Data has these columns:
columns =
"ID","Gender", "Age_Group", "Residence", "Education_Level", "Source_of_Income","Marital_Status",
"Smoked_Cigarettes", "Year_Diagnosed",
"Surgical_Treatment","Chemotherapy", "Radiotherapy", "Immunotherapy", "Molecular_targeted_Therapy",
"Hospitalization_Number", "Time_to_Treatment", "Medical_Treatment_Need",
"Emotional_Impact", "Travel_Impact", "Quality_of_Life",
"Symptoms_exp_cough", "Symptoms_exp_Hoarseness","Symptoms_exp_Blood_cough","Symptoms_exp_chestpain" ,"Symptoms_exp_Shortness_of_breath","Symptoms_exp_weakness","Symptoms_exp_None",
"Symptom_Frequency", "Symptom_Household_Impact", "Sleep_Issues", "Support_From_Close",
"Dependency_Fear", "Health_Satisfaction",
"Daily_Life_Impact_physical","Daily_Life_Impact_Psychological",
"Daily_Life_Impact_proffesional","Daily_Life_Impact_family_life","Daily_Life_Impact_social_life",
"Daily_Life_Impact_no_effect",
"Energy_Level", "Self_Care", "Daily_Activities_Difficulty",
"Work_Readiness", "Support_Satisfaction", "Coping_Strategy", "Negative_Emotions"
1- These can be grouped into:
- demographic features:
- Therpahy type features:
- Symptom exp:
- Daily_Life_Impact:
- Emotions, Satisfaction ...
- ??
We will try to make a target feature looking at these groups above. We can look at the highly correlated values..