Stroke Prediction Analysis
Stroke is the fifth leading cause of death for American females, the leading cause for American males, and the leading cause of serious long-term disability. 32,000 brain cells are killed per second during a stroke. Early identification of individuals at risk of stroke is crucial for implementing preventive measures and interventions, and avoiding lasting issues.
Softwares Utilized
Weka, Python
Year
Fall 2023
Primary goal:
Create a machine learning model capable of accurately predicting the likelihood of an individual experiencing a stroke based on the risk factors and demographic factors.
Risk Factors Clusters:
•Behavioral – smoking, work type
•Environmental – residence type (exposure to pollution)
•Metabolic – high BMI, previous stroke, history of heart disease or hypertension
-
4 in 5 strokes are preventable.
Stroke Awareness Foundation
-
Every 40 sec one person in America has a stroke.
Stroke Awareness Foundation
-
795,000 Americans have a stroke yearly
Stroke Awareness Foundation
-
$56.5 billion spent annually on stroke.
Stroke Awareness Foundation
Data Source and Description
Open Source Dataset
Kaggle
5,110 samples
12 variables
249 had a stroke
Risk Factors
Behavioral
Ever Married (Binary)
Stroke (Binary)
Hypertension (Binary)
Heart Disease (Binary)
Behavioral
Smoking Status:
•Formerly smoked
•Never smoked
•Smokes
•Unknown
Work Type:
•Private
•Self-Employed
•Govt Job
•Self employed
•Never worked
Profile
ID
Gender (Male, Female)
Age
Metabolic
Avg Glucose Level
BMI
Environmental
Residence Type (Urban, Rural)
Limitations
Method of data collection confidential (HIPAA policy)
Conclusion
Our analysis had insufficient positive stroke instances; however, we were able to identify non-stroke patients. Maintaining healthy BMI, glucose levels, and hypertension is associated with lower stroke risks, as indicated by the logistic regression and decision tree analysis.
Recommendation
Discuss with your doctor & monitor these numbers on your chart: Heart disease, Hypertension, Smoking Status, Age, BMI, Average Glucose Level.