Download PDF

A Dual-stage Machine Learning Framework for Heart and Stroke Prediction with Progression Path Modeling

Author : Dr. S Krishnaveni and Muthunayaki M

Abstract :

This research introduces a machine learning framework designed to predict stroke risk while modelling its potential progression from heart disease. Using a structured healthcare dataset, the system applies supervised learning algorithms—Random Forest, Decision Tree, and Naive Bayes to analyse clinical, demographic, and lifestyle features. A key contribution is the development of a “Progression Path” logic that categorizes individuals into four stages: No Risk, Heart Only, Stroke Only, and Heart → Stroke. The models are evaluated using accuracy, confusion matrices, and feature importance analysis. Additionally, visual tools such as Sankey diagrams and patient-level probability simulations improve interpretability and clinical relevance. The proposed framework supports early diagnosis, personalized risk forecasting, and proactive intervention, demonstrating the practical value of machine learning in preventive healthcare.

Keywords :

Stroke prediction, Sankey diagram, Random Forest, Naive Bayes, Machine learning, Disease progression.