Abstract:Objective To construct a risk prediction model for healthcare-associated infection (HAI) in stroke patients, accurately and effectively screen out potential high-risk groups, and formulate targeted preventive interventions to reduce the occurrence of infection. Methods Stroke patients in the "Henan Stroke Cohort" in 2019-2021 were selected as the study objects, and relevant clinical data were collected as the main analysis data for model construction and internal validation. The relevant data of stroke patients in three hospitals that had never participated in the cohort construction from January to September 2022 were randomly selected as a test set for external validation of the risk prediction model. The main analysis data were randomly divided into a training set and a test set, and a risk prediction model was constructed based on logistic regression, artificial neural network (ANN) algorithm, extreme gradient boosting algorithm and random forest algorithm, respectively. Multiple indicators were used to evaluate the prediction performance of the model, and the optimal model was externally validated based on the test set data. Results The infection rate of stroke patients was 20.6% in the main analysis data and 56.4% in the test set data. The accuracy of the risk prediction model based on logistic regression was 91.2%, the area under the receiver operating characteristic (ROC) curve (AUC) was 0.938, the precision rate, recall rate, specificity, and the F1 score were 0.851, 0.695, 0.968, and 0.765, respectively.The accuracy rate, precision rate, specificity and AUC of the logistic risk prediction model and the ANN risk prediction model were all significantly better than other models, while the recall rate and F1 score of the logistic risk prediction model were slightly better than the ANN risk prediction model. The logistic risk prediction model had excellent prediction performance in external validation. Conclusion HAI risk prediction model of stroke patients based on logistic regression can better screen out high-risk stroke patients with infection risk, and can contribute to formulate targeted preventive interventions to reduce the occurrence of infection.