Abstract:Objective To construct an artificial intelligence (AI) judgement system for healthcare-associated infection (HAI) based on the DeepSeek large language model, evaluate its performance differences from traditional individual-based manual review. Methods A single-center retrospective study was conducted, medical records of discharged patients from the Second Hospital of Dalian Medical University between January and June 2025 were included for analysis, the blinded consensus of 5 senior infection control experts was used as the gold standard to judge HAI, the differences in sensitivity, specificity, accuracy, area under the curve (AUC), and Kappa values of the AI system and individual review were compared, subgroup analysis based on infection sites and error type categorization was also performed. Results According to the expert gold standard judgement, 136 cases were positive and 184 cases were negative for HAI in this study. The performance comparison showed that the AI system outperformed individual judgement in various performance indicators: sensitivity (92.6% vs 84.6%), accuracy (93.8% vs 89.1%), AUC (0.976 vs 0.897), and Kappa value (0.869 vs 0.776), with differences being statistically signi-ficant (all P<0.05). The sensitivity of the AI system to different infection sites remained above 94%, especially in bloodstream infection, which was significantly superior to manual review (94.7% vs 73.7%, P=0.044). Analysis of error types revealed that AI misjudgements were mainly caused by atypical clinical manifestations, while manual underreporting was often caused by negligence in reading medical records. Conclusion The DeepSeek-based AI judgement system for HAI demonstrates high judgement performance and stability, and can significantly improve the sensitivity and standardization of HAI recognition. The human-AI collaborative model of "AI initial screening-manual final review" can serve as an intelligent solution for HAI prevention and control.