Objective: Utilize machine learning to construct a risk prediction model for postoperative sore throat. Methods: A total of 757 adult patients who underwent tracheal intubation under general anesthesia from September 2022 to June 2023 were retrospectively collected, 327 males and 430 females, aged ≥ 18 years, ASA physical status Ⅰ-Ⅲ. The patients were divided into two groups according to whether POST happened within 24 hours after operation: POST group (group P) and non-POST group(group NP). The preoperative, intraoperative and postoperative follow-up information of the patients was collected. The data set was randomly divided into a training set of 80% and a test set of 20%. The 20-fold cross-validation and grid search methods were used to develop the model on the training set. The test set was used for internal validation. Four models, namely logistic regression (LR), random forest (RF), adaptive boosting (AdaBoost), and extreme gradient boosting (XGBoost), were established respectively. The performance of the four models was compared using the area under the receiver operating characteristic curve, accuracy, sensitivity, specificity, area under the precision-recall curve (AUPRC), and Brier score. In addition, the Shapley additive explanations (SHAP) was used to conduct explanatory analysis on the model with the best performance. Results: A total of 221 patients (29.1%) had POST. The AUC values of the four machine learning models, namely LR, RF, AdaBoost, and XGBoost, were 0.89 (95% CI 0.87-0.90), 0.90 (95% CI 0.87-0.91), 0.86 (95% CI 0.81-0.89), and 0.88 (95% CI 0.86-0.91), respectively. The RF model performed the best in the test set, with ROC curve accuracy of 84.0%, sensitivity of 77.0%, specificity of 86.0%, Brier score of 0.13, AUPRC of 0.81, and AUC of 0.90. The results of the SHAP showed that the variable of blood in the tracheal tube had the greatest contribution to RF, followed in order by the time of indwelling tracheal tube, amount of intraoperative fluids, gender, age, the number of intubations greater than one, ASA physical status, and indwelling gastrostomy tube. Conclusion: In this study, we constructed a risk prediction model for POST based on four algorithms, including LR, RF, AdaBoost, and XGBoost, RF had the best performance. |