SyMPox: An Automated Monkeypox Detection System Based on Symptoms Using XGBoost

Alireza Farzipour; Roya Elmi; Hamid Nasiri

doi:10.36227/techrxiv.24265333.v2

loading page

SyMPox: An Automated Monkeypox Detection System Based on Symptoms Using XGBoost

Alireza Farzipour ,
Roya Elmi ,
Hamid Nasiri

Abstract

The monkeypox virus poses a novel public health risk that might quickly escalate into a worldwide epidemic. Machine learning (ML) has recently shown much promise in diagnosing diseases like cancer, finding tumor cells, and finding COVID-19 patients. In this study, we have created a dataset based on the data both collected and published by Global Health, and used by the World Health Organization (WHO). Being entirely textual, this dataset shows the relationship between the symptoms and the monkeypox disease. The data has been analyzed, using gradient boosting methods such as Extreme Gradient Boosting (XGBoost), CatBoost, and LightGBM along with other standard machine learning methods such as Support Vector Machine (SVM) and Random Forest. All these methods have been compared. The research aims at providing an ML model based on symptoms to diagnose monkeypox. Previous studies have only examined disease diagnosis-using images. The best performance has belonged to XGBoost, with an accuracy of 1.0 in reviews. To check the model’s flexibility, k-fold cross-validation is used, reaching an average accuracy of 0.9 in 5 different split of test set. In addition, Shapley Additive Explanations (SHAP) helps examining and explaining the output of the XGBoost model.