Machine learning model for early detection of sexually transmitted infections

No Thumbnail Available

Date

2025-07

Journal Title

Journal ISSN

Volume Title

Publisher

NM-AIST

Abstract

Sexually Transmitted Infections are diseases transmitted mostly through unprotected sex with an infected partner. Every day, about one million people throughout the world get sexually transmitted infections. The most vulnerable groups in Tanzania are commercial sex workers, truck drivers who travel long distances and grocery and hotel workers. Common sexually transmitted infections in Tanzania are Gonorrhoea, Syphilis, Chlamydia and Trichomoniasis. The STIs have several effects if they are not cured on time or use the wrong medications. The STIs can induce infertility or sterility, make the body prone to more serious diseases like HIV, and even cause death. The stigma and humiliation associated with sexually transmitted infections create significant hurdles to seeking effective diagnosis and treatment. This study aimed to develop a machine learning model integrated into a web application to facilitate seamless communication between patients and health centres, specifically addressing communication challenges between sexual health clinics and STI patients. Both qualitative and quantitative research methods were employed in the study. Qualitative data were gathered through interviews with health practitioners and ICT officers from the respective hospitals, while quantitative data were collected using survey questionnaires from four hospitals, supported by the Government of Tanzania Health Operation Management Information System (GoT-HoMIS). Dataset with features which included several STI symptoms and the label features which are laboratory diagnosis results. The model was trained on a local dataset using five machine learning algorithms: AdaBoost, Support Vector Machine (SVM), Random Forest, Decision Tree, and Stochastic Gradient Descent (SGD). In this study, results revealed that the highest accuracy score was 97.45% and the F1 score of 97.70% from the AdaBoost classifier. Thus, the model from the AdaBoost algorithm was serialised for integration with the web app. The validation of the web app system was done with a higher number of people recommending the system to be used in the Health Information Management System. The developed machine learning model can benefit policymakers and health practitioners by using telemedicine to enable remote diagnosis and patient monitoring. Apart from telemedicine, the model can remove stigmatisation barriers among STI patients. And lastly, a machine learning-powered system can increase patient adherence to medication and treatment strategies by anticipating future noncompliance and offering timely reminders or interventions.

Sustainable Development Goals

SDG-5:Gender Equality SDG-9:Industry, Innovation and Infrastructure SDG-10:Reduced Inequalities

Keywords

Citation