Browsing by Author "Mduma, Neema"
Now showing 1 - 20 of 37
- Results Per Page
- Sort Options
Item The Artificial Neural Network-Based Smart Number Plate for Vehicles with Real-Time Traffic Signs Recognition and Notification(Springer Link, 2024-06-30) Niyomugaba, Alexandre; Kisangiri, Michael; Mduma, NeemaThe world is advancing technologically in all sectors, including intelligent transportation, whereby various vehicles’ movements are monitored and controlled remotely. These technologies simplify the tasks in traffic control and increase road safety. The previous related works implemented and designed provided different technologies that can identify, locate, and detect the vehicle’s speed. However, even though these technologies have been implemented, there is still a lack of assistance to drivers for earlier knowing the road situation and real-time accident notification to dedicated authorities such as traffic police stations. In this paper, an Artificial Neural Network-based Smart Number Plate with real-time traffic sign recognition and notification was developed. The developed number plate comprises two units, the processing unit and the display unit, which both communicate through wireless communication. The processing unit contains a speed sensor and vibration shock sensors, Global System for Mobile Communication (GSM), Global Position System (GPS), and Raspberry Pi 3 B+ that act as the system controller. The display unit contains the Expressif board, Liquid crystal Display (LCD), and Buzzer. With the TensorFlow model for machine learning, the smart number plate classifies and recognizes traffic signs with real-time notification. Moreover, this number plate had been tested on different drivers and assisted them in obeying the traffic signs earlier, and the traffic station had been alerted for emergency support.Item Automatic Railway Road Crossing (RLC) Traffic Light System for Metric Gauge Railway Network in Tanzania(International Journal of Advances in Scientific Research and Engineering (ijasre), 2021-11) NKUNZIMANA, Libere; Minja, Gilbert; Mariki, Christina; Zirakwiye, Innocent; Mduma, Neema; Dida, MussaThe verdict has been established that Railway Level Crossings (RLCs) present a possible risk to roads users. Because of the ever- increasing number of vehicles on the road every day, it was determined that employing automation at level crossings can be beneficial to both road and railway users' safety. The aim is to develop an automated railway level crossing system that would reduce the likelihood of collisions between trains and road users at intersections. From the perspective of a railway level crossing, the conditions safeguards must meet is straightforward - Before a train passes, there must be a complete stop for all road users. Two RFID sensors and Ultrasonic sensors are used located at strike-in and strike-out points at the level crossing. Detection of automobiles stuck on the rail road once the train has activated the automation at the RLC is performed by the Ultrasonic sensor. Other warning measures in the system include an automated barrier, Light Emitting Diode flashing lights, and an audio alarm device. Arduino UNO and ESP32 were used as microcontrollers to perform all the logical operations and control commands. Consequently, the next train station from RLC was updated of the incoming train’s Expected Time of Arrival (ETA). The time it takes for the barriers to close will be determined by the train’s speed. In this project work, the prospective application strategy for securing rail road crossings are described in detail. It is the best feasible control of the level crossing by using the train detection systemItem Banana Leaves Imagery Dataset(Springer Nature, 2025-03-21) Mduma, Neema; Elinisa, ChristianIn this work, we present a dataset of banana leaf imagery, both with and without diseases. The dataset consists of 11,767 images, categorized as follows: 3,339 healthy images, 3,496 images of leaves affected by Black Sigatoka and 4,932 images of leaves affected by Fusarium Wilt Race 1. This data was collected to support machine learning diagnostics for disease detection. The data collection process involved farmers, researchers, agricultural experts and plant pathologists from the northern and southern highland regions of Tanzania. To ensure unbiased representation, farms were randomly selected from the Rungwe, Mbeya, Arumeru, and Arusha districts, based on the presence of banana crops and the targeted diseases. The dataset offers a comprehensive collection of images captured from November 2022 to January 2023, using a high-resolution smartphone camera across a wide geographical area. Researchers and developers can use this dataset to build machine learning solutions that automatically detect diseases in images, potentially enabling agricultural stakeholders, including farmers, to diagnose Fusarium Wilt Race 1 and Black Sigatoka early and take timely action.Item A Battery Voltage Level Monitoring System for Telecommunication Towers(Engineering, Technology & Applied Science Research, 2021-12) Uwamahoro, Rahab; Mduma, Neema; Machuve, DinaVoltage fluctuations in batteries form a major challenge the telecommunication towers face. These fluctuations mostly occur due to poor management and the lack of a battery voltage level monitoring system. The current paper presents a battery voltage-level monitoring system to be used in telecommunication towers. The proposed solution is incorporated with a centralized mobile application dashboard for accessing the live data of the installed battery, integrated with voltage-level, current, temperature, fire, and gas sensors. An Arduino Uno microcontroller board is used to process and analyze the collected data from the sensors. The Global Service Message (GSM) module is used to monitor and store data to the cloud. Users are alerted in the case of low voltage, fire, and increase in harmful gases in the tower through Short Message Service (SMS). The experiment was conducted at Ngorongoro and Manyara telecommunication towers. The developed system can be used in accessing battery information remotely while allowing real-time continuous monitoring of battery usage. The proposed battery voltage-level monitoring system contributes to the elimination of battery hazards in towers. Therefore, the proposed battery voltage level monitoring system can be adopted by telecommunication tower engineers for the reduction of voltage fluctuation risks.Item Characterisation of Malaria Diagnosis Data in High and Low Endemic Areas of Tanzania(East African Health Research Journal, 2022) Mariki, Martina; Mduma, Neema; Mkoba, ElizabethBackground: Malaria remains a significant cause of morbidity and mortality, especially in the sub-Saharan African region. Malaria is considered preventable and treatable, but in recent years, it has increased outpatient visits, hospitalisation, and deaths worldwide, reaching a 9% prevalence in Tanzania. With the massive number of patient records in the health facilities, this study aims to understand the key characteristics and trends of malaria diagnostic symptoms, testing and treatment data in Tanzania’s high and low endemic regions. Methods: This study had retrospective and cross-sectional designs. The data were collected from four facilities in two regions in Tanzania,i.e., Morogoro Region (high endemicity) and Kilimanjaro Region (low endemicity). Firstly, malaria patient records were extracted from malaria patients’ files from 2015 to 2018. Data collected include (i) the patient’s demographic information, (ii) the symptoms presented by the patient when consulting a doctor, (iii) the tests taken and results, (iv) diagnosis based on the laboratory results and (v) the treatment provided. Apart from that, we surveyed patients who visited the health facility with malaria-related symptoms to collect extra information such as travel history and the use of malaria control initiatives such as insecticide-treated nets. A descriptive analysis was generated to identify the frequency of responses. Correlation analysis random effects logistic regression was performed to determine the association between malaria-related symptoms and positivity. Significant differences of p < 0.05 (i.e., a Confidence Interval of 95%) were accepted. Results: Of the 2556 records collected, 1527(60%) were from the high endemic area, while 1029(40%) were from the low endemic area. The most observed symptoms were the following: for facilities in high endemic regions was fever followed by headache, vomiting and body pain; for facilities in the low endemic region was high fever, sweating, fatigue and headache. The results showed that males with malaria symptoms had a higher chance of being diagnosed with malaria than females. Most patients with fever had a high probability of being diagnosed with malaria. From the interview, 68% of patients with malaria-related symptoms treated themselves without proper diagnosis. Conclusions: Our data indicate that proper malaria diagnosis is a significant concern. The majority still self-medicate with anti-malaria drugs once they experience any malaria-related symptoms. Therefore, future studies should explore this challenge and investigate the potentiality of using malaria diagnosis records to diagnose the disease.Item Combining Clinical Symptoms and Patient Features for Malaria Diagnosis: Machine Learning Approach(Taylor & Francis online, 2022-01-30) Mariki, Martina; Mkoba, Elizabeth; Mduma, NeemaPresumptive treatment and self-medication for malaria have been used in limited-resource countries. However, these approaches have been considered unreliable due to the unnecessary use of malaria medication. This study aims to demonstrate supervised machine learning models in diagnosing malaria using patient symptoms and demographic features. Malaria diagnosis dataset extracted in two regions of Tanzania: Morogoro and Kilimanjaro. Important features were selected to improve model performance and reduce processing time. Machine learning classifiers with the k-fold cross-validation method were used to train and validate the model. The dataset developed a machine learning model for malaria diagnosis using patient symptoms and demographic features. A malaria diagnosis dataset of 2556 patients’ records with 36 features was used. It was observed that the ranking of features differs among regions and when combined dataset. Significant features were selected, residence area, fever, age, general body malaise, visit date, and headache. Random Forest was the best classifier with an accuracy of 95% in Kilimanjaro, 87% in Morogoro and 82% in the combined dataset. Based on clinical symptoms and demographic features, a regional-specific malaria predictive model was developed to demonstrate relevant machine learning classifiers. Important features are useful in making the disease prediction.Item Common beans imagery dataset for early detection of bean rust and bean anthracnose diseases(Elsevier, 2024-05-11) Laizer, Hudson; Mduma, Neema; Machuve, Dina; Maganga, ReinfridCommon bean plays a crucial role in the agricultural sector in Tanzania. To most smallholder farmers, the crop serves as a principal source of protein and an essential source of income. Despite its significance, common bean production is often affected by diseases, particularly bean rust and bean anthracnose, resulting in low yields and diminished economic returns. To address this challenge, a comprehensive dataset of common bean leaf images has been collected by using smartphone cameras to capture the visual characteristics of healthy and diseased leaves. The dataset contains more than 59,072 labeled images, offering a valuable resource for developing machine learning models and user-friendly tools capable of early detection and diagnosis of bean rust and bean anthracnose diseases. The aim of generating this dataset is to facilitate the development of machine learning tools that will empower agricultural extension officers, smallholder farmers, and other stakeholders in agriculture to promptly identify and diagnose affected crops, enabling timely and effective interventions before causing significant economic loss. By equipping farmers with the knowledge and tools to combat these diseases, we can safeguard bean production, enhance food security, and strengthen the economic well-being of smallholder farmers in Tanzania and other parts of Africa.Item Computer Science Education in Selected Countries from Sub-Saharan Africa(ACM Inroads, 2024-02-20) Bainomugisha, Engineer; Bradshaw, Karen; Ujakpa, Martin; Nakatumba-Nabende, Joyce; Nderu, Lawrence; Mduma, Neema; Kihoza, Patrick; Irungu, AnnetteComputer Science education in sub-Saharan Africa has evolved over the past decades. The number of institutions offering distinct undergraduate programs has grown, thus increasing the number of students enrolling in the computer science discipline. Several computer science degree programs have emerged with one of the objectives being to satisfy the growing demand for local talent and skills. In this paper, we provide a snapshot of the evolution of undergraduate computer science education in selected countries in Sub-Saharan Africa over the past 20+ years and an overview of the developments in computer science education and observed trends. The setup of educational institutions in Africa and the operational context requires unique modalities for the design and delivery of computer science education that meets the demands of the industry, amongst others. This paper provides insights into the best practices in the computer science curricula in the selected countries, as well as an overview of the pedagogical and delivery approaches to computer science education. The paper highlights case studies from institutions in the selected countries, namely Uganda, South Africa, Ghana, Tanzania, and Kenya with a consolidated summary of the current and emerging challenges and opportunities in all these countries. The paper concludes by providing persectives on the future landscape of computer science in Sub-Saharan Africa.Item Convolutional Neural Network Deep Learning Model for Early Detection of Streak Virus and Lethal Necrosis in Maize: A Case of Northern-Highlands, Tanzania(Springer Link, 2024-06-30) Mayo, Flavia; Mduma, NeemaIn the Tanzanian context, maize is the dominant food crop that serves as a significant common and traditional food being grown in about 45% of the country’s farmland. However, its productivity is hindered by diseases that diminutions its quality and quantity. Maize streak virus (MSV) and maize lethal necrosis (MLN) are the two diseases that have been reported by farmers to dominate for ages. These diseases are likely to be cured if early detected. Nevertheless, sophisticated tools for detecting these diseases are still lagging behind the fast pace of technology in developing countries like Tanzania. That being the case, this study aims to fill the gap by investigating the need and development of a deep learning model for early detection of these two diseases. In doing so, a deep learning solution based on Convolution Neural Networks (CNN) has been developed to predict the early occurrence of these diseases in maize leaves. A CNN model was developed from scratch with a total of 1500 datasets belonging to three classes namely; healthy, MLN, and MSV. The developed model attained a validation accuracy of 98.44%. Since the validation accuracy is more than 70% then, this model is reliable and have potential of being adopted in early prediction of MLN and MSV diseases. However, the vision transformer (ViT) model will be developed, and its efficiency be compared with CNN. The model with best results will be deployed in a mobile device, ready for use by farmers in real-life environments.Item Data Balancing Techniques for Predicting Student Dropout Using Machine Learning(MDPI, 2023-02-27) Mduma, NeemaPredicting student dropout is a challenging problem in the education sector. This is due to an imbalance in student dropout data, mainly because the number of registered students is always higher than the number of dropout students. Developing a model without taking the data imbalance issue into account may lead to an ungeneralized model. In this study, different data balancing techniques were applied to improve prediction accuracy in the minority class while maintaining a satisfactory overall classification performance. Random Over Sampling, Random Under Sampling, Synthetic Minority Over Sampling, SMOTE with Edited Nearest Neighbor and SMOTE with Tomek links were tested, along with three popular classification models: Logistic Regression, Random Forest, and Multi-Layer Perceptron. Publicly accessible datasets from Tanzania and India were used to evaluate the effectiveness of balancing techniques and prediction models. The results indicate that SMOTE with Edited Nearest Neighbor achieved the best classification performance on the 10-fold holdout sample. Furthermore, Logistic Regression correctly classified the largest number of dropout students (57348 for the Uwezo dataset and 13430 for the India dataset) using the confusion matrix as the evaluation matrix. The applications of these models allow for the precise prediction of at-risk students and the reduction of dropout rates.Item Data driven approach for predicting student dropout in secondary schools(NM-AIST, 2020-06) Mduma, NeemaStudent dropout is among the challenges that face most schools in developing countries particularly in Africa. In Tanzania alone, student dropout in secondary schools is pronounced to be around 36%. In addressing the student dropout problem, a thorough understanding of the fundamental factors that cause the student dropout is essential. Several researchers have identified and proposed causes, methods and strategies that will help to reduce or stop the student dropout problem, however, most of the proposed solutions didn’t show promising results and the students dropout trend continue to increase over time. This study focused on developing a data driven approach that will help to identify and predict students who are at risk of dropping out of school in order to facilitate an intervention program as an active measure in eliminating the problem of dropout in Tanzania. In doing so, (a) 122 research articles were examined, (b) 4 focus group discussions and 2 round table surveys with 38 respondents from 5 districts (Arusha, Mbeya, Kisarawe, Rufiji and Nzega) were conducted, and (c) 3 datasets from Tanzania and India were used in order to identify factors that contribute significantly to student dropout problem, disclose the best classifier from the commonly used classifiers (Logistic Regression, Random Forest, K-nearest Neighbor and Multilayer Perceptron) and assessing the data balancing techniques for predictive performance of the model. Results revealed that, most of the respondents mentioned students’ gender, age, parent’s income, number of qualified teachers and remoteness as the main contributing factors to the students’ dropout problem in secondary schools. Furthermore, results from the examined articles indicated that, most studies conducted in developing countries focused on the social aspects of student dropout, and a paltry mentioned the use of other approaches such as machine learning. Nevertheless, results from data driven approach development shows that the Logistic Regression and Multilayer perceptron achieved the highest performance when over-sampling technique was employed. Also, the hyper parameter tuning improved the algorithm's performance compared to its baseline settings, and stacking of the classifiers improved the overall predictive performance of the developed approach. The study, therefore, recommends the developed approach to be considered by relevant authorities in identifying and predicting students at risk of dropping out for early intervention, planning and informative decisions making on addressing the student dropout problem.Item Data Synthesis Technique for Categorical Pestes Des Petits Ruminants (PPR) Data Using CTGAN Model(Pre prints,org, 2023-05-11) Nyambo, Devotha; Mduma, Neema; Sinde, Ramadhani; Lyimo, TumainiData scarcity is a significant challenge in the field of Machine Learning (ML), as data collection can be expensive, time‐consuming, and difficult, particularly in developing countries. This challenge is exaggerated on the need to use dataset for livestock disease predictions for early intervention and surveillance. To address this challenge, this paper presents a data synthesis method that has been used to accurately generate new data samples from few real‐world data. With much data available to train the ML models, overfitting is eliminated. We present the use of Generative Adversarial Networks mainly the Conditional Tabular Generative Adversarial Network to synthesize categorical data for training machine learning models for prediction of the Pestes des Petits Ruminants (PPR) disease. The results showed that training score became 0.89 and the cross‐ validation score was 0.87 after synthesized data was used with Random Forest algorithm. The resulting dataset can be used to support the prediction and surveillance of the Pestes des Petits Ruminants (PPR) disease. The proposed method can also be applied to any domain with categorical data, and has the potential to improve the performance of machine learning models with increased data availability.Item Dataset of banana leaves and stem images for object detection, classification and segmentation: A case of Tanzania(Elsevier, 2023-06-16) Mduma, Neema; Leo, JudithBanana is among major crops cultivated by most smallholder farmers in Tanzania and other parts of Africa. This crop is very important in the household economy as well as food security since it serves as both food and cash crops. Despite these benefits, the majority of smallholder farmers are experiencing low yields which are attributed to diseases. The most problematic diseases are Black Sigatoka and Fusarium Wilt Race 1. Black Sigatoka is a disease that produces spots on the leaves of bananas and is caused by an air-borne fungus called Pseudocercospora fijiensis, formerly known as Mycosphaerella fijiensis. Fusarium Wilt Race 1 disease is one of the most destructive banana diseases that is caused by a soil-borne fungus called Fusarium oxysporum f.sp. Cubense (Foc). The dataset of curated banana crop image is presented in this article. Images of both healthy and diseased banana leaves and stems were taken in Tanzania and are included in the dataset. Smartphone cameras were used to take pictures of the banana leaves and stems. The dataset is the largest publicly accessible dataset for banana leaves and stems and includes 16,092 images. The dataset is significant and can be used to develop machine learning models for early detection of diseases affecting bananas. This dataset can be used for a number of computer vision applications, including object detection, classification, and image segmentation. The motivation for generating this dataset is to contribute to developing machine learning tools and spur innovations that will help to address the issue of crop diseases and help to eradicate the problem of food security in Africa.Item A Deep Learning Model for Predicting Stock Prices in Tanzania(Engineering, Technology & Applied Science Research, 2023-04-02) Joseph, Samuel; Mduma, Neema; Nyambo, DevothaStock price prediction models help traders to reduce investment risk and choose the most profitable stocks. Machine learning and deep learning techniques have been applied to develop various models. As there is a lack of literature on efforts to utilize such techniques to predict stock prices in Tanzania, this study attempted to fill this gap. This study selected active stocks from the Dar es Salaam Stock Exchange and developed LSTM and GRU deep learning models to predict the next-day closing prices. The results showed that LSTM had the highest prediction accuracy with an RMSE of 4.7524 and an MAE of 2.4377. This study also aimed to examine whether it is significant to account for the outstanding shares of each stock when developing a joint model for predicting the closing prices of multiple stocks. Experimental results with both models revealed that prediction accuracy improved significantly when the number of outstanding shares of each stock was taken into account. The LSTM model achieved an RMSE of 10.4734 when the outstanding shares were not taken into account and 4.7524 when they were taken into account, showing an improvement of 54.62%. However, GRU achieved an RMSE of 12.4583 when outstanding shares were not taken into account and 8.7162 when they were taken into account, showing an improvement of 30.04%. The best model was implemented in a web-based prototype to make it accessible to stockbrokers and investment advisors.Item Deep learning models for the early detection of maize streak virus and maize lethal necrosis diseases in Tanzania(International Journal of Innovative Research & Development, 2024-08-16) Mduma, Neema; Mgala, Mvurya; Maina, Ciira; Mayo, FlaviaAgriculture is considered the backbone of Tanzania’s economy, with more than 60% of the residents depending on it for survival. Maize is the country’s dominant and primary food crop, accounting for 45% of all farmland production. However, its productivity is challenged by the limitation to detect maize diseases early enough. Maize streak virus (MSV) and maize lethal necrosis virus (MLN) are common diseases often detected too late by farmers. This has led to the need to develop a method for the early detection of these diseases so that they can be treated on time. This study investigated the potential of developing deep- learning models for the early detection of maize diseases in Tanzania. The regions where data was collected are Arusha, Kilimanjaro, and Manyara. Data was collected through observation by a plant. The study proposed convolutional neural network (CNN) and vision transformer (ViT) models. Four classes of imagery data were used to train both models: MLN, Healthy, MSV, and WRONG. The results revealed that the ViT model surpassed the CNN model, with 93.1 and 90.96% accuracies, respectively. Further studies should focus on mobile app development and deployment of the model with greater precision for early detection of the diseases mentioned above in real life.Item Development of a smart ugali cooker(International Journal of Advanced Technology and Engineering Exploration, 2021-02-21) Katwale, Samwel; Daudi, Ngollo; Hassan, Amran; Mduma, Neema; Dida, Mussa; Kisangiri, MichaelUgali is a thick porridge that is one of the popular staple foods in East Africa. Traditional methods of ugali preparation, cooking, and consumption are described. Firewood has been used as the primary energy source followed by charcoal. In East Africa, electricity grids have expanded and reached a wider network, which has opened opportunities for electric cooking to domestic consumers, especially in urban that was previously dominated by charcoal, which is in scarce supply due to government regulations on environmental conservation. In this project, the smart ugali cooker was designed and developed to automate the process of cooking ugali in households which is faster, safer, and healthier. The smart Ugali cooker is an automated kitchen appliance designed to boil the mixture of water and maize flour into a dough mixture referred to as ugali. It consists of a driving motor, stirrer, flour dispenser, heat source, a cooking pan, a temperature sensor that measures the temperature of the boiling water in the pan, and the control system (Arduino board). The device has basic units that are a dispenser, stirring unit, electronic control unit, pan, and electric heater. These units were fabricated and integrated to form the complete cooking device. Thereafter, the Arduino board was programmed to control the cooking process. Cooking experiments were conducted, on the cooking duration and the texture of ugali based on water to flour ratio. The results showed that ugali was cooked after ten minutes and the quality was good for consumption. In recommendation, the rightful flour to water ratio must be applied to obtain the desired texture of ugaliItem Development of the RFID Based Library Management and Anti-Theft System:A Case of East African Community (EAC) Region(International Journal of Advances in Scientific Research and Engineering, 2021-05) Irankunda, Deo; Sinde, Ramadhani; Mduma, Neema; Dida, MussaRadio Frequency Identification (RFID) Systems are becoming very useful in our daily life due to its advantages such as reduction of human error, theft prevention, time consuming reduction, the auto identification of targeted objects, business processes automation etc. RFID systems has been applied in library to manage items and library operations. Different approaches have been adopted in library management system in the East African region unfortunately some challenges including theft, pages removal, non-customer satisfaction, high cost of used system etc. are still persisting. To address these challenges, an RFID based library management and anti-theft system has been developed to East African Community (EAC) library. It focused on the use of Ultra High Frequency (UHF) band which enable readers and tags to transmit and receive data at longrange. The developed system facilitates users to borrowand return library items using RFID modules and enable librarians to monitor, record library activities and prevent no issued item to cross the library entrance or exit.Item Enhancing Management of Nutrition Information Using Mobile Application: Prenatal and Postnatal Requirements(IST-Africa, 2017) Mduma, Neema; Kalegele, KhamisiMalnutrition contributes to over one half of the deaths of children under age of five years in developing countries and is the single greatest cause of child mortality in Tanzania. Investigations reveal that the issue of malnutrition is aggravated by lack of nutritional information especially in rural communities. Absence of proper tools makes collection, management and access to nutrition information very difficult. The aim of this study is to improve accessibility of nutritional information by taking advantage of the advanced mobile technologies to integrate a mobile-based information management platform with existing Health Information Systems. The platform will give mothers instant access to nutritional tips, allow them to interact with nutrition practitioners and help in record keeping. In this paper, we present the requirements of a mobile application for managing prenatal and postnatal nutritional information. The requirements have been established from interviews with the various stakeholders and literature reviews. The established requirements become a necessary input towards development of a complete mobile-based nutrition information management platform, which is to be integrated with existing health information system.Item An Ensemble Predictive Model Based Prototype for Student Drop-out in Secondary Schools(Journal of Information Systems Engineering & Management, 2019-08-22) Mduma, Neema; Kalegele, Khamisi; Machuve, DinaWhen a student is absent from school for a continuous number of days as defined by the relevant authority, that student is considered to have dropped out of school. In Tanzania, for instance, drop-out is when a student is absent continuously for a period of 90 days. Despite the fact that several efforts have been made to improve the overall status of education at secondary level, the student drop-out problem still persists. Taking advantage of advancement in technology, several studies have used machine learning to address the student drop-out problem. However, most of the conducted studies have used datasets from developed countries, while developing countries are facing challenges on generating public datasets to be used to address this problem. Using a dataset from Tanzania which reflect a local scenario; this study presents an ensemble predictive model based prototype for student drop-out in secondary schools. The deployed model was developed by soft combining a tuned Logistic Regression and Multi-Layer Perceptron models. A feature engineering experiment was conducted to obtain the most important features for predicting student drop-out. Furthermore, a visualization module was integrated to assist interpretation of the machine learning results and we used flask framework in the development of the prototype.Item Feature Selection Approach to Improve Malaria Prediction Model’s Performance for High- and Low-Endemic Areas of Tanzania(Springer Link, 2024-06) Mariki, Martina; Mduma, Neema; Mkoba, ElizabethMalaria remains a significant cause of death, especially in sub-Saharan Africa, with about 228 million malaria cases worldwide. Parasitological tests, like microscopic and rapid diagnostic tests (RDT), are the recommended and standard tools for diagnosing malaria. However, clinical diagnosis is advised in areas where parasitological tests for malaria are not readily available. This method is the least expensive and most widely practiced. A clinical diagnosis called presumptive treatment is based on the patient’s signs and symptoms and physical findings at the examination. A malaria diagnosis dataset was extracted from patients’ files from four (4) identified health facilities in Kilimanjaro and Morogoro. These regions were selected to represent the country’s high- (Morogoro) and low-endemic areas (Kilimanjaro). The dataset contained 2556 instances and 36 variables. The random forest classifier, a tree-based, was used to select the most important features for malaria prediction since this classifier was selected for feature selection because it was robust and had high performance. Regional-based features were obtained to facilitate accurate prediction. The feature ranking indicated that fever is universally the most noteworthy feature for predicting malaria, followed by general body malaise, vomiting, and headache. However, these features are ranked differently across the regional datasets. Subsequently, six predictive models, using important features selected by the feature selection method, were used to evaluate the performance of the features. The identified features comply with the malaria diagnosis and treatment guidelines WHO and Tanzania Mainland provided. The compliance is observed to produce a prediction model that will fit in the current healthcare provision system.