dc.description.abstract | Due to the massive adoption of mobile money in Sub-Saharan countries, the global transaction
value of mobile money exceeded $2 billion in 2021. Projections show transaction values will exceed
$3 billion by the end of 2022, and Sub-Saharan Africa contributes half of the daily transactions. SMS
(Short Message Service) phishing cost corporations and individuals millions of dollars annually. Spammers
use Smishing (SMS Phishing) messages to trick a mobile money user into sending electronic cash to an
unintended mobile wallet. Though Smishing is an incarnation of phishing, they differ in the information
available and attack strategy. As a result, detecting Smishing becomes difficult. Numerous models and
techniques to detect Smishing attacks have been introduced for high-resource languages, yet few target
low-resource languages such as Swahili. This study proposes a machine-learning based model to classify
Swahili Smishing text messages targeting mobile money users. Experimental results show a hybrid model of
Extratree classifier feature selection and Random Forest using TFIDF (Term Frequency Inverse Document
Frequency) vectorization yields the best model with an accuracy score of 99.86%. Results are measured
against a baseline Multinomial Naïve-Bayes model. In addition, comparison with a set of other classic
classifiers is also done. The model returns the lowest false positive and false negative of 2 and 4, respectively,
with a Log-Loss of 0.04. A Swahili dataset with 32259 messages is used for performance evaluation. | en_US |