• English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
  • New user? Click here to register. Have you forgotten your password?
    Research Collection
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
  • New user? Click here to register. Have you forgotten your password?
NM-AIST Repository
  1. Home
  2. Browse by Author

Browsing by Author "Agbinya, Johnson"

Now showing 1 - 2 of 2
  • Results Per Page
  • Sort Options
  • Loading...
    Thumbnail Image
    Item
    The Effect of Hyperparameter Optimization on the Estimation of Performance Metrics in Network Traffic Prediction using the Gradient Boosting Machine Model
    (Engineering, Technology & Applied Science Research (ETASR), 2023-06) Mbelwa, Jimmy; Agbinya, Johnson; Mwita, Machoke; Sam, Anael
    Information and Communication Technology (ICT) has changed the way we communicate and access information, resulting in the high generation of heterogeneous data. The amount of network traffic generated constantly increases in velocity, veracity, and volume as we enter the era of big data. Network traffic classification and intrusion detection are very important for the early detection and identification of unnecessary network traffic. The Machine Learning (ML) approach has recently entered the center stage in network traffic accurate classification. However, in most cases, it does not apply model hyperparameter optimization. In this study, gradient boosting machine prediction was used with different hyperparameter optimization configurations, such as interaction depth, tree number, learning rate, and sampling. Data were collected through an experimental setup by using the Sophos firewall and Cisco router data loggers. Data analysis was conducted with R software version 4.2.0 with Rstudio Integrated Development Environment. The dataset was split into two partitions, where 70% was used for training the model and 30% for testing. At a learning rate of 0.1, interaction depth of 14, and tree number of 2500, the model estimated the highest performance metrics with an accuracy of 0.93 and R of 0.87 compared to 0.90 and 0.85 before model optimization. The same configuration attained the minimum classification error of 0.07 than 0.10 before model optimization. After model tweaking, a method was developed for achieving improved accuracy, R square, mean decrease in Gini coefficients for more than 8 features, lower classification error, root mean square error, logarithmic loss, and mean square error in the model.
  • Loading...
    Thumbnail Image
    Item
    Performance Comparison of Ensemble Learning and Supervised Algorithms in Classifying Multi-label Network Traffic Flow
    (Engineering, Technology & Applied Science Research, 2022-06) Machoke, Mwita; Mbelwa, Jimmy; Agbinya, Johnson; Sam, Anael
    Network traffic classification is of significant importance. It helps identify network anomalies and assists in taking measures to avoid them. However, classifying network traffic correctly is a challenging task. This study aims to compare ensemble learning methods with normal supervised classification to come up with improved classification methods. Three types of network traffic were classified (Benign, Malicious, and Outliers). The data were collected experimentally by using Paessler Router Traffic Grapher software and online and were analyzed by R software. The datasets were used to train five supervised models (k-nearest neighbors, mixture discriminant analysis, Naïve Bayes, C5.0 classification model, and regularized discriminant analysis). The models were trained by 70% of the samples and the rest 30% were used for validation. The same samples were used separately in predicting individual accuracy. The results were compared to the ensemble learning models which were built with the use of the same datasets. Among the five supervised classifiers, k-nearest neighbors and C5.0 classification scored the highest accuracy of 0.868 and 0.761. The ensemble learning classifiers Bagging (Random Forest) and Boosting (eXtreme Gradient Boosting) had accuracy of 0.904 and 0.902 respectively. The results show that the ensemble learning method has higher accuracy compared to the normal supervised classifiers. Therefore, it can be used to detect malicious activities in network traffic as well as anomalies with improved accuracy.
Other Links
  • Tanzania Research Repository
  • CERN Document Server
  • Confederation of Open Access Repositories
  • Directory of Open Access Books (DOAB)
  • Directory of Open Access Journals (DOAJ)
useful resources
  • Emerald Database
  • Taylor & Francis
  • EBSCO Host
  • Research4Life
  • Elsevier Journal
Contact us
  • library@nm-aist.ac.tz
  • The Nelson Mandela African institution of science and Technology, 404 Nganana, 2331 Kikwe, Arumeru P.O.BOX 447, Arusha

Nelson Mandela - AIST | Copyright © 2025

  • Privacy policy
  • End User Agreement
  • Send Feedback