Browsing by Author "Myagila, Kasian"

Now showing 1 - 3 of 3

Efficient spatio-temporal modeling for sign language recognition using CNN and RNN architectures
(Frontiers, 2025-08-25) Myagila, Kasian; Nyambo, Devotha; Dida, Mussa
Computer vision has been identified as one of the solutions to bridge communication barriers between speech-impaired populations and those without impairment as most people are unaware of the sign language used by speech-impaired individuals. Numerous studies have been conducted to address this challenge. However, recognizing word signs, which are usually dynamic and involve more than one frame per sign, remains a challenge. This study used Tanzania Sign Language datasets collected using mobile phone selfie cameras to investigate the performance of deep learning algorithms that capture spatial and temporal relationships features of video frames. The study used CNN-LSTM and CNN-GRU architectures, where CNN-GRU with an ELU activation function is proposed to enhance learning efficiency and performance. The findings indicate that the proposed CNN-GRU model with ELU activation achieved an accuracy of 94%, compared to 93% for the standard CNN-GRU model and CNN-LSTM. In addition, the study evaluated performance of the proposed model in a signer-independent setting, where the results varied significantly across individual signers, with the highest accuracy reaching 66%. These results show that more effort is required to improve signer independence performance, including the challenges of hand dominance by optimizing spatial features.
Two stream GRU model with ELU activation function for sign language recognition
(Elsevier, 2025-04-05) Myagila, Kasian; Nyambo, Devotha; Dida, Mussa
Pose Estimation features have been successfully used in human activity recognition including sign language recognition. One of the key challenges in sign language recognition is handling signer-independent modes and hand dominance of signer. This paper proposes the use of the Gated Recurrent Unit (GRU) with the ELU activation function to improve computation efficiency and to enhance model learning efficiency. In addition, the paper proposes two stream model architecture to address the challenge of left and right-hand dominance. The study developed model using a Tanzania Sign language datasets collected using mobile devices and extracted pose estimation feature using MediaPipe holistic framework. According to the results, the proposed model not only achieves an impressive overall accuracy of 95%, but also trains more efficiently than comparable algorithms. Particularly in the signer-independent mode, the two-stream approach led to substantial improvements, achieving a maximum accuracy of 92% and a minimum accuracy of 70% with significant increase on the left handed signer accuracy by 37%. The results highlight the effectiveness of the two-stream approach in overcoming challenges related to left and right-hand dominance, which often arise from signer-specific hand dominance. Additionally, the results indicate that, the proposed model can have a positive impact on limited computational resources while also enhancing the model’s overall performance.
Two stream GRU model with ELU activation function for sign language recognition
(Elsevier, 2025-04-05) Nyambo, Devotha; Myagila, Kasian; Dida, Mussa
Pose Estimation features have been successfully used in human activity recognition including sign language recognition. One of the key challenges in sign language recognition is handling signer-independent modes and hand dominance of signer. This paper proposes the use of the Gated Recurrent Unit (GRU) with the ELU activation function to improve computation efficiency and to enhance model learning efficiency. In addition, the paper proposes two stream model architecture to address the challenge of left and right-hand dominance. The study developed model using a Tanzania Sign language datasets collected using mobile devices and extracted pose estimation feature using MediaPipe holistic framework. According to the results, the proposed model not only achieves an impressive overall accuracy of 95%, but also trains more efficiently than comparable algorithms. Particularly in the signer-independent mode, the two-stream approach led to substantial improvements, achieving a maximum accuracy of 92% and a minimum accuracy of 70% with significant increase on the left handed signer accuracy by 37%. The results highlight the effectiveness of the two-stream approach in overcoming challenges related to left and right-hand dominance, which often arise from signer-specific hand dominance. Additionally, the results indicate that, the proposed model can have a positive impact on limited computational resources while also enhancing the model’s overall performance.