Abstract

The objective of this work was to provide an app that can automatically recognize hand gestures from the American Sign Language (ASL) on mobile devices. The app employs a model based on Convolutional Neural Network (CNN) for gesture classification. Various CNN architectures and optimization strategies suitable for devices with limited resources were examined. InceptionV3 and VGG-19 models exhibited negligibly higher accuracy than our own model, but they also had more complicated architectures. The best method for network optimization became Layer Decomposition which achieved the lowest inference time in classification effectiveness. Each optimization method reduced the inference time of our model at the small expense of classification accuracy. The accelerators with the shortest inference time were GPU and CPU in a configuration of 5 threads. For the purpose of loading the trained models, running and testing their effectiveness under different hardware configurations a prototype of the mobile application was developed: [removed].

Recommended Citation

Kobiela, J., Kobiela, D. & Artemiuk, A. (2024). Sign Language Recognition Using Convolution Neural Networks. In B. Marcinkowski, A. Przybylek, A. Jarzębowicz, N. Iivari, E. Insfran, M. Lang, H. Linger, & C. Schneider (Eds.), Harnessing Opportunities: Reshaping ISD in the post-COVID-19 and Generative AI Era (ISD2024 Proceedings). Gdańsk, Poland: University of Gdańsk. ISBN: 978-83-972632-0-8. https://doi.org/10.62036/ISD.2024.96

Paper Type

Poster

DOI

10.62036/ISD.2024.96

Share

COinS
 

Sign Language Recognition Using Convolution Neural Networks

The objective of this work was to provide an app that can automatically recognize hand gestures from the American Sign Language (ASL) on mobile devices. The app employs a model based on Convolutional Neural Network (CNN) for gesture classification. Various CNN architectures and optimization strategies suitable for devices with limited resources were examined. InceptionV3 and VGG-19 models exhibited negligibly higher accuracy than our own model, but they also had more complicated architectures. The best method for network optimization became Layer Decomposition which achieved the lowest inference time in classification effectiveness. Each optimization method reduced the inference time of our model at the small expense of classification accuracy. The accelerators with the shortest inference time were GPU and CPU in a configuration of 5 threads. For the purpose of loading the trained models, running and testing their effectiveness under different hardware configurations a prototype of the mobile application was developed: [removed].