Electromyography-Based, Robust Hand Motion Classification Employing Temporal Multi-Channel Vision Transformers

Published in 2022 9th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), 2022

With an increasing use of robotic and bionic devices for the execution of everyday life, complex tasks, Electromyography (EMG) based interfaces are being explored as candidate technologies for facilitating an intuitive interaction with such devices. However, EM G- based interfaces typically require appropriate features to be extracted from the raw EMG signals using a plethora of feature extraction methods to achieve excellent performance in practical applications. To select an appropriate feature set that will lead to significant EMG-based decoding performance, a deep understanding of available methods and the human musculoskeletal system is needed. To overcome this issue, researchers have proposed the use of deep learning methods for automatically extracting complex features directly from the raw EMG data. In this work, we propose Temporal Multi-Channel Vision Transformers as a deep learning technique that has the potential to achieve dexterous control of robots and bionic hands. The performance of this method is evaluated and compared with other well- known methods, employing the open-access Ninapro dataset.l applications, such as prosthesis control using gesture classification. Despite the advances introduced by new deep learning techniques, real-time control of robot arms and hands using EMG signals as input still lacks accuracy, especially when a plethora of gestures are included as labels in the case of classification. This has been observed to be due to the noise and non-stationarity of the EMG signals and the increased dimensionality of the problem. In this paper, we propose an intuitive, affordances-oriented EMG-based telemanipulation framework for a robot arm-hand system that allows for dexterous control of the device. An external camera is utilized to perform scene understanding and object detection and recognition, providing grasping and manipulation assistance to the user and simplifying control. Object-specific Transformer-based classifiers are employed based on the affordances of the object of interest, reducing the number of possible gesture outputs, dividing and conquering the problem, and resulting in a more robust and accurate gesture decoding system when compared to a single generic classification model. The performance of the proposed system is experimentally validated in a remote telemanipulation setting, where the user successfully performs a set of dexterous manipulation tasks.

Recommended citation: R. V. Godoy et al., "Electromyography-Based, Robust Hand Motion Classification Employing Temporal Multi-Channel Vision Transformers," 2022 9th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), Seoul, Korea, Republic of, 2022, pp. 1-8, doi: 10.1109/BioRob52689.2022.9925307.
Download Paper | Download Bibtex