Effect of Machine Learning Techniques for Efficient Classification of EMG Patterns in Gait Disorders

Effect of Machine Learning Techniques for Efficient Classification ABSTRACT - Gait disorder is very common in neurodegenerative diseases and differentiating among the same kinematic design is a very challenging task. The muscle activity is responsible for the creation of kinematic patterns. Hence, one optimal way to monitor this issue is to analyse the muscle pattern to identify the gait disorders. In this paper, we will investigate the possibility of identifying GAIT disorders using EMG patterns with the help of various machine learning algorithms. Twenty-five normal persons (13 male and 12 females, age around 28 years of age) and 21 persons having GAIT disorders (11 male and 10 females, age around 67 years of age). Four different machine learning algorithms have been used to identify EMG patterns to recognize healthy and unhealthy persons. The results obtained so far have been used to distinguish between GAIT disorders and healthy patients. Our proposed system can also prove that Recurrent Neural Network has achieved the best accuracy with 91.3 % in the case of two classes and 86.95 % in the case of three classes compared to other machine learning algorithms.


░1. INTRODUCTION
Gait disorder is the common background of different neurological problems [1]. They create a vital problem in fitness as they are a continual source of drop-down, with successive mortality and bitterness. They also decrease the quality of patient's life by spoiling their capability to carry out activities of everyday routine and to engage in conventional daily living [1,2]. Further, they are creating considerably higher costs in the healthcare field [3,4,5].
The different range of pathological units strikes various arrangements of the nervous system might emphasize gait disorganizations. The involvements of various regions of the nervous systems are estimated using different neurological testing. Although, based on phenotypical similarities, recognition of problems with neurologic gait disorganizations is not correct even in the case of expert clinicians. The accurate recognition of etiopathological-based gait disorders is very necessary. It is also useful for diagnosis, treatment, prognosis, and workup. The increase in the accuracy of recognition of gait disorders would enhance the care quality and able to decrease the cost of therapeutic or diagnosis. In order to inspect gait characteristics, surface electromyography and kinematic recordings are used in so much research in the past years. The EMG is always useful because it is near neuronal control device pathological and normal walking [6,7,8].
In this research, our goal is to scrutinize the assumption that activated muscle patterns in the time of running accommodate maximum details to recognize the gait disorganization, significantly correct to enhance the recognition by clinical evaluation, shown between 50 to 80% accuracy [9,10]. The high dimension and complex nature of EMG data and some uncertainties for the features that are responsible for the healthy or pathological gait relies on the automatic approach using classifiers. We are trying to investigate whether the learning techniques are used in the EMG signals will able to (1) Recognize efficiently patients from healthy peoples.
(2) Recognize efficiently the two classes of gait disorder that is, ataxic and hypokinetic gait from healthy classes and for each other. The main findings of our paper are: (1) We introduced a framework based on machine learning concepts to identify gait disorders. (2) Our introduced framework is based on EMG signals. (3) We are able to produce our results with various machine learning techniques. (4) We are able to recognize efficiently gait patients from healthy people. (5) We are able to recognize efficiently the two classes of gait disorder that is, ataxic and hypokinetic gait from healthy classes and for each other.
The rest of the paper is organized as follows: a detailed explanation of the experiment is in section 2, results are described and discussed in section 3, and finally concluded in section 4.

Participants
Patients with neurological diseases were included in the research. The criteria for including in this research is above 18 years of age having neurological gait disorders categorized into "ataxic gait and hypokinetic gait" patients and healthy patients with no gait disorders. The patients have severe orthopaedic, unable to walk freely, neuropsychiatric or interfering to participate and pregnant lady. Healthy people are obtained from public devices.

Experimental Methods
Before main experiment, all the participants performed a test. Further, they are instructed to walk down at a normal pace and then asked to move them and conduct tandem gait. They are captured in the time of gait job. We want to investigate the two categories of gait disorders that are, hypokinetic and ataxic gait. All the exercise is done according to the requirement of these categories.
For the experiments, some bipolar electrodes are attached to the leg of the patients like rectus femoris, vastusmedialis, tibialis anterior, biceps femoris, and gastrocnemius lateralis. Further, some additional accelerometers were attached to every foot to obtain the information that is used to identify the gait cycle. The data was recorded by the system with the help of a wireless accelerometer. People are asked to stand up and sit down then walk about 10 meters and then walk and sit again. This process is redone 10 to 12 times in order to get more data. The EMG data are kept where persons freely walk. The electrodes are placed in the same position for all the participants.

Data Pre-Processing
All accelerometer and EMG data is exported to MATLAB and analysis of the data is done using python. The data is kept only where participants are walking and ignoring the data where participants are standing, sitting, and turning. For data preprocessing, we follow the process adopted by the process of Fricke et al. [11]. We use a high-pass digital filter to delete some noises and DC components that arise due to electrode skin and movement. After that, a low-pass filter is used to remove higher repetition artifacts to process full-wave amendments.
The accelerometer information was further converted in such a way that it represents the gait cycle clearly. The gait cycle was identified with the help of recordings produced by an accelerometer attached to both feet. Since the accelerometer was attached to both feet, the relevant signal was noted from the attached accelerometer every time when the foot hits the ground when they were walking.

Classification Techniques
We applied four supervised machine learning techniques to automatically identify gait disorders based on the proposed method using feature extraction techniques. The machine learning techniques used in the given framework are Recurrent Neural Networks, Logistic regression, Support Vector Machines. These techniques are used to classify (1) gait disorder vs healthy (2) ataxic gait vs hypokinetic gait vs healthy.

Recurrent Neural Networks
RNN is the type of deep learning which has very vast applications in text and speech recognition. The main uses of RNN classifier are its ability to identify features from more from the extracted features [12]. The extracted feature to the RNN is fed from different tasks. The input data is entered into the RNN having various tasks. The spatial layer in the RNN gets the input then filters it having different weights. The given filter is moved horizontal and vertical throughout the layer. Various feature maps have been found based on the number of layers that affect the classification accuracy. Before the training of RNN, the database was first converted into the time-frequency domain with the help of a continuous wavelet transform to improve the accuracy results. In the past research, we have observed that the time-series domain is responsible for the more robust results.
One of the major problems in applying RNN in datasets is the choose a relevant number of layers, weights, and filters to obtained good results while overcoming the problem of under fitting and overfitting because these factors play important role in calculating loss function for the given scenario. To overcome this problem, transfer learning can be a better idea.
The past research has shown that transfer learning works well even if the two datasets are obtained from different domains [13]. Transfer learning performs well with small datasets. Further, this is also very useful in medical imaging where the dataset is too small. We have also applied transfer learning to our research due to small datasets.

Logistic Regression and Support Vector Machine as Classification Technique
SVM and Logistic regression are linear and non-linear types of classifiers used in different, high dimensional features of the database that assign classes to every data point present in data space. Support vector machine is the supervised method applied in many areas like facial expression recognition [14], bioinformatics [16], and face detection [15]. The goal of SVM is to create a distinguished hyperplane in multidimensional space. The hyperplane is further used to separate the entire dataset into two or more labels using tricks [17]. So, it is used to increase the distance between the given data points to the hyperplane to calculate the maximum margin hyperplane. Logistic regression is another machine learning technique that contains no training phase. This technique is used in

International Journal of Electrical and Electronics Research (IJEER) Open Access | Rapid and quality publishing Research Article | Volume 10, Issue 2 | Pages 117-121| e-ISSN: 2347-470X
situations where no pre-knowledge of data points exists. The logistic regression is a non-linear machine learning algorithm. The logistic regression tries to combine similar objects from the given dataset in some meaningful manner. This is based on some metric that is calculated from the given data points. The LR basically, assigns the class of each data point that is present in the datasets in some manner. The accuracy of results is based on scattered data points present in the given datasets.
To perform logistic regression and SVM, some features have to be extracted from the given dataset. We have followed the technique adopted by Alizadeh et al. [18].

Training Classifiers
After pre-processing of the given data, we divided data into validation and training sets contains where validation sets contain the EMG gait cycle information of 17 out of 46 subjects, whereas the training set contains the data of 23 participants, leaving data of 3 from both the training and validations sets. After training all the classifiers, the remaining datasets are used to test the framework. The parameters for the logistic regression and SVM are optimized and leaving out the remaining data for testing. The setup and parameters are kept identical for the RNN. This process is repeated for 46 subjects. In the given investigation, the excluded class was identified as one of two classes ("patient" or "healthy") and in another investigation, the excluded class was identified as one of three classes ("ataxic", "hypokinetic" or "healthy,"). The categorization was processed for every given gait cycle and the categorization outcome for the subject was identified as a class that was achieved most frequently when each and every gait cycle was examined. Incomparably, due to the k-crossvalidation method, the classifier was trained without using any trials of the given subject to be identified in the training phase. The k-cross-validation process was repeated by processing every participant as the excluded subject. Finally, we assessed the categorization outcome by computing sensitivity, accuracy, and specificity, for two-class:

░ 3. RESULTS
We simulate RNN, logistic regression, and SVM on our EMG data to recognize pathological gait disorders. RNN is able to identify EMG database with sensitivity, specificity, and accuracy of 92%, 90.47%, and 91.3% respectively ( Table 1). In this research, it has been found that RNN performs best in the present framework to classify EMG gait cycle using preprocessing techniques and machine learning techniques. RNN has also the capability of getting more specificity, sensitivity, and accuracy for 3-classes and 2-classes problem.

░ 4. CONCLUSION AND FUTURE WORKS
We have investigated the performance of various machine learning techniques to identify gaits disorders using EMG. We observe that the performance of the RNN with respect to other methods used in the given research is best. The performance of logistic regression and support vector machine is significantly not good. Result was consistent and got an accuracy of over 60% throughout the research. This result is better against clinical results found about 50-80% in the real world. We also found that RNN could be tested for the automatic classification of gait disorganizations using data having the accuracy of 80-90%. To further enhance the technique, we need to create a vast normative dataset containing subjects having different and extra specified gait disorganizations, possibly at different phases of their illness and degrees of indication load that could be used to train more precisely classifiers and neural networks. They are expected to identify disorganization on an etiopathological quantity instead of phenotypic gait disorganizations as experimented here, despite the fact that different pathologies may account for a major obstacle that may be difficult to assess. In the future, neuroscience and robotics in connection with identifying weak muscle, improper walk of robot and proper shoes selection for athletes are few major areas where this work can be majorly used. We have decided to apply more deep learning approaches to improve our results.