Painless Prognosis of Myasthenia Gravis using Machine Learningcs229.stanford.edu/proj2018/report/166.pdf · 2019. 1. 6. · Myasthenia gravis (MG) is a neuromuscular disorder that

Painless Prognosis of Myasthenia Gravis using Machine

Learning

Abhishek Tapadar (atapadar); Asherin George Anto George(asherin)

13 December 2018

Project category: Life-Sciences

1 Abstract

We will implement and compare machine learning algorithms to predict with high confidence,the presence of a chronic condition, Myasthenia Gravis which affects about 200,000 people inthe US alone. This should be helpful in eliminating the need for a painful and expensive Single-Fiber Electromyography(EMG) test and could potentially diagnose with a single anti–acetylcholinereceptor (AChR) antibody (Ab) test. We have trained our algorithms on 22 co-factors/featurescommonly found with Myasthenia Gravis, and could also predict the probability of being afflictedwith Myasthenia Gravis given a patient history and a questionnnaire.

2 Introduction to Myasthenia Gravis

Myasthenia gravis (MG) is a neuromuscular disorder that causes weakness in the skeletal muscles,which are the muscles your body uses for movement. It occurs when communication betweennerve cells and muscles becomes impaired. This impairment prevents crucial muscle contractionsfrom occurring, resulting in muscle weakness.

Figure 1a illustrates a typical setup of a Single-Fiber Electromyography(EMG) test. Elec-tromyography (EMG) measures muscle response or electrical activity in response to a nerve’sstimulation of the muscle. The test is used to help detect neuromuscular abnormalities. An ex-ample electrical signal output from the repeated simulation, is shown in Figure 1b. Using manyof these muscle simulations, it is possible to diagnose Myasthenia Gravis.

(a) Single Fiber Electromyography Schematic (b) RNG Muscle Simulation Output

The input to our algorithm is anonymous medical data containing positive and negative labelsfor related medical tests, patient data such as age, gender and BMI and federally accepted metricsregarding the patient’s quality of living and sleep. We then use logistic regression as our baselinemodel and compare the results with other algortihms such as GDA, CNN, Gradient Boosted Treesand Random Forests, with the ultimate goal of predicting if a patient has Myastenia Gravis ornot.

We propose that with training set with the 22 cofactors and the Myasthenia diagnosis inexisting patients, it would be possible for the system to correlate the occurrence of Myasthenia

1

with just a simple questionnaire and the AchR and MuSK phlebotomy tests and avoiding thepainful EMG test. The computational cost is mainly in the form of training which is incurredonly once upfront. Once the algorithm is trained, predictions can be produced in a fraction of thetraining time and would be beneficial to the patients.

3 Related Work

The state of art in medical history diagnosis using machine learning has primarily been appliedto Imaging Diagnostics, in X-Rays and with Tumour Cell Imaging, It has not been studied onclassification problems like diagnoses, and specifically not with Myasthenia Gravis. We havetherefore assumed our baseline calibration with regression analysis which is standard in the fieldof medicine.

4 Dataset and Features

We propose that we can train the algorithms with a data set with 22 different co-factors relatedto Myasthnia Gravis and therefore, we can predict the occurance of Myasthenia Gravis using acombination of factors.

We thank Dr. Srikanth Muppidi (muppids) at the Stanford Neurosciences Hospital for hisimmense help in gathering anonymous patient records for us. We have a set of 10,056 data pointswith the above mentioned factors from a NIH repository for Myasthnia Gravis. We use a 95%:5%train:development set split to train and tune our model respectively. As for the test set, we useanother separate set of 199 samples of anonymized data from patients only from the Menlo Parkregion, California.

The following are the various features each data sample holds.1. Age Age of the candidate is a factor in diagnosing Myasthinia Gravis as it potentially

affects people in advanced ages2. Gender There are studies that show that more women are affected with Myasthnia Gravis

than Men3. BMI A higher BMI may correlate to a host of health problems and may be a feature in

diagnosing Myasthenia4. Years Diagnosed with MG This field is 0, if a patient does not have Myasthnia Gravis,

but if diagnosed gives us an idea on the years they have been diagnosed5. AChR Anti-bodies Approximately 85-90 percent of patients with myasthenia gravis

(MG) express antibodies to the acetylcholine receptor (AChR)6. MuSK Anti-bodies Useful for Diagnosis of autoimmune muscle-specific kinase (MuSK)

myasthenia gravis. Second-order test to aid in the diagnosis of autoimmune myasthenia graviswhen first-line serologic tests are negative

7. Presence of MuSK Ab and AChR Ab This is a field we have generated to account forpresence of both antibodies and therefore might indicate a higher incidence of Myasthenia Gravis

8. Seronegative Around 10-20% of myasthenia gravis (MG) patients do not have acetyl-choline receptor (AChR) antibodies (seronegative), of whom some have antibodies to a membrane-linked muscle specific kinase (MuSK). To examine MG severity and long-term prognosis in seroneg-ative MG compared with seropositive MG, and to look specifically at anti-AChR antibody negativeand anti-MuSK antibody negative patients.

9. Thymectomy Surgical removal of the thymus gland, may indicate a lowe incidence ofMyasthnia Gravis

10. Sleep Apnea A potentially serious sleep disorder in which breathing repeatedly stopsand starts.This is commonly incident with Myasthnia Gravis

11. Sleep Apnea Number This is an indication if the patient is aware of their afflictionwith Sleep Apnea

2

12. Non-Invasive Ventilation Support Noninvasive ventilation (NIV) refers to the provi-sion of ventilatory support to the lungs, without the use of an endotracheal airway. It has emergedas an important tool in the treatment of acute respiratory failure, this field measure if the patienthas ever been put on an NIV system

13. NIV number The number of times a candidate has been on NIV support14. MG-QOL15 The MG-QOL15 is a brief survey, completed by the patient, that is designed

to assess some aspects of ”quality of life”15. ESS The Epworth Sleepiness Scale (ESS) is a scale intended to measure daytime sleepiness

that is measured by use of a very short questionnaire.16. ESS is greater than 10 This indicates a strong affliction to daytime sleepiness and

therefore increased chances of being prognosed with Myasthinia Gravis17. PSQI The Pittsburgh Sleep Quality Index (PSQI) is a self-report questionnaire that

assesses sleep quality over a 1-month time interval.18. PSQI is greater than 5 This indicates a strong affliction to poor sleep rhythms due to

fatigue and therefore increased chances of being prognosed with Myasthinia Gravis19. FSS The Fatigue Severity Scale (FSS) is a method of evaluating the impact of fatigue on

you. The FSS is a short questionnaire that requires you to rate your level of fatigue.20. FSS is greater than 36 his indicates a strong affliction to increased fatigue and therefore

increased chances of being prognosed with Myasthinia Gravis21. MG ADL The MG-ADL profile provides a rapid assessment of MG symptom severity;

it has been validated and shown to correlate with the QMG score.22. MG ADL bulbar subset score A short questionnaire to find out the fatigue in the

bulbar and throat region.This data was pre-processed such that all text fields were transformed to take only numerical

values (for instance, the gender field took ’0’ for ’male’ patients and ’1’ for ’female’ patients).

5 Methods

Various algorithms were used on the training data with Logistic Regression as the baselinemodel since it is the most widely used machine learning algorithm to classify and predict medicaldata.

5.1 Logistic Regression

We will focus on the binary classification problem in which y can take on only two values, 0 and 1.The logistic model is a widely used statistical model that, in its basic form, uses a logistic functionto model a binary dependent variable. Cross-entropy loss is used to measure the performance ofthe model. Logistic regression models p(y|x; θ) as

hθ(x) = g(θTx)

where g is the sigmoid function. By making significantly weaker assumptions, logistic regressionis more robust and less sensitive to incorrect modeling assumptions.

5.2 Gaussian Discriminant Analysis

When we have a classification problem in which the input features x are continuous-valued ran-dom variables, we can then use the Gaussian Discriminant Analysis (GDA) model, which modelsp(x—y) using a multivariate normal distribution. The GDA model makes strong modelling as-sumptions and when these assumptions are correct, informally, there is no other algorithm thatperforms better.

3

5.3 Convolutional Neural Network

The advantage of using a CNN is that they often require only very little pre-processing. A 1Dconvolution was used across the 22 features of the input data. The following CNN architecturewas used.

Layer (type) Output Shape Param #

input1 (InputLayer) (None, 22, 1) 0conv0 (Conv1D) (None, 8, 16) 128

bn0 (BatchNormalization) (None, 8, 16) 64activation1 (Activation) (None, 8, 16) 0

max Pool0 (MaxPooling1D) (None, 4, 16) 0flatten1 (Flatten) (None, 64) 0

fc (Dense) (None, 1) 65activation2 (Activation) (None, 1) 0

Total params: 257 Trainable params: 225 Non-trainable params: 32

5.4 Random Forests

Trees have the capacity to learn highly complex data patterns depending on the depth of thetrees. This will always however lead to over-fitting and hence increases the variance on the model.Thus random forests can be used, which can be imagined as a method of averaging across thesedeep decision trees by applying feature bagging. The bootstrapping procedure ensures that thevariance of the final model is less. Moreover, if a certain feature is a strong predictor of the finalresponse, then that feature will be selected in many of the decision trees.

6 Results

6.1 Metrics

We define three metrics that are very useful in calculating if our model performs well. They are:1. Precision, 2. Recall and 3. F1 score.

Metric Name Formula #

Precision TruePositiveTotalPredictedPositive

Recall TruePositiveTotalActualPositive

F1 score 2 ∗ Precision∗RecallPrecision+Recall

6.2 Results from the models

Model Recall F1 score #

Logistic Regression 0.72 0.837GDA 0.745 0.854CNN 0.829 0.906

4

(a) Logistic Regression (b) GDA

(c) Random Forests - PRC (d) Random Forests - ROC

6.3 Discussion of Results

From our results, we can see that the CNN model performed really well since it has the highestF1 score amongst all the other models. The GDA model performs slightly better than the logisticregression. This can be expected since most of the features of the data are scores from question-naires filled out by the patients themselves and hence, the scores should be treated to contain somesort of noise associated with them. Most natural processes tend to be normally distributed andhence the reason why GDA performs better than logistic regression. Random Forests overfit thedata and so the depth of the tree was limited to 2, the maximum features to 3 and the maximumleaf nodes = 2 to prevent overfitting. The CNN model however, allowed for picking the featuresthat strongly control the prediction since it used a convolution and max pooling layer.

7 Future Work Scope

Some input features might affect our final prediction more strongly than others. More often thatnot, in the medical sphere, medical records of patients are documents with very few of these featuresavailable. If we can narrow down the features that do not affect the final prediction severely, wecan successfully predict if a person has myasthenia gravis simply by looking at existing medicalrecords and open up new doors for more tests that will help with a more certain diagnosis.

It can be noticed that the input features excluding the AChR, MuSK and seronegativity tests,are side effects and related effects caused in a person suffering from Myasthenia gravis. Thus, ifwe knew the age, gender and BMI of a person suffering from Myasthenia gravis and the remainingfeatures, we could extend our model to predict the existence of other effects such as sleep apneaor if a thymectomy could help lessen the symptoms.

5

8 Contributions

The team consisted of two members: 1. Abhishek Tapadar and 2. Asherin George Anto George.Both the team members helped reach out to the professionals at the Stanford NeurosciencesHospital in order to obtain the data required. Most of the discussions and decisions regarding thisproject were also done together.

Abhishek Tapadar contributed in writing code and deriving results from the Logistic Regres-sion, GDA and CNN models. He also helped in deciding upon the hyperparameters for the abovemodels and the gradient boosted and random forests model. In addition, he helped in puttingtogether the report for the various intermediate milestones and the final project report.

Asherin George Anto George contributed in performing the gradient boosted and randomforests models. He also helped in hyperparameter selection for the models Abhishek Tapadar wasinvolved in. He contributed in putting together the write-up for the project proposal, milestone,poster presentation and the final project report.

9 References

[1] Chiou-Tan, F. Y. and Gilchrist, J. M. (2015), Repetitive nerve stimulation and single-fiber elec-tromyography in the evaluation of patients with suspected myasthenia gravis or Lambert–Eatonmyasthenic syndrome: Review of recent literature. Muscle Nerve, 52: 455-462. doi:10.1002/mus.24745

[2]Rajkomar, A., Oren, E., Chen, K., Dai, A.M., Hajaj, N., Hardt, M., Liu, P.J., Liu, X.,Marcus, J., Sun, M. and Sundberg, P., 2018. Scalable and accurate deep learning with electronichealth records. npj Digital Medicine, 1(1), p.18.

[3]H. Yu, Towards answering biological questions with experimental evidence: automaticallyidentifying text that summarize image content in full-text articles, in: Proceedings of the AMIA2006 Symposium, AMIA, 2006, pp. 834–838.

[4]Python libraries that were used: Numpy, Scikit Learn and Keras

10 Code Repository

Link to our code: https://drive.google.com/file/d/1PsVVrVfs-g3Yp0kjofUi_MNkyd_mS0My/view

6

https://drive.google.com/file/d/1PsVVrVfs-g3Yp0kjofUi_MNkyd_mS0My/view

https://drive.google.com/file/d/1PsVVrVfs-g3Yp0kjofUi_MNkyd_mS0My/view

Documents

Painless Prognosis of Myasthenia Gravis using Machine Learningcs229.stanford.edu/proj2018/report/166.pdf · 2019. 1. 6. · Myasthenia gravis (MG) is a neuromuscular disorder that