Skip to content

Data Science project. ML algorithms to detect voice disorders.

Notifications You must be signed in to change notification settings

desininja/voice-disorder

Repository files navigation

voice-disorder

The contribution of this project is to investigate and compare the performance of several machine learning techniques useful for voice pathology detection. The results obtained are evaluated in terms of accuracy, sensitivity, specificity, and receiver operating characteristic area.

The introduction of mobile devices for data transmission or disease control and monitoring has been a main attraction of research and business communities. They offer, in fact, numerous opportunities to realise efficient mobile health (m- health) systems. These solutions can allow patients and doctors to access medical records, clinical audio-visual notes and drug information anywhere and at any time from their mobile devices, such as a tablet or smartphone, to monitor several conditions. M-health solutions can also be used in other important applications such as the detection and prevention of specific diseases, decision making and the management of chronic conditions and emergencies, improving the quality of patient care and reducing the costs of healthcare.

Dysphonia is a disorder that occurs when the voice quality, pitch and loudness are altered. About 10% of the population suffer from this disorder, caused mainly by unhealthy social habits and voice abuse. Unfortunately, a large number of individuals with voice disorders do not seek treatment. Therefore, m-health systems could be an efficient support for the diagnosis and screening of voice disorders. In this work, we want to discuss the application of machine learning algorithms and features selection methods capable of discriminating between pathological and healthy voices with better accuracy.

File "code for data extraction" extracts the data from different files/sources that are available on internet.

File "voice disorder algorithm" has the code.

In this i have deployed different ML algorithms to get the maximum accuracy on dataset. However a hybrid model was not created to achieve better accuracy, so this project is open for suggestions to create a hybrid model for better performance.

Here target is also imbalanced. So i have used smote technique to overcome the imbalance.

Thank you for reading and i am all ears for any suggestion to make this model better.