How Can Machine Learning Aid the Hearing Impaired?

technojules/Julia
4 min readJan 3, 2023

Introduction

According to the World Health Organization, about 5% of the world’s population (a whopping 430 million!) are hearing impaired and/or have hearing loss. Being a very large number, there are possible solutions to aid the hearing-impaired community. The most common solution is hearing aids — they can be analog or digital, where sounds are turned into digital signals and their microphones can selectively detect certain sounds, and come in a variety of sizes, shapes, and prices.

What if machine learning can also serve as an alternative solution to help identify, detect, and/or classify audio? Let’s find out.

Audio Classification: A task in machine learning

The task of audio classification is to recognize features and specificities of each type of audio category to accurately determine, when given a new audio clip, what category this clip belongs to. Classes can be car horn, bird chirp, etc. For us humans, this is exactly how babies learn to identify sounds around them. By hearing numerous types of sounds numerous times, over time, babies will learn to precisely recognize sound names.

The Objective

The goal of developing a machine learning model is to train it to learn in a similar way as humans do, which, in this audio classification case, is to recognize significant features of an audio clip from being fed different audio clips as input through training on sound datasets. The more varied and large the dataset is, most of the times, the model will be more accurate at classifying a sound(output). Because it is easier and faster for machines to sift through thousands of audio data than humans, therefore, it’s faster for models to classify audio.

Step 1: The Dataset

Before training a model on an audio dataset, the first step is to download and load the dataset into an environment. For example, you could load the popular UrbanSound8K dataset, which consists of 10 categories and over 8K .wav files, into a Google Colab Jupyter Notebook environment with the Python language. Then, to rid of impurities, the next step is to clean the data, which involves removing audio files that do not fit any categories and are therefore useless to the learning of the model, reassigning audio files that were put into the wrong class, etc.

Next, the data files need to be split into x and y, or train and validation, to first train the model on one larger part of the dataset and then validate/test the model on the new small part to test its classification accuracy. One of the most common splits is to split the dataset 80–20: 80% is used for training, and 20% is used for testing the model.

Step 2: The Model

There are a variety of methods to transform audio and train a model on it, but the most common and accurate approach is to first convert audio into a 2D image-like representation that models audio’s frequency and time, such as a spectrogram, etc., then feed it into a machine learning architecture, such as a CNN, LSTM, RNN, etc. to extract feature maps of audio to identify the class of this audio clip. The most common architecture/model used is a 2D Convolutional Neural Network (CNN), which is known to robustly classify images and other 2D representations, such as spectrograms representing audio features. Next, the model needs to be trained by running it through a certain number of epochs so it can learn audio features from the training set of a dataset and then predict the correct category of each audio clip from the test set. The output of each epoch usually consists of a training loss and accuracy and a validation/testing loss and accuracy.

From there, you could continue optimizing and then re-training the model to get its classification validation accuracy as high as possible.

Conclusion

Though machine learning is still new, people are continuing to develop accurate, accessible, and robust models and technologies that can be used in many areas to solve a problem or provide a solution. Right now, machine learning as a hearing aid alternative may seem far away, but with enough time, experimentation, etc., we could potentially utilize ML as a solution for the hearing impaired.

Cited Sources

Analog versus digital hearing aids. Hearing And Audiology. (2022, March 14). Retrieved July 2,

2022, from https://www.hearingandaudiology.com.au/blog/analog-hearing-aids-vs-digital-hearing-aids/

Gorgolewski, C. (2020, February 4). UrbanSound8K. Kaggle. Retrieved July 2, 2022,

from https://www.kaggle.com/chrisfilo/urbansound8k

Bonner, A. (2019, June 1). The complete beginner’s guide to deep learning: Convolutional

Neural Networks. Medium. Retrieved July 1, 2022, from https://towardsdatascience.com/wtf-is-image-classification-8e78a8235acb

Nandi, P. (2021, December 10). CNNs for Audio Classification. Medium. Retrieved July 1,

2022, from https://towardsdatascience.com/cnns-for

audio-classification-6244954665ab

--

--