Environmental sound recognition
Overview
The aim of this project was to generate a neural network which recognizes environmental sounds and is applicaple in real time iOS-Applications. As a proof of conecpt, we want to detect the sound of boiling water in the kitchen.
Table of Contents
Data
The data consists of two sets:
- Random kitchen noise in various environments before the water is boiling
- Sound of boiling water with and without noise
Spectrogram
TempiFFT allows to generate real time spectrograms from the mircophone input in iOS. This is not essentially needed for the detection, but adds an intuitive layout to the App.
It uses Fast Fourier Transform (short FFT), which is a method for deconstructing an audio signal (or any time-based signal for that matter) into its constituent frequencies and intensities.
For discrete values (like sound data), the Discrete Fourier Transformation (DFT) is given by
Neural Network
The Neural Network is based on Turicreate, which is a library provided by Apple to do various machine learning tasks.
In this case we are using the so called Sound Classifier. Given a sound, the goal of the Sound Classifier is to assign it to one of a pre-determined number of labels, such as baby crying, siren, or dog barking. This Sound Classifier is not intended to be used for speech recognition.
Example Code
import turicreate as tc
from os.path import basename
# Load the audio data and meta data.
data = tc.load_audio('./ESC-50/audio/')
meta_data = tc.SFrame.read_csv('./ESC-50/meta/esc50.csv')
# Join the audio data and the meta data.
data['filename'] = data['path'].apply(lambda p: basename(p))
data = data.join(meta_data)
# Drop all records which are not part of the ESC-10.
data = data.filter_by('True', 'esc10')
# Make a train-test split, just use the first fold as our test set.
test_set = data.filter_by(1, 'fold')
train_set = data.filter_by(1, 'fold', exclude=True)
# Create the model.
model = tc.sound_classifier.create(train_set, target='category', feature='audio')
# Generate an SArray of predictions from the test set.
predictions = model.predict(test_set)
# Evaluate the model and print the results
metrics = model.evaluate(test_set)
print(metrics)
# Save the model for later use in Turi Create
model.save('mymodel.model')
# Export for use in Core ML
model.export_coreml('mymodel.mlmodel')
In order use Turicreate in iOS, one has to do the steps described here Deployment to Core ML.
In order to distribute the app, one has to package the dynamic Library of Turicreate into a Framework. Watch out for the isse Turicreate Issue.
Result
The result was farily good and the trained network could detect the sound of boiling water in various different environments.
The resulting App 212° F will soon be available for downloading.