python - categorizing short audio samples -
i have small number of similar types of sounds (i shall refer these db_sounds) need match recording (rec_sounds). each rec_sound short , unique , needs matched corresponding db_sound. how go matching them?
to illustrate problem, consider following:
bob, deep voice in room (with background noise) says ma
alice, high voice in room b says eh
baby learning speak. first word eh
ma , eh 2 different types of db_sounds, have return 2 different results. have several db_sound samples of different people saying ma , eh compare rec_sounds to
the sounds dealing voice recordings of single syllables la, ba, ne, eh, ma etc.
how should tackle this?
don't think audio fingerprinting work (see spectrogram), , existing voice recognition software this google api integration in python don't work since not trying recognize human language, sounds.
i don't mind building ground up, point me in direction think work, , please add plenty justification why think so.
spectrograms of 8 samples of baby saying eh
time domain graphs of 8 samples of baby saying eh
if want recognize sounds, start simple procedure:
- crop silence each sound sample (simple energy treshold).
- compute audio features each sample of database (e.g. mfccs).
- perform cross-validated classification procedure map audio features sound category want recognize.
helpful python libs: scipy reading wav files, essentia audio feature extraction, scikit-learn classification , other machine learning.
Comments
Post a Comment