python - categorizing short audio samples -


i have small number of similar types of sounds (i shall refer these db_sounds) need match recording (rec_sounds). each rec_sound short , unique , needs matched corresponding db_sound. how go matching them?

to illustrate problem, consider following:
bob, deep voice in room (with background noise) says ma
alice, high voice in room b says eh
baby learning speak. first word eh

ma , eh 2 different types of db_sounds, have return 2 different results. have several db_sound samples of different people saying ma , eh compare rec_sounds to

the sounds dealing voice recordings of single syllables la, ba, ne, eh, ma etc.

how should tackle this?
don't think audio fingerprinting work (see spectrogram), , existing voice recognition software this google api integration in python don't work since not trying recognize human language, sounds.

i don't mind building ground up, point me in direction think work, , please add plenty justification why think so.

spectrograms of 8 samples of baby saying eh enter image description here

time domain graphs of 8 samples of baby saying eh enter image description here

if want recognize sounds, start simple procedure:

  1. crop silence each sound sample (simple energy treshold).
  2. compute audio features each sample of database (e.g. mfccs).
  3. perform cross-validated classification procedure map audio features sound category want recognize.

helpful python libs: scipy reading wav files, essentia audio feature extraction, scikit-learn classification , other machine learning.


Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - .htaccess mod_rewrite for dynamic url which has domain names -

Website Login Issue developed in magento -