Hemant Vishwakarma: How to get complete fundamental (f0) frequency extraction with python lib librosa.pyin?

Thursday, 15 December 2022

How to get complete fundamental (f0) frequency extraction with python lib librosa.pyin?

I am running librosa.pyin on a speech audio clip, and it doesn't seem to be extracting all the fundamentals (f0) from the first part of the recording.

librosa documentation: https://librosa.org/doc/main/generated/librosa.pyin.html

sr: 22050

fmin=librosa.note_to_hz('C0')
fmax=librosa.note_to_hz('C7')

f0, voiced_flag, voiced_probs = librosa.pyin(y,
                                             fmin=fmin,
                                             fmax=fmax,
                                             pad_mode='constant',
                                             n_thresholds = 10,
                                             max_transition_rate = 100,
                                             sr=sr)

Raw audio:

Spectrogram with fundamental tones, onssets, and onset strength, but the first part doesn't have any fundamental tones extracted.

link to audio file: https://jasonmhead.com/wp-content/uploads/2022/12/quick_fox.wav

times = librosa.times_like(o_env, sr=sr)
onset_frames = librosa.onset.onset_detect(onset_envelope=o_env, sr=sr)

Another view with power spectrogram:

I tried compressing the audio, but that didn't seem to work.

Any suggestions on what parameters I can adjust, or audio pre-processing that can be done to have fundamental tones extracted from all words?

What type of things affect fundamental tone extraction success?

from How to get complete fundamental (f0) frequency extraction with python lib librosa.pyin?

Hemant Vishwakarma

Thursday, 15 December 2022

How to get complete fundamental (f0) frequency extraction with python lib librosa.pyin?

No comments:

Post a Comment