FOREX Press I. J. of Electrical & Electronics Research
Support Open Access

Research Article |

Fundamental Frequency Extraction by Utilizing the Combination of Spectrum in Noisy Speech

Author(s): Foujia Islam1, Nargis Parvin2, Moinur Rahman3, Md. Tofael Ahmed4, Dulal Chakraborty5, and Md. Saifur Rahman6*

Publisher : FOREX Publication

Published : 10 December 2025

e-ISSN : 2347-470X

Page(s) : 730-734




Foujia Islam, Department of ICT, Comilla University, Bangladesh; Email: foujiaislam4567@gmail.com

Nargis Parvin, Assistant Professor, Department of CSE, Bangladesh Army International University of Science and Technology, Bangladesh; Email: nargis.cse@baiust.ac.bd

Moinur Rahman, Lecturer, Department of ICT, Comilla University, Bangladesh; Email: moinur.rahman@cou.ac.bd

Md. Tofael Ahmed, Professor, Department of ICT, Comilla University, Bangladesh; Email: tofael@cou.ac.bd

Dulal Chakraborty, Associate Professor, Department of ICT, Comilla University, Bangladesh; Email: dulal.ict.cou@gmail.com

Md. Saifur Rahman*, Professor, Department of ICT, Comilla University, Bangladesh; Email: saifurice@cou.ac.bd

    [1] H. C. Mahendru, Quick review of human speech production mechanism, International Journal of Engineering Research and Development 9 (10) (2014) 48–54.
    [2] C. Shahnaz, Pitch extraction of noisy speech using dominant frequency of the harmonic speech model (2002).
    [3] L. Sukhostat, Y. Imamverdiyev, A comparative analysis of pitch detection methods under the influence of different noise conditions, Journal of voice 29 (4) (2015) 410–417.
    [4] M. J. Carey, E. S. Parris, H. Lloyd-Thomas, S. Bennett, Robust prosodic features for speaker identification, in: Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96, Vol. 3, IEEE, 1996, pp. 1800–1803.
    [5] A. G. Adami, R. Mihaescu, D. A. Reynolds, J. J. Godfrey, Modeling prosodic dynamics for speaker recognition, in: 2003 IEEE International Conference on Acoustics, Speech and Signal Processing, 2003. Proceedings.(ICASSP’03)., Vol. 4, IEEE, 2003, pp. IV–788.
    [6] J. A. Moorer, The optimum comb method of pitch period analysis of continuous digitized speech (1973).
    [7] Y. Medan, E. Yair, D. Chazan, Super resolution pitch determination of speech signals, IEEE transactions on signal processing 39 (1) (1991) 40–48.
    [8] L. Rabiner, On the use of autocorrelation analysis for pitch detection, IEEE transactions on acoustics, speech and signal processing 25 (1) (1977) 24–33.
    [9] M. Ross, H. Shaffer, A. Cohen, R. Freudberg, H. Manley, Average magnitude difference function pitch extractor, IEEE Transactions on Acoustics, Speech and Signal Processing 22 (5) (1974) 353–362.
    [10] T. Shimamura, H. Kobayashi, Weighted autocorrelation for pitch extraction of noisy speech, IEEE transactions on speech and audio processing 9 (7) (2001) 727–730.
    [11] A. De Cheveigné, H. Kawahara, Yin, a fundamental frequency estimator for speech and music, The Journal of the Acoustical Society of America 111 (4) (2002) 1917–1930.
    [12] S. Seneff, Real-time harmonic pitch detector, IEEE Transactions on Acoustics, Speech and Signal Processing 26 (4) (1978) 358–365.
    [13] T. Sreenivas, P. Rao, Pitch extraction from corrupted harmonics of the power spectrum, The Journal of the Acoustical Society of America 65 (1) (1979) 223–228.
    [14] M. Lahat, R. Niederjohn, D. Krubsack, A spectral autocorrelation method for measurement of the fundamental frequency of noise-corrupted speech, IEEE transactions on acoustics, speech and signal processing 35 (6) (1987) 741–750.
    [15] A. M. Noll, Cepstrum pitch determination, The journal of the acoustical society of America 41 (2) (1967) 293–309.
    [16] H. Kobayashi, T. Shimamura, A modified cepstrum method for pitch extraction, in: IEEE. APCCAS 1998. 1998 IEEE Asia-Pacific Conference on Circuits and Systems. Microelectronics and Integrating Systems. Proceedings (Cat. No. 98EX242), IEEE, 1998, pp. 299–302.
    [17] R. H. MAFM, M. S. Rahman, T. Shimamura, Windowless-autocorrelation-based cepstrum method for pitch extraction of noisy speech, Journal of Signal Processing 16 (3) (2012) 231–239.
    [18] N. Yang, H. Ba, W. Cai, I. Demirkol, W. Heinzelman, Bana: A noise resilient fundamental frequency detection algorithm for speech and music, IEEE/ACM Transactions on Audio, Speech and Language Processing 22 (12) (2014) 1833–1848.
    [19] S. Gonzalez, M. Brookes, Pefac - a pitch estimation algorithm robust to high levels of noise, IEEE/ACM Transactions on Audio, Speech and Language Processing 22 (2) (2014) 518–530. doi:10.1109/TASLP.2013.2295918.
    [20] K. Kasi, Yet another algorithm for pitch tracking (yaapt) (2002).
    [21] L. N. Tan, A. Alwan, Multi-band summary correlogram-based pitch detection for noisy speech, Speech communication 55 (7-8) (2013) 841–856.
    [22] 20 countries language database, NTT Advanced Technology Corp., Japan (1988).
    [23] F. Plante, G. Meyer, W. Ainsworth, A fundamental frequency extraction reference database, in: Proc. Eurospeech, 1995, pp. 837–840.
    [24] A. Varga, H. J. Steeneken, Assessment for automatic speech recognition: Ii. noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems, Speech Communication 12 (3) (1993) 247–251.

Foujia Islam, Nargis Parvin, Moinur Rahman, Md. Tofael Ahmed, Dulal Chakraborty, Md. Saifur Rahman (2025), Fundamental Frequency Extraction by Utilizing the Combination of Spectrum in Noisy Speech. IJEER 13(4), 730-734. DOI: 10.37391/IJEER.130413.