Research Article |
Indian Classical Music Recognition using Deep Convolution Neural Network
Author(s): Swati Aswale*, Dr. Prabhat Chandra Shrivastava, Dr. Ratnesh Ranjan and Seema Shende
Published In : International Journal of Electrical and Electronics Research (IJEER) Volume 12, Issue 1
Publisher : FOREX Publication
Published : 05 February 2024
e-ISSN : 2347-470X
Page(s) : 73-82
Abstract
A divine approach to communicate feelings about the world occurs through music. There is a huge variety in the language of music. One of the principal variables of Indian social legacy is classical music. Hindustani and Carnatic are the two primary subgenres of Indian classical music. Models have been trained and taught to distinguish between Carnatic and Hindustani songs. This paper presents Indian classical music recognition based on multiple acoustic features (MAF) consisting of various statistical, spectral, and time domain features. The MAF provides the changes in intonation, timbre, prosody and pitch of the musical speech due to different ragas. The lightweight DCNN is used to improve the representation of the raga sound and to provide higher order abstract level features. The overall performance of the raga type is estimated using various performance metrics, including accuracy, precision, recall and F1-score. The proposed DCNN achieves an accuracy, precision, recall, and F1-score of 89.38%, 0.89, 0.89, and 0.89, respectively, for eight raga classifications. The extensive experimentation on eight classical ragas has shown a noteworthy improvement over the traditional state of art.
Keywords: Music recognition
, Indian Raga Classification
, Deep Convolution Neural Network
, Spectral Features
, Speech Recognition
.
Swati Aswale*, Research Scholar, G.H. Raisoni University, Amravati, India; Email: swatiaswale31@gmail.com
Dr. Prabhat Chandra Shrivastava, Assistant Professor, JK Institute of Applied Physics Allahabad University, India; Email: prabhatphd@gmail.com
Dr. Ratnesh Ranjan, Assistant Professor, JK Institute of Applied Physics Allahabad University, India; Email: ratnesh.ranjan.ece.gmail.com
Seema Shende, Research Scholar, G.H. Raisoni University, Amravati, India; Email: seemashende18.gmail.com
-
[1] R. Sridhar and T. V. Geetha, ‘‘Swara indentification for south indian classical music,’’ in Proc. 9th Int. Conf. Inf. Technol. (ICIT), Dec. 2006, pp. 143–144. [CrossRef]
-
[2] R. Sridhar and T. V. Geetha, ‘‘Music information retrieval of carnatic songs based on carnatic music singer identification,’’ in Proc. Int. Conf. Comput. Electr. Eng., Dec. 2008, pp. 407–411. [CrossRef]
-
[3] G. Pandey, C. Mishra, and P. Ipe, ‘‘TANSEN: A system for automatic raga identification,’’ IICAI, Dec. 2003, pp. 1350–1363.
-
[4] J. P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, and M. B. Sandler,‘‘A tutorial on onset detection in music signals,’’ IEEE Trans. Speech Audio Process,vol.13, no. 5, pp. 1035–1047, Sep. 2005. [CrossRef]
-
[5] A. Klapuri and M. Davy, Signal Processing Methods for Music Transcription. New York, NJ, USA: Springer-Verlag, 2006.
-
[6] P. Chordia, ‘‘Automatic raag classification of pitch-tracked performances using pitch-class and pitch- class dyad distributions,’’ in Proc. ICMC, 2006, pp. 1–7.
-
[7] G. E. Poliner, D. P. W. Ellis, A. F. Ehmann, E. Gomez, S. Streich, and B. Ong, ‘‘Melody transcription from music audio: Approaches and evaluation,’’ IEEE Trans. Audio, Speech Lang. Process., vol. 15, no. 4, pp. 1247–1256, May 2007. [CrossRef]
-
[8] S. Samsekai Manjabhat, S. G. Koolagudi, K. S. Rao, and P. B. Ramteke, ‘‘Raga and tonic identification in carnatic music,’’ J. New Music Res., vol. 46, no. 3, pp. 229–245, Jul. 2017. [CrossRef]
-
[9] Theory of Indian Music, Pankaj, New Delhi, India, 1999.
-
[10] S. Shetty and S. Hegde, ‘‘Automatic classification of carnatic music instruments using MFCC and LPC,’’ in Data Management, Analytics and Innovation. Singapore: Springer, 2020, pp. 463-474. [CrossRef]
-
[11] Joshi Dipti, Jyoti Pareek, and Pushkar Ambatkar. "Indian Classical Raga Identification using Machine Learning." In ISIC'21: International Semantic Intelligence Conference, February 25-27, 2021, New Delhi, India, pp. 259-263. 2021.
-
[12] Choi, K., Fazekas, G. and Sandler, M. (2016). Automatic tagging using deep convolutional neural networks.
-
[13] Abdul, A., Chen, J., Liao, H.-Y. and Chang, S.-H. (2018). An emotion-aware personalized music recommendation system using a convolutional neural networks approach, Applied Sciences 8: 1103. [CrossRef]
-
[14] Chang, S., Abdul, A., Chen, J. and Liao, H. (2018). A personalized music recommendation system using convolutional neural networks approach, 2018 IEEE International Conference on Applied System Invention (ICASI), pp. 47–49. [CrossRef]
-
[15] Elbir, A. and Aydin, N. (2020). Music genre classification and music recommendation by using deep learning, Electronics Letters 56(12): 627–629. [CrossRef]
-
[16] Jiang, M., Yang, Z. and Zhao, C. (2017). What to play next? arnn-based music recommendation system, 2017 51st Asilomar Conference on Signals, Systems, and Computers, pp. 356–358. [CrossRef]
-
[17] Tao, Y., Zhang, Y. and Bian, K. (2019). Attentive context-aware music recommendation, 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC), pp. 54–61. [CrossRef]
-
[18] Fulzele, P., Singh, R., Kaushik, N. and Pandey, K. (2018). A hybrid model for music genre classification using lstm and svm, 2018 Eleventh International Conference on Contemporary Computing (IC3), pp1-3. [CrossRef]
-
[19] Adiyansjah, Alexander, G. and Derwin, S. (2019). Music recommender system based on genre using convolutional recurrent neural networks, Procedia Computer Science 157: 99–109. [CrossRef]
-
[20] Irene, R. T., Borrelli, C., Zanoni, M., Buccoli, M. and Sarti, A. (2019). Automatic playlist generation using convolutional neural networks and recurrent neural networks, 2019 27th European Signal Processing Conference (EUSIPCO), pp. 1–5. [CrossRef]
-
[21] Kim, H., Kim, G. Y. and Kim, J. Y. (2019). Music recommendation system using human activity recognition from accelerometer data, IEEE Transactions on Consumer Electronics 65(3): 349–358. [CrossRef]
-
[22] Prabhat Chandra Shrivastava, Prashant Kumar, Manish Tiwari, Amit Dhawan, “Efficient Architecture for the Realization of 2-D Adaptive FIR Filter Using Distributed Arithmetic. Circuits Syst Signal Process, Issue Date March 2021, Volume 40, pp 1458–1478 https://doi.org/10.1007/s00034-020- 01539-y, (SCI, Impact Factor-2.25). [CrossRef]
-
[23] Prashant Kumar, Prabhat Chandra Shrivastava, Manish Tiwari and Ganga Ram Mishra, “High- Throughput, Area-Efficient Architecture of 2-D Block FIR Filter Using Distributed Arithmetic Algorithm” Circuits System & Signal Processing, Springer., Issue Date-March 2019, Volume 38, Issue 3, pp 1099–1113, https://doi.org/10.1007/s00034-018-0897-2, (SCI, Impact Factor-2.25). [CrossRef]
-
[24] R. R. Kar and R. G. Wandhare, "Energy Management System For Photovoltaic Fed Hybrid Electric Vehicle Charging Stations," 2021 IEEE 48th Photovoltaic Specialists Conference (PVSC), Fort Lauderdale, FL, USA, 2021, pp. 2478-2485, doi: 10.1109/PVSC43889.2021.9518722. [CrossRef]
-
[25] Prabhat Chandra Shrivastava, Prashant Kumar, Manish Tiwari, “Hardware Realization of 2-D General Model State Space Systems”, International Journal of Engineering and Technology (IJET), ISSN (Online): 0975-4024, Vol 9 No, Pages: 3996-4005, 5 Oct-Nov 2017, DOI: 10.21817/ijet/2017/v9i5/170905301 (Scopus Index Impact Factor-1.998). [CrossRef]
-
[26] Alam, Md Jahangir, Tomi Kinnunen, Patrick Kenny, Pierre Ouellet, and Douglas O’Shaughnessy. "Multitaper MFCC and PLP features for speaker verification using i-vectors." Speech communication 55, no. 2 (2013): 237-251. [CrossRef]
-
[27] Mansouri, Arash, and Eduardo Castillo-Guerra. "Multitaper MFCC and normalized multitaper phase- based features for speaker verification." SN Applied Sciences 1, no. 4 (2019): 1-18. [CrossRef]
-
[28] Chowdhury, Anurag, and Arun Ross. "Fusing MFCC and LPC features using 1D triplet CNN for speaker recognition in severely degraded audio signals." IEEE transactions on information forensics and security 15 (2019): 1616-1629. [CrossRef]
-
[29] Chauhan, Neha, Tsuyoshi Isshiki, and Dongju Li. "Speaker recognition using LPC, MFCC, ZCR features with ANN and SVM classifier for large input database." In 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS), pp. 130-133. IEEE, 2019. [CrossRef]
-
[30] Welling, Lutz, and Hermann Ney. "Formant estimation for speech recognition." IEEE Transactions on Speech and Audio Processing 6, no. 1 (1998): 36-48. [CrossRef]
-
[31] Zhang, Yang, Tess Koerner, Sharon Miller, Zach Grice‐Patil, Adam Svec, David Akbari, Liz Tusler, and Edward Carney. "Neural coding of formant‐exaggerated speech in the infant brain." Developmental science 14, no. 3 (2011): 566-581. [CrossRef]
-
[32] Levin, Herman, and William Lord. "Speech pitch frequency as an emotional state indicator." IEEE Transactions on Systems, Man, and Cybernetics 2 (1975): 259-273. [CrossRef]
-
[33] Savchenko, A. V., and V. V. Savchenko. "A method for measuring the pitch frequency of speech signals for the systems of acoustic speech analysis." Measurement Techniques 62, no. 3 (2019): 282-288. [CrossRef]
-
[34] Ghosal, Arijit, Rudrasis Chakraborty, Ractim Chakraborty, Swagata Haty, Bibhas Chandra Dhara, and Sanjoy Kumar Saha. "Speech/music classification using occurrence pattern of zcr and ste." In 2009 Third International Symposium on Intelligent Information Technology Application, vol. 3, pp. 435-438. IEEE, 2009. [CrossRef]
-
[35] Banchhor, Sumit Kumar, and Arif Khan. "Musical instrument recognition using zero crossing rate and short-time energy." Musical Instrument 1, no. 3 (2012): 1-4. [CrossRef]
-
[36] Farrús, Mireia, and Javier Hernando. "Using jitter and shimmer in speaker verification." IET Signal Processing 3, no. 4 (2009): 247-257. [CrossRef]
-
[37] Becker, Alyssa S., and Peter J. Watson. "The Use of Vibrato in Belt and Legit Styles of Singing in Professional Female Musical-Theater Performers." Journal of Voice (2022). [CrossRef]
-
[38] “Multilingual Indian Musical Type Classification” Mrs. Swati P. Aswale, Prabhat Chandra Shrivastava, Dr. Roshani Bhagat, Vikrant B. Joshi, Mrs. Seema M. Shende, conference paper 5th International Conference on VLSI, Communication and Signal Processing (Via Online mode), Volume, Year 2022. [CrossRef]
-
[39] Joshi, Dipti, Jyoti Pareek, and Pushkar Ambatkar. "Indian Classical Raga Identification using Machine Learning." In ISIC, pp. 259-263. 2021.