[Home ] [Archive]   [ فارسی ]  
:: Main :: About :: Current Issue :: Archive :: Search :: Submit :: Contact ::
:: year 4, Issue 1 (Journal of Acoustical Engineering Society of Iran 2016) ::
مجله علمی پژوهشی انجمن مهندسی صوتیات ایران 2016, 4(1): 1-20 Back to browse issues page
Query-by-example music retrieval using genre recognition to speed up the performance
N. Borjian * , E. Kabir , S. Seyedin , E. Masehian
Abstract:   (1057 Views)

The goal of a query-by-example music information retrieval system is retrieval of the target song corresponding to user-provided example from a particular dataset. The example can be a few second piece recorded from any music source such as TV or even a noisy environment e.g. gym. In this paper, a query-by-example system for music retrieval using genre recognition is proposed whose goal is to show the effect of genre recognition to achieve the accurate and rapid performance in such systems even in presence the background noise. This system includes two basic blocks: genre recognition and matching-retrieval. A binary decision tree performs the genre recognition and matching-retrieval uses two Euclidean and Kullback-Leibler (KL) distances along with a score level based decision fusion. The proposed system is evaluated on the well-known GTZAN  dataset (prepared by George Tzanetakis) and by two random groups of pure and noisy queries. The results show the accuracy of 97% and 86% for two pure and noisy query groups, respectively, in retrieval time of 525 ms with Euclidean distance. These values are 97% and 82% in retrieval time of 380 ms with KL distance.

Keywords: Music information retrieval, Query by example, Genre recognition, Decision fusion, Noise.
Full-Text [PDF 253 kb]   (73 Downloads)    
Type of Study: Research |
Received: 2015/09/28 | Accepted: 2016/09/10 | Published: 2016/09/10
1. N. Borjian, E. Kabir, S. Seyedin, E. Masehian, “Music retrieval using Gaussian mixture model,” The Fourth International Conference on Acoustics and Vibration, Iran University of Science and Technology, Tehran, Iran, 2014.
2. N. Borjian, E. Kabir, S. Seyedin, E. Masehian, “Genre recognition-based music retrieval,” The 12th Iranian Conference on Intelligent Systems, Higher Education Complex of Bam, 2014.
3. S. Kiranyaz, “Advanced techniques for content-based management of multimedia databases,” PhD Thesis, Tampere University of Technology, Finland, 2005.
4. A. Meng, P. Ahrendt, J. Larsen, L. K. Hansen, “Temporal feature integration for music genre classification,” IEEE Transactions on Audio, Speech and Language Processing, vol. 15, pp. 1654-1664, 2007.
5. M. Helén, T. Virtanen, “Audio query by example using similarity measures between probability density functions of features,”EURASIP Journal on Audio, Speech, and Music Processing, vol. 2010, pp. 1-12, 2010.
6. T. Dharani, I.L. Aroquiaraj, “A survey on content based image retrieval,” International Conference on Pattern Recognition, Informatics and Mobile Engineering (PRIME 2013) Salem, USA, 2013.
7. Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Transaction on Pattern Analysis and Machine Intelligence vol. 35, no. 12, pp.2916-2929, 2013.
8. J.S. Downie, “The International Society of Music Information Retrieval,” Available: http://www.ismir.net/, 2000.
9. M.A. Casey, R. Veltkamp, M. Goto, M. Leman, C. Rhodes, M. Slaney, “Contentbased music information retrieval: Current directions and future challenges,”Proceedings of the IEEE, vol. 96, pp. 668-696, 2008.
10. Z.W. Ras, A. Wieczorkowska, “Advances in Music Information Retrieval,” First Edition, Springer-Verlag Berlin Heidelberg, 2010.
11. M. Schedl, E. Gómez, J. Urbano, “Music information retrieval: Recent developmentsand applications,” Foundations and Trends in Information Retrieval, vol. 8, pp. 127-261, 2014.
12. D. Byrd, T. Crawford, “Problems of music information retrieval in the real world,”Information Processing and Management, vol. 38, pp. 249-272, 2002.
13. G. Haus, M .Longari, E. Pollastri, “Scoredriven approach to music information retrieval,” Journal of the American Society for Information Science and Technology, vol. 55, pp. 1045-1052, 2004.
14. N.H. Adams, M.A. Bartsch, G.H. Wakefield, “Note segmentation and quantization for music information retrieval,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, pp. 131-141, 2006.
15. C.N. Silla, A.L. Koerich, C.A.A. Kaestner, “Feature selection in automatic music genre classification,” The Tenth IEEE International Symposium on Multimedia (ISM 2008), Berkeley, CA, 2008.
16. M. Kaminskas, F. Ricci, “Contextual music information retrieval and recommendation: State of the art and challenges,” Computer Science Review, vol. 6, pp. 89-119, 2012.
17. J. Makhoul, F. Kubala, T. Leek, D. Liu, L. Nguyen, R. Schwartz, A. Srivastava, “Speech and language technologies for audio indexing and retrieval,” Proceedings of the IEEE, vol. 88, pp. 1338-1353, 2000.
18. W.-H. Tsai, H.-M. Yu, H.-M .Wang, “Query-by-example technique for retrieving cover versions of popular songs with similar melodies,” The Sixth International Conference on Music Information Retrieval, London, UK, 2005.
19. I.S.H. Suyoto, A.L. Uitdenbogerd, F. Scholer, “Effective retrieval of polyphonic audio with polyphonic symbolic queries,”The MIR '07 Proceedings of the International Workshop on Multimedia Information Retrieval, 2007.
20. H.-M. Yu, W.-H. Tsai, H.-M. Wang, “A query-by-singing system for retrieving karaoke music,” IEEE Transactions on Multimedia, vol. 10, pp. 1626-1637, 2008.
21. W.-H. Tsai, Y.-M. Tu, C.-H. Ma, “An FFTbased fast melody comparison method for query-by-singing/humming systems,”Pattern Recognition Letters, vol. 33, pp. 2285-2291, 2012.
22. E. Unal, E. Chew, P.G. Georgiou, S.S. Narayanan, “Challenging uncertainty in query by humming systems: A fingerprinting approach,” IEEE Transactionson Audio, Speech and Language Processing, vol. 16, pp. 13, 2008.
23. S. Doraisamy, S. Ruger, “Robust polyphonic music retrieval with n-grams,” Journal of Intelligent Information Systems, vol. 21, pp. 53-70, 2003.
24. J. Allali, P. Ferraro, P. Hanna, C. Iliopoulos, M. Robine, “Toward a general framework for polyphonic comparison,” Fundamenta Informaticae, vol. 97, pp. 331-334, 2009.
25. J. Salamon, E. Gómez, “Melody extraction from polyphonic music signals using pitch contour characteristics,” IEEE Transactions on Audio, Speech and Language Processing, vol. 20, pp. 1759-1770, 2012.
26. R.P. Paiva, T. Mendes, A. Cardoso, “A methodology for detection of melody in polyphonic musical signals,” The 116th Convention, Audio Engineering Society, Berlin, Germany, 2004.
27. H.-M. Yu ,W.-H. Tsai, H.-M. Wang, “A query-by-singing technique for retrieving polyphonic objects of popular music,”Information Retrieval Technology Lecture Notes in Computer Science, vol. 3689, pp.439-453, 2005.
28. M. Kle´c, D. Korˇzinek, “Unsupervised feature pre-training of the scattering wavelet transform for musical genre recognition,”Procedia Technology, vol. 18, pp. 133-139, 2014.
29. M. Banitalebi-Dehkordi, A. Banitalebi-Dehkordi, “Music genre classification using spectral analysis and sparse representation of the signals,” Journal of Signal Processing Systems, vol. 74, pp. 273-280, 2014.
30. R. Mayer, R. Neumayer, A. Rauber, “Combination of audio and lyrics features for genre classification in digital audio collections,” The MM '08 Proceedings of the Sixteenth ACM International Conference on Multimedia, New York, USA, 2008.
31. C. Barton, P. Inghelbrecht, A. Wang, D. Mukherjee, Shazam Company, Available: http://www.shazam.com/company, 1999.
32. G. Tzanetakis, P. Cook, “Musical genre classification of audio signals,” IEEE Transactions on Speech and Audio Processing, vol. 10, pp. 293-302, 2002.
33. R. Kaye, “The Open Music Encyclopedia,”Available: https://musicbrainz.org/, 2005.
34. A. Schröder, M. Keith, Free database Available: http://www.freedb.org.
35. K. Mohajer, M. Emami, J. Hom, K. McMahon, T. Stonehocker, C. Lucanegro, K. Mohajer, A. Arbabi, and F. Shakeri. www.soundhound.com, 2010.
36. J. Born, Neuros, Available: www.neurostechnology.com, 2001.
37. A.L.-C. Wang, “An industrial strength audio search algorithm,” Proceedings of the Fourth International Conference on Music Information Retrieval (ISMIR 2003), Baltimore, MD, 2003.
38. W.-H. Tsai, H.-M. Yu, and H.-M. Wang, “Query-by-example technique for retrieving cover versions of popular songs with similar melodies,” Proceedings of the Sixth International Conference on Music Information Retrieval, London, 2005.
39. K. Itoyama, M. Goto, K. Komatani, T. Ogata, H.G. Okuno, “Query-by-example music information retrieval by scoreinformed source separation and remixing technologies,” EURASIP Journal on Advances in Signal Processing, pp. 1-14, 2010.
40. G. Tzanetakis, “GTZAN genre collection,”Available: http://marsyas.info/downloads/datasets.html
41. L.R. Rabiner, B.H. Juang, “Fundamental of Speech Recognition Prentice,” First Edition, Prentice Hall, 1993.
42. F. Camastra, A. Vinciarelli, “Machine Learning for Audio, Image and Video Analysis: Theory and Applications,”London, Springer, 2008.
43. Z.H. Tan, B. Lindberg, “Automatic Speech Recognition on Mobile Devices and Over Communication Networks,” London, Springer-Verlag, 2008.
44. U.S. Tiwary, T.J. Siddiqui, “Speech, Image, and Language Processing for Human Computer Interaction: Multi-modal Advancements,” USA, IGI Global, 2012.
45. J. Shen, J. Shepherd, A.H.H. Ngu, “Towards effective content-based music retrieval with multiple acoustic feature combination,” IEEE Transactions on Multimedia, vol. 8, pp. 1179-1189, 2006.
46. L. Breiman, J. Friedman, R. Olshen, C. Stone, Classification and Regression Trees: Chapman and Hall, 1984.
47. J.R. Quinlan, “Induction of decision trees,”Machine Learning, vol. 1, no. 1, pp. 81-106, 1986.
48. Y. Riblet, “Decision Tree Learning,”Available: http://en.wikipedia.org/wiki/Decision_tree_learning, 2007.
49. I. Narsky, F.C. Porter, “Statistical Analysis Techniques in Particle Physics, Fits, Density Estimation and Supervised Learning,”Wiley, 2013.
50. J. Goldberger, S. Gordon, H. Greenspan, “An efficient image similarity measure based on approximations of KL-divergence between two Gaussian mixtures,”Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France, 2003.
51. E.G.I. Termens, “Audio content processing for automatic music genre classification: Descriptors, databases, and classifiers,” PhD Thesis, Department of Information and Communication Technologies, University Pompeu Fabra, Barcelona, 2009.
52. M.B. Dehkordi, “Music genre classification using spectral analysis and sparse representation of the signals,” Journal of Signal Processing Systems, vol. 74, pp. 8, 2014.
53. M. Helen, “Similarity measures for contentbased audio retrieval,” PhD Thesis, Tietotalo Building, Tampere University of Technology, Finland, 2009.
54. M. Gowan, Available: http://www.techhive.com/, 2011.
55. I. Cox, M. Miller, J. Bloom, J. Fridrich, T. Kalker, “Digital Watermarking and Steganography,” Second Edition, Morgan Kaufmann, 2007.
56. J. Haitsma, A. Kalker, “A highly robust audio fingerprinting system with an efficient search strategy,” Journal of New Music Research, vol. 32, pp. 211-222, 2003.
57. B. Thoshkahna, K. Ramakrishnan, “Arminion:a query by example system for audio retrieval,” Proceedings of Computer Music Modelling and Retrieval, pp. 1-9, 2005.
Send email to the article author

Add your comments about this article
Your username or Email:


XML   Persian Abstract   Print

Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Borjian N, Kabir E, Seyedin S, Masehian E. Query-by-example music retrieval using genre recognition to speed up the performance . مجله علمی پژوهشی انجمن مهندسی صوتیات ایران. 2016; 4 (1) :1-20
URL: http://joasi.ir/article-1-76-en.html

year 4, Issue 1 (Journal of Acoustical Engineering Society of Iran 2016) Back to browse issues page
مجله علمی-پژوهشی انجمن lمهندسی صوتیات ایران Journal of Acoustical Engineering Society of Iran
Persian site map - English site map - Created in 0.06 seconds with 30 queries by YEKTAWEB 3815