ES2608613A1

ES2608613A1 - Methodology for the automated recognition of reptiles by transforming the markov model of the parametric fusion characteristics of their sound production. (Machine-translation by Google Translate, not legally binding)

Info

Publication number: ES2608613A1
Application number: ES201600805A
Authority: ES
Inventors: Carlos Manuel TRAVIESO GONZÁLEZ; David De La Cruz SÁNCHEZ RODRÍGUEZ; Juan José NODA ARENCIBIA
Original assignee: Universidad de las Palmas de Gran Canaria
Current assignee: Universidad de las Palmas de Gran Canaria
Priority date: 2016-09-16
Filing date: 2016-09-16
Publication date: 2017-04-12
Anticipated expiration: 2036-09-16
Also published as: ES2608613B2

Abstract

The present invention relates to a method for the recognition and automated census of reptiles through the transformation using hidden markov models of the fusion of the characteristics in different domains of their acoustic signal emissions allowing the identification of the species and the monitoring specific to individuals within the same species. This methodology uses a concatenation of temporal and spectral characteristics to be automatically recognized by means of an intelligent pattern recognition system. (Machine-translation by Google Translate, not legally binding)

Description

DESCRIPCIÓN DESCRIPTION

Metodología para el reconocimiento automatizado de reptiles mediante la transformación del modelo de Markov de la fusión paramétrica de características de su producción sonora. Methodology for the automated recognition of reptiles by transforming the Markov model of the parametric fusion of characteristics of its sound production.

5 5

Objeto de la invención Object of the invention

La presente invención se refiere a un procedimiento para el reconocimiento y censo automatizado de reptiles a través de la hiperdimensionalidad de la transformación de sus emisiones de señales acústicas permitiendo la identificación de la especie y el seguimiento 10 específico de individuos dentro de una misma especie. Las señales bio-acústicas que producen los reptiles son generadas de diversas formas: por excitación de la laringe, expulsando aire a través de su nariz o boca, y agitando o rascando partes corporales entre otros mecanismos. The present invention relates to a procedure for the recognition and automated census of reptiles through the hyperdimensionality of the transformation of their acoustic signal emissions allowing the identification of the species and the specific monitoring of individuals within the same species. The bio-acoustic signals produced by reptiles are generated in various ways: by excitation of the larynx, expelling air through its nose or mouth, and stirring or scratching body parts among other mechanisms.

Antecedentes de la invención 15 Background of the invention

Actualmente, el empleo de técnicas bio-acústicas para el estudio y seguimiento <le las especies animales dentro de su hábitat suponen una de las herramientas más importantes para los biólogos y conservacionistas. El avance tecnológico experimentado en los sensores acústicos y los medios de grabación digitales permiten el censo e identificación de especies de 20 forma remota evitando técnicas invasivas que alteran los ecosistemas o suponen la presencia física del biólogo en el área de estudio. Los datos recopilados permiten el seguimiento de animales evitando el marcado físico de los mismos y proporcionan a los investigadores información sobre los indicadores biológicos de la zona. La presencia o no de determinadas especies y su número pueden ser empleados para determinad la salud de un ecosistema, 25 detectando la presencia de contaminación, el estado de calidad de las aguas, cambios climáticos o incluso alteraciones en la radiación ultravioleta. Currently, the use of bio-acoustic techniques for the study and monitoring of animal species within their habitat is one of the most important tools for biologists and conservationists. The technological advance experienced in acoustic sensors and digital recording media allows the census and identification of species remotely avoiding invasive techniques that alter ecosystems or assume the physical presence of the biologist in the study area. The data collected allow the monitoring of animals avoiding their physical marking and provide researchers with information on the biological indicators of the area. The presence or absence of certain species and their number can be used to determine the health of an ecosystem, 25 detecting the presence of pollution, the state of water quality, climatic changes or even alterations in ultraviolet radiation.

Existen numerosos estudios de las características espectro-temporales de especies, en los cuales se intentan analizar los parámetros en frecuencia y tiempo de las señales acústicas o 30 vocalizaciones producidas por los animales con objeto de identificar patrones en sus comunicaciones y sus iteraciones sociales. En ellos en general, el procedimiento consiste en recopilar horas de grabaciones sonoras por medio de sensores o micrófonos situados en el hábitat de estudio, las cuales son escuchadas y analizadas espectro-temporalmente por un biólogo para determinar la presencia de una determinada especie en el área que se está 35 investigando. Sin embargo, esté procedimiento es lento debido al gran número de horas de grabación que pueden haber sido recopiladas y a la necesidad de tener que contar con un biólogo experto en bio-acústica familiarizado con la especie animal a la que se desea realizar el seguimiento. En los últimos años se ha realizado un esfuerzo con la intención de automatizar esté procedimiento por medio de sistemas inteligentes empleado técnicas de reconocimiento 40 automático. Los estudios se han centrado en especies con amplía producción sonora como los pájaros, ranas y ballenas, donde existen varias investigaciones prometedoras que tratan de resolver éste problema. En ellas se aplican técnicas empleadas en el reconocimiento del habla humana por medio de sistemas expertos que reconocen con más o menos éxito la especie objeto de estudio. Por el contrario, los reptiles al considerarse mudos o con poca producción 45 sonora nunca han sido objetos de este tipo de investigaciones. Sin embargo, los reptiles entre ellos los cocodrilos, gecos, serpientes y tortugas, son capaces de producir sonidos bio-acústicos que son específicas de la especie. Los principales estudios en reconocimiento acústico se han centrado en los sonidos generados por los pájaros ejemplo de ello lo podemos encontrar en los siguientes artículos: 50 There are numerous studies of the spectrum-temporal characteristics of species, in which attempts are made to analyze the parameters in frequency and time of the acoustic signals or vocalizations produced by the animals in order to identify patterns in their communications and their social iterations. In them in general, the procedure consists of collecting hours of sound recordings by means of sensors or microphones located in the study habitat, which are listened to and analyzed spectrum-temporarily by a biologist to determine the presence of a certain species in the area That is being investigated. However, this procedure is slow due to the large number of recording hours that may have been collected and the need to have an expert bio-acoustics biologist familiar with the animal species to which the monitoring is desired. In recent years an effort has been made with the intention of automating this procedure by means of intelligent systems using automatic recognition techniques. Studies have focused on species with extensive sound production such as birds, frogs and whales, where there are several promising investigations that try to solve this problem. They apply techniques used in the recognition of human speech through expert systems that recognize more or less successfully the species under study. On the contrary, reptiles when considered silent or with little sound production have never been the objects of this type of research. However, reptiles including crocodiles, geckos, snakes and turtles are capable of producing bio-acoustic sounds that are specific to the species. The main studies in acoustic recognition have focused on the sounds generated by birds, an example of this can be found in the following articles: 50

i) Harmä, Automatic identification of bird species based on sinusoidal modeling of syllables, in: Acoustics, Speech, and Signal Processing 2003. Proceedings (ICASSP'03). 2003 IEEE International Conference on, Vol. 5, IEEE, 2003, pp. V-545. i) Harmä, Automatic identification of bird species based on sinusoidal modeling of syllables, in: Acoustics, Speech, and Signal Processing 2003. Proceedings (ICASSP'03). 2003 IEEE International Conference on, Vol. 5, IEEE, 2003, pp. V-545.

ii) S. Fagerlund, Bird species recognition using support vector machines, EURASIP 5 journal on Applied Signal Processing 2007 (1) (2007) 64-64. ii) S. Fagerlund, Bird species recognition using support vector machines, EURASIP 5 journal on Applied Signal Processing 2007 (1) (2007) 64-64.

iii) Lee, Chang-Hsing, Chin-Chuan Han, and Ching-Chien Chuang. "Automatic classification of bird species from their sounds using two-dimensional cepstral coefficients." Audio, Speech, and Language Processing, IEEE Transactions on 16.8 10 (2008): 1541-1550. iii) Lee, Chang-Hsing, Chin-Chuan Han, and Ching-Chien Chuang. "Automatic classification of bird species from their sounds using two-dimensional cepstral coefficients." Audio, Speech, and Language Processing, IEEE Transactions on 16.8 10 (2008): 1541-1550.

iv) Jančovič, Peter, and Münevver Köküer. "Automatic detection and recognition of tonal bird sounds in noisy environments." EURASIP Journal on Advances in Signal Processing 2011.1 (2011): 982936. 15 iv) Jančovič, Peter, and Münevver Köküer. "Automatic detection and recognition of tonal bird sounds in noisy environments." EURASIP Journal on Advances in Signal Processing 2011.1 (2011): 982936. 15

v) Graciarena, Martín, et al. "Acoustic front-end optimization for bird species recognition." Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010. v) Graciarena, Martín, et al. "Acoustic front-end optimization for bird species recognition." Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010.

20 twenty

vi) Graciarena, Martín, et al. "Bird species recognition combining acoustic and sequence modeling." Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE, 2011. vi) Graciarena, Martín, et al. "Bird species recognition combining acoustic and sequence modeling." Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE, 2011.

vii) Lopes, Marcelo T., et al. "Automatic bird species identification for large number of 25 species." Multimedia (ISM), 2011 IEEE International Symposium on. IEEE, 2011. vii) Lopes, Marcelo T., et al. "Automatic bird species identification for large number of 25 species." Multimedia (ISM), 2011 IEEE International Symposium on. IEEE, 2011.

viii) Mporas, Iosif, et al. "Automated Acoustic Classification of Bird Species from Real-Field Recordings." Tools with Artificial Intelligence (ICT A 1), 2012 IEEE 24th International Conference on. Vol. l. IEEE, 2012. 30 viii) Mporas, Iosif, et al. "Automated Acoustic Classification of Bird Species from Real-Field Recordings." Tools with Artificial Intelligence (ICT A 1), 2012 IEEE 24th International Conference on. Vol. L. IEEE, 2012. 30

ix) Juang, Chia-Feng, and Tai-Mou Chen. "Birdsong recognition using prediction-based recurrent neural fuzzy networks." Neurocomputing 71.1 (2007): 121-130. ix) Juang, Chia-Feng, and Tai-Mou Chen. "Birdsong recognition using prediction-based recurrent neural fuzzy networks." Neurocomputing 71.1 (2007): 121-130.

Las técnicas clásicas de reconocimiento automático acústico han sido empleadas para el 35 reconocimiento de acústico de patrones, de personas y animales, corno en: Classic acoustic automatic recognition techniques have been used for acoustic recognition of patterns, of people and animals, as in:

x) R. Bardelim, Algorithmic analysis of Complex Audio Scenes. Universitat Bonn. PhD Thesis, 2008 x) R. Bardelim, Algorithmic analysis of Complex Audio Scenes. Bonn University. PhD Thesis, 2008

40 40

xi) H. Xing, P.C. Loizou, Frequency Shift Detection of Speech with GMMs and SVMs, IEEE workshop on Signal Processing Systems, (2002) 215-219 xi) H. Xing, P.C. Loizou, Frequency Shift Detection of Speech with GMMs and SVMs, IEEE workshop on Signal Processing Systems, (2002) 215-219

Además, se han tratado de emplear del mismo modo técnicas clásicas de reconocimiento automático acústico sobre los insectos, murciélagos y ranas, ejemplos de ello pueden 45 encontrarse en los siguientes artículos: In addition, classic techniques of automatic acoustic recognition of insects, bats and frogs have been tried in the same way, examples of this can be found in the following articles:

xii) K. Riede, Acoustic monitoring of orthoptera and its potential for conservation, Journal of Insect Conservation 2 (3-4) ( 1998) 217-223. xii) K. Riede, Acoustic monitoring of orthoptera and its potential for conservation, Journal of Insect Conservation 2 (3-4) (1998) 217-223.

50 fifty

xiii) T. Ganchev, l. Potamitis, N. Fakotakis, Acoustic monitoring of singing insects, in: Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, Vol. 4, IEEE, 2007, pp. IV-721. xiii) T. Ganchev, l. Potamitis, N. Fakotakis, Acoustic monitoring of singing insects, in: Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, Vol. 4, IEEE, 2007, pp. IV-721.

xiv) Z. Leqing, Z. Zhen, lnsect sound recognition based on sbc and hmm, in: Intelligent 5 Computation Technology and Automation (ICICTA),2010 International Conference on, Vol. 2, IEEE, 2010, pp. 544-548. xiv) Z. Leqing, Z. Zhen, lnsect sound recognition based on sbc and hmm, in: Intelligent 5 Computation Technology and Automation (ICICTA), 2010 International Conference on, Vol. 2, IEEE, 2010, pp. 544-548.

xv) D. Chesmore, Automated bioacoustic identification of species, Anais da Academia Brasileira de Ciencias 76 (2) (2004) 436-440. 10 xv) D. Chesmore, Automated bioacoustic identification of species, Anais da Brazilian Academy of Sciences 76 (2) (2004) 436-440. 10

xvi) J. Pinhas, V. Soroker, A. Hetzroni, A. Mizrach, M. Teicher, J. Goldberger, Automatic acoustic detection of the red palm weevil, computers and electronics in agriculture 63 (2) (2008) 131-139. xvi) J. Pinhas, V. Soroker, A. Hetzroni, A. Mizrach, M. Teicher, J. Goldberger, Automatic acoustic detection of the red palm weevil, computers and electronics in agriculture 63 (2) (2008) 131-139 .

15 fifteen

xvii) A. E. Chaves, C. M. Travieso, A. Camacho, J. B. Alonso, Katydids acoustic classification on verification approach based on mfcc and hmm, in: Intelligent Engineering Systems (INES), 2012 IEEE 16th International Conference on, IEEE, 2012, pp. 561-566. xvii) A. E. Chaves, C. M. Travieso, A. Camacho, J. B. Alonso, Katydids acoustic classification on verification approach based on mfcc and hmm, in: Intelligent Engineering Systems (INES), 2012 IEEE 16th International Conference on, IEEE, 2012, pp. 561-566.

20 twenty

xviii) S. Kaloudis, D. Anastopoulos, C. P. Yialouris, N. A. Lorentzos, A. B. Sideridis, Insect identification expert system for forest protection, Expert Systems with Applications 28 (3) (2005) 445-452. xviii) S. Kaloudis, D. Anastopoulos, C. P. Yialouris, N. A. Lorentzos, A. B. Sideridis, Insect identification expert system for forest protection, Expert Systems with Applications 28 (3) (2005) 445-452.

xix) A. Henriquez, J. B. Alonso, C. M. Travieso, B. Rodríguez-Herrera, F. Bolanos, P. 25 Alpízar, K. Lopez-de Ipina, P. Henriquez, An automatic acoustic bat identification system based on the audible spectrum, Expert Systems with Applications 41 (11) (2014) 5451-5465. xix) A. Henriquez, JB Alonso, CM Travieso, B. Rodríguez-Herrera, F. Bolanos, P. 25 Alpízar, K. Lopez-de Ipina, P. Henriquez, An automatic acoustic bat identification system based on the audible spectrum, Expert Systems with Applications 41 (11) (2014) 5451-5465.

xx) G. Grigg, A. Taylor, H. Me Callum, G. Watson, Monitoring frog communities: an 30 application of machine learning, in: Proceedings of Eighth Innovative Applications of Artificial Intelligence Conference, Portland Oregon, 1996, pp. 1564-1569. xx) G. Grigg, A. Taylor, H. Me Callum, G. Watson, Monitoring frog communities: an 30 application of machine learning, in: Proceedings of Eighth Innovative Applications of Artificial Intelligence Conference, Portland Oregon, 1996, pp. 1564-1569.

xxi) C.-H. Lee, C.-H. Chou, C.-C. Han, R.-Z. Huang, Automatic recognition of animal vocalizations using averaged mfcc and linear discriminant analysis, Pattern 35 Recognition Letters 27 (2) (2006) 93-101. xxi) C.-H. Lee, C.-H. Chou, C.-C. Han, R.-Z. Huang, Automatic recognition of animal vocalizations using averaged mfcc and linear discriminant analysis, Pattern 35 Recognition Letters 27 (2) (2006) 93-101.

xxii) T. S. Brandes, Feature vector se1ection and use with hidden markov models to identify frequency-modulated bioacoustic signals amidst noise, Audio, Speech, and Language Processing, IEEE Transactions on 16 (6) (2008) 1173-1180. 40 xxii) T. S. Brandes, Feature vector se1ection and use with hidden markov models to identify frequency-modulated bioacoustic signals amidst noise, Audio, Speech, and Language Processing, IEEE Transactions on 16 (6) (2008) 1173-1180. 40

xxiii) C.-J. Huang, Y.-J. Yang, D.-X. Yang, Y.-J. Chen, Frog classification using machine learning techniques, Expert Systems with Applications 36 (2) (2009) 3737-3743. xxiii) C.-J. Huang, Y.-J. Yang, D.-X. Yang, Y.-J. Chen, Frog classification using machine learning techniques, Expert Systems with Applications 36 (2) (2009) 3737-3743.

xxiv) M. A. Acevedo, C. J. Corrada-Bravo, H. Corrada-Bravo, L. J. Villanueva-Rivera, T. M. 45 Aide, Automated classification of bird and amphibian calls using machine learning: A comparison of methods, Ecological Informatics 4 (4) (2009) 206-214. xxiv) MA Acevedo, CJ Corrada-Bravo, H. Corrada-Bravo, LJ Villanueva-Rivera, TM 45 Aide, Automated classification of bird and amphibian calls using machine learning: A comparison of methods, Ecological Informatics 4 (4) (2009) 206-214.

xxv) N. C. Han, S. V. Muniandy, J. Dayou, Acoustic classification of australian anurans based on hybrid spectral-entropy approach, Applied Acoustics 72 (9) (2011) 639-645. 50 xxv) N. C. Han, S. V. Muniandy, J. Dayou, Acoustic classification of australian anurans based on hybrid spectral-entropy approach, Applied Acoustics 72 (9) (2011) 639-645. fifty

xxvi) W.-P. Chen, S.-S. Chen, C.-C. Lin, Y.-Z. Chen, W.-C. Lin, Automatic recognition of frog calls using a multi-stage average spectrum, Computers & Mathematics with Applications 64 (5) (2012) 1270-1281. xxvi) W.-P. Chen, S.-S. Chen, C.-C. Lin, Y.-Z. Chen, W.-C. Lin, Automatic recognition of frog calls using a multi-stage average spectrum, Computers & Mathematics with Applications 64 (5) (2012) 1270-1281.

xxvii) C. L. T. Yuan, D. A. Ramli, Frog sound identification system for frog species 5 recognition, in: Context-Aware Systems and Applications, Springer, 2013, pp. 41-50. xxvii) C. L. T. Yuan, D. A. Ramli, Frog sound identification system for frog species 5 recognition, in: Context-Aware Systems and Applications, Springer, 2013, pp. 41-50.

xxviii) H. Jaafar, D. A. Ramli, B. A. Rosdi, S. Shahrudin, Frog identification system based on local means k-nearest neighbors with fuzzy distance weighting, in: The 8th International Conference on Robotic, Vision, Signal Processing & Power Applications, 10 Springer, 2014, pp. 153-159. xxviii) H. Jaafar, DA Ramli, BA Rosdi, S. Shahrudin, Frog identification system based on local means k-nearest neighbors with fuzzy distance weighting, in: The 8th International Conference on Robotic, Vision, Signal Processing & Power Applications, 10 Springer, 2014, pp. 153-159.

xxix) C. Bedoya, C. Isaza, J. M. Daza, J. D. Lopez, Automatic recognition of anuran species based on syllable identification, Ecological Informatics 24 {2014) 200--209. xxix) C. Bedoya, C. Isaza, J. M. Daza, J. D. Lopez, Automatic recognition of anuran species based on syllable identification, Ecological Informatics 24 {2014) 200--209.

15 fifteen

xxx) J. Xie, M. Towsey, A. Truskinger, P. Eichinski, J. Zhang, P. Roe, Acoustic classification of australian anurans using syllable features, in: Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), 2015 IEEE Tenth International Conference on, IEEE, 2015, pp. 1-6. xxx) J. Xie, M. Towsey, A. Truskinger, P. Eichinski, J. Zhang, P. Roe, Acoustic classification of australian anurans using syllable features, in: Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), 2015 IEEE Tenth International Conference on, IEEE, 2015, pp. 1-6.

20 twenty

Otros ejemplos de reconocimiento vocalizaciones bio-acústicas se pueden encontrar en la identificación automática de mamíferos marinos donde destacan los estudios sobre las ballenas. Las siguientes publicaciones son ejemplo de ello: Other examples of bio-acoustic vocalization recognition can be found in the automatic identification of marine mammals where studies on whales stand out. The following publications are examples of this:

xxxi) Mouy, Xavier, Mohammed Bahoura, and Yvan Simard. "Automatic recognition of fin 25 and blue whale calls for real-time monitoring in the St. Lawrence." The Journal of the Acoustical Society of America 126.6 (2009): 2918-2928. xxxi) Mouy, Xavier, Mohammed Bahoura, and Yvan Simard. "Automatic recognition of fin 25 and blue whale calls for real-time monitoring in the St. Lawrence." The Journal of the Acoustical Society of America 126.6 (2009): 2918-2928.

xxxii) Dugan, Peter J., et al. "North Atlantic right whale acoustic signal processing: Part l. Comparison of machine learning recognition algorithms." Applications and Technology 30 Conference (LISAT). 2010 Long Island Systems. IEEE, 2010 xxxii) Dugan, Peter J., et al. "North Atlantic right whale acoustic signal processing: Part l. Comparison of machine learning recognition algorithms." Applications and Technology 30 Conference (LISAT). 2010 Long Island Systems. IEEE, 2010

xxxiii) Baumgartner, Mark F., and Sarah E. Mussoline. "A generalized baleen whale call detection and classification system." The Journal of the Acoustical Society of America 129.5 (2011): 2889-2902. 35 xxxiii) Baumgartner, Mark F., and Sarah E. Mussoline. "A generalized baleen whale call detection and classification system." The Journal of the Acoustical Society of America 129.5 (2011): 2889-2902. 35

xxxiv) Seekings, Paul, and John Potter. "Classification of marine acoustic signals using Wavelets & Neural Networks." Proc. of 8th Western Pacific Acoustics Conf.(Wespac8). 2003. xxxiv) Seekings, Paul, and John Potter. "Classification of marine acoustic signals using Wavelets & Neural Networks." Proc. of 8th Western Pacific Acoustics Conf. (Wespac8). 2003

40 40

Existen varias patentes relacionadas con la identificación bio-acústica de especies las cuales se centran de manera genérica en la recolección y comparación de datos y parámetros sonoros basada en sus vocalizaciones. Pero todas ellas se centran principalmente en la identificación de pájaros y ninguna de ellas contempla la identificación acústica de reptiles, ni tienen en cuenta sus especificidades bio-acústicas. Además, solo contemplan la posibilidad de identificar 45 especies no sujetos individuales, subfamilias o género dentro de una determinada especie. Ejemplo de ello se puede encontrar en las siguientes patentes: There are several patents related to the bio-acoustic identification of species which focus generically on the collection and comparison of data and sound parameters based on their vocalizations. But all of them focus mainly on the identification of birds and none of them contemplate the acoustic identification of reptiles, nor do they take into account their bio-acoustic specificities. In addition, they only contemplate the possibility of identifying 45 non-individual species, subfamilies or genus within a given species. An example of this can be found in the following patents:

xxxv) WO 2005024782 A1 (Wildlife Acoustics Inc, Ian Agranat) "Method and apparatus for automatically identifying animal species from their vocalizations". 50 xxxv) WO 2005024782 A1 (Wildlife Acoustics Inc, Ian Agranat) "Method and apparatus for automatically identifying animal species from their vocalizations". fifty

xxxvi) US 8599647 B2 (Wildlife Acoustics, Inc.) "Method for listening to ultrasonic animal sounds". xxxvi) US 8599647 B2 (Wildlife Acoustics, Inc.) "Method for listening to ultrasonic animal sounds".

xxxvii) US 7963254 B2 (Pariff Llc) "Method and apparatus for the automatic identification of birds by their vocalizations". 5 xxxvii) US 7963254 B2 (Pariff Llc) "Method and apparatus for the automatic identification of birds by their vocalizations". 5

xxxviii) US 20130282379 A1 (Tom Stephenson, Stephen Travis POPE) "Method and apparatus for analyzing animal vocalizations, extracting identification characteristics, and using databases of these characteristics for identifying the species of vocalizing animals". 10 xxxviii) US 20130282379 A1 (Tom Stephenson, Stephen Travis POPE) "Method and apparatus for analyzing animal vocalizations, extracting identification characteristics, and using databases of these characteristics for identifying the species of vocalizing animals". 10

xxxix) US 20040107104 A1 (Schaphorst Richard A.) "Method and apparatus for automated identification of animal sounds". xxxix) US 20040107104 A1 (Schaphorst Richard A.) "Method and apparatus for automated identification of animal sounds".

xl) US 8457962 82 (Lawrence P. Jones) "Remote audio surveillance for detection and 15 analysis of wildlife sounds". xl) US 8457962 82 (Lawrence P. Jones) "Remote audio surveillance for detection and 15 analysis of wildlife sounds".

En cuanto a los reptiles, los dispersos estudios se centran en el análisis espectro-temporal de las características acústicas de los reptiles, pero ninguno de ellos hace uso de estas características para el reconocimiento automatizado de estas especies. Además, estos 20 principalmente se centran en los cocodrilos y los gecos que son las especies más comunicativas entre los reptiles. As for reptiles, scattered studies focus on the spectrum-temporal analysis of the acoustic characteristics of reptiles, but none of them make use of these characteristics for the automated recognition of these species. In addition, these 20 mainly focus on crocodiles and geckos that are the most communicative species among reptiles.

xli) Vergne, A. L., M. B. Pritz, and N. Mathevon. "Acoustic communication in crocodilians: from behaviour to brain." Biological Reviews 84.3 (2009): 391-411. 25 xli) Vergne, A. L., M. B. Pritz, and N. Mathevon. "Acoustic communication in crocodilians: from behavior to brain." Biological Reviews 84.3 (2009): 391-411. 25

xlii) Wang, Xianyan, et al. "Acoustic signals of Chinese alligators (Alligator sinensis): social communication." The Journal of the Acoustical Society of America 121.5 (2007): 2984-2989. xlii) Wang, Xianyan, et al. "Acoustic signals of Chinese alligators (Alligator sinensis): social communication." The Journal of the Acoustical Society of America 121.5 (2007): 2984-2989.

30 30

xliii) Ferrara, Camila R., Richard C. Vogt, and Renata S. Sousa-Lima. "Turtle vocalizations as the first evidence of posthatching parental care in chelonians." Journal of Comparative Psychology 127.1 (2013): 24. xliii) Ferrara, Camila R., Richard C. Vogt, and Renata S. Sousa-Lima. "Turtle vocalizations as the first evidence of posthatching parental care in chelonians." Journal of Comparative Psychology 127.1 (2013): 24.

xliv) Labra, Antonieta, et al. Acoustic features of the weeping lizard's distress call. Copeia, 35 2013, vol. 2013, no 2, p. 206-212. xliv) Labra, Antonieta, et al. Acoustic features of the weeping lizard's distress call. Copeia, 35 2013, vol. 2013, no 2, p. 206-212.

Por tanto, se puede observar que no hay constancia de la identificación automatizada de reptiles por medio de su producción sonora, tanto de la especie a la que pertenece como al seguimiento individualizado de un espécimen en concreto. La presente invención tiene por 40 objeto el reconocimiento específico de la especie, familia, subfamilia y género a la que pertenece un determinado reptil en base a sus características de emisión bio-acústica y mediante el hiperdimensionamiento de la transformación de la fusión de sus características acústicas en los dominios ceptrales y temporales. Gracias a este paso, esta solución no ha sido hallada en el estado de la técnica, a diferencia de las vocalizaciones realizadas por otras 45 especies animales que poseen cuerdas vocales. Esta propuesta reconocería automáticamente vocalizaciones y emisiones bio-acústicas de cualquier naturaleza en reptiles. La invención, por tanto, tendría potenciales aplicaciones en la detección, identificación y monitorización del grupo de animales reptiles (Reptilia) o sauropsida. Permitiendo así el control de las poblaciones, lo que tiene a su vez aplicaciones en el control de plagas o especies invasivas, en la 50 conservación de especies, estudios biológicos de comportamiento animal, cambios de las Therefore, it can be observed that there is no record of the automated identification of reptiles through their sound production, both of the species to which it belongs and the individualized monitoring of a specific specimen. The present invention aims at the specific recognition of the species, family, subfamily and genus to which a particular reptile belongs based on its bio-acoustic emission characteristics and by hyper-sizing the fusion transformation of its acoustic characteristics in the brush and temporal domains. Thanks to this step, this solution has not been found in the state of the art, unlike the vocalizations made by 45 other animal species that have vocal cords. This proposal would automatically recognize bio-acoustic vocalizations and emissions of any nature in reptiles. The invention, therefore, would have potential applications in the detection, identification and monitoring of the group of reptile animals (Reptilia) or sauropsida. Thus allowing population control, which in turn has applications in the control of pests or invasive species, in the conservation of species, biological studies of animal behavior, changes in

condiciones ambientales, etc. Incluso en la detección de posibles patologías o plagas que pudieran afectar a este grupo animal. La invención por lo tanto abre un amplio abanico de posibilidades de aplicaciones en el ámbito biológico o de conservación ambiental. Por ello, su análisis y detección es de suma importancia en los tiempos actuales y futuros. environmental conditions, etc. Even in the detection of possible pathologies or pests that could affect this animal group. The invention therefore opens a wide range of possibilities of applications in the biological or environmental conservation field. Therefore, its analysis and detection is very important in current and future times.

5 5

Cabe concluir tras estos antecedentes, que los estudios que se han desarrollado hasta el momento y que han tenido como parámetro característico los sonidos producidos por los reptiles, han sido utilizados básicamente para el estudio del comportamiento biológico de la especie, para caracterizar los parámetros acústicos fundamentales de sus llamadas, establecer su neurología o estudiar la implicación de las mismas en su comportamiento social. También 10 los antecedentes muestran trabajo específico para diferentes especies de animales, o bien sistemas generales basados en un sistema clásico de reconocimiento de patrones, sin particularidades sobre cómo mejorar el reconocimiento según la especie o la aplicación. El método propuesto, a diferencia de lo observado en el estado de la técnica, permite utilizar sus parámetros acústicos verbales y no verbales para posibilitar el reconocimiento de las especies 15 por medio de un módulo que aumenta la hiperdimensionalidad de la transformación de las características acústicas aplicado a sistemas inteligentes. Esto presenta la ventaja de no ser invasivo, pues con un sistema de micrófonos remoto se puede captar y analizar la señal acústica de los especímenes. Además, permite el seguimiento y detección de estas especies en condiciones de visibilidad limitada. 20 It is possible to conclude after this background, that the studies that have been developed so far and that have had as characteristic parameter the sounds produced by reptiles, have been used basically for the study of the biological behavior of the species, to characterize the fundamental acoustic parameters of their calls, establish their neurology or study their involvement in their social behavior. Also, the background shows specific work for different species of animals, or general systems based on a classic pattern recognition system, without particularities on how to improve the recognition according to the species or application. The proposed method, unlike what is observed in the state of the art, allows its verbal and nonverbal acoustic parameters to be used to enable the recognition of species 15 by means of a module that increases the hyperdimensionality of the transformation of the applied acoustic characteristics to intelligent systems. This has the advantage of not being invasive, because with a remote microphone system the acoustic signal of the specimens can be captured and analyzed. In addition, it allows the monitoring and detection of these species in conditions of limited visibility. twenty

Sumario de la invención Summary of the invention

La presente invención se refiere a un método para la identificación y censo de especies de reptiles a partir de la hiperdimensionalidad de la transformación de su producción sonora 25 siguiendo los siguientes cinco pasos: The present invention relates to a method for the identification and census of reptile species from the hyperdimensionality of the transformation of their sound production 25 following the following five steps:

i) Pre-procesado de la señal acústica enfatizando las regiones que contienen mayor información. i) Pre-processed acoustic signal emphasizing the regions that contain more information.

30 30

ii) Segmentación automática de las llamadas y sonidos verbales o no verbales detectados en la señal acústica, separando las distintas emisiones sonoras que pueden pertenecer a diferentes especies o especímenes en la señal de audio. ii) Automatic segmentation of verbal and non-verbal calls and sounds detected in the acoustic signal, separating the different sound emissions that may belong to different species or specimens in the audio signal.

iii) Fusión paramétrica de las características extraídas en frecuencia y tiempo de cada sonido 35 segmentado de las llamadas o vocalizaciones para obtener una representación completa de diferentes dominios de la fuente sonora. iii) Parametric fusion of the characteristics extracted in frequency and time of each segmented sound 35 of the calls or vocalizations to obtain a complete representation of different domains of the sound source.

iv) Transformación de las características fusionadas a partir de generar una hiperdimensionalidad de las mismas, creando un dominio de representación del modelo de 40 Markov mas discriminativo. iv) Transformation of the merged characteristics from generating a hyperdimensionality of them, creating a domain of representation of the more discriminative Markov model.

v) Clasificación e identificación de la especie o individuo por medio de un algoritmo de aprendizaje automatizado. v) Classification and identification of the species or individual through an automated learning algorithm.

45 Four. Five

Descripción de las figuras Description of the figures

La figura 1 detalla de forma esquemática el diagrama de bloques del sistema desarrollado. Figure 1 schematically details the block diagram of the developed system.

La figura 2 muestra la forma del espectrograma de las emisiones sonoras de los reptiles. 50 Figure 2 shows the spectrogram shape of the sound emissions of reptiles. fifty

a) Crotalus atrox. a) Crotalus atrox.

b) Gekko gecko. b) Gekko gecko.

c) Alligator mississippiensis. 5 c) Alligator mississippiensis. 5

d) Chelonoides nigra. d) Chelonoides nigra.

La figura 3 representa de forma esquemática la segmentación de las vocalizaciones. Figure 3 schematically represents the segmentation of the vocalizations.

10 10

a) Cálculo de la transformada rápida de Fourier (FFT) a) Calculation of the fast Fourier transform (FFT)

b) Localización del punto de mayor energía del espectrograma b) Location of the point of greatest energy of the spectrogram

c) Se repite el procedimiento hasta el final del espectrograma. 15 c) The procedure is repeated until the end of the spectrogram. fifteen

La figura 4 detalla de forma esquemática el proceso de extracción de características espectrales. Figure 4 schematically details the process of extracting spectral characteristics.

a) Cálculo de la transformada rápida de Fourier (FFT). 20 a) Calculation of the fast Fourier transform (FFT). twenty

b) Filtrado por medio de un banco de filtros triangulares Mel. b) Filtered by means of a Mel triangular filter bank.

c) Transformada del Coseno Discreto (DCT). c) Transformed Discrete Cosine (DCT).

25 25

d) Se retienen los 14 primeros coeficientes de la OCT. d) The first 14 OCT coefficients are retained.

Descripción detallada de una realización preferida de la invención 30 Detailed description of a preferred embodiment of the invention

Aunque la invención se describe en términos de una realización específica preferida, será fácilmente evidente para los expertos en esta técnica que se pueden hacer diversas modificaciones, redisposiciones y reemplazos. El alcance de la invención está definido por las reivindicaciones adjuntas a la misma. 35 Although the invention is described in terms of a specific preferred embodiment, it will be readily apparent to those skilled in this art that various modifications, redispositions and replacements can be made. The scope of the invention is defined by the claims appended thereto. 35

La invención propuesta consiste en un método que aplica varios subprocesos hasta llegar a la identificación inequívoca de la especie a la que pertenece el réptil por medio de sistemas inteligentes. El primero, realiza un pre-procesado de la señal (i). A continuación, se realiza una segmentación de las emisiones acústicas contenidas en la grabación de audio por medio de un 40 análisis automático de su espectrograma (ii). Sobre los segmentos de audio se extraen características en el dominio del tiempo y la frecuencia para caracterizar cada sonido verbal o no verbal y se fusionan todas las características para tener una robusta y única representación de la fuente sonora. (iii). Se aplicará una transformación del modelo de Markov para generar una mayor dimensionalidad y nueva representación de características, sobre la representación 45 fusionada. (iv). Los parámetros transformados se envían a un algoritmo clasificador de patrones para realizar la identificación de la especie (v). The proposed invention consists of a method that applies several threads until it reaches the unequivocal identification of the species to which the reptile belongs by means of intelligent systems. The first one performs a preprocessing of the signal (i). Next, a segmentation of the acoustic emissions contained in the audio recording is performed by means of an automatic analysis of its spectrogram (ii). On the audio segments, characteristics in the time and frequency domain are extracted to characterize each verbal or nonverbal sound and all the characteristics are merged to have a robust and unique representation of the sound source. (iii). A transformation of the Markov model will be applied to generate greater dimensionality and new representation of characteristics, over the merged representation. (iv). The transformed parameters are sent to a pattern classification algorithm to perform the identification of the species (v).

A continuación, se describen en detalle los subprocesos enumerados previamente. Next, the threads listed above are described in detail.

50 fifty

(i) El pre-procesado de la señal consiste en la conversión de estéreo a mono del audio procedente de las grabaciones sonoras realizando la media entre los dos canales y se filtra la señal paso bajo con frecuencia de corte de 18 kHz, debido a que las emisiones de los reptiles se concentran fundamentalmente en bajas frecuencias. A continuación, se aplica un filtro de pre-énfasis para igualar la energía del espectro definido por la ecuación Y(n) = 5 X(n) - 0.95*X(n-1), donde X(n) es la señal sonora e Y(n) la salida del filtro. El filtro de pre-énfasis permite aumentar la contribución de las altas frecuencias en la identificación del espécimen. (i) The preprocessing of the signal consists of the conversion from stereo to mono of the audio coming from the sound recordings by means of the average between the two channels and the low pass signal is filtered with a cut-off frequency of 18 kHz, because Reptile emissions focus primarily on low frequencies. Next, a pre-emphasis filter is applied to match the energy of the spectrum defined by the equation Y (n) = 5 X (n) - 0.95 * X (n-1), where X (n) is the sound signal and Y (n) the filter outlet. The pre-emphasis filter allows to increase the contribution of the high frequencies in the identification of the specimen.

(ii) Una vez pre-procesada la señal se procede a la segmentación de sonidos vocales o no 10 vocales de forma automatizada realizando un estudio del espectrograma de la señal. Para ello se emplea una versión especialmente modificada del algoritmo de Härmä para la obtención de los segmentos. Para ello el espectrograma se recorre aplicando una ventana de Hamming de duración 11.6 ms. y solapamiento del 45%. En cada paso de la ventana se localiza el punto de mayor energía del espectrograma y se toma la señal a izquierda y 15 derecha de ese punto hasta que la energía cae a 20 dB decibelios, repitiendo el proceso a cada paso de la ventana. A continuación, se aplica un supresor inteligente de muestras incorrectas para eliminar automáticamente aquellos segmentos que no contienen información relevante para la identificación. Para ello, se ha aplicado el algoritmo de alineamiento temporal dinámico en inglés, Dynamic Time Warping (DTW), utilizando como 20 intervalo de confianza la media más 1.8 veces la desviación estándar. Este diseño, que no se había utilizado con anterioridad en la detección de animales, impide que sonidos del entorno natural interfieran en el proceso de clasificación aumentando la tasa de éxito en el reconocimiento. (ii) Once the signal is preprocessed, the segmentation of vocal or non-vocal sounds is automatically performed by conducting a study of the signal spectrogram. For this, a specially modified version of the Härmä algorithm is used to obtain the segments. For this, the spectrogram is traversed by applying a Hamming window of 11.6 ms duration. and 45% overlap. At each step of the window, the point of greatest energy of the spectrogram is located and the signal is taken to the left and right of that point until the energy drops to 20 dB decibels, repeating the process at each step of the window. Next, an intelligent suppressor of incorrect samples is applied to automatically eliminate those segments that do not contain relevant information for identification. To do this, the dynamic time alignment algorithm in English, Dynamic Time Warping (DTW), has been applied using a mean confidence interval plus 1.8 times the standard deviation. This design, which had not previously been used in the detection of animals, prevents sounds from the natural environment from interfering with the classification process by increasing the recognition success rate.

25 25

(iii) Una vez obtenidas las distintas emisiones acústicas se extraen sobre cada una de ellos los coeficientes de caracterización espectrales MFCC y LFCC, en inglés ''Mel and Lineal Frequencial Cepstral Coeflcients'', para obtener información de todo el espectro; y se obtienen parámetros temporales como la longitud temporal del sonido y su entropía. (iii) Once the different acoustic emissions have been obtained, the spectral characterization coefficients MFCC and LFCC are extracted on each of them, in English `` Mel and Lineal Frequencial Cepstral Coeflcients '', to obtain information on the entire spectrum; and temporary parameters such as the temporal length of the sound and its entropy are obtained.

30 30

A continuación, se fusiona el conjunto de parámetros extraídos por cada sonido; creando un único vector que caracteriza tanto en frecuencia como en tiempo cada una de las llamadas. Se toman 14 coeficientes por cada una de las características, formando por tanto un vector de 28 coeficientes por cada segmento, modelando así la información tanto de las altas como de las bajas frecuencias de los sonidos producidos por los reptiles. 35 Next, the set of parameters extracted by each sound is merged; creating a single vector that characterizes both the frequency and time of each call. 14 coefficients are taken for each of the characteristics, thus forming a vector of 28 coefficients for each segment, thus modeling the information of both the high and low frequencies of the sounds produced by reptiles. 35

(iv) Se aplica una transformación mediante el uso de los modelos ocultos de Markov, para generar una mayor dimensionalidad de la anterior fusión parametrizada. Este nuevo espacio de representación tendrá un mayor discriminante y mejorará el porcentaje de éxito del reconocimiento, sobre los sistemas clásicos que no usan este tipo de 40 hiperdimensionamiento. (iv) A transformation is applied through the use of hidden Markov models, to generate a greater dimensionality of the previous parameterized fusion. This new space of representation will have a greater discrimination and will improve the success rate of recognition, on the classic systems that do not use this type of hyper-dimensioning.

La transformación permitirá pasar del vector obtenido de la fusión paramétrica a un vector de mucha mayor dimensión, adaptado a un espacio de representación que dependerá del número de estados y del número de símbolos por estados del modelo oculto de Markov 45 (MOM). Serán a estos vectores representados a los que se le aplicará el clasificador SVM para obtener un resultado de reconocimiento. The transformation will allow to move from the vector obtained from the parametric fusion to a vector of much greater dimension, adapted to a representation space that will depend on the number of states and the number of symbols by states of the hidden Markov 45 model (MOM). These vectors will be represented to which the SVM classifier will be applied to obtain a recognition result.

Teniendo en cuenta la nomenclatura usada en la descripción del clasificador MOM, se interpreta P(X│λ) como la probabilidad de que un vector de características X (que es el 50 resultado de la fusión paramétrica) haya sido creado por el modelo de Markov λ, definido Taking into account the nomenclature used in the description of the MOM classifier, P (X│λ) is interpreted as the probability that a vector of characteristics X (which is the result of the parametric fusion) has been created by the Markov model λ, defined

por el número de estados y los símbolos por estado. Entonces el espacio adaptado para el mapeo mencionado de vectores de la fusión queda definido como el gradiente del logaritmo de dicha probabilidad: by the number of states and symbols by state. Then the space adapted for the mentioned mapping of fusion vectors is defined as the gradient of the logarithm of said probability:

5 5

Donde cada componente de UX es la derivada con respecto a un determinado parámetro del MOM y especifica consecuentemente la medida en la que cada parámetro contribuye al vector de la fusión paramétrica. En este caso se ha utilizado solo la derivada respecto a la matriz de probabilidad de emisión de símbolos, . Que indica la probabilidad de 10 emitir un símbolo vk estando en Where each component of UX is the derivative with respect to a given MOM parameter and consequently specifies the extent to which each parameter contributes to the parametric fusion vector. In this case, only the derivative has been used with respect to the probability matrix of symbol emission,. Which indicates the probability of 10 emitting a vk symbol while in

imagen1image 1

el estado j. Donde N es el número de estados y M el número de símbolos por estado. the state j. Where N is the number of states and M the number of symbols per state.

Se obtiene entonces la expresión de transformación del modelo de Markov siguiendo la 15 expresión: The transformation expression of the Markov model is then obtained by following the expression:

Siendo δ la función delta de Dirac y la matriz gamma Υt (i) un indicativo de la probabilidad de estar en el estado i en un instante t. El numerador de la expresión anterior indica el 20 número de veces que se usado cada símbolo en cada estado. Being δ the Dirac delta function and the gamma matrix Υt (i) an indicative of the probability of being in state i in an instant t. The numerator of the previous expression indicates the number of times each symbol was used in each state.

(v) Los vectores se envían a un sistema de clasificación basado en una máquina de soporte de vectores SVM, en inglés "Support Vector Machine", de identificación multi-clase aplicando la estrategia "One VsOne" que ha sido previamente entrenado con los audios de las especies de reptiles que se desea identificar. A la salida del clasificador se obtiene el 25 reconocimiento o detección de las especies o individuos en los que se desee realizar el estudio, censo o seguimiento. La máquina de soporte de vectores ha sido configurada empleando un núcleo de tipo Gaussiano, K(x, x') = exp (Υ║x - x'║), con un valor de γ = 0.52 y un margen blando de parámetro C = 20. La transformación del modelo de Markov permite una mejor separación del espacio muestra! en la entrada del SVM separando de 30 forma más eficiente las distintas clases y, por tanto, incrementando los límites de decisión del mismo facilitando el reconocimiento. Este diseño es, por tanto, más eficaz que los diseños clásicos al permitir una diferenciación más óptima de los distintos sonidos. Los resultados experimentales dan como resultado tasas de acierto por encima del 99% en la identificación de la especie a la que pertenece el reptil. 35 (v) The vectors are sent to a classification system based on an SVM vector support machine, in English "Support Vector Machine", multi-class identification applying the "One VsOne" strategy that has been previously trained with the audios of the reptile species that you want to identify. Upon leaving the classifier, recognition or detection of the species or individuals in which the study, census or monitoring is desired is obtained. The vector support machine has been configured using a Gaussian type core, K (x, x ') = exp (Υ║x - x'║), with a value of γ = 0.52 and a soft margin of parameter C = 20. The transformation of the Markov model allows a better separation of the sample space! at the entrance of the SVM, separating the different classes in a more efficient way and, therefore, increasing its decision limits, facilitating recognition. This design is, therefore, more effective than classic designs by allowing a more optimal differentiation of the different sounds. The experimental results result in success rates above 99% in the identification of the species to which the reptile belongs. 35

Claims

1. Method for the automatic identification and classification of animal specimens belonging to the Reptilia group (reptiles or sauropsides) by means of the transformation of the Markov model of the parametric fusion of the characteristics of its acoustic bio-5 emissions of verbal and nonverbal sounds , using an intelligent pattern recognition system and an intelligent suppression system for incorrect sounds or segments. The method also identifies the individual, species and gender of each reptile. The detected sounds are passed through an intelligent segment suppressor, to discard the sounds coming from the environment and to all those that contain little or no information for identification. The taxonomic recognition of each reptile is produced by applying the Markov model transformation of the fusion of information of spectral and temporal parameters, which characterize the acoustic emissions of these specimens in frequency and time. In this case, reptile sounds are characterized by:

fifteen

(i) The transformation of the Markov model of the parametric fusion of its characteristics to a space of greater dimensionality.

The transformation has been calculated using the derivative with respect to the probability matrix of symbol emission; 20 indicating the

image 1

probability of issuing a vk symbol while in state j. Where N is the number of states and M the number of symbols per state. The transformation expression of the Markov model is then obtained by following the expression:

25

Being δ the Dirac delta function and the gamma matrix Υt (i) an indicative of the probability of being in state i in an instant t. The numerator of the previous expression indicates the number of times each symbol is used in each state. 30

(ii) Parametric fusion is obtained from the MFCC and LFCC coefficients, in English, "Mel and Lineal Frequencial Cepstral Coeficients", to obtain information on both the low and high frequencies of the reptile's acoustic signals; and of the temporal parameters of bio-acoustic emissions with discriminant characteristics such as entropy and sound duration:

Parametric Fusion = {MFCC_Parameters ULFCC_Parameters U Entropy U Sound_Duration}

40

With the new space of representation of greater dimensionality of the transformation of the Markov model of the fusion of parameters, the species or individual is classified and identified by means of an automated learning algorithm.