CN107369451B - Bird voice recognition method for assisting phenological study of bird breeding period - Google Patents
Bird voice recognition method for assisting phenological study of bird breeding period Download PDFInfo
- Publication number
- CN107369451B CN107369451B CN201710583313.8A CN201710583313A CN107369451B CN 107369451 B CN107369451 B CN 107369451B CN 201710583313 A CN201710583313 A CN 201710583313A CN 107369451 B CN107369451 B CN 107369451B
- Authority
- CN
- China
- Prior art keywords
- birds
- bird
- algorithm
- recording
- segments
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000009395 breeding Methods 0.000 title claims abstract description 19
- 230000001488 breeding effect Effects 0.000 title claims abstract description 19
- 238000000034 method Methods 0.000 title claims abstract description 16
- 238000000926 separation method Methods 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 4
- 239000012634 fragment Substances 0.000 claims description 3
- 238000013178 mathematical model Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 238000000354 decomposition reaction Methods 0.000 claims description 2
- 230000002194 synthesizing effect Effects 0.000 claims description 2
- 230000003595 spectral effect Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000000513 principal component analysis Methods 0.000 description 2
- 230000001850 reproductive effect Effects 0.000 description 2
- 241000276489 Merlangius merlangus Species 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/16—Hidden Markov models [HMM]
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
A bird voice recognition method for assisting the phenological study of the breeding season of birds is characterized in that field recording segments are read, the sounds comprise a plurality of segments of bird singing, then a recognition algorithm can recognize the types of birds to which the segments of the singing in the recording belong, a recognition reliability is given, the actual recording date of the segment of sounds is recorded, finally the number of the birds which singing in all the recordings in the region, namely the number of the birds entering the breeding season, is calculated according to the recognition result of the algorithm, after a certain time, the number exceeds a preset threshold value, the birds in the region can be considered to enter the breeding season from the moment, and otherwise, after the number is reduced to exceed the threshold value, the birds can be considered to end the breeding season.
Description
Technical Field
The invention relates to the technical field of bird voice recognition, in particular to a bird voice recognition method for assisting the phenological study of bird breeding season.
Background
Biologically, birds are classified into whiting (bird call) and singing (bird song). Among them, bird song (bird song) refers to a song made by birds in the breeding season. The sound pattern of the same bird is very constant. The sound sounds of different birds often differ greatly. The whine of birds can thus be used as a means of identifying the species of birds.
The phenological study is a subject for studying the relationship between animals and the periodic changes of the environment. One branch is to study the relationship between the reproductive stage of birds and the cyclic changes in the environment. While the reproductive stage of birds can be obtained by recognizing the sound of birds. Therefore, the research on the phenological condition of the breeding period of the birds can be assisted by the sound recognition of the birds.
Disclosure of Invention
The invention aims to provide a bird voice recognition method for assisting the phenological study of the breeding period of birds.
In order to solve the technical problems, the following technical scheme is adopted: a bird voice recognition method for assisting the phenological study of the breeding season of birds is characterized in that field recording segments are read, a plurality of bird song segments are contained in the sounds, then a recognition algorithm can recognize the types of birds to which the song segments belong in the recording, a recognition reliability is given, the occurrence time of the song segments in the recording segments is recorded, finally the number of the birds which send out the songs, namely the number of the birds entering the breeding season, can be calculated according to the recording time, and after the number exceeds a preset threshold after a certain time, the birds in the region can be considered to enter the breeding season from the beginning, and otherwise, after the number is reduced and exceeds the threshold, the birds can be considered to end the breeding season.
The specific steps of the recognition algorithm are as follows: 1) adopting semi-supervised non-negative matrix decomposition for source separation, 2) passing the signal through a low-pass filter and then performing frequency compensation; 3) dividing the sound: finding a conversion point from blank to call by using the short-time energy, firstly calculating the short-time energy of the recording:then finding out sound segments according to a threshold value; 4) feature extraction: firstly, adding overlapped windows to a sound fragment, wherein each window becomes a frame, extracting time domain characteristics and frequency domain characteristics aiming at values in each window, most of the frequency domain characteristics are based on short-time Fourier transform (STFT), and then synthesizing the time domain characteristics and the frequency domain characteristics into a vector to be used as a characteristic vector of the frame; 5) dimension reduction and noise reduction: using PCA as a means of dimensionality reduction; 6) a mathematical model is established for each bird song by adopting a hidden Markov chain, firstly, a segmented k means is adopted for model initialization, then a forward-backward algorithm (forward-backward algorithm) is used for training an HMM, after the HMM model is established, a new recording needing to be processed is subjected to source separation, pretreatment, segmentation, feature extraction and PCA, and then the obtained feature sequence is compared with each trained HMM. Namely decoding by using a Viterbi Algorithm (Viterbi Algorithm) to obtain the reliability. And selecting the model with the highest credibility as the recognition result.
Drawings
FIG. 1 is a schematic diagram of a technical circuit of the present invention
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
Firstly, reading a field recording segment, wherein the sound comprises a plurality of segments of bird sounds, then identifying the type of birds to which the sound-calling segment belongs in the recording by an identification algorithm, giving an identification credibility, recording the occurrence time of the sound-calling segment in the recording segment, and finally calculating the number of the birds which send the sounds, namely the number of the birds entering the breeding period by combining the recording time.
The identification algorithm comprises the following specific steps:
1)semi-supervised NMF
Semi-super-NMF: for source separation (source separation). By source separation is meant that the sound recorded by the recorder is a mixture of multiple sounds, sometimes overlapping. Source separation is a technique used to separate different sounds.
The full name of NMF is non-negative matrix factorization, i.e., non-negative matrix factorization. The method is the best method for separating the source. It can decompose the sound into different base (base) weighted forms. A set of bases and corresponding weights may be obtained as a result of source separation.
Semi-hypervided NMF refers to training with some known data of a specific class to obtain a base corresponding to the class, and then applying an NMF algorithm to the data to be processed by using the base and another initial vector. The bases and weights of the known classes trained in advance are used to obtain separate results, which are subsequently processed.
The use of Semi-supervisedNMF can achieve a good separation effect and, in addition, can effectively suppress noise. This method may be better than other noise reduction methods in some environments. Since conventional noise reduction means require knowledge of the nature of the noise. But the conditions under which the noise is generated are very uncertain. The nature of the noise cannot be accurately described in advance. Thus, the traditional noise reduction method has poor effect. But semi-supervisedNMF based methods may not know the nature of the noise in advance. Therefore, the noise reduction effect of the semi-superimposed NMF-based method is better.
2) Pretreatment of
The pretreatment mainly does two parts of work. The signal is first passed through a low pass filter. And then frequency compensation is performed.
3) Segmentation
Recordings are long and contain blanks and beeps. It is therefore necessary to remove the blank part first, leaving only the part with the call. Therefore, the sound needs to be segmented (segmentation). Short-term energy is used to find the transition point (end point) of the blank to the call.
Firstly, the short-time energy of the recorded sound is calculated, and then the sound fragment is found according to the threshold value.
4) Feature extraction
For each of the obtained calls, their features need to be extracted. The sound segment is first added with overlapping windows, each referred to as a frame, and time-domain features and frequency-domain features are extracted for the values within each window. Most frequency domain features are based on a Short Time Fourier Transform (STFT). The time domain feature and the frequency domain feature are then combined into a vector as the feature vector for the frame.
The time domain characteristics are: zero crossing rate, Short timenergy, entry of energy
The frequency domain features are: MFCC, Spectral centroid, Spectral spread, Spectral entry, Spectral flux, Spectral roll
5)PCA
Because the obtained feature vector has a high dimension, the direct operation has a very large operation amount and has some noises. It is therefore desirable to perform dimensionality reduction on the data, where PCA is used as a means of dimensionality reduction.
PCA is known as principal component analysis, principal component analysis. PCA is an effective data dimension reduction means, and can reduce data dimension and reduce computation amount. And much noise can be reduced. Thereby improving system performance.
6)HMM
The full name of the HMM is a Hidden Markov chain (Hidden Markov Model). Is a well-known mathematical model for time series modeling. Compared with other methods, the HMM is higher in recognition efficiency and robustness.
An HMM was established for each bird's chirping. The model is initialized first with segment k means and then the HMM is trained using forward-backward algorithm (forward-backward algorithm).
After training is finished, for new feature vectors needing to be identified after PCA processing, decoding each feature vector by using a Viterbi algorithm (viterbi algorithm). The viterbi algorithm will obtain a probability, and several HMMs with the highest probability may be selected as the result according to the requirement.
The HMM outputs the type and credibility of the bird.
The above-described embodiments are merely illustrative of the principles and effects of the present invention, and some embodiments may be applied, and it will be apparent to those skilled in the art that various changes and modifications may be made without departing from the inventive concept of the present invention, and these embodiments are within the scope of the present invention.
Claims (1)
1. A bird voice recognition method for assisting the phenological study of bird breeding season is characterized in that field recording segments are read, the voice comprises a plurality of bird singing segments, then a recognition algorithm is used for recognizing the types of birds to which the singing segments belong in the recording, a recognition reliability is given, the actual recording date of the segment of voice is recorded, finally the number of the birds which send singing in all recordings in the region, namely the number of the birds entering the breeding season, is calculated according to the recognition result of the algorithm, after a certain time, the number exceeds a preset threshold value, the birds in the region are considered to enter the breeding season from the moment, otherwise, after the number is reduced to exceed the threshold value, the birds are considered to end the breeding season, and the specific steps of the recognition algorithm compriseComprises the following steps: 1) adopting semi-supervised non-negative matrix decomposition for source separation, 2) passing the signal through a low-pass filter and then performing frequency compensation; 3) dividing the sound: finding a conversion point from blank to call by using the short-time energy, firstly calculating the short-time energy of the recording:then finding out sound segments according to a threshold value; 4) feature extraction: firstly, adding overlapped windows to a sound fragment, wherein each window becomes a frame, extracting time domain characteristics and frequency domain characteristics aiming at values in each window, most of the frequency domain characteristics are based on short-time Fourier transform (STFT), and then synthesizing the time domain characteristics and the frequency domain characteristics into a vector to be used as a characteristic vector of the frame; 5) dimension reduction and noise reduction: using PCA as a means of dimensionality reduction; 6) a mathematical model is established for each bird song by adopting a hidden Markov chain, firstly, a segmented k means is adopted for model initialization, then a forward-backward Algorithm (forward-backward Algorithm) is used for training an HMM, after the HMM model is established, a new recording needing to be processed is subjected to source separation, pretreatment, segmentation, feature extraction and PCA, then an obtained feature sequence is compared with each trained HMM, namely, decoding is carried out by adopting a Viterbi Algorithm (Viterbi Algorithm), so that the credibility is obtained, and the model with the highest credibility is selected as a recognition result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710583313.8A CN107369451B (en) | 2017-07-18 | 2017-07-18 | Bird voice recognition method for assisting phenological study of bird breeding period |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710583313.8A CN107369451B (en) | 2017-07-18 | 2017-07-18 | Bird voice recognition method for assisting phenological study of bird breeding period |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107369451A CN107369451A (en) | 2017-11-21 |
CN107369451B true CN107369451B (en) | 2020-12-22 |
Family
ID=60308665
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710583313.8A Active CN107369451B (en) | 2017-07-18 | 2017-07-18 | Bird voice recognition method for assisting phenological study of bird breeding period |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107369451B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108898164A (en) * | 2018-06-11 | 2018-11-27 | 南京理工大学 | A kind of chirping of birds automatic identifying method based on Fusion Features |
CN110120224B (en) * | 2019-05-10 | 2023-01-20 | 平安科技(深圳)有限公司 | Method and device for constructing bird sound recognition model, computer equipment and storage medium |
CN110335613B (en) * | 2019-05-28 | 2021-07-09 | 广东工业大学 | Bird identification method adopting pickup for real-time detection |
CN113707158A (en) * | 2021-08-02 | 2021-11-26 | 南昌大学 | Power grid harmful bird seed singing recognition method based on VGGish migration learning network |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708860A (en) * | 2012-06-27 | 2012-10-03 | 昆明信诺莱伯科技有限公司 | Method for establishing judgment standard for identifying bird type based on sound signal |
CN104658538A (en) * | 2013-11-18 | 2015-05-27 | 中国计量学院 | Mobile bird recognition method based on birdsong |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9058384B2 (en) * | 2012-04-05 | 2015-06-16 | Wisconsin Alumni Research Foundation | System and method for identification of highly-variable vocalizations |
US9177559B2 (en) * | 2012-04-24 | 2015-11-03 | Tom Stephenson | Method and apparatus for analyzing animal vocalizations, extracting identification characteristics, and using databases of these characteristics for identifying the species of vocalizing animals |
CN102930870B (en) * | 2012-09-27 | 2014-04-09 | 福州大学 | Bird voice recognition method using anti-noise power normalization cepstrum coefficients (APNCC) |
US8670986B2 (en) * | 2012-10-04 | 2014-03-11 | Medical Privacy Solutions, Llc | Method and apparatus for masking speech in a private environment |
CN103117061B (en) * | 2013-02-05 | 2016-01-20 | 广东欧珀移动通信有限公司 | A kind of voice-based animals recognition method and device |
CN103489446B (en) * | 2013-10-10 | 2016-01-06 | 福州大学 | Based on the twitter identification method that adaptive energy detects under complex environment |
CN103474072B (en) * | 2013-10-11 | 2016-06-01 | 福州大学 | Utilize the quick anti-noise chirping of birds sound recognition methods of textural characteristics and random forest |
CN103985385A (en) * | 2014-05-30 | 2014-08-13 | 安庆师范学院 | Method for identifying Batrachia individual information based on spectral features |
CN104102923A (en) * | 2014-07-16 | 2014-10-15 | 西安建筑科技大学 | Nipponia nippon individual recognition method based on MFCC algorithm |
CN104882144B (en) * | 2015-05-06 | 2018-10-30 | 福州大学 | Animal sounds recognition methods based on sonograph bicharacteristic |
CN106504762B (en) * | 2016-11-04 | 2023-04-14 | 中南民族大学 | Bird community number estimation system and method |
-
2017
- 2017-07-18 CN CN201710583313.8A patent/CN107369451B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708860A (en) * | 2012-06-27 | 2012-10-03 | 昆明信诺莱伯科技有限公司 | Method for establishing judgment standard for identifying bird type based on sound signal |
CN104658538A (en) * | 2013-11-18 | 2015-05-27 | 中国计量学院 | Mobile bird recognition method based on birdsong |
Also Published As
Publication number | Publication date |
---|---|
CN107369451A (en) | 2017-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11631404B2 (en) | Robust audio identification with interference cancellation | |
US9536547B2 (en) | Speaker change detection device and speaker change detection method | |
Lee et al. | Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis | |
CN108198547B (en) | Voice endpoint detection method and device, computer equipment and storage medium | |
CN107077860B (en) | Method for converting a noisy audio signal into an enhanced audio signal | |
WO2021128741A1 (en) | Voice emotion fluctuation analysis method and apparatus, and computer device and storage medium | |
CN107369451B (en) | Bird voice recognition method for assisting phenological study of bird breeding period | |
CN101136199B (en) | Voice data processing method and equipment | |
Cakir et al. | Multi-label vs. combined single-label sound event detection with deep neural networks | |
CN109308912B (en) | Music style recognition method, device, computer equipment and storage medium | |
CN106098079B (en) | Method and device for extracting audio signal | |
CN107564543B (en) | Voice feature extraction method with high emotion distinguishing degree | |
Ting Yuan et al. | Frog sound identification system for frog species recognition | |
CN108091340B (en) | Voiceprint recognition method, voiceprint recognition system, and computer-readable storage medium | |
Ismail et al. | Mfcc-vq approach for qalqalahtajweed rule checking | |
Todkar et al. | Speaker recognition techniques: A review | |
US10014007B2 (en) | Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system | |
Jaafar et al. | Automatic syllables segmentation for frog identification system | |
CN111599344A (en) | Language identification method based on splicing characteristics | |
CA2947957A1 (en) | Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system | |
Thakur et al. | Rapid bird activity detection using probabilistic sequence kernels | |
CN107993666B (en) | Speech recognition method, speech recognition device, computer equipment and readable storage medium | |
CN113129926A (en) | Voice emotion recognition model training method, voice emotion recognition method and device | |
CN115171716B (en) | Continuous voice separation method and system based on spatial feature clustering and electronic equipment | |
wa MAINA | Bioacoustic approaches to biodiversity monitoring and conservation in Kenya |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |