CN107369451B - Bird voice recognition method for assisting phenological study of bird breeding period - Google Patents

Bird voice recognition method for assisting phenological study of bird breeding period Download PDF

Info

Publication number
CN107369451B
CN107369451B CN201710583313.8A CN201710583313A CN107369451B CN 107369451 B CN107369451 B CN 107369451B CN 201710583313 A CN201710583313 A CN 201710583313A CN 107369451 B CN107369451 B CN 107369451B
Authority
CN
China
Prior art keywords
birds
bird
algorithm
recording
segments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710583313.8A
Other languages
Chinese (zh)
Other versions
CN107369451A (en
Inventor
刘丰
李晟
申小莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING COMPUTING CENTER
Original Assignee
BEIJING COMPUTING CENTER
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING COMPUTING CENTER filed Critical BEIJING COMPUTING CENTER
Priority to CN201710583313.8A priority Critical patent/CN107369451B/en
Publication of CN107369451A publication Critical patent/CN107369451A/en
Application granted granted Critical
Publication of CN107369451B publication Critical patent/CN107369451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/16Hidden Markov models [HMM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

A bird voice recognition method for assisting the phenological study of the breeding season of birds is characterized in that field recording segments are read, the sounds comprise a plurality of segments of bird singing, then a recognition algorithm can recognize the types of birds to which the segments of the singing in the recording belong, a recognition reliability is given, the actual recording date of the segment of sounds is recorded, finally the number of the birds which singing in all the recordings in the region, namely the number of the birds entering the breeding season, is calculated according to the recognition result of the algorithm, after a certain time, the number exceeds a preset threshold value, the birds in the region can be considered to enter the breeding season from the moment, and otherwise, after the number is reduced to exceed the threshold value, the birds can be considered to end the breeding season.

Description

Bird voice recognition method for assisting phenological study of bird breeding period
Technical Field
The invention relates to the technical field of bird voice recognition, in particular to a bird voice recognition method for assisting the phenological study of bird breeding season.
Background
Biologically, birds are classified into whiting (bird call) and singing (bird song). Among them, bird song (bird song) refers to a song made by birds in the breeding season. The sound pattern of the same bird is very constant. The sound sounds of different birds often differ greatly. The whine of birds can thus be used as a means of identifying the species of birds.
The phenological study is a subject for studying the relationship between animals and the periodic changes of the environment. One branch is to study the relationship between the reproductive stage of birds and the cyclic changes in the environment. While the reproductive stage of birds can be obtained by recognizing the sound of birds. Therefore, the research on the phenological condition of the breeding period of the birds can be assisted by the sound recognition of the birds.
Disclosure of Invention
The invention aims to provide a bird voice recognition method for assisting the phenological study of the breeding period of birds.
In order to solve the technical problems, the following technical scheme is adopted: a bird voice recognition method for assisting the phenological study of the breeding season of birds is characterized in that field recording segments are read, a plurality of bird song segments are contained in the sounds, then a recognition algorithm can recognize the types of birds to which the song segments belong in the recording, a recognition reliability is given, the occurrence time of the song segments in the recording segments is recorded, finally the number of the birds which send out the songs, namely the number of the birds entering the breeding season, can be calculated according to the recording time, and after the number exceeds a preset threshold after a certain time, the birds in the region can be considered to enter the breeding season from the beginning, and otherwise, after the number is reduced and exceeds the threshold, the birds can be considered to end the breeding season.
The specific steps of the recognition algorithm are as follows: 1) adopting semi-supervised non-negative matrix decomposition for source separation, 2) passing the signal through a low-pass filter and then performing frequency compensation; 3) dividing the sound: finding a conversion point from blank to call by using the short-time energy, firstly calculating the short-time energy of the recording:
Figure BDA0001352845910000011
then finding out sound segments according to a threshold value; 4) feature extraction: firstly, adding overlapped windows to a sound fragment, wherein each window becomes a frame, extracting time domain characteristics and frequency domain characteristics aiming at values in each window, most of the frequency domain characteristics are based on short-time Fourier transform (STFT), and then synthesizing the time domain characteristics and the frequency domain characteristics into a vector to be used as a characteristic vector of the frame; 5) dimension reduction and noise reduction: using PCA as a means of dimensionality reduction; 6) a mathematical model is established for each bird song by adopting a hidden Markov chain, firstly, a segmented k means is adopted for model initialization, then a forward-backward algorithm (forward-backward algorithm) is used for training an HMM, after the HMM model is established, a new recording needing to be processed is subjected to source separation, pretreatment, segmentation, feature extraction and PCA, and then the obtained feature sequence is compared with each trained HMM. Namely decoding by using a Viterbi Algorithm (Viterbi Algorithm) to obtain the reliability. And selecting the model with the highest credibility as the recognition result.
Drawings
FIG. 1 is a schematic diagram of a technical circuit of the present invention
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
Firstly, reading a field recording segment, wherein the sound comprises a plurality of segments of bird sounds, then identifying the type of birds to which the sound-calling segment belongs in the recording by an identification algorithm, giving an identification credibility, recording the occurrence time of the sound-calling segment in the recording segment, and finally calculating the number of the birds which send the sounds, namely the number of the birds entering the breeding period by combining the recording time.
The identification algorithm comprises the following specific steps:
1)semi-supervised NMF
Semi-super-NMF: for source separation (source separation). By source separation is meant that the sound recorded by the recorder is a mixture of multiple sounds, sometimes overlapping. Source separation is a technique used to separate different sounds.
The full name of NMF is non-negative matrix factorization, i.e., non-negative matrix factorization. The method is the best method for separating the source. It can decompose the sound into different base (base) weighted forms. A set of bases and corresponding weights may be obtained as a result of source separation.
Semi-hypervided NMF refers to training with some known data of a specific class to obtain a base corresponding to the class, and then applying an NMF algorithm to the data to be processed by using the base and another initial vector. The bases and weights of the known classes trained in advance are used to obtain separate results, which are subsequently processed.
The use of Semi-supervisedNMF can achieve a good separation effect and, in addition, can effectively suppress noise. This method may be better than other noise reduction methods in some environments. Since conventional noise reduction means require knowledge of the nature of the noise. But the conditions under which the noise is generated are very uncertain. The nature of the noise cannot be accurately described in advance. Thus, the traditional noise reduction method has poor effect. But semi-supervisedNMF based methods may not know the nature of the noise in advance. Therefore, the noise reduction effect of the semi-superimposed NMF-based method is better.
2) Pretreatment of
The pretreatment mainly does two parts of work. The signal is first passed through a low pass filter. And then frequency compensation is performed.
3) Segmentation
Recordings are long and contain blanks and beeps. It is therefore necessary to remove the blank part first, leaving only the part with the call. Therefore, the sound needs to be segmented (segmentation). Short-term energy is used to find the transition point (end point) of the blank to the call.
Firstly, the short-time energy of the recorded sound is calculated, and then the sound fragment is found according to the threshold value.
4) Feature extraction
For each of the obtained calls, their features need to be extracted. The sound segment is first added with overlapping windows, each referred to as a frame, and time-domain features and frequency-domain features are extracted for the values within each window. Most frequency domain features are based on a Short Time Fourier Transform (STFT). The time domain feature and the frequency domain feature are then combined into a vector as the feature vector for the frame.
The time domain characteristics are: zero crossing rate, Short timenergy, entry of energy
The frequency domain features are: MFCC, Spectral centroid, Spectral spread, Spectral entry, Spectral flux, Spectral roll
5)PCA
Because the obtained feature vector has a high dimension, the direct operation has a very large operation amount and has some noises. It is therefore desirable to perform dimensionality reduction on the data, where PCA is used as a means of dimensionality reduction.
PCA is known as principal component analysis, principal component analysis. PCA is an effective data dimension reduction means, and can reduce data dimension and reduce computation amount. And much noise can be reduced. Thereby improving system performance.
6)HMM
The full name of the HMM is a Hidden Markov chain (Hidden Markov Model). Is a well-known mathematical model for time series modeling. Compared with other methods, the HMM is higher in recognition efficiency and robustness.
An HMM was established for each bird's chirping. The model is initialized first with segment k means and then the HMM is trained using forward-backward algorithm (forward-backward algorithm).
After training is finished, for new feature vectors needing to be identified after PCA processing, decoding each feature vector by using a Viterbi algorithm (viterbi algorithm). The viterbi algorithm will obtain a probability, and several HMMs with the highest probability may be selected as the result according to the requirement.
The HMM outputs the type and credibility of the bird.
The above-described embodiments are merely illustrative of the principles and effects of the present invention, and some embodiments may be applied, and it will be apparent to those skilled in the art that various changes and modifications may be made without departing from the inventive concept of the present invention, and these embodiments are within the scope of the present invention.

Claims (1)

1. A bird voice recognition method for assisting the phenological study of bird breeding season is characterized in that field recording segments are read, the voice comprises a plurality of bird singing segments, then a recognition algorithm is used for recognizing the types of birds to which the singing segments belong in the recording, a recognition reliability is given, the actual recording date of the segment of voice is recorded, finally the number of the birds which send singing in all recordings in the region, namely the number of the birds entering the breeding season, is calculated according to the recognition result of the algorithm, after a certain time, the number exceeds a preset threshold value, the birds in the region are considered to enter the breeding season from the moment, otherwise, after the number is reduced to exceed the threshold value, the birds are considered to end the breeding season, and the specific steps of the recognition algorithm compriseComprises the following steps: 1) adopting semi-supervised non-negative matrix decomposition for source separation, 2) passing the signal through a low-pass filter and then performing frequency compensation; 3) dividing the sound: finding a conversion point from blank to call by using the short-time energy, firstly calculating the short-time energy of the recording:
Figure FDA0002469868220000011
then finding out sound segments according to a threshold value; 4) feature extraction: firstly, adding overlapped windows to a sound fragment, wherein each window becomes a frame, extracting time domain characteristics and frequency domain characteristics aiming at values in each window, most of the frequency domain characteristics are based on short-time Fourier transform (STFT), and then synthesizing the time domain characteristics and the frequency domain characteristics into a vector to be used as a characteristic vector of the frame; 5) dimension reduction and noise reduction: using PCA as a means of dimensionality reduction; 6) a mathematical model is established for each bird song by adopting a hidden Markov chain, firstly, a segmented k means is adopted for model initialization, then a forward-backward Algorithm (forward-backward Algorithm) is used for training an HMM, after the HMM model is established, a new recording needing to be processed is subjected to source separation, pretreatment, segmentation, feature extraction and PCA, then an obtained feature sequence is compared with each trained HMM, namely, decoding is carried out by adopting a Viterbi Algorithm (Viterbi Algorithm), so that the credibility is obtained, and the model with the highest credibility is selected as a recognition result.
CN201710583313.8A 2017-07-18 2017-07-18 Bird voice recognition method for assisting phenological study of bird breeding period Active CN107369451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710583313.8A CN107369451B (en) 2017-07-18 2017-07-18 Bird voice recognition method for assisting phenological study of bird breeding period

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710583313.8A CN107369451B (en) 2017-07-18 2017-07-18 Bird voice recognition method for assisting phenological study of bird breeding period

Publications (2)

Publication Number Publication Date
CN107369451A CN107369451A (en) 2017-11-21
CN107369451B true CN107369451B (en) 2020-12-22

Family

ID=60308665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710583313.8A Active CN107369451B (en) 2017-07-18 2017-07-18 Bird voice recognition method for assisting phenological study of bird breeding period

Country Status (1)

Country Link
CN (1) CN107369451B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108898164A (en) * 2018-06-11 2018-11-27 南京理工大学 A kind of chirping of birds automatic identifying method based on Fusion Features
CN110120224B (en) * 2019-05-10 2023-01-20 平安科技(深圳)有限公司 Method and device for constructing bird sound recognition model, computer equipment and storage medium
CN110335613B (en) * 2019-05-28 2021-07-09 广东工业大学 Bird identification method adopting pickup for real-time detection
CN113707158A (en) * 2021-08-02 2021-11-26 南昌大学 Power grid harmful bird seed singing recognition method based on VGGish migration learning network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708860A (en) * 2012-06-27 2012-10-03 昆明信诺莱伯科技有限公司 Method for establishing judgment standard for identifying bird type based on sound signal
CN104658538A (en) * 2013-11-18 2015-05-27 中国计量学院 Mobile bird recognition method based on birdsong

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9058384B2 (en) * 2012-04-05 2015-06-16 Wisconsin Alumni Research Foundation System and method for identification of highly-variable vocalizations
US9177559B2 (en) * 2012-04-24 2015-11-03 Tom Stephenson Method and apparatus for analyzing animal vocalizations, extracting identification characteristics, and using databases of these characteristics for identifying the species of vocalizing animals
CN102930870B (en) * 2012-09-27 2014-04-09 福州大学 Bird voice recognition method using anti-noise power normalization cepstrum coefficients (APNCC)
US8670986B2 (en) * 2012-10-04 2014-03-11 Medical Privacy Solutions, Llc Method and apparatus for masking speech in a private environment
CN103117061B (en) * 2013-02-05 2016-01-20 广东欧珀移动通信有限公司 A kind of voice-based animals recognition method and device
CN103489446B (en) * 2013-10-10 2016-01-06 福州大学 Based on the twitter identification method that adaptive energy detects under complex environment
CN103474072B (en) * 2013-10-11 2016-06-01 福州大学 Utilize the quick anti-noise chirping of birds sound recognition methods of textural characteristics and random forest
CN103985385A (en) * 2014-05-30 2014-08-13 安庆师范学院 Method for identifying Batrachia individual information based on spectral features
CN104102923A (en) * 2014-07-16 2014-10-15 西安建筑科技大学 Nipponia nippon individual recognition method based on MFCC algorithm
CN104882144B (en) * 2015-05-06 2018-10-30 福州大学 Animal sounds recognition methods based on sonograph bicharacteristic
CN106504762B (en) * 2016-11-04 2023-04-14 中南民族大学 Bird community number estimation system and method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708860A (en) * 2012-06-27 2012-10-03 昆明信诺莱伯科技有限公司 Method for establishing judgment standard for identifying bird type based on sound signal
CN104658538A (en) * 2013-11-18 2015-05-27 中国计量学院 Mobile bird recognition method based on birdsong

Also Published As

Publication number Publication date
CN107369451A (en) 2017-11-21

Similar Documents

Publication Publication Date Title
US11631404B2 (en) Robust audio identification with interference cancellation
US9536547B2 (en) Speaker change detection device and speaker change detection method
Lee et al. Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis
CN108198547B (en) Voice endpoint detection method and device, computer equipment and storage medium
CN107077860B (en) Method for converting a noisy audio signal into an enhanced audio signal
WO2021128741A1 (en) Voice emotion fluctuation analysis method and apparatus, and computer device and storage medium
CN107369451B (en) Bird voice recognition method for assisting phenological study of bird breeding period
CN101136199B (en) Voice data processing method and equipment
Cakir et al. Multi-label vs. combined single-label sound event detection with deep neural networks
CN109308912B (en) Music style recognition method, device, computer equipment and storage medium
CN106098079B (en) Method and device for extracting audio signal
CN107564543B (en) Voice feature extraction method with high emotion distinguishing degree
Ting Yuan et al. Frog sound identification system for frog species recognition
CN108091340B (en) Voiceprint recognition method, voiceprint recognition system, and computer-readable storage medium
Ismail et al. Mfcc-vq approach for qalqalahtajweed rule checking
Todkar et al. Speaker recognition techniques: A review
US10014007B2 (en) Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
Jaafar et al. Automatic syllables segmentation for frog identification system
CN111599344A (en) Language identification method based on splicing characteristics
CA2947957A1 (en) Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
Thakur et al. Rapid bird activity detection using probabilistic sequence kernels
CN107993666B (en) Speech recognition method, speech recognition device, computer equipment and readable storage medium
CN113129926A (en) Voice emotion recognition model training method, voice emotion recognition method and device
CN115171716B (en) Continuous voice separation method and system based on spatial feature clustering and electronic equipment
wa MAINA Bioacoustic approaches to biodiversity monitoring and conservation in Kenya

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant