CN109741759A - An acoustic automatic detection method for specific bird species - Google Patents
An acoustic automatic detection method for specific bird species Download PDFInfo
- Publication number
- CN109741759A CN109741759A CN201811566250.6A CN201811566250A CN109741759A CN 109741759 A CN109741759 A CN 109741759A CN 201811566250 A CN201811566250 A CN 201811566250A CN 109741759 A CN109741759 A CN 109741759A
- Authority
- CN
- China
- Prior art keywords
- potential
- signal
- sound
- segment
- channel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 38
- 230000003044 adaptive effect Effects 0.000 claims abstract description 20
- 238000000034 method Methods 0.000 claims abstract description 19
- 238000012545 processing Methods 0.000 claims abstract description 15
- 238000012706 support-vector machine Methods 0.000 claims abstract description 11
- 230000011218 segmentation Effects 0.000 claims abstract description 6
- 238000012544 monitoring process Methods 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 12
- 230000007613 environmental effect Effects 0.000 claims description 10
- 238000004422 calculation algorithm Methods 0.000 claims description 9
- 230000001755 vocal effect Effects 0.000 claims description 9
- 239000012634 fragment Substances 0.000 claims description 7
- 238000001228 spectrum Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 5
- 230000000903 blocking effect Effects 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 4
- 238000010801 machine learning Methods 0.000 claims description 4
- 239000000203 mixture Substances 0.000 claims description 4
- 230000003068 static effect Effects 0.000 claims description 4
- 238000007619 statistical method Methods 0.000 claims description 4
- 230000017105 transposition Effects 0.000 claims description 4
- 238000009432 framing Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 238000012549 training Methods 0.000 claims description 2
- 241000894007 species Species 0.000 abstract description 31
- 238000000605 extraction Methods 0.000 abstract description 2
- 241000271566 Aves Species 0.000 abstract 5
- 238000012512 characterization method Methods 0.000 abstract 2
- 238000012360 testing method Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
Landscapes
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Catching Or Destruction (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
技术领域technical field
本发明属于生态监测及声信号技术识别领域,特别涉及一种面向特定鸟类物种的声学自动检测方法。The invention belongs to the field of ecological monitoring and acoustic signal technology identification, and particularly relates to an acoustic automatic detection method for specific bird species.
背景技术Background technique
鸟类是一种分布很广的发声动物,相比于其它动物群落,鸟类的生活习性更易被人观察,且能够敏感地感知生态环境的微小变化,因此,鸟类被大多数生态学者认为是监测生态环境变化的理想物种。Birds are widely distributed vocal animals. Compared with other animal communities, the living habits of birds are more easily observed by humans, and they can sensitively perceive small changes in the ecological environment. Therefore, birds are considered by most ecologists. It is an ideal species for monitoring ecological environment changes.
然而,近年来由于人类城镇化的不断扩张,人类活动对生态环境的持续破坏,导致鸟类的数量和种类在大规模减少,这不仅会给全球物种多样性的带来损失,而且鸟类作为林地植被群落的指示性物种,鸟类的锐减会导致生态环境的不平衡。因此保护鸟类是生态领域的关注的焦点,尤其是对珍稀鸟类的监管更是重中之重。传统的计点法(pointcounts)在监测鸟类时成本需求较高且侵略了生物栖息地。为了更加快速地、无侵入地评估鸟类活动,有效地实现鸟类自动检测的需求越来越紧迫。However, in recent years, due to the continuous expansion of human urbanization and the continuous destruction of the ecological environment by human activities, the number and species of birds have decreased on a large scale. Indicative species of woodland vegetation community, the sharp decline of birds will lead to the imbalance of ecological environment. Therefore, the protection of birds is the focus of attention in the ecological field, especially the supervision of rare birds is the top priority. Traditional pointcounts are costly and invasive of habitats when monitoring birds. In order to assess bird activities more quickly and non-invasively, the need to effectively implement automatic bird detection is increasingly urgent.
其中,基于鸟类物种的鸣叫信号,利用声学信号分析手段对野外实测鸟声信号提取特征,是后续进行大规模数据分析及鸟声识别模型建立的基础。在真实复杂的环境中,考虑到基于特征参数的鸟类物种识别分类方法对环境噪声及其它干扰声源信号的敏感度较高,许多学者在积极探索噪声中鸟叫声识别方法。中国专利CN102930870A公开一种利用抗噪幂归一化倒谱系数的鸟类声音识别方法,该方法首先采用多频带谱减法对声音功率谱进行降噪处理,然后对降噪后的声音功率谱提取抗噪幂归一化倒谱系数,最后结合支持向量机(Support Vector Machine,SVM)对34种鸟类声音进行不同环境与信噪比情况下的识别。中国专利CN103489446A公开了一种复杂环境下基于自适应能量检测的鸟鸣识别方法,该方法首先将声音信号经过能量检测,对检测筛选出的声音信号帧提取基于Mel尺度的小波包分解子带倒谱系数(Wavelet Packet decomposition Subband Cepstral Coefficient,WPSCC)抗噪特征,结合SVM对在噪声环境下的15类鸟鸣进行分类识别。刘钊等介绍了一种基于随机森林和大规模声学特征提取的噪声环境下的鸟声识别算法(刘钊,张宇琛,胡海龙.随机森林和大规模声学特征的噪声环境鸟声识别仿真[J].系统仿真技术,2017(4):359-362.)。张赛花等首先采用基于高斯混合模型(Gaussian Mixture Model,GMM)声学事件检测处理过程提取潜在的鸟鸣片段,然后提取各片段的基于Mel子带参数化特征,并采用SVM对野外环境中的11类鸟鸣进行分类识别(张赛花,赵兆,许志勇,张怡.基于Mel子带参数化特征的自动鸟鸣识别[J].计算机应用,2017,37(4):1111-1115.)。Among them, based on the song signals of bird species, the use of acoustic signal analysis methods to extract features from the field measured bird sound signals is the basis for subsequent large-scale data analysis and bird sound recognition model establishment. In a real and complex environment, considering that the bird species identification and classification method based on characteristic parameters is highly sensitive to environmental noise and other interfering sound source signals, many scholars are actively exploring methods for identifying bird calls in noise. Chinese patent CN102930870A discloses a bird sound recognition method using anti-noise power normalized cepstral coefficients. The method first uses multi-band spectral subtraction to perform noise reduction processing on the sound power spectrum, and then extracts the noise power spectrum after noise reduction. The anti-noise power normalized cepstral coefficients, and finally combined with Support Vector Machine (SVM) to identify 34 kinds of bird sounds under different environments and signal-to-noise ratios. Chinese patent CN103489446A discloses a bird song recognition method based on self-adaptive energy detection in a complex environment. In this method, the sound signal is first subjected to energy detection, and the sound signal frames selected by the detection are extracted from the wavelet packet decomposition sub-band inversion based on Mel scale. Spectral coefficient (Wavelet Packet decomposition Subband Cepstral Coefficient, WPSCC) anti-noise feature, combined with SVM to classify and identify 15 types of bird song in noisy environment. Liu Zhao et al. introduced a bird sound recognition algorithm in noisy environment based on random forest and large-scale acoustic feature extraction (Liu Zhao, Zhang Yuchen, Hu Hailong. Simulation of bird sound recognition in noisy environment with random forest and large-scale acoustic features [J] ]. System Simulation Technology, 2017(4):359-362.). Zhang Saihua et al. first used the Gaussian Mixture Model (GMM)-based acoustic event detection process to extract potential bird song segments, and then extracted the Mel subband-based parametric features of each segment, and used SVM to analyze the 11 categories in the wild environment. Classification and recognition of birdsong
上述已有研究在实验时大多只加入一种噪声或者对噪声未做抑制处理,然而野外声学环境十分复杂,具有多种坏境干扰源,仅采用单通道音频增强前端处理无法满足实际野外复杂声学环境下的声学监测任务的要求。Most of the above-mentioned existing studies only add one kind of noise or do not suppress the noise in the experiment. However, the field acoustic environment is very complex, with a variety of environmental interference sources. Only single-channel audio enhancement front-end processing cannot meet the actual field complex acoustics. Requirements for acoustic monitoring tasks in the environment.
发明内容SUMMARY OF THE INVENTION
本发明所要解决的技术问题在于提供一种面向特定鸟类物种的声学自动检测方法。The technical problem to be solved by the present invention is to provide an acoustic automatic detection method for specific bird species.
实验本发明目的的技术解决方案为:一种面向特定鸟类物种的声学自动检测方法,包括以下步骤:Experiment The technical solution for the purpose of the present invention is: an acoustic automatic detection method for a specific bird species, comprising the following steps:
步骤1、采集野外连续鸟声监测数据信号并进行自动分段,之后提取特定鸟类物种潜在鸣声片段;Step 1. Collect field continuous bird sound monitoring data signals and perform automatic segmentation, and then extract potential song fragments of specific bird species;
步骤2、对步骤1获得的各个潜在鸣声片段进行自适应信号降噪增强处理;Step 2, performing adaptive signal noise reduction enhancement processing on each potential sound segment obtained in step 1;
步骤3、对步骤2降噪增强后的每个潜在鸣声片段提取特征参数,构建潜在鸣声片段特征集;Step 3, extracting feature parameters for each potential sound segment after noise reduction and enhancement in step 2, and constructing a feature set of potential sound segments;
步骤4、结合步骤3获得的潜在鸣声片段特征集和机器学习中的识别算法完成特定鸟类物种的声学检测。Step 4. Acoustic detection of a specific bird species is completed by combining the feature set of potential song segments obtained in step 3 and the recognition algorithm in machine learning.
本发明与现有技术相比,其显著优点为:1)本发明中采用多元立体麦克风阵列采集连续鸟声监测数据,采集的数据包含了丰富的时间和空间信息,可实现在大时空尺度上对鸟类物种的监测;2)本发明通过基于高斯混合模型的特定鸟类物种潜在鸣声事件检测,结合基于候选声音事件的能量、鸣叫时长、频率分布的后处理过程,能够完成稳健检测与自动分段;3)本发明通过采用基于麦克风阵列的自适应信号降噪增强处理,提高了声音事件的信噪比,进而提高特定物种辨识准确率;4)本发明的方法实现过程便捷,易于实施。Compared with the prior art, the present invention has the following significant advantages: 1) In the present invention, a multi-dimensional stereo microphone array is used to collect continuous bird sound monitoring data. Monitoring of bird species; 2) The present invention can complete robust detection and detection by detecting potential song events of specific bird species based on a Gaussian mixture model, combined with a post-processing process based on the energy, song duration, and frequency distribution of candidate sound events. Automatic segmentation; 3) The present invention improves the signal-to-noise ratio of sound events by adopting the adaptive signal noise reduction enhancement processing based on the microphone array, thereby improving the identification accuracy rate of specific species; 4) The method of the present invention is convenient and easy to implement. implement.
下面结合附图对本发明作进一步详细的描述。The present invention will be described in further detail below with reference to the accompanying drawings.
附图说明Description of drawings
图1为本发明面向特定鸟类物种的声学自动检测方法的流程图。FIG. 1 is a flow chart of an acoustic automatic detection method for a specific bird species according to the present invention.
图2为本发明特定鸟类物种潜在鸣声事件检测的流程图。FIG. 2 is a flow chart of the detection of potential song events for a specific bird species according to the present invention.
图3为P元立体麦克风阵列自适应时延估计原理图。FIG. 3 is a schematic diagram of the adaptive delay estimation of the P-element stereo microphone array.
图4为P元立体麦克风阵列广义旁瓣对消器结构框图。FIG. 4 is a structural block diagram of a generalized sidelobe canceller of a P-element stereo microphone array.
图5为4元立体麦克风阵列自适应时延估计原理图。FIG. 5 is a schematic diagram of adaptive delay estimation of a 4-element stereo microphone array.
图6为4元立体麦克风阵列广义旁瓣对消器结构框图。FIG. 6 is a structural block diagram of a generalized sidelobe canceller of a 4-element stereo microphone array.
具体实施方式Detailed ways
结合图1,本发明的面向特定鸟类物种的声学自动检测方法,包括以下步骤:1, the acoustic automatic detection method for specific bird species of the present invention includes the following steps:
步骤1、采集野外连续鸟声监测数据信号并进行自动分段,之后提取特定鸟类物种潜在鸣声片段。Step 1. Collect field continuous bird sound monitoring data signals and perform automatic segmentation, and then extract potential song fragments of specific bird species.
进一步地,结合图2,步骤1具体为:Further, in conjunction with Fig. 2, step 1 is specifically:
步骤1-1、利用多元立体麦克风阵列采集多通道野外连续鸟声监测数据信号,并对采集的野外连续鸟声监测数据信号进行预加重处理,以补偿高频信号的过大衰减,同时抑制低频风噪声;Step 1-1. Use a multi-dimensional stereo microphone array to collect multi-channel field continuous bird sound monitoring data signals, and perform pre-emphasis processing on the collected field continuous bird sound monitoring data signals to compensate for the excessive attenuation of high-frequency signals and suppress low-frequency signals. wind noise;
步骤1-2、对步骤1-1处理后的连续鸟声监测数据信号进行分帧、加窗及快速傅里叶变换,获得功率谱图;Step 1-2, performing framing, windowing and fast Fourier transform on the continuous bird sound monitoring data signal processed in step 1-1 to obtain a power spectrogram;
步骤1-3、设置频率下限和上限分别为fL和fH,确定每一帧的短时对数能量le(l),所用公式为:Steps 1-3, set the lower and upper frequency limits as f L and f H respectively, and determine the short-term logarithmic energy le(l) of each frame. The formula used is:
le(l)=log10(e(l))le(l)=log 10 (e(l))
其中,in,
式中,l为帧序号,i为频率序号,S(i,l)表示在时频点(i,l)处的短时傅里叶变换结果,NL和NH分别表示fL和fH对应的频率点序号,e(l)为第l帧的短时能量;In the formula, l is the frame serial number, i is the frequency serial number, S(i,l) represents the short-time Fourier transform result at the time-frequency point (i,l), NL and NH represent f L and f respectively The frequency point sequence number corresponding to H , e(l) is the short-term energy of the lth frame;
步骤1-4、利用含有两个高斯分量的高斯混合模型生成帧对数能量分布,则两个高斯分量分别表示潜在鸣声事件帧集合以及环境噪声帧集合的概率密度函数;Steps 1-4, using a Gaussian mixture model containing two Gaussian components to generate a frame logarithmic energy distribution, then the two Gaussian components respectively represent the probability density function of the potential sound event frame set and the environmental noise frame set;
步骤1-5、针对每一帧,通过对比后验概率判断该帧属于潜在鸣声片段还是环境噪声段,获得若干个潜在鸣声片段,具体为:Steps 1-5: For each frame, determine whether the frame belongs to a potential sound segment or an environmental noise segment by comparing the posterior probability, and obtain several potential sound segments, specifically:
对比每一帧属于潜在鸣声事件帧集合的后验概率与属于环境噪声帧集合的后验概率,如果该帧属于潜在鸣声事件帧集合的后验概率大于属于环境噪声帧集合的后验概率,则该帧归属于某个潜在鸣声片段,且与该帧时间上连续且同样满足上述条件的其它帧也归属于该片段潜在鸣声片段,由此获得若干个潜在鸣声片段,所有潜在鸣声片段构成集合D={AE1,AE2,…,AEK},其中K为潜在鸣声片段个数;Compare the posterior probability of each frame belonging to the set of potential whistling event frames with the posterior probability of belonging to the set of environmental noise frames, if the posterior probability of the frame belonging to the set of potential whistling event frames is greater than the posterior probability of belonging to the set of environmental noise frames , then the frame belongs to a potential sound segment, and other frames that are temporally continuous with the frame and also meet the above conditions also belong to the potential sound segment of the segment, thereby obtaining several potential sound segments, all potential sound segments. The sound segment constitutes a set D={AE 1 ,AE 2 ,...,AE K }, where K is the number of potential sound segments;
步骤1-6、求取步骤1-5获得的每个潜在鸣声片段的对数能量,所用公式为:Steps 1-6. Calculate the logarithmic energy of each potential sound segment obtained in steps 1-5. The formula used is:
并获取其中最大的对数能量ME:and get the largest logarithmic energy ME among them:
针对第k个潜在鸣声片段,若ME-EAEk≥q,则认为该潜在鸣声片段为生态研究价值较小的过弱片段,将其剔除,其中q为根据实际情况预设的阈值,q的单位为dB;For the k-th potential vocal segment, if ME-EAE k ≥q, the potential vocal segment is considered to be a weak segment with less ecological research value, and it is eliminated, where q is a preset threshold according to the actual situation, The unit of q is dB;
步骤1-7、基于现有的特定鸟类物种鸣声数据库数据,通过统计分析获取特定鸟类物种鸣声片段鸣叫时长的上下阈值即最长鸣叫时长tH和最短鸣叫时长tL,并根据信号采样率fs将tH和tL转化为最大鸣叫长度nH和最小鸣叫长度nL:Steps 1-7, based on the existing specific bird species song database data, obtain the upper and lower thresholds of the specific bird species song segment song duration through statistical analysis, that is, the longest song duration tH and the shortest song duration tL , and according to The signal sampling rate f s translates t H and t L into a maximum tweet length n H and a minimum tweet length n L :
nH=fs×tH n H = f s ×t H
nL=fs×tL n L = f s ×t L
针对步骤1-6获得的每个潜在鸣声片段,获取其长度T为:For each potential sound segment obtained in steps 1-6, obtain its length T as:
T=帧长×潜在鸣声片段中的帧数T = frame length × number of frames in the potential sound segment
将潜在鸣声片段长度T小于nL和大于nH的潜在鸣声片段剔除;Eliminate potential sound segments whose length T is less than n L and greater than n H ;
步骤1-8、基于现有的特定鸟类物种鸣声数据库数据,通过统计分析获取特定鸟类物种鸣声片段的频率分布范围,针对步骤1-7获得的潜在鸣声片段,将超出频率范围的数据置零。Steps 1-8, based on the existing sound database data of specific bird species, obtain the frequency distribution range of specific bird species song fragments through statistical analysis, and the potential song fragments obtained in steps 1-7 will exceed the frequency range data is set to zero.
进一步地,步骤1-6所述q取20dB。Further, the q in steps 1-6 is taken as 20dB.
步骤2、对步骤1得到的各个潜在鸣声片段进行自适应信号降噪增强处理。Step 2: Perform adaptive signal noise reduction enhancement processing on each potential sound segment obtained in Step 1.
进一步地,步骤2对步骤1获得的各个潜在鸣声片段进行自适应信号降噪增强处理,具体为:Further, step 2 performs adaptive signal noise reduction enhancement processing on each potential sound segment obtained in step 1, specifically:
假设所述多元立体麦克风阵列为P元立体麦克风阵列,对P元立体麦克风阵列通道以一定的顺序依次进行编号为1,2,3….,P;Assuming that the multi-element stereo microphone array is a P-element stereo microphone array, the channels of the P-element stereo microphone array are numbered 1, 2, 3...., P in a certain order;
步骤2-1、结合图3,对每个潜在鸣声片段采用自适应滤波方法进行声源方向估计,具体为:Step 2-1. Combined with Figure 3, the adaptive filtering method is used to estimate the sound source direction for each potential sound segment, specifically:
步骤2-1-1、针对其中一个潜在鸣声片段的P通道信号数据,假设其P个通道信号分别为m1(n)、m2(n)、m3(n)、…、mP(n),n=1,2,3,...,Lm,Lm为信号长度,以通道1的信号为参考信号,构造通道2信号的快拍xk:Step 2-1-1. For the P channel signal data of one of the potential sound segments, it is assumed that the P channel signals are m 1 (n), m 2 (n), m 3 (n), ..., m P (n), n=1,2,3,...,L m , L m is the signal length, and the signal of channel 1 is used as the reference signal to construct the snapshot x k of the signal of channel 2:
TT
xk=[m2(k),m2(k+1),...,m2(k+L-1)];x k =[m 2 (k),m 2 (k+1),...,m 2 (k+L-1)];
式中,下标k=1,2,...,Lm-L+1表示第k个快拍,L表示滤波器长度,上标T表示转置;In the formula, the subscript k=1,2,...,L m -L+1 represents the k-th snapshot, L represents the filter length, and the superscript T represents the transposition;
步骤2-1-2、求取自相关矩阵Rxx,所用公式为:Step 2-1-2, obtain the autocorrelation matrix R xx , the formula used is:
式中,K=Lm-L+1为快拍数量;In the formula, K=L m -L+1 is the number of snapshots;
步骤2-1-3、求取互相关矩阵rxd,所用公式为:Step 2-1-3, to obtain the cross-correlation matrix r xd , the formula used is:
式中,为滤波器中心点;In the formula, is the filter center point;
步骤2-1-4、求取权矢量w1,所用公式为:Step 2-1-4, to obtain the weight vector w 1 , the formula used is:
w1=Rxx -1rxd;w 1 =R xx −1 r xd ;
步骤2-1-5、对步骤2-1-4获得的权矢量w1进行峰值检测,记峰值的横坐标为z,则通道1信号与通道2信号的时延点数d1=z-D;Step 2-1-5, perform peak detection on the weight vector w 1 obtained in step 2-1-4, and mark the abscissa of the peak value as z, then the number of delay points between the channel 1 signal and the channel 2 signal d 1 =zD;
步骤2-1-6、重复步骤2-1-1至步骤2-1-5,获得通道1信号与第c个通道信号的时延点数dc,c=1,2,...,P-1;Step 2-1-6, repeat steps 2-1-1 to 2-1-5 to obtain the delay points d c of the channel 1 signal and the c-th channel signal, c=1,2,...,P -1;
步骤2-2、结合图4,采用广义旁瓣对消器对潜在鸣声片段进行自适应增强,具体为:Step 2-2, with reference to Fig. 4, use a generalized sidelobe canceller to adaptively enhance the potential sound segment, specifically:
步骤2-2-1、求主通道信号d(k):Step 2-2-1. Find the main channel signal d(k):
d(k)=wc Tm(k);d(k)=w c T m(k);
式中,wc=[wc1,wc2,..,wcP]T为静态权矢量,wc1、wc2、..、wcP为各个通道对应的权值且wc1+wc2+...+wcP=1;m(k)=[m1(k),m2(k-d1),...,mP(k-dP-1)]T,k=1,2,...,Lm;In the formula, w c =[w c1 ,w c2 ,..,w cP ] T is the static weight vector, w c1 , w c2 , .., w cP are the corresponding weights of each channel and w c1 +w c2 + ...+w cP =1; m(k)=[m 1 (k),m 2 (kd 1 ),...,m P (kd P-1 )] T , k=1,2,. ..,L m ;
步骤2-2-2、求辅助通道信号e(k):Step 2-2-2, find the auxiliary channel signal e(k):
其中,WS为维数P×(P-1)的阻塞矩阵;Among them, W S is the blocking matrix of dimension P×(P-1);
步骤2-2-3、求增强后纯净的潜在鸣声片段信号y(k):Step 2-2-3. Find the pure potential sound segment signal y(k) after enhancement:
y(k)=d(k)-vT(k)e(k);y(k)=d(k)-v T (k)e(k);
式中,v(k)表示自适应干扰对消器的动态权矢量。where v(k) represents the dynamic weight vector of the adaptive interference canceler.
进一步地,步骤2对步骤1获得的各个潜在鸣声片段进行自适应信号降噪增强处理,具体为:Further, step 2 performs adaptive signal noise reduction enhancement processing on each potential sound segment obtained in step 1, specifically:
假设所述多元立体麦克风阵列为P=4元立体麦克风阵列;对4元立体麦克风阵列以一定的顺序依次进行编号为1,2,3,4;Assuming that the multi-element stereo microphone array is P=4-element stereo microphone array; the 4-element stereo microphone array is numbered 1, 2, 3, 4 in a certain order;
步骤2-1、结合图5,对每个潜在鸣声片段采用自适应滤波方法进行声源方向估计,具体为:Step 2-1, with reference to Fig. 5, adopt the adaptive filtering method to estimate the sound source direction for each potential sound segment, specifically:
步骤2-1-1、针对其中一个潜在鸣声片段的4通道信号数据,假设其4通道信号分别为m1(n)、m2(n)、m3(n)、m4(n),n=1,2,3,...,Lm,Lm为信号长度,以通道4信号为参考信号,构造通道1信号的快拍xk:Step 2-1-1. For the 4-channel signal data of one of the potential sound segments, it is assumed that the 4-channel signals are m 1 (n), m 2 (n), m 3 (n), and m 4 (n) respectively. , n=1,2,3,...,L m , L m is the signal length, and the channel 4 signal is used as the reference signal to construct the snapshot x k of the channel 1 signal:
xk=[m1(k),m1(k+1),...,m1(k+L-1)]T;x k =[m 1 (k),m 1 (k+1),...,m 1 (k+L-1)] T ;
式中,下标k=1,2,...,Lm-L+1表示第k个快拍,L表示滤波器长度,上标T表示转置;In the formula, the subscript k=1,2,...,L m -L+1 represents the k-th snapshot, L represents the filter length, and the superscript T represents the transposition;
步骤2-1-2、求取自相关矩阵Rxx,所用公式为:Step 2-1-2, obtain the autocorrelation matrix R xx , the formula used is:
式中,K=Lm-L+1为快拍数量;In the formula, K=L m -L+1 is the number of snapshots;
步骤2-1-3、求取互相关矩阵rxd,所用公式为:Step 2-1-3, to obtain the cross-correlation matrix r xd , the formula used is:
式中,为滤波器中心点;In the formula, is the filter center point;
步骤2-1-4、求取权矢量w1,所用公式为:Step 2-1-4, to obtain the weight vector w 1 , the formula used is:
w1=Rxx -1rxd;w 1 =R xx −1 r xd ;
步骤2-1-5、对步骤2-1-4获得的权矢量w1进行峰值检测,记峰值的横坐标为z,则通道4信号与通道1信号的时延点数d1=z-D;Step 2-1-5, perform peak detection on the weight vector w 1 obtained in step 2-1-4, and mark the abscissa of the peak value as z, then the number of delay points between the channel 4 signal and the channel 1 signal d 1 =zD;
步骤2-1-6、重复步骤2-1-1至步骤2-1-5,获得通道4信号与通道2信号、通道3信号的时延点数,分别为d2、d3;Step 2-1-6, repeating steps 2-1-1 to 2-1-5, to obtain the delay points of the channel 4 signal, the channel 2 signal, and the channel 3 signal, which are respectively d 2 and d 3 ;
步骤2-2、结合图6,采用广义旁瓣对消器对潜在鸣声片段进行自适应增强,具体为:Step 2-2, with reference to Fig. 6, use a generalized sidelobe canceller to adaptively enhance the potential sound segment, specifically:
步骤2-2-1、求主通道信号d(k):Step 2-2-1. Find the main channel signal d(k):
d(k)=wc Tm(k);d(k)=w c T m(k);
式中,wc=[wc4,wc1,wc2,wc3]T为静态权矢量,wc1、...、wc4为各通道对应的权值且wc1+wc2+wc3+wc4=1,m(k)=[m4(k),m1(k-d1),m2(k-d2),m3(k-d3)]T,k=1,2,...,Lm;In the formula, w c =[w c4 ,w c1 ,w c2 ,w c3 ] T is the static weight vector, w c1 ,...,w c4 are the corresponding weights of each channel and w c1 +w c2 +w c3 +w c4 =1, m(k)=[m 4 (k), m 1 (kd 1 ), m 2 (kd 2 ), m 3 (kd 3 )] T , k=1,2,... , L m ;
步骤2-2-2、求辅助通道信号e(k):Step 2-2-2, find the auxiliary channel signal e(k):
其中,WS为维数4×3的阻塞矩阵;Among them, W S is the blocking matrix of dimension 4 × 3;
步骤2-2-3、求增强后纯净的潜在鸣声片段信号y(k):Step 2-2-3. Find the pure potential sound segment signal y(k) after enhancement:
y(k)=d(k)-vT(k)e(k);y(k)=d(k)-v T (k)e(k);
式中,v(k)表示自适应干扰对消器的权矢量。where v(k) represents the weight vector of the adaptive interference canceler.
步骤3、对步骤2降噪增强后的每个潜在鸣声片段提取特征参数,构建潜在鸣声片段特征集。Step 3: Extract feature parameters for each potential sound segment after noise reduction and enhancement in step 2, and construct a feature set of potential sound segments.
进一步地,步骤3具体为:Further, step 3 is specifically:
步骤3-1、根据步骤1-2计算降噪增强后每个潜在鸣声片段的功率谱图;Step 3-1, according to step 1-2, calculate the power spectrum of each potential sound segment after noise reduction enhancement;
步骤3-2、在特定鸟类物种鸣声片段的频率分布范围内设置Mel带通滤波器组,然后将潜在鸣声片段功率谱通过该滤波器组,求各个滤波器输出;Step 3-2, set up a Mel bandpass filter group within the frequency distribution range of the vocal segment of a specific bird species, and then pass the power spectrum of the potential vocal segment through the filter group to obtain the output of each filter;
步骤3-3、对输出结果取对数,并做离散余弦变换,获得梅尔倒谱系数特征参数;Step 3-3, take the logarithm of the output result, and perform discrete cosine transform to obtain the characteristic parameters of Mel cepstral coefficients;
步骤3-4、将所有潜在鸣声片段对应的梅尔倒谱系数特征参数组合构建获得潜在鸣声片段特征集。Step 3-4, combining the characteristic parameters of Mel cepstral coefficients corresponding to all potential singing segments to construct a feature set of potential singing segments.
步骤4、结合步骤3获得的潜在鸣声片段特征集和机器学习中的识别算法完成特定鸟类物种的声学检测。Step 4. Acoustic detection of a specific bird species is completed by combining the feature set of potential song segments obtained in step 3 and the recognition algorithm in machine learning.
进一步地,步骤4中识别算法具体采用支持向量机识别算法。Further, the identification algorithm in step 4 specifically adopts the support vector machine identification algorithm.
进一步地,步骤4具体为:Further, step 4 is specifically:
以已有的特定鸟类物种鸣声特征数据库中的梅尔倒谱系数特征作为训练样本,以潜在鸣声片段特征集作为支持向量机的输入样本,通过支持向量机的决策,自动检测出特定鸟类物种。Taking the Mel cepstral coefficient feature in the existing vocal feature database of a specific bird species as the training sample, and using the feature set of potential vocal segments as the input sample of the support vector machine, through the decision of the support vector machine, it can automatically detect the specific bird species.
综上所述本发明的一种面向特定鸟类物种的声学自动检测方法,其采用GMM的特定鸟类物种潜在鸣声事件检测结合基于候选声音事件的能量、帧长、频率分布的后处理过程,完成稳健检测与自动分段。此外,本发明通过采用基于麦克风阵的自适应降噪增强处理能够明显改善声音事件的信噪比从而提高特定物种辨识准确度,实现野外自然环境下特定鸟类物种的声学检测,对于野外珍稀鸟类物种的生态保护及相关生态学研究具有重要意义。To sum up, an acoustic automatic detection method for a specific bird species of the present invention adopts the GMM detection of potential song events of a specific bird species combined with a post-processing process based on the energy, frame length, and frequency distribution of candidate sound events. , complete robust detection and automatic segmentation. In addition, the present invention can significantly improve the signal-to-noise ratio of sound events by adopting the adaptive noise reduction enhancement processing based on the microphone array, thereby improving the identification accuracy of specific species, and realizing the acoustic detection of specific bird species in the wild natural environment. The ecological protection of species and related ecological research are of great significance.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811566250.6A CN109741759B (en) | 2018-12-21 | 2018-12-21 | Acoustic automatic detection method for specific bird species |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811566250.6A CN109741759B (en) | 2018-12-21 | 2018-12-21 | Acoustic automatic detection method for specific bird species |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109741759A true CN109741759A (en) | 2019-05-10 |
CN109741759B CN109741759B (en) | 2020-07-31 |
Family
ID=66360805
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811566250.6A Active CN109741759B (en) | 2018-12-21 | 2018-12-21 | Acoustic automatic detection method for specific bird species |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109741759B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110335613A (en) * | 2019-05-28 | 2019-10-15 | 广东工业大学 | A Bird Recognition Method Using Pickup Real-time Detection |
CN111276151A (en) * | 2020-01-20 | 2020-06-12 | 北京正和恒基滨水生态环境治理股份有限公司 | Bird sound identification system and identification method |
CN113314127A (en) * | 2021-04-23 | 2021-08-27 | 广州大学 | Space orientation-based bird song recognition method, system, computer device and medium |
CN115116461A (en) * | 2022-07-22 | 2022-09-27 | 南京理工大学 | Double-channel speech enhancement-based birdsong species identification method |
CN117724042A (en) * | 2024-02-18 | 2024-03-19 | 百鸟数据科技(北京)有限责任公司 | Method and system for positioning bird song sound source based on acoustic bispectrum |
CN118173102A (en) * | 2024-05-15 | 2024-06-11 | 百鸟数据科技(北京)有限责任公司 | Bird voiceprint recognition method in complex scene |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101447190A (en) * | 2008-06-25 | 2009-06-03 | 北京大学深圳研究生院 | Voice enhancement method employing combination of nesting-subarray-based post filtering and spectrum-subtraction |
CN102708860A (en) * | 2012-06-27 | 2012-10-03 | 昆明信诺莱伯科技有限公司 | Method for establishing judgment standard for identifying bird type based on sound signal |
CN102800322A (en) * | 2011-05-27 | 2012-11-28 | 中国科学院声学研究所 | Method for estimating noise power spectrum and voice activity |
US20140226838A1 (en) * | 2013-02-13 | 2014-08-14 | Analog Devices, Inc. | Signal source separation |
CN106504762A (en) * | 2016-11-04 | 2017-03-15 | 中南民族大学 | Bird community number estimation system and method |
CN107393542A (en) * | 2017-06-28 | 2017-11-24 | 北京林业大学 | A kind of birds species identification method based on binary channels neutral net |
CN107393549A (en) * | 2017-07-21 | 2017-11-24 | 北京华捷艾米科技有限公司 | Delay time estimation method and device |
CN107644650A (en) * | 2017-09-29 | 2018-01-30 | 山东大学 | A kind of improvement sound localization method based on progressive serial orthogonalization blind source separation algorithm and its realize system |
CN108694953A (en) * | 2017-04-07 | 2018-10-23 | 南京理工大学 | A kind of chirping of birds automatic identifying method based on Mel sub-band parameter features |
-
2018
- 2018-12-21 CN CN201811566250.6A patent/CN109741759B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101447190A (en) * | 2008-06-25 | 2009-06-03 | 北京大学深圳研究生院 | Voice enhancement method employing combination of nesting-subarray-based post filtering and spectrum-subtraction |
CN102800322A (en) * | 2011-05-27 | 2012-11-28 | 中国科学院声学研究所 | Method for estimating noise power spectrum and voice activity |
CN102708860A (en) * | 2012-06-27 | 2012-10-03 | 昆明信诺莱伯科技有限公司 | Method for establishing judgment standard for identifying bird type based on sound signal |
US20140226838A1 (en) * | 2013-02-13 | 2014-08-14 | Analog Devices, Inc. | Signal source separation |
CN106504762A (en) * | 2016-11-04 | 2017-03-15 | 中南民族大学 | Bird community number estimation system and method |
CN108694953A (en) * | 2017-04-07 | 2018-10-23 | 南京理工大学 | A kind of chirping of birds automatic identifying method based on Mel sub-band parameter features |
CN107393542A (en) * | 2017-06-28 | 2017-11-24 | 北京林业大学 | A kind of birds species identification method based on binary channels neutral net |
CN107393549A (en) * | 2017-07-21 | 2017-11-24 | 北京华捷艾米科技有限公司 | Delay time estimation method and device |
CN107644650A (en) * | 2017-09-29 | 2018-01-30 | 山东大学 | A kind of improvement sound localization method based on progressive serial orthogonalization blind source separation algorithm and its realize system |
Non-Patent Citations (2)
Title |
---|
ZHAO ZHAO ETC: "Automated bird acoustic event detection and robust species classification", 《ECOLOGICAL INFORMATICS》 * |
吴烨: "面向鸟声传感网数据中心的边缘虚拟网桥技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110335613A (en) * | 2019-05-28 | 2019-10-15 | 广东工业大学 | A Bird Recognition Method Using Pickup Real-time Detection |
CN110335613B (en) * | 2019-05-28 | 2021-07-09 | 广东工业大学 | A method for bird identification using real-time detection of pickups |
CN111276151A (en) * | 2020-01-20 | 2020-06-12 | 北京正和恒基滨水生态环境治理股份有限公司 | Bird sound identification system and identification method |
CN111276151B (en) * | 2020-01-20 | 2023-04-07 | 北京正和恒基滨水生态环境治理股份有限公司 | Bird sound identification system and identification method |
CN113314127A (en) * | 2021-04-23 | 2021-08-27 | 广州大学 | Space orientation-based bird song recognition method, system, computer device and medium |
CN113314127B (en) * | 2021-04-23 | 2023-10-10 | 广州大学 | Bird song recognition method, system, computer equipment and media based on spatial orientation |
CN115116461A (en) * | 2022-07-22 | 2022-09-27 | 南京理工大学 | Double-channel speech enhancement-based birdsong species identification method |
CN117724042A (en) * | 2024-02-18 | 2024-03-19 | 百鸟数据科技(北京)有限责任公司 | Method and system for positioning bird song sound source based on acoustic bispectrum |
CN117724042B (en) * | 2024-02-18 | 2024-04-19 | 百鸟数据科技(北京)有限责任公司 | Method and system for positioning bird song sound source based on acoustic bispectrum |
CN118173102A (en) * | 2024-05-15 | 2024-06-11 | 百鸟数据科技(北京)有限责任公司 | Bird voiceprint recognition method in complex scene |
CN118173102B (en) * | 2024-05-15 | 2024-07-16 | 百鸟数据科技(北京)有限责任公司 | Bird voiceprint recognition method in complex scene |
Also Published As
Publication number | Publication date |
---|---|
CN109741759B (en) | 2020-07-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109741759B (en) | Acoustic automatic detection method for specific bird species | |
CN106653032B (en) | Animal sound detection method based on multi-band energy distribution in low signal-to-noise ratio environment | |
CN105513605B (en) | Speech enhancement system and speech enhancement method of mobile phone microphone | |
Potamitis et al. | On automatic bioacoustic detection of pests: the cases of Rhynchophorus ferrugineus and Sitophilus oryzae | |
CN109767785A (en) | Environmental noise recognition and classification method based on convolutional neural network | |
CN108694953A (en) | A kind of chirping of birds automatic identifying method based on Mel sub-band parameter features | |
CN105469785A (en) | Voice activity detection method in communication-terminal double-microphone denoising system and apparatus thereof | |
CN111292762A (en) | Single-channel voice separation method based on deep learning | |
CN109884591B (en) | Microphone array-based multi-rotor unmanned aerial vehicle acoustic signal enhancement method | |
CN105261359A (en) | Noise elimination system and method of mobile phone microphones | |
CN107731235B (en) | Method and device for extracting and classifying sound pulse characteristics of sperm whales and long fin pilot whales | |
CN111261189A (en) | Vehicle sound signal feature extraction method | |
Himawan et al. | Deep Learning Techniques for Koala Activity Detection. | |
CN110211596B (en) | Method for detecting Whistle signal of cetacea animal based on Mel subband spectral entropy | |
CN111414832A (en) | Real-time online recognition and classification system based on whale dolphin low-frequency underwater acoustic signals | |
CN112735468A (en) | MFCC-based automobile seat motor abnormal noise detection method | |
Lostanlen et al. | Long-distance detection of bioacoustic events with per-channel energy normalization | |
CN107886050A (en) | Utilize time-frequency characteristics and the Underwater targets recognition of random forest | |
Berger et al. | Bird audio detection-dcase 2018 | |
Castro et al. | Automatic manatee count using passive acoustics | |
Patti et al. | Methods for classification of nocturnal migratory bird vocalizations using Pseudo Wigner-Ville Transform | |
CN100525727C (en) | Method for measuring and identifying bat kind using with bat echo location acoustic | |
Zhang et al. | Automatic bioacoustics noise reduction method based on a deep feature loss network | |
May et al. | Generalization of supervised learning for binary mask estimation | |
Zhang et al. | Environmental sound recognition using double-level energy detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |