CN106228979B - Method for extracting and identifying abnormal sound features in public places - Google Patents
Method for extracting and identifying abnormal sound features in public places Download PDFInfo
- Publication number
- CN106228979B CN106228979B CN201610674982.1A CN201610674982A CN106228979B CN 106228979 B CN106228979 B CN 106228979B CN 201610674982 A CN201610674982 A CN 201610674982A CN 106228979 B CN106228979 B CN 106228979B
- Authority
- CN
- China
- Prior art keywords
- abnormal sound
- esmd
- abnormal
- signal
- decomposition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 105
- 238000000034 method Methods 0.000 title claims abstract description 76
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 48
- 238000001514 detection method Methods 0.000 claims abstract description 4
- 230000005236 sound signal Effects 0.000 claims description 36
- 239000013598 vector Substances 0.000 claims description 18
- 238000012216 screening Methods 0.000 claims description 12
- 238000013145 classification model Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 230000001174 ascending effect Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000003111 delayed effect Effects 0.000 claims description 3
- 238000004880 explosion Methods 0.000 claims description 3
- 239000011521 glass Substances 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 239000013589 supplement Substances 0.000 claims description 3
- 238000000605 extraction Methods 0.000 abstract description 8
- 230000000694 effects Effects 0.000 abstract description 5
- 238000004422 calculation algorithm Methods 0.000 abstract description 2
- 230000007547 defect Effects 0.000 abstract description 2
- 230000007613 environmental effect Effects 0.000 abstract 1
- 238000012544 monitoring process Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 206010039740 Screaming Diseases 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
The invention relates to a method for extracting and identifying abnormal sounds in public places, which improves the polar symmetry mode decomposition (ESMD), called D-ESMD for short, and is characterized in that: adding a random T distribution sequence signal into abnormal sounds in public places, and reducing the influence of background noise in the public places on the extraction of the characteristics of the abnormal sounds; aiming at the problem that the original ESMD has poor decomposition effect when decomposing abnormal sounds, a symmetric midpoint interpolation method is provided to replace an extreme value midpoint odd-even interpolation method, so that the abnormal sound decomposition efficiency and the recognition rate are improved; aiming at the defects of the original ESMD in the selection of effective decomposition modes, complexity detection is carried out on the modes obtained by ESMD decomposition based on the permutation entropy algorithm, and the effective modal components of abnormal sounds are obtained in a self-adaptive mode. The method can fully describe the characteristics of the abnormal sound, obtain a better classification recognition result, more accurately extract the characteristics of the abnormal sound and have better robustness on the environmental background noise.
Description
Technical Field
The invention belongs to the technical field of audio signal feature extraction and pattern recognition, and particularly relates to a method for extracting and recognizing abnormal sound features in public places.
Background
Public places such as squares, bus stations, subways and the like have the characteristics of large people flow, wide regions and the like, and the safety precaution of the public places is always widely concerned by governments and people of all countries. At present, a monitoring technology mainly based on video monitoring plays an active role in safety precaution in public places, but the video monitoring technology has the problems of monitoring dead angles, monitoring fuzziness in rainy days and the like. As is well known, abnormal sounds such as screaming sound, gunshot sound, glass breaking sound, explosion sound and the like are often accompanied when an abnormal event occurs, and therefore the cooperative operation of audio monitoring and video monitoring has become a development direction in the field of security monitoring in public places. At present, the existing audio monitoring system only comprises simple sound collection, transmission and the like, and is lack of effective identification of abnormal sounds, because the core theory and technology of audio monitoring are not broken through. The technology for recognizing abnormal sounds in public places is a core technology of an audio monitoring system. Therefore, the research on the technology has important social significance and research value.
At present, there is a problem in extracting abnormal sound characteristics of a public place by using an Extreme-point Symmetric Mode Decomposition (ESMD) method, wherein ① the abnormal sound characteristics of the public place consist of an abnormal sound signal and a background noise signal, the background noise signal can shield local characteristics of the abnormal sound, the ESMD is adopted to decompose the abnormal sound of the public place, the obtained modal component necessarily contains a background noise component, and the abnormal sound characteristics generate deviation, ② the ESMD constructs 1, 2, 3 or more than equal interpolation curves according to the midpoint of the interpolation curve to improve Decomposition effects when decomposing the signal, namely ESMD-I, ESMD-II and ESMD-III methods, because the effect of the interpolation method on modal Decomposition is greatly influenced, the three interpolation methods are compared to find out that the modal number is reduced with the increase of interpolation lines, the symmetry degree is reduced, the amplitude change is enhanced, the Decomposition efficiency is improved, when decomposing the abnormal sound characteristics of the public place with the background noise are decomposed by using the remaining number of the Extreme points as the Decomposition termination condition, the ESMD is not subjected to the low-frequency Decomposition judgment, and the ESMD is not subjected to the judgment that the Decomposition frequency of the ESMD is retained when decomposing noise characteristics are selected, and ESMD is retained in ESMD, and ESMD is not subjected to the ESMD, so that the ESMD is retainedmin,Kmax]And (4) internally changing, repeatedly decomposing the abnormal sound signals by using different screening times, and finally calculating the optimal screening time by using the least square method principle, so that the time consumption for decomposing the abnormal sound signals by the ESMD is long.
In summary, the ESMD decomposition technique has room for improvement.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a public place abnormal sound feature extraction and identification method based on an improved ESMD (D-ESMD) decomposition technology, which is used for improving an internal interpolation method, a judgment condition of decomposition mode termination and the mode component screening times by adding noise to an input signal of the ESMD to obtain the features of the public place abnormal sound under different scales.
A method for extracting and identifying abnormal sound features in public places comprises the following specific steps:
step 1: and inputting abnormal sounds to be identified in public places and preprocessing the abnormal sounds.
Step 2: and decomposing the abnormal sound signal to be identified by adopting an improved pole symmetric modal D-ESMD decomposition method to obtain modal components of each order, wherein each modal component respectively comprises the characteristics of the abnormal sound signal in different frequency bands.
And step 3: and (3) calculating the energy ratio of each order of modal component obtained in the step (2) relative to the original abnormal sound signal, and combining the energy ratios into a vector form to perform normalization processing to be used as a feature vector of the abnormal sound signal to be identified.
And 4, step 4: judging whether the feature vector is valid; if not, skipping to the step 3; if yes, go to step 5.
And 5: the identification process of the abnormal sound to be identified in the public place comprises the following steps: firstly, randomly selecting each class and a certain number of training samples from an established abnormal sound library, solving the feature vector of the training samples through the step 2 and the step 3, and establishing an SVM classification model; and then, classifying the feature vectors of the abnormal sounds to be recognized by using the established SVM classification model to obtain a classification recognition result.
The D-ESMD decomposition method is characterized in that on the basis of a pole symmetric mode ESMD decomposition method, a random T distribution noise sequence is added to abnormal sounds to be identified in a public place, a symmetric midpoint interpolation method is adopted to replace an extreme value midpoint parity interpolation method of the ESMD, arrangement entropy values are calculated for decomposed mode components, the mode component screening times are improved, complexity detection of each mode is completed, and effective mode components of the abnormal sounds are obtained in a self-adaptive mode.
The abnormal sound library comprises explosion sound, scream sound, gunshot sound and glass breaking sound.
Specifically, the D-ESMD decomposition method comprises the following specific processes:
step 2.1, determining the number N of times of adding T distributed random noise;
step 2.2, supposing that the abnormal sound signal to be identified is X, adding a random T distribution sequence into the sound signal to be identified to obtain a noise-added abnormal sound signal Xi;
Step 2.3, the abnormal sound signal X after the noise addition is obtainediConnecting adjacent extreme points, and marking the middle point of the line segment as FiSupplement left and right boundary points F0And FnAn interpolation curve L is constructed for n +1 extremum midpoints by adopting a symmetric midpoint interpolation method to replace an ESMD extremum midpoint parity interpolation method*;
Step 2.4 reaction of Xi-L*As input, repeating the above step 2.3 until the screening frequency reaches the maximum value to obtain the first-order modal component M1 iCalculating the value of the permutation entropy of the modal components; if the value of the permutation entropy of the signal is larger than a predetermined threshold value theta, the signal is regarded as an abnormal sound modal component, otherwise, the signal is regarded as a noise component;
step 2.5 if modal component M1 iFor abnormal sound modal component, X is addedi-M1 iRepeating the steps 2.3-2.4 as input signals until the modal component M is obtained by decompositionn iIs a noise component;
step 2.6 if i<N, let i be i +1, repeat steps 2.2 to 2.5, and the T distribution noise signal added each time is different until N decompositions are performed, and all modal components are obtainedTaking the overall average value and taking the result as the final modal component M of the signal to be decomposedk:
In the above formula, k is the order of modal component, and N is the number of times of noise addition.
Specifically, the symmetric midpoint interpolation method specifically comprises the following steps:
step 3.1, assuming that the input signal is y, solving all maximum value points y of ymaxAnd minimum value point ymin;
Step 3.2, connecting all adjacent extreme points and solving the extreme value middle point ymean;
ymean=(ymax+ymin)/2
Step 3.3, solving the symmetrical middle point y of the middle points of the adjacent extreme valuesmAnd simultaneously using cubic spline interpolation method to ymAnd (5) carrying out interpolation to obtain a final interpolation curve.
Specifically, the screening times in step 2.4 are optimally 12.
Specifically, the specific calculation process of the permutation entropy is as follows:
assuming a time series signal x (i) of length N, i ═ 1, 2, …, N, which is subjected to delayed reconstruction, the following time series results:
wherein l is time delay, m is reconstruction dimension, and m elements in X (i) are arranged in ascending order to obtain:
Xi'={x(i+(j1-1)*l)≤x(i+(j2-1)*l)
≤…≤x(i+(jm-1)*l)}
thus, each vector x (i) has a set of permutation sequences:
Sg={j1,j2,j3,…jm}
in the formula, j represents an index of a column in which each element in the reconstruction component is located.
Wherein m! A different arrangement; calculating the probability p of each permutation appearing in X (i)1、p2、…p3Then the normalized permutation entropy is:
where N is the time series length, m is the reconstruction dimension and l is the time delay.
The effective gain effect is as follows:
when the invention decomposes abnormal sounds in public places based on D-ESMD, random T-distributed noise sequences are added to the abnormal sound signals in the public places to be decomposed, and the decomposition deviation caused by background noise is reduced from the source, thereby greatly improving the recognition capability of the abnormal sounds in the public places. In addition, the invention combines the characteristics of the abnormal sound and the background noise of the public place, provides a D-ESMD method for extracting and identifying the characteristics of the abnormal sound of the public place, and decomposes the abnormal sound of the public place into a series of modal components with single frequency components. Theoretically, an interpolation method inside the ESMD, judgment conditions for termination of decomposition modes, screening times of mode components and the like are improved, and the mode components obtained through decomposition can reflect the characteristics of abnormal sounds in public places under different scales.
Drawings
FIG. 1: the invention provides a flow chart of a public place abnormal sound feature extraction and identification method;
FIG. 2: decomposing a simulation signal diagram by an ESMD interpolation method;
FIG. 3: the improved interpolation method provided by the invention decomposes the analog signal diagram;
FIG. 4: the invention is compared with Receiver Operating Characteristics (ROC) curves of other abnormal sound Characteristic extraction methods.
Detailed Description
The invention is explained in further detail below with reference to the drawings.
The core technology of the invention is a D-ESMD decomposition method. The D-ESMD decomposition method is an improvement based on the ESMD decomposition method, and the improvement points are as follows:
firstly, an ESMD decomposition method based on T distribution is adopted to weaken background noise components in modal components, and therefore the characteristics of abnormal sounds are extracted better. The method comprises the following specific steps:
and adding a random T distribution sequence in the sound signal to be identified, weakening a background noise component in the modal component, reducing the decomposition deviation caused by the background noise from the source, and improving the characteristic extraction capability of abnormal sound. The specific treatment process comprises the following steps:
suppose the abnormal sound signal of the public place is x (t), which generally consists of the real abnormal sound signal x (t) and the background noise signal n (t), that is:
X(t)=x(t)+N(t)
when ESMD is used to decompose x (t), the obtained mode m (t) also includes abnormal sound signal component m (t) and background noise signal component c (t), which is:
in the formula, n is the number of modal components, and r (t) is the decomposition residue.
Adding k different T noise sequences n to the signal X (T)iAfter (t), the series of equations can be expressed as:
X(t)+n1(t)=m11(t)+m12(t)+…+m1n(t)+c11(t)+c12(t)+…+c1n(t)+r1(t)
X(t)+n2(t)=m21(t)+m22(t)+…+m2n(t)+c21(t)+c22(t)+…+c2n(t)+r2(t)
………
X(t)+ni(t)=mi1(t)+mi2(t)+…+min(t)+ci1(t)+ci2(t)+…+cin(t)+ri(t)
………
X(t)+nk(t)=mk1(t)+mk2(t)+…+mkn(t)+ck1(t)+ck2(t)+…+ckn(t)+rk(t)
adding the N formulas to obtain:
as can be seen from the above formula, k.times.N (t) + n when k is ∞1(t)+n2(t)+…nk(t) and cijThe terms (t) all approach zero, then the above equation is converted as follows:
as can be seen from the above formula, k times of random T distribution noise sequences are added to abnormal sounds in public places, and the average value of each order of modes obtained by decomposing the abnormal sounds by using ESMD is taken, so that the background noise component c (T) is eliminated, and the influence of the background noise in the public places on the abnormal sound decomposition is reduced.
And secondly, symmetric midpoint interpolation is adopted to replace extreme value midpoint odd-even interpolation, and the ESMD decomposition efficiency and the decomposition accuracy are improved from the signal source head.
The symmetric midpoint interpolation method comprises the following steps:
step 3.1 to find all maxima points y of the original signalmaxAnd minimum value point ymin;
Step 3.2, connecting all adjacent extreme points and solving the extreme middle point ymean;
ymean=(ymax+ymin)/2
Step 3.3. finding the symmetrical midpoint y of the midpoints of adjacent extremamAnd simultaneously using cubic spline interpolation method to ymAnd (5) carrying out interpolation to obtain a final interpolation curve.
The analog signal z is decomposed by adopting symmetric midpoint interpolation and extreme point parity interpolation. The analog signal z is assumed to consist of three sinusoidal signals of different frequencies and different amplitudes, as follows:
z=sin(20*p*t)+1.5cos(40*π*t)+2.5cos(80*π*t)
as shown in fig. 2, when the ESMD interpolation method is used to decompose the analog signal, the generated mode has a distortion phenomenon, and the amplitude deviation between the mode and the original signal is large. Fig. 3 is a diagram of an analog signal decomposed by the improved interpolation method provided by the present invention, which effectively alleviates the distortion problem caused by the ambiguity of the endpoint of the ESMD interpolation.
And thirdly, carrying out complexity detection on the modal component obtained by ESMD decomposition based on the permutation entropy algorithm, taking the detected modal component as a judgment criterion for distinguishing abnormal sound and background noise, and obtaining the effective abnormal sound component in a self-adaptive manner.
The specific calculation process of the permutation entropy is as follows:
assuming a time series signal x (i) of length N, i ═ 1, 2, …, N, which is subjected to delayed reconstruction, the following time series results:
where l is the time delay and m is the reconstruction dimension, the m elements in x (i) are sorted in ascending order to obtain:
Xi'={x(i+(j1-1)*l)≤x(i+(j2-1)*l)
≤…≤x(i+(jm-1)*l)}
thus, each vector x (i) has a set of permutation sequences:
Sg={j1,j2,j3,…jm}
in the formula, j represents an index of a column in which each element in the reconstruction component is located.
Wherein m! A different arrangement. Calculating the probability p of each permutation appearing in X (i)1、p2、…p3Then the normalized permutation entropy is:
where N is the time series length, m is the reconstruction dimension and l is the time delay. According to the experimental result, the reconstruction dimension m is generally selected to be 3-7. The time delay has a small influence on the permutation entropy and can be generally selected to be 1.
In the invention, the selection of the mode is judged by judging whether the arrangement entropy H of the mode components with different frequency scales obtained by decomposing the abnormal sound signals of the public places added with the random T distribution sequences is larger than the threshold theta. Experiments show that the effect of extracting the abnormal sound features is good when the value of theta is in the range of 0.25-0.35.
Fourth, screening frequency of modal component
The number of modal screens is determined by a number of experiments to be the optimum number of screens, with a preferred value of 12.
The invention utilizes the above improvement points to realize the extraction and identification of the abnormal sound characteristics of public places, and as shown in figure 1, the method mainly comprises three parts: the method comprises the following steps: and decomposing, characteristic extracting and identifying abnormal sounds to be identified in public places.
The method comprises the following specific steps:
step 1: and inputting abnormal sound signals to be identified in public places and preprocessing the abnormal sound signals.
Step 2: and decomposing the abnormal sound signal to be identified into a series of modal components by adopting an improved pole symmetric modal D-ESMD decomposition method, wherein each order of modal components respectively comprises the characteristics of the abnormal sound signal in different frequency bands.
And step 3: and (3) calculating the energy ratio of each order of modal component obtained in the step (2) relative to the original abnormal sound signal, and combining the energy ratios into a vector form to perform normalization processing to be used as a feature vector of the abnormal sound signal to be identified.
And 4, step 4: judging whether the feature vector is valid; if not, skipping to the step 3; if yes, executing step 5;
and 5: the identification process of the abnormal sound to be identified in the public place comprises the following steps: firstly, randomly selecting each class and a certain number of training samples from an established abnormal sound library, solving the feature vector of the training samples through the step 2 and the step 3, and establishing an SVM classification model; then, classifying the feature vectors of the abnormal sounds to be recognized by using the established SVM classification model to obtain a classification recognition result;
the method comprises the following steps of D-ESMD, wherein the D-ESMD is used for extracting the characteristics of abnormal sounds to be identified in public places:
step 2.1, determining the number N of times of adding T distributed random noise;
step 2.2, supposing that the abnormal sound signal to be identified is X, adding a random T distribution sequence into the sound signal to be identified to obtain a noise-added abnormal sound signal Xi;
Step 2.3, the abnormal sound signal X after the noise addition is obtainediConnecting adjacent extreme points, and marking the middle point of the line segment as FiSupplement left and right boundary points F0And Fn. An extreme value midpoint odd-even interpolation method for replacing ESMD by adopting a symmetrical midpoint interpolation method to construct an interpolation curve L for n +1 extreme value midpoints*。
Step 2.4 reaction of Xi-L*As input, repeat the above step 2.3 until the number of screening reaches the maximum 12 to obtain the first order modal component M1 iCalculating the value of the permutation entropy of the modal components; if the value of the permutation entropy of the signal is larger than a predetermined threshold value theta, the signal is regarded as an abnormal sound modal component, otherwise, the signal is regarded as a noise component;
step 2.5 if modal component M1 iFor abnormal sound modal component, X is addedi-M1 iRepeating the steps 2.3-2.4 as input signals until the modal component M is obtained by decompositionn iIs a noise component;
step 2.6 if i<N, let i be i +1, repeat steps 2.2 to 2.5, and the T distribution noise signal added each time is different until N decompositions are performed, and all modal components are obtainedTaking the overall average value and taking the result as the final modal component M of the signal to be decomposedk:
In the above formula, k is the order of modal component, and N is the number of times of noise addition.
FIG. 4 is a comparison graph of ROC curves of the present invention and several other abnormal sound feature extraction methods. The ESMD is a pole symmetric modal decomposition method, the EEMD is a total empirical mode decomposition method, the SaSEEMD is a total empirical mode decomposition method based on alpha distribution, and the ELMD is a total local mean decomposition method. D-ESMD is the improved ESMD decomposition method provided by the invention.
Claims (4)
1. A method for extracting and identifying abnormal sound features in public places is characterized by comprising the following steps: decomposing abnormal sounds to be identified in public places, extracting and identifying characteristics; the method comprises the following concrete steps:
step 1: inputting abnormal sounds to be identified in public places and preprocessing the abnormal sounds;
step 2: decomposing the abnormal sound signal to be identified by adopting an improved pole symmetric modal decomposition D-ESMD method to obtain modal components of each order, wherein each modal component respectively comprises the characteristics of the abnormal sound signal in different frequency bands;
and step 3: calculating the energy ratio of each order of modal component obtained in the step 2 relative to the original abnormal sound signal, and combining the energy ratios into a vector form to carry out normalization processing to be used as a feature vector of the abnormal sound signal to be identified;
and 4, step 4: judging whether the feature vector is valid; if not, skipping to the step 3; if yes, executing step 5;
and 5: the identification process of the abnormal sound to be identified in the public place comprises the following steps: firstly, randomly selecting each class and a certain number of training samples from an established abnormal sound library, solving the feature vector of the training samples through the step 2 and the step 3, and establishing an SVM classification model; then, classifying the feature vectors of the abnormal sounds to be recognized by using the established SVM classification model to obtain a classification recognition result;
the D-ESMD decomposition method is characterized in that on the basis of a pole symmetric mode ESMD decomposition method, a random T distribution noise sequence is added to abnormal sounds to be identified in a public place, a symmetric midpoint interpolation method is adopted to replace an extreme value midpoint parity interpolation method of the ESMD, arrangement entropy values are calculated for decomposed mode components, the mode component screening times are improved, complexity detection of each mode is completed, and effective mode components of the abnormal sounds are obtained in a self-adaptive mode;
the abnormal sound library comprises explosion sound, scream sound, gunshot sound and glass breaking sound;
the D-ESMD decomposition method comprises the following specific processes:
step 2.1, determining the number N of times of adding T distributed random noise;
step 2.2, supposing that the abnormal sound signal to be identified is X, adding a random T distribution sequence into the sound signal to be identified to obtain a noise-added abnormal sound signal Xi;
Step 2.3, the abnormal sound signal X after the noise addition is obtainediConnecting adjacent extreme points, and marking the middle point of the line segment as FiSupplement left and right boundary points F0And FnAn interpolation curve L is constructed for n +1 extremum midpoints by adopting a symmetric midpoint interpolation method to replace an ESMD extremum midpoint parity interpolation method*;
Step 2.4 reaction of Xi-L*As input, repeating the above step 2.3 until the screening frequency reaches the maximum value to obtain the first-order modal component M1 iCalculating the value of the permutation entropy of the modal components; if the value of the permutation entropy of the signal is larger than a predetermined threshold value theta, the signal is regarded as an abnormal sound modal component, otherwise, the signal is regarded as a noise component;
step 2.5 if modal component M1 iFor abnormal sound modal component, X is addedi-M1 iRepeating the steps 2.3-2.4 as input signals until the modal component M is obtained by decompositionn iIs a noise component;
step 2.6 if i<N, let i be i +1, repeat steps 2.2 to 2.5, and the T distribution noise signal added each time is different until N decompositions are performed, and all modal components M are obtainedk iTaking the overall average value and taking the result as the final modal component M of the signal to be decomposedk:
In the above formula, k is the order of modal component, and N is the number of times of noise addition.
2. The method for extracting and identifying the abnormal sound features in the public places according to claim 1, wherein the symmetrical midpoint interpolation method comprises the following specific steps:
step 3.1, assuming that the input signal is y, solving all maximum value points y of ymaxAnd minimum value point ymin;
Step 3.2, connecting all adjacent extreme points and solving the extreme value middle point ymean;
ymean=(ymax+ymin)/2
Step 3.3, solving the symmetrical middle point y of the middle points of the adjacent extreme valuesmAnd simultaneously using cubic spline interpolation method to ymAnd (5) carrying out interpolation to obtain a final interpolation curve.
3. The method for extracting and identifying the abnormal sound features in the public places according to claim 1, wherein the maximum value of the screening times in the step 2.4 is prioritized to 12.
4. The method for extracting and identifying the abnormal sound features in the public places according to claim 1, wherein the specific calculation process of the permutation entropy is as follows:
assuming a time series signal x (i) of length N, i ═ 1, 2, …, N, which is subjected to delayed reconstruction, the following time series results:
wherein l is time delay, m is reconstruction dimension, and m elements in X (i) are arranged in ascending order to obtain:
X'i={x(i+(j1-1)*l)≤x(i+(j2-1)*l)≤…≤x(i+(jm-1)*l)}
thus, each vector x (i) has a set of permutation sequences:
Sg={j1,j2,j3,…jm}
in the formula, j represents an index of a column where each element in the reconstruction component is located;
wherein m! A different arrangement; calculating the probability p of each permutation appearing in X (i)1、p2、…p3Then the normalized permutation entropy is:
where N is the time series length, m is the reconstruction dimension and l is the time delay.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610674982.1A CN106228979B (en) | 2016-08-16 | 2016-08-16 | Method for extracting and identifying abnormal sound features in public places |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610674982.1A CN106228979B (en) | 2016-08-16 | 2016-08-16 | Method for extracting and identifying abnormal sound features in public places |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106228979A CN106228979A (en) | 2016-12-14 |
CN106228979B true CN106228979B (en) | 2020-01-10 |
Family
ID=57552521
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610674982.1A Active CN106228979B (en) | 2016-08-16 | 2016-08-16 | Method for extracting and identifying abnormal sound features in public places |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106228979B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106683687B (en) * | 2016-12-30 | 2020-02-14 | 杭州华为数字技术有限公司 | Abnormal sound classification method and device |
CN107525671B (en) * | 2017-07-28 | 2020-12-18 | 中国科学院电工研究所 | Method for separating and identifying compound fault characteristics of transmission chain of wind turbine generator |
CN107527617A (en) * | 2017-09-30 | 2017-12-29 | 上海应用技术大学 | Monitoring method, apparatus and system based on voice recognition |
CN108182950B (en) * | 2017-12-28 | 2021-05-28 | 重庆大学 | Improved method for decomposing and extracting abnormal sound characteristics of public places through empirical wavelet transform |
CN109258509B (en) * | 2018-11-16 | 2023-05-02 | 太原理工大学 | Intelligent monitoring system and method for abnormal sound of live pigs |
CN110910897B (en) * | 2019-12-05 | 2023-06-09 | 四川超影科技有限公司 | Feature extraction method for motor abnormal sound recognition |
CN111461090B (en) * | 2020-06-17 | 2020-09-22 | 杭州云智声智能科技有限公司 | Sound vibration signal processing method and system based on environment sample basic cloud model |
CN112906578B (en) * | 2021-02-23 | 2023-09-05 | 北京建筑大学 | Method for denoising bridge time sequence displacement signal |
CN114710419B (en) * | 2022-02-21 | 2023-07-28 | 上海交通大学 | Equipment working state single-point monitoring method and device based on switching power supply sound and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102522082A (en) * | 2011-12-27 | 2012-06-27 | 重庆大学 | Recognizing and locating method for abnormal sound in public places |
CN103730109A (en) * | 2014-01-14 | 2014-04-16 | 重庆大学 | Method for extracting characteristics of abnormal noise in public places |
CN105125204A (en) * | 2015-07-31 | 2015-12-09 | 华中科技大学 | Electrocardiosignal denoising method based on ESMD (extreme-point symmetric mode decomposition) method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9779725B2 (en) * | 2014-12-11 | 2017-10-03 | Mediatek Inc. | Voice wakeup detecting device and method |
-
2016
- 2016-08-16 CN CN201610674982.1A patent/CN106228979B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102522082A (en) * | 2011-12-27 | 2012-06-27 | 重庆大学 | Recognizing and locating method for abnormal sound in public places |
CN103730109A (en) * | 2014-01-14 | 2014-04-16 | 重庆大学 | Method for extracting characteristics of abnormal noise in public places |
CN105125204A (en) * | 2015-07-31 | 2015-12-09 | 华中科技大学 | Electrocardiosignal denoising method based on ESMD (extreme-point symmetric mode decomposition) method |
Non-Patent Citations (1)
Title |
---|
基于CEEMD 和排列熵的故障数据小波阈值降噪方法;周涛涛等;《振动与冲击》;20151215;第34卷(第23期);第208页第1.3-1.4节、第209-211页第2、4节 * |
Also Published As
Publication number | Publication date |
---|---|
CN106228979A (en) | 2016-12-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106228979B (en) | Method for extracting and identifying abnormal sound features in public places | |
CN109768985B (en) | Intrusion detection method based on flow visualization and machine learning algorithm | |
Zhai et al. | Insulator fault detection based on spatial morphological features of aerial images | |
CN109800700B (en) | Underwater acoustic signal target classification and identification method based on deep learning | |
CN108009628B (en) | Anomaly detection method based on generation countermeasure network | |
CN110767216B (en) | Voice recognition attack defense method based on PSO algorithm | |
CN102663426B (en) | Face identification method based on wavelet multi-scale analysis and local binary pattern | |
Ding et al. | Autospeech: Neural architecture search for speaker recognition | |
CN107527617A (en) | Monitoring method, apparatus and system based on voice recognition | |
WO2018006797A1 (en) | System and method for detecting keyboard pressing content by using acoustic signal | |
CN102779510B (en) | Speech emotion recognition method based on feature space self-adaptive projection | |
CN104795064B (en) | The recognition methods of sound event under low signal-to-noise ratio sound field scape | |
CN106328120B (en) | Method for extracting abnormal sound features of public places | |
CN105718889A (en) | Human face identity recognition method based on GB(2D)2PCANet depth convolution model | |
US20160335553A1 (en) | Accoustic Context Recognition using Local Binary Pattern Method and Apparatus | |
CN110197665A (en) | A kind of speech Separation and tracking for police criminal detection monitoring | |
CN103778913A (en) | Pathological voice recognition method | |
CN109410919A (en) | A kind of intelligent home control system | |
CN106792706A (en) | A kind of illegal invasion detection method based on wireless network signal spectrum non-stationary property | |
CN107454084B (en) | Nearest neighbor intrusion detection algorithm based on hybrid zone | |
CN113159181B (en) | Industrial control system anomaly detection method and system based on improved deep forest | |
CN111191548B (en) | Discharge signal identification method and identification system based on S transformation | |
CN111667836B (en) | Text irrelevant multi-label speaker recognition method based on deep learning | |
Park et al. | A noise robust audio fingerprint extraction technique for mobile devices using gradient histograms | |
CN112489330A (en) | Warehouse anti-theft alarm method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |