CN106228979B - Method for extracting and identifying abnormal sound features in public places - Google Patents

Method for extracting and identifying abnormal sound features in public places Download PDF

Info

Publication number
CN106228979B
CN106228979B CN201610674982.1A CN201610674982A CN106228979B CN 106228979 B CN106228979 B CN 106228979B CN 201610674982 A CN201610674982 A CN 201610674982A CN 106228979 B CN106228979 B CN 106228979B
Authority
CN
China
Prior art keywords
abnormal sound
esmd
abnormal
signal
decomposition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610674982.1A
Other languages
Chinese (zh)
Other versions
CN106228979A (en
Inventor
李伟红
田真真
龚卫国
王伟冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN201610674982.1A priority Critical patent/CN106228979B/en
Publication of CN106228979A publication Critical patent/CN106228979A/en
Application granted granted Critical
Publication of CN106228979B publication Critical patent/CN106228979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

The invention relates to a method for extracting and identifying abnormal sounds in public places, which improves the polar symmetry mode decomposition (ESMD), called D-ESMD for short, and is characterized in that: adding a random T distribution sequence signal into abnormal sounds in public places, and reducing the influence of background noise in the public places on the extraction of the characteristics of the abnormal sounds; aiming at the problem that the original ESMD has poor decomposition effect when decomposing abnormal sounds, a symmetric midpoint interpolation method is provided to replace an extreme value midpoint odd-even interpolation method, so that the abnormal sound decomposition efficiency and the recognition rate are improved; aiming at the defects of the original ESMD in the selection of effective decomposition modes, complexity detection is carried out on the modes obtained by ESMD decomposition based on the permutation entropy algorithm, and the effective modal components of abnormal sounds are obtained in a self-adaptive mode. The method can fully describe the characteristics of the abnormal sound, obtain a better classification recognition result, more accurately extract the characteristics of the abnormal sound and have better robustness on the environmental background noise.

Description

Method for extracting and identifying abnormal sound features in public places
Technical Field
The invention belongs to the technical field of audio signal feature extraction and pattern recognition, and particularly relates to a method for extracting and recognizing abnormal sound features in public places.
Background
Public places such as squares, bus stations, subways and the like have the characteristics of large people flow, wide regions and the like, and the safety precaution of the public places is always widely concerned by governments and people of all countries. At present, a monitoring technology mainly based on video monitoring plays an active role in safety precaution in public places, but the video monitoring technology has the problems of monitoring dead angles, monitoring fuzziness in rainy days and the like. As is well known, abnormal sounds such as screaming sound, gunshot sound, glass breaking sound, explosion sound and the like are often accompanied when an abnormal event occurs, and therefore the cooperative operation of audio monitoring and video monitoring has become a development direction in the field of security monitoring in public places. At present, the existing audio monitoring system only comprises simple sound collection, transmission and the like, and is lack of effective identification of abnormal sounds, because the core theory and technology of audio monitoring are not broken through. The technology for recognizing abnormal sounds in public places is a core technology of an audio monitoring system. Therefore, the research on the technology has important social significance and research value.
At present, there is a problem in extracting abnormal sound characteristics of a public place by using an Extreme-point Symmetric Mode Decomposition (ESMD) method, wherein ① the abnormal sound characteristics of the public place consist of an abnormal sound signal and a background noise signal, the background noise signal can shield local characteristics of the abnormal sound, the ESMD is adopted to decompose the abnormal sound of the public place, the obtained modal component necessarily contains a background noise component, and the abnormal sound characteristics generate deviation, ② the ESMD constructs 1, 2, 3 or more than equal interpolation curves according to the midpoint of the interpolation curve to improve Decomposition effects when decomposing the signal, namely ESMD-I, ESMD-II and ESMD-III methods, because the effect of the interpolation method on modal Decomposition is greatly influenced, the three interpolation methods are compared to find out that the modal number is reduced with the increase of interpolation lines, the symmetry degree is reduced, the amplitude change is enhanced, the Decomposition efficiency is improved, when decomposing the abnormal sound characteristics of the public place with the background noise are decomposed by using the remaining number of the Extreme points as the Decomposition termination condition, the ESMD is not subjected to the low-frequency Decomposition judgment, and the ESMD is not subjected to the judgment that the Decomposition frequency of the ESMD is retained when decomposing noise characteristics are selected, and ESMD is retained in ESMD, and ESMD is not subjected to the ESMD, so that the ESMD is retainedmin,Kmax]And (4) internally changing, repeatedly decomposing the abnormal sound signals by using different screening times, and finally calculating the optimal screening time by using the least square method principle, so that the time consumption for decomposing the abnormal sound signals by the ESMD is long.
In summary, the ESMD decomposition technique has room for improvement.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a public place abnormal sound feature extraction and identification method based on an improved ESMD (D-ESMD) decomposition technology, which is used for improving an internal interpolation method, a judgment condition of decomposition mode termination and the mode component screening times by adding noise to an input signal of the ESMD to obtain the features of the public place abnormal sound under different scales.
A method for extracting and identifying abnormal sound features in public places comprises the following specific steps:
step 1: and inputting abnormal sounds to be identified in public places and preprocessing the abnormal sounds.
Step 2: and decomposing the abnormal sound signal to be identified by adopting an improved pole symmetric modal D-ESMD decomposition method to obtain modal components of each order, wherein each modal component respectively comprises the characteristics of the abnormal sound signal in different frequency bands.
And step 3: and (3) calculating the energy ratio of each order of modal component obtained in the step (2) relative to the original abnormal sound signal, and combining the energy ratios into a vector form to perform normalization processing to be used as a feature vector of the abnormal sound signal to be identified.
And 4, step 4: judging whether the feature vector is valid; if not, skipping to the step 3; if yes, go to step 5.
And 5: the identification process of the abnormal sound to be identified in the public place comprises the following steps: firstly, randomly selecting each class and a certain number of training samples from an established abnormal sound library, solving the feature vector of the training samples through the step 2 and the step 3, and establishing an SVM classification model; and then, classifying the feature vectors of the abnormal sounds to be recognized by using the established SVM classification model to obtain a classification recognition result.
The D-ESMD decomposition method is characterized in that on the basis of a pole symmetric mode ESMD decomposition method, a random T distribution noise sequence is added to abnormal sounds to be identified in a public place, a symmetric midpoint interpolation method is adopted to replace an extreme value midpoint parity interpolation method of the ESMD, arrangement entropy values are calculated for decomposed mode components, the mode component screening times are improved, complexity detection of each mode is completed, and effective mode components of the abnormal sounds are obtained in a self-adaptive mode.
The abnormal sound library comprises explosion sound, scream sound, gunshot sound and glass breaking sound.
Specifically, the D-ESMD decomposition method comprises the following specific processes:
step 2.1, determining the number N of times of adding T distributed random noise;
step 2.2, supposing that the abnormal sound signal to be identified is X, adding a random T distribution sequence into the sound signal to be identified to obtain a noise-added abnormal sound signal Xi
Step 2.3, the abnormal sound signal X after the noise addition is obtainediConnecting adjacent extreme points, and marking the middle point of the line segment as FiSupplement left and right boundary points F0And FnAn interpolation curve L is constructed for n +1 extremum midpoints by adopting a symmetric midpoint interpolation method to replace an ESMD extremum midpoint parity interpolation method*
Step 2.4 reaction of Xi-L*As input, repeating the above step 2.3 until the screening frequency reaches the maximum value to obtain the first-order modal component M1 iCalculating the value of the permutation entropy of the modal components; if the value of the permutation entropy of the signal is larger than a predetermined threshold value theta, the signal is regarded as an abnormal sound modal component, otherwise, the signal is regarded as a noise component;
step 2.5 if modal component M1 iFor abnormal sound modal component, X is addedi-M1 iRepeating the steps 2.3-2.4 as input signals until the modal component M is obtained by decompositionn iIs a noise component;
step 2.6 if i<N, let i be i +1, repeat steps 2.2 to 2.5, and the T distribution noise signal added each time is different until N decompositions are performed, and all modal components are obtained
Figure BDA0001080170260000034
Taking the overall average value and taking the result as the final modal component M of the signal to be decomposedk
Figure BDA0001080170260000031
In the above formula, k is the order of modal component, and N is the number of times of noise addition.
Specifically, the symmetric midpoint interpolation method specifically comprises the following steps:
step 3.1, assuming that the input signal is y, solving all maximum value points y of ymaxAnd minimum value point ymin
Step 3.2, connecting all adjacent extreme points and solving the extreme value middle point ymean
ymean=(ymax+ymin)/2
Step 3.3, solving the symmetrical middle point y of the middle points of the adjacent extreme valuesmAnd simultaneously using cubic spline interpolation method to ymAnd (5) carrying out interpolation to obtain a final interpolation curve.
Specifically, the screening times in step 2.4 are optimally 12.
Specifically, the specific calculation process of the permutation entropy is as follows:
assuming a time series signal x (i) of length N, i ═ 1, 2, …, N, which is subjected to delayed reconstruction, the following time series results:
Figure BDA0001080170260000032
wherein l is time delay, m is reconstruction dimension, and m elements in X (i) are arranged in ascending order to obtain:
Xi'={x(i+(j1-1)*l)≤x(i+(j2-1)*l)
≤…≤x(i+(jm-1)*l)}
thus, each vector x (i) has a set of permutation sequences:
Sg={j1,j2,j3,…jm}
in the formula, j represents an index of a column in which each element in the reconstruction component is located.
Wherein m! A different arrangement; calculating the probability p of each permutation appearing in X (i)1、p2、…p3Then the normalized permutation entropy is:
Figure BDA0001080170260000033
where N is the time series length, m is the reconstruction dimension and l is the time delay.
The effective gain effect is as follows:
when the invention decomposes abnormal sounds in public places based on D-ESMD, random T-distributed noise sequences are added to the abnormal sound signals in the public places to be decomposed, and the decomposition deviation caused by background noise is reduced from the source, thereby greatly improving the recognition capability of the abnormal sounds in the public places. In addition, the invention combines the characteristics of the abnormal sound and the background noise of the public place, provides a D-ESMD method for extracting and identifying the characteristics of the abnormal sound of the public place, and decomposes the abnormal sound of the public place into a series of modal components with single frequency components. Theoretically, an interpolation method inside the ESMD, judgment conditions for termination of decomposition modes, screening times of mode components and the like are improved, and the mode components obtained through decomposition can reflect the characteristics of abnormal sounds in public places under different scales.
Drawings
FIG. 1: the invention provides a flow chart of a public place abnormal sound feature extraction and identification method;
FIG. 2: decomposing a simulation signal diagram by an ESMD interpolation method;
FIG. 3: the improved interpolation method provided by the invention decomposes the analog signal diagram;
FIG. 4: the invention is compared with Receiver Operating Characteristics (ROC) curves of other abnormal sound Characteristic extraction methods.
Detailed Description
The invention is explained in further detail below with reference to the drawings.
The core technology of the invention is a D-ESMD decomposition method. The D-ESMD decomposition method is an improvement based on the ESMD decomposition method, and the improvement points are as follows:
firstly, an ESMD decomposition method based on T distribution is adopted to weaken background noise components in modal components, and therefore the characteristics of abnormal sounds are extracted better. The method comprises the following specific steps:
and adding a random T distribution sequence in the sound signal to be identified, weakening a background noise component in the modal component, reducing the decomposition deviation caused by the background noise from the source, and improving the characteristic extraction capability of abnormal sound. The specific treatment process comprises the following steps:
suppose the abnormal sound signal of the public place is x (t), which generally consists of the real abnormal sound signal x (t) and the background noise signal n (t), that is:
X(t)=x(t)+N(t)
when ESMD is used to decompose x (t), the obtained mode m (t) also includes abnormal sound signal component m (t) and background noise signal component c (t), which is:
in the formula, n is the number of modal components, and r (t) is the decomposition residue.
Adding k different T noise sequences n to the signal X (T)iAfter (t), the series of equations can be expressed as:
X(t)+n1(t)=m11(t)+m12(t)+…+m1n(t)+c11(t)+c12(t)+…+c1n(t)+r1(t)
X(t)+n2(t)=m21(t)+m22(t)+…+m2n(t)+c21(t)+c22(t)+…+c2n(t)+r2(t)
………
X(t)+ni(t)=mi1(t)+mi2(t)+…+min(t)+ci1(t)+ci2(t)+…+cin(t)+ri(t)
………
X(t)+nk(t)=mk1(t)+mk2(t)+…+mkn(t)+ck1(t)+ck2(t)+…+ckn(t)+rk(t)
adding the N formulas to obtain:
Figure BDA0001080170260000051
as can be seen from the above formula, k.times.N (t) + n when k is ∞1(t)+n2(t)+…nk(t) and cijThe terms (t) all approach zero, then the above equation is converted as follows:
as can be seen from the above formula, k times of random T distribution noise sequences are added to abnormal sounds in public places, and the average value of each order of modes obtained by decomposing the abnormal sounds by using ESMD is taken, so that the background noise component c (T) is eliminated, and the influence of the background noise in the public places on the abnormal sound decomposition is reduced.
And secondly, symmetric midpoint interpolation is adopted to replace extreme value midpoint odd-even interpolation, and the ESMD decomposition efficiency and the decomposition accuracy are improved from the signal source head.
The symmetric midpoint interpolation method comprises the following steps:
step 3.1 to find all maxima points y of the original signalmaxAnd minimum value point ymin
Step 3.2, connecting all adjacent extreme points and solving the extreme middle point ymean
ymean=(ymax+ymin)/2
Step 3.3. finding the symmetrical midpoint y of the midpoints of adjacent extremamAnd simultaneously using cubic spline interpolation method to ymAnd (5) carrying out interpolation to obtain a final interpolation curve.
The analog signal z is decomposed by adopting symmetric midpoint interpolation and extreme point parity interpolation. The analog signal z is assumed to consist of three sinusoidal signals of different frequencies and different amplitudes, as follows:
z=sin(20*p*t)+1.5cos(40*π*t)+2.5cos(80*π*t)
as shown in fig. 2, when the ESMD interpolation method is used to decompose the analog signal, the generated mode has a distortion phenomenon, and the amplitude deviation between the mode and the original signal is large. Fig. 3 is a diagram of an analog signal decomposed by the improved interpolation method provided by the present invention, which effectively alleviates the distortion problem caused by the ambiguity of the endpoint of the ESMD interpolation.
And thirdly, carrying out complexity detection on the modal component obtained by ESMD decomposition based on the permutation entropy algorithm, taking the detected modal component as a judgment criterion for distinguishing abnormal sound and background noise, and obtaining the effective abnormal sound component in a self-adaptive manner.
The specific calculation process of the permutation entropy is as follows:
assuming a time series signal x (i) of length N, i ═ 1, 2, …, N, which is subjected to delayed reconstruction, the following time series results:
Figure BDA0001080170260000061
where l is the time delay and m is the reconstruction dimension, the m elements in x (i) are sorted in ascending order to obtain:
Xi'={x(i+(j1-1)*l)≤x(i+(j2-1)*l)
≤…≤x(i+(jm-1)*l)}
thus, each vector x (i) has a set of permutation sequences:
Sg={j1,j2,j3,…jm}
in the formula, j represents an index of a column in which each element in the reconstruction component is located.
Wherein m! A different arrangement. Calculating the probability p of each permutation appearing in X (i)1、p2、…p3Then the normalized permutation entropy is:
Figure BDA0001080170260000062
where N is the time series length, m is the reconstruction dimension and l is the time delay. According to the experimental result, the reconstruction dimension m is generally selected to be 3-7. The time delay has a small influence on the permutation entropy and can be generally selected to be 1.
In the invention, the selection of the mode is judged by judging whether the arrangement entropy H of the mode components with different frequency scales obtained by decomposing the abnormal sound signals of the public places added with the random T distribution sequences is larger than the threshold theta. Experiments show that the effect of extracting the abnormal sound features is good when the value of theta is in the range of 0.25-0.35.
Fourth, screening frequency of modal component
The number of modal screens is determined by a number of experiments to be the optimum number of screens, with a preferred value of 12.
The invention utilizes the above improvement points to realize the extraction and identification of the abnormal sound characteristics of public places, and as shown in figure 1, the method mainly comprises three parts: the method comprises the following steps: and decomposing, characteristic extracting and identifying abnormal sounds to be identified in public places.
The method comprises the following specific steps:
step 1: and inputting abnormal sound signals to be identified in public places and preprocessing the abnormal sound signals.
Step 2: and decomposing the abnormal sound signal to be identified into a series of modal components by adopting an improved pole symmetric modal D-ESMD decomposition method, wherein each order of modal components respectively comprises the characteristics of the abnormal sound signal in different frequency bands.
And step 3: and (3) calculating the energy ratio of each order of modal component obtained in the step (2) relative to the original abnormal sound signal, and combining the energy ratios into a vector form to perform normalization processing to be used as a feature vector of the abnormal sound signal to be identified.
And 4, step 4: judging whether the feature vector is valid; if not, skipping to the step 3; if yes, executing step 5;
and 5: the identification process of the abnormal sound to be identified in the public place comprises the following steps: firstly, randomly selecting each class and a certain number of training samples from an established abnormal sound library, solving the feature vector of the training samples through the step 2 and the step 3, and establishing an SVM classification model; then, classifying the feature vectors of the abnormal sounds to be recognized by using the established SVM classification model to obtain a classification recognition result;
the method comprises the following steps of D-ESMD, wherein the D-ESMD is used for extracting the characteristics of abnormal sounds to be identified in public places:
step 2.1, determining the number N of times of adding T distributed random noise;
step 2.2, supposing that the abnormal sound signal to be identified is X, adding a random T distribution sequence into the sound signal to be identified to obtain a noise-added abnormal sound signal Xi
Step 2.3, the abnormal sound signal X after the noise addition is obtainediConnecting adjacent extreme points, and marking the middle point of the line segment as FiSupplement left and right boundary points F0And Fn. An extreme value midpoint odd-even interpolation method for replacing ESMD by adopting a symmetrical midpoint interpolation method to construct an interpolation curve L for n +1 extreme value midpoints*
Step 2.4 reaction of Xi-L*As input, repeat the above step 2.3 until the number of screening reaches the maximum 12 to obtain the first order modal component M1 iCalculating the value of the permutation entropy of the modal components; if the value of the permutation entropy of the signal is larger than a predetermined threshold value theta, the signal is regarded as an abnormal sound modal component, otherwise, the signal is regarded as a noise component;
step 2.5 if modal component M1 iFor abnormal sound modal component, X is addedi-M1 iRepeating the steps 2.3-2.4 as input signals until the modal component M is obtained by decompositionn iIs a noise component;
step 2.6 if i<N, let i be i +1, repeat steps 2.2 to 2.5, and the T distribution noise signal added each time is different until N decompositions are performed, and all modal components are obtained
Figure BDA0001080170260000072
Taking the overall average value and taking the result as the final modal component M of the signal to be decomposedk
Figure BDA0001080170260000071
In the above formula, k is the order of modal component, and N is the number of times of noise addition.
FIG. 4 is a comparison graph of ROC curves of the present invention and several other abnormal sound feature extraction methods. The ESMD is a pole symmetric modal decomposition method, the EEMD is a total empirical mode decomposition method, the SaSEEMD is a total empirical mode decomposition method based on alpha distribution, and the ELMD is a total local mean decomposition method. D-ESMD is the improved ESMD decomposition method provided by the invention.

Claims (4)

1. A method for extracting and identifying abnormal sound features in public places is characterized by comprising the following steps: decomposing abnormal sounds to be identified in public places, extracting and identifying characteristics; the method comprises the following concrete steps:
step 1: inputting abnormal sounds to be identified in public places and preprocessing the abnormal sounds;
step 2: decomposing the abnormal sound signal to be identified by adopting an improved pole symmetric modal decomposition D-ESMD method to obtain modal components of each order, wherein each modal component respectively comprises the characteristics of the abnormal sound signal in different frequency bands;
and step 3: calculating the energy ratio of each order of modal component obtained in the step 2 relative to the original abnormal sound signal, and combining the energy ratios into a vector form to carry out normalization processing to be used as a feature vector of the abnormal sound signal to be identified;
and 4, step 4: judging whether the feature vector is valid; if not, skipping to the step 3; if yes, executing step 5;
and 5: the identification process of the abnormal sound to be identified in the public place comprises the following steps: firstly, randomly selecting each class and a certain number of training samples from an established abnormal sound library, solving the feature vector of the training samples through the step 2 and the step 3, and establishing an SVM classification model; then, classifying the feature vectors of the abnormal sounds to be recognized by using the established SVM classification model to obtain a classification recognition result;
the D-ESMD decomposition method is characterized in that on the basis of a pole symmetric mode ESMD decomposition method, a random T distribution noise sequence is added to abnormal sounds to be identified in a public place, a symmetric midpoint interpolation method is adopted to replace an extreme value midpoint parity interpolation method of the ESMD, arrangement entropy values are calculated for decomposed mode components, the mode component screening times are improved, complexity detection of each mode is completed, and effective mode components of the abnormal sounds are obtained in a self-adaptive mode;
the abnormal sound library comprises explosion sound, scream sound, gunshot sound and glass breaking sound;
the D-ESMD decomposition method comprises the following specific processes:
step 2.1, determining the number N of times of adding T distributed random noise;
step 2.2, supposing that the abnormal sound signal to be identified is X, adding a random T distribution sequence into the sound signal to be identified to obtain a noise-added abnormal sound signal Xi
Step 2.3, the abnormal sound signal X after the noise addition is obtainediConnecting adjacent extreme points, and marking the middle point of the line segment as FiSupplement left and right boundary points F0And FnAn interpolation curve L is constructed for n +1 extremum midpoints by adopting a symmetric midpoint interpolation method to replace an ESMD extremum midpoint parity interpolation method*
Step 2.4 reaction of Xi-L*As input, repeating the above step 2.3 until the screening frequency reaches the maximum value to obtain the first-order modal component M1 iCalculating the value of the permutation entropy of the modal components; if the value of the permutation entropy of the signal is larger than a predetermined threshold value theta, the signal is regarded as an abnormal sound modal component, otherwise, the signal is regarded as a noise component;
step 2.5 if modal component M1 iFor abnormal sound modal component, X is addedi-M1 iRepeating the steps 2.3-2.4 as input signals until the modal component M is obtained by decompositionn iIs a noise component;
step 2.6 if i<N, let i be i +1, repeat steps 2.2 to 2.5, and the T distribution noise signal added each time is different until N decompositions are performed, and all modal components M are obtainedk iTaking the overall average value and taking the result as the final modal component M of the signal to be decomposedk
In the above formula, k is the order of modal component, and N is the number of times of noise addition.
2. The method for extracting and identifying the abnormal sound features in the public places according to claim 1, wherein the symmetrical midpoint interpolation method comprises the following specific steps:
step 3.1, assuming that the input signal is y, solving all maximum value points y of ymaxAnd minimum value point ymin
Step 3.2, connecting all adjacent extreme points and solving the extreme value middle point ymean
ymean=(ymax+ymin)/2
Step 3.3, solving the symmetrical middle point y of the middle points of the adjacent extreme valuesmAnd simultaneously using cubic spline interpolation method to ymAnd (5) carrying out interpolation to obtain a final interpolation curve.
3. The method for extracting and identifying the abnormal sound features in the public places according to claim 1, wherein the maximum value of the screening times in the step 2.4 is prioritized to 12.
4. The method for extracting and identifying the abnormal sound features in the public places according to claim 1, wherein the specific calculation process of the permutation entropy is as follows:
assuming a time series signal x (i) of length N, i ═ 1, 2, …, N, which is subjected to delayed reconstruction, the following time series results:
Figure FDA0002242180250000022
wherein l is time delay, m is reconstruction dimension, and m elements in X (i) are arranged in ascending order to obtain:
X'i={x(i+(j1-1)*l)≤x(i+(j2-1)*l)≤…≤x(i+(jm-1)*l)}
thus, each vector x (i) has a set of permutation sequences:
Sg={j1,j2,j3,…jm}
in the formula, j represents an index of a column where each element in the reconstruction component is located;
wherein m! A different arrangement; calculating the probability p of each permutation appearing in X (i)1、p2、…p3Then the normalized permutation entropy is:
Figure FDA0002242180250000031
where N is the time series length, m is the reconstruction dimension and l is the time delay.
CN201610674982.1A 2016-08-16 2016-08-16 Method for extracting and identifying abnormal sound features in public places Active CN106228979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610674982.1A CN106228979B (en) 2016-08-16 2016-08-16 Method for extracting and identifying abnormal sound features in public places

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610674982.1A CN106228979B (en) 2016-08-16 2016-08-16 Method for extracting and identifying abnormal sound features in public places

Publications (2)

Publication Number Publication Date
CN106228979A CN106228979A (en) 2016-12-14
CN106228979B true CN106228979B (en) 2020-01-10

Family

ID=57552521

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610674982.1A Active CN106228979B (en) 2016-08-16 2016-08-16 Method for extracting and identifying abnormal sound features in public places

Country Status (1)

Country Link
CN (1) CN106228979B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106683687B (en) * 2016-12-30 2020-02-14 杭州华为数字技术有限公司 Abnormal sound classification method and device
CN107525671B (en) * 2017-07-28 2020-12-18 中国科学院电工研究所 Method for separating and identifying compound fault characteristics of transmission chain of wind turbine generator
CN107527617A (en) * 2017-09-30 2017-12-29 上海应用技术大学 Monitoring method, apparatus and system based on voice recognition
CN108182950B (en) * 2017-12-28 2021-05-28 重庆大学 Improved method for decomposing and extracting abnormal sound characteristics of public places through empirical wavelet transform
CN109258509B (en) * 2018-11-16 2023-05-02 太原理工大学 Intelligent monitoring system and method for abnormal sound of live pigs
CN110910897B (en) * 2019-12-05 2023-06-09 四川超影科技有限公司 Feature extraction method for motor abnormal sound recognition
CN111461090B (en) * 2020-06-17 2020-09-22 杭州云智声智能科技有限公司 Sound vibration signal processing method and system based on environment sample basic cloud model
CN112906578B (en) * 2021-02-23 2023-09-05 北京建筑大学 Method for denoising bridge time sequence displacement signal
CN114710419B (en) * 2022-02-21 2023-07-28 上海交通大学 Equipment working state single-point monitoring method and device based on switching power supply sound and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102522082A (en) * 2011-12-27 2012-06-27 重庆大学 Recognizing and locating method for abnormal sound in public places
CN103730109A (en) * 2014-01-14 2014-04-16 重庆大学 Method for extracting characteristics of abnormal noise in public places
CN105125204A (en) * 2015-07-31 2015-12-09 华中科技大学 Electrocardiosignal denoising method based on ESMD (extreme-point symmetric mode decomposition) method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9779725B2 (en) * 2014-12-11 2017-10-03 Mediatek Inc. Voice wakeup detecting device and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102522082A (en) * 2011-12-27 2012-06-27 重庆大学 Recognizing and locating method for abnormal sound in public places
CN103730109A (en) * 2014-01-14 2014-04-16 重庆大学 Method for extracting characteristics of abnormal noise in public places
CN105125204A (en) * 2015-07-31 2015-12-09 华中科技大学 Electrocardiosignal denoising method based on ESMD (extreme-point symmetric mode decomposition) method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于CEEMD 和排列熵的故障数据小波阈值降噪方法;周涛涛等;《振动与冲击》;20151215;第34卷(第23期);第208页第1.3-1.4节、第209-211页第2、4节 *

Also Published As

Publication number Publication date
CN106228979A (en) 2016-12-14

Similar Documents

Publication Publication Date Title
CN106228979B (en) Method for extracting and identifying abnormal sound features in public places
CN109768985B (en) Intrusion detection method based on flow visualization and machine learning algorithm
Zhai et al. Insulator fault detection based on spatial morphological features of aerial images
CN109800700B (en) Underwater acoustic signal target classification and identification method based on deep learning
CN108009628B (en) Anomaly detection method based on generation countermeasure network
CN110767216B (en) Voice recognition attack defense method based on PSO algorithm
CN102663426B (en) Face identification method based on wavelet multi-scale analysis and local binary pattern
Ding et al. Autospeech: Neural architecture search for speaker recognition
CN107527617A (en) Monitoring method, apparatus and system based on voice recognition
WO2018006797A1 (en) System and method for detecting keyboard pressing content by using acoustic signal
CN102779510B (en) Speech emotion recognition method based on feature space self-adaptive projection
CN104795064B (en) The recognition methods of sound event under low signal-to-noise ratio sound field scape
CN106328120B (en) Method for extracting abnormal sound features of public places
CN105718889A (en) Human face identity recognition method based on GB(2D)2PCANet depth convolution model
US20160335553A1 (en) Accoustic Context Recognition using Local Binary Pattern Method and Apparatus
CN110197665A (en) A kind of speech Separation and tracking for police criminal detection monitoring
CN103778913A (en) Pathological voice recognition method
CN109410919A (en) A kind of intelligent home control system
CN106792706A (en) A kind of illegal invasion detection method based on wireless network signal spectrum non-stationary property
CN107454084B (en) Nearest neighbor intrusion detection algorithm based on hybrid zone
CN113159181B (en) Industrial control system anomaly detection method and system based on improved deep forest
CN111191548B (en) Discharge signal identification method and identification system based on S transformation
CN111667836B (en) Text irrelevant multi-label speaker recognition method based on deep learning
Park et al. A noise robust audio fingerprint extraction technique for mobile devices using gradient histograms
CN112489330A (en) Warehouse anti-theft alarm method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant