CN113792596A - Acoustic classification method and system based on preprocessing ensemble learning - Google Patents
Acoustic classification method and system based on preprocessing ensemble learning Download PDFInfo
- Publication number
- CN113792596A CN113792596A CN202110913549.XA CN202110913549A CN113792596A CN 113792596 A CN113792596 A CN 113792596A CN 202110913549 A CN202110913549 A CN 202110913549A CN 113792596 A CN113792596 A CN 113792596A
- Authority
- CN
- China
- Prior art keywords
- data
- normal
- sound
- training
- sound data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000007781 pre-processing Methods 0.000 title claims abstract description 38
- 238000012549 training Methods 0.000 claims abstract description 52
- 238000012360 testing method Methods 0.000 claims description 58
- 238000001228 spectrum Methods 0.000 claims description 33
- 230000002159 abnormal effect Effects 0.000 claims description 24
- 238000010586 diagram Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 238000003860 storage Methods 0.000 claims description 6
- 238000013145 classification model Methods 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000012797 qualification Methods 0.000 claims description 3
- 238000011897 real-time detection Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 abstract description 21
- 230000004927 fusion Effects 0.000 abstract description 3
- 230000005236 sound signal Effects 0.000 abstract description 3
- 238000013528 artificial neural network Methods 0.000 abstract 1
- 238000002555 auscultation Methods 0.000 abstract 1
- 238000004891 communication Methods 0.000 description 4
- 230000007547 defect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000006996 mental state Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01M—TESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
- G01M13/00—Testing of machine parts
- G01M13/04—Bearings
- G01M13/045—Acoustic or vibration analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
The invention relates to an acoustic classification method and system based on preprocessing ensemble learning, wherein the method comprises the following steps: deploying a sound acquisition device at the detection object for receiving a sound signal emitted by the detection object; the collected sound signals are processed by using various preprocessing algorithms to obtain a corresponding feature atlas, the training and modeling of the same neural network algorithm are carried out according to the feature atlas, and the weighting and voting fusion results of a plurality of obtained model results finally identify whether the state of the detection object is normal or not, so that the robustness of the model is greatly improved. The invention detects whether the state of the object to be detected is normal or not by the sound emitted by the object to be detected, can rapidly and nondestructively detect the object to be detected, and avoids subjective influence caused by manual auscultation. The detection efficiency is improved, and the detection time is shortened. The rapid, efficient and lossless object state detection work is realized.
Description
Technical Field
The invention relates to the technical field of detection, in particular to an acoustic classification method and system based on preprocessing ensemble learning.
Background
In recent years, acoustic defect detection has been widely used in industrial production as one of nondestructive tests. However, most production lines are provided with devices for listening and checking by trained workers, the detection efficiency is low, the detection result is influenced by the mental state, the operation state and the working state of the workers, and the device is strong in subjectivity and not stable enough. Therefore, an acoustic classification method based on preprocessing ensemble learning is urgently needed.
Disclosure of Invention
In order to overcome the defects of the prior art method, the invention provides an acoustic classification method and system based on preprocessing ensemble learning, which can perform nondestructive rapid online classification detection on the detected object.
Aiming at the technical problem, the scheme adopted by the invention is as follows:
in a first aspect, the present invention provides an acoustic classification method based on preprocessing ensemble learning, including the following steps:
step S1, placing a sound collection device at the position of the object to be detected to collect normal sound data D1And abnormal data D2Two types of sound data sets;
step S2, dividing the two types of data sets into a training set D and a testing set T respectively, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
step S3, 4 feature extraction methods are carried out on D1, D2, T1 and T2 to generate { D11, D12 … … … D1n }, { D21 · D2n }, { T11 · T1n } and { T21 · T2n };
step S4, inputting the generated n feature data sets of two types of { D11, D12 · … D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
step S5, testing the trained n classifiers by using the test set, counting the times of misjudgment as qualified, and obtaining the respective weights of all models;
step S6, in the real-time classification process, the collected sound data t is subjected to various pre-processed feature data { t }1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
Further, the step S1 specifically includes:
step S11, placing a microphone at the position of the object to be detected, and continuously acquiring normal and abnormal sound data sets in a working state according to whether the object is normal or not; and at a time interval T0Storing at a sampling frequency fc(ii) a The time length of the collected sound data is T0The data length is N ═ T0*fc;
Step S12, calculating kurtosis for each collected sound data, and calculating according to difference of kurtosis values; determining a signal starting point S by using a threshold (10-30) after differential operation0And a fixed signal length N0And according to the starting point S0And N0Intercepting the data to finally obtain intercepted D1 and D2 data sets:
in the formula: kiIs the kurtosis of the ith point, xiAs the point of the i-th spot,σ is the standard deviation of the data as the mean of the data.
Further, the step S2 specifically includes:
dividing two types of sound data sets D1 and D2 into a training set and a test set respectively;
the training set is denoted as T [ { (D)11,0),(D12,0)...(D1m,0)},{(D21,1),(D22,1)...(D2m,1)}]Test set P [ { (D)11,0),(D12,0)...(D1j,0)},{(D21,1),(D22,1)...(D2j,1)}]Therein is markedLabel 0 indicates normal sound data, and label 1 indicates abnormal sound data; the training set has 2m sound data, and the test set has 2j data.
Further, the step S3 specifically includes:
step S31, performing windowing and framing on each captured sound data, where each frame has a length of h and the window function is as follows:
in the formula: w is a window sequence, and n is the nth point;
after windowing each sound data frame by frame, N is carried outf∈[512,1024,4096]Short-time Fourier transform of the points to obtain a time-frequency spectrum characteristic diagram data set Fstft:
Fstft={Γ1(W*D11),...,Γi(W*D1i)}
In the formula: fstftIs a time spectrum, i is a frame number, D1iF is short-time Fourier transform of ith frame data sequence of the first type of sound data;
step S32, the obtained time frequency data FstftPerforming exponential operation to adjust the difference of energy amplitudes of all frequency bands, increasing the amplitude of the frequency of the interested part, and obtaining a characteristic diagram F corresponding to the amplitudev(ii) a The adjustment formula is as follows:
in the formulaThe ith frequency in each frame frequency spectrum is indicated, alpha belongs to (0,3) as an amplitude adjustment factor, F (F)i) Adjusting the frequency spectrum after the amplitude value is adjusted for each frame;
step S33, extracting multichannel Mel features from the time-frequency spectrum feature data by utilizing a Mel filter bank to obtain Mel spectrum feature map data Fmel:
Fmel=Fstft*Hmel
In the formula: hmelIs a Mel transformation matrix;
step S34, dividing each frame frequency spectrum in the time frequency spectrum data into frequency fa,fbSplitting the frequency spectrum according to the frequency bands of 0-fa,Splicing front and back, only paying attention to effective frequency bands in the time frequency spectrum, and obtaining spliced time frequency spectrum characteristic diagram data
Further, the step S4 specifically includes:
step S41, applying the four preprocessing methods to the original sound data set D1、D2Generating a plurality of characteristic diagram data by each preprocessing method, wherein n groups are formed in total;
step S42, respectively sending the preprocessed n (n is odd) groups of feature training set data into n same-type two-classification convolutional networks, and training and constructing n two-classification models { M1M2...Mn}。
Further, the step S5 specifically includes:
testing the trained n binary classifiers by using the n groups of test set data obtained by the preprocessing to obtain the predicted misjudgment rate { r } of the system to the normal sound1r2...rnThe number v of the n models judged to be normal in each prediction is calculated; obtaining respective weight according to the misjudgment rate of each model on normal prediction:
in the formula wiWeighting the result of the ith model on the normal prediction, e is a natural number, riThe misjudgment rate of the ith model on normal prediction is obtained;
the number v of n models judged to be normal in each prediction, and the probability p of all models judged to be normal in each prediction is as follows:
output probability M of the final system predicted to be normaloutComprises the following steps:
in the formula: p is a radical ofiThe probability of being judged as normal for the ith model.
Further, the step S6 specifically includes:
step 61, during real-time detection, collected sound data t are collected according to a signal starting point S0And length N0Intercepting; respectively carrying out 4 kinds of n feature extraction processing on the intercepted data set to obtain { t }1t2...tnN feature data;
step 62, the processed characteristic data are respectively sent to two trained classifiers corresponding to the processed characteristic data, and the output of each model is obtained;
and 63, obtaining a final classification result of the system on the equipment to be detected in the currently acquired working state by using a weighting formula.
In a second aspect, the present invention provides an acoustic classification system based on pre-processing ensemble learning, including:
a data acquisition module for placing a sound acquisition device at the object to be detected to acquire normal sound data D1And abnormal data D2Two types of sound data sets;
the sample classification module is used for respectively classifying the two types of data sets into a training set D and a testing set T, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
the characteristic extraction module is used for carrying out 4 characteristic extraction methods on D1, D2, T1 and T2 to generate { D11, D12 · · D1n }, { D21 · · D2n }, { T11 · · T1n } and { T21 · · T2n };
a training modeling module for inputting the generated n feature data sets of two types of { D11, D12 · D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
the test statistic module is used for testing the trained n classifiers by using the test set, counting the times of misjudgment as qualification and obtaining the respective weights of all models;
a real-time classification module used for carrying out various pre-processed characteristic data { t) on the collected sound data t in the real-time classification process1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
In a third aspect, the present invention provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the acoustic classification method based on pre-processing ensemble learning when executing the program.
In a fourth aspect, the present invention provides a non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the steps of the acoustic classification method based on pre-processing ensemble learning.
The beneficial effect of this patent does:
the invention realizes the online detection of the working condition of the object to be detected by learning the sound information of the object to be detected in the working state in advance. The detection speed can be improved to dozens of times of manual listening and detection, and the detection efficiency is greatly improved. And the stability and the anti-interference performance of the traditional online detection are improved by the result fusion of a plurality of AI models for learning different characteristic diagrams.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart of the training process of the acoustic classification method based on pre-processing ensemble learning of the present invention;
FIG. 2 is a flow chart of a testing process of the acoustic classification method based on pre-processing ensemble learning of the present invention;
FIG. 3 is a flow chart of the detection process of the acoustic classification method based on pre-processing ensemble learning of the present invention;
FIG. 4 is a block diagram of an acoustic classification system based on pre-processing ensemble learning according to the present invention;
fig. 5 is a block diagram of an electronic device according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. It should be noted that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and all other embodiments obtained by those skilled in the art without any inventive work based on the embodiments of the present invention belong to the protection scope of the present invention.
Example 1
As shown in fig. 1-3, the present invention provides an acoustic classification method based on preprocessing ensemble learning, comprising the following steps:
step S1, placing a sound collection device at the position of the object to be detected to collect normal sound data D1And abnormal data D2Two types of sound data sets;
step S2, dividing the two types of data sets into a training set D and a testing set T respectively, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
step S3, 4 feature extraction methods are carried out on D1, D2, T1 and T2 to generate { D11, D12 … … … D1n }, { D21 · D2n }, { T11 · T1n } and { T21 · T2n };
step S4, inputting the generated n feature data sets of two types of { D11, D12 · … D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
step S5, testing the trained n classifiers by using the test set, counting the times of misjudgment as qualified, and obtaining the respective weights of all models;
step S6, in the real-time classification process, the collected sound data t is subjected to various pre-processed feature data { t }1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
The step S1 specifically includes:
step S11, placing a microphone at the object to be detected to acquire sound signals in a working state, and acquiring normal and abnormal sound data sets according to whether the running state of the object is normal or not; and at a time interval T0Storing at a sampling frequency fc. The time length of the collected sound data is T0The data length is N ═ T0*fc;
And step S12, calculating kurtosis for each collected sound data, and calculating according to difference of kurtosis values. Determining a signal starting point S by using a threshold (10-30) after differential operation0The first position exceeding the threshold is the starting point and the fixed signal length N0And according to the starting point S0And N0Intercepting the data to finally obtain intercepted D1 and D2 data sets:
in the formula: kiIs the kurtosis of the ith point, xiAs the point of the i-th spot,σ is the standard deviation of the data as the mean of the data.
The step S2 specifically includes:
dividing two types of sound data sets D1 and D2 into a training set and a testing set respectively, and numbering and labeling the training sets and the testing set correspondingly; the training set is recorded as:
the training set is denoted as T [ { (D)11,0),(D12,0)...(D1m,0)},{(D21,1),(D22,1)...(D2m,1)}]Test set P [ { (D)11,0),(D12,0)...(D1j,0)},{(D21,1),(D22,1)...(D2j,1)}]Where label 0 represents normal sound data and label 1 represents abnormal sound data. The training set has 2m sound data, and the test set has 2j data.
The step S3 specifically includes:
step S31, performing windowing and framing on each voice data, where each frame has a length of h, and the window function is as follows:
in the formula: w is a window sequence, and n is the nth point;
after windowing each sound data frame by frame, N is carried outf∈[512,1024,4096]Short-time Fourier transform of the points to obtain a time-frequency spectrum characteristic diagram data set Fstft:
Fstft={Γ1(W*D11),...,Γi(W*D1i)}
In the formula: fstftIs a time spectrum, i is a frame number, D1iF is short-time Fourier transform of ith frame data sequence of the first type of sound data;
step S32, the obtained time frequency spectrum data FstftPerforming exponential operation to adjust the difference of energy amplitudes of all frequency bands to obtain characteristic diagram data F corresponding to the differencev. The adjustment formula is as follows:
in the formulaThe ith frequency amplitude in each frame frequency spectrum is indicated, alpha belongs to (0,3) as an amplitude adjustment factor, F (F)i) Adjusting the frequency spectrum after the amplitude value is adjusted for each frame;
step S33, extracting multichannel Mel features from the time-frequency spectrum feature data by utilizing a Mel filter bank to obtain Mel spectrum feature map data Fmel:
Fmel=Fstft*Hmel
In the formula: hmelIs a Mel transformation matrix;
step S34, dividing each frame frequency spectrum in the time frequency spectrum data into frequency fa,fbSplitting the frequency spectrum according to the frequency bands of 0-fa,Splicing front and back to obtain spliced time-frequency spectrum characteristic diagram data
The step S4 specifically includes:
step S41, applying the above four preprocessing methods to the sound data set D1、D2Generating a plurality of characteristic diagram data by each preprocessing method, wherein n groups are formed in total;
step S42, all n (n is odd) groups of feature training set data and labels are respectively sent into n same-type two-classification convolutional networks for training, and n two-classification models { M are constructed1M2...Mn}。
The step S5 specifically includes:
testing the trained n binary classifiers by using the n groups of test set data obtained by the preprocessing to obtain the predicted misjudgment rate { r } of the system to the normal sound1r2...rnThe number v of the n models judged to be normal in each prediction is calculated; obtaining respective weight according to the misjudgment rate of each model on normal prediction:
in the formula wiWeighting the result of the ith model on the normal prediction, e is a natural number, riThe misjudgment rate of the ith model on normal prediction is obtained;
the number v of the n models judged to be normal in each prediction, and the probability p of all the models judged to be normal in each prediction is as follows:
output probability M of the final system predicted to be normaloutComprises the following steps:
in the formula: p is a radical ofiThe probability of being judged as normal for the ith model.
The step S6 specifically includes:
step 61, during real-time detection, collected sound data are processed according to a signal starting point S0And length N0And (6) intercepting. Respectively extracting 4 n features from the intercepted data set to obtain n features, and extracting { t } t1t2...tnN feature data;
step 62, the processed characteristic data are respectively sent to two trained classifiers corresponding to the processed characteristic data, and the output of each model is obtained;
and 63, obtaining a final classification result of the system on the equipment to be detected in the currently acquired working state by using a weighting formula.
Example 2
As shown in fig. 4, the present invention provides an acoustic classification system based on preprocessing ensemble learning, which includes a data acquisition module, a sample classification module, a feature extraction module, a training modeling module, a test statistics module, and a real-time classification module;
the data acquisition module is used for placing a sound acquisition device at the position of the object to be detected to acquire normal sound data D1And abnormal data D2Two types of sound data sets;
the sample classification module is used for respectively classifying the two types of data sets into a training set D and a testing set T, the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
the feature extraction module is used for carrying out 4 feature extraction methods on D1, D2, T1 and T2 to generate { D11, D12 · D1n }, { D21 · D2n }, { T11 · T1n } and { T21 · T2n };
the training modeling module is used for inputting the generated n feature data sets of two types, namely { D11, D12 · D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
the test statistic module is used for testing the trained n classifiers by using a test set, counting the times of misjudgment as qualification and obtaining the respective weights of all models;
the real-time classification module is used for carrying out various pre-processed characteristic data { t) on the collected sound data t in the real-time classification process1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
Other features in this embodiment are the same as those in embodiment 1, and therefore are not described herein again.
Example 3
As shown in fig. 5, based on the same concept, the present invention provides an electronic device, where the electronic device is a server, and the server may include: a processor (processor)810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. The processor 810 may invoke logic instructions in the memory 830 to perform the steps of the acoustic classification method based on pre-processing ensemble learning as described in the various embodiments above.
In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Example 4
Based on the same concept, the present invention also provides a non-transitory computer-readable storage medium storing a computer program, the computer program comprising at least one code section, which is executable by a master control device to control the master control device to implement the steps of the acoustic classification method based on pre-processing ensemble learning according to the embodiments.
The invention realizes the online detection classification of the working conditions of the object to be detected by learning the sound information of the object to be detected in the working state in advance. The application of the invention can improve the detection speed to dozens of times of the manual listening and detecting speed, thereby greatly improving the detection efficiency. And through the result fusion of a plurality of AI models for learning different characteristic diagrams, the method has better stability and anti-interference performance compared with the traditional online detection method.
Although the present invention has been described in detail with reference to the embodiments, it will be apparent to those skilled in the art that modifications, equivalents, improvements, and the like can be made in the technical solutions of the foregoing embodiments or in some of the technical features of the foregoing embodiments, but those modifications, equivalents, improvements, and the like are all within the spirit and principle of the present invention.
Claims (10)
1. An acoustic classification method based on preprocessing ensemble learning is characterized by comprising the following steps:
step S1, placing a sound collection device at the position of the object to be detected to collect normal sound data D1And abnormal data D2Two types of sound data sets;
step S2, dividing the two types of data sets into a training set D and a testing set T respectively, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
step S3, 4 feature extraction methods are carried out on D1, D2, T1 and T2 to generate { D11, D12 · D1n }, { D21 · D2n }, { T11 · T1n } and { T21 · T2n };
step S4, inputting the generated n feature data sets of two types of { D11, D12 · · D1n } and { D21 · · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
step S5, testing the trained n classifiers by using the test set, counting the times of misjudgment as qualified, and obtaining the respective weights of all models;
step S6, in the real-time classification process, the collected sound data t is subjected to various pre-processed feature data { t }1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
2. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S1 specifically includes:
step S11, placing a microphone at the position of the object to be detected, and continuously acquiring normal and abnormal sound data sets in a working state according to whether the object is normal or not; and at a time interval T0Storing at a sampling frequency fc(ii) a The time length of the collected sound data is T0The data length is N ═ T0*fc;
Step S12, calculating kurtosis for each collected sound data, and calculating according to difference of kurtosis values; determining a signal starting point S by using a threshold (10-30) after differential operation0And a fixed signal length N0And according to the starting point S0And N0Intercepting the data to finally obtain intercepted D1 and D2 data sets:
3. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S2 specifically includes:
dividing two types of sound data sets D1 and D2 into a training set and a test set respectively;
the training set is denoted as T [ { (D)11,0),(D12,0)...(D1m,0)},{(D21,1),(D22,1)...(D2m,1)}]Test set P [ { (D)11,0),(D12,0)...(D1j,0)},{(D21,1),(D22,1)...(D2j,1)}]Wherein label 0 represents normal sound data and label 1 represents abnormal sound data; the training set has 2m sound data, and the test set has 2j data.
4. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S3 specifically includes:
step S31, performing windowing and framing on each voice data, where each frame has a length of h, and the window function is as follows:
in the formula: w is a window sequence, and n is the nth point;
after windowing each sound data frame by frame, N is carried outf∈[512,1024,4096]Short-time Fourier transform of the points to obtain a time-frequency spectrum characteristic diagram data set Fstft:
Fstft={Γ1(W*D11),...,Γi(W*D1i)}
In the formula: fstftIs a time spectrum, i is a frame number, D1iF is short-time Fourier transform of ith frame data sequence of the first type of sound data;
step S32, the obtained time frequency spectrum data FstftPerforming exponential operation to adjust the difference of energy amplitudes of all frequency bands to obtain characteristic diagram data F corresponding to the differencev(ii) a The adjustment formula is as follows:
in the formula dfiThe ith frequency amplitude in each frame frequency spectrum is indicated, alpha belongs to (0,3) as an amplitude adjustment factor, F (F)i) Adjusting the frequency spectrum after the amplitude value is adjusted for each frame;
step S33, extracting multichannel Mel features from the time-frequency spectrum feature data by utilizing a Mel filter bank to obtain Mel spectrum feature map data Fmel:
Fmel=Fstft*Hmel
In the formula: hmelIs a Mel transformation matrix;
5. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S4 specifically includes:
step S41, applying four preprocessing methods to the sound data set D1、D2Generating a plurality of characteristic diagram data by each preprocessing method, wherein n groups are formed in total;
step S42, all n groups of feature training set data are respectively sent into n same-type two-classification convolutional networks to construct n two-classification models { M1M2...MnAnd n is an odd number.
6. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S5 specifically includes:
testing the n trained binary classifiers by the test set T according to the n groups of feature data obtained by preprocessing to obtain the misjudgment rate { r } of the system to the normal prediction1r2...rnThe number v of the n models judged to be normal in each prediction is calculated; obtaining respective weight according to the misjudgment rate of each model on normal prediction:
in the formula wiWeighting the result of the ith model on the normal prediction, e is a natural number, riThe misjudgment rate of the ith model on normal prediction is obtained;
the number v of the n models judged to be normal in each prediction, and the probability p of all the models judged to be normal in each prediction is as follows:
output probability M of the final system predicted to be normaloutComprises the following steps:
in the formula: p is a radical ofiThe probability of being judged as normal for the ith model.
7. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S6 specifically includes:
step 61, during real-time detection, collected sound data t are collected according to a signal starting point S0And length N0Intercepting; respectively carrying out 4 kinds of n feature extraction processing on the intercepted data set to obtain { t }1t2...tnN feature data;
step 62, the processed characteristic data are respectively sent to two trained classifiers corresponding to the processed characteristic data, and the output of each model is obtained;
and 63, obtaining a final classification result of the system on the equipment to be detected in the currently acquired working state by using a weighting formula.
8. An acoustic classification system based on pre-processing ensemble learning, comprising:
a data acquisition module for placing a sound acquisition device at the object to be detected to acquire normal sound data D1And abnormal data D2Two types of sound data sets;
the sample classification module is used for respectively classifying the two types of data sets into a training set D and a testing set T, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
the characteristic extraction module is used for carrying out 4 characteristic extraction methods on D1, D2, T1 and T2 to generate { D11, D12 · · D1n }, { D21 · · D2n }, { T11 · · T1n } and { T21 · · T2n };
a training modeling module for inputting the generated n feature data sets of two types of { D11, D12 · D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
the test statistic module is used for testing the trained n classifiers by using the test set, counting the times of misjudgment as qualification and obtaining the respective weights of all models;
a real-time classification module used for carrying out various pre-processed characteristic data { t) on the collected sound data t in the real-time classification process1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the acoustic classification method based on pre-processing ensemble learning according to any of claims 1 to 7.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the acoustic classification method based on pre-processing ensemble learning according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110913549.XA CN113792596A (en) | 2021-08-10 | 2021-08-10 | Acoustic classification method and system based on preprocessing ensemble learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110913549.XA CN113792596A (en) | 2021-08-10 | 2021-08-10 | Acoustic classification method and system based on preprocessing ensemble learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113792596A true CN113792596A (en) | 2021-12-14 |
Family
ID=78875812
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110913549.XA Pending CN113792596A (en) | 2021-08-10 | 2021-08-10 | Acoustic classification method and system based on preprocessing ensemble learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113792596A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107967917A (en) * | 2016-10-19 | 2018-04-27 | 福特全球技术公司 | The vehicle periphery audio classification learnt by neural network machine |
CN110189769A (en) * | 2019-05-23 | 2019-08-30 | 复钧智能科技(苏州)有限公司 | Abnormal sound detection method based on multiple convolutional neural networks models couplings |
CN113140229A (en) * | 2021-04-21 | 2021-07-20 | 上海泛德声学工程有限公司 | Sound detection method based on neural network, industrial acoustic detection system and method |
-
2021
- 2021-08-10 CN CN202110913549.XA patent/CN113792596A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107967917A (en) * | 2016-10-19 | 2018-04-27 | 福特全球技术公司 | The vehicle periphery audio classification learnt by neural network machine |
CN110189769A (en) * | 2019-05-23 | 2019-08-30 | 复钧智能科技(苏州)有限公司 | Abnormal sound detection method based on multiple convolutional neural networks models couplings |
CN113140229A (en) * | 2021-04-21 | 2021-07-20 | 上海泛德声学工程有限公司 | Sound detection method based on neural network, industrial acoustic detection system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108648748B (en) | Acoustic event detection method under hospital noise environment | |
CN108198574B (en) | Sound change detection method and device | |
US10403266B2 (en) | Detecting keywords in audio using a spiking neural network | |
CN110795843B (en) | Method and device for identifying faults of rolling bearing | |
CN110033756B (en) | Language identification method and device, electronic equipment and storage medium | |
CN111724770B (en) | Audio keyword identification method for generating confrontation network based on deep convolution | |
CN112735473B (en) | Method and system for identifying unmanned aerial vehicle based on voice | |
CN110827793A (en) | Language identification method | |
CN116778964A (en) | Power transformation equipment fault monitoring system and method based on voiceprint recognition | |
CN111081223A (en) | Voice recognition method, device, equipment and storage medium | |
CN116861303A (en) | Digital twin multisource information fusion diagnosis method for transformer substation | |
CN116778956A (en) | Transformer acoustic feature extraction and fault identification method | |
CN116935892A (en) | Industrial valve anomaly detection method based on audio key feature dynamic aggregation | |
Liu et al. | Surrey system for dcase 2022 task 5: Few-shot bioacoustic event detection with segment-level metric learning | |
CN114184684A (en) | Online detection method and system for cutting state of cutter | |
CN113792597A (en) | Mechanical equipment abnormal sound detection method based on self-supervision feature extraction | |
CN113345443A (en) | Marine mammal vocalization detection and identification method based on mel-frequency cepstrum coefficient | |
CN113792596A (en) | Acoustic classification method and system based on preprocessing ensemble learning | |
CN116741200A (en) | Locomotive fan fault detection method and device | |
CN116364108A (en) | Transformer voiceprint detection method and device, electronic equipment and storage medium | |
CN114209302B (en) | Cough detection method based on data uncertainty learning | |
CN116013276A (en) | Indoor environment sound automatic classification method based on lightweight ECAPA-TDNN neural network | |
CN114093385A (en) | Unmanned aerial vehicle detection method and device | |
CN112201226B (en) | Sound production mode judging method and system | |
CN111091816B (en) | Data processing system and method based on voice evaluation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20211214 |