CN113792596A - Acoustic classification method and system based on preprocessing ensemble learning - Google Patents

Acoustic classification method and system based on preprocessing ensemble learning Download PDF

Info

Publication number
CN113792596A
CN113792596A CN202110913549.XA CN202110913549A CN113792596A CN 113792596 A CN113792596 A CN 113792596A CN 202110913549 A CN202110913549 A CN 202110913549A CN 113792596 A CN113792596 A CN 113792596A
Authority
CN
China
Prior art keywords
data
normal
sound
training
sound data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110913549.XA
Other languages
Chinese (zh)
Inventor
周松斌
万智勇
刘忆森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Intelligent Manufacturing of Guangdong Academy of Sciences
Original Assignee
Institute of Intelligent Manufacturing of Guangdong Academy of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Intelligent Manufacturing of Guangdong Academy of Sciences filed Critical Institute of Intelligent Manufacturing of Guangdong Academy of Sciences
Priority to CN202110913549.XA priority Critical patent/CN113792596A/en
Publication of CN113792596A publication Critical patent/CN113792596A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/04Bearings
    • G01M13/045Acoustic or vibration analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

The invention relates to an acoustic classification method and system based on preprocessing ensemble learning, wherein the method comprises the following steps: deploying a sound acquisition device at the detection object for receiving a sound signal emitted by the detection object; the collected sound signals are processed by using various preprocessing algorithms to obtain a corresponding feature atlas, the training and modeling of the same neural network algorithm are carried out according to the feature atlas, and the weighting and voting fusion results of a plurality of obtained model results finally identify whether the state of the detection object is normal or not, so that the robustness of the model is greatly improved. The invention detects whether the state of the object to be detected is normal or not by the sound emitted by the object to be detected, can rapidly and nondestructively detect the object to be detected, and avoids subjective influence caused by manual auscultation. The detection efficiency is improved, and the detection time is shortened. The rapid, efficient and lossless object state detection work is realized.

Description

Acoustic classification method and system based on preprocessing ensemble learning
Technical Field
The invention relates to the technical field of detection, in particular to an acoustic classification method and system based on preprocessing ensemble learning.
Background
In recent years, acoustic defect detection has been widely used in industrial production as one of nondestructive tests. However, most production lines are provided with devices for listening and checking by trained workers, the detection efficiency is low, the detection result is influenced by the mental state, the operation state and the working state of the workers, and the device is strong in subjectivity and not stable enough. Therefore, an acoustic classification method based on preprocessing ensemble learning is urgently needed.
Disclosure of Invention
In order to overcome the defects of the prior art method, the invention provides an acoustic classification method and system based on preprocessing ensemble learning, which can perform nondestructive rapid online classification detection on the detected object.
Aiming at the technical problem, the scheme adopted by the invention is as follows:
in a first aspect, the present invention provides an acoustic classification method based on preprocessing ensemble learning, including the following steps:
step S1, placing a sound collection device at the position of the object to be detected to collect normal sound data D1And abnormal data D2Two types of sound data sets;
step S2, dividing the two types of data sets into a training set D and a testing set T respectively, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
step S3, 4 feature extraction methods are carried out on D1, D2, T1 and T2 to generate { D11, D12 … … … D1n }, { D21 · D2n }, { T11 · T1n } and { T21 · T2n };
step S4, inputting the generated n feature data sets of two types of { D11, D12 · … D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
step S5, testing the trained n classifiers by using the test set, counting the times of misjudgment as qualified, and obtaining the respective weights of all models;
step S6, in the real-time classification process, the collected sound data t is subjected to various pre-processed feature data { t }1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
Further, the step S1 specifically includes:
step S11, placing a microphone at the position of the object to be detected, and continuously acquiring normal and abnormal sound data sets in a working state according to whether the object is normal or not; and at a time interval T0Storing at a sampling frequency fc(ii) a The time length of the collected sound data is T0The data length is N ═ T0*fc
Step S12, calculating kurtosis for each collected sound data, and calculating according to difference of kurtosis values; determining a signal starting point S by using a threshold (10-30) after differential operation0And a fixed signal length N0And according to the starting point S0And N0Intercepting the data to finally obtain intercepted D1 and D2 data sets:
Figure BDA0003204578610000021
in the formula: kiIs the kurtosis of the ith point, xiAs the point of the i-th spot,
Figure BDA0003204578610000022
σ is the standard deviation of the data as the mean of the data.
Further, the step S2 specifically includes:
dividing two types of sound data sets D1 and D2 into a training set and a test set respectively;
the training set is denoted as T [ { (D)11,0),(D12,0)...(D1m,0)},{(D21,1),(D22,1)...(D2m,1)}]Test set P [ { (D)11,0),(D12,0)...(D1j,0)},{(D21,1),(D22,1)...(D2j,1)}]Therein is markedLabel 0 indicates normal sound data, and label 1 indicates abnormal sound data; the training set has 2m sound data, and the test set has 2j data.
Further, the step S3 specifically includes:
step S31, performing windowing and framing on each captured sound data, where each frame has a length of h and the window function is as follows:
Figure BDA0003204578610000031
in the formula: w is a window sequence, and n is the nth point;
after windowing each sound data frame by frame, N is carried outf∈[512,1024,4096]Short-time Fourier transform of the points to obtain a time-frequency spectrum characteristic diagram data set Fstft
Fstft={Γ1(W*D11),...,Γi(W*D1i)}
In the formula: fstftIs a time spectrum, i is a frame number, D1iF is short-time Fourier transform of ith frame data sequence of the first type of sound data;
step S32, the obtained time frequency data FstftPerforming exponential operation to adjust the difference of energy amplitudes of all frequency bands, increasing the amplitude of the frequency of the interested part, and obtaining a characteristic diagram F corresponding to the amplitudev(ii) a The adjustment formula is as follows:
Figure BDA0003204578610000032
in the formula
Figure BDA0003204578610000033
The ith frequency in each frame frequency spectrum is indicated, alpha belongs to (0,3) as an amplitude adjustment factor, F (F)i) Adjusting the frequency spectrum after the amplitude value is adjusted for each frame;
step S33, extracting multichannel Mel features from the time-frequency spectrum feature data by utilizing a Mel filter bank to obtain Mel spectrum feature map data Fmel
Fmel=Fstft*Hmel
In the formula: hmelIs a Mel transformation matrix;
step S34, dividing each frame frequency spectrum in the time frequency spectrum data into frequency fa,fbSplitting the frequency spectrum according to the frequency bands of 0-fa,
Figure BDA0003204578610000041
Splicing front and back, only paying attention to effective frequency bands in the time frequency spectrum, and obtaining spliced time frequency spectrum characteristic diagram data
Figure BDA0003204578610000042
Further, the step S4 specifically includes:
step S41, applying the four preprocessing methods to the original sound data set D1、D2Generating a plurality of characteristic diagram data by each preprocessing method, wherein n groups are formed in total;
step S42, respectively sending the preprocessed n (n is odd) groups of feature training set data into n same-type two-classification convolutional networks, and training and constructing n two-classification models { M1M2...Mn}。
Further, the step S5 specifically includes:
testing the trained n binary classifiers by using the n groups of test set data obtained by the preprocessing to obtain the predicted misjudgment rate { r } of the system to the normal sound1r2...rnThe number v of the n models judged to be normal in each prediction is calculated; obtaining respective weight according to the misjudgment rate of each model on normal prediction:
Figure BDA0003204578610000043
in the formula wiWeighting the result of the ith model on the normal prediction, e is a natural number, riThe misjudgment rate of the ith model on normal prediction is obtained;
the number v of n models judged to be normal in each prediction, and the probability p of all models judged to be normal in each prediction is as follows:
Figure BDA0003204578610000044
output probability M of the final system predicted to be normaloutComprises the following steps:
Figure BDA0003204578610000051
in the formula: p is a radical ofiThe probability of being judged as normal for the ith model.
Further, the step S6 specifically includes:
step 61, during real-time detection, collected sound data t are collected according to a signal starting point S0And length N0Intercepting; respectively carrying out 4 kinds of n feature extraction processing on the intercepted data set to obtain { t }1t2...tnN feature data;
step 62, the processed characteristic data are respectively sent to two trained classifiers corresponding to the processed characteristic data, and the output of each model is obtained;
and 63, obtaining a final classification result of the system on the equipment to be detected in the currently acquired working state by using a weighting formula.
In a second aspect, the present invention provides an acoustic classification system based on pre-processing ensemble learning, including:
a data acquisition module for placing a sound acquisition device at the object to be detected to acquire normal sound data D1And abnormal data D2Two types of sound data sets;
the sample classification module is used for respectively classifying the two types of data sets into a training set D and a testing set T, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
the characteristic extraction module is used for carrying out 4 characteristic extraction methods on D1, D2, T1 and T2 to generate { D11, D12 · · D1n }, { D21 · · D2n }, { T11 · · T1n } and { T21 · · T2n };
a training modeling module for inputting the generated n feature data sets of two types of { D11, D12 · D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
the test statistic module is used for testing the trained n classifiers by using the test set, counting the times of misjudgment as qualification and obtaining the respective weights of all models;
a real-time classification module used for carrying out various pre-processed characteristic data { t) on the collected sound data t in the real-time classification process1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
In a third aspect, the present invention provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the acoustic classification method based on pre-processing ensemble learning when executing the program.
In a fourth aspect, the present invention provides a non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the steps of the acoustic classification method based on pre-processing ensemble learning.
The beneficial effect of this patent does:
the invention realizes the online detection of the working condition of the object to be detected by learning the sound information of the object to be detected in the working state in advance. The detection speed can be improved to dozens of times of manual listening and detection, and the detection efficiency is greatly improved. And the stability and the anti-interference performance of the traditional online detection are improved by the result fusion of a plurality of AI models for learning different characteristic diagrams.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart of the training process of the acoustic classification method based on pre-processing ensemble learning of the present invention;
FIG. 2 is a flow chart of a testing process of the acoustic classification method based on pre-processing ensemble learning of the present invention;
FIG. 3 is a flow chart of the detection process of the acoustic classification method based on pre-processing ensemble learning of the present invention;
FIG. 4 is a block diagram of an acoustic classification system based on pre-processing ensemble learning according to the present invention;
fig. 5 is a block diagram of an electronic device according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. It should be noted that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and all other embodiments obtained by those skilled in the art without any inventive work based on the embodiments of the present invention belong to the protection scope of the present invention.
Example 1
As shown in fig. 1-3, the present invention provides an acoustic classification method based on preprocessing ensemble learning, comprising the following steps:
step S1, placing a sound collection device at the position of the object to be detected to collect normal sound data D1And abnormal data D2Two types of sound data sets;
step S2, dividing the two types of data sets into a training set D and a testing set T respectively, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
step S3, 4 feature extraction methods are carried out on D1, D2, T1 and T2 to generate { D11, D12 … … … D1n }, { D21 · D2n }, { T11 · T1n } and { T21 · T2n };
step S4, inputting the generated n feature data sets of two types of { D11, D12 · … D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
step S5, testing the trained n classifiers by using the test set, counting the times of misjudgment as qualified, and obtaining the respective weights of all models;
step S6, in the real-time classification process, the collected sound data t is subjected to various pre-processed feature data { t }1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
The step S1 specifically includes:
step S11, placing a microphone at the object to be detected to acquire sound signals in a working state, and acquiring normal and abnormal sound data sets according to whether the running state of the object is normal or not; and at a time interval T0Storing at a sampling frequency fc. The time length of the collected sound data is T0The data length is N ═ T0*fc
And step S12, calculating kurtosis for each collected sound data, and calculating according to difference of kurtosis values. Determining a signal starting point S by using a threshold (10-30) after differential operation0The first position exceeding the threshold is the starting point and the fixed signal length N0And according to the starting point S0And N0Intercepting the data to finally obtain intercepted D1 and D2 data sets:
Figure BDA0003204578610000091
in the formula: kiIs the kurtosis of the ith point, xiAs the point of the i-th spot,
Figure BDA0003204578610000092
σ is the standard deviation of the data as the mean of the data.
The step S2 specifically includes:
dividing two types of sound data sets D1 and D2 into a training set and a testing set respectively, and numbering and labeling the training sets and the testing set correspondingly; the training set is recorded as:
the training set is denoted as T [ { (D)11,0),(D12,0)...(D1m,0)},{(D21,1),(D22,1)...(D2m,1)}]Test set P [ { (D)11,0),(D12,0)...(D1j,0)},{(D21,1),(D22,1)...(D2j,1)}]Where label 0 represents normal sound data and label 1 represents abnormal sound data. The training set has 2m sound data, and the test set has 2j data.
The step S3 specifically includes:
step S31, performing windowing and framing on each voice data, where each frame has a length of h, and the window function is as follows:
Figure BDA0003204578610000093
in the formula: w is a window sequence, and n is the nth point;
after windowing each sound data frame by frame, N is carried outf∈[512,1024,4096]Short-time Fourier transform of the points to obtain a time-frequency spectrum characteristic diagram data set Fstft
Fstft={Γ1(W*D11),...,Γi(W*D1i)}
In the formula: fstftIs a time spectrum, i is a frame number, D1iF is short-time Fourier transform of ith frame data sequence of the first type of sound data;
step S32, the obtained time frequency spectrum data FstftPerforming exponential operation to adjust the difference of energy amplitudes of all frequency bands to obtain characteristic diagram data F corresponding to the differencev. The adjustment formula is as follows:
Figure BDA0003204578610000101
in the formula
Figure BDA0003204578610000102
The ith frequency amplitude in each frame frequency spectrum is indicated, alpha belongs to (0,3) as an amplitude adjustment factor, F (F)i) Adjusting the frequency spectrum after the amplitude value is adjusted for each frame;
step S33, extracting multichannel Mel features from the time-frequency spectrum feature data by utilizing a Mel filter bank to obtain Mel spectrum feature map data Fmel
Fmel=Fstft*Hmel
In the formula: hmelIs a Mel transformation matrix;
step S34, dividing each frame frequency spectrum in the time frequency spectrum data into frequency fa,fbSplitting the frequency spectrum according to the frequency bands of 0-fa,
Figure BDA0003204578610000103
Splicing front and back to obtain spliced time-frequency spectrum characteristic diagram data
Figure BDA0003204578610000104
The step S4 specifically includes:
step S41, applying the above four preprocessing methods to the sound data set D1、D2Generating a plurality of characteristic diagram data by each preprocessing method, wherein n groups are formed in total;
step S42, all n (n is odd) groups of feature training set data and labels are respectively sent into n same-type two-classification convolutional networks for training, and n two-classification models { M are constructed1M2...Mn}。
The step S5 specifically includes:
testing the trained n binary classifiers by using the n groups of test set data obtained by the preprocessing to obtain the predicted misjudgment rate { r } of the system to the normal sound1r2...rnThe number v of the n models judged to be normal in each prediction is calculated; obtaining respective weight according to the misjudgment rate of each model on normal prediction:
Figure BDA0003204578610000105
in the formula wiWeighting the result of the ith model on the normal prediction, e is a natural number, riThe misjudgment rate of the ith model on normal prediction is obtained;
the number v of the n models judged to be normal in each prediction, and the probability p of all the models judged to be normal in each prediction is as follows:
Figure BDA0003204578610000111
output probability M of the final system predicted to be normaloutComprises the following steps:
Figure BDA0003204578610000112
in the formula: p is a radical ofiThe probability of being judged as normal for the ith model.
The step S6 specifically includes:
step 61, during real-time detection, collected sound data are processed according to a signal starting point S0And length N0And (6) intercepting. Respectively extracting 4 n features from the intercepted data set to obtain n features, and extracting { t } t1t2...tnN feature data;
step 62, the processed characteristic data are respectively sent to two trained classifiers corresponding to the processed characteristic data, and the output of each model is obtained;
and 63, obtaining a final classification result of the system on the equipment to be detected in the currently acquired working state by using a weighting formula.
Example 2
As shown in fig. 4, the present invention provides an acoustic classification system based on preprocessing ensemble learning, which includes a data acquisition module, a sample classification module, a feature extraction module, a training modeling module, a test statistics module, and a real-time classification module;
the data acquisition module is used for placing a sound acquisition device at the position of the object to be detected to acquire normal sound data D1And abnormal data D2Two types of sound data sets;
the sample classification module is used for respectively classifying the two types of data sets into a training set D and a testing set T, the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
the feature extraction module is used for carrying out 4 feature extraction methods on D1, D2, T1 and T2 to generate { D11, D12 · D1n }, { D21 · D2n }, { T11 · T1n } and { T21 · T2n };
the training modeling module is used for inputting the generated n feature data sets of two types, namely { D11, D12 · D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
the test statistic module is used for testing the trained n classifiers by using a test set, counting the times of misjudgment as qualification and obtaining the respective weights of all models;
the real-time classification module is used for carrying out various pre-processed characteristic data { t) on the collected sound data t in the real-time classification process1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
Other features in this embodiment are the same as those in embodiment 1, and therefore are not described herein again.
Example 3
As shown in fig. 5, based on the same concept, the present invention provides an electronic device, where the electronic device is a server, and the server may include: a processor (processor)810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. The processor 810 may invoke logic instructions in the memory 830 to perform the steps of the acoustic classification method based on pre-processing ensemble learning as described in the various embodiments above.
In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Example 4
Based on the same concept, the present invention also provides a non-transitory computer-readable storage medium storing a computer program, the computer program comprising at least one code section, which is executable by a master control device to control the master control device to implement the steps of the acoustic classification method based on pre-processing ensemble learning according to the embodiments.
The invention realizes the online detection classification of the working conditions of the object to be detected by learning the sound information of the object to be detected in the working state in advance. The application of the invention can improve the detection speed to dozens of times of the manual listening and detecting speed, thereby greatly improving the detection efficiency. And through the result fusion of a plurality of AI models for learning different characteristic diagrams, the method has better stability and anti-interference performance compared with the traditional online detection method.
Although the present invention has been described in detail with reference to the embodiments, it will be apparent to those skilled in the art that modifications, equivalents, improvements, and the like can be made in the technical solutions of the foregoing embodiments or in some of the technical features of the foregoing embodiments, but those modifications, equivalents, improvements, and the like are all within the spirit and principle of the present invention.

Claims (10)

1. An acoustic classification method based on preprocessing ensemble learning is characterized by comprising the following steps:
step S1, placing a sound collection device at the position of the object to be detected to collect normal sound data D1And abnormal data D2Two types of sound data sets;
step S2, dividing the two types of data sets into a training set D and a testing set T respectively, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
step S3, 4 feature extraction methods are carried out on D1, D2, T1 and T2 to generate { D11, D12 · D1n }, { D21 · D2n }, { T11 · T1n } and { T21 · T2n };
step S4, inputting the generated n feature data sets of two types of { D11, D12 · · D1n } and { D21 · · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
step S5, testing the trained n classifiers by using the test set, counting the times of misjudgment as qualified, and obtaining the respective weights of all models;
step S6, in the real-time classification process, the collected sound data t is subjected to various pre-processed feature data { t }1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
2. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S1 specifically includes:
step S11, placing a microphone at the position of the object to be detected, and continuously acquiring normal and abnormal sound data sets in a working state according to whether the object is normal or not; and at a time interval T0Storing at a sampling frequency fc(ii) a The time length of the collected sound data is T0The data length is N ═ T0*fc
Step S12, calculating kurtosis for each collected sound data, and calculating according to difference of kurtosis values; determining a signal starting point S by using a threshold (10-30) after differential operation0And a fixed signal length N0And according to the starting point S0And N0Intercepting the data to finally obtain intercepted D1 and D2 data sets:
Figure FDA0003204578600000021
in the formula: kiIs the kurtosis of the ith point, xiAs the point of the i-th spot,
Figure FDA0003204578600000023
σ is the standard deviation of the data as the mean of the data.
3. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S2 specifically includes:
dividing two types of sound data sets D1 and D2 into a training set and a test set respectively;
the training set is denoted as T [ { (D)11,0),(D12,0)...(D1m,0)},{(D21,1),(D22,1)...(D2m,1)}]Test set P [ { (D)11,0),(D12,0)...(D1j,0)},{(D21,1),(D22,1)...(D2j,1)}]Wherein label 0 represents normal sound data and label 1 represents abnormal sound data; the training set has 2m sound data, and the test set has 2j data.
4. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S3 specifically includes:
step S31, performing windowing and framing on each voice data, where each frame has a length of h, and the window function is as follows:
Figure FDA0003204578600000022
in the formula: w is a window sequence, and n is the nth point;
after windowing each sound data frame by frame, N is carried outf∈[512,1024,4096]Short-time Fourier transform of the points to obtain a time-frequency spectrum characteristic diagram data set Fstft
Fstft={Γ1(W*D11),...,Γi(W*D1i)}
In the formula: fstftIs a time spectrum, i is a frame number, D1iF is short-time Fourier transform of ith frame data sequence of the first type of sound data;
step S32, the obtained time frequency spectrum data FstftPerforming exponential operation to adjust the difference of energy amplitudes of all frequency bands to obtain characteristic diagram data F corresponding to the differencev(ii) a The adjustment formula is as follows:
Figure FDA0003204578600000031
in the formula dfiThe ith frequency amplitude in each frame frequency spectrum is indicated, alpha belongs to (0,3) as an amplitude adjustment factor, F (F)i) Adjusting the frequency spectrum after the amplitude value is adjusted for each frame;
step S33, extracting multichannel Mel features from the time-frequency spectrum feature data by utilizing a Mel filter bank to obtain Mel spectrum feature map data Fmel
Fmel=Fstft*Hmel
In the formula: hmelIs a Mel transformation matrix;
step S34, dividing each frame frequency spectrum in the time frequency spectrum data into frequency fa,fbSplitting the frequency spectrum according to frequency segments
Figure FDA0003204578600000032
Splicing front and back to obtain spliced time-frequency spectrum characteristic diagram data
Figure FDA0003204578600000033
5. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S4 specifically includes:
step S41, applying four preprocessing methods to the sound data set D1、D2Generating a plurality of characteristic diagram data by each preprocessing method, wherein n groups are formed in total;
step S42, all n groups of feature training set data are respectively sent into n same-type two-classification convolutional networks to construct n two-classification models { M1M2...MnAnd n is an odd number.
6. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S5 specifically includes:
testing the n trained binary classifiers by the test set T according to the n groups of feature data obtained by preprocessing to obtain the misjudgment rate { r } of the system to the normal prediction1r2...rnThe number v of the n models judged to be normal in each prediction is calculated; obtaining respective weight according to the misjudgment rate of each model on normal prediction:
Figure FDA0003204578600000041
in the formula wiWeighting the result of the ith model on the normal prediction, e is a natural number, riThe misjudgment rate of the ith model on normal prediction is obtained;
the number v of the n models judged to be normal in each prediction, and the probability p of all the models judged to be normal in each prediction is as follows:
Figure FDA0003204578600000042
output probability M of the final system predicted to be normaloutComprises the following steps:
Figure FDA0003204578600000043
in the formula: p is a radical ofiThe probability of being judged as normal for the ith model.
7. The acoustic classification method based on pre-processing ensemble learning of claim 1, wherein the step S6 specifically includes:
step 61, during real-time detection, collected sound data t are collected according to a signal starting point S0And length N0Intercepting; respectively carrying out 4 kinds of n feature extraction processing on the intercepted data set to obtain { t }1t2...tnN feature data;
step 62, the processed characteristic data are respectively sent to two trained classifiers corresponding to the processed characteristic data, and the output of each model is obtained;
and 63, obtaining a final classification result of the system on the equipment to be detected in the currently acquired working state by using a weighting formula.
8. An acoustic classification system based on pre-processing ensemble learning, comprising:
a data acquisition module for placing a sound acquisition device at the object to be detected to acquire normal sound data D1And abnormal data D2Two types of sound data sets;
the sample classification module is used for respectively classifying the two types of data sets into a training set D and a testing set T, wherein the training set D comprises training normal sound samples D1 and training abnormal sound samples D2, and the testing set T comprises testing normal samples T1 and abnormal testing samples T2;
the characteristic extraction module is used for carrying out 4 characteristic extraction methods on D1, D2, T1 and T2 to generate { D11, D12 · · D1n }, { D21 · · D2n }, { T11 · · T1n } and { T21 · · T2n };
a training modeling module for inputting the generated n feature data sets of two types of { D11, D12 · D1n } and { D21 · D2n } into n homogeneous two classifiers { M based on the convolutional neural network1M2...MnIn the previous step, training and modeling are carried out;
the test statistic module is used for testing the trained n classifiers by using the test set, counting the times of misjudgment as qualification and obtaining the respective weights of all models;
a real-time classification module used for carrying out various pre-processed characteristic data { t) on the collected sound data t in the real-time classification process1t2...tnAnd inputting the n trained secondary classifiers, and weighting the output results of the n models according to an output formula to obtain classification results.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of the acoustic classification method based on pre-processing ensemble learning according to any of claims 1 to 7.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the acoustic classification method based on pre-processing ensemble learning according to any one of claims 1 to 7.
CN202110913549.XA 2021-08-10 2021-08-10 Acoustic classification method and system based on preprocessing ensemble learning Pending CN113792596A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110913549.XA CN113792596A (en) 2021-08-10 2021-08-10 Acoustic classification method and system based on preprocessing ensemble learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110913549.XA CN113792596A (en) 2021-08-10 2021-08-10 Acoustic classification method and system based on preprocessing ensemble learning

Publications (1)

Publication Number Publication Date
CN113792596A true CN113792596A (en) 2021-12-14

Family

ID=78875812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110913549.XA Pending CN113792596A (en) 2021-08-10 2021-08-10 Acoustic classification method and system based on preprocessing ensemble learning

Country Status (1)

Country Link
CN (1) CN113792596A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107967917A (en) * 2016-10-19 2018-04-27 福特全球技术公司 The vehicle periphery audio classification learnt by neural network machine
CN110189769A (en) * 2019-05-23 2019-08-30 复钧智能科技(苏州)有限公司 Abnormal sound detection method based on multiple convolutional neural networks models couplings
CN113140229A (en) * 2021-04-21 2021-07-20 上海泛德声学工程有限公司 Sound detection method based on neural network, industrial acoustic detection system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107967917A (en) * 2016-10-19 2018-04-27 福特全球技术公司 The vehicle periphery audio classification learnt by neural network machine
CN110189769A (en) * 2019-05-23 2019-08-30 复钧智能科技(苏州)有限公司 Abnormal sound detection method based on multiple convolutional neural networks models couplings
CN113140229A (en) * 2021-04-21 2021-07-20 上海泛德声学工程有限公司 Sound detection method based on neural network, industrial acoustic detection system and method

Similar Documents

Publication Publication Date Title
CN108648748B (en) Acoustic event detection method under hospital noise environment
CN108198574B (en) Sound change detection method and device
US10403266B2 (en) Detecting keywords in audio using a spiking neural network
CN110795843B (en) Method and device for identifying faults of rolling bearing
CN110033756B (en) Language identification method and device, electronic equipment and storage medium
CN111724770B (en) Audio keyword identification method for generating confrontation network based on deep convolution
CN112735473B (en) Method and system for identifying unmanned aerial vehicle based on voice
CN110827793A (en) Language identification method
CN116778964A (en) Power transformation equipment fault monitoring system and method based on voiceprint recognition
CN111081223A (en) Voice recognition method, device, equipment and storage medium
CN116861303A (en) Digital twin multisource information fusion diagnosis method for transformer substation
CN116778956A (en) Transformer acoustic feature extraction and fault identification method
CN116935892A (en) Industrial valve anomaly detection method based on audio key feature dynamic aggregation
Liu et al. Surrey system for dcase 2022 task 5: Few-shot bioacoustic event detection with segment-level metric learning
CN114184684A (en) Online detection method and system for cutting state of cutter
CN113792597A (en) Mechanical equipment abnormal sound detection method based on self-supervision feature extraction
CN113345443A (en) Marine mammal vocalization detection and identification method based on mel-frequency cepstrum coefficient
CN113792596A (en) Acoustic classification method and system based on preprocessing ensemble learning
CN116741200A (en) Locomotive fan fault detection method and device
CN116364108A (en) Transformer voiceprint detection method and device, electronic equipment and storage medium
CN114209302B (en) Cough detection method based on data uncertainty learning
CN116013276A (en) Indoor environment sound automatic classification method based on lightweight ECAPA-TDNN neural network
CN114093385A (en) Unmanned aerial vehicle detection method and device
CN112201226B (en) Sound production mode judging method and system
CN111091816B (en) Data processing system and method based on voice evaluation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211214