CN114464174A - Brake noise classification and identification method based on deep learning - Google Patents
Brake noise classification and identification method based on deep learning Download PDFInfo
- Publication number
- CN114464174A CN114464174A CN202210028155.0A CN202210028155A CN114464174A CN 114464174 A CN114464174 A CN 114464174A CN 202210028155 A CN202210028155 A CN 202210028155A CN 114464174 A CN114464174 A CN 114464174A
- Authority
- CN
- China
- Prior art keywords
- deep learning
- layer
- brake noise
- classification
- brake
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013135 deep learning Methods 0.000 title claims abstract description 35
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000005236 sound signal Effects 0.000 claims abstract description 21
- 238000013145 classification model Methods 0.000 claims abstract description 18
- 238000010586 diagram Methods 0.000 claims abstract description 11
- 238000012549 training Methods 0.000 claims abstract description 10
- 238000002372 labelling Methods 0.000 claims abstract description 3
- 238000001228 spectrum Methods 0.000 claims description 20
- 230000006870 function Effects 0.000 claims description 13
- 238000011176 pooling Methods 0.000 claims description 12
- 238000009432 framing Methods 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 230000001133 acceleration Effects 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 239000012634 fragment Substances 0.000 claims description 3
- 230000001960 triggered effect Effects 0.000 claims description 2
- 230000003595 spectral effect Effects 0.000 claims 1
- 238000012360 testing method Methods 0.000 abstract description 3
- 210000005069 ears Anatomy 0.000 abstract description 2
- 238000011161 development Methods 0.000 abstract 1
- 238000012545 processing Methods 0.000 abstract 1
- 230000000694 effects Effects 0.000 description 3
- 230000002411 adverse Effects 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Signal Processing (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a brake noise classification and identification method based on deep learning, which comprises the following steps: 1) acquiring original sound signals through a sound acquisition device and carrying out classification and labeling; 2) extracting a corresponding time-frequency characteristic diagram from the marked original sound signal; 3) constructing a deep learning classification model and training; 4) and inputting the time-frequency characteristic diagram of the voice signal to be recognized into the trained deep learning classification model to obtain a recognition result. Compared with the prior art, the method utilizes a deep learning method to classify and identify the brake noise for brake noise test data processing, can replace human ears to identify the type of the brake noise, saves a large amount of manpower and time cost, shortens the development period and the capital investment, and has higher identification accuracy and stability.
Description
Technical Field
The invention relates to the technical field of brake noise identification, in particular to a brake noise classification identification method based on deep learning.
Background
The squeaking sound and the fluttering sound generated during the braking of the automobile are one of important factors causing urban traffic noise pollution, which not only can cause adverse effects on the riding comfort degree of the automobile, but also can cause adverse effects on the surrounding environment. The test of the brake noise is an important part in an automobile test, but the current test software can only intercept an abnormal brake noise signal and cannot accurately identify which brake noise is. The classification of the braking noise usually needs professional evaluators to classify and recognize by human ears, and then the intensity and the occurrence rate of the noise are judged by combining factors such as decibel values and frequencies of the noise so as to guide production.
Therefore, it is necessary to provide a method for identifying the kind of the brake noise using an intelligent technique instead of a human.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a brake noise classification and identification method based on deep learning.
The purpose of the invention can be realized by the following technical scheme:
a brake noise classification and identification method based on deep learning comprises the following steps:
1) acquiring original sound signals through a sound acquisition device and carrying out classification and labeling;
2) extracting a corresponding time-frequency characteristic diagram from the marked original sound signal;
3) constructing a deep learning classification model and training;
4) and inputting the time-frequency characteristic diagram of the voice signal to be recognized into the trained deep learning classification model to obtain a recognition result.
Further, in step 1), the original sound signal is specifically brake noise data that is triggered and collected by a brake pedal signal at each braking, and specifically is an acceleration signal or a microphone signal of the brake.
Furthermore, the length of the original sound signal is within 10 seconds, and the original sound signal is cut off and divided into a plurality of sections when the length of the original sound signal exceeds 10 seconds.
Further, the step 2) is specifically as follows:
the method comprises the steps of pre-emphasizing an original sound signal, compensating loss of high-frequency components through high-pass filtering, obtaining the frequency spectrum characteristics of each time frame of the sound signal through framing, windowing and fast Fourier transform in sequence, obtaining the Mel frequency spectrum of each time frame based on human auditory sense through a Mel filter bank on the frequency spectrum characteristics of each time frame, combining the Mel frequency spectrums of multi-frame sound fragments to obtain a Mel frequency spectrogram, and storing the Mel frequency spectrogram as a picture with the same resolution ratio, namely a time-frequency characteristic graph.
Furthermore, the number of points of framing, windowing and fast fourier transform is set to 1024, the number of overlapping points should be set to 512, and the window function used for windowing is a hamming window.
Furthermore, the frequency spectrum characteristics are obtained by adding a small offset logarithm to the squared frequency spectrum amplitude subjected to Fourier transform, and the small offset value is 10-12。
Further, the number of the mel filters is 32 or 64.
Further, the resolution is set to 224 × 3.
Further, in the step 3), the deep learning classification model is specifically a deep neural network, and a network structure of the deep neural network includes five convolution layers, a full-link layer, a softmax layer and a classification output layer, a normalization layer, an activation function layer and a pooling layer are added after each convolution layer, a discarding layer is added between the pooling layer and the full-link layer of the last convolution layer, the normalization layer is batch-normalized, the activation function layer is a ReLU function, and the pooling layer is maximum pooling.
Furthermore, after the collected brake noise data including background noise under various working conditions is manually identified, the noise types of the brake noise data are marked, and the time-frequency characteristic diagram of each type of brake noise is not less than 1000 and balanced as much as possible for subsequent deep learning classification model training.
Compared with the prior art, the invention has the following advantages:
according to the intelligent brake noise classification and identification method, the brake noise is identified by adopting a deep learning method, the time-frequency characteristic diagram of the noise is used as the input of the deep learning training model, the problem that the brake noise characteristic dimensions are different in different time lengths can be effectively solved, the brake noise can be automatically classified and identified, the identification speed is high, manual participation is not needed, the labor and the time cost are saved, and meanwhile, higher identification accuracy and scientificity are achieved.
Drawings
Fig. 1 is a flow chart of an embodiment of an intelligent brake noise classification and identification system according to the present invention.
FIG. 2 is a diagram illustrating the acquisition of an audio profile of noise according to the present invention.
Fig. 3 is a diagram of the deep learning network structure of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, which are summarized in the embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention is described in detail below with reference to the figures and specific embodiments.
Examples
As shown in fig. 1, the present invention provides an intelligent classification and recognition system for brake noise, which extracts a corresponding time-frequency feature map from an original sound signal after the original sound signal is acquired by a sound acquisition device, analyzes the extracted audio feature map using a trained deep learning classification model to obtain the category of brake noise, acquires the time-frequency features corresponding to each frame of sound segment from an acceleration signal or a microphone signal of a brake acquired each time by framing, combines the time-frequency features of multiple frames of sound segments and stores the time-frequency features as pictures with the same resolution, i.e., the time-frequency feature map, labels the original sound signal and extracts the time-frequency feature map, and trains the labeled time-frequency feature map using the deep learning model to obtain the trained deep learning classification model.
The sound collection device of the embodiment triggers and finishes collection by using the brake pedal signal, stores the signal into an audio format such as wav or mp3, and automatically cuts off the signal into multiple sections of audio after 10 seconds.
Extracting a time-frequency characteristic graph as input of a deep learning model, wherein the specific process is as shown in figure 2, pre-emphasis is firstly carried out on an original sound signal, loss of high-frequency components is compensated through high-pass filtering, then spectrum characteristics of each time frame of the sound signal are obtained through framing, windowing and fast Fourier transformation, the Mel frequency spectrum of each time frame based on human auditory sense is obtained through the spectrum characteristics of each frame through a Mel filter bank, if the number of points of framing, windowing and fast Fourier transformation can be set to be 1024, a window function used for windowing is a Hamming window, the obtained spectrum characteristics of each frame is 513D, the logarithm is measured by adding a small offset after the square of the amplitude of each frame of the spectrum, and the logarithm is measured by taking the small offset as 10-12The Mel frequency spectrum features obtained after passing through the Mel filter bank of 32 filters are 32-dimensional, on the basis of retaining Mel time frequency features based on human auditory sense, the dimensionality of the time frequency features is greatly reduced, the Mel frequency spectrums of multi-frame sound fragments are combined to obtain a Mel frequency spectrum graph, and the Mel frequency spectrum graph is stored as a picture with the resolution of 224 × 3.
The method comprises the steps of manually identifying the collected brake noise data (including background noise) under various working conditions, then marking the noise type of the brake noise data, extracting time-frequency feature graphs from the marked noise data according to the feature extraction mode, corresponding to classification labels of the time-frequency feature graphs one by one, and balancing the brake noise time-frequency feature graphs of each type as much as possible without being less than 1000 so as to be used for subsequent deep learning classification model training.
The network structure of the deep learning classification model adopts a deep neural network comprising five convolution layers and a full connection layer, as shown in figure 3, a picture of 224 x 3 is input, a normalization layer, an activation function layer and a pooling layer are added after each convolution layer, a discarding layer, a full connection layer, a softmax layer and a classification output layer are added after the last pooling layer, the normalization layer adopts batch normalization, the activation function layer adopts a ReLU function, and the pooling layer adopts maximum pooling.
The method comprises the steps of building a deep learning classification model by using an open source deep network framework, inputting marked time-frequency feature graphs into the deep learning classification model according to batches for iterative computation, ending training when a loss function reaches a minimum value or verification accuracy is not increased any more, storing the deep learning classification model, determining batch size according to hardware and iterative rounds according to specific training conditions, wherein the batch size selected in the embodiment is 32 and the iterative rounds are 8.
The trained deep learning classification model is used for classifying and predicting the time-frequency characteristic graph of the brake noise to be detected, the classification of the brake noise and the probabilities of different types can be obtained, the maximum probability corresponds to the identified type, a threshold value can be set, manual rechecking is required when the maximum probability is lower than the threshold value, identification is accurate when the maximum probability is higher than the threshold value, and manual rechecking is not required.
The method is based on a mature deep learning image identification technology, uses the time-frequency characteristic graph of the noise as the input of a deep learning training model, ensures the identification accuracy and the iteration speed, and can effectively solve the problem of different brake noise characteristic dimensions in different time lengths.
The foregoing examples are merely illustrative of the principles and effects of the present invention, and are not intended to limit the scope of the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.
Claims (10)
1. A brake noise classification and identification method based on deep learning is characterized by comprising the following steps:
1) acquiring original sound signals through a sound acquisition device and carrying out classification and labeling;
2) extracting a corresponding time-frequency characteristic diagram from the marked original sound signal;
3) constructing a deep learning classification model and training;
4) and inputting the time-frequency characteristic diagram of the voice signal to be recognized into the trained deep learning classification model to obtain a recognition result.
2. The brake noise classification and identification method based on deep learning of claim 1, wherein in step 1), the original sound signal is brake noise data, specifically an acceleration signal or a microphone signal of a brake, which is triggered and collected at the end of each braking with a brake pedal signal.
3. The method as claimed in claim 1, wherein the length of the original sound signal is within 10 seconds, and the brake noise is segmented when the length exceeds 10 seconds.
4. The brake noise classification and identification method based on deep learning according to claim 1, wherein the step 2) is specifically as follows:
the method comprises the steps of pre-emphasizing an original sound signal, compensating loss of high-frequency components through high-pass filtering, obtaining the frequency spectrum characteristics of each time frame of the sound signal through framing, windowing and fast Fourier transform in sequence, obtaining the Mel frequency spectrum of each time frame based on human auditory sense through a Mel filter bank on the frequency spectrum characteristics of each time frame, combining the Mel frequency spectrums of multi-frame sound fragments to obtain a Mel frequency spectrogram, and storing the Mel frequency spectrogram as a picture with the same resolution ratio, namely a time-frequency characteristic graph.
5. The method as claimed in claim 4, wherein the number of points of framing, windowing and fast Fourier transform is set to 1024, the number of overlapping points is set to 512, and the windowing uses a window function of Hamming window.
6. The method for classifying and identifying the braking noise based on the deep learning of claim 4, wherein the spectral features are obtained by adding a small offset logarithm to a Fourier transformed spectrum amplitude square, and the small offset value is 10-12。
7. The method as claimed in claim 4, wherein the number of the Mel filters is 32 or 64.
8. The method according to claim 4, wherein the resolution is set to 224 x 3.
9. The method as claimed in claim 1, wherein in the step 3), the deep learning classification model is a deep neural network, and the network structure thereof includes five convolutional layers, a full-link layer, a softmax layer and a classification output layer, a normalization layer, an activation function layer and a pooling layer are added after each convolutional layer, and a discarding layer is added between the pooling layer and the full-link layer of the last convolutional layer, the normalization layer is batch-normalized, the activation function layer is a ReLU function, and the pooling layer is a maximum pooling.
10. The brake noise classification and identification method based on deep learning of claim 2 is characterized in that the noise types of the collected brake noise data including background noise under various working conditions are labeled after manual identification, and the time-frequency characteristic diagram of each type of brake noise is not less than 1000 and balanced as much as possible for subsequent deep learning classification model training.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210028155.0A CN114464174A (en) | 2022-01-11 | 2022-01-11 | Brake noise classification and identification method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210028155.0A CN114464174A (en) | 2022-01-11 | 2022-01-11 | Brake noise classification and identification method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114464174A true CN114464174A (en) | 2022-05-10 |
Family
ID=81409942
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210028155.0A Pending CN114464174A (en) | 2022-01-11 | 2022-01-11 | Brake noise classification and identification method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114464174A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115471709A (en) * | 2022-09-28 | 2022-12-13 | 刘鹏 | Directional signal intelligent analysis platform |
CN117033983A (en) * | 2023-10-10 | 2023-11-10 | 山东科技大学 | Unmanned ship self-noise detection and identification method and system |
-
2022
- 2022-01-11 CN CN202210028155.0A patent/CN114464174A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115471709A (en) * | 2022-09-28 | 2022-12-13 | 刘鹏 | Directional signal intelligent analysis platform |
CN117033983A (en) * | 2023-10-10 | 2023-11-10 | 山东科技大学 | Unmanned ship self-noise detection and identification method and system |
CN117033983B (en) * | 2023-10-10 | 2024-01-30 | 山东科技大学 | Unmanned ship self-noise detection and identification method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115081473A (en) | Multi-feature fusion brake noise classification and identification method | |
CN114464174A (en) | Brake noise classification and identification method based on deep learning | |
CN109599120B (en) | Abnormal mammal sound monitoring method based on large-scale farm plant | |
CN111816218A (en) | Voice endpoint detection method, device, equipment and storage medium | |
EP3701528B1 (en) | Segmentation-based feature extraction for acoustic scene classification | |
CN110310666B (en) | Musical instrument identification method and system based on SE convolutional network | |
CN103646649A (en) | High-efficiency voice detecting method | |
CN111986699B (en) | Sound event detection method based on full convolution network | |
Socoró et al. | Development of an Anomalous Noise Event Detection Algorithm for dynamic road traffic noise mapping | |
CN105448291A (en) | Parkinsonism detection method and detection system based on voice | |
CN113327626A (en) | Voice noise reduction method, device, equipment and storage medium | |
CN113707173B (en) | Voice separation method, device, equipment and storage medium based on audio segmentation | |
CN106548786A (en) | A kind of detection method and system of voice data | |
Al-Kaltakchi et al. | Thorough evaluation of TIMIT database speaker identification performance under noise with and without the G. 712 type handset | |
CN113763986A (en) | Air conditioner indoor unit abnormal sound detection method based on sound classification model | |
CN116013276A (en) | Indoor environment sound automatic classification method based on lightweight ECAPA-TDNN neural network | |
CN113793624B (en) | Acoustic scene classification method | |
CN111341343B (en) | Online updating system and method for abnormal sound detection | |
CN113963719A (en) | Deep learning-based sound classification method and apparatus, storage medium, and computer | |
CN117332293A (en) | Truck overload detection method based on sound Mel frequency characteristics | |
CN116741159A (en) | Audio classification and model training method and device, electronic equipment and storage medium | |
CN115452378A (en) | Rolling bearing fault voiceprint recognition method based on power regularization cepstrum coefficient | |
CN114093385A (en) | Unmanned aerial vehicle detection method and device | |
CN112201226B (en) | Sound production mode judging method and system | |
CN116467576A (en) | Brake peristaltic moan noise evaluation method based on deep convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |