CN112735448A - Sound detection method and system based on target detection - Google Patents

Sound detection method and system based on target detection Download PDF

Info

Publication number
CN112735448A
CN112735448A CN202011480987.3A CN202011480987A CN112735448A CN 112735448 A CN112735448 A CN 112735448A CN 202011480987 A CN202011480987 A CN 202011480987A CN 112735448 A CN112735448 A CN 112735448A
Authority
CN
China
Prior art keywords
sound
target
detection
model
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011480987.3A
Other languages
Chinese (zh)
Inventor
鲍亭文
朱小芹
王旻轩
刘展
金超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Cyberinsight Technology Co ltd
Original Assignee
Beijing Cyberinsight Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Cyberinsight Technology Co ltd filed Critical Beijing Cyberinsight Technology Co ltd
Priority to CN202011480987.3A priority Critical patent/CN112735448A/en
Publication of CN112735448A publication Critical patent/CN112735448A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Emergency Alarm Devices (AREA)

Abstract

According to the sound detection method and system based on target detection, the specific form of the target sound on the spectrogram is identified by using the target detection algorithm on the spectrogram of the sound signal, noise reduction is not needed on the sound, the anti-interference performance on various environmental noises is good, misjudgment is not generated, and the generalization of the model is improved; the target sounds in different frequency bands or in different frequency bands do not need to be retrained, and the trained model can be generalized to the same type of target sounds which appear in different frequency bands, have different sound pressure levels and slightly different frequency spectrum forms, and can be applied to all target sounds which accord with certain frequency spectrum form characteristics.

Description

Sound detection method and system based on target detection
Technical Field
The application relates to a sound detection method and system based on target detection, which are applicable to the technical field of sound signal detection.
Background
The blade of the wind driven generator is one of important parts of a wind turbine generator set for converting wind energy into mechanical energy, and is a basis for obtaining higher wind energy utilization coefficient and economic benefit, and the performance and the generating efficiency of the whole machine are directly influenced by the state of the blade. The comprehensive benefits of the wind field are also seriously influenced by the frequent operation and maintenance of the blades and accidents. During the operation of the blade for sweeping wind, the judgment of the fault type and the fault position can be assisted by the identification of some target sounds. In different existing scenes, a thunder and lightning identification method based on a sound signal is mostly based on judgment of a burst high-energy signal or is combined with other types of signals such as images, currents and the like to judge and monitor the occurrence of thunder and lightning. The method based on the burst high-energy signal can not well distinguish other sounds with similar characteristics possibly generated in the environment through a threshold value, such as impact sound, blasting sound and the like; the method based on various signals has strong recognition capability for the lightning phenomenon, but the monitoring cost is high and the maintenance is more complicated because various monitoring means are required to be installed, and the same method is only suitable for a single scene of lightning recognition.
For fault detection of whistle generated when a drain hole is blocked, the front edge is corroded and the like, the existing method is to extract whistle forms or characteristics and identify the whistle forms or characteristics by a clustering or polynomial fitting correlation method. This type of approach allows identification of whistles of the same shape. When a fault develops or a whistle with a new shape which is not completely the same appears in different frequency domains, a good identification accuracy rate cannot be achieved. The frequency band of whistle is related to the fault position, and the form is related to the fault type, which changes with time. In addition, the method has high requirements on data quality and feature extraction, and has low generalization on unknown noise of unknown environment.
The Chinese patent application 201710419138.9 sets an energy threshold value for a specific frequency domain range in a collected sound sample, and determines that thunder is generated when the threshold value exceeds a preset range; the method simply carries out rule judgment through frequency domain energy, the rule is single, and misjudgment of all other noises with short-time high energy in the environment cannot be avoided. The Chinese patent application 201910331781.5 comprehensively collects lightning information through an image, temperature, humidity and electromagnetic field measuring device, the device firstly triggers data collection through an optical detection device, and then collects information such as an image, an electric field, a magnetic field and the like; the system does not describe whether the thunder is judged after data acquisition, so that the system can be understood as judging the thunder only through light intensity; the method also has a single judgment standard, cannot distinguish other signals with short-term high brightness in the environment, and has higher monitoring cost for collecting various signals. The Chinese patent application 201510115347.5 extracts the spectrum curve of the whistle of the blade, then fits a polynomial to reconstruct the whistle form, and identifies the whistle through the correlation between the signal and a reconstruction model; the method needs an effective extraction method for the target sound, has high sensitivity to noise, and often has low generalization because sound signals contain various environmental noises in an actual scene; secondly, the method carries out polynomial fitting on the characteristics, and the generalization of whistling and other fault sound forms in the whole life cycle of the fan along with the change of time is not enough, and the stability is not high. The chinese patent application 201910603546.9 frames the sound signals of the wind turbine, extracts features of the framed sound signals for secondary clustering, and judges the fault according to the periodicity of the category label. The method needs to select effective characteristics for fault characteristics, and target sounds such as whistling sounds and thunder sounds which are possibly changed in frequency domain and sound intensity cannot be identified. In addition, the patent uses a binary classification method, which does not well distinguish the unseen fault characteristics and cannot identify the fault type of the abnormal state.
The schemes in the prior art can only detect single target sound, the same method cannot be reused in the detection of other target sounds, and the sound-based schemes are all easily affected by various environmental noises during collection.
Disclosure of Invention
The invention aims to provide a sound detection method and a sound detection system based on target detection, wherein a sound signal is simply used, and a target sound is accurately identified by a method of carrying out target detection on a spectrogram of the signal. The method has the advantages of single monitoring signal and high generalization, can accurately identify the form of the target sound in the spectrogram, and has little influence on the accuracy due to the frequency band change of the target sound or various noises appearing in the environment.
The application relates to a sound detection method based on target detection, which comprises a training process and a prediction process, wherein the training process comprises the following steps:
(1.1) collecting a plurality of groups of historical data with target sounds and carrying out quality screening on the data;
(1.2) carrying out spectrum conversion on the data to obtain a spectrogram;
(1.3) converting the spectrogram into a picture and storing the picture;
(1.4) marking the position of the target sound on the spectrogram by using a target detection marking tool;
(1.5) training the labeled picture data by using a target detection model;
the prediction process comprises the following steps:
(2.1) carrying out data quality screening and spectrum conversion on the collected audio signal to be detected;
(2.2) converting the frequency spectrum of the data to be detected into a picture;
(2.3) predicting the generated picture by using the trained target detection model;
and (2.4) when the model identifies that the picture contains the target sound, calculating the frequency band and the time period of the target sound and outputting the frequency band and the time period as a result.
In the step (2.3), after prediction, the model outputs the number, probability and frame position of the detection targets; after the step (2.3), the method can further comprise the following steps: generating a detection factor according to a detection result of the model; the target sound is identified according to the model and the probability or frequency mapping of the target sound occurrence becomes the detection factor.
The application also relates to a sound detection system based on target detection, which comprises a sound sensor, a machine end hardware device and an operation module, wherein the operation module runs the sound detection method.
The voice detection system further comprises a station-side server, and the operation module can be arranged on the station-side server; the end hardware device may include an edge hardware piece counting system.
According to the sound detection method and system based on target detection, the specific form of the target sound on the spectrogram is identified by using the target detection algorithm on the spectrogram of the sound signal, noise reduction is not needed on the sound, the anti-interference performance on various environmental noises is good, misjudgment is not generated, and the generalization of the model is improved; the target sounds in different frequency bands or in different frequency bands do not need to be retrained, and the trained model can be generalized to the same type of target sounds which appear in different frequency bands, have different sound pressure levels and slightly different frequency spectrum forms, and can be applied to all target sounds which accord with certain frequency spectrum form characteristics.
Drawings
Fig. 1 is a schematic flow chart of a sound detection method based on object detection according to the present application.
Fig. 2 is a schematic diagram of different forms of thunder samples identified in an embodiment of the present application.
FIG. 3 is a schematic diagram of whistle samples of different frequency bands identified in an embodiment of the present application.
FIG. 4 is a schematic illustration of different forms of whistle samples identified in an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more apparent, embodiments of the present application will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
The sound signal that contains or not contain fan wind-sweeping pneumatic noise in this application passes through audio equipment collection period, through further analysis to sound signal, realizes the discernment to wherein target sound. The sound detection system based on target detection comprises a sound sensor used for collecting sounds of a fan, the environment and the like, a machine end hardware device and application software running on a station end server. The end hardware device may further include an edge hardware piece acquisition system for acquiring operational data and/or environmental data of the system components.
The target sound is suitable for the sound with certain shape characteristics on the spectrogram of the audio frequency and in the time-frequency domain. For example, a whistle in the form of an S-shape. The width, the height and the shape of the whistle are different in different fans, but the S-shaped characteristic is maintained overall, so that the whistle can be used by the whistle method. For another example, thunder is sound generated in the random lightning discharge process, the frequency spectrum of the thunder has several different forms according to the distance of thunder, but the overall form characteristics of the thunder are also kept consistent, namely, the frequency domain range of the sound is from large to small for a certain time after the sound is strong, and the target sound can also be obtained by using the method. The blade front edge protective film is tilted to generate sound similar to whistle, and thunder is identified to be lightning stroke damage and the like before the blade is seriously failed. The target sounds have respective specific forms on the spectrogram, such as whistling sounds are approximately S-shaped, thunder sounds are approximately triangular, and the like, so that the target sounds can be identified and positioned by a method for detecting targets in the image by converting the frequency spectrum into the image.
The sound detection method based on target detection comprises a training process and a prediction process, and is shown in fig. 1. Wherein, the training process comprises the following steps:
(1.1) collecting a plurality of groups of historical data with target sounds;
(1.2) performing quality screening on the data; according to different target sounds, screening methods and standards may be different, and a machine learning method may be used manually or manually, mainly to ensure that the spectrum form of the target sounds is not completely covered by noise;
(1.3) performing spectrum conversion on the data to obtain a spectrogram, which can use but is not limited to Short Time Fourier Transform (STFT), mel spectrum, etc.;
(1.4) converting the spectrogram into pictures and storing the pictures, wherein if the sound signal is longer, the pictures can be stored into a plurality of small pictures in a sliding window mode;
(1.5) marking the position of the target sound on the spectrogram by using a target detection marking tool;
(1.6) training the labeled picture data by using a target detection model; models include, but are not limited to, Yolo, SSD, R-CNN, AttentionNet, etc. training may choose to use weights that have been trained well in the common data set as initial weights, depending on the size of the historical data set.
The prediction process comprises the following steps:
(2.1) carrying out data quality screening and spectrum conversion on the collected audio signal to be detected, wherein the specific method is consistent with the training process, and if the data sampling rate in the prediction process is different from that in the training process, the sampling rate can be unified through resampling;
(2.2) converting the frequency spectrum of the data to be detected into a picture, and keeping the format size and the like of the generated picture consistent with those of the picture during training; in the prediction process, the picture can be stored, and can also be directly cached in a memory for use;
(2.3) predicting the generated picture by using the trained target detection model, and outputting the number, probability and frame position of the detection target by using the model;
(2.4) generating a correlation detection factor according to the result of the model, wherein the correlation detection factor can be a factor according to whether the model identifies the target sound and the probability/maximum probability/frequency mapping of the target sound; the specific mapping mode is formulated according to the purpose of target sound detection, and when the audio signal sliding window is a plurality of pictures, the identification result of each picture is comprehensively considered during mapping;
and (2.5) when the model identifies the form that the picture contains the target sound, calculating the frequency band and the time period of the target sound according to the position of the output target frame in the spectrogram and outputting the frequency band and the time period as the result.
Examples
The marked thunder and whistle data are used for respectively training a thunder and whistle recognition model, a brand new wind field is monitored by using the model, and thunder samples in different forms are recognized, as shown in fig. 2. Different frequency bands, different forms of whistle samples are identified as shown in figures 3 and 4. In this embodiment, the training of the model can be completed by only marking 20 pictures and using the weights trained by the public data set as the initial weights of the model, and the model uses the Yolo v3 target detection model. The two models are verified by at least 5 wind fields for several months respectively, and the results that the recognition accuracy is greater than 95% and the false alarm rate is less than 3% can be achieved without retraining the models.
Although the embodiments disclosed in the present application are described above, the descriptions are only for the convenience of understanding the present application, and are not intended to limit the present application. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims.

Claims (7)

1. A sound detection method based on target detection comprises a training process and a prediction process, and is characterized in that: the training process comprises the following steps:
(1.1) collecting a plurality of groups of historical data with target sounds and carrying out quality screening on the data;
(1.2) carrying out spectrum conversion on the data to obtain a spectrogram;
(1.3) converting the spectrogram into a picture and storing the picture;
(1.4) marking the position of the target sound on the spectrogram by using a target detection marking tool;
(1.5) training the labeled picture data by using a target detection model;
the prediction process comprises the following steps:
(2.1) carrying out data quality screening and spectrum conversion on the collected audio signal to be detected;
(2.2) converting the frequency spectrum of the data to be detected into a picture;
(2.3) predicting the generated picture by using the trained target detection model;
and (2.4) when the model identifies that the picture contains the target sound, calculating the frequency band and the time period of the target sound and outputting the frequency band and the time period as a result.
2. The sound detection method according to claim 1, characterized in that: in the step (2.3), after prediction, the model outputs the number, probability and frame position of the detection target.
3. The sound detection method according to claim 1 or 2, characterized in that: after the step (2.3), the method can further comprise the following steps: and generating a detection factor according to the detection result of the model.
4. The sound detection method according to claim 2 or 3, characterized in that: the target sound is identified according to the model and the probability or frequency mapping of the target sound occurrence becomes the detection factor.
5. The utility model provides a sound detecting system based on target detection, includes sound sensor, machine end hardware device and operation module, its characterized in that: the arithmetic module runs the sound detection method according to any one of claims 1 to 4.
6. The sound detection system of claim 5, wherein: the sound detection system further comprises a station-side server, and the operation module is arranged on the station-side server.
7. The sound detection system according to claim 5 or 6, characterized in that: the end hardware device comprises an edge hardware piece counting system.
CN202011480987.3A 2020-12-15 2020-12-15 Sound detection method and system based on target detection Pending CN112735448A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011480987.3A CN112735448A (en) 2020-12-15 2020-12-15 Sound detection method and system based on target detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011480987.3A CN112735448A (en) 2020-12-15 2020-12-15 Sound detection method and system based on target detection

Publications (1)

Publication Number Publication Date
CN112735448A true CN112735448A (en) 2021-04-30

Family

ID=75602388

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011480987.3A Pending CN112735448A (en) 2020-12-15 2020-12-15 Sound detection method and system based on target detection

Country Status (1)

Country Link
CN (1) CN112735448A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102494894A (en) * 2011-11-17 2012-06-13 高丙团 Audio monitoring and fault diagnosis system for wind generating set and audio monitoring and fault diagnosis method for same
CN104677623A (en) * 2015-03-16 2015-06-03 西安交通大学 On-site acoustic diagnosis method and monitoring system for wind turbine blade failure
US10504504B1 (en) * 2018-12-07 2019-12-10 Vocalid, Inc. Image-based approaches to classifying audio data
US20200184991A1 (en) * 2018-12-05 2020-06-11 Pascal Cleve Sound class identification using a neural network
CN111306010A (en) * 2020-04-17 2020-06-19 北京天泽智云科技有限公司 Method and system for detecting lightning damage of fan blade
CN111306008A (en) * 2019-12-31 2020-06-19 远景智能国际私人投资有限公司 Fan blade detection method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102494894A (en) * 2011-11-17 2012-06-13 高丙团 Audio monitoring and fault diagnosis system for wind generating set and audio monitoring and fault diagnosis method for same
CN104677623A (en) * 2015-03-16 2015-06-03 西安交通大学 On-site acoustic diagnosis method and monitoring system for wind turbine blade failure
US20200184991A1 (en) * 2018-12-05 2020-06-11 Pascal Cleve Sound class identification using a neural network
US10504504B1 (en) * 2018-12-07 2019-12-10 Vocalid, Inc. Image-based approaches to classifying audio data
CN111306008A (en) * 2019-12-31 2020-06-19 远景智能国际私人投资有限公司 Fan blade detection method, device, equipment and storage medium
CN111306010A (en) * 2020-04-17 2020-06-19 北京天泽智云科技有限公司 Method and system for detecting lightning damage of fan blade

Similar Documents

Publication Publication Date Title
JP7199608B2 (en) Methods and apparatus for inspecting wind turbine blades, and equipment and storage media therefor
CN112727710B (en) Wind field thunderbolt density statistical method and system based on audio signals
CN111161756B (en) Method for extracting and identifying abnormal whistle contour in wind sweeping sound signal of fan blade
CN112201260B (en) Transformer running state online detection method based on voiceprint recognition
CN108169639B (en) Method for identifying switch cabinet fault based on parallel long-time and short-time memory neural network
CN110259648B (en) Fan blade fault diagnosis method based on optimized K-means clustering
CN111306010B (en) Method and system for detecting lightning damage of fan blade
WO2023245990A1 (en) Classification and abnormality detection method for electrical devices of power plant on basis of multi-information fusion
CN112560673A (en) Thunder detection method and system based on image recognition
KR102314824B1 (en) Acoustic event detection method based on deep learning
CN115618205A (en) Portable voiceprint fault detection system and method
CN115932659A (en) Transformer fault detection method based on voiceprint characteristics
CN115467787A (en) Motor state detection system and method based on audio analysis
CN114694640A (en) Abnormal sound extraction and identification method and device based on audio frequency spectrogram
CN112735448A (en) Sound detection method and system based on target detection
Zhu et al. Wind turbine blade fault detection by acoustic analysis: Preliminary results
CN114242085A (en) Fault diagnosis method and device for rotating equipment
CN117093938A (en) Fan bearing fault detection method and system based on deep learning
CN116230013A (en) Transformer fault voiceprint detection method based on x-vector
CN110346032A (en) A kind of Φ-OTDR vibration signal end-point detecting method combined based on constant false alarm with zero-crossing rate
EP4254261A1 (en) Blade fault diagnosis method, apparatus and system, and storage medium
CN114093385A (en) Unmanned aerial vehicle detection method and device
CN108106717A (en) A kind of method based on voice signal identification set state
CN111833905B (en) System and method for detecting quality of marked character based on audio analysis
CN112926626B (en) Fan blade fault detection method based on sparse Bayesian learning and power spectrum separation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210430