CN109087655A

CN109087655A - A kind of monitoring of traffic route sound and exceptional sound recognition system

Info

Publication number: CN109087655A
Application number: CN201810851609.8A
Authority: CN
Inventors: 罗丽燕; 覃泓铭; 王玫; 周陬; 邓小芳; 刘争红; 韦金泉
Original assignee: Guilin University of Electronic Technology
Current assignee: Guilin University of Electronic Technology
Priority date: 2018-07-30
Filing date: 2018-07-30
Publication date: 2018-12-25

Abstract

The invention discloses a kind of monitorings of traffic route sound and exceptional sound recognition system, sound collection end and server end including passing through network connection；Sound collection end includes sound pick-up, sound card, GPS positioning module, data processing module, wireless communication module.Sound pick-up and data processing module can be arranged installation in specified section according to demand；Real-time monitoring is carried out to traffic route sound abnormal sound data are just transmitted to server end and are further identified, volume of transmitted data is greatly reduced when monitoring abnormal sound；The various features value for extracting abnormal sound signal, is identified and is classified in conjunction with neural network；When carrying out identification classification to abnormal sound data, using depth convolutional neural networks (CNN), it is highly suitable for the identification and classification of sound, which can greatly improve training effectiveness and accuracy of identification, the abnormal sound generated in efficient identification traffic route.

Description

A kind of monitoring of traffic route sound and exceptional sound recognition system

Technical field

The present invention relates to sound monitoring and identification technology field, specifically a kind of traffic route sound monitoring and abnormal sound Identifying system.

Background technique

Abnormal sound refers to that the sound that should not be generated under certain normal environment, the abnormal sound of public place generally include Explosive sound, impact sound, shriek, gunshot etc..Abnormal sound on traffic route is able to reflect out traffic accident and urgent feelings The generation of condition passes through the monitoring to traffic route sound, it will be appreciated that the traffic condition of certain road, when being abnormal situation When, by the identification to abnormal sound, the property of the abnormal conditions can be analyzed.Moreover it is possible to according to monitoring situation judgement It has the section for being prone to traffic accident or traffic congestion.

Preventing road monitoring system is the important component of intelligent transportation, in existing traffic route monitoring system, usually only Audio-video monitoring function is without event recognition function, when being abnormal event, usually by artificial playback of monitoring videos into Row event recognition carries out event recognition by image processing algorithm.Artificial initiative recognition mode is more loaded down with trivial details and time-consuming, figure As the computationally intensive and computational algorithm that processing identification method needs is complicated.It is therefore, critically important to traffic route progress sound monitoring, When generating abnormal sound, identification is carried out to judge the type of anomalous event to abnormal sound, is also seemed increasingly important.

Summary of the invention

Present invention aims to overcome that the problem of above-mentioned traffic route monitoring system, propose a kind of traffic route sound Sound monitoring and exceptional sound recognition system, are acquired sound by sound pick-up and sound card, obtain position by locating module Abnormal sound data and location information data are sent to server end by wireless network, serviced by information, data processing module Device end carries out the multi-feature extractions such as MFCC, short-time energy, short-time zero-crossing rate, the characteristic that will be extracted to abnormal sound data It is input in depth convolutional neural networks and is compared with abnormal sound property data base, the type of abnormal sound finally can be obtained. By abnormal sound type in conjunction with location information, the position of abnormal sound generation and the class of abnormal sound are showed in map Type.

To achieve the above object, a kind of traffic route sound monitoring of the present invention and exceptional sound recognition system, including pass through The sound collection end of network connection and server end；Sound collection end includes sound pick-up, sound card, GPS positioning module, data processing Module, wireless communication module.

The sound pick-up, for the acquisition to voice signal；

The sound card, for carrying out analog-to-digital conversion to voice signal；

The GPS positioning module, the location information sent for obtaining satellite；

The data processing module judges whether it is abnormal sound for carrying out Preliminary detection to the voice signal of acquisition, and will The abnormal sound data detected are sent to server end together with location information data；

The wireless communication module, for providing the communication network of transmission data for data processing module；

The server end is mentioned for carrying out the multiple features data such as MFCC, short-time energy, short-time zero-crossing rate to abnormal sound data It takes, the characteristic of extraction is input in depth convolutional neural networks and is compared with abnormal sound property data base, it is comprehensive The matching rate of three kinds of features identifies the type of simultaneously output abnormality sound；By abnormal sound type in conjunction with location information, in map In show abnormal sound generation position and abnormal sound type.

Further, the data processing module judges whether it is abnormal sound to the voice signal progress Preliminary detection of acquisition Sound, step include:

1) framing handles voice signal as unit of frame；

2) Fast Fourier Transform (FFT) is carried out to voice signal, obtains the corresponding frequency spectrum of each frame；

3) the signal power value of each frame is calculated；

4) decibel value of sound is calculated by the performance number of each frame；

5) decibel value of each frame is made decisions, is then determined as that abnormal sound (according to criteria for noise, is greater than 40db greater than 40db Then it is determined as noise).

Further, the server end carries out MFCC feature extraction to abnormal sound data, comprising the following steps:

1) preemphasis promotes voice signal high frequency section, signal spectrum is made to become flat；

2) framing handles voice signal as unit of frame；

3) adding window increases a frame data adding window continuity of frame left end and right end；

4) Fast Fourier Transform (FFT) obtains the corresponding frequency spectrum of each frame；

5) Mel is filtered, and the frequency spectrum after Fast Fourier Transform (FFT) is converted to by Mel filter group and embodies human auditory system Mel frequency spectrum；

6) logarithm is taken, the logarithmic energy of each filter group output is calculated；

7) discrete cosine transform converts logarithmic energy, finds out Mel cepstrum coefficient；

8) dynamic difference parameter is extracted, behavioral characteristics is described with the Difference Spectrum of static nature, effectively improves the identity of system Energy；

9) the 2nd to the 13rd coefficient and dynamic difference parameter after taking discrete cosine transform are MFCC feature.

Further, the server end carries out short-time energy feature extraction to abnormal sound data, comprising the following steps:

1) framing handles voice signal as unit of frame；

2) adding window increases a frame data adding window continuity of frame left end and right end；

3) it takes absolute value, calculates all sampling point amplitudes in each frame；

4) short-time energy for calculating all sampling points in each frame sums to the short-time energy of all sampling points；

5) the short-time energy value of each frame is taken to correspond to a short-time energy feature.

Further, the server end carries out short-time zero-crossing rate feature extraction, including following step to abnormal sound data It is rapid:

1) framing handles voice signal as unit of frame；

3) judge whether two sampling points adjacent in each frame have different algebraic symbols, be to occur

Zero passage；

4) number of zero passage in each frame is calculated；

5) number of zero passage in each frame is taken to correspond to a zero-crossing rate feature.

Further, the server end identifies abnormal sound signal using depth convolutional neural networks, including Following steps:

1) abnormal sound property data base is established；

2) MFCC of the abnormal sound extracted, short-time energy, zero-crossing rate characteristic are input to depth convolutional Neural net Network；

3) the MFCC characteristic extracted is compared with the MFCC characteristic in database, identifies matching rate by classifier Highest type；

4) the short-time energy characteristic extracted is compared with the short-time energy characteristic in database, is identified by classifier The highest type of matching rate out；

5) the short-time zero-crossing rate characteristic extracted is compared with the short-time zero-crossing rate characteristic in database, by classifier Identify the highest type of matching rate；

6) the in summary comparison matching rate and identification types of three features, obtains best identified result.

Beneficial effects of the present invention: a kind of monitoring of traffic route sound and exceptional sound recognition system, sound pick-up and data Processing module can be arranged installation in specified section according to demand；Real-time monitoring is carried out to traffic route sound, when monitoring When abnormal sound, abnormal sound data are just transmitted to server end and are further identified, volume of transmitted data is greatly reduced； The various features value for extracting abnormal sound signal, is identified and is classified in conjunction with neural network；Abnormal sound data are known Not Fen Lei when, using depth convolutional neural networks (CNN), be highly suitable for the identification and classification of sound, which can be significantly Improve training effectiveness and accuracy of identification, the abnormal sound generated in efficient identification traffic route.

Detailed description of the invention

Fig. 1 is traffic route sound monitoring of the present invention and exceptional sound recognition system structure diagram；

Fig. 2 is the schematic diagram at the sound collection end in present system；

Fig. 3 is the schematic diagram of data processing module in present system；

Fig. 4 is the schematic diagram of server-side processes data in present system；

Fig. 5 is the step schematic diagram that MFCC feature is extracted in present system；

Fig. 6 is the step schematic diagram that short-time energy feature is extracted in present system；

Fig. 7 is the step schematic diagram that short-time zero-crossing rate feature is extracted in present system.

Specific embodiment

The content of present invention is further described below with reference to embodiment and attached drawing, but is not limitation of the invention.

Embodiment

As shown in Fig. 1 system structure diagram, traffic route sound of the invention, which is monitored with exceptional sound recognition system, includes Sound collection end 7 and server end 6 by network connection, sound collection end 7 include sound pick-up 1, sound card 2, GPS positioning module 3, data processing module 4, wireless communication module 5.

Sound pick-up 1 acquires and amplifies voice signal, by transmission of sound signals to sound card 2.Sound card 2 believes collected sound Number (analog signal) is converted into digital signal, by digital data transmission to data processing module 4.GPS positioning module 3 receives satellite Location information is transmitted to data processing module 4 by the location information sent.Data processing module 4 is to the voice signal received Preliminary detection is carried out, judges whether there is abnormal sound, module 5 is sent to clothes by wireless communication if detecting abnormal sound Business device end 6, while the GPS positioning information received is handled, select two-dimensional position information data mould by wireless communication Block 5 is sent to server end 6.Server end 6 carries out multi-feature extraction to the abnormal sound data received, in conjunction with depth convolution mind Classification and Identification is carried out through network.It is final to combine the location information received, position and classification recognition result are occurred into for abnormal sound on ground It is presented on figure.

As shown in Fig. 2, microphone collected sound signal, this voice signal is analog signal in sound pick-up 1, pass through operation Amplifier carries out first order amplification, then carries out second level amplification, mould of the output by amplification by automatic gain amplifier (AGC) Quasi- signal.The twin-stage amplifying circuit efficiently controls the power of voice signal, avoids moment high-decibel sound to subsequent equipment It influences.In sound card 2, by the analog signal of amplification after over-sampling, quantization, coding, output data processing module 4 is identifiable PCM digital signal.

If Fig. 3 data processing module 4 is handled shown in schematic diagram data, data processing module 4 is first to the sound received Signal carries out framing, and signal is handled as unit of frame, then carries out Fast Fourier Transform (FFT) to each frame signal (FFT), the corresponding frequency spectrum of each frame is obtained, the signal power value of each frame is then calculated, is calculated by the performance number of each frame The decibel value of sound out finally makes decisions the decibel value of each frame, is then determined as abnormal sound (according to noise greater than 40db Standard is then determined as noise greater than 40db).It particularly illustrates, if an exception occurs sound, then the voice signal is several continuous The decibel value of frame is all greater than 40db.Meanwhile the extraction of two-dimensional position coordinate is carried out to GPS positioning information.The abnormal sound that will be extracted Sound data and two-dimensional position coordinate data are packaged, the network transmission that module 5 provides by wireless communication to server terminal.

As shown in Fig. 4 server-side processes schematic diagram data, server end 6 receives the number that data processing module 4 is sent According to packet, voice data and two-dimensional position coordinate are obtained by unpacking.For voice data, saved first to local, then to sound Sound data are filtered, and then carry out multi-feature extraction to voice signal, and characteristic is finally inputted depth convolutional Neural net Network is identified and is classified, and obtains classification results.For classification results, it is saved to corresponding library and carries out abnormal sound database It establishes, such as saves traffic accident impact sound to traffic accident impact sound database.For two-dimensional position coordinate, first preservation to local, so Carry out coordinate conversion afterwards, earth latitude and longitude coordinates be converted into corresponding map reference, the embodiment of the present invention call Baidu/ Google Maps.Finally, position and classification that abnormal sound occurs is presented in combining classification result and position coordinates on map.

Further, as described above, being related to carrying out multi-feature extraction to voice signal.Abnormal sound be it is a kind of it is aperiodic, The random signal of non-stationary only cannot fully be described abnormal sound with a kind of feature in time domain or frequency domain.? In short time, generally 10ms-30ms, abnormal sound signal can be considered a kind of short-term stationarity signal, be based on this characteristic, can extract Multiple features in voice signal time-domain and frequency-domain, are identified using multiple features, discrimination can be improved.To abnormal sound message Number carry out multi-feature extraction, including three MFCC, short-time energy, short-time zero-crossing rate features.

Further, as shown in figure 5, MFCC feature extraction the following steps are included:

2) framing handles voice signal as unit of frame；

Further, as shown in fig. 6, short-time energy feature extraction the following steps are included:

1) framing handles voice signal as unit of frame；

Further, as shown in fig. 7, short-time zero-crossing rate feature extraction the following steps are included:

1) framing handles voice signal as unit of frame；

3) judge whether two sampling points adjacent in each frame have different algebraic symbols, be that zero passage has occurred；

4) number of zero passage in each frame is calculated；

Further, it is related to identification of the depth convolutional neural networks to abnormal sound signal, comprising the following steps:

1) abnormal sound property data base is established；

2) MFCC of the abnormal sound extracted, short-time energy, zero-crossing rate characteristic are input to depth convolutional neural networks；

Claims

1. a kind of traffic route sound monitoring and exceptional sound recognition system, it is characterised in that: including the sound by network connection Sound collection terminal and server end；

Sound collection end includes sound pick-up, sound card, GPS positioning module, data processing module, wireless communication module；

The sound pick-up, for the acquisition to voice signal；

The server end, for being mentioned to three kinds of abnormal sound data progress MFCC, short-time energy, short-time zero-crossing rate characteristics It takes, the characteristic of extraction is input in depth convolutional neural networks and is compared with abnormal sound property data base, it is comprehensive The matching rate of three kinds of features identifies the type of simultaneously output abnormality sound；By abnormal sound type in conjunction with location information, in map In show abnormal sound generation position and abnormal sound type.

2. the monitoring of traffic route sound and exceptional sound recognition system according to claim 1, it is characterised in that: the number Preliminary detection is carried out according to voice signal of the processing module to acquisition and judges whether it is abnormal sound, and step includes:

1) framing handles voice signal as unit of frame；

2) Fast Fourier Transform (FFT) is carried out to voice data, obtains the corresponding frequency spectrum of each frame；

3) the signal power value of each frame is calculated；

5) decibel value of each frame is made decisions, is then determined as abnormal sound greater than 40db.

3. the monitoring of traffic route sound and exceptional sound recognition system according to claim 1, it is characterised in that: the service Device end carries out MFCC feature extraction to abnormal sound data, comprising the following steps:

1) preemphasis, promotion signal high frequency section make signal spectrum become flat；

2) framing handles signal as unit of frame；

4. the monitoring of traffic route sound and exceptional sound recognition system according to claim 1, it is characterised in that: the service Device end carries out short-time energy feature extraction to abnormal sound data, comprising the following steps:

1) framing handles signal as unit of frame；

5. the monitoring of traffic route sound and exceptional sound recognition system according to claim 1, it is characterised in that: the service Device end carries out zero-crossing rate feature extraction to abnormal sound data, comprising the following steps:

1) framing handles signal as unit of frame；

4) number of zero passage in each frame is calculated；

6. the monitoring of traffic route sound and exceptional sound recognition system according to claim 1, it is characterised in that: the service Device end identifies abnormal sound signal using depth convolutional neural networks, comprising the following steps:

1) abnormal sound property data base is established；