CN104112453A - Audio preprocessing system - Google Patents
Audio preprocessing system Download PDFInfo
- Publication number
- CN104112453A CN104112453A CN201410141017.9A CN201410141017A CN104112453A CN 104112453 A CN104112453 A CN 104112453A CN 201410141017 A CN201410141017 A CN 201410141017A CN 104112453 A CN104112453 A CN 104112453A
- Authority
- CN
- China
- Prior art keywords
- frame
- audio
- audio signal
- audio frequency
- emphasis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention discloses an audio preprocessing system which aims to obtain more accurate audio signals and provide convenience for follow-up audio processing step. The system comprises six steps such as sampling, A/D conversion, pre-emphasis, framing, windowing and endpoint detection. Implementation model of the above steps is as follows: the step of A/D conversion aims to convert an analog audio signal to a digital audio signal so as to bring convenience for further processing of a computer. Many characteristics of the audio signal focus on high band. The higher the frequency, the smaller the spectrum value is. Thus, the high-frequency part of audios needs to undergo pre-emphasis processing so as to raise high-frequency resolution of the audio signal. Although the audio signal is a non-linear time-varying signal, the audio signal has short-time stability, and the framing step can help obtain short-time characteristic of the audio signal. The audio signal which has undergone framing must execute the windowing step, thus reducing interframe discontinuity. The endpoint detection step is a necessary step. Only by finding an initial point and a termination point of the audio signal, an effective audio clip can be found so as to carry out further analysis and identification.
Description
Technical field
The present invention relates to audio-frequency information process field, more specifically a kind of in order to obtain the audio frequency pretreatment system of more accurate sound signal.
Background technology
Current audio recognition technology has obtained larger development and has been applied to multiple fields, as: the other fields such as office or business system, manufacturing industry, telecommunications, medical treatment, but because higher audio identification rate is all to obtain in purer acoustic environment conventionally, when in noise circumstance, discrimination will sharply decline.
In order to address the above problem, the invention discloses a kind of audio frequency pretreatment system.Before extracting audio frequency characteristics, need first collected voice signal to be carried out to pre-service, comprising: sampling, A/D conversion, pre-emphasis, point frame, windowing and end-point detection.Thereby the object of A/D conversion is analog voice signal to convert audio digital signals to and be convenient to the further processing of computing machine.Again because a lot of features of voice all concentrate on high band, and its frequency is when higher, and spectrum value is on the contrary less, so need to increase the weight of to improve to the HFS of voice the high frequency resolution of voice.Although voice signal is nonlinear time-varying signal, there is in short-term characteristic stably, point frame can obtain the short-time characteristic of voice signal.Voice signal through undue frame must carry out windowing, can reduce like this uncontinuity of interframe.The end-point detection of voice signal is the necessary basis of speech recognition system.Only find the starting point and ending point of voice signal, thereby just can find effective sound bite to carry out further analyzing identification.
1. sampling
Sampling and A/D conversion can be applied the tool box function of MATLAB and be realized.
2. pre-emphasis
Pre-emphasis can make signal spectrum become smooth, is beneficial to analysis.Conventionally can will need the voice signal of pre-emphasis by the Hi-pass filter of a single order.Be expressed as follows:
H(z)=1-αz
-1(0.9<α<0.98)
3. point frame
Point frame is generally taking 20-30ms as a frame, and frame moves the part overlapping into frame and consecutive frame, generally gets 1/3 or 1/2 of frame length and moves as frame for fear of the characteristic variations of interframe is too large.
4. windowing
Be exactly to be multiplied by a window function with original voice signal to the windowing of voice signal, conventional window function has rectangular window, Hanning window and Hamming window.Three window functions are as follows successively:
Summary of the invention
The invention discloses a kind of audio frequency pretreatment system, object is to obtain sound signal more accurately, for follow-up audio frequency treatment step is provided convenience.
The present invention takes following technical scheme to realize: a kind of audio frequency pretreatment system, comprising: sampling, A/D conversion, pre-emphasis, point frame, windowing and six steps of end-point detection; The implementation of above-mentioned steps is: thus the object of A/D switch process is simulated audio signal to convert digital audio and video signals to and be convenient to the further processing of computing machine.Again because a lot of features of sound signal all concentrate on high band, and its frequency is when higher, and spectrum value is on the contrary less, thus need to carry out pre-emphasis processing to the HFS of audio frequency, to improve the high frequency resolution of sound signal.Although sound signal is nonlinear time-varying signal, there is in short-term characteristic stably, point frame step can obtain the short-time characteristic of sound signal.Sound signal through undue frame must be carried out windowing step, can reduce like this uncontinuity of interframe.End-point detection step is steps necessary of the present invention, has only found the starting point and ending point of sound signal, thereby just can find effective audio fragment to carry out further analyzing identification.
Realization of the present invention also comprises following technical scheme:
Sampling and A/D switch process can be applied the tool box function of MATLAB and realize.
Pre-emphasis step can make signal spectrum become smooth, is beneficial to analysis.Conventionally can will need the sound signal of pre-emphasis by the Hi-pass filter of a single order.Be expressed as follows:
H(z)=1-αz
-1(0.9<α<0.98)
As point frame step 1, taking 20-30ms as a frame, frame moves the part overlapping into frame and consecutive frame, generally gets 1/3 or 1/2 of frame length and moves as frame for fear of the characteristic variations of interframe is too large.
Windowing step is exactly to be multiplied by a window function by original sound signal, and conventional window function has rectangular window, Hanning window and Hamming window.Three window functions are as follows successively:
Rely on simple short-time energy or zero-crossing rate to be not enough to find out exactly the audio fragment that really will analyze, thus need to be according to both combinations, adopt short-time energy and zero-crossing rate jointly to carry out end-point detection step.
Advantage of the present invention and beneficial effect are embodied in the following aspects:
1. the present invention is by obtaining sound signal preprocessing process more accurately after audio frequency letter, for the subsequent step of Audio Signal Processing provides convenience.
2. improve the pre-service efficiency of sound signal.
Brief description of the drawings
Fig. 1 is execution step schematic diagram of the present invention;
Fig. 2 is the process flow diagram of end-point detection step.
Embodiment
Below in conjunction with Figure of description 1, enforcement of the present invention is further described:
A kind of audio frequency pretreatment system, comprising: sampling, A/D conversion, pre-emphasis, point frame, windowing and six steps of end-point detection; The implementation of above-mentioned steps is: thus the object of A/D switch process is simulated audio signal to convert digital audio and video signals to and be convenient to the further processing of computing machine.Again because a lot of features of sound signal all concentrate on high band, and its frequency is when higher, and spectrum value is on the contrary less, thus need to carry out pre-emphasis processing to the HFS of audio frequency, to improve the high frequency resolution of sound signal.Although sound signal is nonlinear time-varying signal, there is in short-term characteristic stably, point frame step can obtain the short-time characteristic of sound signal.Sound signal through undue frame must be carried out windowing step, can reduce like this uncontinuity of interframe.End-point detection step is steps necessary of the present invention, has only found the starting point and ending point of sound signal, thereby just can find effective audio fragment to carry out further analyzing identification.
Sampling and A/D switch process can be applied the tool box function of MATLAB and realize.
Pre-emphasis step can make signal spectrum become smooth, is beneficial to analysis.Conventionally can will need the sound signal of pre-emphasis by the Hi-pass filter of a single order.Be expressed as follows:
H(z)=1-tz
-1(0.9<α<0.98)
As point frame step 1, taking 20-30ms as a frame, frame moves the part overlapping into frame and consecutive frame, generally gets 1/3 or 1/2 of frame length and moves as frame for fear of the characteristic variations of interframe is too large.
Windowing step is exactly to be multiplied by a window function by original sound signal, and conventional window function has rectangular window, Hanning window and Hamming window.Three window functions are as follows successively:
Below in conjunction with Figure of description 2, the enforcement of end-point detection step in the present invention is further described:
Rely on simple short-time energy or zero-crossing rate to be not enough to find out exactly the audio fragment that really will analyze, thus need to be according to both combinations, adopt short-time energy and zero-crossing rate jointly to carry out end-point detection.Be further described as an example of voice audio signals example: general voice signal comprises unvoiced segments, voiceless sound section and voiced segments.Wherein, voiced segments is the effective voice signal that vocal cord vibration produces, and comprises energy the highest; Voiceless sound section is for air impacts produced voice signal in oral cavity, and the energy comprising takes second place; The minimum energy that unvoiced segments comprises.And show by experiment, the zero-crossing rate maximum of voiceless sound section, the zero-crossing rate of voiced segments is less.
In the time detecting, four threshold values need to be set altogether, two threshold ones Tel (energy low threshold) and Tzl (zero-crossing rate low threshold), the limit value Teh of Liang Ge wealthy family (energy high threshold) and Tzh (zero-crossing rate high threshold).The numerical value that threshold ones arranges is little, to signal sensitivity, is easy to be exceeded.The numerical value that wealthy family's limit value arranges is large, and palpus signal has sizable intensity and just can reach.So, by the of short duration noise of being likely of low threshold, only reaching high threshold and just can be defined as voice signal, the flow process of specific implementation is as shown in Figure of description 2.In the time that the energy value of voice signal exceedes Teh and when zero-crossing rate exceedes Tzh, this just represents formally to have entered voice segments, in the time that voice segments finishes, by judging that energy value is less than the end that Tel and zero-crossing rate are less than Tzl and judge voice segments, owing to being provided with threshold value, the situation that occurs erroneous judgement is reduced relatively.
Utilize technical solutions according to the invention, or those skilled in the art being under the inspiration of technical solution of the present invention, designs similar technical scheme, and reaching above-mentioned technique effect, is all to fall into protection scope of the present invention.
Claims (6)
1. an audio frequency pretreatment system, it is characterized in that: sampling, A/D conversion, pre-emphasis, point frame, windowing and six steps of end-point detection, the implementation of above-mentioned steps is: thus the object of A/D switch process is simulated audio signal to convert digital audio and video signals to and be convenient to the further processing of computing machine, the HFS of audio frequency is carried out to pre-emphasis processing, divide frame step can obtain the short-time characteristic of sound signal, sound signal through undue frame must be carried out windowing step, and end-point detection step is the starting point and ending point in order to have found sound signal.
2. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: described sampling and A/D switch process can be applied the tool box function of MATLAB and realize.
3. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: described pre-emphasis step implementation is to need the sound signal of pre-emphasis by the Hi-pass filter of a single order.
4. a kind of audio frequency pretreatment system according to claim 1, it is characterized in that: as described point of frame step 1 taking 20-30ms as a frame, frame moves as frame and the overlapping part of consecutive frame, generally gets 1/3 or 1/2 of frame length too greatly move as frame for fear of the characteristic variations of interframe.
5. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: described windowing step is exactly to be multiplied by a window function by original sound signal.
6. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: the implementation of described end-point detection step adopts short-time energy and zero-crossing rate jointly to complete.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410141017.9A CN104112453A (en) | 2014-04-09 | 2014-04-09 | Audio preprocessing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410141017.9A CN104112453A (en) | 2014-04-09 | 2014-04-09 | Audio preprocessing system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104112453A true CN104112453A (en) | 2014-10-22 |
Family
ID=51709211
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410141017.9A Pending CN104112453A (en) | 2014-04-09 | 2014-04-09 | Audio preprocessing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104112453A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109507510A (en) * | 2018-11-28 | 2019-03-22 | 深圳桓轩科技有限公司 | A kind of transformer fault diagnosis system |
CN112786071A (en) * | 2021-01-13 | 2021-05-11 | 国家电网有限公司客户服务中心 | Data annotation method for voice segments of voice interaction scene |
CN112863546A (en) * | 2021-01-21 | 2021-05-28 | 安徽理工大学 | Belt conveyor health analysis method based on audio characteristic decision |
CN113327590A (en) * | 2021-04-15 | 2021-08-31 | 中标软件有限公司 | Speech recognition method |
CN114007176A (en) * | 2020-10-09 | 2022-02-01 | 上海又为智能科技有限公司 | Audio signal processing method, apparatus and storage medium for reducing signal delay |
-
2014
- 2014-04-09 CN CN201410141017.9A patent/CN104112453A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109507510A (en) * | 2018-11-28 | 2019-03-22 | 深圳桓轩科技有限公司 | A kind of transformer fault diagnosis system |
CN114007176A (en) * | 2020-10-09 | 2022-02-01 | 上海又为智能科技有限公司 | Audio signal processing method, apparatus and storage medium for reducing signal delay |
CN114007176B (en) * | 2020-10-09 | 2023-12-19 | 上海又为智能科技有限公司 | Audio signal processing method, device and storage medium for reducing signal delay |
CN112786071A (en) * | 2021-01-13 | 2021-05-11 | 国家电网有限公司客户服务中心 | Data annotation method for voice segments of voice interaction scene |
CN112863546A (en) * | 2021-01-21 | 2021-05-28 | 安徽理工大学 | Belt conveyor health analysis method based on audio characteristic decision |
CN113327590A (en) * | 2021-04-15 | 2021-08-31 | 中标软件有限公司 | Speech recognition method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104112453A (en) | Audio preprocessing system | |
US10090005B2 (en) | Analog voice activity detection | |
WO2018145584A1 (en) | Voice activity detection method and voice recognition method | |
CN103117067B (en) | Voice endpoint detection method under low signal-to-noise ratio | |
CN103886871A (en) | Detection method of speech endpoint and device thereof | |
US9454976B2 (en) | Efficient discrimination of voiced and unvoiced sounds | |
TWI569263B (en) | Method and apparatus for signal extraction of audio signal | |
US20140067388A1 (en) | Robust voice activity detection in adverse environments | |
CN104992714A (en) | Motor abnormal sound detection method | |
CN107274911A (en) | A kind of similarity analysis method based on sound characteristic | |
CN104409078A (en) | Abnormal noise detection and recognition system | |
CN103297590A (en) | Method and system for achieving equipment unlocking based on voice frequency | |
CN108364637A (en) | A kind of audio sentence boundary detection method | |
CN109377982B (en) | Effective voice obtaining method | |
CN108735230B (en) | Background music identification method, device and equipment based on mixed audio | |
WO2020186695A1 (en) | Voice information batch processing method and apparatus, computer device, and storage medium | |
EP3503093A1 (en) | Method for associating a device with a speaker in a gateway, corresponding computer program computer and apparatus | |
Song et al. | Feature extraction and classification for audio information in news video | |
CN104240705A (en) | Intelligent voice-recognition locking system for safe box | |
CN205451769U (en) | Wear speech recognition system of smart machine and wear smart machine | |
CN104766610A (en) | Voice recognition system and method based on vibration | |
CN205029873U (en) | Detection device based on time domain analysis | |
CN106653040A (en) | Voice audio signal sampling processing method | |
CN103997381B (en) | The identification of examination hall cheating signal intelligent and evidence obtaining method of reducing | |
CN107833582B (en) | Arc length-based voice signal endpoint detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20141022 |