CN104112453A

CN104112453A - Audio preprocessing system

Info

Publication number: CN104112453A
Application number: CN201410141017.9A
Authority: CN
Inventors: 不公告发明人
Original assignee: Tianjin Siboke Technology Development Co Ltd
Current assignee: Tianjin Siboke Technology Development Co Ltd
Priority date: 2014-04-09
Filing date: 2014-04-09
Publication date: 2014-10-22

Abstract

The invention discloses an audio preprocessing system which aims to obtain more accurate audio signals and provide convenience for follow-up audio processing step. The system comprises six steps such as sampling, A/D conversion, pre-emphasis, framing, windowing and endpoint detection. Implementation model of the above steps is as follows: the step of A/D conversion aims to convert an analog audio signal to a digital audio signal so as to bring convenience for further processing of a computer. Many characteristics of the audio signal focus on high band. The higher the frequency, the smaller the spectrum value is. Thus, the high-frequency part of audios needs to undergo pre-emphasis processing so as to raise high-frequency resolution of the audio signal. Although the audio signal is a non-linear time-varying signal, the audio signal has short-time stability, and the framing step can help obtain short-time characteristic of the audio signal. The audio signal which has undergone framing must execute the windowing step, thus reducing interframe discontinuity. The endpoint detection step is a necessary step. Only by finding an initial point and a termination point of the audio signal, an effective audio clip can be found so as to carry out further analysis and identification.

Description

A kind of audio frequency pretreatment system

Technical field

The present invention relates to audio-frequency information process field, more specifically a kind of in order to obtain the audio frequency pretreatment system of more accurate sound signal.

Background technology

Current audio recognition technology has obtained larger development and has been applied to multiple fields, as: the other fields such as office or business system, manufacturing industry, telecommunications, medical treatment, but because higher audio identification rate is all to obtain in purer acoustic environment conventionally, when in noise circumstance, discrimination will sharply decline.

In order to address the above problem, the invention discloses a kind of audio frequency pretreatment system.Before extracting audio frequency characteristics, need first collected voice signal to be carried out to pre-service, comprising: sampling, A/D conversion, pre-emphasis, point frame, windowing and end-point detection.Thereby the object of A/D conversion is analog voice signal to convert audio digital signals to and be convenient to the further processing of computing machine.Again because a lot of features of voice all concentrate on high band, and its frequency is when higher, and spectrum value is on the contrary less, so need to increase the weight of to improve to the HFS of voice the high frequency resolution of voice.Although voice signal is nonlinear time-varying signal, there is in short-term characteristic stably, point frame can obtain the short-time characteristic of voice signal.Voice signal through undue frame must carry out windowing, can reduce like this uncontinuity of interframe.The end-point detection of voice signal is the necessary basis of speech recognition system.Only find the starting point and ending point of voice signal, thereby just can find effective sound bite to carry out further analyzing identification.

1. sampling

Sampling and A/D conversion can be applied the tool box function of MATLAB and be realized.

2. pre-emphasis

Pre-emphasis can make signal spectrum become smooth, is beneficial to analysis.Conventionally can will need the voice signal of pre-emphasis by the Hi-pass filter of a single order.Be expressed as follows:

H(z)=1-αz ^-1(0.9<α<0.98)

3. point frame

Point frame is generally taking 20-30ms as a frame, and frame moves the part overlapping into frame and consecutive frame, generally gets 1/3 or 1/2 of frame length and moves as frame for fear of the characteristic variations of interframe is too large.

4. windowing

Be exactly to be multiplied by a window function with original voice signal to the windowing of voice signal, conventional window function has rectangular window, Hanning window and Hamming window.Three window functions are as follows successively:

Summary of the invention

The invention discloses a kind of audio frequency pretreatment system, object is to obtain sound signal more accurately, for follow-up audio frequency treatment step is provided convenience.

The present invention takes following technical scheme to realize: a kind of audio frequency pretreatment system, comprising: sampling, A/D conversion, pre-emphasis, point frame, windowing and six steps of end-point detection; The implementation of above-mentioned steps is: thus the object of A/D switch process is simulated audio signal to convert digital audio and video signals to and be convenient to the further processing of computing machine.Again because a lot of features of sound signal all concentrate on high band, and its frequency is when higher, and spectrum value is on the contrary less, thus need to carry out pre-emphasis processing to the HFS of audio frequency, to improve the high frequency resolution of sound signal.Although sound signal is nonlinear time-varying signal, there is in short-term characteristic stably, point frame step can obtain the short-time characteristic of sound signal.Sound signal through undue frame must be carried out windowing step, can reduce like this uncontinuity of interframe.End-point detection step is steps necessary of the present invention, has only found the starting point and ending point of sound signal, thereby just can find effective audio fragment to carry out further analyzing identification.

Realization of the present invention also comprises following technical scheme:

Sampling and A/D switch process can be applied the tool box function of MATLAB and realize.

Pre-emphasis step can make signal spectrum become smooth, is beneficial to analysis.Conventionally can will need the sound signal of pre-emphasis by the Hi-pass filter of a single order.Be expressed as follows:

H(z)=1-αz ^-1(0.9<α<0.98)

As point frame step 1, taking 20-30ms as a frame, frame moves the part overlapping into frame and consecutive frame, generally gets 1/3 or 1/2 of frame length and moves as frame for fear of the characteristic variations of interframe is too large.

Windowing step is exactly to be multiplied by a window function by original sound signal, and conventional window function has rectangular window, Hanning window and Hamming window.Three window functions are as follows successively:

Rely on simple short-time energy or zero-crossing rate to be not enough to find out exactly the audio fragment that really will analyze, thus need to be according to both combinations, adopt short-time energy and zero-crossing rate jointly to carry out end-point detection step.

Advantage of the present invention and beneficial effect are embodied in the following aspects:

1. the present invention is by obtaining sound signal preprocessing process more accurately after audio frequency letter, for the subsequent step of Audio Signal Processing provides convenience.

2. improve the pre-service efficiency of sound signal.

Brief description of the drawings

Fig. 1 is execution step schematic diagram of the present invention;

Fig. 2 is the process flow diagram of end-point detection step.

Embodiment

Below in conjunction with Figure of description 1, enforcement of the present invention is further described:

A kind of audio frequency pretreatment system, comprising: sampling, A/D conversion, pre-emphasis, point frame, windowing and six steps of end-point detection; The implementation of above-mentioned steps is: thus the object of A/D switch process is simulated audio signal to convert digital audio and video signals to and be convenient to the further processing of computing machine.Again because a lot of features of sound signal all concentrate on high band, and its frequency is when higher, and spectrum value is on the contrary less, thus need to carry out pre-emphasis processing to the HFS of audio frequency, to improve the high frequency resolution of sound signal.Although sound signal is nonlinear time-varying signal, there is in short-term characteristic stably, point frame step can obtain the short-time characteristic of sound signal.Sound signal through undue frame must be carried out windowing step, can reduce like this uncontinuity of interframe.End-point detection step is steps necessary of the present invention, has only found the starting point and ending point of sound signal, thereby just can find effective audio fragment to carry out further analyzing identification.

H(z)=1-tz ^-1(0.9<α<0.98)

Below in conjunction with Figure of description 2, the enforcement of end-point detection step in the present invention is further described:

Rely on simple short-time energy or zero-crossing rate to be not enough to find out exactly the audio fragment that really will analyze, thus need to be according to both combinations, adopt short-time energy and zero-crossing rate jointly to carry out end-point detection.Be further described as an example of voice audio signals example: general voice signal comprises unvoiced segments, voiceless sound section and voiced segments.Wherein, voiced segments is the effective voice signal that vocal cord vibration produces, and comprises energy the highest; Voiceless sound section is for air impacts produced voice signal in oral cavity, and the energy comprising takes second place; The minimum energy that unvoiced segments comprises.And show by experiment, the zero-crossing rate maximum of voiceless sound section, the zero-crossing rate of voiced segments is less.

In the time detecting, four threshold values need to be set altogether, two threshold ones Tel (energy low threshold) and Tzl (zero-crossing rate low threshold), the limit value Teh of Liang Ge wealthy family (energy high threshold) and Tzh (zero-crossing rate high threshold).The numerical value that threshold ones arranges is little, to signal sensitivity, is easy to be exceeded.The numerical value that wealthy family's limit value arranges is large, and palpus signal has sizable intensity and just can reach.So, by the of short duration noise of being likely of low threshold, only reaching high threshold and just can be defined as voice signal, the flow process of specific implementation is as shown in Figure of description 2.In the time that the energy value of voice signal exceedes Teh and when zero-crossing rate exceedes Tzh, this just represents formally to have entered voice segments, in the time that voice segments finishes, by judging that energy value is less than the end that Tel and zero-crossing rate are less than Tzl and judge voice segments, owing to being provided with threshold value, the situation that occurs erroneous judgement is reduced relatively.

Utilize technical solutions according to the invention, or those skilled in the art being under the inspiration of technical solution of the present invention, designs similar technical scheme, and reaching above-mentioned technique effect, is all to fall into protection scope of the present invention.

Claims

1. an audio frequency pretreatment system, it is characterized in that: sampling, A/D conversion, pre-emphasis, point frame, windowing and six steps of end-point detection, the implementation of above-mentioned steps is: thus the object of A/D switch process is simulated audio signal to convert digital audio and video signals to and be convenient to the further processing of computing machine, the HFS of audio frequency is carried out to pre-emphasis processing, divide frame step can obtain the short-time characteristic of sound signal, sound signal through undue frame must be carried out windowing step, and end-point detection step is the starting point and ending point in order to have found sound signal.

2. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: described sampling and A/D switch process can be applied the tool box function of MATLAB and realize.

3. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: described pre-emphasis step implementation is to need the sound signal of pre-emphasis by the Hi-pass filter of a single order.

4. a kind of audio frequency pretreatment system according to claim 1, it is characterized in that: as described point of frame step 1 taking 20-30ms as a frame, frame moves as frame and the overlapping part of consecutive frame, generally gets 1/3 or 1/2 of frame length too greatly move as frame for fear of the characteristic variations of interframe.

5. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: described windowing step is exactly to be multiplied by a window function by original sound signal.

6. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: the implementation of described end-point detection step adopts short-time energy and zero-crossing rate jointly to complete.