CN104112453A - Audio preprocessing system - Google Patents

Audio preprocessing system Download PDF

Info

Publication number
CN104112453A
CN104112453A CN201410141017.9A CN201410141017A CN104112453A CN 104112453 A CN104112453 A CN 104112453A CN 201410141017 A CN201410141017 A CN 201410141017A CN 104112453 A CN104112453 A CN 104112453A
Authority
CN
China
Prior art keywords
frame
audio
audio signal
audio frequency
emphasis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410141017.9A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Siboke Technology Development Co Ltd
Original Assignee
Tianjin Siboke Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Siboke Technology Development Co Ltd filed Critical Tianjin Siboke Technology Development Co Ltd
Priority to CN201410141017.9A priority Critical patent/CN104112453A/en
Publication of CN104112453A publication Critical patent/CN104112453A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses an audio preprocessing system which aims to obtain more accurate audio signals and provide convenience for follow-up audio processing step. The system comprises six steps such as sampling, A/D conversion, pre-emphasis, framing, windowing and endpoint detection. Implementation model of the above steps is as follows: the step of A/D conversion aims to convert an analog audio signal to a digital audio signal so as to bring convenience for further processing of a computer. Many characteristics of the audio signal focus on high band. The higher the frequency, the smaller the spectrum value is. Thus, the high-frequency part of audios needs to undergo pre-emphasis processing so as to raise high-frequency resolution of the audio signal. Although the audio signal is a non-linear time-varying signal, the audio signal has short-time stability, and the framing step can help obtain short-time characteristic of the audio signal. The audio signal which has undergone framing must execute the windowing step, thus reducing interframe discontinuity. The endpoint detection step is a necessary step. Only by finding an initial point and a termination point of the audio signal, an effective audio clip can be found so as to carry out further analysis and identification.

Description

A kind of audio frequency pretreatment system
Technical field
The present invention relates to audio-frequency information process field, more specifically a kind of in order to obtain the audio frequency pretreatment system of more accurate sound signal.
Background technology
Current audio recognition technology has obtained larger development and has been applied to multiple fields, as: the other fields such as office or business system, manufacturing industry, telecommunications, medical treatment, but because higher audio identification rate is all to obtain in purer acoustic environment conventionally, when in noise circumstance, discrimination will sharply decline.
In order to address the above problem, the invention discloses a kind of audio frequency pretreatment system.Before extracting audio frequency characteristics, need first collected voice signal to be carried out to pre-service, comprising: sampling, A/D conversion, pre-emphasis, point frame, windowing and end-point detection.Thereby the object of A/D conversion is analog voice signal to convert audio digital signals to and be convenient to the further processing of computing machine.Again because a lot of features of voice all concentrate on high band, and its frequency is when higher, and spectrum value is on the contrary less, so need to increase the weight of to improve to the HFS of voice the high frequency resolution of voice.Although voice signal is nonlinear time-varying signal, there is in short-term characteristic stably, point frame can obtain the short-time characteristic of voice signal.Voice signal through undue frame must carry out windowing, can reduce like this uncontinuity of interframe.The end-point detection of voice signal is the necessary basis of speech recognition system.Only find the starting point and ending point of voice signal, thereby just can find effective sound bite to carry out further analyzing identification.
1. sampling
Sampling and A/D conversion can be applied the tool box function of MATLAB and be realized.
2. pre-emphasis
Pre-emphasis can make signal spectrum become smooth, is beneficial to analysis.Conventionally can will need the voice signal of pre-emphasis by the Hi-pass filter of a single order.Be expressed as follows:
H(z)=1-αz -1(0.9<α<0.98)
3. point frame
Point frame is generally taking 20-30ms as a frame, and frame moves the part overlapping into frame and consecutive frame, generally gets 1/3 or 1/2 of frame length and moves as frame for fear of the characteristic variations of interframe is too large.
4. windowing
Be exactly to be multiplied by a window function with original voice signal to the windowing of voice signal, conventional window function has rectangular window, Hanning window and Hamming window.Three window functions are as follows successively:
Summary of the invention
The invention discloses a kind of audio frequency pretreatment system, object is to obtain sound signal more accurately, for follow-up audio frequency treatment step is provided convenience.
The present invention takes following technical scheme to realize: a kind of audio frequency pretreatment system, comprising: sampling, A/D conversion, pre-emphasis, point frame, windowing and six steps of end-point detection; The implementation of above-mentioned steps is: thus the object of A/D switch process is simulated audio signal to convert digital audio and video signals to and be convenient to the further processing of computing machine.Again because a lot of features of sound signal all concentrate on high band, and its frequency is when higher, and spectrum value is on the contrary less, thus need to carry out pre-emphasis processing to the HFS of audio frequency, to improve the high frequency resolution of sound signal.Although sound signal is nonlinear time-varying signal, there is in short-term characteristic stably, point frame step can obtain the short-time characteristic of sound signal.Sound signal through undue frame must be carried out windowing step, can reduce like this uncontinuity of interframe.End-point detection step is steps necessary of the present invention, has only found the starting point and ending point of sound signal, thereby just can find effective audio fragment to carry out further analyzing identification.
Realization of the present invention also comprises following technical scheme:
Sampling and A/D switch process can be applied the tool box function of MATLAB and realize.
Pre-emphasis step can make signal spectrum become smooth, is beneficial to analysis.Conventionally can will need the sound signal of pre-emphasis by the Hi-pass filter of a single order.Be expressed as follows:
H(z)=1-αz -1(0.9<α<0.98)
As point frame step 1, taking 20-30ms as a frame, frame moves the part overlapping into frame and consecutive frame, generally gets 1/3 or 1/2 of frame length and moves as frame for fear of the characteristic variations of interframe is too large.
Windowing step is exactly to be multiplied by a window function by original sound signal, and conventional window function has rectangular window, Hanning window and Hamming window.Three window functions are as follows successively:
Rely on simple short-time energy or zero-crossing rate to be not enough to find out exactly the audio fragment that really will analyze, thus need to be according to both combinations, adopt short-time energy and zero-crossing rate jointly to carry out end-point detection step.
Advantage of the present invention and beneficial effect are embodied in the following aspects:
1. the present invention is by obtaining sound signal preprocessing process more accurately after audio frequency letter, for the subsequent step of Audio Signal Processing provides convenience.
2. improve the pre-service efficiency of sound signal.
Brief description of the drawings
Fig. 1 is execution step schematic diagram of the present invention;
Fig. 2 is the process flow diagram of end-point detection step.
Embodiment
Below in conjunction with Figure of description 1, enforcement of the present invention is further described:
A kind of audio frequency pretreatment system, comprising: sampling, A/D conversion, pre-emphasis, point frame, windowing and six steps of end-point detection; The implementation of above-mentioned steps is: thus the object of A/D switch process is simulated audio signal to convert digital audio and video signals to and be convenient to the further processing of computing machine.Again because a lot of features of sound signal all concentrate on high band, and its frequency is when higher, and spectrum value is on the contrary less, thus need to carry out pre-emphasis processing to the HFS of audio frequency, to improve the high frequency resolution of sound signal.Although sound signal is nonlinear time-varying signal, there is in short-term characteristic stably, point frame step can obtain the short-time characteristic of sound signal.Sound signal through undue frame must be carried out windowing step, can reduce like this uncontinuity of interframe.End-point detection step is steps necessary of the present invention, has only found the starting point and ending point of sound signal, thereby just can find effective audio fragment to carry out further analyzing identification.
Sampling and A/D switch process can be applied the tool box function of MATLAB and realize.
Pre-emphasis step can make signal spectrum become smooth, is beneficial to analysis.Conventionally can will need the sound signal of pre-emphasis by the Hi-pass filter of a single order.Be expressed as follows:
H(z)=1-tz -1(0.9<α<0.98)
As point frame step 1, taking 20-30ms as a frame, frame moves the part overlapping into frame and consecutive frame, generally gets 1/3 or 1/2 of frame length and moves as frame for fear of the characteristic variations of interframe is too large.
Windowing step is exactly to be multiplied by a window function by original sound signal, and conventional window function has rectangular window, Hanning window and Hamming window.Three window functions are as follows successively:
Below in conjunction with Figure of description 2, the enforcement of end-point detection step in the present invention is further described:
Rely on simple short-time energy or zero-crossing rate to be not enough to find out exactly the audio fragment that really will analyze, thus need to be according to both combinations, adopt short-time energy and zero-crossing rate jointly to carry out end-point detection.Be further described as an example of voice audio signals example: general voice signal comprises unvoiced segments, voiceless sound section and voiced segments.Wherein, voiced segments is the effective voice signal that vocal cord vibration produces, and comprises energy the highest; Voiceless sound section is for air impacts produced voice signal in oral cavity, and the energy comprising takes second place; The minimum energy that unvoiced segments comprises.And show by experiment, the zero-crossing rate maximum of voiceless sound section, the zero-crossing rate of voiced segments is less.
In the time detecting, four threshold values need to be set altogether, two threshold ones Tel (energy low threshold) and Tzl (zero-crossing rate low threshold), the limit value Teh of Liang Ge wealthy family (energy high threshold) and Tzh (zero-crossing rate high threshold).The numerical value that threshold ones arranges is little, to signal sensitivity, is easy to be exceeded.The numerical value that wealthy family's limit value arranges is large, and palpus signal has sizable intensity and just can reach.So, by the of short duration noise of being likely of low threshold, only reaching high threshold and just can be defined as voice signal, the flow process of specific implementation is as shown in Figure of description 2.In the time that the energy value of voice signal exceedes Teh and when zero-crossing rate exceedes Tzh, this just represents formally to have entered voice segments, in the time that voice segments finishes, by judging that energy value is less than the end that Tel and zero-crossing rate are less than Tzl and judge voice segments, owing to being provided with threshold value, the situation that occurs erroneous judgement is reduced relatively.
Utilize technical solutions according to the invention, or those skilled in the art being under the inspiration of technical solution of the present invention, designs similar technical scheme, and reaching above-mentioned technique effect, is all to fall into protection scope of the present invention.

Claims (6)

1. an audio frequency pretreatment system, it is characterized in that: sampling, A/D conversion, pre-emphasis, point frame, windowing and six steps of end-point detection, the implementation of above-mentioned steps is: thus the object of A/D switch process is simulated audio signal to convert digital audio and video signals to and be convenient to the further processing of computing machine, the HFS of audio frequency is carried out to pre-emphasis processing, divide frame step can obtain the short-time characteristic of sound signal, sound signal through undue frame must be carried out windowing step, and end-point detection step is the starting point and ending point in order to have found sound signal.
2. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: described sampling and A/D switch process can be applied the tool box function of MATLAB and realize.
3. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: described pre-emphasis step implementation is to need the sound signal of pre-emphasis by the Hi-pass filter of a single order.
4. a kind of audio frequency pretreatment system according to claim 1, it is characterized in that: as described point of frame step 1 taking 20-30ms as a frame, frame moves as frame and the overlapping part of consecutive frame, generally gets 1/3 or 1/2 of frame length too greatly move as frame for fear of the characteristic variations of interframe.
5. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: described windowing step is exactly to be multiplied by a window function by original sound signal.
6. a kind of audio frequency pretreatment system according to claim 1, is characterized in that: the implementation of described end-point detection step adopts short-time energy and zero-crossing rate jointly to complete.
CN201410141017.9A 2014-04-09 2014-04-09 Audio preprocessing system Pending CN104112453A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410141017.9A CN104112453A (en) 2014-04-09 2014-04-09 Audio preprocessing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410141017.9A CN104112453A (en) 2014-04-09 2014-04-09 Audio preprocessing system

Publications (1)

Publication Number Publication Date
CN104112453A true CN104112453A (en) 2014-10-22

Family

ID=51709211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410141017.9A Pending CN104112453A (en) 2014-04-09 2014-04-09 Audio preprocessing system

Country Status (1)

Country Link
CN (1) CN104112453A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109507510A (en) * 2018-11-28 2019-03-22 深圳桓轩科技有限公司 A kind of transformer fault diagnosis system
CN112786071A (en) * 2021-01-13 2021-05-11 国家电网有限公司客户服务中心 Data annotation method for voice segments of voice interaction scene
CN112863546A (en) * 2021-01-21 2021-05-28 安徽理工大学 Belt conveyor health analysis method based on audio characteristic decision
CN113327590A (en) * 2021-04-15 2021-08-31 中标软件有限公司 Speech recognition method
CN114007176A (en) * 2020-10-09 2022-02-01 上海又为智能科技有限公司 Audio signal processing method, apparatus and storage medium for reducing signal delay

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109507510A (en) * 2018-11-28 2019-03-22 深圳桓轩科技有限公司 A kind of transformer fault diagnosis system
CN114007176A (en) * 2020-10-09 2022-02-01 上海又为智能科技有限公司 Audio signal processing method, apparatus and storage medium for reducing signal delay
CN114007176B (en) * 2020-10-09 2023-12-19 上海又为智能科技有限公司 Audio signal processing method, device and storage medium for reducing signal delay
CN112786071A (en) * 2021-01-13 2021-05-11 国家电网有限公司客户服务中心 Data annotation method for voice segments of voice interaction scene
CN112863546A (en) * 2021-01-21 2021-05-28 安徽理工大学 Belt conveyor health analysis method based on audio characteristic decision
CN113327590A (en) * 2021-04-15 2021-08-31 中标软件有限公司 Speech recognition method

Similar Documents

Publication Publication Date Title
CN104112453A (en) Audio preprocessing system
US10090005B2 (en) Analog voice activity detection
WO2018145584A1 (en) Voice activity detection method and voice recognition method
CN103117067B (en) Voice endpoint detection method under low signal-to-noise ratio
CN103886871A (en) Detection method of speech endpoint and device thereof
US9454976B2 (en) Efficient discrimination of voiced and unvoiced sounds
TWI569263B (en) Method and apparatus for signal extraction of audio signal
US20140067388A1 (en) Robust voice activity detection in adverse environments
CN104992714A (en) Motor abnormal sound detection method
CN107274911A (en) A kind of similarity analysis method based on sound characteristic
CN104409078A (en) Abnormal noise detection and recognition system
CN103297590A (en) Method and system for achieving equipment unlocking based on voice frequency
CN108364637A (en) A kind of audio sentence boundary detection method
CN109377982B (en) Effective voice obtaining method
CN108735230B (en) Background music identification method, device and equipment based on mixed audio
WO2020186695A1 (en) Voice information batch processing method and apparatus, computer device, and storage medium
EP3503093A1 (en) Method for associating a device with a speaker in a gateway, corresponding computer program computer and apparatus
Song et al. Feature extraction and classification for audio information in news video
CN104240705A (en) Intelligent voice-recognition locking system for safe box
CN205451769U (en) Wear speech recognition system of smart machine and wear smart machine
CN104766610A (en) Voice recognition system and method based on vibration
CN205029873U (en) Detection device based on time domain analysis
CN106653040A (en) Voice audio signal sampling processing method
CN103997381B (en) The identification of examination hall cheating signal intelligent and evidence obtaining method of reducing
CN107833582B (en) Arc length-based voice signal endpoint detection method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20141022