CN203288240U

CN203288240U - Speech endpoint detection system based on DSP

Info

Publication number: CN203288240U
Application number: CN201320097898XU
Authority: CN
Inventors: 张梅
Original assignee: Anhui University of Science and Technology
Current assignee: Anhui University of Science and Technology
Priority date: 2013-03-04
Filing date: 2013-03-04
Publication date: 2013-11-13
Anticipated expiration: 2023-03-04

Abstract

The utility model discloses a speech endpoint detection system based on a DSP. The system comprises a core processing unit TMS320VC5416DSP, a speech signal coding and decoding module TLV320AIC23 chip, a PC machine, a power supply circuit, a reset circuit and a clock circuit. The signal output end of the TLV320AIC23 chip is connected to the signal input end of the TMS320VC5416DSP. The TMS320VC5416DSP is connected with the PC machine through an asynchronous serial transceiver TL16C550 and a level conversion chip MAX232. According to the system, a collected voice signal is inputted through the TLV320AIC23 chip, the signal is converted into a digital signal and then is sent to the TMS320VC5416DSP to carry out a speech endpoint detection algorithm operation based on a fuzzy neural network, then the data processed by the TMS320VC5416DSP is in series communication with the PC machine through the asynchronous serial transceiver TL16C550 and the level conversion chip MAX232, and an SRAM memory and a FLASH memory are added to the system. According to the system, the TMS320VC5416DSP is employed as a core chip, the fuzzy neural network is used to realize speech endpoint detection algorithm, and the system has the advantages of good adaptability, high real-time performance and strong robustness.

Description

A kind of detection system of sound end based on DSP

Technical field

The utility model relates generally to the end-point detection of voice signal, relates in particular to a kind of detection system of sound end based on DSP.

Background technology

Speech recognition technology is progressively to move towards application, and the stability of speech recognition and robustness problem have progressively become the focus of the Research of Speech Recognition.Speech recognition system with using value must can adapt to various noise circumstances.And present speech recognition system performance can not meet the demands, performance can reduce greatly under noise circumstance, it is not accurate enough that one of them chief reason is exactly the sound end detection, and therefore, accurate, reliable, sane voice activity detection algorithm is absolutely necessary in speech recognition system.Desirable end-point detection algorithm should have following feature: accuracy, reliability, robustness, adaptivity, real-time etc.In these features, the most difficult what accomplish is exactly robustness.Therefore, how under noise background the end-point detection algorithm of a kind of robust of design be a difficult problem to be solved.

Traditional speech detection method is as being all more classical method based on energy and zero-crossing rate method.These peace and quiet or noise ratio than the circlet border under, differentiate effect satisfactory, but when the pure and impure sound in voice signal is suitable with noise, with regard to the very difficult voice of distinguishing from noise.

, for the situation of noise, occurred that recently major general's voice do not strengthen the method that detects combination with sound end.These methods have changed directly carries out the mode that sound end detects, and becomes and adopts voice to strengthen and two steps of sound end detection.Wherein the voice enhancing is that a kind of voice that polluted by additive noise by lifting improve the sharpness of voice and the technology of intelligibility sensation.Noisy speech is after voice strengthen processing, and the signal to noise ratio (S/N ratio) of signal is improved, and has highlighted the eigenwert of voice, thereby improves the success ratio that sound end detects.Although the above method of root is all obtaining certain improvement effect under experimental situation separately, the performance under strong noise environment, and in implementation procedure threshold value choose, avoid pulse in the problems such as impact of disturbing noise, need further research and checking.

In recent years, the researcher had proposed various characteristic parameter or its derivative parameters that can distinguish voice and noise, in order to improve the noise immunity of algorithm.As cepstrum coefficient, autocorrelation function, in short-term frequency band variance, information entropy etc. all be employed with the end-point detection technology in, more usually several technological synthesiss are got up to carry out speech detection.

On the other hand, realizing that on the hardware that sound end detects, the DSP of TI company that multiselect is used is acp chip.In low side, low-cost series, the TMS320C2000 of TI company (C24x and C28x) DSP series, extensively adopt in automobile, main equipment, hard disk, modulator-demodular unit and personal consumption electronic product.In the middle-end processor family, mainly contain TMS320C5000 (C54x and the C55x) DSP of TI company.The processor of this grade has obtained higher performance by raising and the more complicated framework of clock operating rate.They have promoted performance greatly when reducing power consumption, therefore often be applied to movable equipment, in mobile phone, wireless device, digital camera, audio/video player and digital deaf-aid.In high-end processor family, the TMS320C6000 of TI company (C62x, C64x and C67x) DSP has obtained more excellent performance by advanced frameworks such as very long instruction word (VLIW) collection (VLIW).But the requirement to the program space and power consumption is but very harsh, so this class processor often is applied to high-end video system, radar system and communication base station, in high code check real-time video coded system.In sum, from cost and two kinds of angles of performance, the design object of middle-end processor family can reach the designing requirement that realizes speech detection system on the DSP platform.

The utility model content

The utility model purpose is exactly in order to make up the defect of prior art, a kind of detection system of sound end based on DSP is provided, can realize the end-point detection of voice signal, and can send the data that detect to PC, have the advantages such as the accuracy of detection is high, quick timely.

To achieve these goals, the utility model is achieved through the following technical solutions:

1. detection system of the sound end based on DSP, it is characterized in that: include core processing unit TMS320VC5416 DSP, voice signal coding/decoding module TLV320AIC23 chip and PC, in described TMS320VC5416 DSP, the application assembly language is designed with the voice activity detection algorithm based on fuzzy neural network, described TLV320AIC23 chip signal output terminal access TMS320VC5416 DSP signal input part, described core processing unit TMS320VC5416 DSP signal output part is connected with the PC communication by a TL16C550 asynchronous serial port transceiver and a MAX232 level transferring chip, the periphery of described TMS320VC5416 DSP is also expanded and is connected with power circuit, reset circuit, clock circuit, the voice signal that system gathers by the input of TLV320AIC23 chip, be translated into after digital signal and send into TMS320VC5416 DSP and carry out the end-point detection computing, then the data handled well of TMS320VC5416 DSP are carried out serial communication by asynchronous serial port transceiver TL16C550 and level transferring chip MAX232 and PC, system has also extended out a slice SRAM storer and a slice FLASH storer, and the SRAM storer is used for voice data, and the FLASH storer is used for storing the offline operation program.

The model of described DSP is TMS320VC5416, is a kind of Low Power High Performance fixed DSP that TI company produces, and is applicable to speech processes, the wired and aspects such as radio communication, portable information system.

Described encoding and decoding speech module is the TLV320AIC23 chip of TI company exploitation, and it changes into digital signal input DSP with the external analog voice, completes simultaneously the analog voice output services of internal digital voice.

Described SRAM storer is the IS61LV6416 of ICS company, capacity 64K*16bit; The FLASH storer is the SST39VF400A of SST company, capacity 256K*16bit.

Communicating by letter between described TMS320VC5416 and PC utilizes asynchronous serial port transceiver TL16C550 and MAX232 to realize.Wherein TL16C550 completes the parallel/serial conversion of data and the work such as baud rate setting of serial transmission, and MAX232 completes level conversion.

Principle of the present utility model is:

Gather voice signal by MIC or LINE IN input by TLV320AIC23, be translated into after digital signal and send into TMS320VC5416 DSP and carry out voice activity detection algorithm computing based on fuzzy neural network.This algorithm basic thought is: at first utilize wavelet analysis to carry out Characteristic Extraction to voice signal, then with the input of these characteristic quantities as fuzzy neural network, it is carried out computing, determining finally this signal is voice or noise.

Then the data handled well of TMS320VC5416 DSP are carried out serial communication by TL16C550 and MAX232 and PC.

The utility model has following beneficial effect:

(1) to have adopted TMS320VC5416 DSP be acp chip to the utility model, has high reliability, real-time, adaptability.

(2) adopt fuzzy neural network to realize voice activity detection algorithm, have adaptability good, the advantages such as strong robustness.

(3) TMS320VC5416 DSP is connected with PC with MAX232 by TL16C550, can carry out real-time Communication for Power.

Description of drawings

Fig. 1 is structural representation block diagram of the present utility model.

Fig. 2 is the connecting circuit figure of TLV320AIC23 of the present utility model and TMS320VC5416.

Embodiment

as shown in Figure 1, a kind of detection system of sound end based on DSP, it is characterized in that: include core processing unit TMS320VC5416 DSP 1, voice signal coding/decoding module TLV320AIC23 chip 2 and PC 10, described TLV320AIC23 chip 2 signal output part access TMS320VC5416 DSP 1 signal input parts, TMS320VC5416 DSP 1 is connected with PC 10 with a level transferring chip MAX232 9 by an asynchronous serial port transceiver TL16C550 8, the periphery of described TMS320VC5416 DSP 1 is also expanded and is connected with power circuit 3, reset circuit 4, clock circuit 5, the voice signal that system gathers by 2 inputs of TLV320AIC23 chip, be translated into after digital signal and send into TMS320VC5416 DSP and carry out voice activity detection algorithm computing based on fuzzy neural network, then the data handled well of TMS320VC5416 DSP are carried out serial communication by asynchronous serial port transceiver TL16C550 and level transferring chip MAX232 and PC, system has also extended out a slice SRAM storer 6 and a slice FLASH storer 7, and the SRAM storer is used for voice data, and the FLASH storer is used for storing the offline operation program.

Claims

1. detection system of the sound end based on DSP, it is characterized in that: include core processing unit TMS320VC5416 DSP, voice signal coding/decoding module TLV320AIC23 chip and PC, described TLV320AIC23 chip signal output terminal access TMS320VC5416 DSP signal input part, described core processing unit TMS320VC5416 DSP signal output is connected with the PC communication by a TL16C550 asynchronous serial port transceiver and a MAX232 level transferring chip; The periphery of described TMS320VC5416 DSP is also expanded and is connected with power circuit, reset circuit, clock circuit; The voice signal that system gathers by the input of TLV320AIC23 chip, be translated into after digital signal and send into TMS320VC5416 DSP and carry out the end-point detection computing, then the data handled well of TMS320VC5416 DSP are carried out serial communication by asynchronous serial port transceiver TL16C550 and level transferring chip MAX232 and PC; System has also extended out a slice SRAM storer and a slice FLASH storer, and the SRAM storer is used for voice data.