CN107274901A

CN107274901A - A kind of far field voice interaction device

Info

Publication number: CN107274901A
Application number: CN201710680172.1A
Authority: CN
Inventors: 徐坤
Original assignee: Huzhou Golden Soft Electronic Technology Co Ltd
Current assignee: Huzhou Golden Soft Electronic Technology Co Ltd
Priority date: 2017-08-10
Filing date: 2017-08-10
Publication date: 2017-10-20

Abstract

The invention provides a kind of far field voice interaction device, including voice pickup model, front end amplification module, processor, first memory, second memory, wireless communication module, indicator lamp and power module；Voice signal in voice pickup model environment-identification；The voice signal that front end amplification module is picked up to voice pickup model is filtered and enhanced processing；The bottoms such as first memory storaged voice algorithmic code are instructed；The upper strata instructions such as the configuration information and activation word of the peripheral intelligent hardware devices of second memory storage；The instruction of computing device first memory and second memory；The peripheral intelligent hardware devices of wireless communication module connection；Indicator lamp is LED light emitting diodes；Power module provides power supply for each part mentioned above.The present invention uses high integration and high performance-price ratio integrated design, with functions such as comprehensive wake-up, sound source direction finding, orientation pickup, noise suppressed, reverberation elimination, echo cancelltion, far field speech recognitions.

Description

A kind of far field voice interaction device

Technical field

The present invention relates to field of artificial intelligence, more particularly to a kind of far field voice interaction device.

Background technology

Intelligent hardware is based on platform bottom software and hardware, interconnected with intelligent sensing, man-machine interaction, new display and Big data processing etc. generation information technology be characterized, newly to design, new material hardware as carrier novel intelligent terminal product And service.With the continuous maturation in technology upgrading, association base very well equipped and application service market, the product shape of Intelligent hardware State extends to the wearable, smart home of intelligence, intelligent vehicle-carried, medical treatment ＆ health, intelligent Unmanned Systems etc. from smart mobile phone, as letter The joint that breath technology is merged with conventional industries.

At present, Intelligent hardware product is denounced under many scenes because interactive voice experience is deep not as people's will, studies carefully it Reason is mainly the change of interactive voice user's scene, when user is switched to similar intelligent sound box from the Siri of mobile phone, The environment that actually microphone faces just change, this difference whispered and loudly shouted just as two people completely.Voice Interaction is limited to the multiple complicated factors such as background noise, other people acoustic jamming, echo and reverberation, and distance is recognized caused by entering Closely, the low obvious pain spot of discrimination.

The content of the invention

(1) technical problem solved

In order to solve the above-mentioned technical problem, the invention provides a kind of far field voice interaction device, using high integration and High performance-price ratio integrated design, is supported with comprehensive wake-up, sound source direction finding, orientation pickup, noise suppressed, reverberation elimination, echo Disappear, the function such as far field speech recognition.

(2) technical scheme

A kind of far field voice interaction device, including voice pickup model, front end amplification module, processor, first memory, Second memory, wireless communication module, indicator lamp and power module；

Voice signal in the voice pickup model environment-identification；

The voice signal that the front end amplification module is picked up to the voice pickup model is filtered and enhanced processing；

The bottoms such as the first memory storaged voice algorithmic code are instructed, and user can not be changed bottom instruction；

The upper strata instructions, user couple such as the configuration information and activation word of the peripheral intelligent hardware devices of second memory storage The upper strata instruction can modify；

The instruction of first memory and the second memory described in the computing device；

The peripheral intelligent hardware devices of wireless communication module connection；

The indicator lamp is LED light emitting diodes；

The power module provides power supply for each part mentioned above.

Further, the voice pickup model is microphone array, and the microphone array number of columns is 6.

Further, the annular spread that it is diameter 8cm that the microphone array, which is listed on pcb board,.

Further, the first memory is DDR3, and the phonetic algorithm of the first memory storage includes language Sound activation detection, voice wake-up, echo cancellor, low signal-to-noise ratio and reverberation.

Further, the second memory is eMMC.

Further, the processor is Cypress CYW43438.

Further, the wireless communication module be infrared module, bluetooth module or WIFI module in one kind or Person combines.

(3) beneficial effect

The invention provides a kind of far field voice interaction device, using high integration and high performance-price ratio integrated design, tool There are the functions such as comprehensive wake-up, sound source direction finding, orientation pickup, noise suppressed, reverberation elimination, echo cancelltion, far field speech recognition, It is widely used in the intelligent hardware devices such as intelligent sound box, DOT, TV box.

Brief description of the drawings

Fig. 1 is a kind of far field voice interaction device system block diagram involved in the present invention.

Fig. 2 is a kind of far field voice interaction device PCB distribution schematic diagrams involved in the present invention.

Fig. 3 is a kind of far field voice interaction device phonetic algorithm flow chart involved in the present invention.

Embodiment

Embodiment involved in the present invention is described in further details below in conjunction with the accompanying drawings.

Embodiment 1：

As shown in figure 1, a kind of far field voice interaction device, including voice pickup model, front end amplification module, processor, First memory, second memory, wireless communication module, indicator lamp and power module；

Voice signal in voice pickup model environment-identification；

The voice signal that front end amplification module is picked up to voice pickup model is filtered and enhanced processing；

The bottoms such as first memory storaged voice algorithmic code are instructed, and user can not be changed bottom instruction；

The upper strata instructions such as the configuration information and activation word of the peripheral intelligent hardware devices of second memory storage, user is to upper strata Instruction can modify；

The instruction of computing device first memory and second memory；

Indicator lamp is LED light emitting diodes；

Power module provides power supply for each part mentioned above.

Embodiment 2：

With reference to Fig. 2 and Fig. 3, illustrate device operation principle.

Voice pickup model is microphone array, and microphone array is to utilize certain amount, the acoustics of certain space configuration Sensor group is into the system sampled and handled for the spatial character to sound field.Linearly, annular, spherical microphone array Too big difference is had no in principle, only because steric configuration is different, causes their distinguishable spatial dimensions also different.Than Such as, in auditory localization, linear array only has one-dimension information, can only differentiate 180 degree；Annular array is planar array, there is bidimensional Information, can differentiate 360 degree；Ball array is stereoscopic three-dimensional space array, there is three-dimensional information, can distinguish 360 degree azimuth and The 180 degree angle of pitch.Secondly the number of microphone is more, and the positioning precision to speaker is higher, but the difference body of positioning precision In the distance of present interaction distance, if interaction distance is not far, the locating effect difference of 5 wheats and 8 wheats is not very big.This Outside, microphone number is more, and the space that wave beam can be distinguished is finer, and the pickup quality under noisy environment is higher, but one As under indoor quiet environment, the discrimination difference of 5 wheats and 8 wheats is not very big.Microphone number is more simultaneously, and cost is also got over It is high.

Comprehensive consideration, microphone array number of columns chooses 6, and is designed to that a diameter of 8cm annular is divided on pcb board Cloth, has the high position precision and high pickup quality of 360 degrees omnidirection concurrently, while so that development cost will not be very high, being conducive to dress The Miniaturization Design put.

First memory selects DDR3, and belonging to the memory articles of SDRAM families, there is provided the operation higher compared to DDR2 Efficiency and lower voltage, power consumption and caloric value are smaller.

The phonetic algorithm of DDR3 storages includes voice activation detection, voice wake-up, echo cancellor, low signal-to-noise ratio and reverberation.

Voice activation detection is exactly when to judge when to have voice in environment without voice, follow-up voice letter Number processing is carried out in the efficient voice fragment for intercepting out in this step, can so be significantly reduced amount of calculation, together When can also reduce situations such as noise is misidentified.

Voice wake-up is the main triggering mode of man-machine interaction, and work is carried out after voice activation detects voice signal Make, judge whether comprising the activation word prestored in voice signal, if comprising, follow-up voice signal is continued to recognize, Otherwise follow-up voice is without processing.

Echo cancellor is a noun in full-duplex communication, i.e., can be with pickup while playing.The difficulty of echo cancellor Point is that it will have a balance and compromise between the acoustics of intelligent hardware devices such as intelligent sound box.

Low signal-to-noise ratio and reverberation.Sometimes ambient noise is very big in environment, such as family drives TV or inside automobile Etc., so that voice quality is deteriorated, i.e., signal to noise ratio is reduced.The reverberation that wall reflection in other home environment is formed is to language Sound quality also has very important influence.In order to strengthen voice signal, the signal to noise ratio of voice is improved, passes through depth nerve Network carries out regression fit modeling to the complex relationship between noisy speech and clean speech, and it is minimum that this method is based on log power spectrum Mean-square error criteria, multiframe extension has very great help to lifting speech enhan-cement quality and continuity.

Second memory selects eMMC memories, and it carries multimedia card interface, flash memory device and master controller, It is all to be conducive to equipment miniaturization all in a small-sized BGA package, while interface rate is up to 52M bytes per second.

Processor selects Cypress CYW43438 model chips, integrated chip IEEE802.11a/b/g/n/ac WLAN and bluetooth, using the integrated design of high integration and high performance-price ratio, can be achieved the Internet of product design of small size.

Wireless communication module is one kind in infrared module, bluetooth module or WIFI module or combination.Given this Processor has been integrated with bluetooth and WIFI module in embodiment, only need to supplement infrared module.

Far field speech recognition is, it is necessary to which software and hardware combining, is on the one hand believed by hardware using annular 6 microphone array and filtering Number amplifying circuit, by auditory localization and Adaptive beamformer speech enhan-cement, far field pickup is completed in hardware end, and preliminary complete Made an uproar into filter, it is right on the other hand by the phonetic algorithm code stored in Cypress CYW43438 processor running memories DDR3 Voice signal carries out the processing of voice activation detection, voice wake-up, echo cancellor, low signal-to-noise ratio and reverberation, from the voice of environment Extracted in signal and clearly activate word.User connects several peripheries according to individual demand by infrared, bluetooth or WIFI Intelligent hardware devices and the multiple activation words of setting, and configuration information and activation word are stored in eMMC memories.Work as processor When receiving effectively activation word, LED light flicker, the corresponding peripheral intelligent hardware devices of simultaneous processor control are carried out Response.

In actual use, the sensitivity of a kind of far field voice interaction device provided by the present invention>-42dBV@ 94dB 1KHz, horizontal direction realizes 360 ° of auditory localizations, and positioning precision is ± 10 °, and dynamic noise suppresses>20dB, signal to noise ratio> 65dB, wakes up the reachable 20m of distance, and 3m wake-up rate>96%, 5m wake-up rate>91%；The reachable 5m of identification distance, and 2m discrimination>95%, 5m discrimination>90%, support is arbitrarily interrupted, continuous to wake up, and has fully met general indoor intelligence Can hardware device arrangement demand.

The above-described embodiments are merely illustrative of preferred embodiments of the present invention, not to the structure of the present invention Think and scope is defined.On the premise of design concept of the present invention is not departed from, technology of this area ordinary person to the present invention The all variations and modifications that scheme is made, all should drop into protection scope of the present invention, claimed technology contents of the invention, All record in detail in the claims.

Claims

1. a kind of far field voice interaction device, it is characterised in that：Including voice pickup model, front end amplification module, processor, One memory, second memory, wireless communication module, indicator lamp and power module；

Voice signal in the voice pickup model environment-identification；

The upper strata instructions such as the configuration information and activation word of the peripheral intelligent hardware devices of second memory storage, user is to described Upper strata instruction can modify；

The indicator lamp is LED light emitting diodes；

The power module provides power supply for each part mentioned above.

2. a kind of far field voice interaction device according to claim 1, it is characterised in that：The voice pickup model is wheat Gram wind array, and the microphone array number of columns is 6.

3. a kind of far field voice interaction device according to claim 2, it is characterised in that：The microphone array is listed in PCB It is diameter 8cm annular spread on plate.

4. a kind of far field voice interaction device according to claim 1, it is characterised in that：The first memory is DDR3, the phonetic algorithm of the first memory storage includes voice activation detection, voice wake-up, echo cancellor, low letter Make an uproar than and reverberation.

5. a kind of far field voice interaction device according to claim 1, it is characterised in that：The second memory is eMMC。

6. a kind of far field voice interaction device according to claim 1, it is characterised in that：The processor is Cypress CYW43438。

7. a kind of far field voice interaction device according to claim 1, it is characterised in that：The wireless communication module is red One kind or combination in outer wire module, bluetooth module or WIFI module.