CN107274901A - A kind of far field voice interaction device - Google Patents

A kind of far field voice interaction device Download PDF

Info

Publication number
CN107274901A
CN107274901A CN201710680172.1A CN201710680172A CN107274901A CN 107274901 A CN107274901 A CN 107274901A CN 201710680172 A CN201710680172 A CN 201710680172A CN 107274901 A CN107274901 A CN 107274901A
Authority
CN
China
Prior art keywords
voice
memory
far field
module
interaction device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201710680172.1A
Other languages
Chinese (zh)
Inventor
徐坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huzhou Golden Soft Electronic Technology Co Ltd
Original Assignee
Huzhou Golden Soft Electronic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huzhou Golden Soft Electronic Technology Co Ltd filed Critical Huzhou Golden Soft Electronic Technology Co Ltd
Priority to CN201710680172.1A priority Critical patent/CN107274901A/en
Publication of CN107274901A publication Critical patent/CN107274901A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention provides a kind of far field voice interaction device, including voice pickup model, front end amplification module, processor, first memory, second memory, wireless communication module, indicator lamp and power module;Voice signal in voice pickup model environment-identification;The voice signal that front end amplification module is picked up to voice pickup model is filtered and enhanced processing;The bottoms such as first memory storaged voice algorithmic code are instructed;The upper strata instructions such as the configuration information and activation word of the peripheral intelligent hardware devices of second memory storage;The instruction of computing device first memory and second memory;The peripheral intelligent hardware devices of wireless communication module connection;Indicator lamp is LED light emitting diodes;Power module provides power supply for each part mentioned above.The present invention uses high integration and high performance-price ratio integrated design, with functions such as comprehensive wake-up, sound source direction finding, orientation pickup, noise suppressed, reverberation elimination, echo cancelltion, far field speech recognitions.

Description

A kind of far field voice interaction device
Technical field
The present invention relates to field of artificial intelligence, more particularly to a kind of far field voice interaction device.
Background technology
Intelligent hardware is based on platform bottom software and hardware, interconnected with intelligent sensing, man-machine interaction, new display and Big data processing etc. generation information technology be characterized, newly to design, new material hardware as carrier novel intelligent terminal product And service.With the continuous maturation in technology upgrading, association base very well equipped and application service market, the product shape of Intelligent hardware State extends to the wearable, smart home of intelligence, intelligent vehicle-carried, medical treatment & health, intelligent Unmanned Systems etc. from smart mobile phone, as letter The joint that breath technology is merged with conventional industries.
At present, Intelligent hardware product is denounced under many scenes because interactive voice experience is deep not as people's will, studies carefully it Reason is mainly the change of interactive voice user's scene, when user is switched to similar intelligent sound box from the Siri of mobile phone, The environment that actually microphone faces just change, this difference whispered and loudly shouted just as two people completely.Voice Interaction is limited to the multiple complicated factors such as background noise, other people acoustic jamming, echo and reverberation, and distance is recognized caused by entering Closely, the low obvious pain spot of discrimination.
The content of the invention
(1) technical problem solved
In order to solve the above-mentioned technical problem, the invention provides a kind of far field voice interaction device, using high integration and High performance-price ratio integrated design, is supported with comprehensive wake-up, sound source direction finding, orientation pickup, noise suppressed, reverberation elimination, echo Disappear, the function such as far field speech recognition.
(2) technical scheme
A kind of far field voice interaction device, including voice pickup model, front end amplification module, processor, first memory, Second memory, wireless communication module, indicator lamp and power module;
Voice signal in the voice pickup model environment-identification;
The voice signal that the front end amplification module is picked up to the voice pickup model is filtered and enhanced processing;
The bottoms such as the first memory storaged voice algorithmic code are instructed, and user can not be changed bottom instruction;
The upper strata instructions, user couple such as the configuration information and activation word of the peripheral intelligent hardware devices of second memory storage The upper strata instruction can modify;
The instruction of first memory and the second memory described in the computing device;
The peripheral intelligent hardware devices of wireless communication module connection;
The indicator lamp is LED light emitting diodes;
The power module provides power supply for each part mentioned above.
Further, the voice pickup model is microphone array, and the microphone array number of columns is 6.
Further, the annular spread that it is diameter 8cm that the microphone array, which is listed on pcb board,.
Further, the first memory is DDR3, and the phonetic algorithm of the first memory storage includes language Sound activation detection, voice wake-up, echo cancellor, low signal-to-noise ratio and reverberation.
Further, the second memory is eMMC.
Further, the processor is Cypress CYW43438.
Further, the wireless communication module be infrared module, bluetooth module or WIFI module in one kind or Person combines.
(3) beneficial effect
The invention provides a kind of far field voice interaction device, using high integration and high performance-price ratio integrated design, tool There are the functions such as comprehensive wake-up, sound source direction finding, orientation pickup, noise suppressed, reverberation elimination, echo cancelltion, far field speech recognition, It is widely used in the intelligent hardware devices such as intelligent sound box, DOT, TV box.
Brief description of the drawings
Fig. 1 is a kind of far field voice interaction device system block diagram involved in the present invention.
Fig. 2 is a kind of far field voice interaction device PCB distribution schematic diagrams involved in the present invention.
Fig. 3 is a kind of far field voice interaction device phonetic algorithm flow chart involved in the present invention.
Embodiment
Embodiment involved in the present invention is described in further details below in conjunction with the accompanying drawings.
Embodiment 1:
As shown in figure 1, a kind of far field voice interaction device, including voice pickup model, front end amplification module, processor, First memory, second memory, wireless communication module, indicator lamp and power module;
Voice signal in voice pickup model environment-identification;
The voice signal that front end amplification module is picked up to voice pickup model is filtered and enhanced processing;
The bottoms such as first memory storaged voice algorithmic code are instructed, and user can not be changed bottom instruction;
The upper strata instructions such as the configuration information and activation word of the peripheral intelligent hardware devices of second memory storage, user is to upper strata Instruction can modify;
The instruction of computing device first memory and second memory;
The peripheral intelligent hardware devices of wireless communication module connection;
Indicator lamp is LED light emitting diodes;
Power module provides power supply for each part mentioned above.
Embodiment 2:
With reference to Fig. 2 and Fig. 3, illustrate device operation principle.
Voice pickup model is microphone array, and microphone array is to utilize certain amount, the acoustics of certain space configuration Sensor group is into the system sampled and handled for the spatial character to sound field.Linearly, annular, spherical microphone array Too big difference is had no in principle, only because steric configuration is different, causes their distinguishable spatial dimensions also different.Than Such as, in auditory localization, linear array only has one-dimension information, can only differentiate 180 degree;Annular array is planar array, there is bidimensional Information, can differentiate 360 degree;Ball array is stereoscopic three-dimensional space array, there is three-dimensional information, can distinguish 360 degree azimuth and The 180 degree angle of pitch.Secondly the number of microphone is more, and the positioning precision to speaker is higher, but the difference body of positioning precision In the distance of present interaction distance, if interaction distance is not far, the locating effect difference of 5 wheats and 8 wheats is not very big.This Outside, microphone number is more, and the space that wave beam can be distinguished is finer, and the pickup quality under noisy environment is higher, but one As under indoor quiet environment, the discrimination difference of 5 wheats and 8 wheats is not very big.Microphone number is more simultaneously, and cost is also got over It is high.
Comprehensive consideration, microphone array number of columns chooses 6, and is designed to that a diameter of 8cm annular is divided on pcb board Cloth, has the high position precision and high pickup quality of 360 degrees omnidirection concurrently, while so that development cost will not be very high, being conducive to dress The Miniaturization Design put.
First memory selects DDR3, and belonging to the memory articles of SDRAM families, there is provided the operation higher compared to DDR2 Efficiency and lower voltage, power consumption and caloric value are smaller.
The phonetic algorithm of DDR3 storages includes voice activation detection, voice wake-up, echo cancellor, low signal-to-noise ratio and reverberation.
Voice activation detection is exactly when to judge when to have voice in environment without voice, follow-up voice letter Number processing is carried out in the efficient voice fragment for intercepting out in this step, can so be significantly reduced amount of calculation, together When can also reduce situations such as noise is misidentified.
Voice wake-up is the main triggering mode of man-machine interaction, and work is carried out after voice activation detects voice signal Make, judge whether comprising the activation word prestored in voice signal, if comprising, follow-up voice signal is continued to recognize, Otherwise follow-up voice is without processing.
Echo cancellor is a noun in full-duplex communication, i.e., can be with pickup while playing.The difficulty of echo cancellor Point is that it will have a balance and compromise between the acoustics of intelligent hardware devices such as intelligent sound box.
Low signal-to-noise ratio and reverberation.Sometimes ambient noise is very big in environment, such as family drives TV or inside automobile Etc., so that voice quality is deteriorated, i.e., signal to noise ratio is reduced.The reverberation that wall reflection in other home environment is formed is to language Sound quality also has very important influence.In order to strengthen voice signal, the signal to noise ratio of voice is improved, passes through depth nerve Network carries out regression fit modeling to the complex relationship between noisy speech and clean speech, and it is minimum that this method is based on log power spectrum Mean-square error criteria, multiframe extension has very great help to lifting speech enhan-cement quality and continuity.
Second memory selects eMMC memories, and it carries multimedia card interface, flash memory device and master controller, It is all to be conducive to equipment miniaturization all in a small-sized BGA package, while interface rate is up to 52M bytes per second.
Processor selects Cypress CYW43438 model chips, integrated chip IEEE802.11a/b/g/n/ac WLAN and bluetooth, using the integrated design of high integration and high performance-price ratio, can be achieved the Internet of product design of small size.
Wireless communication module is one kind in infrared module, bluetooth module or WIFI module or combination.Given this Processor has been integrated with bluetooth and WIFI module in embodiment, only need to supplement infrared module.
Far field speech recognition is, it is necessary to which software and hardware combining, is on the one hand believed by hardware using annular 6 microphone array and filtering Number amplifying circuit, by auditory localization and Adaptive beamformer speech enhan-cement, far field pickup is completed in hardware end, and preliminary complete Made an uproar into filter, it is right on the other hand by the phonetic algorithm code stored in Cypress CYW43438 processor running memories DDR3 Voice signal carries out the processing of voice activation detection, voice wake-up, echo cancellor, low signal-to-noise ratio and reverberation, from the voice of environment Extracted in signal and clearly activate word.User connects several peripheries according to individual demand by infrared, bluetooth or WIFI Intelligent hardware devices and the multiple activation words of setting, and configuration information and activation word are stored in eMMC memories.Work as processor When receiving effectively activation word, LED light flicker, the corresponding peripheral intelligent hardware devices of simultaneous processor control are carried out Response.
In actual use, the sensitivity of a kind of far field voice interaction device provided by the present invention>-42dBV@ 94dB 1KHz, horizontal direction realizes 360 ° of auditory localizations, and positioning precision is ± 10 °, and dynamic noise suppresses>20dB, signal to noise ratio> 65dB, wakes up the reachable 20m of distance, and 3m wake-up rate>96%, 5m wake-up rate>91%;The reachable 5m of identification distance, and 2m discrimination>95%, 5m discrimination>90%, support is arbitrarily interrupted, continuous to wake up, and has fully met general indoor intelligence Can hardware device arrangement demand.
The invention provides a kind of far field voice interaction device, using high integration and high performance-price ratio integrated design, tool There are the functions such as comprehensive wake-up, sound source direction finding, orientation pickup, noise suppressed, reverberation elimination, echo cancelltion, far field speech recognition, It is widely used in the intelligent hardware devices such as intelligent sound box, DOT, TV box.
The above-described embodiments are merely illustrative of preferred embodiments of the present invention, not to the structure of the present invention Think and scope is defined.On the premise of design concept of the present invention is not departed from, technology of this area ordinary person to the present invention The all variations and modifications that scheme is made, all should drop into protection scope of the present invention, claimed technology contents of the invention, All record in detail in the claims.

Claims (7)

1. a kind of far field voice interaction device, it is characterised in that:Including voice pickup model, front end amplification module, processor, One memory, second memory, wireless communication module, indicator lamp and power module;
Voice signal in the voice pickup model environment-identification;
The voice signal that the front end amplification module is picked up to the voice pickup model is filtered and enhanced processing;
The bottoms such as the first memory storaged voice algorithmic code are instructed, and user can not be changed bottom instruction;
The upper strata instructions such as the configuration information and activation word of the peripheral intelligent hardware devices of second memory storage, user is to described Upper strata instruction can modify;
The instruction of first memory and the second memory described in the computing device;
The peripheral intelligent hardware devices of wireless communication module connection;
The indicator lamp is LED light emitting diodes;
The power module provides power supply for each part mentioned above.
2. a kind of far field voice interaction device according to claim 1, it is characterised in that:The voice pickup model is wheat Gram wind array, and the microphone array number of columns is 6.
3. a kind of far field voice interaction device according to claim 2, it is characterised in that:The microphone array is listed in PCB It is diameter 8cm annular spread on plate.
4. a kind of far field voice interaction device according to claim 1, it is characterised in that:The first memory is DDR3, the phonetic algorithm of the first memory storage includes voice activation detection, voice wake-up, echo cancellor, low letter Make an uproar than and reverberation.
5. a kind of far field voice interaction device according to claim 1, it is characterised in that:The second memory is eMMC。
6. a kind of far field voice interaction device according to claim 1, it is characterised in that:The processor is Cypress CYW43438。
7. a kind of far field voice interaction device according to claim 1, it is characterised in that:The wireless communication module is red One kind or combination in outer wire module, bluetooth module or WIFI module.
CN201710680172.1A 2017-08-10 2017-08-10 A kind of far field voice interaction device Withdrawn CN107274901A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710680172.1A CN107274901A (en) 2017-08-10 2017-08-10 A kind of far field voice interaction device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710680172.1A CN107274901A (en) 2017-08-10 2017-08-10 A kind of far field voice interaction device

Publications (1)

Publication Number Publication Date
CN107274901A true CN107274901A (en) 2017-10-20

Family

ID=60079768

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710680172.1A Withdrawn CN107274901A (en) 2017-08-10 2017-08-10 A kind of far field voice interaction device

Country Status (1)

Country Link
CN (1) CN107274901A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108122563A (en) * 2017-12-19 2018-06-05 北京声智科技有限公司 Improve voice wake-up rate and the method for correcting DOA
CN108228577A (en) * 2018-01-31 2018-06-29 北京百度网讯科技有限公司 Translation on line method, apparatus, equipment and computer-readable medium
CN108289267A (en) * 2018-04-14 2018-07-17 北京智网时代科技有限公司 Eliminate echo cancelling device, method, speaker, the voice frequency sender of TV interference
CN108335697A (en) * 2018-01-29 2018-07-27 北京百度网讯科技有限公司 Minutes method, apparatus, equipment and computer-readable medium
CN108461083A (en) * 2018-03-23 2018-08-28 北京小米移动软件有限公司 Electronic equipment mainboard, audio-frequency processing method, device and electronic equipment
CN108538305A (en) * 2018-04-20 2018-09-14 百度在线网络技术(北京)有限公司 Audio recognition method, device, equipment and computer readable storage medium
CN109192219A (en) * 2018-09-11 2019-01-11 四川长虹电器股份有限公司 The method for improving microphone array far field pickup based on keyword
CN109474853A (en) * 2018-11-27 2019-03-15 深圳Tcl新技术有限公司 A kind of television set wake-up circuit and the television set with it
CN109856593A (en) * 2018-12-21 2019-06-07 南京理工大学 Intelligent miniature array sonic transducer and its direction-finding method towards sound source direction finding
CN109935226A (en) * 2017-12-15 2019-06-25 上海擎语信息科技有限公司 A kind of far field speech recognition enhancing system and method based on deep neural network
CN113299319A (en) * 2021-05-25 2021-08-24 华晨鑫源重庆汽车有限公司 Voice recognition module and recognition method based on edge AI chip

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109935226A (en) * 2017-12-15 2019-06-25 上海擎语信息科技有限公司 A kind of far field speech recognition enhancing system and method based on deep neural network
CN108122563A (en) * 2017-12-19 2018-06-05 北京声智科技有限公司 Improve voice wake-up rate and the method for correcting DOA
CN108122563B (en) * 2017-12-19 2021-03-30 北京声智科技有限公司 Method for improving voice awakening rate and correcting DOA
CN108335697A (en) * 2018-01-29 2018-07-27 北京百度网讯科技有限公司 Minutes method, apparatus, equipment and computer-readable medium
CN108228577A (en) * 2018-01-31 2018-06-29 北京百度网讯科技有限公司 Translation on line method, apparatus, equipment and computer-readable medium
CN108461083A (en) * 2018-03-23 2018-08-28 北京小米移动软件有限公司 Electronic equipment mainboard, audio-frequency processing method, device and electronic equipment
CN108289267A (en) * 2018-04-14 2018-07-17 北京智网时代科技有限公司 Eliminate echo cancelling device, method, speaker, the voice frequency sender of TV interference
CN108538305A (en) * 2018-04-20 2018-09-14 百度在线网络技术(北京)有限公司 Audio recognition method, device, equipment and computer readable storage medium
US11074924B2 (en) 2018-04-20 2021-07-27 Baidu Online Network Technology (Beijing) Co., Ltd. Speech recognition method, device, apparatus and computer-readable storage medium
CN109192219A (en) * 2018-09-11 2019-01-11 四川长虹电器股份有限公司 The method for improving microphone array far field pickup based on keyword
CN109192219B (en) * 2018-09-11 2021-12-17 四川长虹电器股份有限公司 Method for improving far-field pickup of microphone array based on keywords
CN109474853A (en) * 2018-11-27 2019-03-15 深圳Tcl新技术有限公司 A kind of television set wake-up circuit and the television set with it
CN109474853B (en) * 2018-11-27 2021-11-09 深圳Tcl新技术有限公司 Television wake-up circuit and television with same
CN109856593A (en) * 2018-12-21 2019-06-07 南京理工大学 Intelligent miniature array sonic transducer and its direction-finding method towards sound source direction finding
CN109856593B (en) * 2018-12-21 2023-01-03 南京理工大学 Sound source direction-finding-oriented miniature intelligent array type acoustic sensor and direction-finding method thereof
CN113299319A (en) * 2021-05-25 2021-08-24 华晨鑫源重庆汽车有限公司 Voice recognition module and recognition method based on edge AI chip

Similar Documents

Publication Publication Date Title
CN107274901A (en) A kind of far field voice interaction device
KR102393364B1 (en) Method for controlling audio signal and electronic device supporting the same
US11789697B2 (en) Methods and systems for attending to a presenting user
US11806862B2 (en) Robots, methods, computer programs, computer-readable media, arrays of microphones and controllers
CN203075421U (en) Music playing system based on emotion change
KR20180062746A (en) Lamp device for inputting or outputting voice signals and a method of driving the lamp device
US9620116B2 (en) Performing automated voice operations based on sensor data reflecting sound vibration conditions and motion conditions
US20110235817A1 (en) Earphone, electronic system and power-saving method
CN201414156Y (en) Movable terminal with prompt facility of radiant intensity
US20140286517A1 (en) Network of speaker lights and wearable devices using intelligent connection managers
CN106782519A (en) A kind of robot
CN207182906U (en) A kind of far field voice interaction device
CN108447483A (en) Speech recognition system
CN109246525A (en) Headset control method, device and headphone based on gesture
CN109447027A (en) Fingerprint acquisition device, fingerprint identification method and terminal
CN106325113B (en) Robot controls engine and system
CN111105792A (en) Voice interaction processing method and device
CN204145834U (en) Intelligent lightening device and illuminator
CN206162837U (en) Civilization behavior detection reminding device
CN108680902A (en) A kind of sonic location system based on multi-microphone array
CN104536580B (en) The method and apparatus for detecting electronic equipment posture
CN105957535A (en) Robot voice signal detecting and identifying system
CN206891912U (en) A kind of portable air quality detection apparatus
CN208538474U (en) Speech recognition system
CN104836876A (en) Air quality sensing suite based on smart phone audio interface

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20171020

WW01 Invention patent application withdrawn after publication