CN107825433A

CN107825433A - A kind of card machine people of children speech instruction identification

Info

Publication number: CN107825433A
Application number: CN201711024260.2A
Authority: CN
Inventors: 王冬
Original assignee: Anhui Shuo Wei Intelligent Technology Co Ltd
Current assignee: Anhui Shuo Wei Intelligent Technology Co Ltd
Priority date: 2017-10-27
Filing date: 2017-10-27
Publication date: 2018-03-23

Abstract

The invention discloses a kind of card machine people of children speech instruction identification, including voice acquisition module, vocal print feature extraction module, model fitting module, semantic processes module and perform terminal, voice acquisition module includes double MIC voice collectings units and Audio Processing Unit, model fitting module includes model buildings unit and model database, semantic processes module includes semantic instructions matching unit and execution unit, and execution unit transmission instructs to be performed to execution terminal.Differentiation is identified to children speech by vocal print feature extraction, Tong Yin is eliminated on the basis of cacoepy, sentence have some setbacks, realizes that card machine people accurately identifies phonetic order.

Description

A kind of card machine people of children speech instruction identification

Technical field

The present invention relates to artificial intelligence field, specifically a kind of card machine people of children speech instruction identification.

Background technology

A variety of fields have been come into the development of current various artificial intelligence, and educational robot is also each in coming into progressively Individual school, family, and more intelligentized phonetic order manipulation technology is also more and more popular.Existing patent CN106297815A is carried A kind of method of echo cancellation in speech recognition scene is supplied, this method uses Double-number microphone channel, believes in digital audio Mike's input and loudspeaker output voice data are obtained in number processing module simultaneously, loudspeaker therein is exported into right data Copy in the R channel of Mike's input audio data, form Mike's input audio data of synthesis, the Mike of synthesis is inputted Voice data is supplied to the echo cancellation module on upper strata, Mike's input audio data by echo cancellation module AEC to synthesis Left and right acoustic channels carry out algorithm process, output be available for the phonetic entry voice data that sound identification module uses, allow equipment can Identify extraneous phonetic order.The technological merit is in itself and extraneous using dual microphone progress noise reduction process, abatement apparatus Influence of the noise to voice collecting.But in terms of robot man-machine interaction particularly instructs identification, using double MIC pairs Voice, which carries out simple noise reduction process, can only reduce the influence of part extraneous factor, can not produce required high-quality speech Information.

There is existing patent CN106557653A to provide a kind of intelligent medical guide system and method for portable medical again, Patient main suit is gathered including the use of mobile terminal, is sent by mobile Internet to cloud server, relies on medical knowledge storehouse, profit Patient main suit is analyzed with keyword extraction, text participle, fuzzy matching and sort recommendations technology, handled, is automatically generated Disease investigates result, and combines information about doctor generation intelligent medical guide result and feed back to mobile terminal；If patient is not over this Secondary inquiry or diagnostic result are not reaching to optimal, then prompt patient to supplement main suit information using intelligently guiding mode, and then right Disease is investigated result and intelligent medical guide result and gradually optimized, until patient terminates this inquiry or diagnostic result reaches most It is excellent.The technological merit is to realize semantics recognition using keyword extraction, for more practical in terms of medical science, so as to allow equipment more Add and identify extraneous phonetic order exactly, the operating experience of enhancing man machine language's interaction.But this method uses keyword extraction Mode, do not realize the semantics recognition in the case of virgin sound, be that voice is tender the characteristics of virgin sound, pronunciation is nonstandard, sentence not Accurately, therefore keyword extraction will be inaccurate.

The content of the invention

It is an object of the invention to provide a kind of card machine people of children speech instruction identification, to solve above-mentioned background skill The existing equipment proposed in art can not identify children speech well, therefore can not perform asking for children speech instruction well Topic.

To achieve the above object, the present invention provides following technical scheme：

Including voice acquisition module, vocal print feature extraction module, model fitting module, semantic processes module and perform end End；

The voice acquisition module includes double MIC voice collectings units and Audio Processing Unit, the voice acquisition module It is electrically connected with Audio Processing Unit；

The vocal print feature extraction module is electrically connected with Audio Processing Unit, in the voice messaging after extraction process Vocal print feature；

The model fitting module includes model buildings unit and model database, and the model buildings unit is electrically connected with Vocal print feature extraction module and model database；

The semantic processes module includes semantic instructions matching unit and execution unit, the semantic instructions matching unit electricity Property link model build unit and execution unit, the execution unit, which is electrically connected with, performs terminal.

Preferably, Audio Processing Unit be used for will the speech signal analysis that collect into pure wave shape files, and to the waveform File carries out Jing Yin excision and sub-frame processing.

Preferably, it is card machine people to perform terminal.

Compared with prior art, the beneficial effects of the invention are as follows：

The present invention carries out noise reduction, Jing Yin excision and sub-frame processing by Audio Processing Unit to children speech, realizes and carries The purpose of high children speech identification, differentiation children speech is identified and according in large database concept by vocal print feature extraction Model contrast and determine semanteme, realize card machine people Tong Yin on the basis of cacoepy, sentence have some setbacks, still can be with Accurately identify phonetic order.

Brief description of the drawings

Fig. 1 is a kind of structural representation of the card machine people of children speech instruction identification of the present invention；

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made Embodiment, belong to the scope of protection of the invention.

As shown in figure 1, a kind of card machine people of children speech instruction identification, including voice acquisition module, vocal print feature Extraction module, model fitting module, semantic processes module and execution terminal.Voice acquisition module includes double MIC voice collecting lists Member and Audio Processing Unit, voice acquisition module are electrically connected with Audio Processing Unit, and double MIC voice collectings units are to collecting Voice messaging carries out preliminary noise reduction collection, improves the quality of the voice messaging collected.Audio Processing Unit collects to double MIC Voice messaging be adjusted to pure wave shape files, and secondary noise reduction, Jing Yin elimination and sub-frame processing are carried out to it.

Vocal print feature extraction module is electrically connected with Audio Processing Unit, and vocal print feature extraction module is according to the voice after framing Information frame number carries out feature extraction, and these features include showing the characteristic wave of the information such as the tone color of children speech, tone, accuracy in pitch Section.

Model fitting module includes model buildings unit and model database, and model buildings unit is electrically connected with vocal print feature Extraction module and model database.Model database is a huge combinations of features database, passes through the data input of early stage Backup after being extracted with late feature carries out the foundation of database, and each decometer piece robot timing periodic transmission database In data message into the central information repository on backstage.Model buildings module to the characteristic wave bands extracted will according to tone color, The default order such as tone, accuracy in pitch carries out model buildings, and the comparison matching of model is carried out in model database.

Semantic processes module includes semantic instructions matching unit and execution unit, and semantic instructions matching unit is electrically connected with mould Type builds unit and execution unit, and execution unit, which is electrically connected with, performs terminal.Semantic instructions matching unit is according to the mould matched Type finds corresponding service order, such as sing and dance such as is told a story at the command information, and the command information is transferred to and performs list Member, corresponding execution terminal i.e. card machine people is controlled to carry out the execution of the children speech instruction by execution unit.

Although an embodiment of the present invention has been shown and described, for the ordinary skill in the art, can be with A variety of changes, modification can be carried out to these embodiments, replace without departing from the principles and spirit of the present invention by understanding And modification, the scope of the present invention is defined by the appended.

Claims

A kind of 1. card machine people of children speech instruction identification, it is characterised in that including：

Voice acquisition module, vocal print feature extraction module, model fitting module, semantic processes module and execution terminal；

The voice acquisition module includes double MIC voice collectings units and Audio Processing Unit, the voice acquisition module are electrical Connect Audio Processing Unit；

The vocal print feature extraction module is electrically connected with Audio Processing Unit, for the vocal print in the voice messaging after extraction process Feature；

The model fitting module includes model buildings unit and model database, and the model buildings unit is electrically connected with vocal print Characteristic extracting module and model database；

The semantic processes module includes semantic instructions matching unit and execution unit, and the semantic instructions matching unit electrically connects Model buildings unit and execution unit are connect, the execution unit, which is electrically connected with, performs terminal.
A kind of 2. card machine people of children speech instruction identification according to claim 1, it is characterised in that the voice The speech signal analysis that processing unit is used to collect carries out Jing Yin excision to the wave file and divided into pure wave shape files Frame processing.
A kind of 3. card machine people of children speech instruction identification according to claim 1, it is characterised in that the execution Terminal is card machine people.