CN107825433A - A kind of card machine people of children speech instruction identification - Google Patents
A kind of card machine people of children speech instruction identification Download PDFInfo
- Publication number
- CN107825433A CN107825433A CN201711024260.2A CN201711024260A CN107825433A CN 107825433 A CN107825433 A CN 107825433A CN 201711024260 A CN201711024260 A CN 201711024260A CN 107825433 A CN107825433 A CN 107825433A
- Authority
- CN
- China
- Prior art keywords
- unit
- module
- model
- machine people
- card machine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 claims abstract description 19
- 230000001755 vocal effect Effects 0.000 claims abstract description 15
- 238000000034 method Methods 0.000 claims abstract description 13
- 230000008569 process Effects 0.000 claims abstract description 9
- 230000005540 biological transmission Effects 0.000 abstract description 2
- 230000004069 differentiation Effects 0.000 abstract description 2
- 230000015572 biosynthetic process Effects 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000011946 reduction process Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J11/00—Manipulators not otherwise provided for
- B25J11/0005—Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a kind of card machine people of children speech instruction identification, including voice acquisition module, vocal print feature extraction module, model fitting module, semantic processes module and perform terminal, voice acquisition module includes double MIC voice collectings units and Audio Processing Unit, model fitting module includes model buildings unit and model database, semantic processes module includes semantic instructions matching unit and execution unit, and execution unit transmission instructs to be performed to execution terminal.Differentiation is identified to children speech by vocal print feature extraction, Tong Yin is eliminated on the basis of cacoepy, sentence have some setbacks, realizes that card machine people accurately identifies phonetic order.
Description
Technical field
The present invention relates to artificial intelligence field, specifically a kind of card machine people of children speech instruction identification.
Background technology
A variety of fields have been come into the development of current various artificial intelligence, and educational robot is also each in coming into progressively
Individual school, family, and more intelligentized phonetic order manipulation technology is also more and more popular.Existing patent CN106297815A is carried
A kind of method of echo cancellation in speech recognition scene is supplied, this method uses Double-number microphone channel, believes in digital audio
Mike's input and loudspeaker output voice data are obtained in number processing module simultaneously, loudspeaker therein is exported into right data
Copy in the R channel of Mike's input audio data, form Mike's input audio data of synthesis, the Mike of synthesis is inputted
Voice data is supplied to the echo cancellation module on upper strata, Mike's input audio data by echo cancellation module AEC to synthesis
Left and right acoustic channels carry out algorithm process, output be available for the phonetic entry voice data that sound identification module uses, allow equipment can
Identify extraneous phonetic order.The technological merit is in itself and extraneous using dual microphone progress noise reduction process, abatement apparatus
Influence of the noise to voice collecting.But in terms of robot man-machine interaction particularly instructs identification, using double MIC pairs
Voice, which carries out simple noise reduction process, can only reduce the influence of part extraneous factor, can not produce required high-quality speech
Information.
There is existing patent CN106557653A to provide a kind of intelligent medical guide system and method for portable medical again,
Patient main suit is gathered including the use of mobile terminal, is sent by mobile Internet to cloud server, relies on medical knowledge storehouse, profit
Patient main suit is analyzed with keyword extraction, text participle, fuzzy matching and sort recommendations technology, handled, is automatically generated
Disease investigates result, and combines information about doctor generation intelligent medical guide result and feed back to mobile terminal;If patient is not over this
Secondary inquiry or diagnostic result are not reaching to optimal, then prompt patient to supplement main suit information using intelligently guiding mode, and then right
Disease is investigated result and intelligent medical guide result and gradually optimized, until patient terminates this inquiry or diagnostic result reaches most
It is excellent.The technological merit is to realize semantics recognition using keyword extraction, for more practical in terms of medical science, so as to allow equipment more
Add and identify extraneous phonetic order exactly, the operating experience of enhancing man machine language's interaction.But this method uses keyword extraction
Mode, do not realize the semantics recognition in the case of virgin sound, be that voice is tender the characteristics of virgin sound, pronunciation is nonstandard, sentence not
Accurately, therefore keyword extraction will be inaccurate.
The content of the invention
It is an object of the invention to provide a kind of card machine people of children speech instruction identification, to solve above-mentioned background skill
The existing equipment proposed in art can not identify children speech well, therefore can not perform asking for children speech instruction well
Topic.
To achieve the above object, the present invention provides following technical scheme:
Including voice acquisition module, vocal print feature extraction module, model fitting module, semantic processes module and perform end
End;
The voice acquisition module includes double MIC voice collectings units and Audio Processing Unit, the voice acquisition module
It is electrically connected with Audio Processing Unit;
The vocal print feature extraction module is electrically connected with Audio Processing Unit, in the voice messaging after extraction process
Vocal print feature;
The model fitting module includes model buildings unit and model database, and the model buildings unit is electrically connected with
Vocal print feature extraction module and model database;
The semantic processes module includes semantic instructions matching unit and execution unit, the semantic instructions matching unit electricity
Property link model build unit and execution unit, the execution unit, which is electrically connected with, performs terminal.
Preferably, Audio Processing Unit be used for will the speech signal analysis that collect into pure wave shape files, and to the waveform
File carries out Jing Yin excision and sub-frame processing.
Preferably, it is card machine people to perform terminal.
Compared with prior art, the beneficial effects of the invention are as follows:
The present invention carries out noise reduction, Jing Yin excision and sub-frame processing by Audio Processing Unit to children speech, realizes and carries
The purpose of high children speech identification, differentiation children speech is identified and according in large database concept by vocal print feature extraction
Model contrast and determine semanteme, realize card machine people Tong Yin on the basis of cacoepy, sentence have some setbacks, still can be with
Accurately identify phonetic order.
Brief description of the drawings
Fig. 1 is a kind of structural representation of the card machine people of children speech instruction identification of the present invention;
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made
Embodiment, belong to the scope of protection of the invention.
As shown in figure 1, a kind of card machine people of children speech instruction identification, including voice acquisition module, vocal print feature
Extraction module, model fitting module, semantic processes module and execution terminal.Voice acquisition module includes double MIC voice collecting lists
Member and Audio Processing Unit, voice acquisition module are electrically connected with Audio Processing Unit, and double MIC voice collectings units are to collecting
Voice messaging carries out preliminary noise reduction collection, improves the quality of the voice messaging collected.Audio Processing Unit collects to double MIC
Voice messaging be adjusted to pure wave shape files, and secondary noise reduction, Jing Yin elimination and sub-frame processing are carried out to it.
Vocal print feature extraction module is electrically connected with Audio Processing Unit, and vocal print feature extraction module is according to the voice after framing
Information frame number carries out feature extraction, and these features include showing the characteristic wave of the information such as the tone color of children speech, tone, accuracy in pitch
Section.
Model fitting module includes model buildings unit and model database, and model buildings unit is electrically connected with vocal print feature
Extraction module and model database.Model database is a huge combinations of features database, passes through the data input of early stage
Backup after being extracted with late feature carries out the foundation of database, and each decometer piece robot timing periodic transmission database
In data message into the central information repository on backstage.Model buildings module to the characteristic wave bands extracted will according to tone color,
The default order such as tone, accuracy in pitch carries out model buildings, and the comparison matching of model is carried out in model database.
Semantic processes module includes semantic instructions matching unit and execution unit, and semantic instructions matching unit is electrically connected with mould
Type builds unit and execution unit, and execution unit, which is electrically connected with, performs terminal.Semantic instructions matching unit is according to the mould matched
Type finds corresponding service order, such as sing and dance such as is told a story at the command information, and the command information is transferred to and performs list
Member, corresponding execution terminal i.e. card machine people is controlled to carry out the execution of the children speech instruction by execution unit.
Although an embodiment of the present invention has been shown and described, for the ordinary skill in the art, can be with
A variety of changes, modification can be carried out to these embodiments, replace without departing from the principles and spirit of the present invention by understanding
And modification, the scope of the present invention is defined by the appended.
Claims (3)
- A kind of 1. card machine people of children speech instruction identification, it is characterised in that including:Voice acquisition module, vocal print feature extraction module, model fitting module, semantic processes module and execution terminal;The voice acquisition module includes double MIC voice collectings units and Audio Processing Unit, the voice acquisition module are electrical Connect Audio Processing Unit;The vocal print feature extraction module is electrically connected with Audio Processing Unit, for the vocal print in the voice messaging after extraction process Feature;The model fitting module includes model buildings unit and model database, and the model buildings unit is electrically connected with vocal print Characteristic extracting module and model database;The semantic processes module includes semantic instructions matching unit and execution unit, and the semantic instructions matching unit electrically connects Model buildings unit and execution unit are connect, the execution unit, which is electrically connected with, performs terminal.
- A kind of 2. card machine people of children speech instruction identification according to claim 1, it is characterised in that the voice The speech signal analysis that processing unit is used to collect carries out Jing Yin excision to the wave file and divided into pure wave shape files Frame processing.
- A kind of 3. card machine people of children speech instruction identification according to claim 1, it is characterised in that the execution Terminal is card machine people.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711024260.2A CN107825433A (en) | 2017-10-27 | 2017-10-27 | A kind of card machine people of children speech instruction identification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711024260.2A CN107825433A (en) | 2017-10-27 | 2017-10-27 | A kind of card machine people of children speech instruction identification |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107825433A true CN107825433A (en) | 2018-03-23 |
Family
ID=61650731
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711024260.2A Pending CN107825433A (en) | 2017-10-27 | 2017-10-27 | A kind of card machine people of children speech instruction identification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107825433A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109035911A (en) * | 2018-08-10 | 2018-12-18 | 安徽爱依特科技有限公司 | A kind of child's mutual education robot system based on instruction of swiping the card |
CN111916083A (en) * | 2020-08-20 | 2020-11-10 | 绍兴市麦芒智能科技有限公司 | Intelligent device voice instruction recognition algorithm through big data acquisition |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20010025770A (en) * | 1999-09-01 | 2001-04-06 | 배명진 | On the Real-Time Fairy Tale Narration System with Parent's Voice Color |
CN103177722A (en) * | 2013-03-08 | 2013-06-26 | 北京理工大学 | Tone-similarity-based song retrieval method |
CN103956162A (en) * | 2014-04-04 | 2014-07-30 | 上海元趣信息技术有限公司 | Voice recognition method and device oriented towards child |
CN104978957A (en) * | 2014-04-14 | 2015-10-14 | 美的集团股份有限公司 | Voice control method and system based on voiceprint identification |
CN106128467A (en) * | 2016-06-06 | 2016-11-16 | 北京云知声信息技术有限公司 | Method of speech processing and device |
CN106297790A (en) * | 2016-08-22 | 2017-01-04 | 深圳市锐曼智能装备有限公司 | The voiceprint service system of robot and service control method thereof |
CN106683676A (en) * | 2017-03-13 | 2017-05-17 | 安徽朗巴智能科技有限公司 | Voice recognition system for robot control |
CN106782521A (en) * | 2017-03-22 | 2017-05-31 | 海南职业技术学院 | A kind of speech recognition system |
CN106782522A (en) * | 2015-11-23 | 2017-05-31 | 宏碁股份有限公司 | Sound control method and speech control system |
-
2017
- 2017-10-27 CN CN201711024260.2A patent/CN107825433A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20010025770A (en) * | 1999-09-01 | 2001-04-06 | 배명진 | On the Real-Time Fairy Tale Narration System with Parent's Voice Color |
CN103177722A (en) * | 2013-03-08 | 2013-06-26 | 北京理工大学 | Tone-similarity-based song retrieval method |
CN103956162A (en) * | 2014-04-04 | 2014-07-30 | 上海元趣信息技术有限公司 | Voice recognition method and device oriented towards child |
CN104978957A (en) * | 2014-04-14 | 2015-10-14 | 美的集团股份有限公司 | Voice control method and system based on voiceprint identification |
CN106782522A (en) * | 2015-11-23 | 2017-05-31 | 宏碁股份有限公司 | Sound control method and speech control system |
CN106128467A (en) * | 2016-06-06 | 2016-11-16 | 北京云知声信息技术有限公司 | Method of speech processing and device |
CN106297790A (en) * | 2016-08-22 | 2017-01-04 | 深圳市锐曼智能装备有限公司 | The voiceprint service system of robot and service control method thereof |
CN106683676A (en) * | 2017-03-13 | 2017-05-17 | 安徽朗巴智能科技有限公司 | Voice recognition system for robot control |
CN106782521A (en) * | 2017-03-22 | 2017-05-31 | 海南职业技术学院 | A kind of speech recognition system |
Non-Patent Citations (1)
Title |
---|
曾鹏: "《软交换应用系统设计与实现》", 30 November 2016 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109035911A (en) * | 2018-08-10 | 2018-12-18 | 安徽爱依特科技有限公司 | A kind of child's mutual education robot system based on instruction of swiping the card |
CN111916083A (en) * | 2020-08-20 | 2020-11-10 | 绍兴市麦芒智能科技有限公司 | Intelligent device voice instruction recognition algorithm through big data acquisition |
CN111916083B (en) * | 2020-08-20 | 2023-08-22 | 北京基智科技有限公司 | Intelligent equipment voice instruction recognition algorithm through big data acquisition |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110491382B (en) | Speech recognition method and device based on artificial intelligence and speech interaction equipment | |
CN108564942B (en) | Voice emotion recognition method and system based on adjustable sensitivity | |
CN102723078B (en) | Emotion speech recognition method based on natural language comprehension | |
CN110992932B (en) | Self-learning voice control method, system and storage medium | |
CN107657017A (en) | Method and apparatus for providing voice service | |
CN109256150A (en) | Speech emotion recognition system and method based on machine learning | |
CN101064104A (en) | Emotion voice creating method based on voice conversion | |
CN104252861A (en) | Video voice conversion method, video voice conversion device and server | |
CN107845381A (en) | A kind of method and system of robot semantic processes | |
US11587561B2 (en) | Communication system and method of extracting emotion data during translations | |
WO2022048404A1 (en) | End-to-end virtual object animation generation method and apparatus, storage medium, and terminal | |
CN110070855A (en) | A kind of speech recognition system and method based on migration neural network acoustic model | |
CN117637097A (en) | Method and system for generating electronic medical record based on outpatient service dialogue of large model | |
CN110349565B (en) | Auxiliary pronunciation learning method and system for hearing-impaired people | |
CN107825433A (en) | A kind of card machine people of children speech instruction identification | |
CN104679733B (en) | A kind of voice dialogue interpretation method, apparatus and system | |
CN103035252A (en) | Chinese speech signal processing method, Chinese speech signal processing device and hearing aid device | |
CN114283822A (en) | Many-to-one voice conversion method based on gamma pass frequency cepstrum coefficient | |
CN113837907A (en) | Man-machine interaction system and method for English teaching | |
CN109300478A (en) | A kind of auxiliary Interface of person hard of hearing | |
CN113763925A (en) | Speech recognition method, speech recognition device, computer equipment and storage medium | |
CN115757860A (en) | Music emotion label generation method based on multi-mode fusion | |
CN114550701A (en) | Deep neural network-based Chinese electronic larynx voice conversion device and method | |
Lane et al. | Local word discovery for interactive transcription | |
CN111768773A (en) | Intelligent decision-making conference robot |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: Room 803, room F1, two, innovation industrial park, No. 2800, new avenue of innovation, Hefei high tech Zone, Anhui Applicant after: Anhui Shuo Wei Intelligent Technology Co., Ltd. Address before: 230088, H2, building 374, two innovation industrial park, 2800 innovation Avenue, Hefei hi tech Zone, Anhui Applicant before: Anhui Shuo Wei Intelligent Technology Co., Ltd. |
|
CB02 | Change of applicant information | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180323 |
|
RJ01 | Rejection of invention patent application after publication |