CN107919126A - A kind of intelligent speech interactive system - Google Patents
A kind of intelligent speech interactive system Download PDFInfo
- Publication number
- CN107919126A CN107919126A CN201711194068.8A CN201711194068A CN107919126A CN 107919126 A CN107919126 A CN 107919126A CN 201711194068 A CN201711194068 A CN 201711194068A CN 107919126 A CN107919126 A CN 107919126A
- Authority
- CN
- China
- Prior art keywords
- module
- sound
- model
- interactive system
- storehouse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000002452 interceptive effect Effects 0.000 title claims abstract description 24
- 230000003993 interaction Effects 0.000 claims abstract description 26
- 238000000034 method Methods 0.000 claims abstract description 19
- 230000008569 process Effects 0.000 claims abstract description 18
- 238000001514 detection method Methods 0.000 claims abstract description 14
- 238000000605 extraction Methods 0.000 claims abstract description 9
- 230000004044 response Effects 0.000 claims description 15
- 238000013507 mapping Methods 0.000 claims description 10
- 230000005055 memory storage Effects 0.000 claims description 6
- 238000009432 framing Methods 0.000 claims description 3
- 230000005236 sound signal Effects 0.000 claims description 3
- 241000209140 Triticum Species 0.000 claims 1
- 235000021307 Triticum Nutrition 0.000 claims 1
- 230000008859 change Effects 0.000 abstract description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000003516 pericardium Anatomy 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000007474 system interaction Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
- G10L15/05—Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The invention discloses a kind of intelligent speech interactive system.It is related to technical field of voice interaction.Including sound acquisition module, PFGA pretreatment modules, intelligent interaction center;PFGA pretreatment modules include end-point detection unit;End-point detection unit is electrically connected with sound pretreatment unit, feature extraction unit respectively;Intelligent interaction center includes control unit;Control unit is electrically connected with storage unit, sound identification module, semantic understanding module, interaction process module, voice synthetic module, feedback module, loudspeaker respectively;Acoustic model repository includes HMM model database and ANN model database;Language model storehouse includes N Gram model databases and Rule based model databases.Feedback of the present invention using feedback module to identification information, and by control unit to present customers identification information, while control and change Sound Match model and language model storehouse, improve the interactive identification accuracy of interactive system.
Description
Technical field
The invention belongs to technical field of voice interaction, more particularly to a kind of intelligent speech interactive system.
Background technology
As in artificial intelligence Stepping into daily life, people are also increasingly deeper for the understanding of interactive voice, while to people
The requirement of voice interactive system is also higher and higher in work intelligence.
The problem of voice interactive system presently, there are has when client occurs giving an irrelevant answer or have obvious with system interaction
When cannot identifying or not understand the voice messaging of client, often customer is at one's wit's end, greatly reduces human-computer interaction
Performance.
The content of the invention
It is an object of the invention to provide a kind of intelligent speech interactive system, is known by identification control unit to present customers
Other information, while control and change Sound Match model and language model, the adaptive of interactive voice is realized, solves voice friendship
The problem of speech recognition errors can not be handled when mutually.
In order to solve the above technical problems, the present invention is achieved by the following technical solutions:
The present invention is a kind of intelligent speech interactive system, including sound acquisition module, PFGA pretreatment modules, intelligent interaction
Center;The PFGA pretreatment modules include end-point detection unit;The end-point detection unit respectively with sound pretreatment unit,
Feature extraction unit is electrically connected;The sound pretreatment unit is electrically connected with sound acquisition module;In the intelligent interaction
Pericardium includes control unit;Described control unit respectively with storage unit, sound identification module, semantic understanding module, interaction process
Module, voice synthetic module, feedback module, loudspeaker are electrically connected;The storage unit respectively with sound identification module, semanteme
Understanding Module, interaction process module are electrically connected;Voice synthetic module is electrical with semantic understanding module, interaction process module respectively
Connection;Language model storehouse, sound model storehouse, semantic dictionary database, response message storehouse are equipped with the storage unit;The sound
Learning model library includes HMM model database and ANN model database;The language model storehouse include N-Gram model databases and
Rule-based model databases.
Preferably, the sound acquisition module is microphone;The microphone collected sound signal;The sound pretreatment
Unit does voice signal antialiasing filter processing, A/D converter turns and framing windowing process.
Preferably, the endpoint detection module is the endpoint detection module based on frequency band variance.
Preferably, described control unit includes ARM microcontrollers;Sound model selection electricity is integrated with described control unit
Road;Language model storehouse selection circuit is integrated with described control unit.
Preferably, the response mapping of the response message storehouse memory storage situational dialogues;The semantic dictionary databases
Store up statement semantics mapping.
Preferably, using the extraction of MFCC parameter attributes in the feature extraction unit.
Preferably, it is equipped with memory in the feedback module;What the memory storage interaction process module passed over
The voice messaging and feedback command that text message, voice synthetic module pass over.
Preferably, when the sound identification module carries out speech recognition, language model, sound in language model storehouse are obtained
Sound model in model library;When the speech understanding module carries out semantic understanding, the semanteme for obtaining semantic dictionary database reflects
Penetrate;When the interaction process module interacts processing, the response mapping in response message storehouse is obtained.
The invention has the advantages that:
1st, feedback of the present invention using feedback module to identification information, and identified and believed to present customers by control unit
Breath, while control and change Sound Match model and language model storehouse, improve the interactive identification accuracy of interactive system.
2nd, the present invention uses PFGA modules in sound pretreatment module, and the parallel processing for having liberated intelligent interaction center is born
Carry, improve interactive voice efficiency.
Certainly, implement any of the products of the present invention and do not necessarily require achieving all the advantages described above at the same time.
Brief description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, embodiment will be described below required
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for ability
For the those of ordinary skill of domain, without creative efforts, it can also be obtained according to these attached drawings other attached
Figure.
Fig. 1 is the system structure diagram of the present invention.
Embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained all other without creative efforts
Embodiment, belongs to the scope of protection of the invention.
Refering to Figure 1, the present invention is a kind of intelligent speech interactive system, including sound acquisition module, PFGA locate in advance
Manage module, intelligent interaction center;PFGA pretreatment modules include end-point detection unit;End-point detection unit is located in advance with sound respectively
Manage unit, feature extraction unit is electrically connected;Sound pretreatment unit is electrically connected with sound acquisition module;Intelligent interaction center
Including control unit;Control unit respectively with storage unit, sound identification module, semantic understanding module, interaction process module, language
Sound synthesis module, feedback module, loudspeaker are electrically connected;Storage unit respectively with sound identification module, semantic understanding module, hand over
Mutual processing module is electrically connected;Voice synthetic module is electrically connected with semantic understanding module, interaction process module respectively;Storage is single
Language model storehouse, sound model storehouse, semantic dictionary database, response message storehouse are equipped with first;Acoustic model repository includes HMM model
Database and ANN model database;Language model storehouse includes N-Gram model databases and Rule-based model databases.
Wherein, sound acquisition module is microphone;Microphone collected sound signal;Sound pretreatment unit is to voice signal
Do antialiasing filter processing, A/D converter turns and framing windowing process.
Wherein, endpoint detection module is the endpoint detection module based on frequency band variance.
Wherein, control unit includes ARM microcontrollers;Sound model selection circuit is integrated with control unit;Control unit
On be integrated with language model storehouse selection circuit.
Wherein, the response mapping of response message storehouse memory storage situational dialogues;Semantic dictionary databases store up statement semantics
Mapping.
Wherein, using the extraction of MFCC parameter attributes in feature extraction unit.
Wherein, it is equipped with memory in feedback module;Text message, the language that memory storage interaction process module passes over
The voice messaging and feedback command that sound synthesis module passes over.
Wherein, when sound identification module carries out speech recognition, language model, sound model storehouse in acquisition language model storehouse
Interior sound model;When speech understanding module carries out semantic understanding, the Semantic mapping of semantic dictionary database is obtained;Interaction process
When module interacts processing, the response mapping in response message storehouse is obtained.
It is worth noting that, in said system embodiment, included unit is simply drawn according to function logic
Point, but above-mentioned division is not limited to, as long as corresponding function can be realized;In addition, each functional unit is specific
Title is also only to facilitate mutually distinguish, the protection domain being not intended to limit the invention.
In addition, one of ordinary skill in the art will appreciate that realize all or part of step in the various embodiments described above method
It is that relevant hardware can be instructed to complete by program, corresponding program can be stored in a computer-readable storage and be situated between
In matter, the storage medium, such as ROM/RAM, disk or CD.
Present invention disclosed above preferred embodiment is only intended to help and illustrates the present invention.Preferred embodiment is not detailed
All details are described, are not limited the invention to the specific embodiments described.Obviously, according to the content of this specification,
It can make many modifications and variations.This specification is chosen and specifically describes these embodiments, is in order to preferably explain the present invention
Principle and practical application so that skilled artisan can be best understood by and utilize the present invention.The present invention is only
Limited by claims and its four corner and equivalent.
Claims (8)
1. a kind of intelligent speech interactive system, it is characterised in that handed over including sound acquisition module, PFGA pretreatment modules, intelligence
Mutual center;
The PFGA pretreatment modules include end-point detection unit;The end-point detection unit respectively with sound pretreatment unit,
Feature extraction unit is electrically connected;The sound pretreatment unit is electrically connected with sound acquisition module;
The intelligent interaction center includes control unit;Described control unit respectively with storage unit, sound identification module, semanteme
Understanding Module, interaction process module, voice synthetic module, feedback module, loudspeaker are electrically connected;The storage unit respectively with
Sound identification module, semantic understanding module, interaction process module are electrically connected;Voice synthetic module respectively with semantic understanding mould
Block, interaction process module are electrically connected;Language model storehouse, sound model storehouse, semantic dictionary data are equipped with the storage unit
Storehouse, response message storehouse;
The acoustic model repository includes HMM model database and ANN model database;The language model storehouse includes N-Gram moulds
Type database and Rule-based model databases.
2. a kind of intelligent speech interactive system according to claim 1, it is characterised in that the sound acquisition module is wheat
Gram wind;The microphone collected sound signal;The sound pretreatment unit does voice signal antialiasing filter processing, A/
D converters turn and framing windowing process.
3. a kind of intelligent speech interactive system according to claim 1, it is characterised in that the endpoint detection module is base
In the endpoint detection module of frequency band variance.
4. a kind of intelligent speech interactive system according to claim 1, it is characterised in that described control unit includes ARM
Microcontroller;Sound model selection circuit is integrated with described control unit;The choosing of language model storehouse is integrated with described control unit
Select circuit.
A kind of 5. intelligent speech interactive system according to claim 1, it is characterised in that response message storehouse memory storage
The response mapping of situational dialogues;The semantic dictionary databases storage statement semantics mapping.
6. a kind of intelligent speech interactive system according to claim 1, it is characterised in that adopted in the feature extraction unit
Extracted with MFCC parameter attributes.
7. a kind of intelligent speech interactive system according to claim 1, it is characterised in that be equipped with and deposit in the feedback module
Reservoir;The voice that text message, the voice synthetic module that the memory storage interaction process module passes over pass over
Information and feedback command.
8. a kind of intelligent speech interactive system according to claim 1, it is characterised in that the sound identification module carries out
During speech recognition, the language model in language model storehouse, the sound model in sound model storehouse are obtained;The speech understanding module
When carrying out semantic understanding, the Semantic mapping of semantic dictionary database is obtained;When the interaction process module interacts processing, obtain
The response in response message storehouse is taken to map.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711194068.8A CN107919126A (en) | 2017-11-24 | 2017-11-24 | A kind of intelligent speech interactive system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711194068.8A CN107919126A (en) | 2017-11-24 | 2017-11-24 | A kind of intelligent speech interactive system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107919126A true CN107919126A (en) | 2018-04-17 |
Family
ID=61896908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711194068.8A Pending CN107919126A (en) | 2017-11-24 | 2017-11-24 | A kind of intelligent speech interactive system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107919126A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109018778A (en) * | 2018-08-31 | 2018-12-18 | 深圳市研本品牌设计有限公司 | Rubbish put-on method and system based on speech recognition |
CN109147768A (en) * | 2018-09-13 | 2019-01-04 | 云南电网有限责任公司 | A kind of audio recognition method and system based on deep learning |
CN109388792A (en) * | 2018-09-30 | 2019-02-26 | 珠海格力电器股份有限公司 | Text processing method, device, equipment, computer equipment and storage medium |
CN109616095A (en) * | 2018-12-12 | 2019-04-12 | 安徽讯呼信息科技有限公司 | A kind of AI intelligent voice system |
CN110459203A (en) * | 2018-05-03 | 2019-11-15 | 百度在线网络技术(北京)有限公司 | A kind of intelligent sound guidance method, device, equipment and storage medium |
CN111326141A (en) * | 2018-12-13 | 2020-06-23 | 南京硅基智能科技有限公司 | Method for processing and acquiring human voice data |
CN112397067A (en) * | 2020-11-13 | 2021-02-23 | 重庆长安工业(集团)有限责任公司 | Voice control terminal of weapon equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080091429A1 (en) * | 2006-10-12 | 2008-04-17 | International Business Machines Corporation | Enhancement to viterbi speech processing algorithm for hybrid speech models that conserves memory |
CN103730116A (en) * | 2014-01-07 | 2014-04-16 | 苏州思必驰信息科技有限公司 | System and method for achieving intelligent home device control on smart watch |
CN106056207A (en) * | 2016-05-09 | 2016-10-26 | 武汉科技大学 | Natural language-based robot deep interacting and reasoning method and device |
-
2017
- 2017-11-24 CN CN201711194068.8A patent/CN107919126A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080091429A1 (en) * | 2006-10-12 | 2008-04-17 | International Business Machines Corporation | Enhancement to viterbi speech processing algorithm for hybrid speech models that conserves memory |
CN103730116A (en) * | 2014-01-07 | 2014-04-16 | 苏州思必驰信息科技有限公司 | System and method for achieving intelligent home device control on smart watch |
CN106056207A (en) * | 2016-05-09 | 2016-10-26 | 武汉科技大学 | Natural language-based robot deep interacting and reasoning method and device |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110459203A (en) * | 2018-05-03 | 2019-11-15 | 百度在线网络技术(北京)有限公司 | A kind of intelligent sound guidance method, device, equipment and storage medium |
CN109018778A (en) * | 2018-08-31 | 2018-12-18 | 深圳市研本品牌设计有限公司 | Rubbish put-on method and system based on speech recognition |
CN109147768A (en) * | 2018-09-13 | 2019-01-04 | 云南电网有限责任公司 | A kind of audio recognition method and system based on deep learning |
CN109388792A (en) * | 2018-09-30 | 2019-02-26 | 珠海格力电器股份有限公司 | Text processing method, device, equipment, computer equipment and storage medium |
CN109616095A (en) * | 2018-12-12 | 2019-04-12 | 安徽讯呼信息科技有限公司 | A kind of AI intelligent voice system |
CN111326141A (en) * | 2018-12-13 | 2020-06-23 | 南京硅基智能科技有限公司 | Method for processing and acquiring human voice data |
CN112397067A (en) * | 2020-11-13 | 2021-02-23 | 重庆长安工业(集团)有限责任公司 | Voice control terminal of weapon equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107919126A (en) | A kind of intelligent speech interactive system | |
CN107767863B (en) | Voice awakening method and system and intelligent terminal | |
CN107644643A (en) | A kind of voice interactive system and method | |
CN105590626B (en) | Continuous voice man-machine interaction method and system | |
CN107134279A (en) | A kind of voice awakening method, device, terminal and storage medium | |
CN105469789A (en) | Voice information processing method and voice information processing terminal | |
CN110277088B (en) | Intelligent voice recognition method, intelligent voice recognition device and computer readable storage medium | |
CN105786798A (en) | Natural language intention understanding method in man-machine interaction | |
CN110459222A (en) | Sound control method, phonetic controller and terminal device | |
CA2151371A1 (en) | Recursive finite state grammar | |
CN108039175B (en) | Voice recognition method and device and server | |
CN105446146A (en) | Intelligent terminal control method based on semantic analysis, system and intelligent terminal | |
CN110444210A (en) | A kind of method of speech recognition, the method and device for waking up word detection | |
CN107767861A (en) | voice awakening method, system and intelligent terminal | |
CN101847405A (en) | Speech recognition equipment and method, language model generation device and method and program | |
US20200265843A1 (en) | Speech broadcast method, device and terminal | |
CN110211589B (en) | Awakening method and device of vehicle-mounted system, vehicle and machine readable medium | |
WO2023222089A1 (en) | Item classification method and apparatus based on deep learning | |
WO2023222090A1 (en) | Information pushing method and apparatus based on deep learning | |
CN111930912A (en) | Dialogue management method, system, device and storage medium | |
CN113157240A (en) | Voice processing method, device, equipment, storage medium and computer program product | |
CN111081254A (en) | Voice recognition method and device | |
CN109065076B (en) | Audio label setting method, device, equipment and storage medium | |
CN113593565A (en) | Intelligent home device management and control method and system | |
CN108231074A (en) | A kind of data processing method, voice assistant equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180417 |
|
RJ01 | Rejection of invention patent application after publication |