CN103117058B - Based on Multi-voice engine switch system and the method for intelligent television platform - Google Patents
Based on Multi-voice engine switch system and the method for intelligent television platform Download PDFInfo
- Publication number
- CN103117058B CN103117058B CN201210558320.XA CN201210558320A CN103117058B CN 103117058 B CN103117058 B CN 103117058B CN 201210558320 A CN201210558320 A CN 201210558320A CN 103117058 B CN103117058 B CN 103117058B
- Authority
- CN
- China
- Prior art keywords
- speech
- speech engine
- module
- engine
- interface
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The present invention relates to intelligent television software platform, it discloses a kind of more voice engine switching method based on intelligent television platform, realize automatically searching the highest speech engine of current recognition efficiency and switching, the interactive voice promoting user is experienced.The method may be summarized to be: when user runs speech application use speech identifying function, speech engine selects module to be obtained the speech data collected by voice application interface, then speech data is sent to each speech engine module, record and compare the response time that each speech engine module returns recognition result, the speech engine module of Response to selection shortest time switches.In addition, the invention also discloses corresponding switched system, be applicable in intelligent television, realize Rapid Speech recognition function.
Description
Technical field
The present invention relates to intelligent television software platform, specifically, relate to a kind of Multi-voice engine switch system based on intelligent television platform and method.
Background technology
Along with the development of television terminal intellectuality, networking, the retrievable content of intelligent television obtains abundant greatly, and function is more diversification also, and the manipulation of TV becomes more frequent and complicated thereupon.The application of speech recognition technology on intelligent television enormously simplify the operating process of user, and Consumer's Experience is greatly improved.Because speech recognition needs to take huge system resource, intelligent television is general at present all realizes speech identifying function by network connection cloud server;
Speech recognition engine in the server for realizing speech identifying function is made up of speech detection module, characteristic extracting module and identification search module; Wherein, the function of speech detection module be carry out voice signal detection and with process, the primary voice data collected is sent to this module by TV, and voice signal data needs the data layout (such as: 8K, 16bit) converting standard in speech detection module to; Meanwhile, utilize efficient signal detection algorithm, judge the starting point and ending point of voice; Characteristic extracting module receives the audio data stream after detection, therefrom extracts the eigenvector stream obtaining voice signal.Phonetic feature utilizes Digital Signal Processing, extracts the information of reacting its essential attribute most from voice signal.In this module, need to carry out the process such as pre-emphasis, framing, windowing, product and conversion, Cepstrum Transform, difference to voice signal, finally obtain the eigenvector of tens of dimension left and right; Identify search module by the unknown phonic signal character received with the acoustic model storehouse in engine, dictionary/dictionary with identify that syntactic information mates, obtain the word sequence of the most applicable unknown phonetic feature.This process can be briefly described as follows: by retrieval dictionary/dictionary, sentence can be resolved into the sequence of phoneme by word sequence.The sequence of this phoneme combines with acoustic model, is just more reflected the acoustic model unit sequence information of its essential attribute.Then, the information of the eigenvector of raw tone with the acoustic model unit sequence of all possible sentence candidate is mated mutually, calculates its matching probability, select the acoustic model unit sequence with maximum a posteriori probability.By this unit sequence, with it corresponding word sequence can be obtained, the word sequence of Here it is engine exports to TV.
And owing to there is multiple speech recognition engine in server, if the some stationary engines of single use carry out speech recognition, be unfavorable for the lifting of intelligent television audio identification efficiency, cause user speech interactive experience bad; Therefore, how to search between multiple speech recognition engine current full blast speech recognition engine and carry out switching be interactive voice application in a problem demanding prompt solution.
Summary of the invention
Technical matters to be solved by this invention is: propose a kind of Multi-voice engine switch system based on intelligent television platform and method, realizes automatically searching the highest speech engine of current recognition efficiency and switching, and the interactive voice promoting user is experienced.
The scheme that the present invention solves the problems of the technologies described above employing is: based on the Multi-voice engine switch system of intelligent television platform, comprising: speech engine selects module and at least two speech engine modules; All speech engine modules are encapsulated by unified speech engine interface, and connect speech engine selection module by speech engine interface; Described speech engine selects module to be connected with speech application by voice application interface.
Further, described speech engine module is used for obtaining from speech engine interface the speech data that speech engine selects module transmission, and identifies speech data, then selects module to return recognition result to speech engine; Described speech engine selects module to be used for when speech application uses speech identifying function, the speech data collected is obtained by voice application interface, speech data is sent to each speech engine module by speech engine interface, and receive the recognition result that all speech engine modules return, record each speech engine module return the response time of recognition result and contrast, the speech engine module of Response to selection shortest time switches, and makes speech application can call the highest speech engine module of recognition efficiency.
Further, the speech engine module of described Response to selection shortest time is carried out switching and is referred to: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.
In addition, the invention allows for a kind of accordingly based on the more voice engine switching method of intelligent television platform, comprising:
A., when user runs speech application use speech identifying function, speech engine selects module to be obtained the speech data collected by voice application interface;
B. speech engine selects module that speech data is sent to each speech engine module by speech engine interface;
C. each speech engine module identifies speech data, then selects module to return recognition result to speech engine;
D. speech engine selects each speech engine module of module record return the response time of recognition result and contrast, and the speech engine module of Response to selection shortest time switches.
Further, in steps d, the speech engine module of described Response to selection shortest time is carried out switching and is referred to: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.
The invention has the beneficial effects as follows: contrasted by the response time (i.e. recognition speed) each speech engine module being returned to recognition result, the speech engine module of Response to selection shortest time switches, the speech engine module making speech application can call recognition efficiency the highest carries out speech recognition, thus improves the overall recognition efficiency of speech recognition; And, because speech application and speech engine select the connection carrier between module (voice application interface) to remain unchanged, when speech engine module switches, speech application switches without the need to paying close attention to specifically which speech engine module, thus ensure that stability and the continuity of speech recognition.
Accompanying drawing explanation
Fig. 1 is that the Multi-voice engine switch system based on intelligent television platform in the present invention realizes framework map;
Fig. 2 is the process flow diagram of the more voice engine switching method based on intelligent television platform in the present invention.
Embodiment
The principle that realizes of the present invention is: due to the performance difference of each speech engine module in system, these modules to the process of speech data with regard to faster or slower, therefore, we can select the response time of module to each speech engine resume module speech data record and compare by arranging a speech engine, thus it is the shortest to find out the processing time, respond the fastest speech engine module, then the connection of this speech engine module is switched to, and speech engine selects the introducing of module not change all the time due to the application interface between itself and speech application, therefore, simultaneously can also the stability problem of resolution system.
See Fig. 1, comprise speech engine based on the Multi-voice engine switch system of intelligent television platform in the present invention and select module and multiple speech engine module; All speech engine modules are encapsulated by unified speech engine interface, and connect speech engine selection module by speech engine interface; Described speech engine selects module to be connected with speech application by voice application interface.
Wherein, described speech engine module is used for obtaining from speech engine interface the speech data that speech engine selects module transmission, and identifies speech data, then selects module to return recognition result to speech engine; Described speech engine selects module to be used for when speech application uses speech identifying function, the speech data collected is obtained by voice application interface, speech data is sent to each speech engine module by speech engine interface, and receive the recognition result that all speech engine modules return, record each speech engine module return the response time of recognition result and contrast, the speech engine module of Response to selection shortest time switches, and makes speech application can call the highest speech engine module of recognition efficiency.
Fig. 2 gives the corresponding flow process of changing method, and it comprises following performing step:
A., when user runs speech application use speech identifying function, speech engine selects module to be obtained the speech data collected by voice application interface; The voice capture device that this speech data derives from intelligent television collects to obtain sound source signal;
B. speech engine selects module that speech data is sent to each speech engine module by speech engine interface; Encapsulate owing to have employed unified speech engine interface, each speech engine module can receive same speech data simultaneously;
C. each speech engine module identifies speech data, then selects module to return recognition result to speech engine;
D. speech engine selects each speech engine module of module record return the response time of recognition result and contrast, the speech engine module of Response to selection shortest time switches: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.After this, speech application can realize speech recognition fast by calling this response time the shortest speech engine module, and the interactive voice promoting user is experienced.
Claims (2)
1. based on the Multi-voice engine switch system of intelligent television platform, it is characterized in that, comprising: speech engine selects module and at least two speech engine modules; All speech engine modules are encapsulated by unified speech engine interface, and connect speech engine selection module by speech engine interface; Described speech engine selects module to be connected with speech application by voice application interface;
Described speech engine module is used for obtaining from speech engine interface the speech data that speech engine selects module transmission, and identifies speech data, then selects module to return recognition result to speech engine; Described speech engine selects module to be used for when speech application uses speech identifying function, the speech data collected is obtained by voice application interface, speech data is sent to each speech engine module by speech engine interface, and receive the recognition result that all speech engine modules return, record each speech engine module return the response time of recognition result and contrast, the speech engine module of Response to selection shortest time switches, and makes speech application can call the highest speech engine module of recognition efficiency;
The speech engine module of described Response to selection shortest time is carried out switching and is referred to: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.
2., based on the more voice engine switching method of intelligent television platform, be applied in the system as claimed in claim 1, it is characterized in that, comprising:
A., when user runs speech application use speech identifying function, speech engine selects module to be obtained the speech data collected by voice application interface;
B. speech engine selects module that speech data is sent to each speech engine module by speech engine interface;
C. each speech engine module identifies speech data, then selects module to return recognition result to speech engine;
D. speech engine selects each speech engine module of module record return the response time of recognition result and contrast, and the speech engine module of Response to selection shortest time switches;
In steps d, the speech engine module of described Response to selection shortest time is carried out switching and is referred to: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210558320.XA CN103117058B (en) | 2012-12-20 | 2012-12-20 | Based on Multi-voice engine switch system and the method for intelligent television platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210558320.XA CN103117058B (en) | 2012-12-20 | 2012-12-20 | Based on Multi-voice engine switch system and the method for intelligent television platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103117058A CN103117058A (en) | 2013-05-22 |
CN103117058B true CN103117058B (en) | 2015-12-09 |
Family
ID=48415416
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210558320.XA Active CN103117058B (en) | 2012-12-20 | 2012-12-20 | Based on Multi-voice engine switch system and the method for intelligent television platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103117058B (en) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103336687B (en) * | 2013-06-17 | 2016-09-14 | 深圳市金立通信设备有限公司 | The changing method of a kind of application interface and terminal |
CN103714814A (en) * | 2013-12-11 | 2014-04-09 | 四川长虹电器股份有限公司 | Voice introducing method of voice recognition engine |
CN104795069B (en) * | 2014-01-21 | 2020-06-05 | 腾讯科技(深圳)有限公司 | Speech recognition method and server |
CN105609102B (en) * | 2014-11-21 | 2021-03-16 | 中兴通讯股份有限公司 | Voice engine parameter configuration method and device |
CN107018228B (en) * | 2016-01-28 | 2020-03-31 | 中兴通讯股份有限公司 | Voice control system, voice processing method and terminal equipment |
CN107526512B (en) * | 2017-08-31 | 2020-11-20 | 联想(北京)有限公司 | Switching method and system for electronic equipment |
CN107657031A (en) * | 2017-09-28 | 2018-02-02 | 四川长虹电器股份有限公司 | Method based on android system management intelligent sound box voice technical ability |
CN109036427B (en) * | 2018-09-25 | 2021-01-26 | 苏宁智能终端有限公司 | Method and system for dynamically configuring voice recognition service |
CN111179934A (en) * | 2018-11-12 | 2020-05-19 | 奇酷互联网络科技(深圳)有限公司 | Method of selecting a speech engine, mobile terminal and computer-readable storage medium |
CN109410926A (en) * | 2018-11-27 | 2019-03-01 | 恒大法拉第未来智能汽车(广东)有限公司 | Voice method for recognizing semantics and system |
CN109493862B (en) * | 2018-12-24 | 2021-11-09 | 深圳Tcl新技术有限公司 | Terminal, voice server determination method, and computer-readable storage medium |
CN109949816A (en) * | 2019-02-14 | 2019-06-28 | 安徽云之迹信息技术有限公司 | Robot voice processing method and processing device, cloud server |
CN109947651B (en) * | 2019-03-21 | 2022-08-02 | 上海智臻智能网络科技股份有限公司 | Artificial intelligence engine optimization method and device |
CN110708365A (en) * | 2019-09-23 | 2020-01-17 | 杭州迪普科技股份有限公司 | Data receiver selection method and device |
CN113450785B (en) * | 2020-03-09 | 2023-12-19 | 上海擎感智能科技有限公司 | Implementation method, system, medium and cloud server for vehicle-mounted voice processing |
CN113593535A (en) * | 2021-06-30 | 2021-11-02 | 青岛海尔科技有限公司 | Voice data processing method and device, storage medium and electronic device |
CN114446279A (en) * | 2022-02-18 | 2022-05-06 | 青岛海尔科技有限公司 | Voice recognition method, voice recognition device, storage medium and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1323435A (en) * | 1998-10-02 | 2001-11-21 | 国际商业机器公司 | System and method for providing network coordinated conversational services |
CN1429019A (en) * | 2001-12-18 | 2003-07-09 | 松下电器产业株式会社 | TV set with sound discrimination function and its control method |
CN1633679A (en) * | 2001-12-29 | 2005-06-29 | 摩托罗拉公司 | Method and apparatus for multi-level distributed speech recognition |
CN1723487A (en) * | 2002-12-13 | 2006-01-18 | 摩托罗拉公司 | Method and apparatus for selective speech recognition |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6480819B1 (en) * | 1999-02-25 | 2002-11-12 | Matsushita Electric Industrial Co., Ltd. | Automatic search of audio channels by matching viewer-spoken words against closed-caption/audio content for interactive television |
-
2012
- 2012-12-20 CN CN201210558320.XA patent/CN103117058B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1323435A (en) * | 1998-10-02 | 2001-11-21 | 国际商业机器公司 | System and method for providing network coordinated conversational services |
CN1429019A (en) * | 2001-12-18 | 2003-07-09 | 松下电器产业株式会社 | TV set with sound discrimination function and its control method |
CN1633679A (en) * | 2001-12-29 | 2005-06-29 | 摩托罗拉公司 | Method and apparatus for multi-level distributed speech recognition |
CN1723487A (en) * | 2002-12-13 | 2006-01-18 | 摩托罗拉公司 | Method and apparatus for selective speech recognition |
Also Published As
Publication number | Publication date |
---|---|
CN103117058A (en) | 2013-05-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103117058B (en) | Based on Multi-voice engine switch system and the method for intelligent television platform | |
CN103093755B (en) | Based on terminal and mutual network household electric appliance control method and the system of internet voice | |
CN103440867B (en) | Audio recognition method and system | |
CN102855872A (en) | Method and system for controlling household appliance on basis of voice interaction between terminal and internet | |
CN102855874B (en) | Method and system for controlling household appliance on basis of voice interaction of internet | |
CN110473546B (en) | Media file recommendation method and device | |
WO2020238209A1 (en) | Audio processing method, system and related device | |
CN104867492A (en) | Intelligent interaction system and method | |
JP6783339B2 (en) | Methods and devices for processing audio | |
US11457061B2 (en) | Creating a cinematic storytelling experience using network-addressable devices | |
CN107018228B (en) | Voice control system, voice processing method and terminal equipment | |
CN102831892A (en) | Toy control method and system based on internet voice interaction | |
WO2022037526A1 (en) | Speech recognition method, apparatus, electronic device and storage medium | |
CN103730115A (en) | Method and device for detecting keywords in voice | |
JP2019091429A (en) | Method and apparatus for processing information | |
CN110992955A (en) | Voice operation method, device, equipment and storage medium of intelligent equipment | |
CN102847325A (en) | Toy control method and system based on voice interaction of mobile communication terminal | |
CN113889113A (en) | Sentence dividing method and device, storage medium and electronic equipment | |
CN103095927A (en) | Displaying and voice outputting method and system based on mobile communication terminal and glasses | |
EP3059731A1 (en) | Method and apparatus for automatically sending multimedia file, mobile terminal, and storage medium | |
CN111833857A (en) | Voice processing method and device and distributed system | |
CN101588415A (en) | Voice service method and voice service system | |
CN113936655A (en) | Voice broadcast processing method and device, computer equipment and storage medium | |
CN110619876A (en) | Voice processing method and device based on power transmission mobile application | |
CN111583916A (en) | Voice recognition method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |