CN103117058B - Based on Multi-voice engine switch system and the method for intelligent television platform - Google Patents

Based on Multi-voice engine switch system and the method for intelligent television platform Download PDF

Info

Publication number
CN103117058B
CN103117058B CN201210558320.XA CN201210558320A CN103117058B CN 103117058 B CN103117058 B CN 103117058B CN 201210558320 A CN201210558320 A CN 201210558320A CN 103117058 B CN103117058 B CN 103117058B
Authority
CN
China
Prior art keywords
speech
speech engine
module
engine
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210558320.XA
Other languages
Chinese (zh)
Other versions
CN103117058A (en
Inventor
陈冠霖
赵波
刘贤洪
杨金峰
毕端
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN201210558320.XA priority Critical patent/CN103117058B/en
Publication of CN103117058A publication Critical patent/CN103117058A/en
Application granted granted Critical
Publication of CN103117058B publication Critical patent/CN103117058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to intelligent television software platform, it discloses a kind of more voice engine switching method based on intelligent television platform, realize automatically searching the highest speech engine of current recognition efficiency and switching, the interactive voice promoting user is experienced.The method may be summarized to be: when user runs speech application use speech identifying function, speech engine selects module to be obtained the speech data collected by voice application interface, then speech data is sent to each speech engine module, record and compare the response time that each speech engine module returns recognition result, the speech engine module of Response to selection shortest time switches.In addition, the invention also discloses corresponding switched system, be applicable in intelligent television, realize Rapid Speech recognition function.

Description

Based on Multi-voice engine switch system and the method for intelligent television platform
Technical field
The present invention relates to intelligent television software platform, specifically, relate to a kind of Multi-voice engine switch system based on intelligent television platform and method.
Background technology
Along with the development of television terminal intellectuality, networking, the retrievable content of intelligent television obtains abundant greatly, and function is more diversification also, and the manipulation of TV becomes more frequent and complicated thereupon.The application of speech recognition technology on intelligent television enormously simplify the operating process of user, and Consumer's Experience is greatly improved.Because speech recognition needs to take huge system resource, intelligent television is general at present all realizes speech identifying function by network connection cloud server;
Speech recognition engine in the server for realizing speech identifying function is made up of speech detection module, characteristic extracting module and identification search module; Wherein, the function of speech detection module be carry out voice signal detection and with process, the primary voice data collected is sent to this module by TV, and voice signal data needs the data layout (such as: 8K, 16bit) converting standard in speech detection module to; Meanwhile, utilize efficient signal detection algorithm, judge the starting point and ending point of voice; Characteristic extracting module receives the audio data stream after detection, therefrom extracts the eigenvector stream obtaining voice signal.Phonetic feature utilizes Digital Signal Processing, extracts the information of reacting its essential attribute most from voice signal.In this module, need to carry out the process such as pre-emphasis, framing, windowing, product and conversion, Cepstrum Transform, difference to voice signal, finally obtain the eigenvector of tens of dimension left and right; Identify search module by the unknown phonic signal character received with the acoustic model storehouse in engine, dictionary/dictionary with identify that syntactic information mates, obtain the word sequence of the most applicable unknown phonetic feature.This process can be briefly described as follows: by retrieval dictionary/dictionary, sentence can be resolved into the sequence of phoneme by word sequence.The sequence of this phoneme combines with acoustic model, is just more reflected the acoustic model unit sequence information of its essential attribute.Then, the information of the eigenvector of raw tone with the acoustic model unit sequence of all possible sentence candidate is mated mutually, calculates its matching probability, select the acoustic model unit sequence with maximum a posteriori probability.By this unit sequence, with it corresponding word sequence can be obtained, the word sequence of Here it is engine exports to TV.
And owing to there is multiple speech recognition engine in server, if the some stationary engines of single use carry out speech recognition, be unfavorable for the lifting of intelligent television audio identification efficiency, cause user speech interactive experience bad; Therefore, how to search between multiple speech recognition engine current full blast speech recognition engine and carry out switching be interactive voice application in a problem demanding prompt solution.
Summary of the invention
Technical matters to be solved by this invention is: propose a kind of Multi-voice engine switch system based on intelligent television platform and method, realizes automatically searching the highest speech engine of current recognition efficiency and switching, and the interactive voice promoting user is experienced.
The scheme that the present invention solves the problems of the technologies described above employing is: based on the Multi-voice engine switch system of intelligent television platform, comprising: speech engine selects module and at least two speech engine modules; All speech engine modules are encapsulated by unified speech engine interface, and connect speech engine selection module by speech engine interface; Described speech engine selects module to be connected with speech application by voice application interface.
Further, described speech engine module is used for obtaining from speech engine interface the speech data that speech engine selects module transmission, and identifies speech data, then selects module to return recognition result to speech engine; Described speech engine selects module to be used for when speech application uses speech identifying function, the speech data collected is obtained by voice application interface, speech data is sent to each speech engine module by speech engine interface, and receive the recognition result that all speech engine modules return, record each speech engine module return the response time of recognition result and contrast, the speech engine module of Response to selection shortest time switches, and makes speech application can call the highest speech engine module of recognition efficiency.
Further, the speech engine module of described Response to selection shortest time is carried out switching and is referred to: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.
In addition, the invention allows for a kind of accordingly based on the more voice engine switching method of intelligent television platform, comprising:
A., when user runs speech application use speech identifying function, speech engine selects module to be obtained the speech data collected by voice application interface;
B. speech engine selects module that speech data is sent to each speech engine module by speech engine interface;
C. each speech engine module identifies speech data, then selects module to return recognition result to speech engine;
D. speech engine selects each speech engine module of module record return the response time of recognition result and contrast, and the speech engine module of Response to selection shortest time switches.
Further, in steps d, the speech engine module of described Response to selection shortest time is carried out switching and is referred to: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.
The invention has the beneficial effects as follows: contrasted by the response time (i.e. recognition speed) each speech engine module being returned to recognition result, the speech engine module of Response to selection shortest time switches, the speech engine module making speech application can call recognition efficiency the highest carries out speech recognition, thus improves the overall recognition efficiency of speech recognition; And, because speech application and speech engine select the connection carrier between module (voice application interface) to remain unchanged, when speech engine module switches, speech application switches without the need to paying close attention to specifically which speech engine module, thus ensure that stability and the continuity of speech recognition.
Accompanying drawing explanation
Fig. 1 is that the Multi-voice engine switch system based on intelligent television platform in the present invention realizes framework map;
Fig. 2 is the process flow diagram of the more voice engine switching method based on intelligent television platform in the present invention.
Embodiment
The principle that realizes of the present invention is: due to the performance difference of each speech engine module in system, these modules to the process of speech data with regard to faster or slower, therefore, we can select the response time of module to each speech engine resume module speech data record and compare by arranging a speech engine, thus it is the shortest to find out the processing time, respond the fastest speech engine module, then the connection of this speech engine module is switched to, and speech engine selects the introducing of module not change all the time due to the application interface between itself and speech application, therefore, simultaneously can also the stability problem of resolution system.
See Fig. 1, comprise speech engine based on the Multi-voice engine switch system of intelligent television platform in the present invention and select module and multiple speech engine module; All speech engine modules are encapsulated by unified speech engine interface, and connect speech engine selection module by speech engine interface; Described speech engine selects module to be connected with speech application by voice application interface.
Wherein, described speech engine module is used for obtaining from speech engine interface the speech data that speech engine selects module transmission, and identifies speech data, then selects module to return recognition result to speech engine; Described speech engine selects module to be used for when speech application uses speech identifying function, the speech data collected is obtained by voice application interface, speech data is sent to each speech engine module by speech engine interface, and receive the recognition result that all speech engine modules return, record each speech engine module return the response time of recognition result and contrast, the speech engine module of Response to selection shortest time switches, and makes speech application can call the highest speech engine module of recognition efficiency.
Fig. 2 gives the corresponding flow process of changing method, and it comprises following performing step:
A., when user runs speech application use speech identifying function, speech engine selects module to be obtained the speech data collected by voice application interface; The voice capture device that this speech data derives from intelligent television collects to obtain sound source signal;
B. speech engine selects module that speech data is sent to each speech engine module by speech engine interface; Encapsulate owing to have employed unified speech engine interface, each speech engine module can receive same speech data simultaneously;
C. each speech engine module identifies speech data, then selects module to return recognition result to speech engine;
D. speech engine selects each speech engine module of module record return the response time of recognition result and contrast, the speech engine module of Response to selection shortest time switches: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.After this, speech application can realize speech recognition fast by calling this response time the shortest speech engine module, and the interactive voice promoting user is experienced.

Claims (2)

1. based on the Multi-voice engine switch system of intelligent television platform, it is characterized in that, comprising: speech engine selects module and at least two speech engine modules; All speech engine modules are encapsulated by unified speech engine interface, and connect speech engine selection module by speech engine interface; Described speech engine selects module to be connected with speech application by voice application interface;
Described speech engine module is used for obtaining from speech engine interface the speech data that speech engine selects module transmission, and identifies speech data, then selects module to return recognition result to speech engine; Described speech engine selects module to be used for when speech application uses speech identifying function, the speech data collected is obtained by voice application interface, speech data is sent to each speech engine module by speech engine interface, and receive the recognition result that all speech engine modules return, record each speech engine module return the response time of recognition result and contrast, the speech engine module of Response to selection shortest time switches, and makes speech application can call the highest speech engine module of recognition efficiency;
The speech engine module of described Response to selection shortest time is carried out switching and is referred to: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.
2., based on the more voice engine switching method of intelligent television platform, be applied in the system as claimed in claim 1, it is characterized in that, comprising:
A., when user runs speech application use speech identifying function, speech engine selects module to be obtained the speech data collected by voice application interface;
B. speech engine selects module that speech data is sent to each speech engine module by speech engine interface;
C. each speech engine module identifies speech data, then selects module to return recognition result to speech engine;
D. speech engine selects each speech engine module of module record return the response time of recognition result and contrast, and the speech engine module of Response to selection shortest time switches;
In steps d, the speech engine module of described Response to selection shortest time is carried out switching and is referred to: speech engine selects module to be connected to response time the shortest speech engine module by speech engine interface, disconnects the connection with other speech engine module simultaneously.
CN201210558320.XA 2012-12-20 2012-12-20 Based on Multi-voice engine switch system and the method for intelligent television platform Active CN103117058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210558320.XA CN103117058B (en) 2012-12-20 2012-12-20 Based on Multi-voice engine switch system and the method for intelligent television platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210558320.XA CN103117058B (en) 2012-12-20 2012-12-20 Based on Multi-voice engine switch system and the method for intelligent television platform

Publications (2)

Publication Number Publication Date
CN103117058A CN103117058A (en) 2013-05-22
CN103117058B true CN103117058B (en) 2015-12-09

Family

ID=48415416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210558320.XA Active CN103117058B (en) 2012-12-20 2012-12-20 Based on Multi-voice engine switch system and the method for intelligent television platform

Country Status (1)

Country Link
CN (1) CN103117058B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103336687B (en) * 2013-06-17 2016-09-14 深圳市金立通信设备有限公司 The changing method of a kind of application interface and terminal
CN103714814A (en) * 2013-12-11 2014-04-09 四川长虹电器股份有限公司 Voice introducing method of voice recognition engine
CN104795069B (en) * 2014-01-21 2020-06-05 腾讯科技(深圳)有限公司 Speech recognition method and server
CN105609102B (en) * 2014-11-21 2021-03-16 中兴通讯股份有限公司 Voice engine parameter configuration method and device
CN107018228B (en) * 2016-01-28 2020-03-31 中兴通讯股份有限公司 Voice control system, voice processing method and terminal equipment
CN107526512B (en) * 2017-08-31 2020-11-20 联想(北京)有限公司 Switching method and system for electronic equipment
CN107657031A (en) * 2017-09-28 2018-02-02 四川长虹电器股份有限公司 Method based on android system management intelligent sound box voice technical ability
CN109036427B (en) * 2018-09-25 2021-01-26 苏宁智能终端有限公司 Method and system for dynamically configuring voice recognition service
CN111179934A (en) * 2018-11-12 2020-05-19 奇酷互联网络科技(深圳)有限公司 Method of selecting a speech engine, mobile terminal and computer-readable storage medium
CN109410926A (en) * 2018-11-27 2019-03-01 恒大法拉第未来智能汽车(广东)有限公司 Voice method for recognizing semantics and system
CN109493862B (en) * 2018-12-24 2021-11-09 深圳Tcl新技术有限公司 Terminal, voice server determination method, and computer-readable storage medium
CN109949816A (en) * 2019-02-14 2019-06-28 安徽云之迹信息技术有限公司 Robot voice processing method and processing device, cloud server
CN109947651B (en) * 2019-03-21 2022-08-02 上海智臻智能网络科技股份有限公司 Artificial intelligence engine optimization method and device
CN110708365A (en) * 2019-09-23 2020-01-17 杭州迪普科技股份有限公司 Data receiver selection method and device
CN113450785B (en) * 2020-03-09 2023-12-19 上海擎感智能科技有限公司 Implementation method, system, medium and cloud server for vehicle-mounted voice processing
CN113593535A (en) * 2021-06-30 2021-11-02 青岛海尔科技有限公司 Voice data processing method and device, storage medium and electronic device
CN114446279A (en) * 2022-02-18 2022-05-06 青岛海尔科技有限公司 Voice recognition method, voice recognition device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1323435A (en) * 1998-10-02 2001-11-21 国际商业机器公司 System and method for providing network coordinated conversational services
CN1429019A (en) * 2001-12-18 2003-07-09 松下电器产业株式会社 TV set with sound discrimination function and its control method
CN1633679A (en) * 2001-12-29 2005-06-29 摩托罗拉公司 Method and apparatus for multi-level distributed speech recognition
CN1723487A (en) * 2002-12-13 2006-01-18 摩托罗拉公司 Method and apparatus for selective speech recognition

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6480819B1 (en) * 1999-02-25 2002-11-12 Matsushita Electric Industrial Co., Ltd. Automatic search of audio channels by matching viewer-spoken words against closed-caption/audio content for interactive television

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1323435A (en) * 1998-10-02 2001-11-21 国际商业机器公司 System and method for providing network coordinated conversational services
CN1429019A (en) * 2001-12-18 2003-07-09 松下电器产业株式会社 TV set with sound discrimination function and its control method
CN1633679A (en) * 2001-12-29 2005-06-29 摩托罗拉公司 Method and apparatus for multi-level distributed speech recognition
CN1723487A (en) * 2002-12-13 2006-01-18 摩托罗拉公司 Method and apparatus for selective speech recognition

Also Published As

Publication number Publication date
CN103117058A (en) 2013-05-22

Similar Documents

Publication Publication Date Title
CN103117058B (en) Based on Multi-voice engine switch system and the method for intelligent television platform
CN103093755B (en) Based on terminal and mutual network household electric appliance control method and the system of internet voice
CN103440867B (en) Audio recognition method and system
CN102855872A (en) Method and system for controlling household appliance on basis of voice interaction between terminal and internet
CN102855874B (en) Method and system for controlling household appliance on basis of voice interaction of internet
CN110473546B (en) Media file recommendation method and device
WO2020238209A1 (en) Audio processing method, system and related device
CN104867492A (en) Intelligent interaction system and method
JP6783339B2 (en) Methods and devices for processing audio
US11457061B2 (en) Creating a cinematic storytelling experience using network-addressable devices
CN107018228B (en) Voice control system, voice processing method and terminal equipment
CN102831892A (en) Toy control method and system based on internet voice interaction
WO2022037526A1 (en) Speech recognition method, apparatus, electronic device and storage medium
CN103730115A (en) Method and device for detecting keywords in voice
JP2019091429A (en) Method and apparatus for processing information
CN110992955A (en) Voice operation method, device, equipment and storage medium of intelligent equipment
CN102847325A (en) Toy control method and system based on voice interaction of mobile communication terminal
CN113889113A (en) Sentence dividing method and device, storage medium and electronic equipment
CN103095927A (en) Displaying and voice outputting method and system based on mobile communication terminal and glasses
EP3059731A1 (en) Method and apparatus for automatically sending multimedia file, mobile terminal, and storage medium
CN111833857A (en) Voice processing method and device and distributed system
CN101588415A (en) Voice service method and voice service system
CN113936655A (en) Voice broadcast processing method and device, computer equipment and storage medium
CN110619876A (en) Voice processing method and device based on power transmission mobile application
CN111583916A (en) Voice recognition method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant