CN104506906B

CN104506906B - Interactive voice householder method and system based on tv scene element and voice assistant

Info

Publication number: CN104506906B
Application number: CN201410634282.0A
Authority: CN
Inventors: 黄海兵
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2014-11-12
Filing date: 2014-11-12
Publication date: 2019-01-18
Anticipated expiration: 2034-11-12
Also published as: CN104506906A

Abstract

The present invention relates to interactive voice householder methods and system based on tv scene element and voice assistant, software and the voice assistant independent operating of televising, the scene information that software of televising described in the voice assistant acquisition is run, the voice assistant matches speech recognition conversion result with the scene information of acquisition, then for matched scene information, according to situation elements information and scene state information and voice messaging, by televising, software carries out operation execution.The present invention is based on tv scene element and the interactive voice householder methods and system of voice assistant, it is operated on it and is used according to the real-time scene information of TV, voice television is set really to march toward intelligence, simultaneously, software separates independent operating with televising, it can be used cooperatively with a voice assistant and multiple softwares of televising, greatly save system resource.In addition, convenient be updated and innovate to speech engine, promote development of the voice technology in terms of intelligence.

Description

Interactive voice householder method and system based on tv scene element and voice assistant

Technical field

The present invention relates to a kind of interactive voice householder method and systems, more particularly to one kind to be based on tv scene element and language The interactive voice householder method and system of sound assistant.

Background technique

Although the emerging technologies such as smart phone, network change people's production and life, in the family, TV significantly Or there is the information transmission status that do not replace.With the development of science and technology, TV tech has also obtained significant progress, at present To the intelligent stage, smart television is more and more extensive to be applied in people's life Polarizations for Target Discrimination in Clutter.With the development of voice technology, language Sound TV also increasingly walks close to people's lives.It is soft that embedded voice is typically employed in module of televising in voice television at present Part carries out voice-controlled operations, and majority can only carry out concrete operations project and be operated, since scene information is soft according to TV The real time execution of part and change, therefore, existing inline operations for the real-time scene information of TV cannot grasp it Make and uses.In addition, software of then each televising carries out for loading multiple softwares of televising in intelligent television platform The embedded exploitation of complicated voice just can be carried out use, meanwhile, when software is loaded, a large amount of memory can be occupied, especially Simultaneously load it is several televise software when, need a large amount of memory source, influence the operational effect of system.With speech recognition Degree it is higher and higher, speech engine is also more and more huger, and voice control is also more and more intelligent, this needs speech engine itself It is continuous to update and develop, the embedded development for obviously greatly limiting voice control of voice.

Summary of the invention

Technical problem solved by the present invention is it is auxiliary to construct a kind of interactive voice based on tv scene element and voice assistant Aid method and system, overcome the prior art cannot for TV real-time scene information be operated on it and be used and The technical issues of operational effect of influence system, limitation voice control development on TV.

The technical scheme is that providing a kind of interactive voice auxiliary square based on tv scene element and voice assistant Method, including software of televising, voice assistant, televise software and the voice assistant independent operating, interactive voice Householder method includes the following steps:

Obtain scene information: the scene information for software operation of televising described in the voice assistant acquisition, the scene Information includes situation elements information；

Input voice: the voice assistant acquires voice messaging, and the voice assistant carries out voice to the voice messaging Identification conversion；

Matching executes: the voice assistant matches speech recognition conversion result with the scene information of acquisition；If institute Situation elements information and the institute speech recognition result for stating software operation of televising are same or similar in relevant information, then institute It states voice assistant and matched situation elements information is transmitted to the software of televising, institute is executed by the software of televising State the corresponding project of situation elements information.

Software is televised and the voice assistant is broadcast by the TV a further technical solution of the present invention is: described The spare interface for softening part establishes communication connection or described televises software and the voice assistant is built by proprietary protocol Vertical communication connection.

A further technical solution of the present invention is: the software of televising include it is a variety of it is independently operated televise it is soft The software cooperating of televising of part, the voice assistant and current active.

A further technical solution of the present invention is: further including network server, the voice assistant believes the scene of acquisition Breath uploads to the network server, and the network server matches the scene information with pre-stored information, Matched information is transmitted to the voice assistant.

A further technical solution of the present invention is: the same or similar relevant information that is included in is being sent out in the relevant information It is same or similar in sound, text, text meaning, affiliated type or operation information, or matching both sides respectively sending out by partial information It is same or similar in sound, text, text meaning, affiliated type or operation information.

The technical scheme is that construct it is a kind of based on tv scene element and the interactive voice of voice assistant auxiliary system System, including software of televising, voice assistant, televise software and the voice assistant independent operating, the TV Playout software includes the acquisition acquisition module of scene information, the communication module communicated with the voice assistant, execution module, The voice assistant include obtain described in televise software operation scene information data obtaining module, acquisition voice messaging Voice acquisition module, carry out speech recognition conversion speech recognition module, matching module, transmission module, the acquisition of information The scene information that software of televising described in module acquisition is run, the scene information includes situation elements information；The voice Acquisition module acquires voice messaging, and the speech recognition module carries out speech recognition conversion to the voice messaging；The matching Module matches speech recognition conversion result with the scene information of acquisition；If the scene member of the software operation of televising Prime information and institute's speech recognition result are same or similar in relevant information, and the transmission module believes matched situation elements Breath is transmitted to the software of televising, and the execution module executes the corresponding project of the situation elements information.

A further technical solution of the present invention is: the software of televising includes first information output module or described Voice assistant includes the second message output module.

The solution have the advantages that: construct a kind of interactive voice auxiliary square based on tv scene element and voice assistant Method and system, including software of televising, voice assistant, televise software and the voice assistant independent operating, institute The scene information for software operation of televising described in voice assistant acquisition is stated, the scene information includes situation elements information；Institute Voice assistant acquisition voice messaging is stated, the voice assistant carries out speech recognition conversion to the voice messaging；The voice helps Hand matches speech recognition conversion result with the scene information of acquisition；If the situation elements of the software operation of televising Information and institute's speech recognition result are same or similar in relevant information, then the voice assistant believes matched situation elements Breath is transmitted to the software of televising, and executes the corresponding project of the situation elements information by the software of televising.This Interactive voice householder method and system of the invention based on tv scene element and voice assistant, it is described televise software with it is described Voice assistant independent operating, the voice assistant obtain described in televise the scene information of software operation, the voice assistant Speech recognition conversion result is matched with the scene information of acquisition, then for matched scene information, according to scene member Prime information and scene state information and voice messaging, by televising, software carries out operation execution.The present invention is based on tv scenes The interactive voice householder method and system of element and voice assistant are operated on it according to the real-time scene information of TV With use, so that voice television is really marched toward intelligence, meanwhile, software separates independent operating with televising, can be with a voice Assistant is used cooperatively with multiple softwares of televising, and greatlys save system resource.In addition, it is convenient speech engine is updated and Innovation promotes development of the voice technology in terms of intelligence.

Detailed description of the invention

Fig. 1 is the structural diagram of the present invention.

Fig. 2 is the preferred embodiment of the present invention structural schematic diagram.

Specific embodiment

Combined with specific embodiments below, further explanation of the technical solution of the present invention.

As shown in Figure 1, a specific embodiment of the invention is: providing a kind of based on tv scene element and voice assistant Interactive voice householder method, including software 1 of televising, voice assistant 2, software 1 and the voice assistant 2 of televising Independent operating, interactive voice householder method include the following steps:

Obtain scene information: the voice assistant 2 obtains the scene information of the operation of software 1 of televising, the field Scape information includes situation elements information.

Specific implementation process is as follows: the voice assistant 2 obtains the scene information mode of the operation of software 1 of televising Including two ways: a kind of mode is the scene information of the 1 background acquisition self-operating of software of televising, and this information is adopted Mode set is preferred manner comprehensively, accurately, quickly.Another mode is: the voice assistant 2 is televised by described The scene information that software 1 of televising described in the spare interface acquisition of software 1 is run, this mode will be according to the function of spare interface It can determine the degree of acquisition information.For the scene information of the acquisition of software 1 of televising, by the software 1 of televising It is transmitted to the acquisition that the voice assistant 2 completes scene information.Pass through the software 1 of televising for the voice assistant 2 Spare interface acquisition described in televise software 1 operation scene information, itself be scene information obtain process.Institute Stating scene information includes situation elements information.The situation elements information includes the visual information for running details interface and presenting, tool Body includes text information, pictorial information, video information title of runnable interface etc., and the text information at operation details interface is most main The information wanted.The scene state information mainly includes the operation information that runnable interface is related to, such as: it plays video, play sound The related operation informations such as happy, operation game.In specific embodiment, according to these information, usually by more turns of the element information of acquisition It is changed to text information.

Input voice: the voice assistant 2 acquires voice messaging, and the voice assistant 2 carries out language to the voice messaging Sound identification conversion.

Specific implementation process is as follows: by external voice input equipment input voice information, the voice assistant 2 acquires institute Voice messaging is stated, speech recognition conversion then is carried out to the voice messaging.In specific embodiment, speech recognition conversion result packet Text information is included, operation information is alsod relate to.Such as: happy base camp is opened, then speech recognition conversion result is related to operating Information also includes text information.

Matching executes: the voice assistant 2 matches speech recognition conversion result with the scene information of acquisition；If institute Situation elements information and the institute speech recognition result for stating the operation of software 1 of televising are same or similar in relevant information, then Matched situation elements information is transmitted to the software 1 of televising by the voice assistant 2, by the software 1 of televising Execute the corresponding project of the situation elements information.

Specific implementation process is as follows: the voice assistant 2 carries out speech recognition conversion result and the scene information of acquisition Matching is mainly matched from pronunciation, text, text meaning or the operation information of oneself each relevant information, the scene member Prime information includes type where the title of situation elements information, situation elements information, production involved in situation elements information One of content information that personnel, situation elements information are related to is a variety of.It is same or similar in the relevant information to be included in Relevant information is same or similar in pronunciation, text, text meaning, affiliated type or operation information, such as: current scene element Information is " happy base camp ", carries out same or similar matching from the pronunciation of " happy base camp ", text, can also be belonging to it It is matched in type, such as: " happy base camp " is variety show, can also be from the matching of the enterprising row information of its host, also It can be from its affiliated enterprising row information matching of TV station etc..Another way is: matching both sides respectively partial information pronunciation, text It is same or similar in word, text meaning, affiliated type or operation information.Such as: current scene element information is " happy university degree Battalion ", can take its partial information " happy " and " base camp " to be matched, if in speech recognition result including " happy " or " base camp " can also then match " happy base camp " as correlation.After matching is related, the voice assistant 2 is by matched field Scape element information is transmitted to the software 1 of televising, and it is corresponding to execute the situation elements information by the software 1 of televising Project.For example having the program of display " happy base camp " in situation elements information, after matching is related, the voice assistant 2 will " happy base camp " information is transferred to the software 1 of televising, and the software 1 of televising, which executes, is somebody's turn to do " happy base camp " Program, implementing result include the operations such as selection, click.

As shown in Figure 1, the preferred embodiment of the present invention is: televise software 1 and the progress of the voice assistant 2 In message transmitting procedure, televise software 1 and the voice assistant 2 establish communication connection by following two mode. It is described televise software 1 and the voice assistant 2 by the spare interface of the software 1 of televising establish communication connection or Software 1 and the voice assistant 2 of televising described in person pass through proprietary protocol and establish communication connection.The voice assistant 2 obtains The scene information of acquisition operation includes two ways: the software 1 of televising is transmitted to the voice assistant 2 or institute's predicate Sound assistant 2 directly acquires to the software 1 of televising.For the scene information of the acquisition operation of software 1 of televising, institute It states televise software 1 and the voice assistant 2 and establishes and communicate to connect, then by the software 1 of televising by the fortune of acquisition Row scene information is transferred to the voice assistant 2.The voice assistant 2 can also televise what software 1 was reserved by described Interface and the software 1 of televising are established and are communicated to connect, and the voice assistant 2 is directly acquired to the software 1 of televising It televises described in acquisition the Run-time scenario information of software 1.The voice assistant 2 is according to the reserved of the software 1 of televising Interface and the software 1 of televising are established and are communicated to connect.Currently, most software is some specific functions, reserve Communication interface, such as: some softwares are that old man does not see Chu and the reserved interface for carrying out massage voice reading, alternatively, some softwares are The auxiliary operation interface etc. that blind person reserves.These functional interfaces and institute of the voice assistant 2 by the software 1 of televising It states software 1 of televising and establishes communication connection.The voice assistant 2 is established with the software 1 of televising by proprietary protocol Communication connection.The proprietary protocol communicated by constructing the voice assistant 2 with the software 1 of televising, realizes the voice The communication connection of assistant 2 and the software 1 of televising.

As shown in Figure 1, the preferred embodiment of the present invention is: the software of televising includes a variety of independently operated electricity Depending on playout software, the software cooperating of televising of the voice assistant and current active.Specific implementation process is as follows: The software 1 of televising is a variety of independently operated softwares of televising, the electricity of the voice assistant 2 and current active Depending on 1 cooperating of playout software.The operation of software 1 if current environment is televised described in only one, the voice assistant 2 with Current 1 cooperating of software of televising, if current system environment has multiple operations of software 1 of televising, The voice assistant 2 obtains the current software of televising in system environments by current system, such as Android system 1, then the voice assistant 2 is established with the current software 1 of televising and is communicated to connect, and carries out related work.

As shown in Fig. 2, the preferred embodiment of the present invention is: further including network server 3, the voice assistant 2 will adopt The scene information of collection uploads to the network server 3, and the network server 3 is by the scene information and pre-stored letter Breath is matched, and matched information is transmitted to the voice assistant 2.If scene information is " blame sincere not faze ", the network clothes Business device 3 is previously stored with the relevant information of " blame sincere not faze ", for example, the master of the recommended information of " blame sincere not faze ", " blame sincere not faze " People's relevant information is held, the information such as the song link information of " blame sincere not faze ", the network server 3 is relevant by " blame sincere not faze " These information are transferred to the voice assistant 2, these information are organized into information list, can directly shown by the voice assistant 2 Show output, for users to use, including the operation such as checks, plays；It can also be for transmission to the software 1 of televising, by the electricity It shows and exports depending on playout software 1, for using；It can also be transferred to mobile terminal, shown and exported by mobile terminal, for using.

As shown in Figure 1, a specific embodiment of the invention is: constructing a kind of based on tv scene element and voice assistant Interactive voice auxiliary system, including software 1 of televising, voice assistant 2, software 1 and the voice assistant 2 of televising Independent operating, the software 1 of televising is including acquiring the acquisition module 11 of scene information, being communicated with the voice assistant Communication module 12, execution module 13, the voice assistant 2 include obtain described in televise software 1 operation scene information Data obtaining module 21, acquire voice messaging voice acquisition module 22, carry out speech recognition conversion speech recognition module 23, matching module 24, transmission module 25, the data obtaining module 21 obtain the scene letter of the operation of software 1 of televising Breath, the scene information includes situation elements information；The voice acquisition module 22 acquires voice messaging, the speech recognition mould Block 23 carries out speech recognition conversion to the voice messaging；The matching module 24 is by the field of speech recognition conversion result and acquisition Scape information is matched；If it is described televise software 1 operation situation elements information and institute's speech recognition result pronunciation, Related in text, text meaning or operation information, matched situation elements information is transmitted to described by the transmission module 25 It televises software 1, the execution module 13 executes the corresponding project of the situation elements information.

As shown in Figure 1, specific implementation process of the invention is: the data obtaining module 21 obtain it is described televise it is soft The scene information mode that part 1 is run includes two ways: a kind of mode is the 1 background acquisition self-operating of software of televising Scene information, this information collection mode comprehensively, it is accurate, quickly, be preferred manner.Another mode is: the voice The scene information that software 1 of televising described in spare interface acquisition of the assistant 2 by the software 1 of televising is run, it is this Mode will determine the degree of acquisition information according to the function of spare interface.For the scene letter of the acquisition of software 1 of televising Breath is transmitted to the acquisition that the voice assistant 2 completes scene information by the software 1 of televising.For the voice assistant 2 The scene information that software 1 of televising described in spare interface acquisition by the software 1 of televising is run, itself is The process that scene information obtains.The scene information includes situation elements information.The situation elements information includes operation details The visual information that interface is presented, specifically includes text information, pictorial information, video information title of runnable interface etc., and operation is detailed The text information at feelings interface is most important information.The scene state information mainly includes the operation letter that runnable interface is related to Breath, such as: it plays video, play the related operation informations such as music, operation game.In specific embodiment, according to these information, lead to The element information of acquisition is often converted into text information more.

By external voice input equipment input voice information, the voice acquisition module 22 acquires the voice messaging, Then speech recognition module 23 carries out speech recognition conversion to the voice messaging.In specific embodiment, speech recognition conversion knot Fruit includes text information, alsos relate to operation information.Such as: happy base camp is opened, then speech recognition conversion result is related to Operation information also includes text information.

The matching module 24 matches speech recognition conversion result with the scene information of acquisition, mainly from it is each oneself Pronunciation, text, text meaning or the operation information of relevant information are matched, and the situation elements information includes situation elements Producer involved in type, situation elements information where the title of information, situation elements information, situation elements information relate to And one of content information or a variety of.In the relevant information it is same or similar be included in relevant information pronunciation, text, It is same or similar in text meaning, affiliated type or operation information, such as: current scene element information is " happy base camp ", Same or similar matching is carried out from the pronunciation of " happy base camp ", text, can also be matched from its affiliated type, such as: " happy base camp " is variety show, can also be from the matching of the enterprising row information of its host, can also be from its affiliated TV station Enterprising row information matching etc..Another way is: matching both sides respectively partial information in pronunciation, text, text meaning, affiliated class It is same or similar in type or operation information.Such as: current scene element information is " happy base camp ", can take its partial information " happy " and " base camp " is matched, if in speech recognition result including " happy " or " base camp ", can also be incited somebody to action " fast Happy base camp " matching is correlation.After matching is related, matched situation elements information is transmitted to the electricity by the transmission module 25 Depending on playout software 1, the corresponding project of the situation elements information is executed by the execution module 13.Such as in situation elements information There is the program of display " happy base camp ", after matching correlation, " happy base camp " information is transferred to described by the voice assistant 2 Televise software 1, the execution module 13 execute should " happy base camp " program, implementing result includes selection, clicks etc. Operation.

As shown in Figure 1, the preferred embodiment of the present invention is: the software 1 of televising includes a variety of independently operated It televises software, the software cooperating of televising of the voice assistant and current active.Specific implementation process is such as Under: the software 1 of televising is a variety of independently operated softwares of televising, the institute of the voice assistant 2 and current active State 1 cooperating of software of televising.If current environment is televised described in only one, the operation of software 1, the voice are helped Hand 2 and current 1 cooperating of software of televising, if current system environment has multiple fortune of software 1 of televising Row, then the voice assistant 2 by current system, such as Android system, broadcast by the current TV obtained in system environments Part 1 is softened, then the voice assistant 2 is established with the current software 1 of televising and communicated to connect, and carries out related work.

As shown in Fig. 2, the preferred embodiment of the present invention is: further including network server 3, the voice assistant 2 will adopt The scene information of collection uploads to the network server 3, and the network server 3 is by the scene information and pre-stored letter Breath is matched, and matched information is transmitted to the voice assistant 2.If scene information is " blame sincere not faze ", the network clothes Business device 3 is previously stored with the relevant information of " blame sincere not faze ", for example, the master of the recommended information of " blame sincere not faze ", " blame sincere not faze " People's relevant information is held, the information such as the song link information of " blame sincere not faze ", the network server 3 is relevant by " blame sincere not faze " These information are transferred to the voice assistant 2, these information are organized into information list by the voice assistant 2, by the second information Output module 26 directly displays output, for users to use, including the operation such as checks, plays；It can also be broadcast for transmission to the TV Part 1 is softened, is shown and is exported by the first information output module 14, for using；It can also be transferred to mobile terminal, by mobile whole End display output, for using.

The solution have the advantages that: construct a kind of interactive voice auxiliary square based on tv scene element and voice assistant Method and system, including software 1 of televising, voice assistant 2, software 1 and the independent fortune of the voice assistant 2 of televising Row, the voice assistant 2 obtain the scene information of the operation of software 1 of televising, and the scene information includes situation elements Information；The voice assistant 2 acquires voice messaging, and the voice assistant 2 carries out speech recognition conversion to the voice messaging；Institute It states voice assistant 2 and matches speech recognition conversion result with the scene information of acquisition；If the operation of software 1 of televising Situation elements information and institute's speech recognition result in pronunciation, text, text meaning or operation information related, by described Matched situation elements information is transmitted to the software 1 of televising by voice assistant 2, is executed by the software 1 of televising The corresponding project of the situation elements information.The present invention is based on the interactive voice householder method of tv scene and voice assistant 2 and System, televise software 1 and 2 independent operating of voice assistant, the voice assistant 2 obtain it is described televise it is soft The scene information that part 1 is run, the voice assistant 2 match speech recognition conversion result with the scene information of acquisition, so Afterwards for matched scene information, according to situation elements information and scene state information and voice messaging, by software 1 of televising Carry out operation execution.The present invention is based on tv scene and the interactive voice householder methods and system of voice assistant 2, according to TV Real-time scene information is operated on it and is used, and voice television is made really to march toward intelligence, meanwhile, it is soft with televising Part 1 separates independent operating, can be used cooperatively with a voice assistant 2 and multiple softwares 1 of televising, and greatlys save system money Source.In addition, convenient be updated and innovate to speech engine, promote development of the voice technology in terms of intelligence.

The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, exist Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention Protection scope.

Claims

1. a kind of interactive voice householder method based on tv scene element and voice assistant, including software of televising, voice Assistant, televise software and the voice assistant independent operating, it is described televise software and the voice assistant it is logical Cross the software of televising spare interface establish communication connection or it is described televise software and the voice assistant it is logical It crosses proprietary protocol and establishes communication connection, which is characterized in that interactive voice householder method includes the following steps:

Obtain scene information: the scene information for software operation of televising described in the voice assistant acquisition, the scene information Including situation elements information；The scene information mode that software is run of televising described in the voice assistant acquisition includes two kinds of sides Formula: a kind of mode is the scene information of the software background acquisition self-operating of televising, another mode is: institute's predicate The scene information that software of televising described in spare interface acquisition of the sound assistant by the software of televising is run；The field Scape element information includes the visual information for running details interface and presenting；

Input voice: the voice assistant acquires voice messaging, and the voice assistant carries out speech recognition to the voice messaging Conversion；

Matching executes: the voice assistant matches speech recognition conversion result with the scene information of acquisition；If the electricity Situation elements information and institute speech recognition result depending on playout software operation is same or similar in relevant information, then institute's predicate Matched situation elements information is transmitted to the software of televising by sound assistant, executes the field by the software of televising The corresponding project of scape element information.

2. the interactive voice householder method based on tv scene element and voice assistant, feature exist according to claim 1 In the software of televising includes a variety of independently operated softwares of televising, the institute of the voice assistant and current active State software cooperating of televising.

3. the interactive voice householder method based on tv scene element and voice assistant, feature exist according to claim 1 In further including network server, the scene information of acquisition is uploaded to the network server, the network by the voice assistant Server matches the scene information with pre-stored information, and matched information is transmitted to the voice assistant.

4. the interactive voice householder method based on tv scene element and voice assistant, feature exist according to claim 1 In the same or similar relevant information that is included in is in pronunciation, text, text meaning, affiliated type or operation in the relevant information It is same or similar in information, or matching both sides respectively partial information in pronunciation, text, text meaning, affiliated type or operation It is same or similar in information.

5. a kind of interactive voice auxiliary system based on tv scene element and voice assistant, which is characterized in that broadcast including TV Soften part, voice assistant, televise software and the voice assistant independent operating, the software and described of televising Voice assistant establishes communication connection or the software and described of televising by the spare interface of the software of televising Voice assistant by proprietary protocol establish communicate to connect, the software of televising include acquire scene information acquisition module, Communication module, the execution module communicated with the voice assistant, the voice assistant are soft including televising described in acquisition The data obtaining module of the scene information of part operation, carries out speech recognition conversion at the voice acquisition module for acquiring voice messaging Speech recognition module, matching module, transmission module, the data obtaining module obtain described in televise the scene of software operation Information, the scene information include situation elements information；The voice acquisition module acquires voice messaging, the speech recognition mould Block carries out speech recognition conversion to the voice messaging；The matching module believes speech recognition conversion result and the scene of acquisition Breath is matched；If the situation elements information and institute's speech recognition result of the software operation of televising are in relevant information Same or similar, matched situation elements information is transmitted to the software of televising, the execution mould by the transmission module Block executes the corresponding project of the situation elements information, the scene letter for software operation of televising described in the voice assistant acquisition Breath mode includes two ways: a kind of mode is the scene information of the software background acquisition self-operating of televising, in addition A kind of mode is: software operation of televising described in spare interface acquisition of the voice assistant by the software of televising Scene information；The situation elements information includes the visual information for running details interface and presenting.

6. the interactive voice auxiliary system according to claim 5 based on tv scene element and voice assistant, which is characterized in that institute Stating software of televising includes a variety of independently operated softwares of televising, the TV of the voice assistant and current active Playout software cooperating.

7. the interactive voice auxiliary system according to claim 5 based on tv scene element and voice assistant, which is characterized in that also Including network server, the scene information of acquisition is uploaded to the network server, the network service by the voice assistant Device matches the scene information with pre-stored information, and matched information is transmitted to the voice assistant.

8. the interactive voice auxiliary system according to claim 7 based on tv scene element and voice assistant, which is characterized in that institute Stating software of televising includes that first information output module or the voice assistant include the second message output module.