CN104506944A

CN104506944A - Voice interaction assisting method and system based on television scene and voice assistant

Info

Publication number: CN104506944A
Application number: CN201410634174.3A
Authority: CN
Inventors: 黄海兵
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2014-11-12
Filing date: 2014-11-12
Publication date: 2015-04-08
Anticipated expiration: 2034-11-12
Also published as: CN104506944B

Abstract

The invention relates to a voice interaction assisting method and system based on a television scene and a voice assistant. Television play software and the voice assistant run independently. The voice interaction assisting method comprises the following steps that: the voice assistant acquires the running scene information of the television play software; the voice assistant matches a voice identification conversion result with acquired scene information; and the television play software executes the matched scene information according to scene element information, scene state information and voice information. The voice interaction assisting method and system based on the television scene and the voice assistant are operated and used according to real-time scene information of a television, so that voice television is truly intellectualized. Meanwhile, the television play software runs separately and independently, and one voice assistant can be used together with multiple pieces of television play software, so that system resources are saved greatly. Moreover, a voice engine can be updated and innovated conveniently, and the development of a voice technology in the aspect of intellectualization is promoted.

Description

Based on interactive voice householder method and the system of tv scene and voice assistant

Technical field

The present invention relates to a kind of interactive voice householder method and system, particularly relate to a kind of interactive voice householder method based on tv scene and voice assistant and system.

Background technology

Although the emerging technology such as smart mobile phone, network changes production and the life of people greatly, in the family, TV still has the information transmission status do not replaced.Along with the development of science and technology, TV tech have also been obtained significant progress, and current Polarizations for Target Discrimination in Clutter is to the intelligent stage, and intelligent television is more and more widely applied in people's life.Along with the development of voice technology, voice television also more and more walks close to the life of people.Being typically employed in embedded voice software in module of televising in voice television carries out voice-controlled operations at present, majority can only carry out concrete operations project and operate, be real time execution according to TV software due to scene information and change, therefore, existing inline operations can not carry out operating on it and using for the real-time scene information of TV.In addition, multiple software of televising is loaded in intelligent television platform, then each software of televising all carries out the complicated embedded exploitation of voice and just can use, simultaneously, when software loads, a large amount of internal memories can be taken, particularly load simultaneously several televise software time, need a large amount of memory sources, the operational effect of influential system.Along with the degree of speech recognition is more and more higher, speech engine is also more and more huger, and Voice command is also more and more intelligent, and this needs speech engine itself constantly to upgrade and development, and voice are embedded obviously greatly limit voice-operated development.

Summary of the invention

The technical problem that the present invention solves is: build a kind of interactive voice householder method based on tv scene and voice assistant and system, overcome prior art to carry out operating on it and use for the real-time scene information of TV and the operational effect of influential system, the technical problem of restriction Voice command development on TV.

Technical scheme of the present invention is: provide a kind of interactive voice householder method based on tv scene and voice assistant, comprise software of televising, voice assistant, described software and the described voice assistant independent operating of televising, interactive voice householder method comprises the steps:

Obtain scene information: the scene information of running software of televising described in described voice assistant obtains, described scene information comprises situation elements information or scene state information;

Input voice: described voice assistant gathers voice messaging, and described voice assistant carries out speech recognition conversion to described voice messaging;

Coupling performs: speech recognition conversion result is mated with the scene information of acquisition by described voice assistant; If described in televise the situation elements information of running software and institute's speech recognition result same or similar in relevant information, then described voice assistant the situation elements information of coupling is sent to described in televise software, perform project corresponding to described situation elements information by described software of televising; If the scene state information of described running software and institute's speech recognition result matching result same or similar in relevant information, then described voice assistant calls the scene state template of this project information built in advance, then described voice assistant according to voice messaging the information of corresponding scene state template is sent to described in televise software, perform project corresponding to the information of described scene state template by described software of televising.

Further technical scheme of the present invention is: described in televise software and described voice assistant pass through described in televise the spare interface of software establish a communications link or described in televise software and described voice assistant established a communications link by proprietary protocol.

Further technical scheme of the present invention is: described in software of televising comprise the software of televising of multiple independent operating, software cooperating of televising described in described voice assistant and current active.

Further technical scheme of the present invention is: also comprise the webserver, the scene information of collection is uploaded to the described webserver by described voice assistant, described scene information mates with the information prestored by the described webserver, and the information of coupling is sent to described voice assistant.

Further technical scheme of the present invention is: same or similar in described relevant information to be included in relevant information same or similar on pronunciation, word, word implication, affiliated type or operation information, or coupling both sides partial information is same or similar on pronunciation, word, word implication, affiliated type or operation information separately.

Technical scheme of the present invention is: build a kind of interactive voice auxiliary system based on tv scene and voice assistant, comprise software of televising, voice assistant, described software and the described voice assistant independent operating of televising, described software of televising comprises the acquisition module gathering scene information, the communication module that communicates is carried out with described voice assistant, Executive Module, described voice assistant comprises the data obtaining module of the scene information of running software of televising described in acquisition, gather the voice acquisition module of voice messaging, carry out the sound identification module of speech recognition conversion, matching module, transport module, to televise described in described data obtaining module obtains the scene information of running software, described scene information comprises situation elements information or scene state information, described voice acquisition module gathers voice messaging, and described sound identification module carries out speech recognition conversion to described voice messaging, speech recognition conversion result is mated with the scene information of acquisition by described matching module, if described in televise the situation elements information of running software and institute's speech recognition result same or similar in relevant information, televise described in the situation elements information of coupling is sent to by described transport module software, described Executive Module performs project corresponding to described situation elements information, if described in televise the scene state information of running software and institute's speech recognition result same or similar in relevant information, described voice assistant calls the scene state template of this project information built in advance, televise described in the information of corresponding scene state template to be sent to according to voice messaging by described transport module software, described Executive Module performs project corresponding to the information of described scene state template.

Further technical scheme of the present invention is: described in software of televising comprise first information output module or described voice assistant comprises the second message output module.

Technique effect of the present invention is: build a kind of interactive voice householder method based on tv scene and voice assistant and system, comprise software of televising, voice assistant, described software and the described voice assistant independent operating of televising, televise described in described voice assistant obtains the scene information of running software, described scene information comprises situation elements information or scene state information; Described voice assistant gathers voice messaging, and described voice assistant carries out speech recognition conversion to described voice messaging; Speech recognition conversion result is mated with the scene information of acquisition by described voice assistant; If described in televise the situation elements information of running software and institute's speech recognition result same or similar in relevant information, then described voice assistant the situation elements information of coupling is sent to described in televise software, perform project corresponding to described situation elements information by described software of televising; If the scene state information of described running software and institute's speech recognition result matching result same or similar in relevant information, then described voice assistant calls the scene state template of this project information built in advance, then described voice assistant according to voice messaging the information of corresponding scene state template is sent to described in televise software, perform project corresponding to the information of described scene state template by described software of televising.The present invention is based on interactive voice householder method and the system of tv scene and voice assistant, described software and the described voice assistant independent operating of televising, to televise described in described voice assistant obtains the scene information of running software, speech recognition conversion result is mated with the scene information of acquisition by described voice assistant, then for the scene information of coupling, according to situation elements information and scene state information and voice messaging, carry out operation by software of televising and perform.The present invention is based on interactive voice householder method and the system of tv scene and voice assistant, carry out operating on it and using according to the real-time scene information of TV, voice television is really marched toward intellectuality, simultaneously, software separates independent operating with televising, can a voice assistant and multiple televise software with the use of, greatly save system resource.In addition, convenient speech engine upgraded and innovate, the development of promotion voice technology in intelligent.

Accompanying drawing explanation

Fig. 1 is structural representation of the present invention.

Fig. 2 is the preferred embodiment of the present invention structural representation.

Embodiment

Below in conjunction with specific embodiment, technical solution of the present invention is further illustrated.

As shown in Figure 1, the specific embodiment of the present invention is: provide a kind of interactive voice householder method based on tv scene and voice assistant, comprise software 1 of televising, voice assistant 2, described software 1 and described voice assistant 2 independent operating of televising, interactive voice householder method comprises the steps:

Obtain scene information: the scene information that software 1 of televising described in described voice assistant 2 obtains runs, described scene information comprises situation elements information or scene state information.

Specific implementation process is as follows: the scene information mode that software 1 of televising described in described voice assistant 2 obtains runs comprises two kinds of modes: a kind of mode be described in televise the scene information of software 1 background acquisition self-operating, this information gathering mode comprehensively, accurately, fast, is preferred manner.Another mode is: described voice assistant 2 pass through described in televise software 1 spare interface collection described in televise the scene information that software 1 runs, this mode will determine the degree of Information Monitoring according to the function of spare interface.For the scene information that described software 1 of televising gathers, be sent to by described software 1 of televising the acquisition that described voice assistant 2 completes scene information.Televise described in passing through for described voice assistant 2 software 1 spare interface collection described in televise software 1 run scene information, itself be namely scene information obtain process.Described scene information comprises situation elements information or scene state information.Described situation elements information comprises the visual information run details interface and present, and specifically comprises the Word message of runnable interface, pictorial information, video information title etc., and the Word message running details interface is topmost information.Described scene state information spinner will comprise the operation information that runnable interface relates to, such as: displaying video, relevant operation information such as broadcasting music, operate game etc.In specific embodiment, according to these information, usually the element information of collection is converted to Word message more.

Input voice: described voice assistant 2 gathers voice messaging, and described voice assistant 2 carries out speech recognition conversion to described voice messaging.

Specific implementation process is as follows: by external voice input equipment input voice information, and described voice assistant 2 gathers described voice messaging, then carries out speech recognition conversion to described voice messaging.In specific embodiment, speech recognition conversion result comprises Word message, also can relate to operation information.Such as: open happy base camp, then speech recognition conversion result relates to operation information, also comprises Word message.

Coupling performs: speech recognition conversion result is mated with the scene information of acquisition by described voice assistant 2; If described in televise software 1 run situation elements information and institute's speech recognition result same or similar in relevant information, then described voice assistant 2 the situation elements information of coupling is sent to described in televise software 1, perform project corresponding to described situation elements information by described software 1 of televising; If the scene state information of described running software and institute's speech recognition result matching result same or similar in relevant information, then described voice assistant 2 calls the scene state template of this project information built in advance, then described voice assistant 2 according to voice messaging the information of corresponding scene state template is sent to described in televise software 1, the project that the information performing described scene state template by described software 1 of televising is corresponding.

Specific implementation process is as follows: speech recognition conversion result is mated with the scene information of acquisition by described voice assistant 2, mainly mate from the pronunciation of oneself relevant information each, word, word implication or operation information, described situation elements information comprise the title of situation elements information, one or more in content information that the type at situation elements information place, the producer involved by situation elements information, situation elements information relate to.It is same or similar in described relevant information that to be included in relevant information same or similar on pronunciation, word, word implication, affiliated type or operation information, such as: current scene element information is " happy base camp ", same or similar coupling is carried out from the pronunciation of " happy base camp ", word, also can mate from type belonging to it, as: " happy base camp " is variety show, the coupling of information can also be carried out from its host, information matches etc. can also be carried out from TV station belonging to it.Another kind of mode is: partial information is same or similar on pronunciation, word, word implication, affiliated type or operation information separately for coupling both sides.Such as: current scene element information is " happy base camp ", its partial information " happy " can be got and " base camp " mates, if voice identification result comprises " happy " or " base camp ", then can be also relevant by " happy base camp " coupling.After coupling is relevant, software 1 of televising described in the situation elements information of coupling is sent to by described voice assistant 2, performs project corresponding to described situation elements information by described software 1 of televising.The program of display " happy base camp " is had in such as situation elements information, after coupling is relevant, described voice assistant 2 by " happy base camp " information transmission to described in televise software 1, described software 1 of televising performs the program being somebody's turn to do " happy base camp ", and execution result comprises the operations such as selection, click.

Scene state information and institute's speech recognition result matching result of described software 1 operation of televising are same or similar in relevant information, then described voice assistant 2 calls the scene state template of this project information built in advance, then described voice assistant 2 according to voice messaging the information of corresponding scene state template is sent to described in televise software 1, the project that the information performing described scene state template by described software 1 of televising is corresponding.Be exemplified below: if the scene state information of current collection is " blame sincere not faze in broadcasting ", then described voice assistant 2 calls the video player module built in advance, video player module comprises " broadcasting ", " F.F. ", " rewind ", " Volume Up ", " volume reduction ", " contrast increase ", associated videos such as " contrast reductions " plays the operation information related to, if the information of voice identification result comprises " increase volume ", understand from its implication, should be " Volume Up ", then described voice assistant 2 " Volume Up " is sent to described in televise software 1, then software 1 of televising described in performs the operation of Volume Up.

As shown in Figure 1, the preferred embodiment of the present invention is: described in televise software 1 and described voice assistant 2 carry out in message transmitting procedure, described in televise software 1 and described voice assistant 2 established a communications link by following two kinds of modes.Described televise software 1 and described voice assistant 2 pass through described in televise the spare interface of software 1 establish a communications link or described in televise software 1 and described voice assistant 2 established a communications link by proprietary protocol.Described voice assistant 2 obtains and gathers the scene information run and comprise two kinds of modes: described in software 1 of televising to transmit to described voice assistant 2 or described voice assistant 2 gathers directly to described software 1 of televising.For the scene information that described software 1 collection of televising is run, described in televise software 1 establish a communications link with described voice assistant 2, then by described software 1 of televising by the Run-time scenario information transmission that gathers to described voice assistant 2.Described voice assistant 2 also can pass through described in the reserved interface of software 1 of televising establish a communications link with described software 1 of televising, described voice assistant 2 gathers the Run-time scenario information of software 1 of televising described in acquisition directly to described software 1 of televising.Described voice assistant 2 establishes a communications link according to the spare interface of described software 1 of televising and described software 1 of televising.At present, most software is some specific functions, all reserved communication interfaces, such as: some softwares are that old man does not see Chu and reserves the interface carrying out massage voice reading, or some softwares are the auxiliary operation interface etc. that blind person reserves.Described voice assistant 2 pass through described in televise these functional interfaces of software 1 and described software 1 of televising establish a communications link.Described voice assistant 2 is established a communications link by proprietary protocol with described software 1 of televising.By building the proprietary protocol that described voice assistant 2 communicates with described software 1 of televising, realize the communication connection of described voice assistant 2 and described software 1 of televising.

As shown in Figure 1, the preferred embodiment of the present invention is: described in software of televising comprise the software of televising of multiple independent operating, software cooperating of televising described in described voice assistant and current active.Specific implementation process is as follows: described in software 1 of televising be the software of televising of multiple independent operating, described voice assistant 2 and software 1 cooperating of televising described in current active.If current environment only has software 1 of televising described in one to run, then described voice assistant 2 with current described in televise software 1 cooperating, if current system environment have multiple described in software 1 of televising run, then described voice assistant 2 passes through current system, such as Android system, to televise described in obtain in system environments current software 1, then described voice assistant 2 with current described in software 1 of televising establish a communications link, carry out related work.

As shown in Figure 2, the preferred embodiment of the present invention is: also comprise the webserver 3, the scene information of collection is uploaded to the described webserver 3 by described voice assistant 2, described scene information mates with the information prestored by the described webserver 3, and the information of coupling is sent to described voice assistant 2.If scene information is " blame sincere not faze ", the described webserver 3 is previously stored with the relevant information of " blame sincere not faze ", such as, the recommended information of " blame sincere not faze ", host's relevant information of " blame sincere not faze ", the information such as the song link information of " blame sincere not faze ", these information transmission that " blame sincere not faze " is correlated with by the described webserver 3 are to described voice assistant 2, these Information Organizations are become information list by described voice assistant 2, can direct display translation, for user, comprise check, the operation such as broadcasting; Also to televise described in can being transferred to software 1, by described software 1 display translation of televising, for; Can also mobile terminal be transferred to, by mobile terminal display translation, for.

As shown in Figure 1, the specific embodiment of the present invention is: build a kind of interactive voice auxiliary system based on tv scene and voice assistant, comprise software 1 of televising, voice assistant 2, described software 1 and described voice assistant 2 independent operating of televising, described software 1 of televising comprises the acquisition module 11 gathering scene information, the communication module 12 that communicates is carried out with described voice assistant, Executive Module 13, described voice assistant 2 comprises the data obtaining module 21 of the scene information that software 1 of televising described in acquisition runs, gather the voice acquisition module 22 of voice messaging, carry out the sound identification module 23 of speech recognition conversion, matching module 24, transport module 25, the scene information that software 1 of televising described in described data obtaining module 21 obtains runs, described scene information comprises situation elements information or scene state information, described voice acquisition module 22 gathers voice messaging, and described sound identification module 23 carries out speech recognition conversion to described voice messaging, speech recognition conversion result is mated with the scene information of acquisition by described matching module 24, if described in televise situation elements information that software 1 runs and institute speech recognition result relevant pronouncing, on word, word implication or operation information, televise described in the situation elements information of coupling is sent to by described transport module 25 software 1, described Executive Module 13 performs project corresponding to described situation elements information, if described in televise scene state information that software 1 runs and institute speech recognition result relevant pronouncing, on word, word implication or operation information, described voice assistant 2 calls the scene state template of this project information built in advance, televise described in the information of corresponding scene state template to be sent to according to voice messaging by described transport module 25 software 1, described Executive Module 13 performs project corresponding to the information of described scene state template.

As shown in Figure 1, specific embodiment of the invention process is: the scene information mode that software 1 of televising described in described data obtaining module 21 obtains runs comprises two kinds of modes: a kind of mode be described in televise the scene information of software 1 background acquisition self-operating, this information gathering mode comprehensively, accurately, fast, is preferred manner.Another mode is: described voice assistant 2 pass through described in televise software 1 spare interface collection described in televise the scene information that software 1 runs, this mode will determine the degree of Information Monitoring according to the function of spare interface.For the scene information that described software 1 of televising gathers, be sent to by described software 1 of televising the acquisition that described voice assistant 2 completes scene information.Televise described in passing through for described voice assistant 2 software 1 spare interface collection described in televise software 1 run scene information, itself be namely scene information obtain process.Described scene information comprises situation elements information or scene state information.Described situation elements information comprises the visual information run details interface and present, and specifically comprises the Word message of runnable interface, pictorial information, video information title etc., and the Word message running details interface is topmost information.Described scene state information spinner will comprise the operation information that runnable interface relates to, such as: displaying video, relevant operation information such as broadcasting music, operate game etc.In specific embodiment, according to these information, usually the element information of collection is converted to Word message more.

By external voice input equipment input voice information, described voice acquisition module 22 gathers described voice messaging, and then sound identification module 23 carries out speech recognition conversion to described voice messaging.In specific embodiment, speech recognition conversion result comprises Word message, also can relate to operation information.Such as: open happy base camp, then speech recognition conversion result relates to operation information, also comprises Word message.

Speech recognition conversion result is mated with the scene information of acquisition by described matching module 24, mainly mate from the pronunciation of oneself relevant information each, word, word implication or operation information, described situation elements information comprise the title of situation elements information, one or more in content information that the type at situation elements information place, the producer involved by situation elements information, situation elements information relate to.It is same or similar in described relevant information that to be included in relevant information same or similar on pronunciation, word, word implication, affiliated type or operation information, such as: current scene element information is " happy base camp ", same or similar coupling is carried out from the pronunciation of " happy base camp ", word, also can mate from type belonging to it, as: " happy base camp " is variety show, the coupling of information can also be carried out from its host, information matches etc. can also be carried out from TV station belonging to it.Another kind of mode is: partial information is same or similar on pronunciation, word, word implication, affiliated type or operation information separately for coupling both sides.Such as: current scene element information is " happy base camp ", its partial information " happy " can be got and " base camp " mates, if voice identification result comprises " happy " or " base camp ", then can be also relevant by " happy base camp " coupling.After coupling is relevant, software 1 of televising described in the situation elements information of coupling is sent to by described transport module 25, performs project corresponding to described situation elements information by described Executive Module 13.The program of display " happy base camp " is had in such as situation elements information, after coupling is relevant, described voice assistant 2 by " happy base camp " information transmission to described in televise software 1, described Executive Module 13 performs the program being somebody's turn to do " happy base camp ", and execution result comprises the operations such as selection, click.

Scene state information and institute's speech recognition result matching result of described software 1 operation of televising are same or similar in relevant information, then described voice assistant 2 calls the scene state template of this project information built in advance, then described transport module 25 according to voice messaging the information of corresponding scene state template is sent to described in televise software 1, perform project corresponding to the information of described scene state template by described Executive Module 13.Be exemplified below: if the scene state information of current collection is " blame sincere not faze in broadcasting ", then described voice assistant 2 calls the video player module built in advance, video player module comprises " broadcasting ", " F.F. ", " rewind ", " Volume Up ", " volume reduction ", " contrast increase ", associated videos such as " contrast reductions " plays the operation information related to, if the information of voice identification result comprises " increase volume ", understand from its implication, should be " Volume Up ", then described transport module 25 " Volume Up " is sent to described in televise software 1, then described Executive Module 13 performs the operation of Volume Up.

As shown in Figure 1, the preferred embodiment of the present invention is: described in software 1 of televising comprise the software of televising of multiple independent operating, software cooperating of televising described in described voice assistant and current active.Specific implementation process is as follows: described in software 1 of televising be the software of televising of multiple independent operating, described voice assistant 2 and software 1 cooperating of televising described in current active.If current environment only has software 1 of televising described in one to run, then described voice assistant 2 with current described in televise software 1 cooperating, if current system environment have multiple described in software 1 of televising run, then described voice assistant 2 passes through current system, such as Android system, to televise described in obtain in system environments current software 1, then described voice assistant 2 with current described in software 1 of televising establish a communications link, carry out related work.

As shown in Figure 2, the preferred embodiment of the present invention is: also comprise the webserver 3, the scene information of collection is uploaded to the described webserver 3 by described voice assistant 2, described scene information mates with the information prestored by the described webserver 3, and the information of coupling is sent to described voice assistant 2.If scene information is " blame sincere not faze ", the described webserver 3 is previously stored with the relevant information of " blame sincere not faze ", such as, the recommended information of " blame sincere not faze ", host's relevant information of " blame sincere not faze ", the information such as the song link information of " blame sincere not faze ", these information transmission that " blame sincere not faze " is correlated with by the described webserver 3 are to described voice assistant 2, these Information Organizations are become information list by described voice assistant 2, by the second message output module 26 directly display translation, for user, comprise check, the operation such as broadcasting; Also to televise described in can being transferred to software 1, by described first information output module 14 display translation, for; Can also mobile terminal be transferred to, by mobile terminal display translation, for.

Technique effect of the present invention is: build a kind of interactive voice householder method based on tv scene and voice assistant and system, comprise software 1 of televising, voice assistant 2, described software 1 and described voice assistant 2 independent operating of televising, the scene information that software 1 of televising described in described voice assistant 2 obtains runs, described scene information comprises situation elements information or scene state information; Described voice assistant 2 gathers voice messaging, and described voice assistant 2 carries out speech recognition conversion to described voice messaging; Speech recognition conversion result is mated with the scene information of acquisition by described voice assistant 2; If described in televise situation elements information that software 1 runs and institute speech recognition result relevant pronouncing, on word, word implication or operation information, to televise described in the situation elements information of coupling is sent to by described voice assistant 2 software 1, perform project corresponding to described situation elements information by described software 1 of televising; If the scene state information of described running software and institute speech recognition result are relevant pronouncing, on word, word implication or operation information, described voice assistant 2 calls the scene state template of this project information built in advance, to televise described in the information of corresponding scene state template to be sent to according to voice messaging by described voice assistant 2 software 1, the project that the information performing described scene state template by described software 1 of televising is corresponding.The present invention is based on interactive voice householder method and the system of tv scene and voice assistant 2, described software 1 and described voice assistant 2 independent operating of televising, the scene information that software 1 of televising described in described voice assistant 2 obtains runs, speech recognition conversion result is mated with the scene information of acquisition by described voice assistant 2, then for the scene information of coupling, according to situation elements information and scene state information and voice messaging, carry out operation by software 1 of televising and perform.The present invention is based on interactive voice householder method and the system of tv scene and voice assistant 2, carry out operating on it and using according to the real-time scene information of TV, voice television is really marched toward intellectuality, simultaneously, software 1 separates independent operating with televising, can voice assistant 2 with multiple televise software 1 with the use of, greatly save system resource.In addition, convenient speech engine upgraded and innovate, the development of promotion voice technology in intelligent.

Above content is in conjunction with concrete preferred implementation further description made for the present invention, can not assert that specific embodiment of the invention is confined to these explanations.For general technical staff of the technical field of the invention, without departing from the inventive concept of the premise, some simple deduction or replace can also be made, all should be considered as belonging to protection scope of the present invention.

Claims

1., based on an interactive voice householder method for tv scene and voice assistant, comprise software of televising, voice assistant, described in televise software and described voice assistant independent operating, it is characterized in that, interactive voice householder method comprises the steps:

2. according to claim 1 based on the interactive voice householder method of tv scene and voice assistant, it is characterized in that, described in televise software and described voice assistant pass through described in televise the spare interface of software establish a communications link or described in televise software and described voice assistant established a communications link by proprietary protocol.

3. according to claim 1 based on the interactive voice householder method of tv scene and voice assistant, it is characterized in that, described software of televising comprises the software of televising of multiple independent operating, software cooperating of televising described in described voice assistant and current active.

4. according to claim 1 based on the interactive voice householder method of tv scene and voice assistant, it is characterized in that, also comprise the webserver, the scene information of collection is uploaded to the described webserver by described voice assistant, described scene information mates with the information prestored by the described webserver, and the information of coupling is sent to described voice assistant.

5. according to claim 1 based on the interactive voice householder method of tv scene and voice assistant, it is characterized in that, it is same or similar in described relevant information that to be included in relevant information same or similar on pronunciation, word, word implication, affiliated type or operation information, or coupling both sides partial information is same or similar on pronunciation, word, word implication, affiliated type or operation information separately.

6. the interactive voice auxiliary system based on tv scene and voice assistant, it is characterized in that, comprise software of televising, voice assistant, described software and the described voice assistant independent operating of televising, described software of televising comprises the acquisition module gathering scene information, the communication module that communicates is carried out with described voice assistant, Executive Module, described voice assistant comprises the data obtaining module of the scene information of running software of televising described in acquisition, gather the voice acquisition module of voice messaging, carry out the sound identification module of speech recognition conversion, matching module, transport module, to televise described in described data obtaining module obtains the scene information of running software, described scene information comprises situation elements information or scene state information, described voice acquisition module gathers voice messaging, and described sound identification module carries out speech recognition conversion to described voice messaging, speech recognition conversion result is mated with the scene information of acquisition by described matching module, if described in televise the situation elements information of running software and institute's speech recognition result same or similar in relevant information, televise described in the situation elements information of coupling is sent to by described transport module software, described Executive Module performs project corresponding to described situation elements information, if described in televise the scene state information of running software and institute's speech recognition result same or similar in relevant information, described voice assistant calls the scene state template of this project information built in advance, televise described in the information of corresponding scene state template to be sent to according to voice messaging by described transport module software, described Executive Module performs project corresponding to the information of described scene state template.

7. according to the interactive voice auxiliary system of claim 6 based on tv scene and voice assistant, it is characterized in that, described software of televising comprises the software of televising of multiple independent operating, software cooperating of televising described in described voice assistant and current active.

8. according to the interactive voice auxiliary system of claim 6 based on tv scene and voice assistant, it is characterized in that, also comprise the webserver, the scene information of collection is uploaded to the described webserver by described voice assistant, described scene information mates with the information prestored by the described webserver, and the information of coupling is sent to described voice assistant.

9., according to Claim 8 based on the interactive voice auxiliary system of tv scene and voice assistant, it is characterized in that, described in software of televising comprise first information output module or described voice assistant comprises the second message output module.