CN112511877B - Intelligent television voice continuous conversation and interaction method - Google Patents

Intelligent television voice continuous conversation and interaction method Download PDF

Info

Publication number
CN112511877B
CN112511877B CN202011420024.4A CN202011420024A CN112511877B CN 112511877 B CN112511877 B CN 112511877B CN 202011420024 A CN202011420024 A CN 202011420024A CN 112511877 B CN112511877 B CN 112511877B
Authority
CN
China
Prior art keywords
instruction
continuous
interaction
voice
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011420024.4A
Other languages
Chinese (zh)
Other versions
CN112511877A (en
Inventor
陈贵凤
周杰
高美军
李洋全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN202011420024.4A priority Critical patent/CN112511877B/en
Publication of CN112511877A publication Critical patent/CN112511877A/en
Application granted granted Critical
Publication of CN112511877B publication Critical patent/CN112511877B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/4104Peripherals receiving signals from specially adapted client devices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42204User interfaces specially adapted for controlling a client device through a remote control device; Remote control devices therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • H04N21/4415Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card using biometric characteristics of the user, e.g. by voice recognition or fingerprint scanning

Abstract

The invention discloses a method for voice continuous conversation and interaction of an intelligent television, which is characterized in that the validity of the operation intention of an instruction is judged through a server side, and the effective instruction, the weak intention instruction and the refusable instruction are distinguished; and the television equipment performs different operations on the instructions with different effectiveness, UI display and continuous conversation interaction. The voice interaction that voice commands can be continuously input to carry out television control can be realized by awakening the smart television once, the operation path is shortened, the voice use complexity of a user is greatly reduced, the user can operate the television naturally as if the user communicates with a person, and the use experience of the smart equipment of the user is obviously improved.

Description

Intelligent television voice continuous conversation and interaction method
Technical Field
The invention relates to the technical field of intelligent voice interaction, in particular to a voice continuous conversation and interaction method for an intelligent television,
background
At present, two interaction modes of voice on the smart television basically exist: single round of interaction, namely one-time awakening and one-time interaction; and multiple rounds of interaction, namely one-time awakening and multiple interactions. The voice pickup can be carried out after the voice is called up every time in a single round of interaction, and the awakening-free multi-round voice input can be supported only for limited times after the voice is called up even though the voice is not required to be activated every time in the multi-round of interaction. When a user uses the television, the user needs to frequently call the voice by activating the words to input a new voice command, so that the user cannot operate the television continuously and smoothly by the voice without obstacles. The main reasons for this problem are: if the voice is always in the awakening state, because a user possibly speaks all the time in the house, a quieter environment cannot be guaranteed, the television can continuously record external sound, unexpected semantic understanding and execution operation are easy to occur, and the user can hardly use the television function normally.
Continuous voice interaction of voice instructions under the fixed same-class service intentions is only realized on the existing intelligent product for the continuous conversation interaction, and the scene and the continuous conversation interaction are fixed and inflexible, can not perform cross-scene and cross-service interaction, and can not dynamically update operable instructions under the continuous conversation in real time.
Disclosure of Invention
Aiming at the problem that the intelligent television is in an awakening activation state for a long time, and after external environment sound is continuously recorded, wrong semantic understanding and execution operation outside expectation are easily carried out, and continuous interaction cannot be carried out; the method aims at the problems of continuous conversation interaction and fixed and inflexible scene at present. The invention provides a voice continuous conversation and interaction method for an intelligent television. The method realizes one-time voice activation of the smart television and can continuously perform voice interaction by performing effective semantic analysis on the voice instruction. The invention defines a plurality of instruction data sets at the server end to carry out continuous conversation among a plurality of scenes and services, and supports dynamically adjustable rules and data sets, thereby realizing the voice continuous conversation interaction of cross-scene and cross-service. Compared with the prior art, the method defines a scene set and an instruction set capable of continuously picking sound, judges the validity of the voice instruction intention, carries out different interactions on the instructions with different validity, and realizes the interaction of continuous conversation across scenes and businesses.
The invention realizes the purpose through the following technical scheme:
a method for voice continuous conversation and interaction of an intelligent television comprises the following steps:
step1, defining an effective instruction data set of continuous conversation;
effective scene instruction data set defining a continuous dialog: defining scenes needing to customize and process continuous dialogue interaction, dynamically configuring an effective instruction set and rules for instructions in different scenes, and preferentially judging the instruction set of the effective scene and the continuous dialogue interaction in the scene;
effective domain data sets defining continuous dialogs: defining a voice field set and rules which can support continuous dialogue interaction, and dynamically configuring an instruction set; defining a voice domain set and rules which can support continuous dialogue interaction, and dynamically configuring domain data in the set;
defining a weak semantic instruction dataset: in hundreds of voice instructions or intentions supported by an effective field data set, strong semantic and weak semantic intention instructions in the field are further distinguished, instructions with ambiguous intentions or weak functions are classified into a weak semantic instruction set, and instruction set rules in the instructions can be dynamically configured;
step2, the server side judges the validity of the instruction;
for the continuous dialogue effective instruction set and the weak semantic instruction set which are dynamically configured in the step1, the server side judges the effectiveness of the instruction through an intention rejection algorithm model based on a pre-training language model and a convolutional neural network, the effectiveness is divided into an effective instruction, a weak intention instruction and a rejectable instruction, and a semantic control intention is issued;
step3, voice continuous dialogue interactive display at the television equipment end;
and (3) judging the validity of the instruction through the step2, displaying different interactions and UIs by the television end, providing different interaction states for the user, and enabling the user to intuitively perceive whether the function is executable or not, whether the current state can continuously pick up sound or not and whether the conversation can be directly and continuously carried out or not through different interaction effects and UI display according to different states.
Further, in the step2, the determining step is as follows:
A. preferentially determining the effective scene instruction data set: whether the current scene is in an effective scene set or not is judged, the analysis judgment of the instruction set of the corresponding scene is preferentially carried out under the effective scene, and the effectiveness is judged according to the semantic intention;
B. if it is determined in step a that the voice command is not a valid scene command, a valid domain data set is determined: if the voice command is not an intention command in the effective field set, judging the voice command to be a rejectable command;
C. and B, judging that the voice instruction is an effective domain instruction, and judging a weak semantic instruction set: and if the voice instruction is not an intention instruction in the weak semantic set, judging the voice instruction to be an effective instruction, otherwise, judging the voice instruction to be a weak intention instruction.
Further, the instruction judgment types are as follows:
valid instructions: directly triggering continuous conversation interaction in a discontinuous conversation state, entering a continuous pickup state, and performing instruction intention control; if the continuous conversation state is already in the continuous conversation state, the continuous conversation state is kept, the continuous sound pickup and recording state is kept, and the operation of the current instruction is kept;
the weak intent instruction: in the discontinuous conversation state, the original single-round or multi-round interaction state is kept; if the conversation is in a continuous conversation state, performing guidance reply without operation;
the rejectable instruction: in the discontinuous conversation state, the original single-round or multi-round interaction state is kept; and under the continuous conversation state, the original continuous pickup state is kept without reply and control.
If the command can be rejected, keeping the original interaction state and carrying out related operation in a discontinuous conversation state; and in the continuous conversation state, the current continuous conversation state is maintained, no operation is performed, and the UI is displayed as an unexecuted state.
Further, if the instruction is a valid instruction, the intended operation of the instruction is executed; if the voice is not in the continuous conversation interaction, entering a continuous conversation UI state, and starting a continuous pickup function; and in the continuous dialogue interaction, the voice keeps the interaction UI state and the continuous sound pickup and recording function is kept.
The further scheme is that if the instruction is a weak intention instruction, the original interaction state is kept and related operations are carried out in a discontinuous conversation state; and in the continuous conversation state, the interaction state and the UI display effect are kept, and only guidance reply is carried out without operation.
The invention has the beneficial effects that:
the invention realizes the voice interaction of television control by continuously inputting the voice command after the smart television is awakened once by voice, shortens the operation path, greatly reduces the voice use complexity of the user, enables the user to operate the television as natural as the communication with people, and obviously improves the use experience of the smart equipment of the user.
The invention defines a plurality of effective instruction data sets under continuous conversation and realizes the data support of cross-scene and cross-service continuous conversation interaction; judging the validity of the operation intention of the instruction through the server side, and distinguishing a valid instruction, a weak intention instruction and a rejectable instruction; and the television equipment performs different operations on the instructions with different effectiveness, UI display and continuous conversation interaction.
The invention is not limited to the smart television, but can also be expanded to other intelligent devices; the method is not limited to a fixed instruction set, and can flexibly allocate and configure each instruction set of the server side, even the personalized instruction set of the user can be dynamically updated at the later stage, so that more intelligent experience is provided.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following briefly introduces the embodiments or the drawings needed to be practical in the prior art description, and obviously, the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
In any embodiment, as shown in fig. 1, a method for voice continuous conversation and interaction of a smart television of the present invention includes:
step1, defining an effective instruction data set of continuous conversation;
effective scene instruction data set defining a continuous dialog: the intelligent television has the functional scenes of network video playing, local video playing, song playing, radio listening, education and learning, game entertainment and the like, and in different scenes, the demands for continuous conversation are different, and different scenes are effectively distinguished and continuously interacted and distinguished by defining an effective scene set.
Defining a valid scene instruction dataset: defining scenes needing to customize and process continuous dialogue interaction, dynamically configuring an effective instruction set and rules for instructions in different scenes, and preferentially judging the instruction set of the effective scene and the continuous dialogue interaction in the scene;
and the scene without customized processing is judged according to the effective field set of the voice.
Effective domain data sets defining continuous dialogs: different voice instructions can be subdivided into different fields of listening to songs, watching videos, checking weather, listening to radio stations and the like, and by defining an effective field set, a non-effective field set is drawn into a field with low utilization rate and weak function of part of users, and the triggering of continuous conversation interaction is not carried out or the existing interaction of the continuous conversation is not interrupted.
Defining a valid domain data set: defining a voice domain set and rules which can support continuous dialogue interaction, and dynamically configuring domain data in the set;
defining a weak semantic instruction dataset: in hundreds of voice instructions or intentions supported by the effective field data set, strong semantic and weak semantic intention instructions in the field are further distinguished, instructions with ambiguous intentions or weak functions are classified into a weak semantic instruction set, and instruction set rules in the instructions can be dynamically configured.
Step2, the server side judges the validity of the instruction;
for the continuous dialogue effective instruction set and the weak semantic instruction set which are dynamically configured in the step1, the server side judges the effectiveness of the instruction through an intention rejection algorithm model based on a pre-training language model and a convolutional neural network, the effectiveness is divided into an effective instruction, a weak intention instruction and a rejectable instruction, and a semantic control intention is issued. The determination steps are as follows:
A. preferentially determining the effective scene instruction data set: whether the current scene is in an effective scene set or not is judged, the analysis judgment of the instruction set of the corresponding scene is preferentially carried out under the effective scene, and the effectiveness is judged according to the semantic intention;
B. if it is determined in step a that the voice command is not a valid scene command, a valid domain data set is determined: if the voice command is not an intention command in the effective field set, judging the voice command to be a rejectable command;
C. and B, judging that the voice instruction is an effective domain instruction, and judging a weak semantic instruction set: and if the voice instruction is not an intention instruction in the weak semantic set, judging the voice instruction to be an effective instruction, otherwise, judging the voice instruction to be a weak intention instruction.
Step3, voice continuous dialogue interactive display at the television equipment end;
and (3) through the judgment of the validity of the instruction in the step (2), the television end displays different interactions and UIs and provides different interaction states for the user. Aiming at different states, through different interaction effects and UI display, a user can intuitively perceive whether the function is executable or not, whether the current state can continuously pick up sound or not, and whether continuous conversation can be directly realized or not.
The types of instruction judgment are as follows:
valid instructions: directly triggering continuous conversation interaction in a discontinuous conversation state, entering a continuous pickup state, and performing instruction intention control; if the continuous conversation state is already in the continuous conversation state, the continuous conversation state is kept, the continuous sound pickup and recording state is kept, and the operation of the current instruction is kept;
the weak intent instruction: in the discontinuous conversation state, the original single-round or multi-round interaction state is kept; if the conversation is in a continuous conversation state, performing guidance reply without operation;
the rejectable instruction: in the discontinuous conversation state, the original single-round or multi-round interaction state is kept; and under the continuous conversation state, the original continuous pickup state is kept without reply and control.
In an embodiment, as shown in fig. 1, a method for voice continuous conversation and interaction of a smart television of the present invention includes the following steps:
step1 defining an effective instruction data set of continuous conversation;
1.1 defining a valid scene instruction data set for a continuous dialog;
defining a scene needing to customize and process continuous dialogue interaction; if the video application is defined as an effective scene App, the private playback control global instruction in the scene is an effective instruction set AppControl.
1.2 defining valid domain data sets for continuous dialogs;
defining a voice field set and rules which can support continuous dialogue interaction, and dynamically configuring an instruction set; if a domain set DomainA of movie and Video + Music + TV control is defined, other domains such as weather and chatting are not in an effective domain set;
strong semantic and weak semantic instructions in the effective field are distinguished, partial single nouns or ambiguous weak semantic instruction sets DomainWeak are defined, such as some Stars of singers and actors, and the configuration in the weak semantic instruction sets DomainWeak can be dynamically adjusted according to market popularity and user preference.
Step2, the server side judges the validity of the instruction;
after receiving the reported voice command, the server side performs the following processing through the reported scene and the continuous conversation state information of the equipment side, and issues the semantic control intention of the command:
step1, judging the semantic intention of the instruction;
step2 makes a valid scene instruction data set decision: the current scene is an effective scene App, and if the current instruction is judged to be in an effective instruction set Appcontrol of the current scene, the current instruction is an effective instruction; otherwise, the next Step3 processing is carried out;
step3, judging the valid domain data set: if the current instruction is in the effective domain set DomainA, performing next Step4 processing, otherwise, determining that the current instruction is a rejectable instruction;
step4 performs a weak semantic instruction set decision: if the instruction is not in the weak semantic instruction set DomainWeak, the instruction is a valid instruction; in DomainWeak, it is a weak intention instruction.
Step3, voice continuous dialogue interaction at the television equipment end;
and (3) designing different UI display states of continuous interaction and normal interaction at the equipment end, and after receiving the recording and reporting the instruction Query request, carrying out validity judgment on the instruction Query through the step2 by the television end to display different UI display states.
Determination case 1: the Query is an effective instruction and executes the intention operation of the instruction; if the voice is not in the continuous conversation interaction, entering a continuous conversation UI state, and starting a continuous pickup function; and in the continuous dialogue interaction, the voice keeps the interaction UI state and the continuous sound pickup and recording function is kept.
Determination case 2: the Query is a weak intention instruction, and in a discontinuous conversation state, the original interaction state is kept and related operations are carried out; and in the continuous conversation state, the interaction state and the UI display effect are kept, and only guidance reply is carried out without operation. If the command is star name in Stars, in the continuous dialogue state, only "you can say the movie of Query" or "you can say the song of Query" is replied.
Determination case 3: the Query is a refusable instruction, and the original interaction state is kept and related operations are carried out under the discontinuous conversation state; in the continuous conversation state, the current continuous conversation state is maintained, no operation is performed, and the UI is shown in a non-execution state, such as a graytone display of the UI for instruction or reply.
Under the continuous conversation state, after each instruction execution, the device can enter a waiting state, the UI does not shield the display and operation of the next layer of window of the device, the pickup tilting state is kept, and the device can enter a voice operation state at any time.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims. It should be noted that the various technical features described in the above embodiments can be combined in any suitable manner without contradiction, and the invention is not described in any way for the possible combinations in order to avoid unnecessary repetition. In addition, any combination of the various embodiments of the present invention is also possible, and the same should be considered as the disclosure of the present invention as long as it does not depart from the spirit of the present invention.

Claims (4)

1. A method for voice continuous conversation and interaction of an intelligent television is characterized by comprising the following steps:
step1, defining an effective instruction data set of continuous conversation;
effective scene instruction data set defining a continuous dialog: defining scenes needing to customize and process continuous dialogue interaction, dynamically configuring an effective scene instruction data set and rules for instructions in different scenes, and preferentially judging the effective scene instruction data set of the effective scene and the continuous dialogue interaction in the scene;
effective domain data sets defining continuous dialogs: defining a voice field set and rules which can support continuous dialogue interaction, and dynamically configuring an effective field data set; defining a voice field set and rules which can support continuous dialogue interaction, and dynamically configuring field data in the voice field set which can support continuous dialogue interaction;
defining a weak semantic instruction dataset: in hundreds of voice instructions or intentions supported by an effective field data set, strong semantics and weak semantic intention instructions in the field are further distinguished, instructions with ambiguous intentions or weak functions are classified into a weak semantic instruction data set, and weak semantic instruction data set rules in the instructions can be dynamically configured;
step2, the server side judges the validity of the instruction;
for the continuous conversation effective instruction data set and the weak semantic instruction data set which are dynamically configured in the step1, the server side judges the effectiveness of the instruction through an intention rejection algorithm model based on a pre-training language model and a convolutional neural network, the effectiveness is divided into an effective instruction, a weak intention instruction and a rejectable instruction, and a semantic control intention is issued;
in the step2, the determination step is as follows:
A. preferentially determining the effective scene instruction data set: whether the current scene is in an effective scene set or not is judged, the effective scene instruction data set of the corresponding scene is analyzed and judged preferentially under the effective scene, and the effectiveness is judged according to the semantic intention;
B. if it is determined in step a that the voice command is not a valid scene command, a valid domain data set is determined: if the voice command is not an intention command in the effective field data set, judging that the command can be rejected;
C. and B, judging that the voice command is a valid domain command, and judging a weak semantic command data set: if the voice instruction is not an intention instruction in the weak semantic set, judging the voice instruction as an effective instruction, otherwise, judging the voice instruction as a weak intention instruction;
step3, voice continuous dialogue interactive display at the television equipment end;
through the judgment of the validity of the instruction in the step2, the television end displays different interactions and UIs to provide different interaction states for the user, and the user can intuitively perceive whether the function is executable, whether the current state can continuously pick up sound or not and whether continuous conversation can be directly realized or not through different interaction effects and UI display aiming at different states;
the types of instruction judgment are as follows:
valid instructions: directly triggering continuous conversation interaction in a discontinuous conversation state, entering a continuous pickup state, and performing instruction intention control; if the continuous conversation state is already in the continuous conversation state, the continuous conversation state is kept, the continuous sound pickup and recording state is kept, and the operation of the current instruction is kept;
the weak intent instruction: in the discontinuous conversation state, the original single-round or multi-round interaction state is kept; if the conversation is in a continuous conversation state, performing guidance reply without operation;
the rejectable instruction: in the discontinuous conversation state, the original single-round or multi-round interaction state is kept; and under the continuous conversation state, the original continuous pickup state is kept without reply and control.
2. The method for intelligent television voice continuous conversation and interaction as claimed in claim 1, wherein if the command is a refusable command, in the discontinuous conversation state, the original interaction state is maintained, and related operations are performed; and in the continuous conversation state, the current continuous conversation state is maintained, no operation is performed, and the UI is displayed as an unexecuted state.
3. The method according to claim 1, wherein if the command is valid, performing an intended operation of the command; if the voice is not in the continuous conversation interaction, entering a continuous conversation UI state, and starting a continuous pickup function; and in the continuous dialogue interaction, the voice keeps the interaction UI state and the continuous sound pickup and recording function is kept.
4. The method for intelligent television voice continuous conversation and interaction as claimed in claim 1, wherein if the instruction is a weak intention instruction, in a discontinuous conversation state, the original interaction state is maintained, and related operations are performed; and in the continuous conversation state, the interaction state and the UI display effect are kept, and only guidance reply is carried out without operation.
CN202011420024.4A 2020-12-07 2020-12-07 Intelligent television voice continuous conversation and interaction method Active CN112511877B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011420024.4A CN112511877B (en) 2020-12-07 2020-12-07 Intelligent television voice continuous conversation and interaction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011420024.4A CN112511877B (en) 2020-12-07 2020-12-07 Intelligent television voice continuous conversation and interaction method

Publications (2)

Publication Number Publication Date
CN112511877A CN112511877A (en) 2021-03-16
CN112511877B true CN112511877B (en) 2021-08-27

Family

ID=74971195

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011420024.4A Active CN112511877B (en) 2020-12-07 2020-12-07 Intelligent television voice continuous conversation and interaction method

Country Status (1)

Country Link
CN (1) CN112511877B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113611316A (en) * 2021-07-30 2021-11-05 百度在线网络技术(北京)有限公司 Man-machine interaction method, device, equipment and storage medium
CN114356275B (en) * 2021-12-06 2023-12-29 上海小度技术有限公司 Interactive control method and device, intelligent voice equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010066519A (en) * 2008-09-11 2010-03-25 Brother Ind Ltd Voice interactive device, voice interactive method, and voice interactive program
CN103208283A (en) * 2012-01-11 2013-07-17 三星电子株式会社 Method and apparatus for executing a user function by using voice recognition
CN106921911A (en) * 2017-04-13 2017-07-04 深圳创维-Rgb电子有限公司 Voice acquisition method and device
CN109493856A (en) * 2017-09-12 2019-03-19 合肥美的智能科技有限公司 Identify method and apparatus, household electrical appliance and the machine readable storage medium of voice
CN110335603A (en) * 2019-07-12 2019-10-15 四川长虹电器股份有限公司 Multi-modal exchange method applied to tv scene
CN110503960A (en) * 2019-09-26 2019-11-26 大众问问(北京)信息科技有限公司 Uploaded in real time method, apparatus, equipment and the storage medium of speech recognition result
CN110751948A (en) * 2019-10-18 2020-02-04 珠海格力电器股份有限公司 Voice recognition method, device, storage medium and voice equipment
CN111081257A (en) * 2018-10-19 2020-04-28 珠海格力电器股份有限公司 Voice acquisition method, device, equipment and storage medium
CN111583926A (en) * 2020-05-07 2020-08-25 珠海格力电器股份有限公司 Continuous voice interaction method and device based on cooking equipment and cooking equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100484493B1 (en) * 2002-12-12 2005-04-20 한국전자통신연구원 Spontaneous continuous speech recognition system and method using mutiple pronunication dictionary
CN103680505A (en) * 2013-09-03 2014-03-26 安徽科大讯飞信息科技股份有限公司 Voice recognition method and voice recognition system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010066519A (en) * 2008-09-11 2010-03-25 Brother Ind Ltd Voice interactive device, voice interactive method, and voice interactive program
CN103208283A (en) * 2012-01-11 2013-07-17 三星电子株式会社 Method and apparatus for executing a user function by using voice recognition
CN106921911A (en) * 2017-04-13 2017-07-04 深圳创维-Rgb电子有限公司 Voice acquisition method and device
CN109493856A (en) * 2017-09-12 2019-03-19 合肥美的智能科技有限公司 Identify method and apparatus, household electrical appliance and the machine readable storage medium of voice
CN111081257A (en) * 2018-10-19 2020-04-28 珠海格力电器股份有限公司 Voice acquisition method, device, equipment and storage medium
CN110335603A (en) * 2019-07-12 2019-10-15 四川长虹电器股份有限公司 Multi-modal exchange method applied to tv scene
CN110503960A (en) * 2019-09-26 2019-11-26 大众问问(北京)信息科技有限公司 Uploaded in real time method, apparatus, equipment and the storage medium of speech recognition result
CN110751948A (en) * 2019-10-18 2020-02-04 珠海格力电器股份有限公司 Voice recognition method, device, storage medium and voice equipment
CN111583926A (en) * 2020-05-07 2020-08-25 珠海格力电器股份有限公司 Continuous voice interaction method and device based on cooking equipment and cooking equipment

Also Published As

Publication number Publication date
CN112511877A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
KR102320708B1 (en) Video playing method and device, electronic device, and readable storage medium
CN109658932B (en) Equipment control method, device, equipment and medium
US20210125604A1 (en) Systems and methods for determining whether to trigger a voice capable device based on speaking cadence
CN109545206B (en) Voice interaction processing method and device of intelligent equipment and intelligent equipment
CN107396177B (en) Video playing method, device and storage medium
CN112511877B (en) Intelligent television voice continuous conversation and interaction method
US20140036022A1 (en) Providing a conversational video experience
US10860289B2 (en) Flexible voice-based information retrieval system for virtual assistant
CN105453025A (en) Visual confirmation for a recognized voice-initiated action
CN104599669A (en) Voice control method and device
US11270690B2 (en) Method and apparatus for waking up device
US20140028780A1 (en) Producing content to provide a conversational video experience
CN111295708A (en) Speech recognition apparatus and method of operating the same
CN112034726A (en) Scene-based control method, device, equipment and storage medium
KR20150054490A (en) Voice recognition system, voice recognition server and control method of display apparatus
JP2023515897A (en) Correction method and apparatus for voice dialogue
WO2021208392A1 (en) Voice skill jumping method for man-machine dialogue, electronic device, and storage medium
CN110136713A (en) Dialogue method and system of the user in multi-modal interaction
CN112581946A (en) Voice control method and device, electronic equipment and readable storage medium
JP2021056483A (en) Voice recognition control method, apparatus, electronic device, and readable storage medium
CN112269867A (en) Method, device, equipment and storage medium for pushing information
CN112051748A (en) Intelligent household vehicle-mounted control method, device, equipment and storage medium
CN111600782B (en) Control method and device of intelligent voice equipment, electronic equipment and storage medium
KR20210038278A (en) Speech control method and apparatus, electronic device, and readable storage medium
CN110992937A (en) Language offline recognition method, terminal and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant