CN111462744A - Voice interaction method and device, electronic equipment and storage medium - Google Patents

Voice interaction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111462744A
CN111462744A CN202010256089.3A CN202010256089A CN111462744A CN 111462744 A CN111462744 A CN 111462744A CN 202010256089 A CN202010256089 A CN 202010256089A CN 111462744 A CN111462744 A CN 111462744A
Authority
CN
China
Prior art keywords
audio
audio channel
audio information
interactive
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010256089.3A
Other languages
Chinese (zh)
Other versions
CN111462744B (en
Inventor
何亚欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Skyworth RGB Electronics Co Ltd
Original Assignee
Shenzhen Skyworth RGB Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Skyworth RGB Electronics Co Ltd filed Critical Shenzhen Skyworth RGB Electronics Co Ltd
Priority to CN202010256089.3A priority Critical patent/CN111462744B/en
Publication of CN111462744A publication Critical patent/CN111462744A/en
Priority to PCT/CN2020/127116 priority patent/WO2021196617A1/en
Application granted granted Critical
Publication of CN111462744B publication Critical patent/CN111462744B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The application provides a voice interaction method, a voice interaction device, electronic equipment and a storage medium, wherein the voice interaction method comprises the following steps: after receiving the voice wake-up instruction, starting a first audio channel for transmitting interactive audio information, and setting a currently started second audio channel for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state; after receiving an interactive audio instruction, searching interactive audio information matched with the interactive audio instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing. The volume of the interactive audio information and the volume of the on-demand audio information can be respectively controlled based on different audio channels, the recognition efficiency of the interactive audio information is improved, and the human-computer interaction efficiency is further improved.

Description

Voice interaction method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of speech recognition technologies, and in particular, to a speech interaction method, apparatus, electronic device, and storage medium.
Background
In recent years, as the speech recognition technology gradually matures, the speech recognition technology is often applied to the field of smart televisions to realize speech interaction functions between the smart televisions and users, such as channel switching, volume adjustment, and turning on or off of the smart televisions based on speech.
In practice, in the process of using the smart television, a user can perform voice interaction with the smart television while watching a television program to obtain voice interaction content fed back by the smart television, and at the moment, the user is difficult to distinguish the television program from the voice interaction content under the influence of the television program being played, so that the efficiency of recognizing the voice interaction content by the user is reduced, and further the interaction efficiency between the user and the smart television is reduced.
Disclosure of Invention
In view of this, an object of the embodiments of the present application is to provide a voice interaction method, apparatus, electronic device and storage medium, which can respectively control volumes of interactive audio information and on-demand audio information based on different audio channels, thereby improving recognition efficiency of the interactive audio information and further improving efficiency of human-computer interaction.
In a first aspect, an embodiment of the present application provides a voice interaction method, where the method includes:
after receiving the voice wake-up instruction, starting a first audio channel for transmitting interactive audio information, and setting a currently started second audio channel for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state;
after receiving an interactive audio instruction, searching interactive audio information matched with the interactive audio instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In one possible implementation, the voice interaction method further includes:
and after receiving the voice closing instruction, closing the first audio channel, and switching the second audio channel from the target state to the working state.
In one possible implementation, the voice interaction method further includes:
and searching interactive audio information matched with the voice awakening instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In a possible implementation, the first audio channel is further configured to transmit prompt audio information, and after the first audio channel is enabled, the method further includes:
if the prompt audio information to be played is detected, determining the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel;
and based on the transmission sequence, sequentially transmitting the prompt audio information and the interactive audio information to a playing end through the first audio channel for playing.
In a possible implementation, the first audio channel is further configured to transmit a prompt audio message, and after the first audio channel is closed, the method further includes:
if the prompt audio information to be played is detected, starting a first audio channel, and setting a currently started second audio channel to be in a target state;
and transmitting the prompt audio information to a playing end through the first audio channel for playing, closing the first audio channel after the prompt audio information is played, and switching the second audio channel from a target state to a working state.
In a possible implementation, the switching the second audio channel from the target state to the working state includes:
re-enabling the second audio channel in the off state;
alternatively, the first and second electrodes may be,
and switching the second audio channel from a low-tone state to a preset volume state.
In a second aspect, an embodiment of the present application provides a voice interaction apparatus, where the apparatus includes:
the first setting module is used for starting a first audio channel used for transmitting interactive audio information after receiving the voice wake-up instruction, and setting a second audio channel which is started currently and used for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state;
the searching module is used for searching the interactive audio information matched with the interactive audio instruction after receiving the interactive audio instruction;
and the first transmission module is used for transmitting the interactive audio information to a playing end through the first audio channel for playing.
In one possible implementation, the voice interaction apparatus further includes:
and the second setting module is used for closing the first audio channel after receiving the voice closing instruction and switching the second audio channel from the target state to the working state.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, the processor and the memory communicate via the bus when the electronic device is running, and the processor executes the machine-readable instructions to perform the steps of the voice interaction method according to any one of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the voice interaction method according to any one of the first aspect.
According to the voice interaction method, the voice interaction device, the electronic equipment and the storage medium, after the voice wake-up instruction is received, a first audio channel used for transmitting interactive audio information is started, and a second audio channel which is started currently and used for transmitting on-demand audio information is set to be in a target state; wherein the target state is an off state or a bass state; after the interactive audio instruction is received, the interactive audio information matched with the interactive audio instruction is searched, and the interactive audio information is transmitted to the playing end through the first audio channel to be played.
Further, the voice interaction method, the voice interaction device, the electronic device and the storage medium provided by the embodiment of the application can also determine the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel after the prompt audio information to be played is detected; and based on the transmission sequence, the prompt audio information and the interactive audio information are sequentially transmitted to the playing end through the first audio channel to be played, wherein the first audio channel is used for transmitting the prompt audio information and the interactive audio information, the number of occupied audio channels can be reduced, the utilization rate of the first audio channel is improved, and the transmission sequence of the audio information is determined based on the transmission priority of the audio information corresponding to the first audio channel, so that the transmission quality of the audio information of the first audio channel can be improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a flowchart illustrating a voice interaction method provided in an embodiment of the present application;
FIG. 2 is a flow chart of another voice interaction method provided by the embodiment of the application;
FIG. 3 is a flow chart of another voice interaction method provided by the embodiment of the application;
FIG. 4 is a schematic structural diagram of a voice interaction apparatus provided in an embodiment of the present application;
fig. 5 shows a schematic diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
At the present stage, in the process of using the smart television, a user can perform voice interaction with the smart television while watching a television program to obtain voice interaction content fed back by the smart television, and at the moment, the user is difficult to distinguish the television program from the voice interaction content under the influence of the television program being played, so that the efficiency of recognizing the voice interaction content by the user is reduced, and further the interaction efficiency between the user and the smart television is reduced.
Based on the above problems, embodiments of the present application provide a voice interaction method, apparatus, electronic device, and storage medium, where after receiving a voice wake-up instruction, a first audio channel for transmitting interactive audio information is enabled, and a currently enabled second audio channel for transmitting on-demand audio information is set to a target state; wherein the target state is an off state or a bass state; after the interactive audio instruction is received, the interactive audio information matched with the interactive audio instruction is searched, and the interactive audio information is transmitted to the playing end through the first audio channel to be played.
The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solution proposed by the present application to the above-mentioned problems in the following should be the contribution of the inventor to the present application in the process of the present application.
In order to enable a person skilled in the art to use the present disclosure, the following embodiments are given in connection with a specific application scenario "smart tv field". It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Although the present application is described primarily in the context of the "smart tv field," it should be understood that this is merely one exemplary embodiment.
The technical solutions in the present application will be described clearly and completely with reference to the drawings in the present application, and it should be understood that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the present application, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
For the convenience of understanding the present embodiment, a detailed description will be given first of all on a voice interaction method disclosed in the embodiments of the present application.
As shown in fig. 1, a flowchart of a voice interaction method provided in an embodiment of the present application is shown, where the voice interaction method includes the following steps:
s101, after receiving a voice awakening instruction, starting a first audio channel for transmitting interactive audio information, and setting a currently started second audio channel for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state.
In the embodiment of the application, the smart television comprises at least two audio channels, wherein a first audio channel is used for transmitting interactive audio information, and a second audio channel is used for transmitting on-demand audio information, for example, a tv play requested by a user, and the first audio channel corresponds to a first volume, and the second audio channel corresponds to a second volume, which can be respectively adjusted to the first volume and the second volume.
When the smart television plays the on-demand audio information (the voice interaction function between the user and the smart television is not started), the first audio channel is in a closed state, the second audio channel is in an open state, after a voice awakening instruction is received, the voice interaction function between the user and the smart television is started, at the moment, the first audio channel is switched from the closed state to the open state, and the second audio channel is switched from the open state to a target state, so that the voice interaction between the user and the smart television is realized.
The first audio channel corresponds to a first preset volume, and when the first audio channel is switched from the off state to the on state, the first volume corresponding to the first audio channel is set to be the first preset volume, where the first preset volume may be a volume pre-stored locally or a volume selected by a user according to a requirement of the user.
Wherein, the target state is a closed state or a bass state, and switching the second audio channel from the open state to the target state specifically includes: and switching the second audio channel from the open state to the closed state, or switching the second audio channel from the open state to the bass state, wherein the bass state corresponds to a second preset volume, that is, the second volume corresponding to the second audio channel is set to the second preset volume, and further, the on-demand audio information is transmitted to the playing end through the second audio channel to be played with the second volume (the second preset volume), wherein the second preset volume is smaller than the first preset volume.
In the embodiment of the application, the voice wake-up instruction is received in one of the following ways:
1. receiving specific voice information sent by a user, such as 'opening voice interaction function' and 'letting us chat bar'.
2. And detecting that the user clicks (long-presses) a voice interaction control key on the intelligent television.
3. And detecting that the user clicks (long-time pressing and sliding) the voice interaction control on the display screen of the intelligent television.
S102, after receiving an interactive audio instruction, searching interactive audio information matched with the interactive audio instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In the embodiment of the application, the corresponding relation between the interactive audio instruction and the interactive audio information is pre-stored locally, after the interactive audio instruction is received, the interactive audio information corresponding to the interactive audio instruction is searched based on the corresponding relation, and the searched interactive audio information is transmitted to the playing end through the first audio channel to be played, wherein the playing end comprises a display screen and a sound box.
The interactive audio information comprises interactive voice information and interactive video information, the interactive voice information is transmitted to a sound box of the smart television through a first audio channel and is played at the first volume (first preset volume), and the interactive video information is transmitted to a display screen of the smart television through the first audio channel and is played.
In this embodiment of the application, the interactive audio instruction may correspond to fixed interactive audio information, or may correspond to dynamic interactive audio information, for example, after receiving the interactive audio instruction ("how large your display screen is"), the fixed interactive audio information ("my screen is 55 inches"), which matches the interactive audio instruction, is transmitted to the playing end through the first audio channel for playing, or after receiving the interactive audio instruction ("now a few minutes"), the dynamic interactive audio information ("current time afternoon three minutes") matching the interactive audio instruction is transmitted to the playing end through the first audio channel for playing.
In practice, after receiving the interactive audio instruction, the processor in the smart television both feeds back the interactive audio information matched with the interactive audio instruction to the user and responds to the interactive audio instruction, for example, after receiving the interactive audio instruction ("turn down the brightness of the display screen"), transmits the interactive audio information ("the brightness is too low to easily damage eyes") to the playing terminal through the first audio channel for playing, and reduces the brightness of the display screen in response to the interactive audio instruction ("turn down the brightness of the display screen").
The voice interaction method provided by the embodiment of the application can respectively control the volumes of the interactive audio information and the on-demand audio information based on different audio channels, improves the recognition efficiency of the interactive audio information, and further improves the efficiency of man-machine interaction.
Further, the voice interaction method further comprises:
and after receiving the voice closing instruction, closing the first audio channel, and switching the second audio channel from the target state to the working state.
In the embodiment of the application, after the voice closing instruction is received, the first audio channel is switched from the open state to the closed state, and the second audio channel is switched from the target state to the working state.
Wherein, switch the second audio channel into the operating condition from the target state, include: re-enabling the second audio channel in the off state; or the second audio channel is switched from a low-tone state to a preset volume state.
Specifically, when the target state is the closed state, the second audio channel is switched from the closed state to the open state, and the second volume corresponding to the second audio channel is recovered; and when the target state is a bass state, restoring the second volume corresponding to the second audio channel, or setting the second volume corresponding to the second audio channel as a third preset volume, wherein the third preset volume is a volume pre-stored locally.
In the embodiment of the application, the voice closing instruction is received in one of the following ways:
1. receiving specific voice information sent by the user, such as 'closing the voice interaction function' and 'letting us end the chat bar'.
2. And detecting that the user clicks (long-presses) a voice interaction control key on the intelligent television.
3. And detecting that the user clicks (long-time pressing and sliding) the voice interaction control on the display screen of the intelligent television.
4. After the interactive audio instruction is received, the next interactive audio instruction is not received within the preset time range.
Further, after receiving the voice wake-up instruction, the method further includes:
and searching interactive audio information matched with the voice awakening instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In the embodiment of the application, after the voice awakening instruction is received, the interactive audio information matched with the voice awakening instruction is transmitted to the playing end through the first audio channel to be played.
As a possible implementation manner, interactive audio information corresponding to the voice wake-up instruction is pre-stored locally, and after the voice wake-up instruction is received, the interactive audio information is transmitted to the playing end through the first audio channel to be played.
For example, interactive audio information ("happy to chat with you") corresponding to the voice wakeup command is pre-stored locally, and after the voice wakeup command is received, the interactive audio information ("happy to chat with you") is played.
Further, as shown in fig. 2, the first audio channel is further configured to transmit prompt audio information, and after the first audio channel is enabled, the method further includes:
s201, if the prompt audio information to be played is detected, determining the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel.
S202, based on the transmission sequence, the prompt audio information and the interactive audio information are sequentially transmitted to a playing end through the first audio channel to be played.
With reference to step 201 and step 202, a first audio channel is configured to transmit prompt audio information and interactive audio information, and in a process of performing voice interaction between the smart television and a user, if the prompt audio information to be played is detected, a first transmission time range corresponding to the prompt audio information to be played and a second transmission time range corresponding to the interactive audio information to be played are obtained, and if the first transmission time range is intersected with the second transmission time range, a transmission sequence of the prompt audio information to be played and the interactive audio information to be played is determined based on an audio information transmission priority corresponding to the first audio channel, and the prompt audio information to be played and the interactive audio information to be played are sequentially transmitted to a playing end through the first audio channel according to the transmission sequence to be played; and if the first transmission time range and the second transmission time range are not intersected, respectively transmitting the prompt audio information to be played in the first transmission time range and transmitting the interactive audio information to be played in the second transmission time range.
For example, the first transmission time range corresponding to the to-be-played prompt audio information is from 30 minutes 00 seconds at 11: 3/31/2020/31/11: 30 minutes 05 seconds at 3/31/2020/3/31/11, and the second transmission time range corresponding to the to-be-played interactive audio information is from 30 minutes 03 seconds at 11: 3/31/2020/31/11: 30 minutes 10 seconds at 31/10, the to-be-played prompt audio information and the to-be-played interactive audio information are transmitted sequentially through the first audio channel according to the audio information transmission priority corresponding to the first audio channel.
Further, as shown in fig. 3, the first audio channel is further configured to transmit a prompt audio message, and after the first audio channel is closed, the method further includes:
s301, if the prompt audio information to be played is detected, starting a first audio channel, and setting a currently started second audio channel to be in a target state.
In the embodiment of the application, the first audio channel is used for transmitting prompt audio information and interactive audio information, when a voice interaction function between a user and the smart television is not started, the first audio channel is in a closed state, the second audio channel is in an open state, after the prompt audio information to be played is detected, the first audio channel is switched from the closed state to the open state, and the second audio channel is switched from the open state to a target state.
The first audio channel corresponds to a first preset volume, and when the first audio channel is switched from the off state to the on state, the first volume corresponding to the first audio channel is set to be the first preset volume, where the first preset volume may be a volume pre-stored locally or a volume selected by a user according to a requirement of the user.
Wherein, the target state is a closed state or a bass state, and switching the second audio channel from the open state to the target state specifically includes: and switching the second audio channel from the open state to the closed state, or switching the second audio channel from the open state to the bass state, wherein the bass state corresponds to a second preset volume, that is, the second volume corresponding to the second audio channel is set to the second preset volume, and further, the on-demand audio information is transmitted to the playing end through the second audio channel to be played with the second volume (the second preset volume), wherein the second preset volume is smaller than the first preset volume.
S302, the prompt audio information is transmitted to a playing end through the first audio channel to be played, the first audio channel is closed after the prompt audio information is played, and the second audio channel is switched to a working state from a target state.
In the embodiment of the application, the prompt audio information is transmitted to the playing end through the first audio channel to be played, each prompt audio information corresponds to the playing time length, the first audio channel is switched from the open state to the closed state after the playing time length, and the second audio channel is switched from the target state to the working state.
Wherein, switch the second audio channel into the operating condition from the target state, include: re-enabling the second audio channel in the off state; or the second audio channel is switched from a low-tone state to a preset volume state.
Specifically, when the target state is the closed state, the second audio channel is switched from the closed state to the open state, and the second volume corresponding to the second audio channel is recovered; and when the target state is a bass state, restoring the second volume corresponding to the second audio channel, or setting the second volume corresponding to the second audio channel as a third preset volume, wherein the third preset volume is a volume pre-stored locally.
The prompting audio information comprises prompting voice information and prompting video information, the prompting voice information is transmitted to a sound box of the intelligent television through a first audio channel to be played at the first volume (first preset volume), and the prompting video information is transmitted to a display screen of the intelligent television through the first audio channel to be played.
Based on the same inventive concept, the embodiment of the present application further provides a voice interaction apparatus corresponding to the voice interaction method, and since the principle of the apparatus in the embodiment of the present application for solving the problem is similar to the voice interaction method described above in the embodiment of the present application, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not described again.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a voice interaction apparatus according to an embodiment of the present application, where the voice interaction apparatus includes:
the first setting module 401 is configured to, after receiving the voice wake-up instruction, enable a first audio channel used for transmitting interactive audio information, and set a currently enabled second audio channel used for transmitting on-demand audio information to a target state; wherein the target state is an off state or a bass state;
the searching module 402 is configured to search, after receiving an interactive audio instruction, interactive audio information matched with the interactive audio instruction;
the first transmission module 403 is configured to transmit the interactive audio information to a playing end through the first audio channel for playing.
In one possible implementation, the voice interaction apparatus further includes:
and the second setting module is used for closing the first audio channel after receiving the voice closing instruction and switching the second audio channel from the target state to the working state.
In one possible implementation, the voice interaction apparatus further includes:
and the second transmission module is used for searching the interactive audio information matched with the voice awakening instruction and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In a possible implementation manner, the first audio channel is further configured to transmit prompt audio information, and the voice interaction apparatus further includes:
the determining module is used for determining the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel if the prompt audio information to be played is detected;
and the third transmission module is used for sequentially transmitting the prompt audio information and the interactive audio information to a playing end through the first audio channel for playing based on the transmission sequence.
In a possible implementation manner, the first audio channel is further configured to transmit prompt audio information, and the voice interaction apparatus further includes:
the third setting module is used for starting the first audio channel and setting the currently started second audio channel to be in a target state if the prompt audio information to be played is detected;
the fourth transmission module is used for transmitting the prompt audio information to a playing end through the first audio channel for playing;
and the fourth setting module is used for closing the first audio channel and switching the second audio channel from the target state to the working state after the prompt audio information is played.
In a possible implementation manner, the switching, by the second setting module, the second audio channel from the target state to the working state, or the switching, by the fourth setting module, the second audio channel from the target state to the working state includes:
re-enabling the second audio channel in the off state;
alternatively, the first and second electrodes may be,
and switching the second audio channel from a low-tone state to a preset volume state.
The voice interaction device provided by the embodiment of the application can respectively control the volumes of the interactive audio information and the on-demand audio information based on different audio channels, improves the recognition efficiency of the interactive audio information, and further improves the efficiency of man-machine interaction.
Referring to fig. 5, fig. 5 is an electronic device 500 provided in an embodiment of the present application, where the electronic device 500 includes: a processor 501, a memory 502 and a bus, wherein the memory 502 stores machine-readable instructions executable by the processor 501, when the electronic device is operated, the processor 501 and the memory 502 communicate with each other through the bus, and the processor 501 executes the machine-readable instructions to execute the steps of the voice interaction method.
Specifically, the memory 502 and the processor 501 can be general memories and processors, which are not limited in particular, and the processor 501 can execute the voice interaction method when executing the computer program stored in the memory 502.
Corresponding to the voice interaction method, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to perform the steps of the voice interaction method.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method of voice interaction, the method comprising:
after receiving the voice wake-up instruction, starting a first audio channel for transmitting interactive audio information, and setting a currently started second audio channel for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state;
after receiving an interactive audio instruction, searching interactive audio information matched with the interactive audio instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
2. The voice interaction method of claim 1, further comprising:
and after receiving the voice closing instruction, closing the first audio channel, and switching the second audio channel from the target state to the working state.
3. The voice interaction method of claim 1, wherein after receiving the voice wake-up command, the method further comprises:
and searching interactive audio information matched with the voice awakening instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
4. The voice interaction method of claim 1, wherein the first audio channel is further configured to transmit a prompt audio message, and wherein after the first audio channel is enabled, the method further comprises:
if the prompt audio information to be played is detected, determining the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel;
and based on the transmission sequence, sequentially transmitting the prompt audio information and the interactive audio information to a playing end through the first audio channel for playing.
5. The voice interaction method of claim 1, wherein the first audio channel is further configured to transmit a prompt audio message, and after the first audio channel is closed, the method further comprises:
if the prompt audio information to be played is detected, starting a first audio channel, and setting a currently started second audio channel to be in a target state;
and transmitting the prompt audio information to a playing end through the first audio channel for playing, closing the first audio channel after the prompt audio information is played, and switching the second audio channel from a target state to a working state.
6. The voice interaction method according to claim 2 or 5, wherein the switching the second audio channel from the target state to the working state comprises:
re-enabling the second audio channel in the off state;
alternatively, the first and second electrodes may be,
and switching the second audio channel from a low-tone state to a preset volume state.
7. A voice interaction apparatus, comprising:
the first setting module is used for starting a first audio channel used for transmitting interactive audio information after receiving the voice wake-up instruction, and setting a second audio channel which is started currently and used for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state;
the searching module is used for searching the interactive audio information matched with the interactive audio instruction after receiving the interactive audio instruction;
and the first transmission module is used for transmitting the interactive audio information to a playing end through the first audio channel for playing.
8. The voice interaction device of claim 7, further comprising:
and the second setting module is used for closing the first audio channel after receiving the voice closing instruction and switching the second audio channel from the target state to the working state.
9. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the voice interaction method of any of claims 1 to 6.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the voice interaction method according to any one of claims 1 to 6.
CN202010256089.3A 2020-04-02 2020-04-02 Voice interaction method and device, electronic equipment and storage medium Active CN111462744B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010256089.3A CN111462744B (en) 2020-04-02 2020-04-02 Voice interaction method and device, electronic equipment and storage medium
PCT/CN2020/127116 WO2021196617A1 (en) 2020-04-02 2020-11-06 Voice interaction method and apparatus, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010256089.3A CN111462744B (en) 2020-04-02 2020-04-02 Voice interaction method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111462744A true CN111462744A (en) 2020-07-28
CN111462744B CN111462744B (en) 2024-01-30

Family

ID=71680542

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010256089.3A Active CN111462744B (en) 2020-04-02 2020-04-02 Voice interaction method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN111462744B (en)
WO (1) WO2021196617A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113207058A (en) * 2021-05-06 2021-08-03 李建新 Audio signal transmission processing method
CN113362826A (en) * 2021-06-21 2021-09-07 艺唯科技股份有限公司 Device and method for automatically converting voice channel
WO2021196617A1 (en) * 2020-04-02 2021-10-07 深圳创维-Rgb电子有限公司 Voice interaction method and apparatus, electronic device and storage medium
CN114443197A (en) * 2022-01-24 2022-05-06 北京百度网讯科技有限公司 Interface processing method and device, electronic equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114836936A (en) * 2022-05-10 2022-08-02 海信(山东)冰箱有限公司 Clothes treatment equipment and control method thereof

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101039398A (en) * 2007-04-09 2007-09-19 海尔集团公司 Method for playing television audio and television for realizing the same
CN101945162A (en) * 2009-07-01 2011-01-12 Lg电子株式会社 Portable terminal and content of multimedia control method thereof
CN102945672A (en) * 2012-09-29 2013-02-27 深圳市国华识别科技开发有限公司 Voice control system for multimedia equipment, and voice control method
CN108363557A (en) * 2018-02-02 2018-08-03 刘国华 Man-machine interaction method, device, computer equipment and storage medium
CN108769745A (en) * 2018-06-29 2018-11-06 百度在线网络技术(北京)有限公司 Video broadcasting method and device
CN109275025A (en) * 2018-09-25 2019-01-25 四川长虹电器股份有限公司 The method for weakening background sound when realizing voice broadcast in smart television
CN110017848A (en) * 2019-04-11 2019-07-16 北京三快在线科技有限公司 Phonetic navigation method, device, electronic equipment and storage medium
CN110166550A (en) * 2019-05-22 2019-08-23 湖南康通电子股份有限公司 A kind of fixed time broadcast method and device of digit broadcasting system
CN110290475A (en) * 2019-05-30 2019-09-27 深圳米唐科技有限公司 Vehicle-mounted man-machine interaction method, system and computer readable storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105867718A (en) * 2015-12-10 2016-08-17 乐视网信息技术(北京)股份有限公司 Multimedia interaction method and apparatus
US10509626B2 (en) * 2016-02-22 2019-12-17 Sonos, Inc Handling of loss of pairing between networked devices
CN109151564B (en) * 2018-09-03 2021-06-29 海信视像科技股份有限公司 Equipment control method and device based on microphone
CN113763956A (en) * 2019-03-12 2021-12-07 百度在线网络技术(北京)有限公司 Interaction method and device applied to vehicle
CN111462744B (en) * 2020-04-02 2024-01-30 深圳创维-Rgb电子有限公司 Voice interaction method and device, electronic equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101039398A (en) * 2007-04-09 2007-09-19 海尔集团公司 Method for playing television audio and television for realizing the same
CN101945162A (en) * 2009-07-01 2011-01-12 Lg电子株式会社 Portable terminal and content of multimedia control method thereof
CN102945672A (en) * 2012-09-29 2013-02-27 深圳市国华识别科技开发有限公司 Voice control system for multimedia equipment, and voice control method
CN108363557A (en) * 2018-02-02 2018-08-03 刘国华 Man-machine interaction method, device, computer equipment and storage medium
CN108769745A (en) * 2018-06-29 2018-11-06 百度在线网络技术(北京)有限公司 Video broadcasting method and device
CN109275025A (en) * 2018-09-25 2019-01-25 四川长虹电器股份有限公司 The method for weakening background sound when realizing voice broadcast in smart television
CN110017848A (en) * 2019-04-11 2019-07-16 北京三快在线科技有限公司 Phonetic navigation method, device, electronic equipment and storage medium
CN110166550A (en) * 2019-05-22 2019-08-23 湖南康通电子股份有限公司 A kind of fixed time broadcast method and device of digit broadcasting system
CN110290475A (en) * 2019-05-30 2019-09-27 深圳米唐科技有限公司 Vehicle-mounted man-machine interaction method, system and computer readable storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021196617A1 (en) * 2020-04-02 2021-10-07 深圳创维-Rgb电子有限公司 Voice interaction method and apparatus, electronic device and storage medium
CN113207058A (en) * 2021-05-06 2021-08-03 李建新 Audio signal transmission processing method
CN113362826A (en) * 2021-06-21 2021-09-07 艺唯科技股份有限公司 Device and method for automatically converting voice channel
CN114443197A (en) * 2022-01-24 2022-05-06 北京百度网讯科技有限公司 Interface processing method and device, electronic equipment and storage medium
CN114443197B (en) * 2022-01-24 2024-04-09 北京百度网讯科技有限公司 Interface processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111462744B (en) 2024-01-30
WO2021196617A1 (en) 2021-10-07

Similar Documents

Publication Publication Date Title
CN111462744A (en) Voice interaction method and device, electronic equipment and storage medium
EP2815290B1 (en) Method and apparatus for smart voice recognition
WO2020010818A1 (en) Video capturing method and apparatus, terminal, server and storage medium
WO2017193540A1 (en) Method, device and system for playing overlay comment
CN114286173B (en) Display equipment and sound and picture parameter adjusting method
CN103686200A (en) Intelligent television video resource searching method and system
US10468004B2 (en) Information processing method, terminal device and computer storage medium
US20210168460A1 (en) Electronic device and subtitle expression method thereof
KR102358012B1 (en) Speech control method and apparatus, electronic device, and readable storage medium
CN110769319A (en) Standby wakeup interaction method and device
CN112463106A (en) Voice interaction method, device and equipment based on intelligent screen and storage medium
CN110691281A (en) Video playing processing method, terminal device, server and storage medium
CN109032554B (en) Audio processing method and electronic equipment
US11429882B2 (en) Method and apparatus for outputting information
CN113992972A (en) Subtitle display method and device, electronic equipment and readable storage medium
CN109408164B (en) Method, device and equipment for controlling screen display content and readable storage medium
CN111556198A (en) Sound effect control method, terminal equipment and storage medium
CN108334339A (en) A kind of bluetooth equipment driving method and device
CN109739462B (en) Content input method and device
US20230046440A1 (en) Video playback method and device
CN115278352A (en) Video playing method, device, equipment and storage medium
CN112786031B (en) Man-machine conversation method and system
CN113900609A (en) Large-screen terminal interaction method, large-screen terminal and computer readable storage medium
CN113593582A (en) Control method and device of intelligent device, storage medium and electronic device
CN112883144A (en) Information interaction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant