CN111462744A - Voice interaction method and device, electronic equipment and storage medium - Google Patents
Voice interaction method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN111462744A CN111462744A CN202010256089.3A CN202010256089A CN111462744A CN 111462744 A CN111462744 A CN 111462744A CN 202010256089 A CN202010256089 A CN 202010256089A CN 111462744 A CN111462744 A CN 111462744A
- Authority
- CN
- China
- Prior art keywords
- audio
- audio channel
- audio information
- interactive
- state
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 78
- 238000000034 method Methods 0.000 title claims abstract description 57
- 230000002452 interceptive effect Effects 0.000 claims abstract description 115
- 230000005540 biological transmission Effects 0.000 claims description 38
- 230000015654 memory Effects 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The application provides a voice interaction method, a voice interaction device, electronic equipment and a storage medium, wherein the voice interaction method comprises the following steps: after receiving the voice wake-up instruction, starting a first audio channel for transmitting interactive audio information, and setting a currently started second audio channel for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state; after receiving an interactive audio instruction, searching interactive audio information matched with the interactive audio instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing. The volume of the interactive audio information and the volume of the on-demand audio information can be respectively controlled based on different audio channels, the recognition efficiency of the interactive audio information is improved, and the human-computer interaction efficiency is further improved.
Description
Technical Field
The present application relates to the field of speech recognition technologies, and in particular, to a speech interaction method, apparatus, electronic device, and storage medium.
Background
In recent years, as the speech recognition technology gradually matures, the speech recognition technology is often applied to the field of smart televisions to realize speech interaction functions between the smart televisions and users, such as channel switching, volume adjustment, and turning on or off of the smart televisions based on speech.
In practice, in the process of using the smart television, a user can perform voice interaction with the smart television while watching a television program to obtain voice interaction content fed back by the smart television, and at the moment, the user is difficult to distinguish the television program from the voice interaction content under the influence of the television program being played, so that the efficiency of recognizing the voice interaction content by the user is reduced, and further the interaction efficiency between the user and the smart television is reduced.
Disclosure of Invention
In view of this, an object of the embodiments of the present application is to provide a voice interaction method, apparatus, electronic device and storage medium, which can respectively control volumes of interactive audio information and on-demand audio information based on different audio channels, thereby improving recognition efficiency of the interactive audio information and further improving efficiency of human-computer interaction.
In a first aspect, an embodiment of the present application provides a voice interaction method, where the method includes:
after receiving the voice wake-up instruction, starting a first audio channel for transmitting interactive audio information, and setting a currently started second audio channel for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state;
after receiving an interactive audio instruction, searching interactive audio information matched with the interactive audio instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In one possible implementation, the voice interaction method further includes:
and after receiving the voice closing instruction, closing the first audio channel, and switching the second audio channel from the target state to the working state.
In one possible implementation, the voice interaction method further includes:
and searching interactive audio information matched with the voice awakening instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In a possible implementation, the first audio channel is further configured to transmit prompt audio information, and after the first audio channel is enabled, the method further includes:
if the prompt audio information to be played is detected, determining the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel;
and based on the transmission sequence, sequentially transmitting the prompt audio information and the interactive audio information to a playing end through the first audio channel for playing.
In a possible implementation, the first audio channel is further configured to transmit a prompt audio message, and after the first audio channel is closed, the method further includes:
if the prompt audio information to be played is detected, starting a first audio channel, and setting a currently started second audio channel to be in a target state;
and transmitting the prompt audio information to a playing end through the first audio channel for playing, closing the first audio channel after the prompt audio information is played, and switching the second audio channel from a target state to a working state.
In a possible implementation, the switching the second audio channel from the target state to the working state includes:
re-enabling the second audio channel in the off state;
alternatively, the first and second electrodes may be,
and switching the second audio channel from a low-tone state to a preset volume state.
In a second aspect, an embodiment of the present application provides a voice interaction apparatus, where the apparatus includes:
the first setting module is used for starting a first audio channel used for transmitting interactive audio information after receiving the voice wake-up instruction, and setting a second audio channel which is started currently and used for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state;
the searching module is used for searching the interactive audio information matched with the interactive audio instruction after receiving the interactive audio instruction;
and the first transmission module is used for transmitting the interactive audio information to a playing end through the first audio channel for playing.
In one possible implementation, the voice interaction apparatus further includes:
and the second setting module is used for closing the first audio channel after receiving the voice closing instruction and switching the second audio channel from the target state to the working state.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, the processor and the memory communicate via the bus when the electronic device is running, and the processor executes the machine-readable instructions to perform the steps of the voice interaction method according to any one of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the voice interaction method according to any one of the first aspect.
According to the voice interaction method, the voice interaction device, the electronic equipment and the storage medium, after the voice wake-up instruction is received, a first audio channel used for transmitting interactive audio information is started, and a second audio channel which is started currently and used for transmitting on-demand audio information is set to be in a target state; wherein the target state is an off state or a bass state; after the interactive audio instruction is received, the interactive audio information matched with the interactive audio instruction is searched, and the interactive audio information is transmitted to the playing end through the first audio channel to be played.
Further, the voice interaction method, the voice interaction device, the electronic device and the storage medium provided by the embodiment of the application can also determine the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel after the prompt audio information to be played is detected; and based on the transmission sequence, the prompt audio information and the interactive audio information are sequentially transmitted to the playing end through the first audio channel to be played, wherein the first audio channel is used for transmitting the prompt audio information and the interactive audio information, the number of occupied audio channels can be reduced, the utilization rate of the first audio channel is improved, and the transmission sequence of the audio information is determined based on the transmission priority of the audio information corresponding to the first audio channel, so that the transmission quality of the audio information of the first audio channel can be improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a flowchart illustrating a voice interaction method provided in an embodiment of the present application;
FIG. 2 is a flow chart of another voice interaction method provided by the embodiment of the application;
FIG. 3 is a flow chart of another voice interaction method provided by the embodiment of the application;
FIG. 4 is a schematic structural diagram of a voice interaction apparatus provided in an embodiment of the present application;
fig. 5 shows a schematic diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
At the present stage, in the process of using the smart television, a user can perform voice interaction with the smart television while watching a television program to obtain voice interaction content fed back by the smart television, and at the moment, the user is difficult to distinguish the television program from the voice interaction content under the influence of the television program being played, so that the efficiency of recognizing the voice interaction content by the user is reduced, and further the interaction efficiency between the user and the smart television is reduced.
Based on the above problems, embodiments of the present application provide a voice interaction method, apparatus, electronic device, and storage medium, where after receiving a voice wake-up instruction, a first audio channel for transmitting interactive audio information is enabled, and a currently enabled second audio channel for transmitting on-demand audio information is set to a target state; wherein the target state is an off state or a bass state; after the interactive audio instruction is received, the interactive audio information matched with the interactive audio instruction is searched, and the interactive audio information is transmitted to the playing end through the first audio channel to be played.
The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solution proposed by the present application to the above-mentioned problems in the following should be the contribution of the inventor to the present application in the process of the present application.
In order to enable a person skilled in the art to use the present disclosure, the following embodiments are given in connection with a specific application scenario "smart tv field". It will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the application. Although the present application is described primarily in the context of the "smart tv field," it should be understood that this is merely one exemplary embodiment.
The technical solutions in the present application will be described clearly and completely with reference to the drawings in the present application, and it should be understood that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the present application, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
For the convenience of understanding the present embodiment, a detailed description will be given first of all on a voice interaction method disclosed in the embodiments of the present application.
As shown in fig. 1, a flowchart of a voice interaction method provided in an embodiment of the present application is shown, where the voice interaction method includes the following steps:
s101, after receiving a voice awakening instruction, starting a first audio channel for transmitting interactive audio information, and setting a currently started second audio channel for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state.
In the embodiment of the application, the smart television comprises at least two audio channels, wherein a first audio channel is used for transmitting interactive audio information, and a second audio channel is used for transmitting on-demand audio information, for example, a tv play requested by a user, and the first audio channel corresponds to a first volume, and the second audio channel corresponds to a second volume, which can be respectively adjusted to the first volume and the second volume.
When the smart television plays the on-demand audio information (the voice interaction function between the user and the smart television is not started), the first audio channel is in a closed state, the second audio channel is in an open state, after a voice awakening instruction is received, the voice interaction function between the user and the smart television is started, at the moment, the first audio channel is switched from the closed state to the open state, and the second audio channel is switched from the open state to a target state, so that the voice interaction between the user and the smart television is realized.
The first audio channel corresponds to a first preset volume, and when the first audio channel is switched from the off state to the on state, the first volume corresponding to the first audio channel is set to be the first preset volume, where the first preset volume may be a volume pre-stored locally or a volume selected by a user according to a requirement of the user.
Wherein, the target state is a closed state or a bass state, and switching the second audio channel from the open state to the target state specifically includes: and switching the second audio channel from the open state to the closed state, or switching the second audio channel from the open state to the bass state, wherein the bass state corresponds to a second preset volume, that is, the second volume corresponding to the second audio channel is set to the second preset volume, and further, the on-demand audio information is transmitted to the playing end through the second audio channel to be played with the second volume (the second preset volume), wherein the second preset volume is smaller than the first preset volume.
In the embodiment of the application, the voice wake-up instruction is received in one of the following ways:
1. receiving specific voice information sent by a user, such as 'opening voice interaction function' and 'letting us chat bar'.
2. And detecting that the user clicks (long-presses) a voice interaction control key on the intelligent television.
3. And detecting that the user clicks (long-time pressing and sliding) the voice interaction control on the display screen of the intelligent television.
S102, after receiving an interactive audio instruction, searching interactive audio information matched with the interactive audio instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In the embodiment of the application, the corresponding relation between the interactive audio instruction and the interactive audio information is pre-stored locally, after the interactive audio instruction is received, the interactive audio information corresponding to the interactive audio instruction is searched based on the corresponding relation, and the searched interactive audio information is transmitted to the playing end through the first audio channel to be played, wherein the playing end comprises a display screen and a sound box.
The interactive audio information comprises interactive voice information and interactive video information, the interactive voice information is transmitted to a sound box of the smart television through a first audio channel and is played at the first volume (first preset volume), and the interactive video information is transmitted to a display screen of the smart television through the first audio channel and is played.
In this embodiment of the application, the interactive audio instruction may correspond to fixed interactive audio information, or may correspond to dynamic interactive audio information, for example, after receiving the interactive audio instruction ("how large your display screen is"), the fixed interactive audio information ("my screen is 55 inches"), which matches the interactive audio instruction, is transmitted to the playing end through the first audio channel for playing, or after receiving the interactive audio instruction ("now a few minutes"), the dynamic interactive audio information ("current time afternoon three minutes") matching the interactive audio instruction is transmitted to the playing end through the first audio channel for playing.
In practice, after receiving the interactive audio instruction, the processor in the smart television both feeds back the interactive audio information matched with the interactive audio instruction to the user and responds to the interactive audio instruction, for example, after receiving the interactive audio instruction ("turn down the brightness of the display screen"), transmits the interactive audio information ("the brightness is too low to easily damage eyes") to the playing terminal through the first audio channel for playing, and reduces the brightness of the display screen in response to the interactive audio instruction ("turn down the brightness of the display screen").
The voice interaction method provided by the embodiment of the application can respectively control the volumes of the interactive audio information and the on-demand audio information based on different audio channels, improves the recognition efficiency of the interactive audio information, and further improves the efficiency of man-machine interaction.
Further, the voice interaction method further comprises:
and after receiving the voice closing instruction, closing the first audio channel, and switching the second audio channel from the target state to the working state.
In the embodiment of the application, after the voice closing instruction is received, the first audio channel is switched from the open state to the closed state, and the second audio channel is switched from the target state to the working state.
Wherein, switch the second audio channel into the operating condition from the target state, include: re-enabling the second audio channel in the off state; or the second audio channel is switched from a low-tone state to a preset volume state.
Specifically, when the target state is the closed state, the second audio channel is switched from the closed state to the open state, and the second volume corresponding to the second audio channel is recovered; and when the target state is a bass state, restoring the second volume corresponding to the second audio channel, or setting the second volume corresponding to the second audio channel as a third preset volume, wherein the third preset volume is a volume pre-stored locally.
In the embodiment of the application, the voice closing instruction is received in one of the following ways:
1. receiving specific voice information sent by the user, such as 'closing the voice interaction function' and 'letting us end the chat bar'.
2. And detecting that the user clicks (long-presses) a voice interaction control key on the intelligent television.
3. And detecting that the user clicks (long-time pressing and sliding) the voice interaction control on the display screen of the intelligent television.
4. After the interactive audio instruction is received, the next interactive audio instruction is not received within the preset time range.
Further, after receiving the voice wake-up instruction, the method further includes:
and searching interactive audio information matched with the voice awakening instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In the embodiment of the application, after the voice awakening instruction is received, the interactive audio information matched with the voice awakening instruction is transmitted to the playing end through the first audio channel to be played.
As a possible implementation manner, interactive audio information corresponding to the voice wake-up instruction is pre-stored locally, and after the voice wake-up instruction is received, the interactive audio information is transmitted to the playing end through the first audio channel to be played.
For example, interactive audio information ("happy to chat with you") corresponding to the voice wakeup command is pre-stored locally, and after the voice wakeup command is received, the interactive audio information ("happy to chat with you") is played.
Further, as shown in fig. 2, the first audio channel is further configured to transmit prompt audio information, and after the first audio channel is enabled, the method further includes:
s201, if the prompt audio information to be played is detected, determining the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel.
S202, based on the transmission sequence, the prompt audio information and the interactive audio information are sequentially transmitted to a playing end through the first audio channel to be played.
With reference to step 201 and step 202, a first audio channel is configured to transmit prompt audio information and interactive audio information, and in a process of performing voice interaction between the smart television and a user, if the prompt audio information to be played is detected, a first transmission time range corresponding to the prompt audio information to be played and a second transmission time range corresponding to the interactive audio information to be played are obtained, and if the first transmission time range is intersected with the second transmission time range, a transmission sequence of the prompt audio information to be played and the interactive audio information to be played is determined based on an audio information transmission priority corresponding to the first audio channel, and the prompt audio information to be played and the interactive audio information to be played are sequentially transmitted to a playing end through the first audio channel according to the transmission sequence to be played; and if the first transmission time range and the second transmission time range are not intersected, respectively transmitting the prompt audio information to be played in the first transmission time range and transmitting the interactive audio information to be played in the second transmission time range.
For example, the first transmission time range corresponding to the to-be-played prompt audio information is from 30 minutes 00 seconds at 11: 3/31/2020/31/11: 30 minutes 05 seconds at 3/31/2020/3/31/11, and the second transmission time range corresponding to the to-be-played interactive audio information is from 30 minutes 03 seconds at 11: 3/31/2020/31/11: 30 minutes 10 seconds at 31/10, the to-be-played prompt audio information and the to-be-played interactive audio information are transmitted sequentially through the first audio channel according to the audio information transmission priority corresponding to the first audio channel.
Further, as shown in fig. 3, the first audio channel is further configured to transmit a prompt audio message, and after the first audio channel is closed, the method further includes:
s301, if the prompt audio information to be played is detected, starting a first audio channel, and setting a currently started second audio channel to be in a target state.
In the embodiment of the application, the first audio channel is used for transmitting prompt audio information and interactive audio information, when a voice interaction function between a user and the smart television is not started, the first audio channel is in a closed state, the second audio channel is in an open state, after the prompt audio information to be played is detected, the first audio channel is switched from the closed state to the open state, and the second audio channel is switched from the open state to a target state.
The first audio channel corresponds to a first preset volume, and when the first audio channel is switched from the off state to the on state, the first volume corresponding to the first audio channel is set to be the first preset volume, where the first preset volume may be a volume pre-stored locally or a volume selected by a user according to a requirement of the user.
Wherein, the target state is a closed state or a bass state, and switching the second audio channel from the open state to the target state specifically includes: and switching the second audio channel from the open state to the closed state, or switching the second audio channel from the open state to the bass state, wherein the bass state corresponds to a second preset volume, that is, the second volume corresponding to the second audio channel is set to the second preset volume, and further, the on-demand audio information is transmitted to the playing end through the second audio channel to be played with the second volume (the second preset volume), wherein the second preset volume is smaller than the first preset volume.
S302, the prompt audio information is transmitted to a playing end through the first audio channel to be played, the first audio channel is closed after the prompt audio information is played, and the second audio channel is switched to a working state from a target state.
In the embodiment of the application, the prompt audio information is transmitted to the playing end through the first audio channel to be played, each prompt audio information corresponds to the playing time length, the first audio channel is switched from the open state to the closed state after the playing time length, and the second audio channel is switched from the target state to the working state.
Wherein, switch the second audio channel into the operating condition from the target state, include: re-enabling the second audio channel in the off state; or the second audio channel is switched from a low-tone state to a preset volume state.
Specifically, when the target state is the closed state, the second audio channel is switched from the closed state to the open state, and the second volume corresponding to the second audio channel is recovered; and when the target state is a bass state, restoring the second volume corresponding to the second audio channel, or setting the second volume corresponding to the second audio channel as a third preset volume, wherein the third preset volume is a volume pre-stored locally.
The prompting audio information comprises prompting voice information and prompting video information, the prompting voice information is transmitted to a sound box of the intelligent television through a first audio channel to be played at the first volume (first preset volume), and the prompting video information is transmitted to a display screen of the intelligent television through the first audio channel to be played.
Based on the same inventive concept, the embodiment of the present application further provides a voice interaction apparatus corresponding to the voice interaction method, and since the principle of the apparatus in the embodiment of the present application for solving the problem is similar to the voice interaction method described above in the embodiment of the present application, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not described again.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a voice interaction apparatus according to an embodiment of the present application, where the voice interaction apparatus includes:
the first setting module 401 is configured to, after receiving the voice wake-up instruction, enable a first audio channel used for transmitting interactive audio information, and set a currently enabled second audio channel used for transmitting on-demand audio information to a target state; wherein the target state is an off state or a bass state;
the searching module 402 is configured to search, after receiving an interactive audio instruction, interactive audio information matched with the interactive audio instruction;
the first transmission module 403 is configured to transmit the interactive audio information to a playing end through the first audio channel for playing.
In one possible implementation, the voice interaction apparatus further includes:
and the second setting module is used for closing the first audio channel after receiving the voice closing instruction and switching the second audio channel from the target state to the working state.
In one possible implementation, the voice interaction apparatus further includes:
and the second transmission module is used for searching the interactive audio information matched with the voice awakening instruction and transmitting the interactive audio information to a playing end through the first audio channel for playing.
In a possible implementation manner, the first audio channel is further configured to transmit prompt audio information, and the voice interaction apparatus further includes:
the determining module is used for determining the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel if the prompt audio information to be played is detected;
and the third transmission module is used for sequentially transmitting the prompt audio information and the interactive audio information to a playing end through the first audio channel for playing based on the transmission sequence.
In a possible implementation manner, the first audio channel is further configured to transmit prompt audio information, and the voice interaction apparatus further includes:
the third setting module is used for starting the first audio channel and setting the currently started second audio channel to be in a target state if the prompt audio information to be played is detected;
the fourth transmission module is used for transmitting the prompt audio information to a playing end through the first audio channel for playing;
and the fourth setting module is used for closing the first audio channel and switching the second audio channel from the target state to the working state after the prompt audio information is played.
In a possible implementation manner, the switching, by the second setting module, the second audio channel from the target state to the working state, or the switching, by the fourth setting module, the second audio channel from the target state to the working state includes:
re-enabling the second audio channel in the off state;
alternatively, the first and second electrodes may be,
and switching the second audio channel from a low-tone state to a preset volume state.
The voice interaction device provided by the embodiment of the application can respectively control the volumes of the interactive audio information and the on-demand audio information based on different audio channels, improves the recognition efficiency of the interactive audio information, and further improves the efficiency of man-machine interaction.
Referring to fig. 5, fig. 5 is an electronic device 500 provided in an embodiment of the present application, where the electronic device 500 includes: a processor 501, a memory 502 and a bus, wherein the memory 502 stores machine-readable instructions executable by the processor 501, when the electronic device is operated, the processor 501 and the memory 502 communicate with each other through the bus, and the processor 501 executes the machine-readable instructions to execute the steps of the voice interaction method.
Specifically, the memory 502 and the processor 501 can be general memories and processors, which are not limited in particular, and the processor 501 can execute the voice interaction method when executing the computer program stored in the memory 502.
Corresponding to the voice interaction method, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to perform the steps of the voice interaction method.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
1. A method of voice interaction, the method comprising:
after receiving the voice wake-up instruction, starting a first audio channel for transmitting interactive audio information, and setting a currently started second audio channel for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state;
after receiving an interactive audio instruction, searching interactive audio information matched with the interactive audio instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
2. The voice interaction method of claim 1, further comprising:
and after receiving the voice closing instruction, closing the first audio channel, and switching the second audio channel from the target state to the working state.
3. The voice interaction method of claim 1, wherein after receiving the voice wake-up command, the method further comprises:
and searching interactive audio information matched with the voice awakening instruction, and transmitting the interactive audio information to a playing end through the first audio channel for playing.
4. The voice interaction method of claim 1, wherein the first audio channel is further configured to transmit a prompt audio message, and wherein after the first audio channel is enabled, the method further comprises:
if the prompt audio information to be played is detected, determining the transmission sequence of the prompt audio information and the interactive audio information based on the audio information transmission priority corresponding to the first audio channel;
and based on the transmission sequence, sequentially transmitting the prompt audio information and the interactive audio information to a playing end through the first audio channel for playing.
5. The voice interaction method of claim 1, wherein the first audio channel is further configured to transmit a prompt audio message, and after the first audio channel is closed, the method further comprises:
if the prompt audio information to be played is detected, starting a first audio channel, and setting a currently started second audio channel to be in a target state;
and transmitting the prompt audio information to a playing end through the first audio channel for playing, closing the first audio channel after the prompt audio information is played, and switching the second audio channel from a target state to a working state.
6. The voice interaction method according to claim 2 or 5, wherein the switching the second audio channel from the target state to the working state comprises:
re-enabling the second audio channel in the off state;
alternatively, the first and second electrodes may be,
and switching the second audio channel from a low-tone state to a preset volume state.
7. A voice interaction apparatus, comprising:
the first setting module is used for starting a first audio channel used for transmitting interactive audio information after receiving the voice wake-up instruction, and setting a second audio channel which is started currently and used for transmitting on-demand audio information to be in a target state; wherein the target state is an off state or a bass state;
the searching module is used for searching the interactive audio information matched with the interactive audio instruction after receiving the interactive audio instruction;
and the first transmission module is used for transmitting the interactive audio information to a playing end through the first audio channel for playing.
8. The voice interaction device of claim 7, further comprising:
and the second setting module is used for closing the first audio channel after receiving the voice closing instruction and switching the second audio channel from the target state to the working state.
9. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the voice interaction method of any of claims 1 to 6.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the voice interaction method according to any one of claims 1 to 6.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010256089.3A CN111462744B (en) | 2020-04-02 | 2020-04-02 | Voice interaction method and device, electronic equipment and storage medium |
PCT/CN2020/127116 WO2021196617A1 (en) | 2020-04-02 | 2020-11-06 | Voice interaction method and apparatus, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010256089.3A CN111462744B (en) | 2020-04-02 | 2020-04-02 | Voice interaction method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111462744A true CN111462744A (en) | 2020-07-28 |
CN111462744B CN111462744B (en) | 2024-01-30 |
Family
ID=71680542
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010256089.3A Active CN111462744B (en) | 2020-04-02 | 2020-04-02 | Voice interaction method and device, electronic equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111462744B (en) |
WO (1) | WO2021196617A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113207058A (en) * | 2021-05-06 | 2021-08-03 | 李建新 | Audio signal transmission processing method |
CN113362826A (en) * | 2021-06-21 | 2021-09-07 | 艺唯科技股份有限公司 | Device and method for automatically converting voice channel |
WO2021196617A1 (en) * | 2020-04-02 | 2021-10-07 | 深圳创维-Rgb电子有限公司 | Voice interaction method and apparatus, electronic device and storage medium |
CN114443197A (en) * | 2022-01-24 | 2022-05-06 | 北京百度网讯科技有限公司 | Interface processing method and device, electronic equipment and storage medium |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114836936A (en) * | 2022-05-10 | 2022-08-02 | 海信(山东)冰箱有限公司 | Clothes treatment equipment and control method thereof |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101039398A (en) * | 2007-04-09 | 2007-09-19 | 海尔集团公司 | Method for playing television audio and television for realizing the same |
CN101945162A (en) * | 2009-07-01 | 2011-01-12 | Lg电子株式会社 | Portable terminal and content of multimedia control method thereof |
CN102945672A (en) * | 2012-09-29 | 2013-02-27 | 深圳市国华识别科技开发有限公司 | Voice control system for multimedia equipment, and voice control method |
CN108363557A (en) * | 2018-02-02 | 2018-08-03 | 刘国华 | Man-machine interaction method, device, computer equipment and storage medium |
CN108769745A (en) * | 2018-06-29 | 2018-11-06 | 百度在线网络技术(北京)有限公司 | Video broadcasting method and device |
CN109275025A (en) * | 2018-09-25 | 2019-01-25 | 四川长虹电器股份有限公司 | The method for weakening background sound when realizing voice broadcast in smart television |
CN110017848A (en) * | 2019-04-11 | 2019-07-16 | 北京三快在线科技有限公司 | Phonetic navigation method, device, electronic equipment and storage medium |
CN110166550A (en) * | 2019-05-22 | 2019-08-23 | 湖南康通电子股份有限公司 | A kind of fixed time broadcast method and device of digit broadcasting system |
CN110290475A (en) * | 2019-05-30 | 2019-09-27 | 深圳米唐科技有限公司 | Vehicle-mounted man-machine interaction method, system and computer readable storage medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105867718A (en) * | 2015-12-10 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Multimedia interaction method and apparatus |
US10509626B2 (en) * | 2016-02-22 | 2019-12-17 | Sonos, Inc | Handling of loss of pairing between networked devices |
CN109151564B (en) * | 2018-09-03 | 2021-06-29 | 海信视像科技股份有限公司 | Equipment control method and device based on microphone |
CN113763956A (en) * | 2019-03-12 | 2021-12-07 | 百度在线网络技术(北京)有限公司 | Interaction method and device applied to vehicle |
CN111462744B (en) * | 2020-04-02 | 2024-01-30 | 深圳创维-Rgb电子有限公司 | Voice interaction method and device, electronic equipment and storage medium |
-
2020
- 2020-04-02 CN CN202010256089.3A patent/CN111462744B/en active Active
- 2020-11-06 WO PCT/CN2020/127116 patent/WO2021196617A1/en active Application Filing
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101039398A (en) * | 2007-04-09 | 2007-09-19 | 海尔集团公司 | Method for playing television audio and television for realizing the same |
CN101945162A (en) * | 2009-07-01 | 2011-01-12 | Lg电子株式会社 | Portable terminal and content of multimedia control method thereof |
CN102945672A (en) * | 2012-09-29 | 2013-02-27 | 深圳市国华识别科技开发有限公司 | Voice control system for multimedia equipment, and voice control method |
CN108363557A (en) * | 2018-02-02 | 2018-08-03 | 刘国华 | Man-machine interaction method, device, computer equipment and storage medium |
CN108769745A (en) * | 2018-06-29 | 2018-11-06 | 百度在线网络技术(北京)有限公司 | Video broadcasting method and device |
CN109275025A (en) * | 2018-09-25 | 2019-01-25 | 四川长虹电器股份有限公司 | The method for weakening background sound when realizing voice broadcast in smart television |
CN110017848A (en) * | 2019-04-11 | 2019-07-16 | 北京三快在线科技有限公司 | Phonetic navigation method, device, electronic equipment and storage medium |
CN110166550A (en) * | 2019-05-22 | 2019-08-23 | 湖南康通电子股份有限公司 | A kind of fixed time broadcast method and device of digit broadcasting system |
CN110290475A (en) * | 2019-05-30 | 2019-09-27 | 深圳米唐科技有限公司 | Vehicle-mounted man-machine interaction method, system and computer readable storage medium |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021196617A1 (en) * | 2020-04-02 | 2021-10-07 | 深圳创维-Rgb电子有限公司 | Voice interaction method and apparatus, electronic device and storage medium |
CN113207058A (en) * | 2021-05-06 | 2021-08-03 | 李建新 | Audio signal transmission processing method |
CN113362826A (en) * | 2021-06-21 | 2021-09-07 | 艺唯科技股份有限公司 | Device and method for automatically converting voice channel |
CN114443197A (en) * | 2022-01-24 | 2022-05-06 | 北京百度网讯科技有限公司 | Interface processing method and device, electronic equipment and storage medium |
CN114443197B (en) * | 2022-01-24 | 2024-04-09 | 北京百度网讯科技有限公司 | Interface processing method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111462744B (en) | 2024-01-30 |
WO2021196617A1 (en) | 2021-10-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111462744A (en) | Voice interaction method and device, electronic equipment and storage medium | |
EP2815290B1 (en) | Method and apparatus for smart voice recognition | |
WO2020010818A1 (en) | Video capturing method and apparatus, terminal, server and storage medium | |
WO2017193540A1 (en) | Method, device and system for playing overlay comment | |
CN114286173B (en) | Display equipment and sound and picture parameter adjusting method | |
CN103686200A (en) | Intelligent television video resource searching method and system | |
US10468004B2 (en) | Information processing method, terminal device and computer storage medium | |
US20210168460A1 (en) | Electronic device and subtitle expression method thereof | |
KR102358012B1 (en) | Speech control method and apparatus, electronic device, and readable storage medium | |
CN110769319A (en) | Standby wakeup interaction method and device | |
CN112463106A (en) | Voice interaction method, device and equipment based on intelligent screen and storage medium | |
CN110691281A (en) | Video playing processing method, terminal device, server and storage medium | |
CN109032554B (en) | Audio processing method and electronic equipment | |
US11429882B2 (en) | Method and apparatus for outputting information | |
CN113992972A (en) | Subtitle display method and device, electronic equipment and readable storage medium | |
CN109408164B (en) | Method, device and equipment for controlling screen display content and readable storage medium | |
CN111556198A (en) | Sound effect control method, terminal equipment and storage medium | |
CN108334339A (en) | A kind of bluetooth equipment driving method and device | |
CN109739462B (en) | Content input method and device | |
US20230046440A1 (en) | Video playback method and device | |
CN115278352A (en) | Video playing method, device, equipment and storage medium | |
CN112786031B (en) | Man-machine conversation method and system | |
CN113900609A (en) | Large-screen terminal interaction method, large-screen terminal and computer readable storage medium | |
CN113593582A (en) | Control method and device of intelligent device, storage medium and electronic device | |
CN112883144A (en) | Information interaction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |