CN115810356A - Voice control method, device, storage medium and electronic equipment - Google Patents
Voice control method, device, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN115810356A CN115810356A CN202211443786.5A CN202211443786A CN115810356A CN 115810356 A CN115810356 A CN 115810356A CN 202211443786 A CN202211443786 A CN 202211443786A CN 115810356 A CN115810356 A CN 115810356A
- Authority
- CN
- China
- Prior art keywords
- equipment
- voice
- current
- state
- interaction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 94
- 230000003993 interaction Effects 0.000 claims abstract description 338
- 238000004891 communication Methods 0.000 claims description 26
- 230000002452 interceptive effect Effects 0.000 claims description 22
- 238000012544 monitoring process Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 12
- 238000012545 processing Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 11
- 239000004744 fabric Substances 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000013459 approach Methods 0.000 description 4
- 230000007613 environmental effect Effects 0.000 description 4
- 238000005286 illumination Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 230000002618 waking effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Selective Calling Equipment (AREA)
- Telephone Function (AREA)
Abstract
The embodiment of the application discloses a voice control method, a voice control device, a storage medium and electronic equipment. Firstly, when the voice of a user is monitored to meet a voice awakening condition, if the current equipment is determined to be in a first preset equipment state, first equipment state information corresponding to the first preset equipment state is sent to candidate equipment; and if second equipment state information sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information. When the situation that the voice of the user meets the voice awakening condition is monitored, the first preset device state of the current device and the second preset device state of the candidate device can be obtained, and the device states can represent the use condition of the user on the device, so that which device the user specifically wants to use for voice interaction can be determined according to the device states of the devices, and the accuracy of voice control is effectively improved.
Description
Technical Field
The present application relates to the field of voice control technologies, and in particular, to a voice control method, apparatus, storage medium, and electronic device.
Background
With the development of voice technology and the pursuit of people for intelligent life, the dependence of people on electronic equipment is increasingly strengthened. The electronic device with the voice control function often has a voice interaction scene, that is, a user sends a voice control instruction, and the electronic device executes related operations according to the control instruction. However, when a user has a plurality of electronic devices with voice control functions, it is necessary to determine which electronic device executes a voice command.
Disclosure of Invention
The embodiment of the application provides a voice control method, a voice control device, a storage medium and an electronic device, which can accurately determine which electronic device executes a voice instruction when a user has a plurality of electronic devices with voice control functions.
In a first aspect, an embodiment of the present application provides a voice control method, where the method includes:
when the situation that the voice of the user meets a voice awakening condition is monitored, judging whether the current equipment is in a first preset equipment state or not;
if the current equipment is in the first preset equipment state, sending first equipment state information corresponding to the first preset equipment state to candidate equipment, wherein the candidate equipment and the current equipment are in the same multi-equipment scene;
and if second equipment state information sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information.
In a second aspect, an embodiment of the present application provides a voice control method, where the method includes:
when the situation that the voice of the user meets a voice awakening condition is monitored, judging whether the current main equipment is in a first preset equipment state or not;
if the current master device is in the first preset device state and receives second device state information sent by the slave device, determining target interaction devices for voice interaction from the current master device and the slave device according to first device state information corresponding to the first preset device state and the second device state information;
and controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
In a third aspect, an embodiment of the present application provides a voice control method, where the method includes:
when the situation that the user voice meets the voice awakening condition is monitored, judging whether the current slave equipment is in a second preset equipment state or not;
if the current slave equipment is in the second preset equipment state, sending second equipment state information of the current equipment to the master equipment;
and if an interactive instruction sent by the master device is received, controlling the current slave device to perform voice interaction, wherein the interactive instruction is generated by the master device according to first device state information of the master device, second device state information of the current slave device and second device state information of other slave devices.
In a fourth aspect, an embodiment of the present application provides a voice control apparatus, where the apparatus includes:
the voice awakening module is used for judging whether the current equipment is in a first preset equipment state or not when monitoring that the voice of the user meets a voice awakening condition;
the device state sending module is used for sending first device state information corresponding to the first preset device state to a candidate device if the current device is in the first preset device state, and the candidate device and the current device are in the same multi-device scene;
and the voice interaction determining module is used for determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information if the second equipment state information sent by the candidate equipment is received.
In a fifth aspect, an embodiment of the present application provides a voice control apparatus, where the apparatus includes:
the main equipment voice awakening module is used for judging whether the current main equipment is in a first preset equipment state or not when monitoring that the voice of the user meets a voice awakening condition;
the master device voice interaction determining module is used for determining target interaction devices for voice interaction from the current master device and the slave device according to first device state information and second device state information corresponding to a first preset device state if the current master device is in the first preset device state and second device state information sent by the slave device is received;
and the instruction control module is used for controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
In a sixth aspect, an embodiment of the present application provides a voice control apparatus, where the apparatus includes:
the slave equipment voice awakening module is used for judging whether the current slave equipment is in a second preset equipment state or not when monitoring that the voice of the user meets the voice awakening condition;
a slave device state sending module, configured to send second device state information of the current device to a master device if the current slave device is in the second preset device state;
and the slave equipment voice interaction module is used for controlling the current slave equipment to carry out voice interaction if an interaction instruction sent by the master equipment is received, wherein the interaction instruction is generated by the master equipment according to the first equipment state information of the master equipment, the second equipment state information of the current slave equipment and the second equipment state information of other slave equipment.
In a seventh aspect, an embodiment of the present application provides a computer storage medium, which stores a plurality of instructions adapted to be loaded by a processor and execute the steps of the above-mentioned method.
In an eighth aspect, embodiments of the present application provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor.
The technical scheme provided by some embodiments of the application has the advantages that at least:
in the embodiment of the application, when it is monitored that the voice of a user meets a voice awakening condition, whether the current equipment is in a first preset equipment state or not is judged; then if the current equipment is in a first preset equipment state, sending first equipment state information corresponding to the first preset equipment state to candidate equipment, wherein the candidate equipment and the current equipment are in the same multi-equipment scene; and finally, if second equipment state information sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information. When the voice of the user is monitored to meet the voice awakening condition, the first preset device state of the current device and the second preset device state of the candidate device can be obtained, and the device states can represent the using conditions of the device by the user, so that which device the user specifically wants to use for voice interaction can be determined according to the device states of the devices, and the accuracy of voice control is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the description below are only some embodiments of the present application, and it is also possible for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a device interaction method in the related art provided by an embodiment of the present application;
FIG. 2 is a diagram illustrating an exemplary system architecture of a voice control method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a voice control method according to an embodiment of the present application;
fig. 4 is a device interaction method provided in an embodiment of the present application;
fig. 5 is a schematic flowchart of a voice control method according to another embodiment of the present application;
fig. 6 is a schematic flowchart of a voice control method according to another embodiment of the present application;
fig. 7 is a block diagram of a voice control apparatus according to another embodiment of the present application;
fig. 8 is a schematic flowchart of a voice control method according to another embodiment of the present application;
fig. 9 is a block diagram of a voice control apparatus according to another embodiment of the present application;
fig. 10 is a flowchart illustrating a voice control method according to another embodiment of the present application;
fig. 11 is a block diagram of a voice control apparatus according to another embodiment of the present application;
fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the features and advantages of the present application more obvious and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims.
It should be further noted that information (including but not limited to user device information, user personal information, etc.), data (including but not limited to data for analysis, stored data, displayed data, etc.), and signals referred to in the embodiments of the present application are authorized by the user or sufficiently authorized by various parties, and the collection, use, and processing of the relevant data requires compliance with relevant laws and regulations and standards in relevant countries and regions. For example, the object characteristics, the interactive behavior characteristics, the user information and the like referred to in the present application are obtained under sufficient authorization.
Voice assistants are an important application of artificial intelligence on electronic devices. The electronic equipment can carry out intelligent dialogue and intelligent interaction of instant question and answer with the user through the voice assistant. And a voice command input by the user can be recognized, and the electronic equipment is triggered to automatically execute an event corresponding to the voice command. Typically, the voice assistant is in a sleep state, and the user can wake up the voice assistant by voice before using the voice assistant. Only after the voice assistant is awakened can the voice command input by the user be received and recognized. The voice data for waking up may be referred to as a wake-up word, for example, taking the wake-up word as "small cloth", if the user wants to query the weather of the place a using the voice assistant, a voice command of "small cloth, weather of the place a" may be spoken, and after the voice assistant receives the semantic command, the electronic device may be woken up based on the wake-up word "small cloth", and then the electronic device may recognize the voice command using the voice assistant, and trigger the electronic device to query the weather of the place a, and report the weather of the place a to the user by voice or text.
In the related art, the application of voice control is more and more widespread as the technology is developed. For example, many home devices currently support voice control functions. For example, a voice assistant can be installed in the home equipment to implement a voice control function. Thus, there may be a scenario in which the user is located in an environment (e.g., the user's home) that includes multiple devices that support voice control functionality, i.e., a multiple device scenario. In the multi-device scenario, if there are devices with the same wake-up word in the multiple devices, after the user speaks the wake-up word, the voice assistants of the devices with the same wake-up word are all woken up, and all recognize and respond to the voice command spoken subsequently by the user.
Referring to fig. 1, fig. 1 is a diagram illustrating an apparatus interaction method in the related art according to an embodiment of the present disclosure.
As shown in fig. 1, a living room of a user is taken as a multi-device scene, wherein the living room of the user has four devices, namely a sound box 101, a television 102, a mobile phone 103 and a wearable watch 104, the four devices are all provided with a voice assistant, and wakeup words are all small cloths. Then after the user speaks a voice control command containing the wake-up word "duffel", the speaker 101, the television 102, the cell phone 103, and the voice assistant wearing the watch 104 are all woken up and recognize the voice command, and recognize and respond to the voice command.
In a multi-device scenario, a user may only need one device to respond, for example, when the user is using a mobile phone, if the user needs to perform voice interaction with a voice assistant, the user often wants the voice assistant in the mobile phone to be woken up and respond to a control command of the user to perform voice interaction because the interaction with the mobile phone is more convenient, and if multiple devices respond simultaneously, the experience brought to the user is poor.
In order to solve the technical problem, in the embodiment of the application, when it is monitored that the voice of a user meets a voice awakening condition, whether the current equipment is in a first preset equipment state is judged; then if the current equipment is in a first preset equipment state, sending first equipment state information corresponding to the first preset equipment state to candidate equipment, wherein the candidate equipment and the current equipment are in the same multi-equipment scene; and finally, if second equipment state information sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information. When the situation that the voice of the user meets the voice awakening condition is monitored, the first preset device state of the current device and the second preset device state of the candidate device can be obtained, and the device states can represent the use condition of the user on the device, so that which device the user specifically wants to use for voice interaction can be determined according to the device states of the devices, and the accuracy of voice control is effectively improved.
Referring to fig. 2, fig. 2 is a diagram illustrating an exemplary system architecture of a voice control method according to an embodiment of the present application.
As shown in fig. 2, the system architecture may include an electronic device 201, a network 202, and a server 203. The network 202 is used to provide a medium for communication links between the electronic device 201 and the server 203. Network 202 may include various types of wired or wireless communication links, such as: the wired communication link includes optical fiber, twisted pair wire or coaxial cable, and the Wireless communication link includes bluetooth communication link, wireless-Fidelity (Wi-Fi) communication link or microwave communication link.
The electronic device 201 may interact with the server 203 via the network 202 to receive messages from the server 203 or to send messages to the server 203, or the electronic device 201 may interact with the server 203 via the network 202 to receive messages or data sent by other users to the server 203. The electronic device 201 may be hardware or software. When the electronic device 201 is hardware, it may be a variety of electronic devices including, but not limited to, smart watches, smart phones, tablets, smart televisions, laptop portable computers, desktop computers, and the like. When the electronic device 201 is software, it may be installed in the electronic device listed above, and it may be implemented as multiple software or software modules (for example, for providing distributed services), or may be implemented as a single software or software module, and is not limited in this respect.
The server 203 may be a business server providing various services. The server 203 may be hardware or software. When the server 203 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 203 is software, it may be implemented as a plurality of software or software modules (for example, to provide distributed services), or may be implemented as a single software or software module, which is not limited herein.
In this embodiment of the application, the number of the electronic devices 201 may be multiple, multiple electronic devices 201 may be in the same multiple device scenario, and the multiple electronic devices 201 in the same multiple device scenario may also be directly connected through the network 202, that is, the multiple electronic devices 201 may also directly perform data transmission based on the network 202. Therefore, the system architecture may not include the server 203, in other words, the server 203 may be an optional device in this embodiment, that is, the method provided by this embodiment may be applied to a system structure that only includes the electronic device 201, and this is not limited in this embodiment of the application.
In this embodiment of the application, if a certain electronic device 201 in the system architecture is used as a current device, if the current device monitors that a user voice meets a voice wake-up condition, it is determined whether the current device is in a first preset device state, and if the current device is in the first preset device state, first device state information corresponding to the first preset device state is sent to a candidate device, where the candidate device and the current device are in a same multi-device scene; and if second equipment state information sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information.
It should be understood that the number of electronic devices, networks, and servers in fig. 2 is merely illustrative, and that any number of electronic devices, networks, and servers may be used, as desired for an implementation.
Referring to fig. 3, fig. 3 is a schematic flowchart of a voice control method according to an embodiment of the present disclosure. The execution subject of the embodiment of the present application may be an electronic device that executes the voice control, a processor in the electronic device that executes the voice control method, or a voice control service in the electronic device that executes the voice control method. For convenience of description, a specific implementation process of the voice control method is described below by taking an example in which the implementation subject is a processor in an electronic device.
As shown in fig. 3, the voice control method may include at least:
s302, when the condition that the voice of the user meets the voice awakening condition is monitored, whether the current equipment is in a first preset equipment state is judged.
It can be understood that, in the embodiment of the present application, the voice control method is mainly applied to a multi-device scenario, where at least two electronic devices exist in the multi-device scenario, each electronic device in the same multi-device scenario belongs to the same device group, the electronic devices in the same device group have the same device class (that is, the electronic devices in the same device group do not distinguish a subordinate relationship or a primary-secondary relationship), and data transmission may be performed directly between the electronic devices or performed based on data forwarding of a server. Furthermore, each electronic device can be in the same multi-device scene by connecting the same wireless access point (such as a WiFi access point), logging in the same user account, and the like.
Furthermore, each electronic device in the same multi-device scenario is provided with a program similar to a voice assistant, and the program can monitor user voice sent by users around the electronic device in real time based on voice data collected by a microphone, and judge whether the users need to perform voice interaction.
A mode for judging whether a user needs to perform voice interaction is to set voice awakening conditions in each electronic device in advance, and if the situation that the voice of the user meets the voice awakening conditions is monitored, the user can be confirmed to perform voice interaction. In this embodiment of the application, the voice wake-up condition may be that the voice of the user includes a preset wake-up word and/or a voiceprint corresponding to the voice of the user is a preset voiceprint, so that after the voice of the user is collected by the voice assistant through the microphone, the voice wake-up word detection and/or the voiceprint detection may be performed based on the voice of the user, and when the voice of the user includes the preset wake-up word and/or the voiceprint corresponding to the voice of the user is the preset voiceprint, it may be considered that the voice of the user is detected to satisfy the voice wake-up condition. The voice wake-up condition may also be that the electronic device is in a preset state, for example, if the electronic device is a smart watch, and in order to reduce power consumption, the smart watch is in a screen-off state most of the time, and then if the smart watch is in a screen-on state, it may be determined that the smart watch satisfies the voice wake-up condition.
Since the device state of the electronic device changes when the user uses the electronic device or performs related operations on the electronic device, for example, the device state may refer to a static or dynamic state such as a device placement state, a screen lighting state, a standby state, a video playing state, and the like, the device state of the electronic device is associated with the operations of the user, and therefore, when the user needs to perform voice interaction in a multi-device scene, the user often prefers the electronic device that is operating to respond.
Based on the above thought, in this application embodiment, each electronic device that can be in the same multi-device scene can all judge whether self equipment is in the preset device state at first when detecting that user's voice satisfies the voice wake-up condition.
Specifically, for each electronic device in the same multi-device scenario, if the types of the electronic devices are different, the corresponding preset states of the electronic devices are also different, so that the preset device states corresponding to the different electronic devices can be set according to the device type of each electronic device in advance, and then if the electronic device is in the preset device state, it represents that the electronic device may be operated or used by a user, and further, the possibility that the user needs to interact with the electronic device is higher, in order to distinguish the preset device states of the different devices, in this embodiment of the present application, the preset device state corresponding to the current device is determined as the first preset device state.
For the current device, if the current device monitors that the user voice satisfies the voice wake-up condition, it may be determined whether the current device is in a first preset device state, where a manner of determining whether the current device is in the first preset device state may not be limited, for example, it may be determined whether the current device is in the first preset device state by obtaining data collected by a preset sensor in the current device.
S304, if the current equipment is in a first preset equipment state, sending first equipment state information corresponding to the first preset equipment state to candidate equipment, wherein the candidate equipment and the current equipment are in the same multi-equipment scene.
If it is determined that the current device is in the first preset device state, it may be determined that the current device is possibly a device that is being used or operated by a user, but since a plurality of electronic devices exist in a multi-device scene, other electronic devices that are also in the preset device state may also exist in the multi-device scene, so that it is convenient to determine an electronic device that the user wants to perform voice interaction from the plurality of electronic devices that are in the preset device state, after determining that the electronic device is in the preset device state, the electronic devices that are in the same multi-device scene may synchronize state information corresponding to the preset device state that the electronic device is in to the other electronic devices.
Further, since the preset device state is not entity data and cannot be directly transmitted, the current device may obtain first device state information representing the first preset device state, and send the first device state to at least one candidate device in the same multi-device scenario as the current device, where the candidate device may be all electronic devices in the same multi-device scenario as the current device or an electronic device designated by a user in the same multi-device scenario as the current device.
S306, if second equipment state information sent by the candidate equipment is received, whether the current equipment carries out voice interaction or not is determined according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information.
Since multiple electronic devices exist in the multi-device scene, there may also exist other electronic devices in the preset device state in the multi-device scene, that is, when the current device sends the first device state information corresponding to the first device state to the candidate device, it is possible that the candidate device also sends the device state information of the preset device state corresponding to the candidate device, here, in order to distinguish the current device from the first preset device state in which the current device is located, in the embodiment of the present application, the preset device state in which the candidate device is located is marked as the second preset device state, then after the first device state information corresponding to the first preset device state of the current device is sent to the candidate device, the current device may wait for the first preset time, and if the second device state information sent by at least one candidate device is received within the first preset time, it represents that there may also exist other electronic devices in the preset device state in the multi-device scene.
Further, after receiving second device state information sent by at least one candidate device, second preset device states corresponding to the candidate devices may be respectively determined according to the second device state information, and then, whether the current device performs voice interaction is determined according to comparison between the first preset device state and the second preset device states. The manner of comparing the first preset device state with each second preset device state may not be limited, and the comparison may be performed according to a rule set by a user or the electronic device when the electronic device leaves a factory, so as to determine a comparison result. If the current equipment is determined to carry out voice interaction, the voice control command corresponding to the voice of the user can be analyzed, and the voice control command is responded.
Because each electronic device in the multi-device scene executes the voice control method, one electronic device for voice interaction can be determined from the plurality of electronic devices in the multi-device scene, and the experience of the user in voice interaction is improved.
Referring to fig. 4, fig. 4 is a diagram illustrating an apparatus interaction method according to an embodiment of the present disclosure.
As shown in fig. 4, the living room of the user is taken as a multi-device scene, wherein the living room of the user has four devices, namely, a sound box 101, a television 102, a mobile phone 103 and a wearable watch 104, the four devices are all installed with a voice assistant, and the wakeup words are all "small cloth". After the user speaks the user voice containing the awakening word "small cloth", the sound box 101, the television 102, the mobile phone 103, and the voice assistant wearing the watch 104 may all monitor that the user voice meets the voice awakening condition, then the sound box 101, the television 102, the mobile phone 103, and the watch wearing 104 may respectively determine whether the respective devices are in the preset device states, if the mobile phone 103 determines that the respective devices are in the first preset device state, first device state information corresponding to the first preset device state is sent to the sound box 101, the television 102, and the watch wearing 104, if the mobile phone 103 receives the sound box 101, the television 102, and the watch wearing 104, the mobile phone 103 may compare the first preset device state of the mobile phone 103 with a second device state information corresponding to the second device state information of the other devices, and further determine whether the mobile phone 103 performs voice interaction, then the sound box 101, the television 102, and the watch wearing the watch 104 may also determine whether the respective devices perform voice interaction, and finally determine that the television interaction is performed from the sound box 101, the television 102, the mobile phone 103, and the watch wearing 104.
In fig. 4, the mobile phone 103 determines to perform voice interaction, and then the mobile phone 103 may parse the voice control command corresponding to the user voice and affect the voice control command.
Because the interaction with the mobile phone is more convenient, the user often expects that the voice assistant in the mobile phone can be woken up and respond to the control command of the user to perform voice interaction, and if multiple devices respond simultaneously, the experience brought to the user is poor.
In the embodiment of the application, when the voice of a user is monitored to meet a voice awakening condition, whether the current equipment is in a first preset equipment state is judged; then if the current equipment is in a first preset equipment state, sending first equipment state information corresponding to the first preset equipment state to candidate equipment, wherein the candidate equipment and the current equipment are in the same multi-equipment scene; and finally, if second equipment state information sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information. When the situation that the voice of the user meets the voice awakening condition is monitored, the first preset device state of the current device and the second preset device state of the candidate device can be obtained, and the device states can represent the use condition of the user on the device, so that which device the user specifically wants to use for voice interaction can be determined according to the device states of the devices, and the accuracy of voice control is effectively improved.
Referring to fig. 5, fig. 5 is a schematic flow chart of a voice control method according to another embodiment of the present application. As shown in fig. 5, the voice control method may include at least:
s502, when the situation that the voice of the user meets the voice awakening condition is monitored, whether the current equipment is in a first preset equipment state or not is judged.
When a user uses electronic equipment or performs related operations on the electronic equipment, the use or operation modes of different types of electronic equipment are different, so that the preset equipment states of the different types of electronic equipment are different, and therefore, in the process of judging whether the current equipment is in the first preset equipment state, a feasible implementation manner is that the equipment type of the current equipment can be obtained firstly, the equipment type is used for distinguishing the types of the different equipment, for example, the equipment type can be divided into handheld equipment, wearable equipment, sound box equipment, television equipment and the like, and the equipment type can be divided according to the needs of the user or can be directly divided when the user leaves a factory; then, the preset device states corresponding to the electronic devices of different device types are also different, for example, when the device type of the electronic device is a handheld device, the corresponding preset device state may be a handheld state; when the device type of the electronic device is the wearable device, the corresponding preset device state can be a limb lifting state; when the device type of the electronic device is a sound box device, the corresponding preset device state can be a music playing state; when the device type of the electronic device is a television device, the corresponding preset device state may be a video playing state or the like.
Further, some state parameters of the device in different device states are different, so that the specified state parameter corresponding to the current device can be obtained according to the device type of the current device, wherein the specified state parameter can be obtained through a specified sensor and other devices, and finally, whether the current device is in the first preset device state or not is judged according to the specified state parameter.
For example, if the device type of the current device is a handheld device, for example, the current device is a smart phone, then for the handheld device, if the user is using or operating the handheld device, the handheld device is generally not blocked by objects such as a pocket, the handheld device is not completely horizontal, and the handheld device is not very stable, so that if the device type of the current device is the handheld device, at least one of a blocking state parameter, a placing angle state parameter, and a shaking state parameter corresponding to the current device may be obtained, so as to determine whether the current device is in a handheld state according to at least one of the blocking parameter, the placing angle state parameter, and the shaking parameter.
Specifically, if it is determined whether the current device is in the handheld state according to the shielding parameter, the placement angle parameter, and the shaking parameter, it may be determined whether the current device is shielded based on the shielding state parameter, where the shielding state parameter may include an illumination value acquired by the illumination sensor and an approach distance value acquired by the approach sensor, and if the illumination value is smaller than a preset illumination value and the approach distance value is smaller than a preset approach distance value, it may be determined that the current device is shielded, that is, it is determined that the current device is not in the handheld state (the first preset device state); if the angle calculated based on the geomagnetic value and the acceleration value is smaller than the preset flat angle, the current equipment can be determined to be in the flat state, that is, the current equipment is determined not to be in the handheld state (the first preset equipment state); otherwise, it may be determined that the current device is not in a flat state, and whether the current device is in a shaking state may be determined based on shaking state parameters, where the shaking state parameters may include an angular velocity value acquired by an angular velocity sensor, an average angular velocity value within a time sliding window may be calculated according to the angular velocity value, if the current real-time angular velocity value is greater than a preset maximum angular velocity value, or the current real-time angular velocity value is greater than a preset minimum angular velocity value, less than a preset maximum angular velocity value, and the average angular velocity value is greater than a preset average angular velocity value, it may be determined that the current device is in a shaking state, and it may be determined that the current device is in a handheld state (a first preset device state), and if not, it may be determined that the current device is not in a shaking state, it may be determined that the current device is not in a handheld state (a first preset device state).
S504, if the current device is in a first preset device state, first device state information corresponding to the first preset device state is sent to the candidate device, and the candidate device and the current device are in the same multi-device scene.
Step S504 can refer to the description of step S304, and is not described herein.
S506, if second equipment state information sent by the candidate equipment is received, comparing the priority of the first preset equipment state with the priority of the second preset equipment state corresponding to the second equipment state information, and determining whether the current equipment carries out voice interaction or not according to the priority comparison result.
In this embodiment of the present application, after receiving second device status information sent by a candidate device, a first preset device status may be compared with a second preset device status corresponding to the second device status information, and in the comparison process, a feasible implementation manner may be that the first preset device status and priorities of the second preset device statuses are respectively determined, and then whether the current device performs voice interaction is determined according to a priority comparison result.
Specifically, a priority order of preset device states corresponding to electronic devices in the same multi-device scenario may be preset according to an instruction of a user or when the devices leave a factory, then a first state priority corresponding to a first preset device state is determined according to the preset device state priority order, a second state priority corresponding to a second preset device state corresponding to second device state information is determined, and if the first state priority is greater than the second state priority, which represents that the current device has a voice interaction control right with a higher priority, the current device is determined to perform voice interaction; and if the priority of the first state is smaller than that of the second state, representing that other candidate equipment has the voice interaction control right with higher priority, determining that the current equipment does not carry out voice interaction.
S508, if the second equipment state information sent by the candidate equipment is not received, the current equipment is determined to carry out voice interaction.
If the current device does not receive the second device state information sent by the candidate device within the first preset time, which represents that the candidate device in the preset device state does not exist in the multi-device scene, the voice interaction priority of the current device can be directly determined at the moment, and the current device is controlled to perform voice interaction.
And S510, if the current equipment is not in the first preset equipment state and receives second equipment state information sent by the candidate equipment, determining that the current equipment does not carry out voice interaction.
If the current device is not in the first preset device state, but receives the second device state information sent by the candidate device, which represents that the candidate device in the preset device state exists in the multi-device scene, the voice interaction priority of the candidate device is high, and the current device can be controlled not to perform voice interaction.
If the current device is not in the first preset device state and does not receive the second device state information sent by the candidate device, which represents that the interaction priority of all electronic devices in the multi-device scene is lower, each electronic device may not perform voice interaction, and continue to monitor whether the voice of the user meets the wake-up condition, or continue to screen the electronic devices performing voice interaction from all electronic devices in the multi-device scene through other determination conditions.
In the embodiment of the application, whether each electronic device meets the preset device state or not is determined by comparing the device types of each electronic device in the multi-device scene, and then the priority of the preset device state of each electronic device in the multi-device scene is determined to determine whether the current device performs voice interaction or not, so that the accuracy of judging the voice interaction device can be improved.
Referring to fig. 6, fig. 6 is a flowchart illustrating a voice control method according to another embodiment of the present application. As shown in fig. 6, the voice control method may include at least:
s602, when it is monitored that the voice of the user meets the voice awakening condition, judging whether the current equipment is in a first preset equipment state or not.
And S604, if the current equipment is in a first preset equipment state, sending first equipment state information corresponding to the first preset equipment state to candidate equipment, wherein the candidate equipment and the current equipment are in the same multi-equipment scene.
S606, if second equipment state information sent by the candidate equipment is received, whether the current equipment carries out voice interaction or not is determined according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information.
With regard to steps S602 to S606, reference may be made to the description in the above embodiments, which are not repeated herein.
S608, if the current device is not in the first preset device state and does not receive the second device state information sent by the candidate device, obtaining a first general voice characteristic value corresponding to the current device according to the user voice, and sending the first general voice characteristic value to the candidate device.
If the current device is not in the first preset device state and does not receive the second device state information sent by the candidate device, the interaction priority of all electronic devices in the multi-device scene is low, and if one electronic device still needs to be selected for voice interaction at the moment, the electronic device for voice interaction can be selected through the general voice characteristic value corresponding to each electronic device.
The general voice characteristic value is used for representing the awakening priority between the sound source and the equipment, and many factors are used for representing the awakening priority between the sound source and the equipment, so that the awakening priority between the sound source and the equipment can be represented by using the general voice characteristic parameter, that is, the general voice characteristic value can be represented by using various general voice characteristic parameters. It will be readily appreciated that since a user, when interacting with a device in speech, tends to speak the user speech close to and towards the device, generic speech feature parameters may include, but are not limited to: a distance parameter between the sound source and the device and an orientation parameter of the device relative to the sound source.
Specifically, the distance parameter between the sound source and the device can be calculated by the audio energy of the awakening word in the voice of the user, wherein the larger the energy is, the closer the distance is, the smaller the distance parameter between the sound source and the device is, and the awakening priority is higher. Specifically, the audio energy of the wake word needs to reduce the influence of the environmental noise as much as possible, and the Voice Activity Detection (VAD) method can be used to segment the wake word and the environmental noise based on the Voice of the user containing the wake word, and further obtain the energy and duration of the wake word and the energy and duration of the environmental noise, so that the energy of the wake word without the influence of the noise can be calculated as follows:
wherein, the energy and the time length of the awakening word are respectively recorded as es and ts, and the energy and the time length of the environmental noise are respectively recorded as en and tn, soCan be considered as the power of the wake-up word,can be regarded as the power of the ambient noise, thenAndthe difference can be regarded as the power of the awakening word without the influence of noise, and then the energy of the awakening word without the influence of noise can be represented by the power of the awakening word without the influence of noise.
Further, the method for calculating the orientation parameter of the current device relative to the sound source may train a decision model of sound orientation through pre-recorded audio data, and then input the user voice into the decision model to obtain a sound orientation result, that is, the orientation parameter of the current device relative to the sound source, where the pre-recorded audio data may include: 1. the reason for selecting the spectral feature data is that the sound source has an increased orientation parameter for the sound source, and the sound will be reflected more to the current device, so that the high frequency part of the user speech received by the current device will be attenuated more than the low frequency part. 2. The reason why the reverberation characteristic data is selected is that if the sounding source increases the orientation parameter of the sounding source to the sounding source, the larger the reverberation energy is, the current device can calculate the voice direct-mixing ratio and autocorrelation characteristic of the user voice, and if the reverberation is larger, the more the peaks of the autocorrelation result are. 3. The reason for selecting the multi-microphone characteristic data is that if the current equipment has a plurality of microphones to participate in the control of voice, the sound direction characteristics of the plurality of microphones can be calculated to assist in deciding the orientation parameters of the current equipment relative to the sound source.
Further, since the number of the first general-purpose speech feature parameters may be multiple, the influence of different first general-purpose speech feature parameters on the wake-up priority between the sound source and the device is different, and therefore, the preset first general-purpose speech feature weight corresponding to each first general-purpose speech feature parameter is set, where the larger the influence of the first general-purpose speech feature parameters on the wake-up priority between the sound source and the device is, the larger the corresponding first general-purpose speech feature weight is. For example, if the first generic speech feature parameters include: the distance parameter between the sound source and the current device and the orientation parameter of the current device relative to the sound source may be set to 0.6 for the first general speech feature weight corresponding to the distance parameter between the sound source and the current device, and 0.4 for the first speech feature weight corresponding to the orientation parameter of the current device relative to the sound source.
Then, after the first general voice feature parameter corresponding to the current device is obtained, a first general voice feature weight corresponding to each first general voice feature parameter may also be obtained, and then, based on each first general voice feature parameter and each first general voice feature weight, a first general voice feature value corresponding to the current device is calculated, that is, each first general voice feature parameter is multiplied by the corresponding first general voice feature weight, and each multiplication result is added as the first general voice feature value corresponding to the current device.
Further, after the first general-purpose speech feature value corresponding to the current device is obtained according to the user speech, the first general-purpose speech feature value needs to be synchronized to other electronic devices in the multi-device scenario, and similarly, other electronic devices in the multi-device scenario also obtain a second general-purpose speech feature value corresponding to the user speech by using the same method as the current device, and synchronize the second general-purpose speech feature value to the current device.
S610, if the second universal voice characteristic value sent by the candidate equipment is received, whether the current equipment carries out voice interaction or not is determined according to the first universal voice characteristic value and the second universal voice characteristic value.
In this embodiment of the application, after the current device sends the first general-purpose speech feature value to all candidate devices in the multi-device field, the current device may wait for a second preset time, and if at least one candidate device receives the second general-purpose speech feature value sent by the candidate device within the second preset time, the current device may determine whether to perform speech interaction according to the first general-purpose speech feature value and the second general-purpose speech feature value, that is, the first general-purpose speech feature value and the second general-purpose speech feature value are compared, and according to a comparison result.
Specifically, the first general voice feature value and the second general voice feature value may be compared, and if the first general voice feature value is greater than the second general voice feature value and represents that the current device has a higher interaction priority with the user, it is determined that the current device performs voice interaction; if the first general voice characteristic value is smaller than the second general voice characteristic value, the current equipment does not have the interaction priority higher than that of the user, namely other candidate equipment has the interaction priority higher than that of the user, and then the current equipment is determined not to carry out voice interaction; if the first universal voice characteristic value is equal to the second universal voice characteristic value, which means that the current device and other candidate devices do not have a higher interaction priority, it may be determined whether the current device is a preset priority interaction device, where the priority interaction device is set by a user in a self-defined manner or set when the electronic device leaves a factory, so that when the universal voice characteristic values of multiple devices are the same, one of the electronic devices is selected for interaction, thereby avoiding a situation that the user fails in voice interaction, and if it is determined that the current device is the preset priority interaction device, it is determined that the current device performs voice interaction.
And S612, if the second communication voice characteristic value sent by the candidate equipment is not received, determining that the current equipment carries out voice interaction.
In this embodiment of the application, after the current device sends the first general speech feature value to all candidate devices in the multi-device field, the current device may wait for a second preset time, and if the second general speech feature value sent by the candidate device is not received within the second preset time, it represents that no other candidate device except the current device in the multi-scene device obtains the general speech feature value, and at this time, the current device may be directly determined to perform speech interaction.
In the embodiment of the application, when it is determined that all electronic devices in a multi-device scene are not in the preset state, the general voice features acquired by all electronic devices for the voice of the user can be acquired respectively, and then the electronic devices for voice interaction are selected according to the general voice features, so that the accuracy of determining the devices for voice interaction is effectively improved.
Referring to fig. 7, fig. 7 is a block diagram of a voice control apparatus according to another embodiment of the present application. As shown in fig. 7, the voice control apparatus 700 includes:
the voice wake-up module 710 is configured to, when it is monitored that the voice of the user meets a voice wake-up condition, determine whether the current device is in a first preset device state;
the device state sending module 720 is configured to send first device state information corresponding to a first preset device state to a candidate device if the current device is in the first preset device state, where the candidate device and the current device are in the same multi-device scene;
the first voice interaction determining module 730 is configured to, if second device state information sent by the candidate device is received, determine whether the current device performs voice interaction according to the first preset device state and a second preset device state corresponding to the second device state information.
Optionally, the first voice interaction determining module 730 is further configured to compare the priority of the first preset device status with the priority of the second preset device status corresponding to the second device status information, and determine whether the current device performs voice interaction according to a result of the comparison of the priorities.
Optionally, the first voice interaction determining module 730 is further configured to determine, according to a preset device state priority order, a first state priority corresponding to a first preset device state, and determine a second state priority corresponding to a second preset device state corresponding to second device state information; if the first state priority is greater than the second state priority, determining that the current equipment carries out voice interaction; and if the priority of the first state is smaller than that of the second state, determining that the current equipment does not carry out voice interaction.
Optionally, the voice wakeup module 710 is further configured to obtain a device type of the current device, and obtain an assigned state parameter corresponding to the current device according to the device type; and judging whether the current equipment is in a first preset equipment state or not according to the specified state parameters.
Optionally, the voice wakeup module 710 is further configured to, if the device type is a handheld device, obtain a shielding state parameter, a placement angle state parameter, and a jitter state parameter corresponding to the current device; and judging whether the current equipment is in a handheld state or not according to the shielding parameter, the placing angle parameter and the shaking parameter.
Optionally, the voice control apparatus 700 further comprises: and the second voice interaction determining module is used for determining that the current equipment carries out voice interaction if the second equipment state information sent by the candidate equipment is not received.
Optionally, the voice control apparatus 700 further includes: and the third voice interaction determining module is used for determining that the current equipment does not carry out voice interaction if the current equipment is not in the first preset equipment state and receives second equipment state information sent by the candidate equipment.
Optionally, the voice control apparatus 700 further includes: the fourth voice interaction determining module is used for acquiring a first general voice characteristic value corresponding to the current equipment according to the voice of the user and sending the first general voice characteristic value to the candidate equipment if the current equipment is not in the first preset equipment state and does not receive second equipment state information sent by the candidate equipment; and if the second universal voice characteristic value sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first universal voice characteristic value and the second universal voice characteristic value.
Optionally, the fourth voice interaction determining module is further configured to obtain, according to the voice of the user, first general voice feature parameters corresponding to the current device and first general voice feature weights corresponding to the first general voice feature parameters; and calculating a first general voice characteristic value corresponding to the current equipment based on each first general voice characteristic parameter and each first general voice characteristic weight.
Optionally, the first generic speech feature parameters include, but are not limited to: a distance parameter between the sound source and the current device and an orientation parameter of the current device relative to the sound source.
Optionally, the fourth voice interaction determining module is further configured to determine that the current device performs voice interaction if the first general voice feature value is greater than the second general voice feature value; if the first general voice characteristic value is smaller than the second general voice characteristic value, determining that the current equipment does not carry out voice interaction; and if the first general voice characteristic value is equal to the second general voice characteristic value and the current equipment is determined to be preset priority interaction equipment, determining that the current equipment carries out voice interaction.
Optionally, the voice control apparatus 700 further comprises: and the fifth voice interaction determining module is used for determining that the current equipment carries out voice interaction if the second communication voice characteristic value sent by the candidate equipment is not received.
In an embodiment of the present application, a voice control apparatus includes: the voice awakening module is used for judging whether the current equipment is in a first preset equipment state or not when monitoring that the voice of the user meets a voice awakening condition; the device state sending module is used for sending first device state information corresponding to a first preset device state to a candidate device if the current device is in the first preset device state, and the candidate device and the current device are in the same multi-device scene; and the voice interaction determining module is used for determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information if the second equipment state information sent by the candidate equipment is received. When the situation that the voice of the user meets the voice awakening condition is monitored, the first preset device state of the current device and the second preset device state of the candidate device can be obtained, and the device states can represent the use condition of the user on the device, so that which device the user specifically wants to use for voice interaction can be determined according to the device states of the devices, and the accuracy of voice control is effectively improved.
Referring to fig. 8, fig. 8 is a schematic flowchart illustrating a voice control method according to another embodiment of the present application.
As shown in fig. 8, the voice control method includes:
s802, when the situation that the voice of the user meets the voice awakening condition is monitored, whether the current main equipment is in a first preset equipment state or not is judged.
In the embodiment of the application, at least two electronic devices exist in a multi-device scene, each electronic device in the same multi-device scene belongs to the same device group, and the electronic devices in the same device group have a subordinate relationship or a primary-secondary relationship, that is, at least one master device and at least one slave device exist in the multi-device scene, for example, a sound box, a television, a mobile phone and a wearable watch are included in a multi-device environment, so that the mobile phone with better data processing performance can be used as the master device, and the sound box, the television and the wearable watch with poorer data processing performance can be used as the slave devices. For convenience of description, the voice control method is first applied to the master device for description.
When the main device monitors that the user voice meets the voice awakening condition, whether the current main device is in a first preset device state or not can be judged firstly.
S804, if the current master device is in a first preset device state and receives second device state information sent by the slave device, determining target interaction devices for voice interaction from the current master device and the slave device according to the first device state information and the second device state information corresponding to the first preset device state.
If the current device is in the first preset device state, it may be determined that the current device may be a device that is being used or operated by a user, but since a plurality of electronic devices exist in a multi-device scene, other electronic devices that are also in the preset device state may also exist in the multi-device scene, so as to facilitate determining an electronic device that the user wants to perform voice interaction from the plurality of electronic devices that are in the preset device state, after determining that the slave device is in the preset device state, the slave device in the same multi-device scene may synchronize state information corresponding to the preset device state in which the slave device is in to the master device.
Therefore, after the master device receives the second device state information sent by the slave device, the target interaction device for performing voice interaction can be determined from the current master device and the current slave device according to the first device state information and the second device state information corresponding to the first preset device state. For a method for determining a target interaction device according to device state information, reference may be made to the description in the foregoing embodiments, which is not described herein again.
And S806, controlling the target interaction equipment to perform voice interaction based on the interaction instruction.
After the target interaction equipment is determined, the main equipment can generate an interaction instruction, and if the target interaction equipment is the current main equipment, the current main equipment is directly controlled to carry out voice interaction based on the interaction instruction; and if the target interaction equipment is the slave equipment, sending an interaction instruction to the target interaction equipment, wherein the interaction instruction is used for indicating the target interaction equipment to carry out voice interaction, namely the target interaction equipment controls the target interaction equipment to carry out voice interaction after receiving the interaction instruction.
Further, if the current master device is in the first preset device state and does not receive the second device state information sent by the slave device, the current master device is controlled to perform voice interaction.
Optionally, if the current master device is not in the first preset device state, it represents that the current master device does not have the voice interaction priority in the device state, and at this time, if second device state information sent by the slave device is received, a target interaction device performing voice interaction may be determined from the slave device according to the second device state information, and the target interaction device is controlled to perform voice interaction based on the interaction instruction.
Optionally, if the current master device is not in the first preset device state and does not receive the second device state information sent by the slave device, it represents that the interaction priority of all electronic devices in the multi-device scene is low, if at this time, one electronic device still needs to be selected for voice interaction, the electronic device for voice interaction may be selected through the general voice feature value corresponding to each electronic device, and at this time, the first general voice feature value corresponding to the current master device may be obtained according to the voice of the user; if the second universal voice characteristic value sent by the slave equipment is received, determining target interaction equipment for voice interaction from the current master equipment and the current slave equipment according to the first universal voice characteristic value and the second universal voice characteristic value, and controlling the target interaction equipment to perform voice interaction based on the interaction instruction.
Optionally, if the current master device does not monitor that the user voice meets the voice wake-up condition and receives second device state information sent by the slave device, determining a target interaction device for performing voice interaction from the slave device according to the second device state information, and controlling the target interaction device to perform voice interaction based on the interaction instruction.
Optionally, if the current master device does not monitor that the voice of the user meets the voice wake-up condition and receives a second general voice feature value sent by the slave device, determining a target interaction device performing voice interaction from the slave device according to the second general voice feature value, and controlling the target interaction device to perform voice interaction based on the interaction instruction.
In the embodiment of the application, the work of determining the target interaction device according to the state information or determining the target interaction device according to the voice characteristic value is set in the master device, on one hand, the speed of determining the target interaction device can be increased due to the good performance of the master device, and on the other hand, the data processing amount of the slave device can be reduced due to the poor performance of the slave device, and the power consumption of the slave device is reduced.
Referring to fig. 9, fig. 9 is a block diagram of a voice control apparatus according to another embodiment of the present application. As shown in fig. 9, the voice control apparatus 900 includes:
the main device voice wake-up module 910 is configured to, when it is monitored that the voice of the user meets a voice wake-up condition, determine whether the current main device is in a first preset device state;
a master device voice interaction determining module 920, configured to determine, according to first device state information and second device state information corresponding to a first preset device state, a target interaction device for performing voice interaction from a current master device and a current slave device if the current master device is in the first preset device state and receives second device state information sent by the slave device;
and the instruction control module 930 is configured to control the target interaction device to perform voice interaction based on the interaction instruction.
Optionally, the master device voice interaction determining module 920 is further configured to control the current master device to perform voice interaction based on the interaction instruction if the target interaction device is the current master device; and if the target interaction equipment is the slave equipment, sending an interaction instruction to the target interaction equipment, wherein the interaction instruction is used for indicating the target interaction equipment to carry out voice interaction.
Optionally, the master device voice interaction determining module 920 is further configured to control the current master device to perform voice interaction if the current master device is in the first preset device state and does not receive the second device state information sent by the slave device.
Optionally, the master device voice interaction determining module 920 is further configured to determine, if the current master device is not in the first preset device state and receives second device state information sent by the slave device, a target interaction device for performing voice interaction from the slave device according to the second device state information; and controlling the target interaction equipment to perform voice interaction based on the interaction instruction.
Optionally, the master device voice interaction determining module 920 is further configured to, if the current master device is not in the first preset device state and does not receive the second device state information sent by the slave device, obtain a first general voice feature value corresponding to the current master device according to the user voice; if a second general voice characteristic value sent by the slave equipment is received, determining target interaction equipment for voice interaction from the current master equipment and the current slave equipment according to the first general voice characteristic value and the second general voice characteristic value; and controlling the target interaction equipment to perform voice interaction based on the interaction instruction.
Optionally, the master device voice interaction determining module 920 is further configured to determine, if it is not monitored that the user voice meets the voice wake-up condition and when second device state information sent by the slave device is received, a target interaction device performing voice interaction from the slave device according to the second device state information; and controlling the target interaction equipment to perform voice interaction based on the interaction instruction.
Optionally, the master device voice interaction determining module 920 is further configured to determine, if it is not monitored that the user voice meets the voice wake-up condition and a second common voice feature value sent by the slave device is received, a target interaction device performing voice interaction from the slave device according to the second common voice feature value; and controlling the target interaction equipment to perform voice interaction based on the interaction instruction.
Referring to fig. 10, fig. 10 is a schematic flowchart illustrating a voice control method according to another embodiment of the present application.
As shown in fig. 10, the voice control method includes:
s1002, when the situation that the voice of the user meets the voice awakening condition is monitored, whether the current slave equipment is in a second preset equipment state or not is judged.
In the embodiment of the application, at least two electronic devices exist in a multi-device scene, each electronic device in the same multi-device scene belongs to the same device group, and the electronic devices in the same device group have a subordinate relationship or a primary-secondary relationship, that is, at least one master device and at least one slave device exist in the multi-device scene, for example, a sound box, a television, a mobile phone and a wearable watch are included in a multi-device environment, so that the mobile phone with better data processing performance can be used as the master device, and the sound box, the television and the wearable watch with poorer data processing performance can be used as the slave devices. For convenience of description, the voice control method is first described as being applied to the slave device.
And S1004, if the current slave equipment is in a second preset equipment state, sending second equipment state information of the current equipment to the master equipment.
And S1006, if an interactive instruction sent by the master device is received, controlling the current slave device to perform voice interaction, wherein the interactive instruction is generated by the master device according to the first device state information of the master device, the second device state information of the current slave device and the second device state information of other slave devices.
Optionally, if the current slave device is not in a second preset device state and does not receive an interaction instruction sent by the master device, obtaining a second general voice feature value corresponding to the current slave device according to the voice of the user, and sending the second general voice feature value to the master device; and if an interactive instruction sent by the master equipment is received, controlling the current slave equipment to carry out voice interaction, wherein the interactive instruction is generated by the master equipment according to a second voice characteristic value corresponding to the master equipment, a second voice characteristic value of the current slave equipment and second voice characteristic values of other slave equipment.
In the embodiment of the present application, the operations of determining the target interaction device according to the state information or determining the target interaction device according to the voice feature value are all set in the master device, on one hand, the speed of determining the target interaction device can be increased because the performance of the master device is good, and on the other hand, the data processing amount of the slave device can be reduced because the performance of the slave device is poor, and the power consumption of the slave device is reduced.
Referring to fig. 11, fig. 11 is a block diagram of a voice control apparatus according to another embodiment of the present application. As shown in fig. 11, the voice control apparatus 1100 includes:
the slave device voice wake-up module 1110 is configured to, when it is monitored that the voice of the user meets a voice wake-up condition, determine whether the current slave device is in a second preset device state;
a slave device status sending module 1120, configured to send second device status information of the current device to the master device if the current slave device is in a second preset device status;
the slave device voice interaction module 1130 is configured to control the current slave device to perform voice interaction if an interaction instruction sent by the master device is received, where the interaction instruction is generated by the master device according to the first device state information of the master device, the second device state information of the current slave device, and the second device state information of other slave devices.
Optionally, the slave device voice interaction module 1130 is further configured to, if the current slave device is not in the second preset device state and does not receive the interaction instruction sent by the master device, obtain a second universal voice feature value corresponding to the current slave device according to the user voice, and send the second universal voice feature value to the master device.
Optionally, the slave device voice interaction module 1130 is further configured to control the current slave device to perform voice interaction if an interaction instruction sent by the master device is received, where the interaction instruction is generated by the master device according to the second voice characteristic value corresponding to the master device, the second voice characteristic value of the current slave device, and the second voice characteristic values of other slave devices.
Embodiments of the present application also provide a computer storage medium, which may store a plurality of instructions adapted to be loaded by a processor and to perform the steps of the method according to any of the above embodiments.
Referring to fig. 12, fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 12, the electronic device 1200 may include: at least one electronic device processor 1201, at least one network interface 1204, a user interface 1203, memory 1205, at least one communication bus 1202.
Wherein a communication bus 1202 is used to enable connective communication between these components.
The user interface 1203 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1203 may also include a standard wired interface and a wireless interface.
The network interface 1204 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).
The electronic device processor 1201 may include one or more processing cores, among other things. The electronic device processor 1201 interfaces with various interfaces and circuitry throughout the electronic device 1200 to perform various functions of the electronic device 1200 and to process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1205, as well as invoking data stored in the memory 1205. Optionally, the electronic device processor 1201 may be implemented in at least one hardware form of Digital Signal Processing (DSP), field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The electronic device processor 1201 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the electronic device processor 1201, but may be implemented by a single chip.
The Memory 1205 may include a Random Access Memory (RAM) or a Read-Only Memory (ROM). Optionally, the memory 1205 includes a non-transitory computer-readable medium (non-transitory computer-readable storage medium). The memory 1205 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 1205 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1205 may also optionally be at least one storage device located remotely from the aforementioned electronic device processor 1201. As shown in fig. 12, the memory 1205 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a voice control program.
In the electronic device 1200 shown in fig. 12, the user interface 1203 is mainly used as an interface for providing input for a user, and acquiring data input by the user; the electronic device processor 1201 may be configured to invoke the voice control program stored in the memory 1205 and specifically perform the following operations:
when the situation that the voice of the user meets a voice awakening condition is monitored, judging whether the current equipment is in a first preset equipment state or not;
if the current equipment is in a first preset equipment state, sending first equipment state information corresponding to the first preset equipment state to candidate equipment, wherein the candidate equipment and the current equipment are in the same multi-equipment scene;
and if second equipment state information sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information.
In one embodiment, determining whether the current device performs voice interaction according to a first preset device state and a second preset device state corresponding to the second device state information includes: and comparing the priority of the first preset equipment state with the priority of the second preset equipment state corresponding to the second equipment state information, and determining whether the current equipment carries out voice interaction or not according to the priority comparison result.
In one embodiment, comparing the priority of the first preset device status with the priority of the second preset device status corresponding to the second device status information, and determining whether the current device performs voice interaction according to the priority comparison result includes: determining a first state priority corresponding to a first preset device state according to a preset device state priority sequence, and determining a second state priority corresponding to a second preset device state corresponding to second device state information; if the first state priority is greater than the second state priority, determining that the current equipment carries out voice interaction; and if the priority of the first state is smaller than that of the second state, determining that the current equipment does not carry out voice interaction.
In one embodiment, the determining whether the current device is in the first preset device state includes: acquiring the equipment type of the current equipment, and acquiring the specified state parameter corresponding to the current equipment according to the equipment type; and judging whether the current equipment is in a first preset equipment state or not according to the specified state parameters.
In one embodiment, obtaining the specified state parameter corresponding to the current device according to the device type includes: if the equipment type is handheld equipment, acquiring a shielding state parameter, a placing angle state parameter and a shaking state parameter corresponding to the current equipment; judging whether the current equipment is in a first preset equipment state according to the equipment state parameters, wherein the judging step comprises the following steps: and judging whether the current equipment is in a handheld state or not according to the shielding parameter, the placing angle parameter and the shaking parameter.
In one embodiment, the method further comprises: and if the second equipment state information sent by the candidate equipment is not received, determining that the current equipment carries out voice interaction.
In one embodiment, the method further comprises: and if the current equipment is not in the first preset equipment state and receives second equipment state information sent by the candidate equipment, determining that the current equipment does not carry out voice interaction.
In one embodiment, the method further comprises: if the current equipment is not in the first preset equipment state and does not receive second equipment state information sent by the candidate equipment, obtaining a first general voice characteristic value corresponding to the current equipment according to the voice of the user, and sending the first general voice characteristic value to the candidate equipment; and if the second universal voice characteristic value sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first universal voice characteristic value and the second universal voice characteristic value.
In one embodiment, acquiring a first general speech feature value corresponding to a current device according to a user speech includes: acquiring first general voice characteristic parameters corresponding to current equipment and first general voice characteristic weights corresponding to the first general voice characteristic parameters according to user voice; and calculating a first general voice characteristic value corresponding to the current equipment based on each first general voice characteristic parameter and each first general voice characteristic weight.
In one embodiment, the first generic speech feature parameters include, but are not limited to: a distance parameter between the sound source and the current device and an orientation parameter of the current device relative to the sound source.
In one embodiment, determining whether the current device performs voice interaction according to the first general voice feature value and the second general voice feature value includes: if the first general voice characteristic value is larger than the second general voice characteristic value, determining that the current equipment carries out voice interaction; if the first general voice characteristic value is smaller than the second general voice characteristic value, determining that the current equipment does not carry out voice interaction; and if the first general voice characteristic value is equal to the second general voice characteristic value and the current equipment is determined to be preset priority interaction equipment, determining that the current equipment carries out voice interaction.
In one embodiment, the method further comprises: and if the second communication voice characteristic value sent by the candidate equipment is not received, determining that the current equipment carries out voice interaction.
In the electronic device 1200 shown in fig. 12, the user interface 1203 is mainly used as an interface for providing input for a user, and acquiring data input by the user; the electronic device processor 1201 may be configured to call the voice control program stored in the memory 1205, and specifically perform the following operations:
when the situation that the voice of the user meets the voice awakening condition is monitored, judging whether the current main equipment is in a first preset equipment state or not; if the current master equipment is in a first preset equipment state and receives second equipment state information sent by the slave equipment, determining target interactive equipment for voice interaction from the current master equipment and the slave equipment according to first equipment state information and the second equipment state information corresponding to the first preset equipment state; and controlling the target interaction equipment to perform voice interaction based on the interaction instruction.
In one embodiment, if the target interaction device is the current master device, controlling the current master device to perform voice interaction based on the interaction instruction; and if the target interaction equipment is the slave equipment, sending an interaction instruction to the target interaction equipment, wherein the interaction instruction is used for indicating the target interaction equipment to carry out voice interaction.
In one embodiment, the voice control method further comprises: and if the current master equipment is in a first preset equipment state and does not receive second equipment state information sent by the slave equipment, controlling the current master equipment to carry out voice interaction.
In one embodiment, the voice control method further comprises: if the current master equipment is not in the first preset equipment state and second equipment state information sent by the slave equipment is received, determining target interactive equipment for voice interaction from the slave equipment according to the second equipment state information; and controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
In one embodiment, the voice control method further comprises: if the current master equipment is not in the first preset equipment state and does not receive second equipment state information sent by the slave equipment, acquiring a first general voice characteristic value corresponding to the current master equipment according to the voice of a user; if a second general voice characteristic value sent by the slave equipment is received, determining target interaction equipment for voice interaction from the current master equipment and the current slave equipment according to the first general voice characteristic value and the second general voice characteristic value; and controlling the target interaction equipment to perform voice interaction based on the interaction instruction.
In one embodiment, the voice control method further comprises: if the situation that the user voice meets the voice awakening condition is not monitored, and second equipment state information sent by the slave equipment is received, determining target interaction equipment for voice interaction from the slave equipment according to the second equipment state information; and controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
In one embodiment, the voice control method further comprises: if the situation that the user voice meets the voice awakening condition and a second communication voice characteristic value sent by the slave equipment is not monitored, determining target interaction equipment for voice interaction from the slave equipment according to the second communication voice characteristic value; and controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
In the electronic device 1200 shown in fig. 12, the user interface 1203 is mainly used as an interface for providing input for a user, and acquiring data input by the user; the electronic device processor 1201 may be configured to call the voice control program stored in the memory 1205, and specifically perform the following operations:
when the situation that the user voice meets the voice awakening condition is monitored, judging whether the current slave equipment is in a second preset equipment state or not; if the current slave equipment is in a second preset equipment state, sending second equipment state information of the current equipment to the master equipment; and if an interactive instruction sent by the master equipment is received, controlling the current slave equipment to carry out voice interaction, wherein the interactive instruction is generated by the master equipment according to the first equipment state information of the master equipment, the second equipment state information of the current slave equipment and the second equipment state information of other slave equipment.
In one embodiment, the voice control method further comprises: if the current slave equipment is not in a second preset equipment state and does not receive an interactive instruction sent by the master equipment, acquiring a second communication voice characteristic value corresponding to the current slave equipment according to the voice of the user and sending the second communication voice characteristic value to the master equipment; and if an interactive instruction sent by the master equipment is received, controlling the current slave equipment to carry out voice interaction, wherein the interactive instruction is generated by the master equipment according to a second voice characteristic value corresponding to the master equipment, a second voice characteristic value of the current slave equipment and second voice characteristic values of other slave equipment.
In the several embodiments provided in the embodiments of the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of modules is merely a division of logical functions, and an actual implementation may have another division, for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some interfaces, indirect coupling or communication connection between devices or modules, and may be in an electrical, mechanical or other form.
Modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application, which are essential or part of the technical solutions contributing to the prior art, or all or part of the technical solutions may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of combinations of acts, but it should be understood by those skilled in the art that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that acts and modules referred to are not necessarily required to implement the embodiments of the application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In view of the above description of the voice control method, apparatus, storage medium and electronic device provided in the embodiments of the present application, those skilled in the art will recognize that there are variations in the embodiments and applications of the concept of the embodiments of the present application.
Claims (26)
1. A method for voice control, the method comprising:
when the situation that the voice of the user meets a voice awakening condition is monitored, judging whether the current equipment is in a first preset equipment state or not;
if the current equipment is in the first preset equipment state, sending first equipment state information corresponding to the first preset equipment state to candidate equipment, wherein the candidate equipment and the current equipment are in the same multi-equipment scene;
and if second equipment state information sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information.
2. The method according to claim 1, wherein the determining whether the current device performs voice interaction according to the first preset device state and a second preset device state corresponding to the second device state information includes:
and comparing the priority of the first preset equipment state with the priority of a second preset equipment state corresponding to the second equipment state information, and determining whether the current equipment carries out voice interaction or not according to a priority comparison result.
3. The method of claim 2, wherein the comparing the priority of the first preset device status with the priority of the second preset device status corresponding to the second device status information and determining whether the current device performs voice interaction according to a priority comparison result comprises:
determining a first state priority corresponding to the first preset device state and a second state priority corresponding to the second preset device state corresponding to the second device state information according to a preset device state priority sequence;
if the first state priority is greater than the second state priority, determining that the current equipment carries out voice interaction;
and if the first state priority is smaller than the second state priority, determining that the current equipment does not carry out voice interaction.
4. The method of claim 1, wherein the determining whether the current device is in a first preset device state comprises:
acquiring the equipment type of the current equipment, and acquiring the specified state parameter corresponding to the current equipment according to the equipment type;
and judging whether the current equipment is in a first preset equipment state or not according to the specified state parameters.
5. The method according to claim 4, wherein the obtaining the specified state parameter corresponding to the current device according to the device type includes:
if the equipment type is handheld equipment, acquiring at least one of a shielding state parameter, a placing angle state parameter and a shaking state parameter corresponding to the current equipment;
judging whether the current equipment is in a first preset equipment state according to the equipment state parameters, wherein the judging step comprises the following steps:
and judging whether the current equipment is in a handheld state or not according to at least one of the shielding parameter, the placing angle parameter and the shaking parameter.
6. The method of claim 1, further comprising:
and if the second equipment state information sent by the candidate equipment is not received, determining that the current equipment carries out voice interaction.
7. The method of claim 1, further comprising:
and if the current equipment is not in the first preset equipment state and second equipment state information sent by the candidate equipment is received, determining that the current equipment does not carry out voice interaction.
8. The method of claim 1, further comprising:
if the current equipment is not in the first preset equipment state and second equipment state information sent by the candidate equipment is not received, acquiring a first general voice characteristic value corresponding to the current equipment according to the user voice, and sending the first general voice characteristic value to the candidate equipment;
and if a second universal voice characteristic value sent by the candidate equipment is received, determining whether the current equipment carries out voice interaction or not according to the first universal voice characteristic value and the second universal voice characteristic value.
9. The method according to claim 8, wherein the obtaining a first generic speech feature value corresponding to the current device according to the user speech includes:
acquiring first general voice characteristic parameters corresponding to the current equipment and first general voice characteristic weights corresponding to the first general voice characteristic parameters according to the user voice;
and calculating a first general voice characteristic value corresponding to the current equipment based on each first general voice characteristic parameter and each first general voice characteristic weight.
10. The method of claim 9, wherein the first generic speech feature parameter includes, but is not limited to: a distance parameter between a sound source and the current device and an orientation parameter of the current device relative to the sound source.
11. The method of claim 8, wherein the determining whether the current device is engaged in voice interaction based on the first generic speech feature value and the second generic speech feature value comprises:
if the first general voice characteristic value is larger than the second general voice characteristic value, determining that the current equipment carries out voice interaction;
if the first general voice characteristic value is smaller than the second general voice characteristic value, determining that the current equipment does not carry out voice interaction;
and if the first general voice characteristic value is equal to the second general voice characteristic value and the current equipment is determined to be preset priority interaction equipment, determining that the current equipment carries out voice interaction.
12. The method of claim 8, further comprising:
and if the second communication voice characteristic value sent by the candidate equipment is not received, determining that the current equipment carries out voice interaction.
13. A method of voice control, the method comprising:
when the situation that the voice of the user meets a voice awakening condition is monitored, judging whether the current main equipment is in a first preset equipment state or not;
if the current master device is in the first preset device state and receives second device state information sent by the slave device, determining target interaction devices for voice interaction from the current master device and the slave device according to first device state information corresponding to the first preset device state and the second device state information;
and controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
14. The method of claim 13, wherein the controlling the target interactive device to perform voice interaction based on the interactive instruction comprises:
if the target interaction equipment is the current main equipment, controlling the current main equipment to carry out voice interaction based on an interaction instruction;
and if the target interaction equipment is the slave equipment, sending an interaction instruction to the target interaction equipment, wherein the interaction instruction is used for indicating the target interaction equipment to carry out voice interaction.
15. The method of claim 13, further comprising:
and if the current master equipment is in the first preset equipment state and does not receive second equipment state information sent by slave equipment, controlling the current master equipment to carry out voice interaction.
16. The method of claim 13, further comprising:
if the current master equipment is not in the first preset equipment state and second equipment state information sent by the slave equipment is received, determining target interactive equipment for voice interaction from the slave equipment according to the second equipment state information;
and controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
17. The method of claim 13, further comprising:
if the current master equipment is not in the first preset equipment state and does not receive second equipment state information sent by the slave equipment, acquiring a first general voice characteristic value corresponding to the current master equipment according to the user voice;
if a second universal voice characteristic value sent by the slave equipment is received, determining target interaction equipment for voice interaction from the current master equipment and the slave equipment according to the first universal voice characteristic value and the second universal voice characteristic value;
and controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
18. The method of claim 13, further comprising:
if the situation that the user voice meets the voice awakening condition is not monitored, and second equipment state information sent by the slave equipment is received, determining target interaction equipment for voice interaction from the slave equipment according to the second equipment state information;
and controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
19. The method of claim 18, further comprising:
if the situation that the user voice meets the voice awakening condition and a second communication voice characteristic value sent by the slave equipment is not monitored, determining target interaction equipment for voice interaction from the slave equipment according to the second communication voice characteristic value;
and controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
20. A method for voice control, the method comprising:
when the situation that the voice of the user meets the voice awakening condition is monitored, judging whether the current slave equipment is in a second preset equipment state or not;
if the current slave equipment is in the second preset equipment state, sending second equipment state information of the current equipment to the master equipment;
and if an interactive instruction sent by the master device is received, controlling the current slave device to perform voice interaction, wherein the interactive instruction is generated by the master device according to first device state information of the master device, second device state information of the current slave device and second device state information of other slave devices.
21. The method of claim 20, further comprising:
if the current slave equipment is not in the second preset equipment state and does not receive the interactive instruction sent by the master equipment, acquiring a second communication voice characteristic value corresponding to the current slave equipment according to the user voice, and sending the second communication voice characteristic value to the master equipment;
and if an interactive instruction sent by the master equipment is received, controlling the current slave equipment to carry out voice interaction, wherein the interactive instruction is generated by the master equipment according to a second voice characteristic value corresponding to the master equipment, a second voice characteristic value of the current slave equipment and second voice characteristic values of other slave equipment.
22. A voice control apparatus, characterized in that the apparatus comprises:
the voice awakening module is used for judging whether the current equipment is in a first preset equipment state or not when monitoring that the voice of the user meets a voice awakening condition;
the device state sending module is used for sending first device state information corresponding to the first preset device state to a candidate device if the current device is in the first preset device state, and the candidate device and the current device are in the same multi-device scene;
and the voice interaction determining module is used for determining whether the current equipment carries out voice interaction or not according to the first preset equipment state and a second preset equipment state corresponding to the second equipment state information if the second equipment state information sent by the candidate equipment is received.
23. A voice control apparatus, characterized in that the apparatus comprises:
the main equipment voice awakening module is used for judging whether the current main equipment is in a first preset equipment state or not when monitoring that the voice of the user meets the voice awakening condition;
a master device voice interaction determining module, configured to determine, according to first device state information and second device state information corresponding to a first preset device state, a target interaction device for performing voice interaction from the current master device and the slave device if the current master device is in the first preset device state and second device state information sent by the slave device is received;
and the instruction control module is used for controlling the target interaction equipment to carry out voice interaction based on the interaction instruction.
24. A voice control apparatus, characterized in that the apparatus comprises:
the slave equipment voice awakening module is used for judging whether the current slave equipment is in a second preset equipment state or not when monitoring that the voice of the user meets the voice awakening condition;
a slave device state sending module, configured to send second device state information of the current device to a master device if the current slave device is in the second preset device state;
and the slave equipment voice interaction module is used for controlling the current slave equipment to carry out voice interaction if an interaction instruction sent by the master equipment is received, wherein the interaction instruction is generated by the master equipment according to the first equipment state information of the master equipment, the second equipment state information of the current slave equipment and the second equipment state information of other slave equipment.
25. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the steps of the method according to any one of claims 1 to 12, 13 to 19 and 20 to 21.
26. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method according to any one of claims 1 to 12, 13 to 19 and 20 to 21 when executing the program.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211443786.5A CN115810356A (en) | 2022-11-17 | 2022-11-17 | Voice control method, device, storage medium and electronic equipment |
PCT/CN2023/117319 WO2024103926A1 (en) | 2022-11-17 | 2023-09-06 | Voice control methods and apparatuses, storage medium, and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211443786.5A CN115810356A (en) | 2022-11-17 | 2022-11-17 | Voice control method, device, storage medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115810356A true CN115810356A (en) | 2023-03-17 |
Family
ID=85483428
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211443786.5A Pending CN115810356A (en) | 2022-11-17 | 2022-11-17 | Voice control method, device, storage medium and electronic equipment |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115810356A (en) |
WO (1) | WO2024103926A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117133282A (en) * | 2023-03-27 | 2023-11-28 | 荣耀终端有限公司 | Voice interaction method and electronic equipment |
WO2024103926A1 (en) * | 2022-11-17 | 2024-05-23 | Oppo广东移动通信有限公司 | Voice control methods and apparatuses, storage medium, and electronic device |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10643609B1 (en) * | 2017-03-29 | 2020-05-05 | Amazon Technologies, Inc. | Selecting speech inputs |
CN109391528A (en) * | 2018-08-31 | 2019-02-26 | 百度在线网络技术(北京)有限公司 | Awakening method, device, equipment and the storage medium of speech-sound intelligent equipment |
CN111276139B (en) * | 2020-01-07 | 2023-09-19 | 百度在线网络技术(北京)有限公司 | Voice wake-up method and device |
CN113241068A (en) * | 2021-03-26 | 2021-08-10 | 青岛海尔科技有限公司 | Voice signal response method and device, storage medium and electronic device |
CN114627871A (en) * | 2022-03-22 | 2022-06-14 | 北京小米移动软件有限公司 | Method, device, equipment and storage medium for waking up equipment |
CN115810356A (en) * | 2022-11-17 | 2023-03-17 | Oppo广东移动通信有限公司 | Voice control method, device, storage medium and electronic equipment |
-
2022
- 2022-11-17 CN CN202211443786.5A patent/CN115810356A/en active Pending
-
2023
- 2023-09-06 WO PCT/CN2023/117319 patent/WO2024103926A1/en unknown
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024103926A1 (en) * | 2022-11-17 | 2024-05-23 | Oppo广东移动通信有限公司 | Voice control methods and apparatuses, storage medium, and electronic device |
CN117133282A (en) * | 2023-03-27 | 2023-11-28 | 荣耀终端有限公司 | Voice interaction method and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
WO2024103926A1 (en) | 2024-05-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11443744B2 (en) | Electronic device and voice recognition control method of electronic device | |
JP6697024B2 (en) | Reduces the need for manual start / end points and trigger phrases | |
CN111192591B (en) | Awakening method and device of intelligent equipment, intelligent sound box and storage medium | |
US10777195B2 (en) | Wake command nullification for digital assistance and voice recognition technologies | |
CN112863547B (en) | Virtual resource transfer processing method, device, storage medium and computer equipment | |
CN115810356A (en) | Voice control method, device, storage medium and electronic equipment | |
EP3779968A1 (en) | Audio processing | |
CN106940997B (en) | Method and device for sending voice signal to voice recognition system | |
CN109101517B (en) | Information processing method, information processing apparatus, and medium | |
CN109672775B (en) | Method, device and terminal for adjusting awakening sensitivity | |
JP6619488B2 (en) | Continuous conversation function in artificial intelligence equipment | |
CN110097884B (en) | Voice interaction method and device | |
CN112634872A (en) | Voice equipment awakening method and device | |
CN108648754A (en) | Sound control method and device | |
CN111312243B (en) | Equipment interaction method and device | |
CN113225624B (en) | Method and device for determining time consumption of voice recognition | |
CN112420043A (en) | Intelligent awakening method and device based on voice, electronic equipment and storage medium | |
CN112740219A (en) | Method and device for generating gesture recognition model, storage medium and electronic equipment | |
CN114740744A (en) | Method and device for controlling smart home, wearable product and medium | |
CN113835670A (en) | Device control method, device, storage medium and electronic device | |
EP3901881A1 (en) | Information processing terminal, information processing apparatus, and information processing method | |
CN111833883A (en) | Voice control method and device, electronic equipment and storage medium | |
CN112260938B (en) | Session message processing method and device, electronic equipment and storage medium | |
CN109857472A (en) | Towards the exchange method and device for having screen equipment | |
CN112365899B (en) | Voice processing method, device, storage medium and terminal equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |