CN113450798A

CN113450798A - Device control method, device, storage medium, and electronic apparatus

Info

Publication number: CN113450798A
Application number: CN202110731555.3A
Authority: CN
Inventors: 郭凯
Original assignee: Qingdao Haier Technology Co Ltd; Haier Smart Home Co Ltd
Current assignee: Qingdao Haier Technology Co Ltd; Haier Smart Home Co Ltd
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2021-09-28

Abstract

The embodiment of the invention provides a device control method, a device, a storage medium and an electronic device, wherein the method comprises the following steps: receiving first request information sent by first equipment and other request information sent by other equipment, wherein the first request information comprises target voice and first distance information between the target voice and a target object, and the other request information comprises the target voice and other distance information between the target voice and the target object; recognizing the target voice to obtain a first recognition result, wherein the first recognition result comprises first recognition information and second recognition information, the first recognition information is used for indicating a target instruction, and the second recognition information is used for indicating an identifier of a target object; determining a target device based on the second identification information, the first distance information and other distance information; the control target device executes the operation indicated by the target instruction. According to the invention, the problem of insufficient safety of equipment unlocking control in the related technology is solved, and the effect of improving the safety of equipment unlocking is further achieved.

Description

Device control method, device, storage medium, and electronic apparatus

Technical Field

The embodiment of the invention relates to the field of artificial intelligence, in particular to a device control method, a device, a storage medium and an electronic device.

Background

The existing common unlocking method comprises password input, fingerprint input, face scanning, screen sliding, gesture input, voice control and the like, wherein the voice control technology is taken as an intelligent control technology with strong interactivity, and is favored in recent years. However, as long as the password is correct in the ordinary voice control, any person can unlock the household appliance, and a serious security problem exists in some scenes.

Aiming at the problem that the safety of unlocking of the control equipment is insufficient in the prior art, an effective solution is not provided at present.

Disclosure of Invention

The embodiment of the invention provides a device control method, a device, a storage medium and an electronic device, which are used for at least solving the problem of insufficient safety of unlocking of control devices in the related art.

According to an embodiment of the present invention, there is provided an apparatus control method including: receiving first request information sent by a first device and other request information sent by other devices, wherein the first request information includes target voice received by the first device and first distance information between the first device and a target object which sends the target voice, the first distance information is determined by the first device, the other request information includes the target voice received by the other devices and other distance information between the other devices and the target object, and the other distance information is determined by the other devices; identifying the target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of the target object; determining a target device for executing the target instruction based on the second identification information, the first distance information, and the other distance information; and controlling the target device to execute the operation indicated by the target instruction based on the first identification information.

In one exemplary embodiment, recognizing the target speech to obtain a first recognition result includes: performing semantic recognition on the target voice to determine the first recognition information; acquiring a target voiceprint feature of the target voice; comparing the target voiceprint features with standard voiceprint features included in a pre-stored standard voiceprint feature library to obtain a comparison result, wherein the comparison result is used for indicating the similarity between the target voiceprint features and any voiceprint feature included in the pre-stored standard voiceprint feature library, and the standard voiceprint feature library also includes an identification of an object corresponding to the standard voiceprint features included in the standard voiceprint feature library; and obtaining the second identification information based on the comparison result.

In an exemplary embodiment, obtaining the second identification information based on the comparison result includes: and under the condition that the standard voiceprint feature library comprises the standard voiceprint features of which the similarity with the target voiceprint features exceeds a preset threshold value based on the comparison result, determining the identifier of the object corresponding to the determined standard voiceprint features as the identifier of the target object so as to obtain the second identification information.

In an exemplary embodiment, before comparing the target voiceprint features with standard voiceprint features included in a pre-stored standard voiceprint feature library to obtain a comparison result, the method further includes: acquiring standard voiceprint characteristics of a plurality of standard voices and identification of an object sending the standard voices; and correspondingly storing the standard voiceprint features and the identifications in the standard voiceprint feature library.

In one exemplary embodiment, determining the target device for executing the target instruction based on the second identification information, the first distance information, and the other distance information includes: determining, based on the second identification information, an execution device included in the first device and the other device that is allowed to be controlled by the target object; and determining the target equipment closest to the target object from the execution equipment based on the first distance information and the other distance information.

In an exemplary embodiment, in a case that the target device is determined to be the first device, controlling the first device to perform the operation indicated by the target instruction based on the first identification information includes: under the condition that the first device is determined to be in a dormant state, sending a first control instruction to the first device based on the first identification information so as to control the first device to execute a wakeup operation, and executing an operation indicated by the target instruction after wakeup; and under the condition that the first equipment is determined to be in the non-sleep state, sending a second control instruction to the first equipment based on the first identification information so as to control the first equipment to directly execute the operation indicated by the target instruction.

According to another embodiment of the present invention, there is also provided an apparatus control method including: under the condition that first equipment receives target voice, determining first distance information between the first equipment and a target object which sends the target voice; sending first request information containing the target voice and the first distance information to a server to instruct the server to execute the following operations: identifying the target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of the target object; and determining a target device for executing the target instruction based on the second identification information, the first distance information and other distance information contained in other request information from other devices, and controlling the target device to execute the operation indicated by the target instruction based on the first identification information.

In an exemplary embodiment, determining first distance information between the first device and a target object from which the target voice originates includes: and calculating voice signals received by each microphone in a microphone array included in the first equipment according to a preset algorithm so as to determine first distance information between the first equipment and the target object.

In one exemplary embodiment, the predetermined algorithm includes at least one of the following methods: a beam forming based method; a high resolution spectrum estimation based approach; and a method for positioning based on the time difference.

In an exemplary embodiment, after sending the first request message including the target voice and the first distance information to the server, the method further includes: receiving a first control instruction sent by the server, wherein the first control instruction is sent by the server under the condition that the target device is determined to be the first device; executing a wake-up operation based on the first control instruction, and executing an operation indicated by the target instruction after wake-up; or receiving a second control instruction sent by the server, where the second control instruction is sent by the server when the target device is determined to be the first device; and directly executing the operation indicated by the target instruction based on the second control instruction.

According to still another embodiment of the present invention, there is also provided an apparatus control device including: a first receiving module, configured to receive first request information sent by a first device and other request information sent by other devices, where the first request information includes a target voice received by the first device and first distance information between the first device and a target object that sends the target voice, the first distance information is determined by the first device, the other request information includes the target voice received by the other devices and other distance information between the other devices and the target object, and the other distance information is determined by the other devices; the recognition module is configured to recognize the target voice to obtain a first recognition result, where the first recognition result includes first recognition information and second recognition information, the first recognition information is used to indicate a target instruction included in the target voice, and the second recognition information is used to indicate an identifier of the target object; a first determination module configured to determine a target device for executing the target instruction based on the second identification information, the first distance information, and the other distance information; and the control module is used for controlling the target equipment to execute the operation indicated by the target instruction based on the first identification information.

According to still another embodiment of the present invention, there is also provided an apparatus control device including: the second determining module is used for determining first distance information between the first equipment and a target object which sends out the target voice under the condition that the first equipment receives the target voice; a sending module, configured to send first request information including the target voice and the first distance information to a server, so as to instruct the server to perform the following operations: identifying the target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of the target object; and determining a target device for executing the target instruction based on the second identification information, the first distance information and other distance information contained in other request information from other devices, and controlling the target device to execute the operation indicated by the target instruction based on the first identification information.

According to a further embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.

According to the invention, first request information sent by first equipment and other request information sent by other equipment are received, wherein the first request information comprises target voice received by the first equipment and first distance information between the first equipment and a target object which sends the target voice and is determined by the first equipment, and the other request information comprises the target voice received by other equipment and other distance information between the other equipment and the target object and is determined by the other equipment; then, the target voice is recognized to obtain a first recognition result, wherein the first recognition result comprises a target instruction contained in the target voice and an identification of a target object sending the target voice; then, a target device for executing the target instruction is determined based on the second identification information, the first distance information and the other distance information, and the target device is controlled to execute the operation indicated by the target instruction based on the first identification information. The target device for executing the target instruction is determined by recognizing the target instruction in the target voice and the identification of the target object sending the target voice, the problem of insufficient safety of unlocking the control device in the related technology is solved, and the effect of improving the safety of unlocking the device is achieved.

Drawings

Fig. 1 is a block diagram of a hardware configuration of a mobile terminal of an apparatus control method of an embodiment of the present invention;

FIG. 2 is a first flowchart of a device control method according to an embodiment of the present invention;

FIG. 3 is a second flowchart of a device control method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a sound source localization algorithm according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a sound source localization algorithm according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a sound source localization algorithm according to an embodiment of the present invention;

FIG. 7 is a first flowchart of a device control method according to an embodiment of the present invention;

FIG. 8 is a second flowchart of an apparatus control method according to an embodiment of the present invention;

FIG. 9 is a first block diagram of the structure of the device control apparatus according to the embodiment of the present invention;

fig. 10 is a block diagram of a second configuration of the device control apparatus according to the embodiment of the present invention.

Detailed Description

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking the operation on the mobile terminal as an example, fig. 1 is a hardware structure block diagram of the mobile terminal of the device control method according to the embodiment of the present invention. As shown in fig. 1, the mobile terminal may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), and a memory 104 for storing data, wherein the mobile terminal may further include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration, and does not limit the structure of the mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the device control method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In this embodiment, a device control method is provided, and fig. 2 is a first flowchart of the device control method according to the embodiment of the present invention, and as shown in fig. 2, the flowchart includes the following steps:

step S202, receiving first request information sent by a first device and other request information sent by other devices, where the first request information includes a target voice received by the first device and first distance information between the first device and a target object that sends the target voice, the first distance information is determined by the first device, the other request information includes the target voice received by the other devices and other distance information between the other devices and the target object, and the other distance information is determined by the other devices;

step S204, recognizing the target voice to obtain a first recognition result, wherein the first recognition result comprises first recognition information and second recognition information, the first recognition information is used for indicating a target instruction contained in the target voice, and the second recognition information is used for indicating an identifier of the target object;

step S206 of determining a target device for executing the target instruction based on the second identification information, the first distance information, and the other distance information;

step S208, controlling the target device to execute the operation indicated by the target instruction based on the first identification information.

Through the steps, first request information sent by first equipment and other request information sent by other equipment are received, wherein the first request information comprises target voice received by the first equipment and first distance information between the first equipment and a target object which sends the target voice and is determined by the first equipment, and the other request information comprises the target voice received by the other equipment and other distance information between the other equipment and the target object and is determined by the other equipment; then, the target voice is recognized to obtain a first recognition result, wherein the first recognition result comprises a target instruction contained in the target voice and an identification of a target object sending the target voice; then, a target device for executing the target instruction is determined based on the second identification information, the first distance information and the other distance information, and the target device is controlled to execute the operation indicated by the target instruction based on the first identification information. The target device for executing the target instruction is determined by recognizing the target instruction in the target voice and the identification of the target object sending the target voice, the problem of insufficient safety of unlocking the control device in the related technology is solved, and the effect of improving the safety of unlocking the device is achieved.

The execution subject of the above steps may be a server or a cloud server, or a processor configured on a storage device and having a human-computer interaction capability, or a processing device or a processing unit having a similar processing capability, but is not limited thereto. The following description takes the server side to perform the above operation as an example (which is only an exemplary description, and in actual operation, other devices or modules may also perform the above operation):

in the above embodiment, the server receives first request information (e.g., second request information) sent by a first device and other request information (e.g., second request information) sent by other devices (e.g., second devices), where the first request information includes a target voice received by the first device and first distance information between the first device and a target object from which the target voice originates, the other request information (e.g., the second request information) includes a target voice received by other devices (e.g., the second devices) and other distance information (e.g., second distance information) between the other devices and the target object, the server identifies the target voice based on the first request information and the second request information and obtains a first identification result, and the first identification result includes a target instruction included in the target voice and an identifier of the target object from which the target voice originates, in practical application, the server can perform semantic recognition on the target voice to determine a target instruction contained in the target voice, and the server can determine an identifier of a target object sending the target voice by acquiring voiceprint characteristics of the target voice, so that the identity of the target object can be recognized; then, the server determines a target device for executing the target instruction based on the second identification information, the first distance information and the other distance information, and then controls the target device to execute the operation indicated by the target instruction based on the first identification information. Through the embodiment, the purpose of determining the target equipment for executing the target instruction by identifying the target instruction in the target voice and the identification of the target object sending the target voice is achieved, the problem of insufficient safety of equipment unlocking control in the related technology is solved, and the effect of improving the safety of equipment unlocking is further achieved.

In an optional embodiment, recognizing the target speech to obtain the first recognition result comprises: performing semantic recognition on the target voice to determine the first recognition information; acquiring a target voiceprint feature of the target voice; comparing the target voiceprint features with standard voiceprint features included in a pre-stored standard voiceprint feature library to obtain a comparison result, wherein the comparison result is used for indicating the similarity between the target voiceprint features and any voiceprint feature included in the pre-stored standard voiceprint feature library, and the standard voiceprint feature library also includes an identification of an object corresponding to the standard voiceprint features included in the standard voiceprint feature library; and obtaining the second identification information based on the comparison result. In this embodiment, recognizing the target speech to obtain a first recognition result includes: the server performs semantic recognition on the target voice to determine the first recognition information; in practical application, the server pre-stores a standard voiceprint feature library, such as the voiceprint feature of a target object a, optionally, a plurality of voiceprint features of the target object a can be collected in advance, such as the voiceprint features of voices with different volumes and sizes sent by the target object a, or the voiceprint features of voices sent by the target object a from different angles, of course, a plurality of voiceprint features of a plurality of target objects can be collected in advance according to practical needs, and the voiceprint features are stored in the standard voiceprint feature library, and in practical application, the standard voiceprint feature library also includes an identification of an object corresponding to the voiceprint features; after the server obtains the target voiceprint feature of the target voice, the server may compare the target voiceprint feature with a standard voiceprint feature included in a standard voiceprint feature library stored in advance to obtain a comparison result, where the comparison result is used to indicate a similarity between the target voiceprint feature and any voiceprint feature included in the standard voiceprint feature library, and the server may obtain the second identification information based on the comparison result, for example, when the similarity reaches 90% (or 85%, or another value) or more, it may be determined that the voiceprint feature of the target voice meets a standard requirement, and in actual application, it may be determined that the target object which issues the target voice has an authority to operate and control the target device. Through the embodiment, the purpose of identifying the identity of the target object sending the target voice is achieved, and the effect of improving the safety of equipment operation is achieved.

In an optional embodiment, obtaining the second identification information based on the comparison result includes: and under the condition that the standard voiceprint feature library comprises the standard voiceprint features of which the similarity with the target voiceprint features exceeds a preset threshold value based on the comparison result, determining the identifier of the object corresponding to the determined standard voiceprint features as the identifier of the target object so as to obtain the second identification information. In this embodiment, obtaining the second identification information based on the comparison result includes: when it is determined that the standard voiceprint feature library includes the standard voiceprint feature whose similarity with the target voiceprint feature exceeds the predetermined threshold based on the comparison result, for example, when it is determined that the similarity between the target voiceprint feature and the standard voiceprint feature included in the standard voiceprint library exceeds the predetermined threshold (e.g., 90%, or 85%, or other values) according to the comparison result, the identifier of the object corresponding to the determined standard voiceprint feature is determined as the identifier of the target object, so that the second identification information can be obtained. Through the embodiment, the purpose of determining the identification of the target object through the voiceprint characteristics of the target voice is achieved.

In an optional embodiment, before comparing the target voiceprint features with standard voiceprint features included in a pre-stored standard voiceprint feature library to obtain a comparison result, the method further includes: acquiring standard voiceprint characteristics of a plurality of standard voices and identification of an object sending the standard voices; and correspondingly storing the standard voiceprint features and the identifications in the standard voiceprint feature library. In this embodiment, before comparing the target voiceprint features with the standard voiceprint features included in a standard voiceprint feature library stored in advance to obtain a comparison result, the server first obtains the standard voiceprint features of a plurality of standard voices and the identifier of the object which sends the standard voices, for example, the standard voiceprint features may be voiceprint features of voices with different volumes and sizes sent by the target object a, or voiceprint features of voices sent by the target object a from different angles, and of course, a plurality of voiceprint features of a plurality of target objects and the identifier of the object corresponding to each voiceprint feature may also be obtained according to actual needs; and then storing the standard voiceprint features and the identifications in the standard voiceprint feature library correspondingly.

In the above embodiments, the identification of the target object is recognizable by the voiceprint feature, which is similar to fingerprint recognition, and is one of the biometric identification techniques, also called speaker recognition, and is a technique for discriminating the identity of a speaker by voice. The theoretical basis for voiceprint recognition is that each sound has a unique characteristic by which it is possible to effectively distinguish between different human voices. Specifically, the pitch spectrum and the envelope, the energy of the pitch frame, the frequency of occurrence of the pitch formant and the locus thereof, and the like. Table 1 shows the characteristics of different features of the human body, as shown in Table 1.

TABLE 1

As can be seen from table 1, voiceprint recognition has a number of advantages. Voiceprint recognition can be divided into two categories: namely speaker identification technology and speaker recognition technology, wherein the speaker identification technology is used for judging whether an unknown speaker is a certain appointed person; speaker identification techniques are used to identify which of the unknown speakers are among the recorded speakers. It is generally understood that speaker recognition technology is often applied to criminal investigation and crime solving, criminal tracking, national defense monitoring, personalized application, etc., and speaker confirmation technology is often applied to security trading, bank trading, public security evidence obtaining, personal computer voice control lock, automobile voice control lock, identification card, credit card identification, etc. As in the above-described embodiment, the speaker recognition technique is employed, that is, which user is recognized from the control voice uttered by the user.

In an optional embodiment, determining the target device for executing the target instruction based on the second identification information, the first distance information, and the other distance information comprises: determining, based on the second identification information, an execution device included in the first device and the other device that is allowed to be controlled by the target object; and determining the target equipment closest to the target object from the execution equipment based on the first distance information and the other distance information. In this embodiment, the determining, by the server, the target device for executing the target instruction based on the second identification information, the first distance information, and other distance information (such as the second distance information) included in the other request information (such as the second request information) includes: the executing device which is allowed to be controlled by the target object and is included in the first device and the other device is determined based on the second identification information included in the first identification result, that is, the executing device which is allowed to be controlled by the target object is determined according to the identification of the target object indicated by the second identification information, and then the target device which is closest to the target object is determined from the executing devices based on the first distance information and other distance information.

In an optional embodiment, in a case that the target device is determined to be the first device, controlling the first device to perform the operation indicated by the target instruction based on the first identification information includes: under the condition that the first device is determined to be in a dormant state, sending a first control instruction to the first device based on the first identification information so as to control the first device to execute a wakeup operation, and executing an operation indicated by the target instruction after wakeup; and under the condition that the first equipment is determined to be in the non-sleep state, sending a second control instruction to the first equipment based on the first identification information so as to control the first equipment to directly execute the operation indicated by the target instruction. In this embodiment, when it is determined that the target device is the first device and it is determined that the first device is in a dormant state, the server sends a first control instruction to the first device based on the first identification information to control the first device to perform a wakeup operation, and performs an operation indicated by the target instruction after the wakeup, that is, when the target device is in the dormant state, the first control instruction may control the first device to perform the wakeup first and then perform a response operation corresponding to the target instruction; when the first device is determined to be in the non-sleep state, sending a second control instruction to the first device based on the first identification information so as to control the first device to directly execute the operation indicated by the target instruction; through the embodiment, the purpose of controlling the target equipment is achieved.

In this embodiment, a device control method is further provided, and fig. 3 is a second flowchart of the device control method according to the embodiment of the present invention, as shown in fig. 3, the flowchart includes the following steps:

step S302, under the condition that a first device receives a target voice, determining first distance information between the first device and a target object which sends the target voice;

step S304, sending a first request message including the target voice and the first distance information to a server, so as to instruct the server to perform the following operations: identifying the target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of the target object; and determining a target device for executing the target instruction based on the second identification information, the first distance information and other distance information contained in other request information from other devices, and controlling the target device to execute the operation indicated by the target instruction based on the first identification information.

Through the above steps, when the first device receives the target voice, the first distance information between the first device and the target object which sends the target voice is determined, and then the first request information including the target voice and the first distance information is sent to the server, so as to instruct the server to execute the following operations: identifying target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of a target object; then, a target device for executing the target instruction is determined based on the second identification information, the first distance information, and other distance information included in other request information from other devices, and the target device is controlled to execute the operation indicated by the target instruction based on the first identification information. The purpose of determining whether the first device is the target device or not by identifying the target instruction in the target voice and the target object which sends the target voice is achieved, the problem that the safety of unlocking the control device in the related technology is insufficient is solved, and the effect of improving the safety of unlocking the device is achieved.

The executing subject of the above steps may be a device, such as a smart device or other terminal devices, or a processor with human-computer interaction capability configured on a storage device, or a processing device or a processing unit with similar processing capability, etc., but is not limited thereto. The following description takes the intelligent device to perform the above operations as an example (which is only an exemplary description, and in actual operations, other devices or modules may also perform the above operations):

in the above embodiment, in a case that a first device (for example, the smart device a) receives a target voice, first distance information between the first device and a target object that emits the target voice is determined, in practical applications, the first device may determine first distance information between the first device and the target object that emits the target voice by providing a microphone array in the device and calculating a voice signal received by each microphone in the microphone array according to a predetermined algorithm, and then the first device sends a first request message including the target voice and the first distance information to the server, so as to instruct the server to perform the following operations: identifying target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of a target object; then, a target device for executing the target instruction is determined based on the second identification information, the first distance information, and other distance information included in other request information from other devices, and the target device is controlled to execute the operation indicated by the target instruction based on the first identification information. Through the embodiment, the purpose of identifying the target instruction in the target voice and the identification of the target object sending the target voice to determine whether the first equipment is the target equipment is achieved, the problem that the safety of unlocking of the control equipment in the related technology is insufficient is solved, and the effect of improving the safety of unlocking of the equipment is achieved.

In an optional embodiment, determining first distance information between the first device and a target object from which the target voice originates includes: and calculating voice signals received by each microphone in a microphone array included in the first equipment according to a preset algorithm so as to determine first distance information between the first equipment and the target object. In this embodiment, determining first distance information between the first device and a target object that utters the target voice includes: the method and the device for determining the distance between the first device and the target object may further include calculating, according to a predetermined algorithm, a voice signal received by each microphone of a microphone array included in the first device to determine first distance information between the first device and the target object, for example, calculating, based on a sound source localization algorithm of the microphone array, to determine first distance information between the first device and the target object.

In an alternative embodiment, the predetermined algorithm comprises at least one of the following methods: a beam forming based method; a high resolution spectrum estimation based approach; and a method for positioning based on the time difference. In practical application, a sound source localization algorithm based on a microphone array is divided into three categories: one is a beam forming based approach; second, a method based on high resolution spectral estimation; and thirdly, a method for performing positioning based on Time Difference (TDOA).

The method based on the beam forming is widely applied, the controllable beam forming technology based on the maximum output power is to perform weighted summation on signals collected by each array element to form a beam, the beam is guided by searching the possible position of a sound source, and the weight is modified to enable the output signal power of the microphone array to be maximum. Its time translation in the time domain is equivalent to the phase delay in the frequency domain, where a Matrix containing self-and Cross-spectra, called Cross-Spectral Matrix (CSM), is first used, and at each frequency of interest, the processing of the array signal gives the energy level at each given spatial scanning grid point or each signal Direction of Arrival (DOA). Thus, the array represents a summed number of responses associated with the sound source distribution. The method is suitable for large microphone arrays and has strong adaptability to test environments, fig. 4 is a schematic diagram of a sound source positioning algorithm according to an embodiment of the invention, as shown in fig. 4, a beam forming algorithm is used, and the prerequisite is a far-field sound source (TDOA for a near-field sound source), so that incident sound waves are assumed to be parallel and parallel sound fields, if the incident angle is perpendicular to the plane of the microphones, the incident sound waves can reach the microphones simultaneously, if the incident angle is not perpendicular, the phenomenon shown in fig. 4 occurs, the sound field reaches each microphone with a delay, and the delay is determined by the incident angle.

Fig. 5 is a schematic diagram of a sound source localization algorithm according to an embodiment of the present invention, as shown in fig. 5, it can be seen from fig. 5 that: the intensity of the final waveform to be superimposed is different at different incidence angles, for example, θ equals-45 degrees, there is almost no signal, θ equals 0 degrees, slightly point signal, θ equals 45 degrees, and the signal reaches the strongest. This shows that after the original single microphones without polarity are assembled into an array, the whole array is polar, and the next polarity diagram can be led out. Fig. 6 is a schematic diagram of a sound source localization algorithm according to an embodiment of the present invention, and as shown in fig. 6, each microphone array is a directional array, the directivity of the directional array can be simply implemented by time domain algorithms Delay & Sum, different delays are controlled, and directions in different directions are implemented, the directional direction of the directional array is controllable, which is equivalent to providing a spatial filter, a localization area can be first subjected to mesh division, then time domain Delay is performed on each microphone through Delay time of each mesh point, and finally Sum is obtained, so that the sound pressure of each mesh can be calculated, and finally, the relative sound pressure of each mesh can be obtained, so that a hologram for localization of a noise source can be obtained.

In an optional embodiment, after sending the first request message including the target voice and the first distance information to the server, the method further includes: receiving a first control instruction sent by the server, wherein the first control instruction is sent by the server under the condition that the target device is determined to be the first device; executing a wake-up operation based on the first control instruction, and executing an operation indicated by the target instruction after wake-up; or receiving a second control instruction sent by the server, where the second control instruction is sent by the server when the target device is determined to be the first device; and directly executing the operation indicated by the target instruction based on the second control instruction. In this embodiment, after sending the first request information including the target voice and the first distance information to the server, the first device receives a first control instruction sent by the server, for example, in an actual application, when the first device is in a sleep state, the first control instruction is sent to the first device by the server under the condition that the first device is determined to be the target device, and then the first device executes a wakeup operation based on the first control instruction and executes an operation indicated by the target instruction after wakeup; or, after sending the first request information including the target voice and the first distance information to the server, the first device receives a second control instruction sent by the server, where the second control instruction is sent by the server when the target device is determined to be the first device, and directly executes an operation indicated by the target instruction based on the second control instruction, for example, directly executes a response operation corresponding to the target instruction based on the second control instruction when the first device is in an awake state. Through the embodiment, the purpose of controlling the first equipment is achieved.

It is to be understood that the above-described embodiments are only a few, but not all, embodiments of the present invention.

The present invention will be described in detail with reference to the following examples:

fig. 7 is a first flowchart of a device control method according to an embodiment of the present invention, as shown in fig. 7, the first flowchart includes the following steps:

s702, the intelligent device (corresponding to the first device) collects a voice control instruction (corresponding to the target voice) input by a user by using a microphone array;

s704, the intelligent device sends a voice signal to a control server (corresponding to the server);

s706, the intelligent device sends the position information to a control server;

it should be noted that after the intelligent device collects the voice control instruction input by the user, the intelligent device calculates the position information of the user by using the microphone array positioning algorithm, and then sends the position information to the control server;

s708, the control server uses the voice recognition technology to recognize the instruction information;

s710, simultaneously, the control server identifies the voiceprint characteristics of the voice through a voiceprint identification technology;

s712, determining the identification of the corresponding user (corresponding to the target object) which utters the voice based on the voiceprint characteristics of the voice;

s714, judging whether the user has the execution authority or not according to the determined user identification;

s716, the control server judges whether to respond to the instruction according to the position information of the user, for example, if the distance between the user and the intelligent device is less than 2 meters, the control server responds to the instruction;

it should be noted that, the execution of the steps S708, S710, and S716 does not have to be in sequence, and may be executed simultaneously or not;

s718, in combination with the above steps S708, S714, and S716, perform a situation determination response action, that is, in a case that it is determined that the target object (i.e., the user) has the execution authority and the location information meets the requirement, determine a response action corresponding to the instruction in the voice, and return a corresponding result to the smart device, so as to control the smart device to perform a corresponding operation.

In the above step, the smart device makes an interactive response according to the result of the above step S714, provided that the result of the above step S714 is that the user has an execution right; the intelligent device executes corresponding actions according to the results of the above steps S708, S714 and S716, provided that it is determined that the user has the execution authority and the location information meets the requirements.

It should be noted that, the control server usually performs voiceprint registration in advance, that is, collects voice data of the user, calculates corresponding voiceprint features according to a voiceprint recognition algorithm, and establishes a database to store the voiceprint feature data. Fig. 8 is a second flowchart of a device control method according to an embodiment of the present invention, as shown in fig. 8, the flowchart includes the following steps:

s802, collecting the voiceprint of each user, and in practical application, collecting the voiceprints of a plurality of users;

s804, the intelligent equipment uploads the voiceprint characteristics of each user to a control server database;

s806, setting an unlocking password of the intelligent device;

and S808, the intelligent equipment uploads the password to the control server.

Through the steps, the aims of registering the voiceprint of the user and carrying out equipment on the unlocking password of the intelligent equipment are fulfilled.

Through the embodiment, the voiceprint recognition technology is used for reducing the voice awakening steps, the effect of improving user experience is achieved, the voiceprint recognition technology is used for achieving the effect of increasing the safety of voice awakening, the microphone array positioning technology is used for acquiring the position information of the user, and the effect of reducing the possibility of misoperation is achieved.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

In this embodiment, there is also provided a device control apparatus, and fig. 9 is a first block diagram of a structure of the device control apparatus according to the embodiment of the present invention, as shown in fig. 9, the apparatus includes:

a first receiving module 902, configured to receive first request information sent by a first device and other request information sent by other devices, where the first request information includes a target voice received by the first device and first distance information between the first device and a target object that sends the target voice, the first distance information is determined by the first device, the other request information includes the target voice received by the other devices and other distance information between the other devices and the target object, and the other distance information is determined by the other devices;

a recognition module 904, configured to recognize the target voice to obtain a first recognition result, where the first recognition result includes first recognition information and second recognition information, the first recognition information is used to indicate a target instruction included in the target voice, and the second recognition information is used to indicate an identifier of the target object;

a first determining module 906 for determining a target device for executing the target instruction based on the second identification information, the first distance information, and the other distance information;

a control module 908 configured to control the target device to perform an operation indicated by the target instruction based on the first identification information.

In an alternative embodiment, the identifying module 904 comprises: the recognition unit is used for performing semantic recognition on the target voice to determine the first recognition information; the acquisition unit is used for acquiring the target voiceprint characteristics of the target voice; a comparison unit, configured to compare the target voiceprint feature with a standard voiceprint feature included in a pre-stored standard voiceprint feature library to obtain a comparison result, where the comparison result is used to indicate a similarity between the target voiceprint feature and any one of the voiceprint features included in the pre-stored standard voiceprint feature library, and the standard voiceprint feature library further includes an identifier of an object corresponding to the standard voiceprint feature included in the standard voiceprint feature library; an obtaining unit, configured to obtain the second identification information based on the comparison result.

In an alternative embodiment, the obtaining unit includes: an obtaining subunit, configured to, when it is determined based on the comparison result that the standard voiceprint feature library includes a standard voiceprint feature whose similarity with the target voiceprint feature exceeds a predetermined threshold, determine an identifier of the object corresponding to the determined standard voiceprint feature as an identifier of the target object, so as to obtain the second identification information.

In an optional embodiment, the apparatus further comprises: the acquisition module is used for acquiring the standard voiceprint characteristics of a plurality of standard voices and the identification of the object sending the standard voice before comparing the target voiceprint characteristics with the standard voiceprint characteristics in a pre-stored standard voiceprint characteristic library to obtain a comparison result; and the storage module is used for correspondingly storing the standard voiceprint characteristics and the identification in the standard voiceprint characteristic library.

In an alternative embodiment, the first determining module 906 includes: a first determination unit configured to determine, based on the second identification information, an execution device included in the first device and the other device that is permitted to be controlled by the target object; a second determining unit configured to determine the target device closest to the target object from the execution devices based on the first distance information and the other distance information.

In an alternative embodiment, the control module 908 comprises: a first control unit, configured to, when it is determined that the target device is the first device and when it is determined that the first device is in a sleep state, send a first control instruction to the first device based on the first identification information to control the first device to perform a wake-up operation, and perform an operation indicated by the target instruction after wake-up; and the second control unit is used for sending a second control instruction to the first equipment based on the first identification information to control the first equipment to directly execute the operation indicated by the target instruction under the condition that the first equipment is determined to be in the non-sleep state.

In this embodiment, there is further provided a device control apparatus, and fig. 10 is a block diagram of a structure of the device control apparatus according to the embodiment of the present invention, as shown in fig. 10, the apparatus includes:

a second determining module 1002, configured to determine, when a first device receives a target voice, first distance information between the first device and a target object that emits the target voice;

a sending module 1004, configured to send first request information including the target voice and the first distance information to a server, so as to instruct the server to perform the following operations: identifying the target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of the target object; and determining a target device for executing the target instruction based on the second identification information, the first distance information and other distance information contained in other request information from other devices, and controlling the target device to execute the operation indicated by the target instruction based on the first identification information.

In an alternative embodiment, the second determining module 1002 includes: a third determining unit, configured to calculate, according to a predetermined algorithm, a voice signal received by each microphone in a microphone array included in the first device, so as to determine first distance information between the first device and the target object.

In an alternative embodiment, the predetermined algorithm comprises at least one of the following methods: a beam forming based method; a high resolution spectrum estimation based approach; and a method for positioning based on the time difference.

In an optional embodiment, the apparatus further comprises: a second receiving module, configured to receive a first control instruction sent by a server after sending first request information including the target voice and the first distance information to the server, where the first control instruction is sent by the server when it is determined that the target device is the first device; the execution module is used for executing a wakeup operation based on the first control instruction and executing the operation indicated by the target instruction after wakeup; or, the second control instruction is used for receiving a second control instruction sent by the server, where the second control instruction is sent by the server when the target device is determined to be the first device; and directly executing the operation indicated by the target instruction based on the second control instruction.

It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.

Embodiments of the present invention also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.

In an exemplary embodiment, the computer-readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.

In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

For specific examples in this embodiment, reference may be made to the examples described in the above embodiments and exemplary embodiments, and details of this embodiment are not repeated herein.

It will be apparent to those skilled in the art that the various modules or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and they may be implemented using program code executable by the computing devices, such that they may be stored in a memory device and executed by the computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An apparatus control method characterized by comprising:

receiving first request information sent by a first device and other request information sent by other devices, wherein the first request information includes target voice received by the first device and first distance information between the first device and a target object which sends the target voice, the first distance information is determined by the first device, the other request information includes the target voice received by the other devices and other distance information between the other devices and the target object, and the other distance information is determined by the other devices;

identifying the target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of the target object;

determining a target device for executing the target instruction based on the second identification information, the first distance information, and the other distance information;

and controlling the target device to execute the operation indicated by the target instruction based on the first identification information.

2. The method of claim 1, wherein recognizing the target speech to obtain a first recognition result comprises:

performing semantic recognition on the target voice to determine the first recognition information;

acquiring a target voiceprint feature of the target voice; comparing the target voiceprint features with standard voiceprint features included in a pre-stored standard voiceprint feature library to obtain a comparison result, wherein the comparison result is used for indicating the similarity between the target voiceprint features and any voiceprint feature included in the pre-stored standard voiceprint feature library, and the standard voiceprint feature library also includes an identification of an object corresponding to the standard voiceprint features included in the standard voiceprint feature library; and obtaining the second identification information based on the comparison result.

3. The method of claim 2, wherein obtaining the second identification information based on the comparison result comprises:

and under the condition that the standard voiceprint feature library comprises the standard voiceprint features of which the similarity with the target voiceprint features exceeds a preset threshold value based on the comparison result, determining the identifier of the object corresponding to the determined standard voiceprint features as the identifier of the target object so as to obtain the second identification information.

4. The method according to claim 2, wherein before comparing the target voiceprint features with standard voiceprint features included in a pre-stored standard voiceprint feature library to obtain a comparison result, the method further comprises:

acquiring standard voiceprint characteristics of a plurality of standard voices and identification of an object sending the standard voices;

and correspondingly storing the standard voiceprint features and the identifications in the standard voiceprint feature library.

5. The method of claim 1, wherein determining a target device for executing the target instruction based on the second identifying information, the first distance information, and the other distance information comprises:

determining, based on the second identification information, an execution device included in the first device and the other device that is allowed to be controlled by the target object;

and determining the target equipment closest to the target object from the execution equipment based on the first distance information and the other distance information.

6. The method of claim 1, wherein, in a case that the target device is determined to be the first device, controlling the first device to perform the operation indicated by the target instruction based on the first identification information comprises:

under the condition that the first device is determined to be in a dormant state, sending a first control instruction to the first device based on the first identification information so as to control the first device to execute a wakeup operation, and executing an operation indicated by the target instruction after wakeup;

and under the condition that the first equipment is determined to be in the non-sleep state, sending a second control instruction to the first equipment based on the first identification information so as to control the first equipment to directly execute the operation indicated by the target instruction.

7. An apparatus control method characterized by comprising:

under the condition that first equipment receives target voice, determining first distance information between the first equipment and a target object which sends the target voice;

sending first request information containing the target voice and the first distance information to a server to instruct the server to execute the following operations: identifying the target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of the target object; and determining a target device for executing the target instruction based on the second identification information, the first distance information and other distance information contained in other request information from other devices, and controlling the target device to execute the operation indicated by the target instruction based on the first identification information.

8. The method of claim 7, wherein determining first distance information between the first device and a target object from which the target speech originates comprises:

and calculating voice signals received by each microphone in a microphone array included in the first equipment according to a preset algorithm so as to determine first distance information between the first equipment and the target object.

9. The method of claim 8, wherein the predetermined algorithm comprises at least one of:

a beam forming based method;

a high resolution spectrum estimation based approach;

and a method for positioning based on the time difference.

10. The method of claim 7, wherein after sending a first request message containing the target voice and the first distance information to a server, the method further comprises:

receiving a first control instruction sent by the server, wherein the first control instruction is sent by the server under the condition that the target device is determined to be the first device; executing a wake-up operation based on the first control instruction, and executing an operation indicated by the target instruction after wake-up;

alternatively, the first and second electrodes may be,

receiving a second control instruction sent by the server, wherein the second control instruction is sent by the server under the condition that the target device is determined to be the first device; and directly executing the operation indicated by the target instruction based on the second control instruction.

11. An apparatus control device, characterized by comprising:

a first receiving module, configured to receive first request information sent by a first device and other request information sent by other devices, where the first request information includes a target voice received by the first device and first distance information between the first device and a target object that sends the target voice, the first distance information is determined by the first device, the other request information includes the target voice received by the other devices and other distance information between the other devices and the target object, and the other distance information is determined by the other devices;

the recognition module is configured to recognize the target voice to obtain a first recognition result, where the first recognition result includes first recognition information and second recognition information, the first recognition information is used to indicate a target instruction included in the target voice, and the second recognition information is used to indicate an identifier of the target object;

a first determination module configured to determine a target device for executing the target instruction based on the second identification information, the first distance information, and the other distance information;

and the control module is used for controlling the target equipment to execute the operation indicated by the target instruction based on the first identification information.

12. An apparatus control device, characterized by comprising:

the second determining module is used for determining first distance information between the first equipment and a target object which sends out the target voice under the condition that the first equipment receives the target voice;

a sending module, configured to send first request information including the target voice and the first distance information to a server, so as to instruct the server to perform the following operations: identifying the target voice to obtain a first identification result, wherein the first identification result comprises first identification information and second identification information, the first identification information is used for indicating a target instruction contained in the target voice, and the second identification information is used for indicating an identifier of the target object; and determining a target device for executing the target instruction based on the second identification information, the first distance information and other distance information contained in other request information from other devices, and controlling the target device to execute the operation indicated by the target instruction based on the first identification information.

13. A computer-readable storage medium, in which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6 or 7 to 10.

14. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method as claimed in any one of claims 1 to 6 or 7 to 10 are implemented when the computer program is executed by the processor.