CN117201526A

CN117201526A - Equipment control method and electronic equipment

Info

Publication number: CN117201526A
Application number: CN202210621685.6A
Authority: CN
Inventors: 蒋宇波
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2022-06-01
Filing date: 2022-06-01
Publication date: 2023-12-08
Also published as: WO2023231963A1

Abstract

The application provides a device control method and an electronic device, wherein the method is applied to first electronic equipment and comprises the following steps: establishing a communication connection with a second electronic device; receiving first audio from a first user of a second electronic device; acquiring a first audio feature and first user information according to the first audio, wherein the first audio feature does not contain information for identifying the first user, and the first user information is used for identifying the first user; transmitting a first request including a first audio feature and first user information to a server; acquiring a first response returned by the server, wherein the first response is used for indicating first content determined by the server according to the first audio characteristics and the first user information; and sending a first message comprising the first content to the second electronic device, so that the second electronic device plays the first content according to the first message. According to the scheme, the effect of controlling the electronic equipment by voice can be achieved on the basis of not increasing the hardware cost of the electronic equipment side, and meanwhile the privacy leakage risk of a user is reduced.

Description

Equipment control method and electronic equipment

Technical Field

The present application relates to the field of electronic devices, and in particular, to a device control method and an electronic device.

Background

Currently, with the development of the internet of things technology, more and more internet of things devices reside in daily life of people, and a service experience of internet of everything is provided for people. Most of the internet of things equipment can work under the control of user voice, such as intelligent sound boxes, intelligent air conditioners and the like.

When the internet of things device processes a user voice request, most of the internet of things device can only support recognition of part of keywords, for example, voice for controlling basic functions such as opening and closing of the device, and more complex voice recognition cannot be realized, so that the application scene is limited, and the user use experience is lower. For the recognition of complex voice, the internet of things equipment needs to be realized by means of a cloud server. Specifically, the internet of things device may upload the original audio data of the corresponding user to the cloud server after receiving the user voice request, and the cloud server performs voice recognition. After the cloud server recognizes the user voice, the recognition result can be indicated to the Internet of things equipment, and the Internet of things equipment completes the corresponding task according to the recognition result, so that the effect of controlling the Internet of things equipment by the user voice is achieved. Although the scheme can recognize and respond to the more complex voice of the user, because the interaction processes of uploading original audio data, transmitting voice recognition results and the like are needed to be carried out with the Internet of things equipment, the Internet of things equipment is required to be provided with a high-performance communication device, and the scheme also has higher requirements on other hardware such as storage and power supply of the Internet of things equipment, so that the deployment cost and difficulty of the Internet of things equipment side are higher. In addition, the original audio data of the user is uploaded to the cloud server, so that the risk of privacy leakage of the user exists.

Disclosure of Invention

The application provides a device control method and electronic equipment, which are used for realizing the effect of controlling electronic equipment by voice on the basis of not increasing the hardware cost of the electronic equipment side and reducing the privacy disclosure risk of a user.

In a first aspect, the present application provides an apparatus control method, the method comprising: the method comprises the steps that communication connection is established between a first electronic device and a second electronic device; the first electronic device receives first audio from the second electronic device; wherein the first audio is the audio of a first user; the first electronic equipment acquires first audio characteristics and first user information according to the first audio; wherein the first audio feature does not include information for identifying the first user, and the first user information is used for identifying the first user; the first electronic device sends a first request to a server, wherein the first request includes the first audio feature and the first user information; the first electronic equipment obtains a first response returned by the server, wherein the first response is used for indicating first content determined by the server according to the first audio characteristics and the first user information; the first electronic device sends a first message including the first content to the second electronic device, so that the second electronic device plays the first content according to the first message.

In the method, the first electronic device can acquire the audio characteristics and the corresponding user information of the user audio according to the user audio from the second electronic device, send the audio characteristics and the user information to the server, receive the content returned by the server and used for responding to the user audio, and return the content to the second electronic device for playing, so that the effect of controlling the second electronic device by the user voice can be achieved. In the process, the first electronic device receives the user audio from the second electronic device, so the second electronic device only needs to collect and transmit the user audio, the hardware cost of the second electronic device side is not increased, and the computational power requirement of the second electronic device side is lower. The first electronic equipment processes the user audio to obtain the audio characteristics and the user information, and then sends the audio characteristics and the user information to the server, and the audio characteristics remove the information used for identifying the user in the user audio, so that the original user audio and the information related to the user privacy in the user audio are not transmitted to the server, and the risk of revealing the user privacy is reduced. In sum, the method can realize the effect of controlling the second electronic equipment by voice on the basis of not increasing the hardware cost of the second electronic equipment side, and can reduce the wind direction of privacy leakage of users in the control process. The second electronic device may be an internet of things device, and the first electronic device may be a local mutually trusted device of the internet of things device, so that by applying the scheme, the effect of controlling the internet of things device by voice can be achieved on the basis of not increasing hardware cost of the internet of things device side, and meanwhile, the risk of privacy leakage of a user is reduced.

In one possible design, after the first electronic device establishes a communication connection with the second electronic device, the method further includes: the first electronic device receives second audio from the second electronic device; wherein the second audio is the audio of a second user; the first electronic equipment acquires second audio characteristics and second user information according to the second audio; wherein the second audio feature is the same as the first audio feature, the second user information being used to identify the second user; the first electronic device sending a second request to a server, wherein the second request includes the second audio feature and the second user information; the first electronic device obtains a second response returned by the server, wherein the second response is used for indicating: the server determines second content according to the second audio characteristics and the second user information; the first electronic device sends a second message including the second content to the second electronic device, so that the second electronic device plays the second content according to the second message.

In the method, because the information related to the user privacy in the original user audio is removed from the audio features, the audio features extracted by the first electronic device from the audio corresponding to the first user and the second user can be identical, and when the audio features are identical, it can be determined that the user instructions corresponding to the audio sent by the first user and the second user are identical. Therefore, the first electronic device can avoid leakage of the user privacy information by transmitting the audio feature to the server. Meanwhile, the information related to the user instruction can be sent to the server through the audio features, so that the server can determine the user instruction according to the audio features, and smooth execution of the flow of controlling the second electronic equipment is guaranteed.

In one possible design, the first content is associated with the first user, the second content is associated with the second user, and the second content is different from the first content.

In the method, when the user instructions corresponding to the audios sent by different users are the same, the content generated by the server for responding to the user instructions corresponding to the different users is different, and the content generated by the server for responding to the user instructions is associated with the users, so that the method can realize the effect of personalized response to the user instructions of the different users, and more accurate response to the audio control of the users, and further improve the user experience.

In one possible design, the first content is audio content or text content.

In one possible design, the first electronic device establishes a communication connection with a second electronic device, including: the first electronic equipment receives a connection request sent by the second electronic equipment, wherein the connection request is used for requesting to establish communication connection with the first electronic equipment; the first electronic equipment displays first prompt information; the first electronic device responds to the received first operation and sends notification information to the second electronic device, wherein the notification information is used for notifying establishment of communication connection.

In the method, the first operation may be an operation performed by a user, and the first electronic device may display a prompt message when the second electronic device requests to establish a communication connection, and determine to establish a communication connection with the second electronic device according to the operation performed by the user. The method enables the user to flexibly control the communication connection between the first electronic equipment and the second electronic equipment, and is convenient for the user to know the connection relationship between the first electronic equipment and the second electronic equipment in time, so that the control method has higher flexibility and practicability, and the use experience of the user can be improved.

In one possible design, the first electronic device obtains a first audio feature and first user information according to the first audio, including: the first electronic device extracting the first audio feature from the first audio; the first electronic device extracts a first voiceprint from the first audio; the first electronic equipment acquires a first user identification associated with the first voiceprint and takes the first user identification as the first user information.

In the method, the electronic equipment can directly extract the audio characteristics according to the user audio, so that the transmission of the original user audio to the server can be avoided, and the risk of privacy leakage of the user is reduced. The electronic equipment can extract voiceprints capable of identifying the user identity according to the user audio and determine the user information according to the extracted voiceprints, so that the user information can be sent to the server, the server determines the user issuing the instruction, on one hand, the direct transmission of the voiceprint privacy information in the user audio between the second electronic equipment and the server is avoided, the risk of user privacy leakage is reduced, on the other hand, the server can conduct personalized response to the user according to the user information, and therefore the audio control effect is good.

In one possible design, after the first electronic device extracts a first voiceprint from the first audio, before the first electronic device obtains a first user identification associated with the first voiceprint, the method further includes: the first electronic device obtains a voiceprint set corresponding to a first account from the server and at least one user identifier corresponding to at least one voiceprint contained in the voiceprint set; wherein the voiceprint set comprises the first voiceprint; the first electronic device obtaining a first user identifier associated with the first voiceprint includes: the first electronic device selects the first user identifier corresponding to the first voiceprint from the at least one user identifier.

In the method, one account number can be associated with a plurality of user identifications, and each user identification can be associated with one voiceprint. Therefore, a plurality of user identifications and corresponding voiceprints can be managed uniformly based on the account number, and subsequent use is facilitated. The first electronic device may acquire a voiceprint set from the server, determine user information corresponding to the received audio according to the voiceprint set, and send the user information to the server for subsequent processing by the server. Based on the method, the first electronic equipment can support processing of the audio instructions of different users, so that the second electronic equipment can respond to the audio instructions of different users, and therefore the scheme is wide and flexible in application scene and high in practicability.

In one possible design, the first response includes information indicating a storage location of the first content; after the first electronic device obtains the first response returned by the server, before the first electronic device sends a first message including the first content to the second electronic device, the method further includes: the first electronic device determines the storage position of the first content according to the first response; the first electronic device obtains the first content from a storage location of the first content.

In the method, the first electronic device can acquire the content for responding to the user audio according to the storage position indicated by the server and send the content to the second electronic device, so that the second electronic device does not need to acquire the content, and the implementation cost and the corresponding hardware cost of the second electronic device are not increased.

In one possible design, after the first electronic device establishes a communication connection with the second electronic device, the method further includes: the first electronic device receives third audio from the second electronic device, wherein the third audio is audio of a third user; the first electronic device obtains a second voice print according to the third audio; the first electronic device sends a third request to the server, wherein the third request comprises the second voice print; the first electronic equipment acquires third user information returned by the server; wherein the third user information is used for identifying the third user, and the third user information is associated with the second voice print; the first electronic device stores the third user information and the second voice print.

In the method, the first electronic equipment can assist the second electronic equipment to finish the registration of the new user, and the registered user information is stored, so that the second electronic equipment side can be prevented from increasing extra processing cost, and the user can be conveniently identified in the subsequent process according to the registered user information.

In one possible design, the third audio is used to register the second voice print and the third user information corresponding to the second voice print on the server.

In the method, the first electronic equipment can directly perform the user registration process according to the voice frequency of the user indication registration, so that the user can quickly complete registration only by issuing a voice registration indication without independently inputting the voice frequency, and the registration efficiency of the method is higher.

In one possible design, the method further comprises, prior to the first electronic device receiving the first audio from the second electronic device: the first electronic equipment logs in a first account; alternatively, before the first electronic device receives the third audio from the second electronic device, the method further comprises: the first electronic device logs in to a first account, wherein the second voice print is associated with the first account.

In the method, the first electronic equipment logs in the first account before controlling the second electronic equipment, so that the first electronic equipment can conveniently acquire the voiceprint of the user according to the first account and identify the user according to the voiceprint of the user, and personalized response can be carried out on the audio of the user.

In one possible design, the second electronic device is an internet of things device.

The method provided by the application can complete audio control of the second electronic equipment only by the function of the second electronic equipment for supporting acquisition and reporting of the user audio, so that the method provided by the application can control the audio of the Internet of things equipment on the basis of adapting to the characteristics of the Internet of things equipment, and can avoid increasing the configuration cost and the calculation burden of the Internet of things equipment.

In a second aspect, the present application provides an apparatus control method, the method comprising: the method comprises the steps that a server receives a first request from first electronic equipment, wherein the first request comprises first audio characteristics and first user information, the first audio characteristics and the first user information are acquired by the first electronic equipment according to first audio, the first audio is of a first user, the first audio characteristics do not contain information for identifying the first user, and the first user information is used for identifying the first user; the server sends a first response to the first electronic device, wherein the first response is used for indicating first content determined by the server according to the first audio feature and the first user information.

In one possible design, the method further comprises: the server receives a second request from the first electronic device, wherein the second request comprises a second audio feature and second user information, the second audio feature is acquired by the first electronic device according to second audio, the second audio is of a second user, the second audio feature is the same as the first audio feature, and the second user information is used for identifying the second user; the server sends a second response to the first electronic device, wherein the second response is used for indicating second content determined by the server according to the second audio feature and the second user information.

In one possible design, the first content is audio content or text content.

In one possible design, the method further comprises: the server sends a voiceprint set corresponding to a first account and at least one user identifier corresponding to at least one voiceprint contained in the voiceprint set to the first electronic device; wherein the voiceprint set comprises the first voiceprint.

In one possible design, the first response includes information indicating a storage location of the first content.

In one possible design, the method further comprises: the server receives a third request from the first electronic device, wherein the third request comprises a second voice print determined by the first electronic device according to third audio, and the third audio is audio of a third user; the server sends third information to the first electronic device; wherein the third user information is used to identify the third user, the third user information being associated with the second voice print.

In a third aspect, the present application provides an electronic device comprising a memory and one or more processors; wherein the memory is for storing computer program code, the computer program code comprising computer instructions; the computer instructions, when executed by the one or more processors, cause the electronic device to perform the method described by any one of the possible designs of the first aspect or the first aspect described above, or to perform the method described by any one of the possible designs of the second aspect or the second aspect described above.

In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which when run on a computer causes the computer to perform the method described by the above-described first aspect or any of the possible designs of the first aspect or the method described by the above-described second aspect or any of the possible designs of the second aspect.

In a fifth aspect, the application provides a computer program product comprising a computer program or instructions which, when run on a computer, cause the computer to perform the method described by any one of the possible designs of the first aspect or the second aspect described above, or to perform the method described by any one of the possible designs of the second aspect or the second aspect described above.

The beneficial effects of the second aspect to the fifth aspect are described with reference to the beneficial effects of the first aspect, and the detailed description is not repeated here.

Drawings

Fig. 1 is a schematic diagram of a hardware architecture of an electronic device according to an embodiment of the present application;

fig. 2 is a schematic software architecture diagram of an electronic device according to an embodiment of the present application;

Fig. 3 is a schematic architecture diagram of an apparatus control system according to an embodiment of the present application;

fig. 4 is a schematic diagram of a device control method according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a prompt interface according to an embodiment of the present application;

fig. 6 is a flowchart of a method for establishing a communication connection according to an embodiment of the present application;

fig. 7 is a flowchart of a method for registering a user identifier according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a registration interface according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a device control method according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings. Wherein in the description of embodiments of the application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature.

For ease of understanding, a description of concepts related to the application is given by way of example for reference.

The electronic device may be a device having a wireless connection function. In some embodiments of the application, the electronic device may be a device having one or more display screens.

The electronic device in some embodiments of the present application may be a portable device such as a cell phone, tablet computer, wearable device with wireless communication capability (e.g., watch, bracelet, helmet, headset, etc.), vehicle-mounted terminal device, augmented reality (augmented reality, AR)/Virtual Reality (VR) device, notebook computer, ultra-mobile personal computer (UMPC), netbook, personal digital assistant (personal digital assistant, PDA), etc. The electronic device may also be a Smart Home device (e.g., smart television, smart speaker, etc.), a Smart car, a Smart robot, a workshop device, a wireless terminal in a drone (Self Driving), a wireless terminal in a teleoperation (Remote Medical Surgery), a wireless terminal in a Smart Grid (Smart Grid), a wireless terminal in transportation safety (Transportation Safety), a wireless terminal in a Smart City (Smart City), or a wireless terminal in a Smart Home (Smart Home), a flying device (e.g., smart robot, hot air balloon, drone, aircraft), etc.

In some embodiments of the application, the electronic device may also be a portable terminal device that also contains other functions, such as personal digital assistant and/or music player functions. Exemplary embodiments of portable terminal devices include, but are not limited to, piggy-backOr other operating system. The above-described portable terminal device may also be other portable terminal devices, such as a Laptop computer (Laptop) or the like having a touch-sensitive surface (e.g., a touch panel). It should also be appreciated that in other embodiments of the present application, the electronic device described above may be a desktop computer having a touch-sensitive surface (e.g., a touch panel) instead of a portable terminal device.

It should be understood that in embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one (item) below" or the like, refers to any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, a and b, a and c, b and c, or a, b and c, wherein a, b and c can be single or multiple.

The current internet of things equipment generally has the characteristics of sensitive cost, weak calculation power and weak resources, so that voice recognition of simple or specific keywords can be supported, and the recognition of complex voices is realized by means of a cloud server. The cloud server is used for realizing the recognition of complex voices, and the cloud server has higher requirements on the Internet of things equipment in the aspects of network connection, storage, power supply and the like, so that the deployment cost and difficulty of the Internet of things equipment are increased, and the practicability is poor.

Taking the intelligent sound box as an example, after receiving a user voice request, recommending personalized audio to the user is a common scene of the intelligent sound box. Because of the restriction of the cost and the power consumption of the equipment of the Internet of things, when the current intelligent sound box (or similar equipment) processes the voice request of the user, the voice recognition of partial keywords corresponding to the basic functions of waking up, opening, closing and the like is mostly supported. For a scenario of music recommendation according to a user voice request, one scheme that may be currently adopted is: after receiving the user voice, the intelligent sound box packages and sends the corresponding original audio data to the cloud server, or packages and sends the corresponding original audio data to terminal equipment for installing the application (application) corresponding to the intelligent sound box, and the terminal equipment forwards the audio data to the cloud server. The cloud server can identify user intention and generate corresponding recommendation results according to the original audio data, and then indicate the recommendation results to the Internet of things equipment or indicate the recommendation results to the terminal equipment and indicate the recommendation results to the Internet of things equipment by the terminal equipment. After the internet of things device obtains the recommendation result, a specific recommendation task can be executed according to the recommendation result. The scheme requires the intelligent sound box equipment to have high-performance communication devices such as wireless fidelity (wireless fidelity, wiFi) chips, and also has certain requirements on equipment storage, power supply and other hardware, so that larger cost requirements can be brought. In addition, the original voice data is clouded, and the risk of privacy disclosure of users exists.

For the scene of music recommendation according to the user voice request, another scheme which can be adopted at present is to directly utilize the Internet of things equipment to complete the flow of the whole scheme, wherein the flow comprises the processes of receiving the user voice request, identifying and responding the user request and the like. The whole flow of the method is completed in a single device, so that the calculation power and resource requirements on the Internet of things device are high, the corresponding hardware requirements on the Internet of things device are also high, and the implementation cost is high.

In view of the above problems, the embodiment of the application provides a device control method and electronic device, which are used for realizing the effect of voice control of internet of things devices on the premise of not increasing the hardware requirements of the internet of things devices, reducing the risk of privacy disclosure of users and improving the use experience of the users.

Referring first to fig. 1, a description will be given of a structure of an electronic device to which the method provided by the embodiment of the present application is applicable.

As shown in fig. 1, the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a usb interface 130, a charge management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a key 190, a motor 191, an indicator 192, a camera 193, a display 194, a SIM card interface 195, and the like.

The sensor module 180 may include a gyroscope sensor, an acceleration sensor, a proximity light sensor, a fingerprint sensor, a touch sensor, a temperature sensor, a pressure sensor, a distance sensor, a magnetic sensor, an ambient light sensor, a barometric pressure sensor, a bone conduction sensor, and the like.

It will be appreciated that the electronic device 100 shown in fig. 1 is merely an example and is not limiting of the electronic device, and that the electronic device may have more or fewer components than shown in the figures, may combine two or more components, or may have different configurations of components. The various components shown in fig. 1 may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.

The processor 110 may include one or more processing units, such as: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a memory, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a Neural network processor (Neural-network Processing Unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors. The controller may be a neural hub and a command center of the electronic device 100, among others. The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution.

A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby improving the efficiency of the system.

The execution of the device control method provided by the embodiment of the present application may be completed by the processor 110 controlling or calling other components, for example, calling the processing program of the embodiment of the present application stored in the internal memory 121, or calling the processing program of the embodiment of the present application stored in the third party device through the external memory interface 120, to control the wireless communication module 160 to perform data communication with other devices, thereby improving the intelligence and convenience of the electronic device 100 and improving the user experience. The processor 110 may include different devices, for example, when the CPU and the GPU are integrated, the CPU and the GPU may cooperate to execute the device control method provided by the embodiment of the present application, for example, a part of algorithms in the device control method are executed by the CPU, and another part of algorithms are executed by the GPU, so as to obtain a faster processing efficiency.

The display screen 194 is used to display images, videos, and the like. The display 194 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) or an active-matrix organic light-emitting diode (matrix organic light emitting diode), a flexible light-emitting diode (flex), a mini, a Micro led, a Micro-OLED, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like. In some embodiments, the electronic device 100 may include 1 or N display screens 194, N being a positive integer greater than 1. The display 194 may be used to display information entered by or provided to a user as well as various graphical user interfaces (graphical user interface, GUI). For example, the display 194 may display photographs, videos, web pages, or files, etc.

In the embodiment of the present application, the display 194 may be an integral flexible display, or a tiled display composed of two rigid screens and a flexible screen located between the two rigid screens may be used.

The camera 193 (front camera or rear camera, or one camera may be used as both front camera and rear camera) is used to capture still images or video. In general, the camera 193 may include a photosensitive element such as a lens group including a plurality of lenses (convex lenses or concave lenses) for collecting optical signals reflected by an object to be photographed and transmitting the collected optical signals to an image sensor. The image sensor generates an original image of the object to be photographed according to the optical signal.

The internal memory 121 may be used to store computer executable program code including instructions. The processor 110 executes various functional applications of the electronic device 100 and data processing by executing instructions stored in the internal memory 121. The internal memory 121 may include a storage program area and a storage data area. The storage program area may store codes of an operating system, an application program (such as an air conditioner control function, etc.), and the like. The storage data area may store data created during use of the electronic device 100, etc.

The internal memory 121 may also store one or more computer programs corresponding to the air conditioner control algorithm provided in the embodiment of the present application. The one or more computer programs are stored in the internal memory 121 and configured to be executed by the one or more processors 110, the one or more computer programs including instructions that can be used to perform the various steps in the following embodiments.

In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and the like.

Of course, the codes of the air conditioner control algorithm provided by the embodiment of the application can also be stored in the external memory. In this case, the processor 110 may run codes of the air conditioner control algorithm stored in the external memory through the external memory interface 120.

The sensor module 180 may include a gyro sensor, an acceleration sensor, a proximity light sensor, a fingerprint sensor, a touch sensor, and the like.

Touch sensors, also known as "touch panels". The touch sensor may be disposed on the display screen 194, and the touch sensor and the display screen 194 form a touch display screen, which is also referred to as a "touch screen". The touch sensor is used to detect a touch operation acting on or near it. The touch sensor may communicate the detected touch operation to the application processor to determine the touch event type. Visual output related to touch operations may be provided through the display 194. In other embodiments, the touch sensor may also be disposed on a surface of the electronic device 100 at a different location than the display 194.

Illustratively, the display 194 of the electronic device 100 displays a main interface that includes icons of a plurality of applications (e.g., camera applications, weChat applications, etc.). The user clicks on an icon of the camera application in the main interface by touching the sensor, triggering the processor 110 to launch the camera application, opening the camera 193. The display 194 displays an interface for the camera application, such as a viewfinder interface.

The wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like.

The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the electronic device 100 may be used to cover a single or multiple communication bands. Different antennas may also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed into a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.

The mobile communication module 150 may provide a solution for wireless communication including 2G/3G/4G/5G, etc., applied to the electronic device 100. The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA), etc. The mobile communication module 150 may receive electromagnetic waves from the antenna 1, perform processes such as filtering, amplifying, and the like on the received electromagnetic waves, and transmit the processed electromagnetic waves to the modem processor for demodulation. The mobile communication module 150 can amplify the signal modulated by the modem processor, and convert the signal into electromagnetic waves through the antenna 1 to radiate. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be provided in the same device as at least some of the modules of the processor 110. In an embodiment of the present application, the mobile communication module 150 may also be used for information interaction with other devices.

The modem processor may include a modulator and a demodulator. The modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low frequency baseband signal to the baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs sound signals through audio means (not limited to the speaker 170A, the receiver 170B, etc.), or displays images or video through the display screen 194. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be provided in the same device as the mobile communication module 150 or other functional module, independent of the processor 110.

The wireless communication module 160 may provide solutions for wireless communication including wireless local area network (wireless local area networks, WLAN) (e.g., wireless fidelity (wireless fidelity, wi-Fi) network), bluetooth (BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field wireless communication technology (near field communication, NFC), infrared technology (IR), etc., as applied to the electronic device 100. The wireless communication module 160 may be one or more devices that integrate at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, modulates the electromagnetic wave signals, filters the electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be transmitted from the processor 110, frequency modulate it, amplify it, and convert it to electromagnetic waves for radiation via the antenna 2. In the embodiment of the present application, the wireless communication module 160 is configured to establish a connection with other electronic devices to perform data interaction. Or the wireless communication module 160 may be configured to access the access point device, send control instructions to other electronic devices, or receive data sent from other electronic devices.

In addition, the electronic device 100 may implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor, etc. Such as music playing, recording, etc. The electronic device 100 may receive key 190 inputs, generating key signal inputs related to user settings and function control of the electronic device 100. The electronic device 100 may generate a vibration alert (such as an incoming call vibration alert) using the motor 191. The indicator 192 in the electronic device 100 may be an indicator light, may be used to indicate a state of charge, a change in power, may be used to indicate a message, a missed call, a notification, etc. The SIM card interface 195 in the electronic device 100 is used to connect a SIM card. The SIM card may be inserted into the SIM card interface 195, or removed from the SIM card interface 195 to enable contact and separation with the electronic device 100.

It should be understood that in actual practice, electronic device 100 may include more or fewer components than those shown in FIG. 1, and embodiments of the present application are not limited. The illustrated electronic device 100 is only one example, and the electronic device 100 may have more or fewer components than shown in the figures, may combine two or more components, or may have a different configuration of components. The various components shown in the figures may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.

The software system of the electronic device 100 may employ a layered architecture, an event driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. In the embodiment of the application, an Android system with a layered architecture is taken as an example, and the software structure of the electronic equipment is illustrated.

The layered architecture divides the software into several layers, each with distinct roles and branches. The layers communicate with each other through a software interface. As shown in fig. 2, the software architecture can be divided into four layers, from top to bottom, an application layer, an application framework layer (FWK), an Zhuoyun row and system libraries, and a Linux kernel layer.

The application layer is the top layer of the operating system, including native applications of the operating system, such as cameras, gallery, calendar, bluetooth, music, video, information, and so forth. An application program according to an embodiment of the present application is simply referred to as an application, and is a software program capable of implementing one or more specific functions. Typically, a plurality of applications may be installed in an electronic device. Such as camera applications, mailbox applications, smart home control applications, and the like. The application mentioned below may be a system application installed when the electronic device leaves the factory, or may be a third party application downloaded from a network or acquired from other electronic devices by a user during the process of using the electronic device.

Of course, for a developer, the developer may write an application and install it to that layer. In one possible implementation, the application may be developed using Java language, by calling an application programming interface (Application Programming Interface, API) provided by the application framework layer, through which a developer may interact with the underlying layers of the operating system (e.g., kernel layer, etc.) to develop his own application.

The application framework layer provides an application programming interface (application programming interface, API) and programming framework for application programs of the application layer. The application framework layer may include some predefined functions. The application framework layer may include a window manager, a content provider, a view system, a telephony manager, a resource manager, a notification manager, and the like.

The window manager is used for managing window programs. The window manager can acquire the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.

The content provider is used to store and retrieve data and make such data accessible to applications. The data may include information such as files (e.g., documents, video, images, audio), text, etc.

The view system includes visual controls, such as controls that display text, pictures, documents, and the like. The view system may be used to build applications. The interface in the display window may be composed of one or more views. For example, a display interface including a text message notification icon may include a view displaying text and a view displaying a picture.

The telephony manager is for providing communication functions of the electronic device. The notification manager allows the application to display notification information in a status bar, can be used to communicate notification type messages, can automatically disappear after a short dwell, and does not require user interaction.

The android runtime includes a core library and virtual machines. And the android running time is responsible for scheduling and managing an android system.

The core library of the android system comprises two parts: one part is a function which needs to be called by Java language, and the other part is a core library of the android system. The application layer and the application framework layer run in a virtual machine. Taking Java as an example, the virtual machine executes Java files of the application layer and the application framework layer as binary files. The virtual machine is used for executing the functions of object life cycle management, stack management, thread management, security and exception management, garbage collection and the like.

The system library may include a plurality of functional modules. For example: a surface manager, a media library, a three-dimensional graphics processing library (e.g., openGL ES), a two-dimensional graphics engine (e.g., SGL), an image processing library, and the like. The surface manager is used to manage the display subsystem and provides a fusion of two-dimensional and three-dimensional layers for multiple applications. Media libraries support a variety of commonly used audio, video format playback and recording, still image files, and the like. The media library may support a variety of audio video encoding formats, such as: MPEG4, h.564, MP3, AAC, AMR, JPG, PNG, etc. The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like. A two-dimensional graphics engine is a drawing engine that draws two-dimensional drawings.

The Kernel (Kernel) layer provides core system services of the operating system, such as security, memory management, process management, network protocol stacks, driving models, and the like, which are implemented based on the Kernel layer. The kernel layer also acts as an abstraction layer between the hardware and software stacks. This layer has many drivers associated with the electronic device, the main drivers being: a display drive; a keyboard driver as an input device; a camera drive; an audio drive; bluetooth driving; wiFi drive, etc.

It should be understood that the functional services described above are only examples, and in practical applications, the electronic device may be divided into more or fewer functional services according to other factors, or the functions of the respective services may be divided in other manners, or the functional services may not be divided, but may operate as a whole.

The following describes the scheme provided by the embodiment of the application in detail.

The scheme provided by the embodiment of the application can be applied to the field of the Internet of things, and provides the user with safe and practical experience of voice control Internet of things equipment by means of the mutual assistance of the computing power and resources of the mutual trust equipment on the premise of not increasing the hardware requirements of the Internet of things equipment. The internet of things device is used for receiving user audio data and forwarding the audio data to mutually trusted devices with sufficient computing power and resources, and the mutually trusted devices can identify audio features according to the audio data from the internet of things device and upload the audio features to the server. The server can identify the user intention according to the audio characteristics from the mutually trusted device, indicate the response mode corresponding to the identification result to the mutually trusted device, and then indicate the response mode or the response result obtained according to the response mode to the Internet of things device. The internet of things device can execute the corresponding task according to the response mode or the response result indicated by the mutually trusted device. The mutually trusted device can be a device which can be networked with the Internet of things device, has high computing power and is insensitive to resources, such as a smart screen, a computer, a mobile phone and the like.

In the embodiment of the application, the mutually trusted equipment and the Internet of things equipment are accessed to the same local area network.

The scheme provided by the embodiment of the application can be applied to a device control system comprising the Internet of things device, at least one mutually trusted device and a server. The internet of things equipment is equipment supporting sound interaction, and for example, the internet of things equipment can be an intelligent sound box, an intelligent air conditioner, a sweeping robot, a central control screen or similar equipment. The mutually trusted device is an electronic device supporting local mutually trusted networking (i.e. networking with the internet of things device) and having rich computing power and resources, and for example, the mutually trusted device can be a tablet, a mobile phone, and the like. The server may be, for example, a cloud server or the like.

Fig. 3 is a schematic architecture diagram of a device control system according to an embodiment of the present application. Taking the device control system as an example, as shown in fig. 3, the system may include an internet of things device, a mutually trusted device, and a server.

In some embodiments of the present application, as shown in fig. 3, the internet of things device may further include an audio receiving module, an audio codec module, a communication module, and an audio playing module. The following description will be given separately.

1) Audio receiving module

The audio receiving module is used for carrying out real-time recognition and data acquisition on the voice request/instruction of the user, obtaining the original audio data of the user, and sending the original audio data to the audio encoding and decoding module.

Alternatively, as shown in fig. 3, the audio receiving module may include a Microphone (MIC) for receiving a user's voice and converting the received voice into an electrical signal, resulting in original audio data, and an audio input interface (audio input). The audio input interface is used for transmitting the original audio data.

2) Audio coding and decoding module

The audio codec module is configured to transcode the original audio data from the audio receiving module after receiving the original audio data, convert the original audio data into a data format required for performing subsequent feature extraction, for example, a dynamic image expert compression standard audio layer 3 (moving picture experts group audio layer III, MP 3) format or a waveform sound file (WAV) format, and send the transcoded audio data to the communication module.

The audio codec module is further configured to, after receiving the audio data from the communication module, decode the audio data, convert the audio data into a playable data format (e.g., MP3, WAV format), and send the converted audio data to the audio playing module.

Alternatively, as shown in fig. 3, the audio codec module may include an Audio Encoding (AENC) unit for encoding audio data and an Audio Decoding (ADEC) unit for decoding audio data.

3) Communication module

The communication module is used for finding out the mutually trusted equipment and carrying out authorized pairing with the mutually trusted equipment so as to establish communication connection with the mutually trusted equipment. The communication module is also used for receiving the audio data from the audio encoding and decoding module and sending the audio data to the mutually trusted device, and particularly can be sent to the communication module of the mutually trusted device. The communication module is also used for receiving the audio data from the mutually-trusted device and sending the audio data from the mutually-trusted device to the audio coding and decoding module.

Optionally, as shown in fig. 3, the communication module may include a discovery unit, a pairing unit, a transmission unit and a receiving unit, where the discovery unit is configured to discover the mutually trusted device, the pairing unit is configured to perform authorized pairing with the mutually trusted device, so as to establish a communication connection with the mutually trusted device, the transmission unit is configured to transmit audio data to the mutually trusted device, and the receiving unit is configured to receive the audio data from the mutually trusted device.

4) Audio playing module

The audio playing module is used for playing the audio in real time after receiving the audio data from the audio encoding and decoding module.

Alternatively, as shown in fig. 3, the audio playing module may include an audio output interface for transmitting audio data and a Speaker (SPK) for playing audio.

In some embodiments of the present application, as shown in fig. 3, the mutually trusted device may further include a communication module, an audio feature extraction module, a voiceprint management module, a data transmission module, and a pre-download module. The following description will be given separately.

1) Communication module

The communication module is used for receiving the audio data from the Internet of things equipment and sending the received audio data to the audio feature extraction module.

The communication module is further used for sending the audio data to the internet of things device after receiving the audio data from the pre-download module, and specifically can be sent to the communication module of the internet of things device.

Optionally, as shown in fig. 3, the communication module may include a discovery unit, a pairing unit, a transmission unit and a receiving unit, where the discovery unit is configured to discover an internet of things device, the pairing unit is configured to perform authorized pairing with the internet of things device, so as to establish a communication connection with the internet of things device, the transmission unit is configured to transmit audio data to the internet of things device, and the receiving unit is configured to receive audio data from the internet of things device.

2) Audio feature extraction module

The audio feature extraction module is used for preprocessing the audio data, namely feature extraction processing is carried out on the audio data after the audio data from the communication module are received, so that audio features of the audio data are obtained, and the audio features are sent to the voiceprint management module and the data transmission module. The audio features may be used for voiceprint recognition, and may be used for user instruction (or intent) recognition by the cloud server, among other things. By way of example, the audio features may be mel-frequency cepstral coefficients (mel-frequency cepstral coefficients, MFCC) and features processed by the shallow neural network such as peaks, smoothness, etc. of the spectrum.

3) Voiceprint management module

The voiceprint management module is used for carrying out voiceprint extraction according to the audio characteristics after receiving the audio characteristics from the audio characteristic extraction module, and matching the extracted voiceprints with stored voiceprint data. If the stored voiceprint data has voiceprint data matched with the extracted voiceprint, the user identification matched with the extracted voiceprint can be determined according to the corresponding relation between the stored voiceprint data and the user identification, so that the user identification corresponding to the audio data is obtained, and meanwhile, the user identification information corresponding to the voiceprint can be sent to the data transmission module. If the stored voiceprint data does not have the voiceprint data matched with the extracted voiceprint, the voiceprint and the acquired user identification corresponding to the voiceprint can be stored and reported.

Alternatively, as shown in fig. 3, the voiceprint management module may include a voiceprint extraction unit, a voiceprint matching unit, and a voiceprint storage unit. The voiceprint extraction unit is used for extracting voiceprints according to the audio characteristics; the voiceprint matching unit is used for matching the voiceprint extracted by the voiceprint extracting unit with the voiceprint data stored by the voiceprint storage unit; the voiceprint storage unit is used for storing voiceprints.

4) Data transmission module

The data transmission module is used for carrying out the processes of packaging, encrypting, signing and the like on the audio characteristics and the user identification after receiving the audio characteristics from the audio characteristic extraction module and the user identification from the voiceprint management module, and uploading the data obtained by the processing to the server so that the server can identify the user instruction according to the audio characteristics and the user identification.

The data transmission module is also used for receiving the response information from the server and sending the response information to the communication module or the pre-downloading module. The response information is information for responding to the voice request of the user, and the response information is information correspondingly generated after the server identifies the user instruction and can be used for indicating the service provided for the user after responding to the user instruction. The response information may contain any one or more of the following: instructions, instructional information, specific media content, instructional information for the media content, and the like. The media content may be, for example, images, audio, video, etc. When the response information is information which can be directly indicated to the internet of things equipment, such as instructions, indication information and media content, the data transmission module can directly send the response information to the communication module and then send the response information to the internet of things equipment by the communication module. When the response information is information such as indication information of the media content, the data transmission module can send the response information to the pre-downloading module, and then the pre-downloading module obtains the corresponding media content and sends the corresponding media content to the communication module.

For example, in a scenario in which the user voice indicates recommended music, the response information may be recommended music content or indication information of recommended music content, which may be, for example, a uniform resource locator (uniform resource locator, URL) or the like. When the response information is the indication information of the recommended music content, the data transmission module can send the indication information to the pre-downloading module so that the pre-downloading module can acquire the corresponding music content according to the indication information.

Alternatively, as shown in fig. 3, the data transmission module may include an encryption unit, a communication unit, and a decoding unit. The communication unit is used for sending the encrypted audio features and the encrypted user identifications to the server, receiving response information from the server and sending the response information to the Internet of things equipment or the pre-downloading module. The decoding unit is used for decoding or decrypting the information from the server.

5) Pre-download module

The pre-downloading module is used for acquiring corresponding media content according to the response information from the data transmission module, downloading and caching the media content, and pushing the cached media content to the Internet of things equipment in a streaming media mode so that the Internet of things equipment plays the media content.

Alternatively, as shown in fig. 3, the pre-download module may include a download unit, a cache unit, and a content processing unit. The downloading unit is used for downloading the media content indicated by the response information, the caching unit is used for caching the media content downloaded by the downloading unit, and the content processing unit is used for converting the media content cached by the caching unit into a required format, so that the media content is convenient to transmit, display, play and the like. For example, the content processing unit may be used to convert audio data into a streaming format or the like.

In some embodiments of the present application, when a device control system includes a plurality of mutually trusted devices, each mutually trusted device may employ the constituent architecture shown in fig. 3.

The server is used for identifying user instructions or intentions according to the audio characteristics from the mutually-trusted devices and determining services provided for the users in combination with the user identifications. And after determining the service provided for the user, the server indicates the related information to the mutually trusted device so that the mutually trusted device assists the Internet of things device to provide the service for the user.

Alternatively, the modules shown in FIG. 3 may be deployed in the application framework layer shown in FIG. 2. Of course, the method can be deployed in other levels according to actual needs, and the embodiment of the application is not particularly limited.

It should be noted that, the module composition shown in fig. 3 is only one implementation manner of the related modules of the internet of things device or the mutually trusted device in the embodiment of the present application. In practical applications, other implementations may be used for each device or module, including more or fewer modules, and the embodiments of the present application are not limited in detail.

In the following embodiments of the present application, a scenario in which a scheme is applied to a user voice control internet of things device to perform music recommendation is taken as an example, and the scheme provided in the embodiments of the present application is introduced, and an implementation manner of the scheme when the scheme is applied to other scenarios may refer to a method provided below, which is not described in detail one by one in the embodiments of the present application.

When the scheme provided by the embodiment of the application is applied to a scene of music recommendation of user voice control Internet of things equipment, the Internet of things equipment can complete tasks of voiceprint recognition and account matching, recognition and processing of simple audio features, encryption and signature of audio features and account information, establishment of long connection with a server, acquisition of high computation power or high resource requirements such as recommendation music list, music preloading and the like by means of mutual trust equipment in the premise of not increasing hardware requirements, so that safe and practical music recommendation experience is provided for users. The internet of things equipment only needs to support functions of audio receiving, playing, transmitting and the like, and other hardware configurations are not required to be added, so that the side of the internet of things equipment is low in cost and difficulty. In addition, the original audio data of the user is only transmitted in a local mutually trusted environment, and the mutually trusted device uploads the audio characteristics to the server, so that the privacy and the security are high.

Example one

Referring to fig. 4, an apparatus control method provided by an embodiment of the present application includes:

s401: after the internet of things device discovers the first mutually-trusted device, a connection request is sent to the first mutually-trusted device, and the connection request is used for requesting to establish communication connection with the first mutually-trusted device.

In some embodiments of the present application, the networking device may discover other electronic devices that support establishing a communication connection with the internet of things device based on the near-end connection. The near-end connection may be implemented through communication modes such as WiFi communication, power line communication (power line communication, PLC), bluetooth (BT) communication, bluetooth low energy (bluetooth low energy, BLE) communication, and the like.

In some embodiments of the present application, after the internet of things device discovers other electronic devices based on a near-end connection manner, the internet of things device may acquire the identifiers of the other electronic devices, and determine whether the identifiers of the other electronic devices meet the following conditions: the identification type is a set type or the identification is contained in a set white list, if yes, other electronic equipment can be determined to be used as the mutually trusted equipment of the Internet of things equipment, otherwise, other electronic equipment cannot be determined to be used as the mutually trusted equipment of the Internet of things equipment. The white list comprises preset identification information of at least one electronic device capable of being used as a mutually trusted device.

In some embodiments of the present application, after the internet of things device discovers other electronic devices based on a near-end connection manner, the indicators such as computing power, resources, etc. of the other electronic devices may be evaluated, when it is determined that the computing power, resources, etc. of the other electronic devices are higher according to the evaluation result, it may be determined that the other electronic devices can be used as mutually trusted devices of the internet of things device, or else, it is determined that the other electronic devices cannot be used as mutually trusted devices of the internet of things device.

In the embodiment of the application, the request of the internet of things equipment to establish communication connection with the first mutually-trusted equipment, namely the request of the internet of things equipment to establish an association relationship or a mutually-trusted relationship with the first mutually-trusted equipment. After the communication connection is established between the Internet of things equipment and the first mutually-trusted equipment, the Internet of things equipment and the first mutually-trusted equipment have an association relation or a mutually-trusted relation.

Alternatively, the above steps may be performed by a communication module in the internet of things device shown in fig. 3.

S402: after the first mutually-trusted device receives the connection request, the connection prompt information is displayed, and the connection prompt information is used for prompting a user to confirm whether communication connection is established with the Internet of things device or not.

For example, when the internet of things device is an intelligent sound box and the first mutually trusted device is a mobile phone, the first mutually trusted device may display a prompt interface shown in fig. 5 in a format of a message popup window, and connection prompt information included in the prompt interface may be "the intelligent sound box requests to establish a mutually trusted relationship with the local machine, please confirm whether to agree? ". The prompt interface may further include a control for selecting whether to establish a communication connection with the internet of things device, for example, a control of yes and no shown in fig. 5, when the user clicks the control of yes, the first mutually trusted device may determine that the communication connection is established with the internet of things device, and when the user clicks the control of no, the first mutually trusted device may determine that the communication connection is not established with the internet of things device.

When the first mutually-trusted device determines that the communication connection is not established with the internet of things device, notification information for refusing to establish the communication connection can be returned to the internet of things device, and after the internet of things device receives the notification information, other mutually-trusted devices can be continuously discovered so as to establish the communication connection with other mutually-trusted devices.

S403: and the first mutually-trusted device receives connection indication operation performed by a user, and determines to establish communication connection with the Internet of things device according to the operation.

Wherein the connection indication operation is used for indicating: and confirming that communication connection is established with the Internet of things equipment.

For example, in the example shown in fig. 5, the connection instruction operation may be an operation in which the user clicks the "yes" control shown in fig. 5.

Alternatively, the steps S402 to S403 may be replaced by the following steps A1 to A2:

a1: after the internet of things device discovers the first mutually-trusted device, prompting a user to confirm whether to establish communication connection between the internet of things device and the first mutually-trusted device.

The Internet of things device can prompt a user to confirm whether to establish communication connection between the Internet of things device and the first mutually-trusted device in a mode of playing voice prompt.

For example, when the internet of things device is an intelligent sound box and the first mutually trusted device is a mobile phone, the internet of things device can play a voice prompt: "Smart speaker request to establish a mutually trusted relationship with Mobile phone, please confirm whether to agree? ". The user may issue the control instructions in a voice manner. When the internet of things device receives the user voice indication as yes, it is determined that the communication connection can be established with the first mutually-trusted device, and the internet of things device can request the first mutually-trusted device to establish the communication connection and establish the communication connection with the first mutually-trusted device. When the internet of things device receives the user voice indication as no, it is determined that communication connection cannot be established with the first mutually-trusted device, and the internet of things device can continue to discover other mutually-trusted devices.

In some embodiments of the present application, when the internet of things device has a display function, the user may also be prompted by displaying a prompting message to confirm whether to establish a communication connection between the internet of things device and the first mutually trusted device, for example, the prompting may be performed by referring to the prompting method of the first mutually trusted device in step S402, which is not described in detail herein.

A2: the Internet of things equipment receives and recognizes a voice instruction issued by a user, and determines to establish communication connection with first mutually-trusted equipment according to the voice instruction.

Wherein, the voice indication is used for indicating: and confirming establishment of communication connection between the Internet of things equipment and the first mutually-trusted equipment.

Optionally, after the step S403, the following steps S404 to S405 may be further included:

s404: the first mutually trusted device obtains the mutual application associated with the Internet of things device and the voiceprint database of the family group associated with the Internet of things device.

The first mutual-trust device can acquire the mutual-help application from a server for providing the mutual-help application and then install the mutual-help application, so that a user can control and manage the Internet of things device in the mutual-help application. The family group associated with the internet of things equipment can correspond to families corresponding to the environment where the internet of things equipment is located. The Internet of things equipment associated family group is provided with a family account, the Internet of things equipment associated family group comprises at least one user identifier, the at least one user identifier belongs to a sub-account of the family account, and each user identifier corresponds to one user in the family. And the voiceprint database of the family group comprises at least one user identifier and voiceprint data of the user corresponding to each user identifier.

In the embodiment of the application, when the internet of things equipment is started for the first time, an account number associated with the equipment identifier of the internet of things equipment can be automatically created as a home account number, and the user identifier requested to be registered by a user is used as a sub-account number associated with the home account number in a subsequent process. And the user can control and modify the display identification (such as display icon, display name, etc.) of the home account. Or when the internet of things equipment is started for the first time, an account number associated with the equipment identifier of the user can be created as a home account number according to the indication of the user, and the user identifier which is requested to be registered by the user is used as a sub-account number associated with the home account number in the subsequent process. For example, when the user a speaks "i want to register a home account", after receiving and identifying the corresponding audio, the internet of things device may play a prompting voice, such as "please indicate the name of the home account", to prompt the user to set the account name of the home account. The user can speak the name of the home account number to be registered, such as 'home of small a', and after receiving and identifying the corresponding audio, the internet of things device can generate an account number as the corresponding home account number. After the home account is generated, the internet of things device can complete the registration process of the user A through the first mutually trusted device, and the first mutually trusted device can acquire and store the user identification and the voiceprint of the user A after registration and correlate the user identification and the voiceprint of the user A to the home account. For example, if the user identifier registered by the user a is "small a", after the first mutually trusted device obtains the user identifier, the user identifier "small a" and the corresponding voiceprint may be associated with the family account "family of small a". The home account number may also be associated with user identification and voiceprint of other users. For example, if the user B registers the user identifier under the home account, the internet of things device may complete the registration process of the user a through the first mutually trusted device, and after registration, the first mutually trusted device may acquire and store the user identifier "small B" and the voiceprint of the user B, and associate the user identifier "small B" and the voiceprint of the user B to the home account "home of small a". The registration process may refer to the method described in fig. 7 below, and is not described in detail herein.

In the embodiment of the application, under a possible scene, the mutually trusted device can be used as a management end of the user identification and the voiceprint data, and then the user identification and the voiceprint database can be stored in the mutually trusted device. In the scene, the Internet of things equipment can always establish communication connection with a fixed mutually trusted device, and a voiceprint database can be stored in the mutually trusted device; or in the scene, the internet of things device can establish communication connection with a plurality of mutually trusted devices, but the voiceprint database is stored in one of the fixed mutually trusted devices, and other mutually trusted devices can acquire the voiceprint database from the mutually trusted device after networking with the mutually trusted device. Alternatively, the mutually trusted device may synchronize the voiceprint database to the server for storage, and other mutually trusted devices may acquire the voiceprint database from the server.

For example, in this scenario, after the user a registers the home account and the user identifier, the first mutually trusted device may locally store the home account (home of the small a) registered by the user a, the user identifier (small a) associated with the home account, and the voiceprint corresponding to the user identifier (i.e., the voiceprint of the user a).

In another possible scenario, the server may be used as a management end of the user identifier and the voiceprint data, and then the user identifier and the voiceprint database may be stored in the server, and the internet of things device may obtain the voiceprint database from the server. In this scenario, the internet of things device may establish a communication connection with one or more mutually trusted devices, each of which may obtain a voiceprint database from a server.

In this scenario, after the user a registers the home account and the user identifier, the first mutually trusted device may report the home account (home of small a) registered by the user a, the user identifier (small a) associated with the home account and the voiceprint corresponding to the user identifier (i.e., the voiceprint of the user a) to the server, and the server may store the home account (home of small a), the user identifier (small a) associated with the home account and the voiceprint corresponding to the user identifier (i.e., the voiceprint of the user a) in a voiceprint database of an internet of things device-associated home group, where the voiceprint database is associated with the home account (home of small a).

In the embodiment of the application, when the first mutually trusted device is used as the device for storing the voiceprint database, the first mutually trusted device can store the voiceprint recorded by the user in the local voiceprint database in the process of registering the user identifier by the user, so that the first mutually trusted device can directly read the voiceprint database from the local. When the server is used as equipment for storing the voiceprint database, the first mutually trusted equipment can report voiceprints recorded by the user to the server for storage in the process of registering the user identification by the user, and the server can store the voiceprints recorded by the user in the voiceprint database. In the subsequent process, the mutually trusted device can request the server to acquire the voiceprint database of the family group associated with the Internet of things device. When the first mutually trusted device requests to obtain the voiceprint database of the family group associated with the internet of things device from the server, the device identifier of the internet of things device and the family account identifier of the family group associated with the internet of things device can be obtained from the internet of things device, then the obtained device identifier and family account identifier are sent to the server, and the voiceprint database corresponding to the mutually assisted application and family account identifier corresponding to the device identifier is requested to be obtained. After receiving the request, the server can determine a mutual-assistance application associated with the internet of things device according to the device identifier, select a corresponding voiceprint database according to the home account identifier, and return the mutual-assistance application and the voiceprint database to the first mutual-trust device. The server may generate a voiceprint key by using context information such as a home account identifier, a mutually trusted device identifier, a request time and the like from the first mutually trusted device, encrypt a voiceprint database by using the voiceprint key, and send the encrypted voiceprint database to the first mutually trusted device. After the first mutually trusted device receives the encrypted voiceprint database, a voiceprint key can be generated by utilizing the family account identification of the family group associated with the Internet of things device, and then the voiceprint database is decrypted by utilizing the voiceprint key.

In some embodiments of the present application, if the first mutually trusted device determines that the user identifier and the voiceprint associated with the home account have not been registered by the user after the communication connection is established between the first mutually trusted device and the internet of things device, the first mutually trusted device may first perform a registration process shown in fig. 7 below, and then perform the following steps S406 to S415 after the registration process is performed and the user identifier and the voiceprint registered by the user are acquired.

S405: the first mutually-trusted device sends notification information to the Internet of things device, wherein the notification information is used for notifying that the Internet of things device and the first mutually-trusted device have established communication connection.

Optionally, the notification information may include a mutually trusted identifier, where the mutually trusted identifier is used to indicate that the first mutually trusted device establishes a communication connection with the internet of things device. The first mutually trusted device is different in mutually trusted identification generated aiming at different Internet of things devices. The mutually trusted identification may include a device identification of the internet of things device and a device identification of the first mutually trusted device.

In some embodiments of the present application, after the first mutually trusted device establishes communication connection with the internet of things device, the mutually trusted identifier may be reported to the server, so that the server determines the first mutually trusted device corresponding to the internet of things device according to the mutually trusted identifier.

Alternatively, the above steps may be performed by a communication module in the first mutually trusted device shown in fig. 3.

In some embodiments of the present application, after the first mutually trusted device establishes a communication connection with the internet of things device, mutually trusted notification information may also be sent to the server, where the information is used to notify: the internet of things device has established a communication connection with the first mutually trusted device. The server may determine a relationship between the internet of things device and the first mutually trusted device according to the mutually trusted notification information.

S406: the Internet of things equipment receives a voice instruction issued by a user and collects corresponding audio, and the voice instruction is used for indicating to play music.

By way of example, taking the scenario of music recommendation as an example, the voice instruction may be a music play instruction with some intention by the user, such as "listen at random", "play popular song", "play song by XXX singer", "play song in my song list", etc.

Optionally, the internet of things device may receive the voice command of the user and respond after detecting that the user speaks the set wake-up word, and then the user may wake up the internet of things device by speaking the set wake-up word before issuing the voice command, so that the internet of things device timely and accurately receives the voice command of the user. In a time period of a set duration after the Internet of things equipment is awakened by the user, the Internet of things equipment can automatically receive voice instructions of the user without repeated awakening.

Alternatively, the above steps may be performed by an audio receiving module in the internet of things device shown in fig. 3.

S407: and the Internet of things equipment sends the collected audio to the first mutually-trusted equipment.

In the implementation process, the internet of things equipment can convert collected audio into audio digital signals in a set format through transcoding, and then the obtained audio digital signals are sent to first mutually-trusted equipment, wherein the audio digital signals carry voice information of a user. Alternatively, this step may be accomplished by an audio codec module in the internet of things device shown in fig. 3.

When the internet of things device sends the audio digital signal to the first mutually trusted device, the device identifier of the internet of things device or the mutually trusted identifier between the internet of things device and the first mutually trusted device can be sent to the first mutually trusted device at the same time, so that the first mutually trusted device can identify the internet of things device according to the device identifier or the mutually trusted identifier and determine the source of the audio digital signal. Alternatively, this step may be accomplished by a communication module in the internet of things device shown in fig. 3.

S408: after receiving the audio from the internet of things device, the first mutually trusted device extracts audio features according to the audio, extracts voiceprints according to the audio, and determines user identifications corresponding to the voiceprints according to a voiceprint database of an internet of things device associated family group, wherein the audio features are data obtained by removing information for identifying users in the audio, and the audio features are used for determining user instructions.

In some embodiments of the present application, the information for identifying the user in the original audio, that is, the information related to the privacy of the user (for example, voiceprint privacy information of the user, etc.), is removed from the audio feature, so that the first mutually trusted device can ensure that the privacy information of the user is not revealed when the audio feature is reported to the server.

For example, when the voice indications issued by the user a and the user B to the internet of things device are "i want to listen to the song of singer X", the audio sent by the first mutually trusted device to the internet of things device is the audio corresponding to the "i want to listen to the song of singer X" spoken by the user a and the audio corresponding to the "i want to listen to the song of singer X" spoken by the user B, respectively. In this scenario, the audio features extracted by the first mutually trusted device from the audio corresponding to "i want to listen to the song of singer X" spoken by user a are the same as the audio features extracted by the first mutually trusted device from the audio corresponding to "i want to listen to the song of singer X" spoken by user B, and are all information that does not contain user (user a or user B) privacy information, and at the same time can recognize the instruction "i want to listen to the song of singer X".

In this step, when implemented, the first mutually trusted device may extract the audio feature according to the audio digital signal after receiving the audio digital signal (time domain signal) from the internet of things device. The first mutually trusted device can firstly convert the audio digital signal from the time domain signal into the frequency domain signal, then extract the frequency domain characteristics from the converted frequency domain signal, wherein the frequency domain characteristics can be the characteristics of the MFCC, the spectrum centroid, the frequency domain energy, the signal flatness, the frequency spectrum peak and the like, the characteristics remove the privacy information of the user, and the user indication information can be reflected at the same time, so that the first mutually trusted device can not only avoid the risk of revealing the privacy of the user, but also ensure that the server can identify the user indication according to the audio characteristics by extracting the audio characteristics and reporting the audio characteristics to the server.

The first mutually trusted device can match the extracted voiceprint with voiceprint data in a voiceprint database of the internet of things equipment home group, and when voiceprint data consistent with or matched with the extracted voiceprint exists in the voiceprint database of the internet of things equipment home group, the first mutually trusted device determines that a user identifier corresponding to the voiceprint data is a user identifier corresponding to the extracted voiceprint, namely a user identifier corresponding to an audio digital signal received by the first mutually trusted device. When voiceprint data matched with the extracted voiceprints does not exist in the voiceprint database of the family group of the Internet of things equipment, the first mutually trusted equipment determines that the user identification associated with the extracted voiceprint data does not exist.

Alternatively, the above steps may be performed by the audio feature extraction module and the voiceprint management module in the first mutually trusted device shown in fig. 3.

S409: the first mutually trusted device sends request information to the server, the request information containing information of the extracted audio features and the determined user identification.

In this step, the first mutually trusted device may perform processes such as packaging, encryption signature and the like on the extracted audio feature and the determined identifier of the user identifier, to obtain the request information. The first mutually trusted device can generate a voiceprint key according to a family account of the family group of the internet of things device, and encrypt the audio feature and the user identifier by using the voiceprint key.

In some embodiments of the present application, when the first mutually trusted device determines that there is no user identifier associated with the extracted voiceprint data, the first mutually trusted device may send the extracted audio feature and the home account number of the internet of things device home group to the server, so that the server generates non-personalized response information according to the audio feature and the home account number.

In some embodiments of the present application, when the first mutually trusted device determines that there is no user identifier associated with the extracted voiceprint data, the first mutually trusted device may display prompt information for prompting the user to register the user identifier corresponding to the voiceprint data, and when the user operation determines to register the user identifier corresponding to the voiceprint data, the first mutually trusted device may register the user identifier associated with the voiceprint data. The specific registration method may be referred to in the description of example three below, and will not be described in detail here.

The first mutually trusted device may report the request information to the server through wired communication or wireless communication, which is not particularly limited in the embodiment of the present application.

S410: the server determines the audio characteristics and the user identification based on the received request information.

S411: the server identifies the corresponding user instruction based on the audio feature and generates response information in combination with the user instruction and the user identification, the response information being used to indicate the download address of the song.

When the voice of the user indicates "play songs in my song list", the server may determine the user identity according to the user identifier after identifying the indication according to the audio feature, and further determine the song list information corresponding to the user identifier and the download address of the songs in the song list, and then the server may indicate the download address of the songs in the song list to the mutually trusted device through the response information.

S412: the server indicates the response information to the first mutually trusted device.

S413: and after the first mutually-trusted device receives the response information, acquiring at least one song according to the response information.

As an optional implementation manner, after receiving the response information, the first mutually trusted device may download the song from the download address indicated by the response information and store the song, and then send the stored song to the internet of things device. After each time a song is downloaded and stored by the first mutually trusted device, audio data of the song can be sent to the internet of things device, and the internet of things device is instructed to play the song.

For example, when the download address is the address of a server providing music content, the first mutually trusted device may establish a long connection with the server, and pre-download songs in the song list one by one in the order of songs in the song list, and cache the songs locally.

As another optional implementation manner, after receiving the response information, the first mutually trusted device may download the song from the download address indicated by the response information, and send the audio of the downloaded song to the internet of things device in real time, and at the same time instruct the internet of things device to play the received audio.

In an online song listening scenario, when the download address is the address of the server providing the music content, the first mutually trusted device may establish a long connection with the server, download the data packet of the audio data from the server in real time according to the download address, and send the downloaded data packet of the audio data to the internet of things device in real time, so that the internet of things device may play music online according to the data packet of the audio data.

Alternatively, the above steps may be performed by the data transmission module and the pre-download module in the first mutually trusted device shown in fig. 3.

S414: and the first mutually trusted device sends the acquired audio data of the song to the Internet of things device.

The first mutually trusted device can send the audio data of the song to the internet of things device in a streaming media mode.

Alternatively, the above steps may be performed by the pre-download module and the communication module in the first mutually trusted device shown in fig. 3.

S415: and after receiving the audio data of the song, the Internet of things equipment plays the corresponding song.

The internet of things device can transcode the audio data and play the transcoded audio.

Optionally, the steps may be completed by a communication module, an audio codec module, and an audio play module in the internet of things device shown in fig. 3.

In some embodiments of the present application, after receiving the audio from the internet of things device, the mutually trusted device may identify a corresponding user instruction and a user identifier according to the audio, and send the user instruction and the user identifier to the server, where the server may directly generate corresponding response information according to the user instruction and the user identifier and return the response information to the first mutually trusted device. Based on this method, the above steps S408 to S411 may be replaced with the following steps B1 to B3:

b1: after receiving the audio from the internet of things device, the first mutually trusted device extracts and identifies the user instruction according to the audio, extracts voiceprints according to the audio, and determines the user identification corresponding to the voiceprints according to a voiceprint database of the internet of things device family group.

B2: the first mutually trusted device sends request information to the server, wherein the request information comprises a user instruction and a user identification.

For example, if the audio received by the first mutually trusted device from the internet of things device is audio when the user speaks "play songs in my song list", the first mutually trusted device may identify, according to the audio, the user instruction as: songs in my song list are played. After the voiceprint is extracted from the audio and the user account corresponding to the voiceprint is obtained, the first mutually trusted device can send the user instruction and the user account carried in the request information to the server.

B3: after receiving the request information, the server generates response information according to the user instruction and the user identification in the request information, wherein the response information is used for indicating the download address of the song.

In some embodiments of the present application, the internet of things device may establish communication connection with a plurality of mutually trusted devices, and then the plurality of mutually trusted devices may form a mutually trusted device group corresponding to the internet of things device, where any mutually trusted device in the mutually trusted device group may provide computing power and resource support for the internet of things device. Wherein a plurality of mutually trusted devices included in the group of mutually trusted devices may share the method performed by the mutually trusted devices in the above method. For example, after each of the plurality of mutually trusted devices establishes a communication connection with the internet of things device, the method performed by the mutually trusted device in the steps S406 to S410 may be performed by any one of the plurality of mutually trusted devices, the method performed by the mutually trusted device in the steps S411 to S416 may be performed by another mutually trusted device of the plurality of mutually trusted devices, that is, the process of identifying and reporting the audio feature and the voiceprint, and the process of receiving and processing the response information may be performed by two different mutually trusted devices of the plurality of mutually trusted devices. For another example, in the above method, in a scenario that the internet of things device has a plurality of mutually trusted devices, after the server queries a device list having a mutually trusted relationship with the current family group account, the server may send response information to the plurality of mutually trusted devices in the device list, and each mutually trusted device may perform a processing procedure of the response information after receiving the response information, and indicate a processing result to the internet of things device. The internet of things equipment can execute corresponding operation after receiving the processing result from any mutually trusted equipment, and can ignore the processing results of other mutually trusted equipment.

Taking the scenario of music recommendation described in the foregoing embodiment as an example, after determining the song list information and the download address of the song in the song list, the server may distribute the download address of the song in the song list to a plurality of mutually trusted devices corresponding to the internet of things device, where each mutually trusted device may download the corresponding song according to the download address indicated by the server and push the corresponding song to the internet of things device, and after receiving the song pushed by any mutually trusted device, the internet of things device may play the song pushed by the mutually trusted device.

In the scheme, the internet of things equipment is responsible for user audio acquisition, the mutually trusted equipment is responsible for shallow audio feature extraction, voiceprint recognition and user identification matching according to the audio acquired by the internet of things equipment, and the audio features and the user identification are encrypted and then sent to the server. The server is responsible for identifying user indications based on the audio characteristics and the user identification information and responding to the user indications to return personalized recommended audio to the mutually trusted device. And the mutually-trusted equipment reloads the recommended audio and pushes the recommended audio to the Internet of things equipment, and the Internet of things equipment plays the audio. In the scheme, by means of establishing a mutual trust network, tasks such as voiceprint recognition, audio feature extraction and the like on the equipment of the internet of things with high computing power can be transferred to other equipment with high computing power, namely the mutual trust equipment, so that high-load work such as audio preprocessing, user identification recognition, and pre-downloading of recommended audio is carried out by the equipment of the internet of things with the help of the mutual trust equipment, inter-equipment computing power and resource mutual assistance are realized, and more personalized music recommendation experience can be provided for users on the basis of not increasing the hardware requirements of the equipment of the internet of things. Meanwhile, voiceprint information corresponding to the user identifier is stored in the safety environment of the mutually trusted device, and the transmission between the mutually trusted device and the server only comprises the anonymized account information, so that user audio and voiceprint data cannot be transmitted to the server, and privacy safety of the user is ensured.

Example two

In the embodiment of the application, the establishment of the communication connection supports the establishment of authorization after the new equipment is found and the switching of the mutually trusted equipment, and the following description is made with reference to a specific implementation flow.

Referring to fig. 6, a flow of a method for establishing a communication connection according to an embodiment of the present application may include:

s601: and the Internet of things equipment receives an instruction of switching the mutually-trusted equipment issued by the user.

For example, based on the above example one, taking the internet of things device as the sound box as an example, after the user a wakes up the intelligent sound box, the user a may instruct the intelligent sound box to switch the mutually trusted device by voice. For example, the user may say "i want to switch the mutually trusted device", and after the intelligent speaker receives and recognizes the corresponding audio, it may determine that the user needs to switch the mutually trusted device.

S602: the method comprises the steps that an Internet of things device determines whether a connected mutually trusted device exists or not; if yes, go to step S603, otherwise, go to step S614.

S603: the internet of things device prompts the user of the existing mutually trusted device and whether to switch the mutually trusted device or not, and receives the user indication.

As an alternative implementation manner, the internet of things device may prompt the user with a voice mode and ask the user whether to switch the mutually trusted device, and receive and identify the voice instruction of the user, so as to determine whether to switch the mutually trusted device.

For example, when an internet of things device (e.g., a smart speaker) determines that a first mutually trusted device (e.g., a cell phone) has been connected, the following voice prompts may be played: and (3) whether the currently connected mobile phone switches the connected mutually-trusted equipment or not, and receiving and identifying the voice instruction issued by the user. When the user A speaks "switching", after the Internet of things device receives the corresponding audio and recognizes the corresponding audio, it can determine that the mutually trusted device needs to be switched, and then the connection with the first mutually trusted device (mobile phone) can be cut off, and the new mutually trusted device can be continuously discovered and connected with the new mutually trusted device. Or if the internet of things device is connected with other mutually trusted devices at the same time, the internet of things device can prompt the user a to confirm whether to switch to the connected other mutually trusted devices (for example, whether to switch the mutually trusted devices to the intelligent screen when the intelligent screen is currently connected) and perform corresponding processing according to the user instruction. When the user A speaks "yes" or "switch", the Internet of things device can switch the mutually trusted device from the mobile phone to the intelligent screen and use the service provided by the intelligent screen. When the user A speaks 'no switching', after the Internet of things equipment receives and recognizes the corresponding audio, the user A can determine that the mutual trust equipment does not need to be switched, and can keep the connection relation with the mobile phone.

As another optional implementation manner, the internet of things device may display prompt information and a selection control on the display screen, where the prompt information is used to prompt a user of an existing mutually trusted device and prompt the user to confirm whether to switch the mutually trusted device, and the selection control includes a control for selecting to switch the mutually trusted device and a control for selecting not to switch the mutually trusted device. When the internet of things equipment receives the operation of the user on the control for selecting the mutually-trusted equipment, the mutually-trusted equipment can be determined to be switched, and when the internet of things equipment receives the operation of the user on the control for selecting the mutually-trusted equipment not to be switched, the mutually-trusted equipment not to be switched can be determined.

S604: the Internet of things equipment determines whether to switch the mutually trusted equipment according to the user indication; if yes, go to step S605, otherwise, go to step S613.

In some embodiments of the present application, after the internet of things device determines to switch the mutually trusted device, the server may be notified to switch the mutually trusted device. After receiving the notification, the server can generate a new voiceprint key according to the home account number of the home group associated with the internet of things device, and encrypt the stored voiceprint database of the home group associated with the internet of things device by using the voiceprint key, so that the security of the voiceprint database is improved.

S605: the internet of things device cuts off the communication connection with the connected first mutually trusted device.

Optionally, the internet of things device may cut off the communication connection with the first mutually trusted device after instructing the connected first mutually trusted device to delete the installed mutually assisted application and the stored voiceprint database of the family group associated with the internet of things device. After the first mutually-trusted device receives the indication of the internet of things device, the installed mutually-assisted application and the stored voiceprint database of the family group associated with the internet of things device can be deleted.

The first mutually trusted device may be a first mutually trusted device in the above example one, where the voiceprint database of the family group associated with the internet of things device may be stored in the server, that is, in the above example one, after the user a registers the family account and the user identifier, the family account (family of small a), the user identifier (small a) associated with the family account and the voiceprint corresponding to the user identifier (i.e., the voiceprint of the user a) are stored in the voiceprint database of the family group associated with the internet of things device.

S606: the Internet of things device discovers a second mutually-trusted device, sends a connection request to the second mutually-trusted device, and the connection request is used for requesting to establish communication connection with the second mutually-trusted device.

S607: and after the second mutually-trusted device receives the connection request, displaying mutually-assisted prompt information, wherein the mutually-assisted prompt information is used for prompting a user to confirm whether to establish communication connection with the Internet of things device.

S608: and the second mutually-trusted device receives the mutual-assistance indication operation performed by the user, and determines to establish communication connection with the Internet of things device according to the operation.

S609: the second mutually trusted device determines whether a voiceprint database of the family group associated with the internet of things device has been stored, if so, step S612 is executed, otherwise, step S610 is executed.

In one possible case, after the second mutually trusted device establishes communication connection with the internet of things device, if it is determined that the voiceprint database of the family group associated with the internet of things device is stored, the internet of things device can be directly notified that the communication connection is successfully established, and relevant processing such as voiceprint matching is performed by using the voiceprint database in a subsequent process.

In another possible case, if the second mutually trusted device does not store the voiceprint database of the family group associated with the internet of things device, but the voiceprint database is already stored in the server, the second mutually trusted device may acquire the voiceprint database from the server. The voiceprint database stored in the server may be obtained and stored by the server according to the method described in step S404 of example one above.

S610: and when the second mutually trusted device determines that the voiceprint data of the family group associated with the Internet of things device is stored in the server, requesting the server to acquire the mutual application associated with the Internet of things device and the voiceprint database of the family group associated with the Internet of things device.

In the above example one, if after the user a registers the home account and the user identifier, the home account (home of the small a), the user identifier (small a) associated with the home account, and the voiceprint corresponding to the user identifier (i.e., the voiceprint of the user a) are stored in the voiceprint database of the home group associated with the internet of things device, the second mutually trusted device may request to obtain the voiceprint database from the server.

S611: and the server sends the mutual application related to the Internet of things equipment and the voiceprint database of the family group related to the Internet of things equipment to the second mutual trust equipment.

In some embodiments of the present application, the voiceprint database obtained by the second mutually trusted device from the server is an encrypted database, and the second mutually trusted device may generate a voiceprint key according to the same key generation rule as the server by using the home account number of the home group associated with the internet of things device, and decrypt the encrypted voiceprint database obtained from the server by using the voiceprint key.

The key generation rule may be a rule pre-agreed by the mutually trusted device and the server, or a rule indicated to the mutually trusted device by the server.

In some embodiments of the present application, when a first mutually trusted device connected to a previous internet of things device is switched to serve as a device for storing a voiceprint database, a second mutually trusted device may request to the first mutually trusted device to acquire the voiceprint database of the internet of things device associated home group, and request to a server to acquire a mutually assisted application associated with the internet of things device.

S612: the second mutually-trusted device sends notification information to the Internet of things device, wherein the notification information is used for notifying that the Internet of things device and the second mutually-trusted device have established communication connection.

After the internet of things device and the second mutually-trusted device establish communication connection, the method provided in step S406 and the subsequent steps in the above example one may be adopted to provide a personalized recommendation function for the user.

S613: the internet of things device remains connected with the connected first mutually trusted device.

In some embodiments of the present application, when the internet of things device determines to use the existing mutually trusted device, the step S406 and the subsequent steps in the above example one may be continuously performed, so as to provide a personalized recommendation function for the user.

S614: and after the internet of things equipment discovers the connectable mutually-trusted equipment, establishing communication connection with the mutually-trusted equipment.

The specific implementation of this step may refer to the methods described in steps S401 to S406 in the above example one, and will not be described herein. After the internet of things device establishes a communication connection with the mutually trusted device, the method described in steps S407 to S417 in the above example one may be referred to for providing services for the user.

According to the scheme, the Internet of things equipment is used as a main body, the server generates the voiceprint key and re-encrypts voiceprint data, the mutually trusted equipment deletes stored voiceprint data, applications and the like, so that hot plug type mutual assistance relation construction and switching are realized, and the flexibility is high. The internet of things equipment can support the localized management of a plurality of user identifiers under the family account by means of the storage capacity of the mutually trusted equipment.

Example three

In the above embodiments, after the mutually trusted device obtains the voiceprint database of the internet of things associated home group from the server, the user identifier and the voiceprint database corresponding to the internet of things associated home group may be managed by the internet of things device and the server. Specifically, the method can comprise two cases of registration (i.e. new establishment) and deletion of the user identification. The following description is made in connection with the specific embodiment.

Referring to fig. 7, taking a voice print database of a server storing a home group associated with an internet of things device as an example, a flow of a method for registering a user identifier provided in an embodiment of the present application may include:

s701: and the Internet of things equipment receives a request for registering the user identification, which is issued by the user.

The user can send a request for registering the user identifier in a voice mode, and the Internet of things equipment can receive and identify the voice instruction sent by the user, so that the user is determined to need to register the user identifier.

For example, the user C may say "i want to register a new user", and after receiving and identifying the corresponding audio, the internet of things device may determine that the user C needs to register a new user identifier, and send a request for registering an account number to the mutually trusted device.

S702: the internet of things device sends a user registration request to the mutually trusted device, the request being for requesting registration of a user identity.

S703: the mutually trusted equipment displays prompt information, and the prompt information is used for prompting a user to input a user identification name and corresponding user audio.

The user identification name can be used as a display identification of the user identification, and the mutually trusted device can provide the user identification for the user in a mode of displaying the user identification name.

By way of example, taking a mutually trusted device as a mobile phone, the mutually trusted device may display, through a mutually trusted application, a user name entry interface shown in the schematic diagram (a) in fig. 8, where the interface includes prompt information (e.g., an input box and a prompt term "please input a user name") for prompting a user to enter a user identification name (i.e., a user name). User C, upon entering the user identification name (e.g., "small C"), may trigger the trusted device to display the user audio entry interface shown in diagram (b) of fig. 8, including prompt information (e.g., "please speak the following: 12345") for prompting the user to speak the setting content (e.g., "12345"), by clicking on the "next" control shown in diagram (a) of fig. 8. After the prompt information is displayed, the mutually trusted device can simultaneously instruct the Internet of things device to collect audio data corresponding to the voice of the user speaking the set content, and send the collected audio data to the mutually trusted device. For example, after the prompt information is displayed, the user C may speak "12345", and the internet of things device may collect the audio data of the corresponding user C and send the audio data to the mutually trusted device.

In some embodiments of the present application, as an optional implementation manner, a mutually trusted application installed in the mutually trusted device may provide a control portal for registering the user identifier, and then the user may trigger the registration procedure of the user identifier by controlling the control portal. When the mutually trusted device determines the registered user identification according to the operation of the user, prompt information for prompting the user to input the user identification name and corresponding user audio can be displayed.

In some embodiments of the present application, the internet of things device may not perform the steps S701 to S702 described above, but the mutually trusted device directly performs the step S703. For example, the user may trigger the mutually trusted device to execute the step S703 in a scanning manner, and specifically, the user may scan an identification code (for example, a two-dimensional code or the like) associated with the internet of things device by using the mutually trusted device, and after scanning, the mutually trusted device executes the step S703 to display prompt information, so as to prompt the user to register.

S704: the mutually trusted equipment receives the user identification name input by the user and takes the user identification name as the newly added user identification name.

S705: the internet of things device receives user audio.

S706: and after transcoding the user audio, the Internet of things equipment sends the user audio to the mutually trusted equipment.

S707: and the mutually trusted equipment extracts voiceprint data according to the received user audio, and takes the extracted voiceprint data as newly added voiceprint data.

In some embodiments of the present application, the functions of steps S705 to S706 may also be performed by a mutually trusted device.

S708: and the mutually trusted equipment sends the newly added user identification name and the newly added voiceprint data to the server.

For example, after the mutually trusted device receives the audio data of the user C sent by the internet of things device, the voiceprint data of the user C can be extracted from the audio data, and the voiceprint data of the user C (as newly added voiceprint data) and the user identification name "small C" of the user C (as newly added user identification name) are sent to the server.

Specifically, the mutually trusted device may package, encrypt and sign the newly added user identifier name and the newly added voiceprint data, and then send the obtained information packet to the server, and instruct the server to register the user identifier for use.

S709: the server allocates a corresponding new user identifier for the new user identifier name and the new voiceprint data, and stores the new user identifier and the corresponding new voiceprint data into a voiceprint database of the family group associated with the Internet of things equipment.

For example, after receiving the voiceprint data of the user C and the user identification name "small C" of the user C, the server may allocate a corresponding user identification, for example, "small C", to the user C, and store the voiceprint data of the user C and the user identification "small C" in the voiceprint database of the internet of things device home group. When the home account number associated with the internet of things device is "home of small a" in the above embodiment, the user identifier associated with the home account number may be "small a" and "small B", and the voiceprint database of the internet of things device associated home group includes voiceprint data of user a and voiceprint data of user B, where the voiceprint data of user a is associated with the user identifier "small a", and the voiceprint data of user B is associated with the user identifier "small B". After the server stores the voiceprint data of the user C and the user identifier of 'small C' into a voiceprint database of the family group of the Internet of things equipment, the voiceprint data of the user C is added to the voiceprint data in the database, and the voiceprint data of the user C is associated with the user identifier of 'small C'.

S710: and the server returns the voiceprint database of the family group associated with the Internet of things equipment to the mutually trusted equipment.

For example, the voiceprint database of the family group associated with the internet of things device may be a voiceprint database corresponding to the family account "family of small a", where the voiceprint database includes voiceprint data of user a, user B, and user C, where the voiceprint data of user a is associated with user identifier "small a", the voiceprint data of user B is associated with user identifier "small B", and the voiceprint data of user C is associated with user identifier "small C".

S711: and the mutually trusted equipment stores the received voiceprint database of the family group associated with the Internet of things equipment.

In some embodiments of the present application, when the mutually trusted device is used as a device for storing the voiceprint database, the mutually trusted device may allocate a corresponding newly added user identifier to the newly added user identifier name and the newly added voiceprint data, and store the newly added user identifier name, the newly added user identifier, and the newly added voiceprint data locally. The mutually trusted device can update the newly added voiceprint data into a locally stored voiceprint database.

S712: the mutually trusted device sends notification information to the Internet of things device, wherein the notification information is used for notifying that the user identification is successfully registered.

As an optional implementation manner, after the internet of things device receives the notification information, the user can be prompted to successfully register the user identifier by playing a prompt audio.

As another optional implementation manner, after updating the newly added user identifier and the corresponding newly added voiceprint data to the voiceprint database of the stored family group associated with the internet of things device, the mutually trusted device may directly display prompt information for prompting that the user identifier has been successfully registered.

In some embodiments of the present application, the steps S701 to S706 may be replaced by the following steps C1 to C4:

c1: the Internet of things equipment receives a voice instruction issued by a user and collects corresponding user audio, and the voice instruction is used for indicating a registered user identifier.

C2: the Internet of things equipment sends a registration account number request to the mutually trusted equipment, the request is used for requesting registration of user identification, and the request carries user audio acquired by the Internet of things equipment.

For example, after the internet of things device receives and identifies the corresponding audio, it may determine that the user needs to register a new user identifier, and then may send a registration account request to the mutually trusted device and send the received audio to the mutually trusted device. The method for sending the received audio to the mutually trusted device by the internet of things device may refer to the methods described in the above steps S407 to S408.

And C3: the mutually trusted device displays prompt information, and the prompt information is used for prompting a user to input a user identification name.

And C4: the mutually trusted equipment receives the user identification name input by the user and takes the user account name as the newly added user identification name.

After executing the steps C1 to C4, each device continues to execute the steps S707 to S712, thereby realizing the registration of the user identifier.

Based on the method, the user can complete the voice print input at the same time when issuing the voice instruction, so that the operation that the user independently records the voice print again is avoided, the user use experience is improved, and the registration efficiency and the practicability of the scheme are improved.

In the embodiment of the application, under the condition of deleting the user identifier, the Internet of things equipment can receive and identify the user identifier deleting request issued by the user in a voice mode, so as to determine that the user needs to delete the user identifier. When determining that the user needs to delete the user identifier, the internet of things device can send a request for deleting the user identifier to the mutually trusted device, wherein the request contains the identifier of the user identifier needing to be deleted. After receiving the request, the mutually trusted device determines the user identifier to be deleted according to the request, and deletes the stored user identifier and voiceprint data corresponding to the user identifier in the voiceprint database. Meanwhile, the mutually trusted device can request the server to delete the stored user identifier and voiceprint data corresponding to the user identifier in the voiceprint database, so that the effect of deleting the user identifier and the corresponding voiceprint data is achieved.

Optionally, the mutually trusted device in the above method may be the first mutually trusted device in the above embodiment or the second mutually trusted device in the above embodiment or other electronic devices logged in to a home account number associated with the internet of things device. The internet of things equipment can register or delete user identification by means of any mutually-trusted equipment connected with the internet of things equipment.

In the scheme, the internet of things equipment is used as a home account main inlet, and by means of the support of the mutually trusted equipment, the management functions such as the addition and the deletion of the user identification can be conveniently realized, the hardware requirements of the internet of things equipment side cannot be excessively increased, the realization is more convenient, and the practicability and the universality are higher.

It should be noted that, the specific implementation process provided in the foregoing embodiments is merely an illustration of a process flow applicable to the embodiments of the present application, and specific implementation may refer to the description in the foregoing embodiments. In addition, the execution sequence of each step in each embodiment may be adjusted according to the actual requirement, and other steps may be added or part of steps may be reduced.

Based on the above embodiments and the same concept, the embodiments of the present application further provide an apparatus control method, as shown in fig. 9, which may include:

S901: the first electronic device establishes a communication connection with the second electronic device.

The first electronic device may be the first mutually trusted device/the second mutually trusted device/the mutually trusted device described in the foregoing embodiment, and the second electronic device may be the internet of things device described in the foregoing embodiment.

In some embodiments of the present application, the process of establishing a communication connection between a first electronic device and a second electronic device includes: the first electronic equipment receives a connection request sent by the second electronic equipment, wherein the connection request is used for requesting to establish communication connection with the first electronic equipment; the first electronic equipment displays first prompt information which is used for prompting and confirming whether communication connection is established with the second electronic equipment or not; the first electronic device responds to the received first operation, and sends notification information to the second electronic device, wherein the first operation is used for indicating to confirm that communication connection is established with the second electronic device, and the notification information is used for notifying that communication connection is established.

The first prompting message may be, for example, the connection prompting message described in the foregoing embodiment, and the first operation may be a connection indicating operation described in the foregoing embodiment.

In some embodiments of the present application, after a first electronic device establishes a communication connection with a second electronic device, the first electronic device may receive third audio from the second electronic device, where the third audio is audio of a third user; after receiving the third audio, the first electronic device may obtain a second voice print according to the third audio; and sending a third request to the server, wherein the third request includes the second voiceprint. The first electronic device can acquire third user information returned by the server and store the third user information and the second voice pattern; wherein the third user information is used for identifying the third user, and the third user information is associated with the second voice print. Optionally, the third audio is used for registering the second voice print and the third user information corresponding to the second voice print on a server.

For example, the third user may be the user C described in the foregoing embodiment, and the second voice print may be the voice print of the user C, and the third user information may be the user identifier of the user C.

S902: the first electronic device receives first audio from the second electronic device; the first audio is the audio of the first user.

The first audio may be, for example, a voice command issued by the user in the above embodiment.

S903: the first electronic equipment acquires first audio characteristics and first user information according to first audio; the first audio feature does not contain information for identifying the first user, and the first user information is used for identifying the first user.

In some embodiments of the application, the first electronic device may extract the first audio feature from the first audio; the first electronic equipment can extract a first voiceprint from the first audio, acquire a first user identification associated with the first voiceprint, and take the first user identification as the first user information. The first electronic device may acquire a voiceprint set corresponding to a first account from the server, and at least one user identifier corresponding to at least one voiceprint included in the voiceprint set, and select the first user identifier corresponding to the first voiceprint from the at least one user identifier; wherein the voiceprint set comprises the first voiceprint. Optionally, the first account is associated with the second electronic device.

The first account may be a home account as described in the foregoing embodiment, and the voiceprint set may be a voiceprint database of a family group associated with the internet of things device as described in the foregoing embodiment.

For example, the first user may be the user a described in the foregoing embodiment, and the second voice print is the voice print of the user a, and the second user information is the user identifier of the user a.

S904: the first electronic device sends a first request to a server, wherein the first request includes a first audio feature and first user information.

S905: the first electronic device obtains a first response returned by the server, wherein the first response is used for indicating first content determined by the server according to the first audio feature and the first user information.

S906: the first electronic device sends a first message including the first content to the second electronic device to cause the second electronic device to play the first content according to the first message.

In some embodiments of the present application, the first response includes information indicating a storage location of the first content, and the first electronic device may determine the storage location of the first content according to the first response and obtain the first content from the storage location of the first content.

The first response may be response information described in the above embodiment, and the first content may be music described in the above embodiment.

Optionally, after the above step S901, the following steps D1 to D5 may be further performed:

d1: the first electronic device receives second audio from the second electronic device; wherein the second audio is the audio of the second user.

D2: the first electronic equipment acquires second audio characteristics and second user information according to the second audio; wherein the second audio feature is the same as the first audio feature, and the second user information is used to identify the second user.

D3: the first electronic device sends a second request to a server, wherein the second request includes the second audio feature and the second user information.

D4: the first electronic device obtains a second response returned by the server, wherein the second response is used for indicating: the server determines second content according to the second audio feature and the second user information.

D5: the first electronic device sends a second message including the second content to the second electronic device, so that the second electronic device plays the second content according to the second message.

In some embodiments of the application, the first content is associated with the first user, the second content is associated with the second user, and the second content is different from the first content.

In some embodiments of the present application, the first content and the second content may be audio content and text content. The audio content may be, for example, music, audio for broadcasting weather conditions, traffic conditions, etc., audio for reading novels, etc. The text content may be, for example, news content, novel content, or the like. The first content may be delivered by the server according to first user information of the first user, and the second content may be delivered by the server according to second user information of the second user.

In some embodiments of the application, the first electronic device may log onto the first account prior to receiving the first audio from the second electronic device; alternatively, the first electronic device may log onto a first account prior to receiving third audio from the second electronic device, wherein the second voice print is associated with the first account. Optionally, the first account is associated with the second electronic device.

For example, the second user may be the user B described in the foregoing embodiment, and the second voice print is the voice print of the user B, and the second user information is the user identifier of the user B.

The specific implementation of each step may refer to the description of the related content in the foregoing embodiments, which is not repeated herein.

Based on the above embodiments and the same concept, the embodiments of the present application further provide an electronic device, where the electronic device is configured to implement the device control method provided by the embodiments of the present application. As shown in fig. 10, the electronic device 1000 may include: a display 1001, a memory 1002, one or more processors 1003, and one or more computer programs (not shown). The devices described above may be coupled by one or more communication buses 1004.

The display 1001 is used for displaying related user interfaces such as images, videos, application interfaces, and the like. The memory 1002 has stored therein one or more computer programs (code) comprising computer instructions; the one or more processors 1003 invoke the computer instructions stored in the memory 1002 to cause the electronic device 1000 to execute the device control method provided by the embodiment of the present application.

In particular implementations, the memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as one or more disk storage devices, flash memory devices, or other non-volatile solid state storage devices. The memory 1002 may store an operating system (hereinafter referred to as a system), such as ANDROID, IOS, WINDOWS, or an embedded operating system, such as LINUX. The memory 1002 may be used to store implementation programs for embodiments of the present application. The memory 1002 may also store network communication programs that may be used to communicate with one or more additional devices, one or more user devices, and one or more network devices. The one or more processors 1003 may be a general purpose central processing unit (Central Processing Unit, CPU), microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the programs of the present application.

In some embodiments of the present application, the first electronic device, the second electronic device, and the server described in the foregoing embodiments may all be implemented as the structure shown in fig. 10.

It should be noted that fig. 10 is merely an implementation of the electronic device 1000 according to an embodiment of the present application, and in practical application, the electronic device 1000 may further include more or fewer components, which is not limited herein.

Based on the above embodiments and the same concept, the embodiments of the present application further provide an apparatus control system, including: the first electronic device is configured to perform the method performed by the first electronic device described in the above embodiment, and the server is configured to perform the method performed by the server described in the above embodiment.

Optionally, the system further includes a second electronic device, where the second electronic device is configured to perform the method performed by the second electronic device in the foregoing embodiment.

It should be noted that, the method executed by each electronic device in the system may refer to the method described in the foregoing embodiment, which is not described herein again.

Based on the above embodiments and the same idea, the embodiments of the present application further provide a computer-readable storage medium storing a computer program, which when run on a computer, causes the computer to perform the method provided by the above embodiments of the present application.

Based on the above embodiments and the same conception, the present embodiments also provide a computer program product comprising a computer program or instructions that, when run on a computer, cause the computer to perform the methods provided by the above embodiments of the present application.

The method provided by the embodiment of the application can be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user device, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by means of a wired (e.g., coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL), or wireless (e.g., infrared, wireless, microwave, etc.), the computer-readable storage medium may be any available medium that can be accessed by the computer or a data storage device such as a server, data center, etc., that contains an integration of one or more available media, the available media may be magnetic media (e.g., floppy disk, hard disk, tape), optical media (e.g., digital video disc (digital video disc, DVD), or semiconductor media (e.g., SSD), etc.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A device control method, characterized by comprising:

the method comprises the steps that communication connection is established between a first electronic device and a second electronic device;

the first electronic device receives first audio from the second electronic device; wherein the first audio is the audio of a first user;

the first electronic equipment acquires first audio characteristics and first user information according to the first audio; wherein the first audio feature does not include information for identifying the first user, and the first user information is used for identifying the first user;

the first electronic device sends a first request to a server, wherein the first request includes the first audio feature and the first user information;

the first electronic equipment obtains a first response returned by the server, wherein the first response is used for indicating first content determined by the server according to the first audio characteristics and the first user information;

The first electronic device sends a first message including the first content to the second electronic device, so that the second electronic device plays the first content according to the first message.

2. The method of claim 1, wherein after the first electronic device establishes a communication connection with the second electronic device, the method further comprises:

the first electronic device receives second audio from the second electronic device; wherein the second audio is the audio of a second user;

the first electronic equipment acquires second audio characteristics and second user information according to the second audio; wherein the second audio feature is the same as the first audio feature, the second user information being used to identify the second user;

the first electronic device sending a second request to a server, wherein the second request includes the second audio feature and the second user information;

the first electronic device obtains a second response returned by the server, wherein the second response is used for indicating: the server determines second content according to the second audio characteristics and the second user information;

The first electronic device sends a second message including the second content to the second electronic device, so that the second electronic device plays the second content according to the second message.

3. The method of claim 2, wherein the first content is associated with the first user, the second content is associated with the second user, and the second content is different from the first content.

4. A method according to any one of claims 1 to 3, wherein the first content is audio content or text content.

5. The method of any of claims 1-4, wherein the first electronic device establishes a communication connection with a second electronic device, comprising:

the first electronic equipment receives a connection request sent by the second electronic equipment, wherein the connection request is used for requesting to establish communication connection with the first electronic equipment;

the first electronic equipment displays first prompt information;

the first electronic device responds to the received first operation and sends notification information to the second electronic device, wherein the notification information is used for notifying establishment of communication connection.

6. The method of any of claims 1-5, wherein the first electronic device obtaining a first audio feature and first user information from the first audio comprises:

The first electronic device extracting the first audio feature from the first audio;

the first electronic device extracts a first voiceprint from the first audio;

the first electronic equipment acquires a first user identification associated with the first voiceprint and takes the first user identification as the first user information.

7. The method of claim 6, wherein after the first electronic device extracts a first voiceprint from the first audio, before the first electronic device obtains a first user identification associated with the first voiceprint, the method further comprising:

the first electronic device obtains a voiceprint set corresponding to a first account from the server and at least one user identifier corresponding to at least one voiceprint contained in the voiceprint set; wherein the voiceprint set comprises the first voiceprint;

the first electronic device obtaining a first user identifier associated with the first voiceprint includes:

the first electronic device selects the first user identifier corresponding to the first voiceprint from the at least one user identifier.

8. The method of any of claims 1-7, wherein the first response includes information indicating a storage location of the first content; after the first electronic device obtains the first response returned by the server, before the first electronic device sends a first message including the first content to the second electronic device, the method further includes:

The first electronic device determines the storage position of the first content according to the first response;

the first electronic device obtains the first content from a storage location of the first content.

9. The method of any of claims 1-8, wherein after the first electronic device establishes a communication connection with the second electronic device, the method further comprises:

the first electronic device receives third audio from the second electronic device, wherein the third audio is audio of a third user;

the first electronic device obtains a second voice print according to the third audio;

the first electronic device sends a third request to the server, wherein the third request comprises the second voice print;

the first electronic equipment acquires third user information returned by the server; wherein the third user information is used for identifying the third user, and the third user information is associated with the second voice print;

the first electronic device stores the third user information and the second voice print.

10. The method of claim 9, wherein the third audio is used to register the second voice print and the third user information corresponding to the second voice print on the server.

11. The method of claim 9 or 10, wherein before the first electronic device receives the first audio from the second electronic device, the method further comprises: the first electronic equipment logs in a first account; or alternatively

Before the first electronic device receives the third audio from the second electronic device, the method further comprises: the first electronic device logs in to a first account, wherein the second voice print is associated with the first account.

12. The method of any one of claims 1-11, wherein the second electronic device is an internet of things device.

13. An electronic device comprising a memory and one or more processors;

wherein the memory is for storing computer program code, the computer program code comprising computer instructions; the computer instructions, when executed by the one or more processors, cause the electronic device to perform the method of any of claims 1-12.

14. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when run on a computer, causes the computer to perform the method according to any one of claims 1-12.