CN108648754B

CN108648754B - Voice control method and device

Info

Publication number: CN108648754B
Application number: CN201810386446.0A
Authority: CN
Inventors: 王旭; 张建春; 郭峰; 刘广鑫
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2018-04-26
Filing date: 2018-04-26
Publication date: 2021-09-21
Anticipated expiration: 2038-04-26
Also published as: CN108648754A

Abstract

The disclosure relates to a voice control method and device. The method comprises the following steps: receiving a voice instruction recognition result sent by a first terminal, wherein the first terminal is associated with a first user account; performing semantic processing on the voice instruction recognition result to obtain operation information, wherein the operation information comprises a second user account and operation content; when the second user account and the first user account are in a friend relationship, searching a second terminal associated with the second user account; and sending an operation instruction carrying the operation content to the second terminal, and instructing the second terminal to execute the operation content through a target application, wherein the target application is an application program for voice control. The method and the device can meet the requirement that the user interacts with the voice assistant of the friend through the voice assistant of the user, and improve user experience.

Description

Voice control method and device

Technical Field

The present disclosure relates to the field of communications technologies, and in particular, to a voice control method and apparatus.

Background

The voice assistant is an intelligent application, can be carried on intelligent equipment such as a mobile phone, a television, a computer, an intelligent sound box and the like, receives a user audio signal through a microphone of the intelligent equipment, performs semantic judgment, and then quickly responds in a foreground, for example, chatting with a user voice or assisting the user to operate the intelligent equipment according to an instruction. The voice assistant is used for waking up, hearing and understanding and speaking, and the back corresponds to a machine learning and data mining algorithm, a voice recognition technology, a semantic understanding technology and a voice synthesis technology and needs a voice knowledge database for cloud support.

In the related art, after receiving a voice command of a user, a voice assistant controls a user equipment to perform an operation corresponding to the voice command.

Disclosure of Invention

In order to overcome the problems in the related art, embodiments of the present disclosure provide a voice control method and apparatus. The technical scheme is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a voice control method applied to a cloud server, the voice control method including:

receiving a voice instruction recognition result sent by a first terminal, wherein the first terminal is associated with a first user account;

performing semantic processing on the voice instruction recognition result to obtain operation information, wherein the operation information comprises a second user account and operation content;

when the second user account and the first user account are in a friend relationship, searching a second terminal associated with the second user account;

and sending an operation instruction carrying the operation content to the second terminal, wherein the operation instruction is used for instructing the second terminal to execute the operation content through a target application, and the target application is an application program for voice control.

In one embodiment, the sending the operation instruction to the second terminal includes:

judging whether the first user account has the authority of controlling the second terminal to execute the operation content through the target application;

and when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, sending an operation instruction to the second terminal.

In one embodiment, the method further comprises:

receiving a friend verification request which is sent by the first terminal and carries the first user account and the second user account;

and forwarding the friend verification request to a second terminal associated with the second user account, wherein the friend verification request is used for requesting to establish a friend relationship between the second user account and the first user account.

In one embodiment, the operation information further includes at least any one or a combination of the following: the transmission time of the operation content, or the execution time of the operation content.

In one embodiment, the type of the operation content includes at least any one or a combination of the following: the method comprises the steps of leaving message content, voice mailbox content, adding backlog content in a calendar or backlog reminding content.

According to a second aspect of the embodiments of the present disclosure, there is provided a voice control method applied to a first terminal, the voice control method including:

receiving a voice instruction through a target application, wherein the target application is an application program for voice control, and the first terminal is associated with a first user account;

analyzing the voice command to obtain a voice command recognition result;

and sending the voice instruction recognition result to a cloud server.

In one embodiment, the method further comprises:

acquiring a second user account;

and sending a friend verification request carrying the first user account and the second user account to the cloud server, wherein the friend verification request is used for requesting to establish a friend relationship between the second user account and the first user account.

According to a third aspect of the embodiments of the present disclosure, there is provided a voice control method applied to a second terminal, the voice control method including:

receiving an operation instruction sent by a cloud server, wherein the operation instruction comprises operation content;

executing the operation content through a target application, wherein the target application is an application program for voice control.

In one embodiment, the method further comprises:

receiving a friend verification request which is sent by the cloud server and carries the first user account and the second user account, wherein the second user account is associated with the second terminal;

and establishing a friend relationship between the first user account and the second user account according to the friend verification request.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a voice control apparatus including:

the first receiving module is used for receiving a voice instruction recognition result sent by a first terminal, and the first terminal is associated with a first user account;

the processing module is used for performing semantic processing on the voice instruction recognition result to obtain operation information, wherein the operation information comprises a second user account and operation content;

the searching module is used for searching a second terminal associated with the second user account when the second user account is in a friend relationship with the first user account;

a first sending module, configured to send an operation instruction carrying the operation content to the second terminal, where the operation instruction is used to instruct the second terminal to execute the operation content through a target application, and the target application is an application program for voice control.

In one embodiment, the first sending module determines whether the first user account has a right to control the second terminal to execute the operation content through the target application; and when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, sending an operation instruction to the second terminal.

In one embodiment, the apparatus further comprises:

a second receiving module, configured to receive a friend verification request that is sent by the first terminal and carries the first user account and the second user account;

and the forwarding module is used for forwarding the friend verification request to a second terminal associated with the second user account, wherein the friend verification request is used for requesting to establish a friend relationship between the second user account and the first user account.

According to a fifth aspect of the embodiments of the present disclosure, there is provided a voice control apparatus including:

the third receiving module is used for receiving a voice instruction through a target application, the target application is an application program used for voice control, and the first terminal is associated with the first user account;

the analysis module is used for analyzing the voice command to obtain a voice command recognition result;

and the second sending module is used for sending the voice instruction recognition result to a cloud server.

According to a sixth aspect of the embodiments of the present disclosure, there is provided a voice control apparatus including:

the fourth receiving module is used for receiving an operation instruction sent by the cloud server, wherein the operation instruction comprises operation content;

and the execution module is used for executing the operation content through a target application, and the target application is an application program for voice control.

According to a seventh aspect of the embodiments of the present disclosure, there is provided a voice control apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

According to an eighth aspect of the embodiments of the present disclosure, there is provided a voice control apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

analyzing the voice command to obtain a voice command recognition result;

and sending the voice instruction recognition result to a cloud server.

According to a ninth aspect of the embodiments of the present disclosure, there is provided a voice control apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

According to a tenth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of the first aspect described above.

According to an eleventh aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of the second aspect described above.

According to a twelfth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer instructions, which when executed by a processor, implement the steps of the method according to the third aspect described above.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: according to the technical scheme, the first terminal is associated with the first user account, the second terminal is associated with the second user account, the friend relationship between the first user account and the second user account is established, when the cloud server carries out semantic processing on a voice instruction recognition result sent by the first terminal to obtain the second user account and operation content, the operation content is sent to the second terminal associated with the second user account with the friend relationship with the first user account, the second terminal is instructed to execute the operation content through the target application, so that the user can carry out interactive operation with the target application of the second terminal of a friend through the target application of the first terminal, the requirement that the user carries out interaction with a voice assistant of the friend through the voice assistant of the user is met, and the user experience is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is an application scenario diagram illustrating a voice control method according to an exemplary embodiment.

FIG. 2 is a flow diagram illustrating a voice control method according to an example embodiment.

FIG. 3 is a flow chart illustrating a method of voice control according to an example embodiment.

FIG. 4 is a flow chart illustrating a method of voice control according to an example embodiment.

FIG. 5 is a flow chart illustrating a method of voice control according to an example embodiment.

FIG. 6 is a block diagram illustrating a voice-controlled device according to an example embodiment.

FIG. 7 is a block diagram illustrating a voice-controlled device according to an example embodiment.

FIG. 8 is a block diagram illustrating a voice-controlled device according to an example embodiment.

FIG. 9 is a block diagram illustrating a voice-controlled device according to an example embodiment.

FIG. 10 is a block diagram illustrating a voice-controlled device according to an example embodiment.

FIG. 11 is a block diagram illustrating a voice-controlled device according to an example embodiment.

FIG. 12 is a block diagram illustrating a voice-controlled device according to an example embodiment.

FIG. 13 is a block diagram illustrating a voice control apparatus according to an example embodiment.

FIG. 14 is a block diagram illustrating a voice-controlled device according to an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

In the related art, after receiving a voice command of a user, a voice assistant controls user equipment to perform an operation corresponding to the voice command; however, in the related art, the voice assistant can only control the device of the user, and cannot meet the requirement that the user interacts with the voice assistant of the friend through the voice assistant, which affects the user experience.

In order to solve the above problem, an embodiment of the present disclosure provides a voice control method, which is applied to a cloud server, and the method includes: receiving a voice instruction recognition result sent by a first terminal, wherein the first terminal is associated with a first user account; performing semantic processing on the voice instruction recognition result to obtain operation information, wherein the operation information comprises a second user account and operation content; when the second user account and the first user account are in a friend relationship, searching a second terminal associated with the second user account; and sending an operation instruction carrying the operation content to the second terminal, and instructing the second terminal to execute the operation content through a target application, wherein the target application is an application program for voice control.

Referring to fig. 1, an optional application scenario of the voice control method in the embodiment of the present disclosure is shown; the application scenario shown in fig. 1 includes: a terminal 11, a terminal 12, a network 13 and a cloud server 14; the terminal 11 and the terminal 12 communicate through the cloud server 14; the terminal is an electronic device such as a smart phone, a smart speaker, a smart television, a tablet computer, a notebook computer, or a wearable device (such as a bracelet, smart glasses, etc.) that can run an application program for implementing voice control, such as a voice assistant; the cloud server 14 may be one server or a server cluster composed of a plurality of servers; the network 13 may be, for example, a mobile communication network such as 2G/3G/4G/5G, or a wired network; it should be noted that the application scenario shown in fig. 1 is only one possible application scenario example of the voice control method in the embodiment of the present disclosure, and other application scenarios may also include devices not involved in fig. 1. The voice control method provided by the embodiment of the disclosure can be applied to the above scenario, and when the second user account and the operation content are obtained by associating the first terminal with the first user account and associating the second terminal with the second user account and establishing the friend relationship between the first user account and the second user account, and performing semantic processing on the voice instruction recognition result sent by the first terminal by the cloud server, the operation content is sent to the second terminal associated with the second user account which is in the friend relationship with the first user account, and the second terminal is instructed to execute the operation content through the target application, so that the user can perform interactive operation with the target application of the second terminal of the friend through the target application of the first terminal, thereby meeting the interactive requirement of the user with the voice assistant of the friend through the voice assistant of the user, and improving the user experience.

Based on the above analysis, the following specific examples are proposed.

Fig. 2 is a flowchart illustrating a voice control method according to an exemplary embodiment, where an execution subject of the method may be a cloud server; as shown in fig. 2, the method comprises the following steps 201 and 204:

in step 201, a voice instruction recognition result sent by a first terminal is received, and the first terminal is associated with a first user account.

For example, a user logs in a target application of a first terminal by using a first user account, and the first terminal is associated with the first user account; the user sends a voice instruction to the target application. The first terminal analyzes the voice command to obtain a voice command recognition result, and sends the voice command recognition result to the cloud server. The target application is an application for voice control, such as a voice assistant.

In step 202, semantic processing is performed on the voice instruction recognition result to obtain operation information, where the operation information includes the second user account and the operation content.

In an example, after receiving a voice instruction recognition result sent by the first terminal, the cloud server performs semantic processing on the voice instruction recognition result to obtain operation information, where the operation information includes a second user account and operation content.

In step 203, when the second user account and the first user account are in a friend relationship, a second terminal associated with the second user account is searched.

In an example, the cloud server obtains in advance a friend relationship between user accounts and an association relationship between the user accounts and the terminal; the cloud server judges whether the second user account and the first user account are in a friend relationship or not; when the second user account and the first user account are in a friend relationship, searching a second terminal associated with the second user account; and when the second user account is in non-friend relationship with the first user account, ending the process.

Illustratively, a first user logs in a target application of a first terminal by using a first user account, and a second user logs in a target application of a second terminal by using a second user account; the first terminal can request the second terminal to establish a friend relationship between the second user account and the first user account through the cloud server; or, the second terminal may request the first terminal to establish a friend relationship between the second user account and the first user account through the cloud server.

For example, the first terminal acquires a second user account, and sends a friend verification request carrying the first user account and the second user account to the cloud server, where the friend verification request is used to request establishment of a friend relationship between the second user account and the first user account. The cloud server receives a friend verification request which is sent by a first terminal and carries a first user account and a second user account, and forwards the friend verification request to a second terminal associated with the second user account, wherein the friend verification request is used for requesting to establish a friend relationship between the second user account and the first user account. The second terminal receives a friend verification request which is sent by the cloud server and carries a first user account and a second user account, the second user account is associated with the second terminal, and a friend relationship between the first user account and the second user account is established according to the friend verification request.

In step 204, an operation instruction carrying the operation content is sent to the second terminal, where the operation instruction is used to instruct the second terminal to execute the operation content through a target application, and the target application is an application program for voice control.

Illustratively, the operation information further includes at least any one or a combination of the following: the transmission time of the operation content, or the execution time of the operation content.

For example, the type of the operation content includes at least any one or a combination of the following: the method comprises the steps of leaving message content, voice mailbox content, adding backlog content in a calendar or backlog reminding content.

In an example, the cloud server sends an operation instruction carrying the operation content to the second terminal, and instructs the second terminal to execute the operation content through the target application. And after receiving the operation instruction sent by the cloud server, the second terminal executes the operation content through the target application.

For example, the implementation manner of sending the operation instruction to the second terminal includes: the cloud server judges whether the first user account has the authority of controlling the second terminal to execute the operation content through the target application; when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, an operation instruction is sent to the second terminal; and when the first user account does not have the right of controlling the second terminal to execute the operation content through the target application, ending the process.

According to the technical scheme provided by the embodiment of the disclosure, the first terminal is associated with the first user account, the second terminal is associated with the second user account, the friend relationship between the first user account and the second user account is established, and when the cloud server performs semantic processing on the voice instruction recognition result sent by the first terminal to obtain the second user account and the operation content, the cloud server sends the operation content to the second terminal associated with the second user account which is in the friend relationship with the first user account and instructs the second terminal to execute the operation content through the target application, so that the user can perform interactive operation with the target application of the second terminal of a friend through the target application of the first terminal, the requirement of the user for interaction with the voice assistant of the friend through the voice assistant of the user is met, and the user experience is improved.

FIG. 3 is a flow chart illustrating a method of voice control according to an exemplary embodiment, the method being performed by a first terminal; as shown in fig. 3, the method comprises the following steps 301-303:

in step 301, a voice instruction is received by a target application, where the target application is an application program for voice control, and a first terminal is associated with a first user account.

In step 302, the voice command is analyzed to obtain a voice command recognition result.

In step 303, the voice command recognition result is sent to the cloud server.

According to the technical scheme, the first terminal is associated with the first user account, the first terminal analyzes the voice command, the voice command recognition result is sent to the cloud server, and the cloud server meets the requirement that the user interacts with the target application of the friend through the target application of the user.

Fig. 4 is a flow chart illustrating a voice control method according to an exemplary embodiment, an execution subject of the method may be a second terminal; as shown in fig. 4, the method comprises the following steps 401 and 402:

in step 401, an operation instruction sent by the cloud server is received, where the operation instruction includes operation content.

In step 402, the operation content is executed by the target application, which is an application program for voice control.

According to the technical scheme, the cloud server sends the operation instruction comprising the operation content to the second terminal according to the voice instruction recognition result sent by the first terminal, and the second terminal executes the operation content through the target application after receiving the operation instruction sent by the cloud server, so that the requirement that a user interacts with the target application of a friend through the target application of the user is met.

Fig. 5 is a flowchart illustrating a voice control method implemented by a first terminal, a cloud server, and a second terminal according to an exemplary embodiment. As shown in fig. 5, on the basis of the foregoing embodiment, the voice control method according to the present disclosure may include the following steps 501 and 507:

in step 501, a first terminal receives a voice command through a target application, where the target application is an application program for voice control, and the first terminal is associated with a first user account.

For example, a standby offline wake-up mechanism is required to ensure that the first terminal can monitor a wake-up word (defined by a developer) through a continuous monitoring technology, the development end needs general language chat to train a baseline model, the wake-up word records and trains a command word model, the wake-up is to calculate the matching degree of the two by recording data, and if the matching between a voice command of the wake-up and the trained command word model reaches a threshold, the first terminal is woken up. The first terminal is awaken up the back, gets into the mode of monitoring, at Voice input end, because the sound source belongs to most important source, in order to let the sound source furthest reduce the distortion, can adopt echo cancellation, the processing of making an uproar falls, the sound source reinforcing, various means such as sound source filtration guarantee the quality of sound source, the hardware scheme that adopts most often guarantees the sound source through the microphone array.

In step 502, the first terminal analyzes the voice command to obtain a voice command recognition result.

As an example, after receiving and processing a Voice command, Voice recognition (Voice recognition) is performed, and the first stage is to implement Automatic Speech Recognition (ASR), the principle of which depends mainly on three factors, frame, state, and phoneme. As shown in the above figure, in the first step, the sound is cut into a frame of a section corresponding to the top, a plurality of frames of speech form a state, every three states correspond to a phoneme, and a plurality of phonemes are combined into a word, so that the result of speech recognition is obtained, and the result is an output text.

To realize accurate understanding of a large amount of speech, a Hidden Markov Model (HMM) is used to construct a state network, and then a path that best matches the sound is found from the state network. Therefore, the identification can be realized only by limiting the result in the state network, and if any text is to be identified, the network needs to be built to be large enough to contain the path of the any text. This involves extensive training and processing of large amounts of data, the more data, the higher the accuracy.

In step 503, the first terminal sends a voice command recognition result to the cloud server.

In step 504, the cloud server receives a voice instruction recognition result sent by a first terminal, wherein the first terminal is associated with a first user account; and performing semantic processing on the voice instruction recognition result to obtain operation information, wherein the operation information comprises a second user account and operation content.

For example, Natural Language Understanding (NLU) is required for words or chinese characters recognized by ASR, but the current technical level is far from reaching the level of NLU, and only the Natural Language Processing (NLP) stage is realized. The present NLP mainly establishes a huge corpus, and implements processing and simple understanding of natural semantics by continuously training and analyzing grammar, syntax, semantics, and the like, and using statistical principles and deep learning. The semantic meaning can be understood like human, a large amount of learning is needed, even various sensors are added, a machine can really generate human thought, and the feeling of human to an object or language is felt, so that real emotional communication is realized.

After semantic processing is realized, dialog management and language synthesis are needed to be carried out in combination with context, context understanding and context self-correction are carried out, so that relatively accurate feedback is realized, different feedbacks can be formed in combination with different scenes and products, and voice interaction between human and machines is realized. The whole process is required to be carried out no matter the life home is controlled by voice, conversation is carried out by voice and a machine, and voice search is carried out by a search engine, and each link is made to become the key of the existing voice market competition, so that accurate and timely voice feedback is very important for users.

In an example, when the first terminal receives a voice instruction through the target application, the voice instruction can be directly sent to the cloud server, the cloud server analyzes the voice instruction to obtain a voice instruction recognition result, and then semantic processing is performed on the voice instruction recognition result to obtain operation information.

For example, when the first terminal receives a voice instruction through the target application, the voice instruction may be analyzed to obtain a voice instruction recognition result, and then the voice instruction recognition result is subjected to semantic processing to obtain operation information; and then the first terminal sends the operation information to the cloud server.

In step 505, when the second user account and the first user account are in a friend relationship, the cloud server searches for a second terminal associated with the second user account.

In step 506, the cloud server sends an operation instruction carrying the operation content to the second terminal, where the operation instruction is used to instruct the second terminal to execute the operation content through a target application, and the target application is an application program for voice control.

Illustratively, the sending the operation instruction to the second terminal includes: judging whether the first user account has the authority of controlling the second terminal to execute the operation content through the target application; and when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, sending an operation instruction to the second terminal.

In step 507, the second terminal receives an operation instruction sent by the cloud server, where the operation instruction includes operation content; and executing the operation content through the target application, wherein the target application is an application program for voice control.

According to the technical scheme provided by the embodiment of the disclosure, a first user logs in a target application of a first terminal by using a first user account, a second user logs in a target application of a second terminal by using a second user account, a cloud server performs semantic processing on a voice instruction recognition result sent by the first terminal to obtain the second user account and operation content, the operation content is sent to the second terminal associated with the second user account with the first user account being in a friend relationship, the second terminal is instructed to execute the operation content through the target application, so that the user can perform interactive operation with the target application of the second terminal of a friend through the target application of the first terminal, the requirement of the user for interaction with a voice assistant of the friend through the voice assistant is met, and user experience is improved.

In an exemplary embodiment, the voice control method related to the present disclosure may include the steps of:

step 1) a first user logs in a target application of a first terminal by using a first user account, inputs a second user account on an operation interface of the target application, and triggers a friend verification request process; the target application is an application for voice control. The friend verification request process specifically comprises the following steps: the method comprises the steps that a first terminal sends a friend verification request carrying a first user account and a second user account to a cloud server, wherein the friend verification request is used for requesting to establish a friend relationship between the second user account and the first user account; the cloud server receives a friend verification request which is sent by a first terminal and carries a first user account and a second user account, and forwards the friend verification request to a second terminal associated with the second user account, wherein the friend verification request is used for requesting to establish a friend relationship between the second user account and the first user account. The method comprises the steps that a second terminal receives a friend verification request which is sent by a cloud server and carries a first user account and a second user account, the second user account is associated with the second terminal, and a friend relationship between the first user account and the second user account is established according to the friend verification request; and the second terminal sends a friend verification response to the first terminal through the cloud server.

The first user adds a second user account of the second user as a friend through the first user account, and then leaves a message and a voice mailbox to the voice assistant of the second user through the voice assistant of the first user, and adds the message information to a calendar or backlog of the second user.

Step 2), the first user sends a voice instruction to a target application of the first terminal; the first terminal analyzes the voice command to obtain a voice command recognition result, for example, if the voice command recognition result is 'telling the old plum and opening a meeting in a VIP conference room ten times at night', or if the voice command recognition result is 'love classmate, talking on birthday happy to mom' at a glance! ".

And step 3) the first terminal sends a voice instruction recognition result to the cloud server.

And 4) the cloud server performs semantic processing on the voice instruction recognition result to obtain a second user account and operation content.

For example, after semantic processing is performed on a speech instruction recognition result of telling jungle and meeting in a VIP conference room at ten night by the cloud server, it is determined that the second user account is 'jungle', and the operation content is as follows: the text "ten night meeting in VIP meeting room" is written into the calendar or to-do of the second terminal.

For another example, the cloud server recognizes the result of the voice command "love classmate, talk to mom in a piecemeal manner with happy birthday! After semantic processing is carried out, the second user account is determined to be 'mom', and the operation content is as follows: when null, the text "Happy birthday!is sent to the terminal associated with the second user account! ", the text" Happy birthday! "convert to speech and play.

And step 5) the cloud server sends an operation instruction carrying operation content to a second terminal associated with the second user account when the second user account is in a friend relationship with the first user account.

Step 6) the second terminal judges whether the first user account has the authority for controlling the second terminal to execute the operation content through the target application; when the first user account is judged to have the authority of controlling the second terminal to execute the operation content through the target application, the second terminal executes the operation content through the target application; and when the first user account does not have the right of controlling the second terminal to execute the operation content through the target application, ending the process.

For example, the implementation process of the second terminal executing the operation content through the target application includes: analyzing the operation content and identifying the type of the operation content; when the type of the operation content is message content or voice mailbox content, converting the operation content into voice through a Text To Speech (TTS) technology and playing the voice; when the type of the operation content is to add the backlog content or the backlog reminding content in the calendar, other applications in the second terminal, such as the calendar, are called through the target application to execute the operation content.

In the embodiment of the disclosure, a user logs in an application program for voice control by using a user account, for example, a voice assistant may add other user accounts as a friend, and the user may control the voice assistant of the friend terminal to perform operations such as leave messages, voice mailboxes, timed leave messages, calendars or event handling by sending a voice instruction to the voice assistant of the user terminal, so that the voice assistant no longer only controls a single device of the user, and the voice assistants of different terminals are linked after the friend is added, thereby implementing multi-device voice interaction operation, making the voice assistant more powerful, meeting more 'private assistant' requirements, and bringing better user experience to the user.

The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods.

FIG. 6 is a block diagram illustrating a voice-controlled apparatus in accordance with an exemplary embodiment; the apparatus may be implemented in various ways, such as implementing all components of the apparatus in a cloud server, or implementing components of the apparatus in a coupled manner on the cloud server side; the device can implement the method related to the present disclosure through software, hardware or a combination of the two, as shown in fig. 6, the voice control device includes: a first receiving module 601, a processing module 602, a searching module 603 and a first sending module 604, wherein:

the first receiving module 601 is configured to receive a voice instruction recognition result sent by a first terminal, where the first terminal is associated with a first user account;

the processing module 602 is configured to perform semantic processing on the voice instruction recognition result to obtain operation information, where the operation information includes a second user account and operation content;

the search module 603 is configured to search for a second terminal associated with the second user account when the second user account is in a friend relationship with the first user account;

the first sending module 604 is configured to send an operation instruction carrying the operation content to the second terminal, where the operation instruction is used to instruct the second terminal to execute the operation content through a target application, and the target application is an application program for voice control.

The device provided by the embodiment of the present disclosure can be used for executing the technical scheme of the embodiment shown in fig. 2, and the execution mode and the beneficial effect are similar, which are not described herein again.

In a possible implementation manner, the first sending module 604 determines whether the first user account has a right to control the second terminal to execute the operation content through the target application; and when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, sending an operation instruction to the second terminal.

In one possible implementation, as shown in fig. 7, the voice control apparatus shown in fig. 6 may further include:

the second receiving module 701 is configured to receive a friend verification request which is sent by a first terminal and carries a first user account and a second user account;

the forwarding module 702 is configured to forward a friend verification request to a second terminal associated with the second user account, where the friend verification request is used to request to establish a friend relationship between the second user account and the first user account.

FIG. 8 is a block diagram illustrating a voice-controlled apparatus in accordance with an exemplary embodiment; the apparatus may be implemented in various ways, for example, with all of the components of the apparatus being implemented in a terminal, or with components of the apparatus being implemented in a coupled manner on the terminal side; the device can implement the method related to the present disclosure through software, hardware or a combination of the two, as shown in fig. 8, the voice control device includes: a third receiving module 801, an analyzing module 802, and a second sending module 803, wherein:

the third receiving module 801 is configured to receive a voice instruction through a target application, where the target application is an application program for voice control, and the first terminal is associated with the first user account;

the analysis module 802 is configured to analyze the voice command to obtain a voice command recognition result;

the second sending module 803 is configured to send the voice command recognition result to the cloud server.

The device provided by the embodiment of the present disclosure can be used for executing the technical scheme of the embodiment shown in fig. 3, and the execution mode and the beneficial effect are similar, which are not described herein again.

FIG. 9 is a block diagram illustrating a voice-controlled apparatus in accordance with an exemplary embodiment; the apparatus may be implemented in various ways, for example, with all of the components of the apparatus being implemented in a terminal, or with components of the apparatus being implemented in a coupled manner on the terminal side; the device can implement the method related to the present disclosure through software, hardware or a combination of the two, as shown in fig. 9, the voice control device includes: a fourth receiving module 901 and an executing module 902, wherein:

the fourth receiving module 901 is configured to receive an operation instruction sent by the cloud server, where the operation instruction includes operation content;

the execution module 902 is configured to execute the operation content by a target application, which is an application program for voice control.

The device provided by the embodiment of the present disclosure can be used for executing the technical scheme of the embodiment shown in fig. 4, and the execution mode and the beneficial effect are similar, which are not described herein again.

Fig. 10 is a block diagram of a voice control apparatus 1000 according to an exemplary embodiment, where the voice control apparatus 1000 may be implemented in various ways, for example, all components of the apparatus are implemented in a cloud server, or the components of the apparatus are implemented in a coupled manner on the side of the cloud server; the voice control apparatus 1000 includes:

a processor 1001;

a memory 1002 for storing processor-executable instructions;

wherein the processor 1001 is configured to:

In one embodiment, the processor 1001 may be further configured to:

receiving a friend verification request which is sent by a first terminal and carries a first user account and a second user account;

and forwarding a friend verification request to a second terminal associated with the second user account, wherein the friend verification request is used for requesting to establish a friend relationship between the second user account and the first user account.

Fig. 11 is a block diagram illustrating a voice control apparatus 1100 according to an exemplary embodiment, where the voice control apparatus 1100 may be implemented in various manners, such as implementing all components of the apparatus in a terminal or implementing components of the apparatus in a coupled manner on the terminal side; the voice control apparatus 1100 includes:

a processor 1101;

a memory 1102 for storing processor-executable instructions;

wherein the processor 1101 is configured to:

receiving a voice instruction through a target application, wherein the target application is an application program for voice control, and a first terminal is associated with a first user account;

analyzing the voice command to obtain a voice command recognition result;

and sending a voice instruction recognition result to the cloud server.

In one embodiment, the processor 1101 may be further configured to:

acquiring a second user account;

Fig. 12 is a block diagram illustrating a voice control apparatus 1200 according to an exemplary embodiment, where the voice control apparatus 1200 may be implemented in various manners, such as implementing all components of the apparatus in a terminal or implementing the components of the apparatus in a coupled manner on the terminal side; the voice control device 1200 includes:

a processor 1201;

a memory 1202 for storing processor-executable instructions;

wherein the processor 1201 is configured to:

and executing the operation content through the target application, wherein the target application is an application program for voice control.

In one embodiment, the processor 1201 may be further configured to:

receiving a friend verification request which is sent by a cloud server and carries a first user account and a second user account, wherein the second user account is associated with a second terminal;

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

FIG. 13 is a block diagram illustrating a voice-controlled apparatus according to an exemplary embodiment. For example, the voice control apparatus 1300 may be an electronic device such as a smart phone, a smart speaker, a smart television, a tablet computer, a notebook computer, or a wearable device (e.g., a bracelet, smart glasses, etc.) that can run an application program for implementing voice control. Referring to fig. 13, the voice control device 1300 may include one or more of the following components: a processing component 1302, a memory 1304, a power component 1306, a multimedia component 1308, an audio component 1310, an input/output (I/O) interface 1312, a sensor component 1314, and a communications component 1316.

The processing component 1302 generally controls overall operation of the voice control device 1300, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 1302 may include one or more processors 1320 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 1302 can include one or more modules that facilitate interaction between the processing component 1302 and other components. For example, the processing component 1302 may include a multimedia module to facilitate interaction between the multimedia component 1308 and the processing component 1302.

The memory 1304 is configured to store various types of data to support operations at the voice control apparatus 1300. Examples of such data include instructions for any application or method operating on the voice control device 1300, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1304 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 1306 provides power to the various components of the voice control device 1300. The power components 1306 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the voice control device 1300.

The multimedia component 1308 includes a screen between the voice control 1300 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1308 includes a front facing camera and/or a rear facing camera. When the voice control apparatus 1300 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 1310 is configured to output and/or input audio signals. For example, the audio component 1310 includes a Microphone (MIC) configured to receive an external audio signal when the voice control apparatus 1300 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 1304 or transmitted via the communication component 1316. In some embodiments, the audio component 1310 also includes a speaker for outputting audio signals.

The I/O interface 1312 provides an interface between the processing component 1302 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 1314 includes one or more sensors for providing various aspects of state assessment for the voice control device 1300. For example, the sensor assembly 1314 can detect the open/closed state of the voice control device 1300, the relative positioning of components, such as a display and keypad of the voice control device 1300, the sensor assembly 1314 can also detect a change in position of the voice control device 1300 or a component of the voice control device 1300, the presence or absence of user contact with the voice control device 1300, the orientation or acceleration/deceleration of the voice control device 1300, and a change in temperature of the voice control device 1300. The sensor assembly 1314 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 1314 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1314 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 1316 is configured to facilitate wired or wireless communication between the voice control apparatus 1300 and other devices. The voice control device 1300 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1316 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 1316 also includes a Near Field Communications (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the voice control apparatus 1300 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 1304 comprising instructions, executable by the processor 1320 of the speech control apparatus 1300 to perform the method described above is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

FIG. 14 is a block diagram illustrating a voice-controlled apparatus according to an exemplary embodiment. For example, the voice control apparatus 1400 may be provided as a server. The voice control apparatus 1400 includes a processing component 1402 that further includes one or more processors, and memory resources, represented by memory 1403, for storing instructions, such as application programs, that are executable by the processing component 1402. The application programs stored in memory 1403 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1402 is configured to execute instructions to perform the above-described methods.

The voice-controlled device 1400 may also include a power component 1406 configured to perform power management of the voice-controlled device 1400, a wired or wireless network interface 1405 configured to connect the voice-controlled device 1400 to a network, and an input-output (I/O) interface 1408. The voice control device 1400 may operate based on an operating system stored in memory 1403, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

A non-transitory computer readable storage medium, wherein instructions in the storage medium, when executed by a processor of a

voice control apparatus

1300 or 1400, enable the

voice control apparatus

1300 or 1400 to perform a voice control method comprising:

analyzing the voice command to obtain a voice command recognition result;

and sending a voice instruction recognition result to the cloud server.

In one embodiment, the method further comprises:

acquiring a second user account;

In an embodiment of the present disclosure, there is provided a computer readable storage medium having stored thereon computer instructions that, when executed by a processor, implement a method of:

In one embodiment, sending the operation instruction to the second terminal includes:

In one embodiment, the method further comprises:

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A voice control method is applied to a cloud server and is characterized by comprising the following steps:

sending an operation instruction carrying the operation content to the second terminal, wherein the operation instruction is used for instructing the second terminal to execute the operation content through a target application, and the target application is an application program for voice control;

the sending of the operation instruction to the second terminal includes: judging whether the first user account has the authority of controlling the second terminal to execute the operation content through the target application; and when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, sending an operation instruction to the second terminal.

2. The method of claim 1, further comprising:

3. The method according to claim 1, wherein the operation information further comprises at least any one or a combination of the following information: the transmission time of the operation content, or the execution time of the operation content.

4. The method according to claim 1, wherein the type of the operation content at least comprises any one or a combination of the following: the method comprises the steps of leaving message content, voice mailbox content, adding backlog content in a calendar or backlog reminding content.

5. A voice control method is applied to a first terminal, and is characterized by comprising the following steps:

analyzing the voice command to obtain a voice command recognition result; then, semantic processing is carried out on the voice instruction recognition result to obtain operation information;

sending the operation information to a cloud server; the operation information comprises a second user account and operation content, so that the cloud server searches a second terminal associated with the second user account when the second user account is in a friend relationship with the first user account; sending an operation instruction carrying the operation content to the second terminal, wherein the operation instruction is used for instructing the second terminal to execute the operation content through a target application, and the target application is an application program for voice control; the sending of the operation instruction to the second terminal includes: judging whether the first user account has the authority of controlling the second terminal to execute the operation content through the target application; and when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, sending an operation instruction to the second terminal.

6. The method of claim 5, further comprising:

acquiring a second user account;

7. A voice control apparatus, comprising:

a first sending module, configured to send an operation instruction carrying the operation content to the second terminal, where the operation instruction is used to instruct the second terminal to execute the operation content through a target application, and the target application is an application program for voice control;

the first sending module judges whether the first user account has the authority of controlling the second terminal to execute the operation content through the target application; and when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, sending an operation instruction to the second terminal.

8. The apparatus of claim 7, further comprising:

9. A voice control apparatus, comprising:

the analysis module is used for analyzing the voice command to obtain a voice command recognition result; then, semantic processing is carried out on the voice instruction recognition result to obtain operation information;

the second sending module is used for sending the operation information to the cloud server; the operation information comprises a second user account and operation content, so that the cloud server searches a second terminal associated with the second user account when the second user account is in a friend relationship with the first user account; sending an operation instruction carrying the operation content to the second terminal, wherein the operation instruction is used for instructing the second terminal to execute the operation content through a target application, and the target application is an application program for voice control; the sending of the operation instruction to the second terminal includes: judging whether the first user account has the authority of controlling the second terminal to execute the operation content through the target application; and when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, sending an operation instruction to the second terminal.

10. A voice control apparatus, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

the processor is further configured to: judging whether the first user account has the authority of controlling the second terminal to execute the operation content through the target application; and when the first user account is judged to have the right of controlling the second terminal to execute the operation content through the target application, sending an operation instruction to the second terminal.

11. A voice control apparatus, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

12. A computer-readable storage medium having stored thereon computer instructions, which, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 4.

13. A computer-readable storage medium having stored thereon computer instructions, which when executed by a processor, perform the steps of the method of claim 5 or 6.