CN113359980A

CN113359980A - Control method and device of multimedia equipment, electronic equipment and storage medium

Info

Publication number: CN113359980A
Application number: CN202110603027.XA
Authority: CN
Inventors: 石家齐
Original assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Intelligent Technology Co Ltd
Priority date: 2021-05-31
Filing date: 2021-05-31
Publication date: 2021-09-07

Abstract

The present disclosure relates to a method and apparatus for controlling a multimedia device, an electronic device, and a storage medium, the method including: collecting voice information of a target object; carrying out voice recognition on the voice information to obtain a voice recognition result; and controlling at least one multimedia device according to the target voice emotion in the voice recognition result. The embodiment of the disclosure can enrich the control modes for the multimedia equipment.

Description

Control method and device of multimedia equipment, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for controlling a multimedia device, an electronic device, and a storage medium.

Background

With the development of science and technology and the popularization of multimedia equipment, in some scenes such as exhibition and exhibition, corresponding exhibition and exhibition can be performed by combining the multimedia equipment, for example: in scenes such as a release meeting, an enterprise exhibition hall, a city planning hall, a theme exhibition and the like, corresponding exhibition is carried out by combining multimedia equipment, so that better exhibition experience is provided for exhibitors.

Disclosure of Invention

The present disclosure proposes a technical solution for controlling a multimedia device.

According to an aspect of the present disclosure, there is provided a control method of a multimedia device, applied to an electronic device for controlling a plurality of multimedia devices, the method including:

collecting voice information of a target object;

carrying out voice recognition on the voice information to obtain a voice recognition result;

and controlling at least one multimedia device according to the target voice emotion in the voice recognition result.

According to the control method of the multimedia equipment, the voice emotion can be fused with the electronic equipment, so that at least one multimedia equipment in the multiple multimedia equipment can be controlled according to the voice emotion, the control modes of the electronic equipment on the multiple multimedia equipment can be enriched, the interactivity and interestingness of a user and the multimedia equipment are improved, and the user experience is improved.

In a possible implementation manner, the controlling at least one multimedia device according to a target speech emotion in the speech recognition result includes:

determining an operation instruction corresponding to the target voice emotion and target multimedia equipment corresponding to the operation instruction;

and sending the operation instruction to the target multimedia equipment so that the target multimedia equipment executes the operation corresponding to the operation instruction.

In a possible implementation manner, the determining the operation instruction corresponding to the target speech emotion and the target multimedia device corresponding to the operation instruction includes:

determining whether the target object has the control authority or not according to the identity recognition result;

and responding to the control authority of the target object, and determining an operation instruction corresponding to the target voice emotion and target multimedia equipment corresponding to the operation instruction.

According to the control method of the multimedia device, the control authority of the target object can be judged according to the identity recognition result obtained by voice recognition, and the operation instruction corresponding to the target voice emotion is sent to the target multimedia device only under the condition that the target object has the control authority, so that the problems of mistaken touch, instruction interference, repeated instruction and the like caused by other users in the control process of the multimedia device can be avoided, and the specialization and the intellectualization of the control of the multimedia device can be improved.

In a possible implementation manner, the determining whether the target object has the manipulation authority according to the identification result includes:

determining a current manipulation mode;

and determining whether the target object has the control authority or not according to the current control mode and the identity recognition result.

In a possible implementation manner, the determining whether the target object has the manipulation authority according to the current manipulation mode and the identification result includes:

in response to that the current control mode is a control mode and the identification result represents and identifies the target object, determining whether the target object has the control authority according to the identification result; alternatively, the first and second electrodes may be,

and determining that the target object does not have the control authority in response to that the current control mode is a control mode and the identification result represents that the target object is not identified.

According to the control method of the multimedia equipment provided by the embodiment of the disclosure, the control mode of the multimedia equipment can be enriched, the interactivity between users can be improved, the interestingness can be increased, and the user experience can be improved.

and determining that the target object has the control authority in response to the current control mode being an interactive mode.

In a possible implementation manner, the determining the operation instruction corresponding to the target emotion includes:

acquiring an operation list of the target object according to the identity recognition result, wherein the operation list comprises at least one corresponding relation, and the corresponding relation comprises a corresponding relation between a target voice emotion and an operation instruction;

and determining an operation instruction corresponding to the target voice emotion according to the operation list.

According to the control method of the multimedia device provided by the embodiment of the disclosure, the operation list of the target object can be obtained according to the identity recognition result, and the operation instruction corresponding to the target voice emotion is determined according to the operation list of the target object, so that the user can customize the voice emotion of the multimedia device, the control mode for the multimedia device is enriched, and the user experience is improved.

In one possible implementation, the method further includes:

and aiming at the target object, responding to the setting operation aiming at the operation list, and setting at least one corresponding relation in the operation list to obtain the operation list of the target object.

According to the control method of the multimedia device provided by the embodiment of the disclosure, the operation list of the user can be customized, for example: the corresponding relation between the target voice emotion in the operation list and the operation instruction can be set according to personal habit and hobbies of the user, personalized customization requirements of the user can be achieved, and user experience can be improved.

In a possible implementation manner, the setting, in response to the setting operation for the operation list, at least one corresponding relationship in the operation list includes:

determining a target voice emotion from voice emotion options in response to a selected operation for the target voice emotion;

in response to the selected operation aiming at the operation instruction, determining the operation instruction which has a corresponding relation with the target voice emotion;

and storing the corresponding relation between the target voice emotion and the operation instruction into the operation list.

responding to the selected operation aiming at any corresponding relation in the operation list, and determining a target corresponding relation;

and in response to the deletion operation aiming at the target corresponding relation, deleting the target corresponding relation from the operation list to obtain the operation list of the target object.

According to the control method of the multimedia device provided by the embodiment of the disclosure, the operation list of the user can be customized, for example: the corresponding relation between the target voice emotion which is not needed in the operation list and the operation instruction can be deleted, the situations of misoperation and the like in actual operation can be avoided, the multimedia equipment can be controlled more accurately, and the user experience can be improved.

In a possible implementation manner, the determining a target multimedia device corresponding to the operation instruction includes:

determining the multimedia equipment which can be controlled by the target object according to the identification result of the target object;

and determining the multimedia equipment which can be controlled by the target object as the target multimedia equipment corresponding to the operation instruction.

According to the control method of the multimedia device provided by the embodiment of the disclosure, a plurality of users can control different multimedia devices by adopting the same or different voice emotions through the electronic device to realize the same or different operations, the control modes of the multimedia devices can be enriched, and the control method of the multimedia device provided by the embodiment of the disclosure can adapt to various application scenes.

According to an aspect of the present disclosure, there is provided a control apparatus for a multimedia device, applied to an electronic device for controlling a plurality of multimedia devices, the apparatus including:

the acquisition module is used for acquiring voice information of a target object;

the recognition result is used for carrying out voice recognition on the voice information to obtain a voice recognition result;

and the control module is used for controlling at least one multimedia device according to the target voice emotion in the voice recognition result.

According to an aspect of the present disclosure, there is provided an electronic device including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the memory-stored instructions to perform the above-described method.

According to an aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method.

In the embodiment of the present disclosure, the electronic device for controlling a plurality of multimedia devices may acquire the voice information of the target object, and perform image recognition on the voice information to obtain a voice recognition result. The electronic equipment can correspondingly control at least one multimedia device according to the target voice emotion in the voice recognition result. According to the control method and device of the multimedia equipment, the electronic equipment and the storage medium, the voice emotion can be fused with the electronic equipment, so that at least one multimedia equipment in the multiple multimedia equipment can be controlled according to the voice emotion, the control modes of the electronic equipment on the multiple multimedia equipment can be enriched, the interactivity and interestingness of a user and the multimedia equipment are improved, and the user experience is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure. Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 illustrates a schematic diagram of a control method of a multimedia device according to an embodiment of the present disclosure;

fig. 2 illustrates a flowchart of a control method of a multimedia device according to an embodiment of the present disclosure;

fig. 3 illustrates a schematic diagram of a control method of a multimedia device according to an embodiment of the present disclosure;

fig. 4a to 4c are schematic views illustrating a control method of a multimedia device according to an embodiment of the present disclosure;

fig. 5 shows a block diagram of a control apparatus of a multimedia device according to an embodiment of the present disclosure;

FIG. 6 illustrates a block diagram of an electronic device 800 in accordance with an embodiment of the disclosure;

fig. 7 illustrates a block diagram of an electronic device 1900 in accordance with an embodiment of the disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

In a multimedia exhibition hall, a plurality of multimedia devices may be arranged, and the plurality of multimedia devices may be controlled by a connected central control system. Referring to fig. 1, for example: the system can comprise a plurality of multimedia devices such as at least one display device, at least one sound box device and the like, and a central control device, wherein the central control device is internally provided with a central control system for controlling the display device and the sound box device. In the multimedia exhibition hall, a user can control a plurality of multimedia devices through the central control device so as to cooperate with the user to perform corresponding demonstration and interaction. It should be noted that the multimedia exhibition hall structure shown in fig. 1 is only an example of the disclosed embodiment, and is not to be construed as a limitation of the disclosed embodiment.

The embodiment of the disclosure provides a control method of a multimedia device, which can be applied to an electronic device provided with a central control system, and the electronic device can control a plurality of multimedia devices. The electronic equipment can collect the voice information of the target object and perform voice recognition on the voice information collected at the current moment to obtain a voice recognition result. And under the condition that the voice recognition result comprises the target voice emotion, correspondingly controlling at least one multimedia device in the plurality of multimedia devices according to the target voice emotion. The control method of the multimedia device provided by the embodiment of the disclosure can integrate voice emotion with the electronic device to realize control of at least one multimedia device of the plurality of multimedia devices according to the voice emotion, can enrich the control mode of the electronic device on the plurality of multimedia devices, improve the interactivity and interestingness of a user and the multimedia device, and improve user experience.

Fig. 2 shows a flowchart of a control method of a multimedia device according to an embodiment of the present disclosure, where the control method of the multimedia device may be performed by an electronic device such as a terminal device or a server, and the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like, and the method may be implemented by a processor calling computer-readable instructions stored in a memory. Alternatively, the method may be performed by a server.

As shown in fig. 2, the control method of the multimedia device may include:

in step S21, voice information of the target object is acquired.

For example, the electronic device may collect voice information of the target object. For example, in a multimedia exhibition hall scene, the electronic device may use a human body as a target object when recognizing that the human body exists in the designated area, and acquire voice information of the target object through the voice acquisition device. The electronic device can acquire the voice information of the target object through the voice acquisition device integrated in the electronic device, or the electronic device can acquire the voice information of the target object through the external voice acquisition device.

In step S22, speech recognition is performed on the speech information to obtain a speech recognition result.

In the embodiment of the disclosure, the electronic device can perform voice recognition on the voice information collected at the current moment to obtain a voice recognition result. For example, the speech recognition result may include a speech emotion recognition result and/or an identity recognition result, wherein the speech emotion recognition result may include a recognized speech emotion (for example, may include, but is not limited to, speech emotions such as excitement, joy, anger, tension, surprise, hurry, etc.), and the identity recognition result may include identity information of a recognized target object.

For example, the pre-trained speech recognition network may perform speech recognition on the speech information at the current moment to obtain a corresponding speech recognition result. For example, taking the example that the speech recognition result includes a speech emotion recognition result, the speech recognition network may include a speech emotion recognition network, and the speech emotion recognition network performs speech emotion recognition on the speech information acquired at the current time, so as to obtain a corresponding speech emotion recognition result. Or, taking the example that the voice recognition result includes an identity recognition result, the voice recognition network may further include an identity recognition network, and the identity recognition network performs identity recognition on the collected voice information to obtain a corresponding identity recognition result.

Taking a speech emotion recognition network as an example, briefly describing the training process as follows: a training set may be pre-created, the training set including a plurality of sample groups, each sample group may include sample speech information and labeled information of the sample speech information, wherein the labeled information of the sample speech may include a labeled speech emotion category. The method comprises the steps of carrying out voice emotion recognition on sample voice information through a voice emotion recognition network to obtain a voice emotion recognition result of the sample voice information, determining recognition loss of the voice emotion recognition network according to the voice emotion recognition result of the sample voice information and label information of the sample voice information, adjusting network parameters of the voice emotion recognition network according to the recognition loss until the recognition loss of the voice emotion recognition network meets training requirements (for example, the recognition loss is smaller than a loss threshold), finishing training aiming at the voice emotion recognition network, and obtaining the trained voice emotion recognition network.

Similarly, the training process of the identity recognition network may refer to the training process of the speech emotion recognition network, which is not described in detail in the embodiment of the present disclosure.

In step S23, at least one multimedia device is controlled according to the target speech emotion in the speech recognition result.

For example, the target speech emotion may be a preset speech emotion for triggering a corresponding operation instruction. Whether a target voice emotion is included in the voice recognition result can be determined, and in the case that the target voice emotion is included in the voice recognition result, at least one multimedia device of the plurality of multimedia devices controlled by the electronic device can be correspondingly controlled according to the target voice emotion.

For example, the target speech emotion may correspond to an operation instruction, wherein the operation instruction may include an instruction for instructing the multimedia device to perform a corresponding operation. In the case that the speech recognition result includes the target speech emotion, the electronic device may send an operation instruction corresponding to the target speech emotion to the at least one multimedia device. The multimedia device receiving the operation instruction may perform a corresponding operation in response to the operation instruction.

Taking the target voice emotion included in the voice recognition result as "tension", assuming that the operation instruction corresponding to "tension" is the instruction for displaying the special tension effect, the electronic device may send the instruction for displaying the special tension effect to at least one multimedia device under the condition that "tension" is included in the voice recognition result, and the multimedia device receiving the instruction for displaying the special tension effect may display the corresponding special tension effect.

It should be noted that, in the embodiment of the present disclosure, not only the multimedia device in the exhibition hall can be controlled through the speech emotion, but also the light atmosphere, the music effect, the screen color temperature, and the like in the exhibition hall can be controlled, and the specific process may refer to the foregoing embodiment, which is not described herein again.

Therefore, the electronic equipment for controlling the plurality of multimedia devices can acquire the voice information of the target object and perform image recognition on the voice information to obtain a voice recognition result. The electronic equipment can correspondingly control at least one multimedia device according to the target voice emotion in the voice recognition result. According to the control method of the multimedia equipment, the voice emotion can be fused with the electronic equipment, so that at least one multimedia equipment in the multiple multimedia equipment can be controlled according to the voice emotion, the control modes of the electronic equipment on the multiple multimedia equipment can be enriched, the interactivity and interestingness of a user and the multimedia equipment are improved, and the user experience is improved.

In a possible implementation manner, the controlling at least one multimedia device according to a target speech emotion in the speech recognition result may include:

For example, the corresponding relationship between each target voice emotion and the operation instruction may be preset, and after the voice recognition result is determined to include the target voice emotion, the operation instruction corresponding to the target voice emotion may be determined according to the corresponding relationship between the target voice emotion and the operation instruction, and the operation instruction corresponding to the target voice emotion is sent to the target multimedia device, so that the target multimedia device executes the operation corresponding to the operation instruction after receiving the operation instruction.

The target multimedia device corresponding to the operation instruction may include at least one multimedia device of the plurality of multimedia devices. For example, at least one target multimedia device corresponding to the operation instruction may be preset, or at least one target multimedia device corresponding to the target object may also be preset, and then the at least one target multimedia device corresponding to the operation instruction is determined according to the target object.

In a possible implementation manner, the determining, by the voice recognition result including an identity recognition result, an operation instruction corresponding to the target voice emotion and a target multimedia device corresponding to the operation instruction may include:

and under the condition that the target object has the control authority, determining an operation instruction corresponding to the target voice emotion and target multimedia equipment corresponding to the operation instruction.

For example, a user with a control authority may be preset, and when the voice recognition result includes an identity recognition result of a target object, whether the target object is a user with the control authority may be determined according to the identity recognition result of the user, and when it is determined that the target object has the control authority, an operation instruction corresponding to a target voice emotion in the image recognition result and a target multimedia device corresponding to the operation instruction may be determined; or, in the case that the target object does not have the manipulation authority, the electronic device does not respond to any speech emotion of the target object.

For example, a manipulation authority library may be established in advance, and a user having a manipulation authority may be added to the manipulation authority library. In the management and control mode, only the user added in the operation authority library has the control authority on the multimedia device. After obtaining the identification result of the target object, for example: the identity recognition result is that the user identity is recognized, and the user identity is as follows: the user A can search the user A in the control authority library, and under the condition that the user A is searched, the user A is determined to have the control authority; or, in the case that the user a is not found, it may be determined that the user a does not have the manipulation authority. Or, when the identity recognition result is that the user identity is not recognized, it can be directly determined that the target object does not have the control authority.

Or, in the interactive mode, only the user added in the operation permission library and the user without identity identification have the control permission for the multimedia device, for example: the identity recognition result is that the user identity is recognized, and the user identity is as follows: the user B can be searched in the control authority library, and under the condition that the user B is searched, the user B is determined to have control authority; alternatively, in the case where the user B (staff in the exhibition hall) is not found, it may be determined that the user B does not have the manipulation authority. Or when the identity recognition result is that the user identity is not recognized, the target object is the audience, and therefore the target object can be determined to have the control authority. Therefore, interaction between the speaker and the audience can be realized, and meanwhile, the situation that no control is caused when workers move around the speaker is avoided.

For another example, referring to fig. 3, a user control authority library may be established in advance, and a corresponding relationship between the target speech emotion and the operation instruction may be defined to establish a corresponding relationship library. The electronic equipment respectively carries out voice emotion recognition processing and identity recognition processing on the collected voice information, and then an identity recognition result and a voice emotion recognition result of the target object can be obtained. And under the condition that the voice emotion recognition result comprises the target voice emotion defined in the corresponding relation library and the target object represented by the identity recognition result is inquired in the user control authority library, the operation instruction corresponding to the target voice emotion can be sent to the target multimedia equipment. And after receiving the operation instruction, the target multimedia device responds to the operation instruction to execute the operation corresponding to the operation instruction.

Therefore, the control method of the multimedia device provided by the embodiment of the disclosure can determine the control authority of the target object by combining the identity recognition result obtained by voice recognition, and send the operation instruction corresponding to the target voice emotion to the target multimedia device only under the condition that the target object has the control authority, so that the problems of mistaken touch, instruction interference, repeated instruction and the like caused by other users in the control process of the multimedia device can be avoided, and the specialization and the intellectualization of the control of the multimedia device can be improved.

determining a current manipulation mode;

For example, the control mode for the multimedia device may include a control mode and an interaction mode, in the control mode, only the user who can recognize the identity can control the multimedia device, and in the interaction mode, any user can control the multimedia device.

Therefore, after the identity recognition result is obtained, the current control mode can be determined, whether the target object has the control authority or not is determined according to the current control mode and the identity recognition result, the operation instruction corresponding to the target voice emotion is determined under the condition that the target object has the control authority, and the operation instruction is sent to the target multimedia device, so that the target multimedia device can execute corresponding operation.

In a possible implementation manner, the determining whether the target object has the manipulation authority according to the current manipulation mode and the identification result may include:

in response to that the current control mode is a control mode and the identity recognition result represents the identity of the target object, determining whether the target object has the control authority according to the identity recognition result; alternatively, the first and second electrodes may be,

For example, under the condition that the current manipulation mode is the management and control mode, if the identity recognition result represents that the identity of the target object is recognized, the target object can be determined to be a worker in a multimedia exhibition hall, and the target object is determined to have the manipulation authority; or, if the identity recognition result represents that the identity of the target object is not recognized, it may be determined that the target object is not a worker in the multimedia exhibition hall, for example: the tourist, the audience personnel and the like, so that the target object can be determined not to have the control authority.

In a possible implementation manner, when the identity of the target object is identified by the identity identification result representation and the identity of the target object can be found in the control authority library, it can be determined that the target object has the control authority; or when the identity recognition result represents the identity of the recognized target object and the identity of the target object cannot be found in the manipulation permission library, it can be determined that the target object does not have the manipulation permission. Therefore, the problem of misoperation of irrelevant workers can be avoided, and the control accuracy of the multimedia equipment is improved.

For example, when the current manipulation mode is the interactive mode, any user may have a manipulation right to manipulate the multimedia device. That is, no matter the identity recognition result represents the identity of the recognized target object or represents the identity of the unrecognized target object, the target object has the control authority.

For example, in a multimedia exhibition hall, a speaker can perform corresponding demonstration at the middle position of a stage, and an audience can watch the demonstration of the speaker at an audience table. In the scene that does not need the speaker to interact with spectator, can set up current manipulation mode to the management and control mode, control multimedia equipment through the pronunciation mood by the speaker, for example: and controlling the multimedia equipment to display the corresponding special effect through voice emotion, controlling the multimedia equipment to play the corresponding multimedia content and the like.

Or, in a scene where the audience and the speaker need to interact, the current operation mode can be set as the interaction mode. In the interactive mode, the electronic equipment can acquire the voice information of the speaker to obtain the voice emotion recognition result of the speaker, can also acquire the voice information of the audience to obtain the voice emotion recognition result of the audience, and further controls at least one piece of multimedia equipment according to the voice emotion recognition result of the speaker and controls at least one piece of multimedia equipment according to the voice emotion of the audience.

For example, the electronic device may collect voice information of a speaker, and control the multimedia device to display corresponding multimedia content according to a voice emotion recognition result of the voice information of the speaker, so as to cooperate with the presentation of the speaker. Meanwhile, the electronic equipment can also collect voice information of audiences, and control the multimedia equipment to play corresponding music or show corresponding special effects and the like according to voice emotion recognition results of the voice information of the audiences so as to adjust the atmosphere in the current exhibition hall according to the reaction of the audiences to the content demonstrated by the speaker. For example: when the voice information of the audience corresponds to the happy emotion, the corresponding cheerful special effect can be displayed on the multimedia equipment, or cheerful music is played through a sound, so that the speaker can intuitively obtain the feedback of the user on the demonstration content.

Therefore, the control method of the multimedia device provided by the embodiment of the disclosure can enrich the control modes of the multimedia device, improve the interactivity between users, increase the interestingness and further improve the user experience.

For example, an operation list corresponding to each user may be preset, and the operation list may include a preset correspondence between a target speech emotion and an operation instruction. In the embodiment of the disclosure, after the identification result of the target object is obtained, the operation list corresponding to the target object represented by the identification result can be obtained according to the identification result of the target object, and the operation instruction corresponding to the target speech emotion is searched from the operation list.

For example, the operation lists corresponding to different users may be the same or different. For example: the operation instruction corresponding to the voice emotion of tension in the operation list corresponding to the user A is an instruction for displaying a special tensioning effect, the operation instruction corresponding to the voice emotion of tension in the operation list corresponding to the user B is an instruction for displaying a special tensioning effect, the operation instruction corresponding to the voice emotion of tension in the operation list corresponding to the user C is an instruction for sending a special tensioning effect, and the voice emotion of tension does not exist in the operation list corresponding to the user D.

When the identity recognition result represents that the target object is a user A (assumed to be a speaker), an operation list of the user A can be obtained, the tension corresponding to the voice emotion is determined to be a tension special effect displaying instruction according to the operation list of the user A, and the tension special effect displaying instruction is sent to the target multimedia device; or, when the identity recognition result represents that the target object is the user C (assumed to be a viewer), the operation list of the user C may be obtained, and the operation instruction corresponding to the "tension" speech emotion is determined to be the command for sending the cheerful display special effect according to the operation list of the user C, and the command for sending the cheerful display special effect is sent to the target multimedia device; or, in a case where the identity recognition result represents that the target object is the user D, the operation list of the user D may be obtained, and it is determined that the voice emotion of "tension" does not exist according to the operation list of the user D (assumed to be the staff in the exhibition hall), and no response is made.

Therefore, the control method for the multimedia device provided by the embodiment of the disclosure can acquire the operation list of the target object according to the identity recognition result, and determine the operation instruction corresponding to the target voice emotion according to the operation list of the target object, so that the user can customize the voice emotion of the multimedia device, enrich the control mode for the multimedia device, and improve the user experience.

In one possible implementation, the method may further include:

For example, an operation list may be set in advance for any user. For example: the operation list of the target object may be set after the target object is determined. In the process of setting the operation list, at least one corresponding relation in the operation list can be set in response to a setting operation for the operation list (for example, a setting operation for a corresponding relation between a target speech emotion and an operation instruction in the operation list), and after setting of all the corresponding relations is completed, the operation list of the target object is obtained. And the electronic equipment can determine an operation instruction corresponding to the target voice emotion according to the operation list of the target object after recognizing the target voice emotion made by the target object, and then control the target multimedia equipment according to the operation instruction.

For example, the operation list of the target object may be empty at the time of initial setting, and the electronic device may set the corresponding relationship in the operation list in response to a setting operation for the operation list; or, the operation list of the target object is not empty when being set, that is, the operation list includes a preset corresponding relationship, the electronic device may adjust the corresponding relationship in the operation list in response to the setting operation for the operation list (for example, modify the corresponding relationship between the target speech emotion and the operation instruction, or delete the corresponding relationship between the target speech emotion and the operation instruction, and the like), so as to obtain the operation list of the target object.

For example, the user may select the target speech emotion through the selection operation, and select the operation instruction having the correspondence with the target speech emotion through the selection operation, and the electronic device may store the correspondence between the selected target speech emotion and the operation instruction in the operation list.

Referring to fig. 4a, a target speech emotion option box 41 and an operation instruction option box 42, a creation control 43, and an operation list 44 may be included in the creation interface of the operation list. Wherein the electronic device may acquire and present the voice emotion option drop-down box 411 in response to a user's trigger operation (e.g., a click or touch operation on the target voice emotion option box 41) with respect to the target voice emotion option box 41, and the voice emotion option drop-down box 411 may be used to present the voice emotion option. The electronic device can respond to the user's selection operation of the voice emotion option (for example, clicking, touching and the like for the voice emotion option), and determine that the voice emotion option corresponding to the selection operation is the target voice emotion. After determining the target speech emotion, the user may retrieve and present an operation instruction option drop-down box (not shown in fig. 4 a) that may be used to present the operation instruction options in response to a triggering operation for the operation instruction option box 42. The electronic equipment can respond to the operation selected by the user from the operation instruction options (for example, clicking, touching and the like on the operation instruction options), and determine that the operation instruction option corresponding to the selected operation is the operation instruction corresponding to the target voice emotion. The electronic device may store the correspondence between the selected target speech emotion and the selected operation instruction in the operation list in response to a trigger operation (a click, touch, or the like operation for the trigger control) of the user with respect to the creation control 43.

For example, in a case that the correspondence including the target speech emotion or the operation instruction is not stored in the operation list, the electronic device may directly store the correspondence between the target speech emotion and the operation instruction in the operation list; alternatively, in a case where the history correspondence including the target speech emotion or the operation instruction already exists in the operation list, a prompt message may be generated to prompt the user, for example: and generating prompt information: the setting is repeated, and the control can be replaced, the control price can be added, or the control can be cancelled.

When the user wants to modify and change the history corresponding relation, the replacement control can be triggered. The electronic device can respond to the triggering operation of the user for the replacing control, and replace the historical corresponding relation including the target voice emotion or the operation instruction in the operation list by adopting the corresponding relation of the target voice emotion and the operation instruction. In this way, modification and change of the correspondence relation in the operation list by the user can be realized.

Actually, when the user wants to modify and change the corresponding relationship in the operation list, the user may also directly select the corresponding relationship to be modified in the operation list, and directly modify and change the target speech emotion or the operation instruction in the corresponding relationship.

Alternatively, in the case where the user wants to set a plurality of triggered target speech emotions for the same operation instruction, the add control may be triggered. The electronic equipment can respond to the triggering operation of the user for the adding control, and the corresponding relation between the target voice emotion and the operation instruction is stored in an operation list. It should be noted that this method is limited to the case where the historical correspondence includes the operation instruction in the correspondence to be stored currently. In this way, extension of the target speech emotion that triggers the operation instruction can be achieved.

Or, the user may trigger the cancel control when setting an error in the correspondence to be stored currently. The electronic device can respond to the triggering operation of the user for the cancel control to cancel the storage operation. Therefore, the situation that the same target voice emotion or the same operation instruction corresponds to a plurality of corresponding relations can be avoided, and the accuracy of controlling the multimedia equipment can be improved.

The above manner of displaying the voice emotion option and the operation instruction option through the drop-down box is only an example of the embodiment of the present disclosure, and actually, the user may also search and select the target voice emotion and the operation instruction having a corresponding relationship with the target voice emotion directly by inputting the target voice emotion and the operation instruction in the target voice emotion option box 41 and the operation instruction option box 42.

In another example, the electronic device may integrate a voice capture device. Referring to fig. 4b, the electronic device may include a target speech emotion collection option 45, an operation instruction option box 42, a creation control 43, and an operation list 44 on an operation interface. The electronic device can respond to the triggering operation of the user on the target speech emotion acquisition option 45, acquire the speech information of the user through the speech acquisition device, perform speech emotion recognition processing on the acquired speech information to obtain a corresponding speech emotion recognition result, and can respond to the confirmation operation of the user to determine the speech emotion represented by the speech emotion recognition result as the target speech emotion. After the target voice emotion is determined, the selection operation of the operation instruction having the corresponding relationship with the target voice emotion and the creation operation of the corresponding relationship between the target voice emotion and the operation instruction may refer to the foregoing embodiments, and details of the embodiments of the present disclosure are not repeated herein.

Therefore, the setting mode of the operation list can be enriched, the accuracy of the setting of the corresponding relation between the target voice emotion and the operation instruction is improved, the situation that the target object is not standard and cannot be recognized or the target voice emotion is recognized wrongly to control the multimedia equipment when the target object adopts the operation list to perform corresponding control operation is avoided, and the accuracy of controlling the multimedia equipment by adopting the voice emotion can be improved.

It should be noted that, in the embodiment of the present disclosure, when the corresponding relationship between the target speech emotion and the operation instruction is set, there is no limitation on the sequence of the target speech emotion and the selection sequence of the operation instruction, and actually, the operation instruction may be selected first, and then the target speech emotion corresponding to the operation instruction is selected.

Thus, the control method of the multimedia device provided by the embodiment of the present disclosure can customize an operation list of a user, for example: the corresponding relation between the target voice emotion in the operation list and the operation instruction can be set according to personal habit and hobbies of the user, personalized customization requirements of the user can be achieved, and user experience can be improved.

In a possible implementation manner, the setting at least one corresponding relationship in the operation list in response to the setting operation for the operation list may include:

For example, the user may delete any correspondence in the operation list. For example, in response to a user's selection operation (for example, clicking or touching the corresponding relationship) for any corresponding relationship, the electronic device may determine a target corresponding relationship for the corresponding relationship, and may pop up a modification control and a deletion control in the current presentation interface, as shown in fig. 4 c. The electronic device may delete the target correspondence from the operation list in response to a trigger operation (for example, an operation such as clicking or touching on the deletion control) of the user for the deletion control, thereby obtaining an operation list of the target object. Or, the electronic device may also execute a modification operation for the target correspondence in response to a trigger operation of the user for modifying the control (for a specific process, reference may be made to the foregoing embodiment, and details of the embodiment of the present disclosure are not described here again).

Thus, the control method of the multimedia device provided by the embodiment of the present disclosure can customize an operation list of a user, for example: the corresponding relation between the target voice emotion which is not needed in the operation list and the operation instruction can be deleted, the situations of misoperation and the like in actual operation can be avoided, the multimedia equipment can be controlled more accurately, and the user experience can be improved.

For example, the multimedia devices that can be operated by each user can be preset, such as: in an exhibition, comprising 4 multimedia devices, two speakers: user a and user D cooperate to perform the presentation, and it can be set that user a can control multimedia device 1 and multimedia device 2, and user D can control multimedia device 3 and multimedia device 4. After the identification result of the target object is obtained, the multimedia device controllable by the target object can be determined according to the identification result, the multimedia device controllable by the target object is used as the target multimedia device corresponding to the operation instruction, and the operation instruction corresponding to the target voice emotion can be sent to the target multimedia device, so that the target multimedia device can execute relevant processing according to the operation instruction.

For example, it is assumed that the user a is responsible for controlling the multimedia device 1 and the multimedia device 2 to perform the presentation of the multimedia content 1, and the user D is responsible for controlling the multimedia devices 3 to 4 to perform the presentation of the multimedia content 2. When the user a wants to show a corresponding special effect in the explained multimedia content 1, the user a may use a corresponding speech emotion to perform speech, for example: when the happy special effect is required to be displayed, the happy voice emotion can be adopted, or when the heart-hurting special effect is required to be displayed, the heart-hurting voice emotion can be adopted. Then, the electronic device can recognize the identity recognition result in the currently collected voice information: user a, speech emotion recognition result: the electronic device may determine that the multimedia device 1 and the multimedia device 2 controllable by the user a are target multimedia devices, acquire an operation list of the user a, and determine that the "happy" corresponds to the display happy special effect instruction from the operation list of the user a, the electronic device may send the display happy special effect instruction to the multimedia device 1 and the multimedia device 2, and after receiving the display happy special effect instruction, the multimedia device 1 and the multimedia device 2 may display the happy special effect in the currently displayed multimedia content 1.

When the user D wants to show a corresponding special effect in the multimedia content 2, the user D may also use a corresponding speech emotion, for example: when the happy special effect is displayed, the speech with the happy voice emotion can be adopted for speech. The electronic device can recognize the identity recognition result from the currently collected voice information: user D, speech emotion recognition result: the electronic device may determine that the multimedia device 3 and the multimedia device 4 controllable by the user D are target multimedia devices, acquire an operation list of the user D, and determine that the "happy" corresponds to the display happy special effect instruction from the operation list of the user D, the electronic device may send the display happy special effect instruction to the multimedia device 3 and the multimedia device 4, and after receiving the display happy special effect instruction, the multimedia device 3 and the multimedia device 4 may display the happy special effect in the currently displayed multimedia content 2.

It should be noted that, the voice emotions of the user a and the user D controlling the multimedia device to display the fun special effect are the same, and actually, different users may also use different voice emotions to control different multimedia devices to implement the same operation, for example: user a may use "happy" speech emotion to control multimedia device 1 and multimedia device 2 to exhibit the happy special effect, and user D may use "excited" speech emotion to control multimedia device 3 and multimedia device 4 to exhibit the happy special effect, which is not limited in this disclosure.

Therefore, according to the control method of the multimedia device provided by the embodiment of the disclosure, a plurality of users can control different multimedia devices to realize the same or different operations by adopting the same or different voice emotions through the electronic device, so that the control modes of the multimedia devices can be enriched, and the control method of the multimedia device provided by the embodiment of the disclosure can adapt to various application scenes.

The control method of the multimedia device provided by the embodiment of the disclosure can be applied to specialized exhibition hall demonstration and explanation scenes, such as enterprise exhibition halls, city planning halls and the like, and a speaker or an instructor is taken as a target object, so that the multimedia device can be conveniently controlled to perform demonstration and management. According to the embodiment of the disclosure, after the operation instruction for controlling each multimedia device in a scene is combined with the voice emotion, the electronic device provided with the central control system is used for identifying and judging, so that a presenter can realize full-flow integrated smooth presentation and interactive experience, and the scientific and technological sense, the intellectualization and the smoothness of presentation are further enhanced; the voice emotion and the user identity information can be bound, after the identity recognition result of the user is obtained through voice recognition processing, whether the user has the control authority or not is judged according to the identity recognition result, the voice emotion of the user is responded only when the user has the control authority, the problems of multi-user mistaken touch, instruction interference, repeated instructions and the like in the demonstration process can be solved, and the specialization of demonstration and the control precision of a speaker are improved.

It is understood that the above-mentioned method embodiments of the present disclosure can be combined with each other to form a combined embodiment without departing from the logic of the principle, which is limited by the space, and the detailed description of the present disclosure is omitted. Those skilled in the art will appreciate that in the above methods of the specific embodiments, the specific order of execution of the steps should be determined by their function and possibly their inherent logic.

In addition, the present disclosure also provides a control device of a multimedia device, an electronic device, a computer-readable storage medium, and a program, which can be used to implement any one of the control methods of the multimedia device provided by the present disclosure, and the corresponding technical solutions and descriptions and corresponding descriptions in the method section are not repeated.

Fig. 5 is a block diagram of a control apparatus for a multimedia device according to an embodiment of the present disclosure, applied to an electronic device for controlling a plurality of multimedia devices, as shown in fig. 5, the apparatus including:

the acquisition module 51 may be configured to acquire voice information of a target object;

a recognition result 52, which can be used to perform voice recognition on the voice information to obtain a voice recognition result;

the control module 53 may be configured to control at least one multimedia device according to the target speech emotion in the speech recognition result.

Therefore, the electronic equipment for controlling the plurality of multimedia devices can acquire the voice information of the target object and perform image recognition on the voice information to obtain a voice recognition result. The electronic equipment can correspondingly control at least one multimedia device according to the target voice emotion in the voice recognition result. According to the control device of the multimedia equipment, provided by the embodiment of the disclosure, the voice emotion can be fused with the electronic equipment, so that at least one multimedia equipment in the plurality of multimedia equipment can be controlled according to the voice emotion, the control mode of the electronic equipment on the plurality of multimedia equipment can be enriched, the interactivity and interestingness of a user and the multimedia equipment are improved, and the user experience is improved.

In one possible implementation, the control module 53 may be further configured to:

In a possible implementation manner, the voice recognition result includes an identity recognition result, and the control module 53 is further configured to:

determining a current manipulation mode;

In one possible implementation, the apparatus may further include:

and the setting module is used for responding to the setting operation aiming at the target object and setting at least one corresponding relation in the operation list to obtain the operation list of the target object.

In one possible implementation manner, the setting module may be further configured to:

In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.

Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the above-mentioned method. The computer readable storage medium may be a non-volatile computer readable storage medium.

An embodiment of the present disclosure further provides an electronic device, including: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to invoke the memory-stored instructions to perform the above-described method.

The disclosed embodiments also provide a computer program product comprising computer readable code or a non-transitory computer readable storage medium carrying computer readable code, which when run in a processor of an electronic device, the processor in the electronic device performs the above method.

The electronic device may be provided as a terminal, server, or other form of device.

Fig. 6 illustrates a block diagram of an electronic device 800 in accordance with an embodiment of the disclosure. For example, the electronic device 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like terminal.

Referring to fig. 6, electronic device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 800.

The multimedia component 808 includes a screen that provides an output interface between the electronic device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 800 is in an operation mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the electronic device 800. For example, the sensor assembly 814 may detect an open/closed state of the electronic device 800, the relative positioning of components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in the position of the electronic device 800 or a component of the electronic device 800, the presence or absence of user contact with the electronic device 800, orientation or acceleration/deceleration of the electronic device 800, and a change in the temperature of the electronic device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a Complementary Metal Oxide Semiconductor (CMOS) or Charge Coupled Device (CCD) image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate wired or wireless communication between the electronic device 800 and other devices. The electronic device 800 may access a wireless network based on a communication standard, such as a wireless network (WiFi), a second generation mobile communication technology (2G) or a third generation mobile communication technology (3G), or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the electronic device 800 to perform the above-described methods.

Fig. 7 illustrates a block diagram of an electronic device 1900 in accordance with an embodiment of the disclosure. For example, the electronic device 1900 may be provided as a server. Referring to fig. 7, electronic device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the above-described method.

The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output (I/O) interface 1958. The electronic device 1900 may operate based on an operating system, such as the Microsoft Server operating system (Windows Server), stored in the memory 1932^TM) Apple Inc. of the present application based on the graphic user interface operating System (Mac OS X)^TM) Multi-user, multi-process computer operating system (Unix)^TM) Free and open native code Unix-like operating System (Linux)^TM) Open native code Unix-like operating System (FreeBSD)^TM) Or the like.

In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the electronic device 1900 to perform the above-described methods.

The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A control method of a multimedia device is applied to an electronic device, the electronic device is used for controlling a plurality of multimedia devices, and the method comprises the following steps:

collecting voice information of a target object;

2. The method of claim 1, wherein the controlling at least one multimedia device according to the target speech emotion in the speech recognition result comprises:

3. The method according to claim 2, wherein the voice recognition result includes an identity recognition result, and the determining the operation instruction corresponding to the target voice emotion and the target multimedia device corresponding to the operation instruction includes:

4. The method according to claim 3, wherein the determining whether the target object has the manipulation authority according to the identification result comprises:

determining a current manipulation mode;

5. The method according to claim 4, wherein the determining whether the target object has the manipulation authority according to the current manipulation mode and the identification result comprises:

6. The method according to claim 4, wherein the determining whether the target object has the manipulation authority according to the current manipulation mode and the identification result comprises:

7. The method according to any one of claims 3 to 6, wherein the determining of the operation instruction corresponding to the target emotion comprises:

8. The method of claim 7, further comprising:

9. The method according to claim 8, wherein the setting at least one corresponding relationship in the operation list in response to the setting operation for the operation list comprises:

10. The method according to claim 8, wherein the setting at least one corresponding relationship in the operation list in response to the setting operation for the operation list comprises:

11. The method of claim 10, wherein the determining the target multimedia device corresponding to the operation instruction comprises:

12. A control apparatus for a multimedia device, applied to an electronic device for controlling a plurality of multimedia devices, the apparatus comprising:

13. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to invoke the memory-stored instructions to perform the method of any of claims 1 to 11.

14. A computer readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1 to 11.