CN111681650A

CN111681650A - Intelligent conference control method and device

Info

Publication number: CN111681650A
Application number: CN201910181807.2A
Authority: CN
Inventors: 李帅; 李硕; 董金威; 李林峰; 何亚明; 祁越; 沈勇; 邢东杰; 王兵
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2019-03-11
Filing date: 2019-03-11
Publication date: 2020-09-18

Abstract

The disclosure relates to an intelligent conference control method and device. The method comprises the following steps: receiving a voice instruction input by a first user; determining a target conference instruction matched with the voice instruction by analyzing the voice instruction; and controlling the multimedia conference equipment to provide corresponding conference function services based on the target conference instruction. The method and the device can realize that the multimedia conference equipment is controlled to provide conference function service through the voice instruction, effectively reduce the man-machine interaction threshold of the multimedia conference equipment and improve the audio/video conference efficiency.

Description

Intelligent conference control method and device

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to an intelligent conference control method and apparatus.

Background

With the rapid development of internet technology, the communication mode of audio/video conferences is widely popularized in various aspects of life, work, learning and the like of users.

The starting of the traditional audio/video conference and various conference functions depend on the key operation or touch screen operation of the user on the audio/video conference equipment, and have higher equipment operation requirements on the user. If the user does not use the audio/video conference device or does not know about the function keys in the audio/video conference device, the user cannot start the audio/video conference or cannot use the related conference functions available from the audio/video conference device.

Therefore, an intelligent conference control method is needed.

Disclosure of Invention

In view of the above, the present disclosure provides an intelligent conference control method and apparatus, so that a multimedia conference device can be controlled by a voice instruction to provide a conference function service, a man-machine interaction threshold of the multimedia conference device is effectively reduced, and audio/video conference efficiency is improved.

According to a first aspect of the present disclosure, an intelligent conference control method is provided, where the method is applied to a server, and the method includes: receiving a voice instruction input by a first user; determining a target conference instruction matched with the voice instruction by analyzing the voice instruction; and controlling the multimedia conference equipment to provide corresponding conference function services based on the target conference instruction.

In one possible implementation, determining a target meeting instruction matching the voice instruction by parsing the voice instruction includes: converting the voice instruction into a text instruction by performing voice recognition on the voice instruction; and determining the target meeting instruction matched with the text instruction in a meeting instruction database by understanding the natural language of the text instruction, wherein the meeting instruction database comprises a plurality of meeting instructions.

In one possible implementation, the method further includes: and sending a first prompt instruction to multimedia conference equipment, wherein the first prompt instruction is used for controlling the multimedia conference equipment to perform text display on the various conference instructions so as to prompt a user to perform voice input.

In a possible implementation manner, controlling a multimedia conference device to provide a corresponding conference function service based on the target conference instruction includes: determining target multimedia conference equipment according to the voice instruction; and controlling the target multimedia conference equipment to provide corresponding conference function services through the target conference instruction.

In one possible implementation, the voice instruction is input by the first user based on a multimedia conference device; determining a target multimedia conference device according to the voice instruction, comprising: receiving a region identifier of a target region where the multimedia conference device is located, wherein the region identifier is sent by the multimedia conference device; and determining the multimedia conference equipment in the target area as the target multimedia conference equipment according to the area identifier of the target area.

In one possible implementation, the voice instruction is input by the first user based on a smart sound box; determining a target multimedia conference device according to the voice instruction, comprising: determining a sound box identifier corresponding to the intelligent sound box, wherein the sound box identifier is used for indicating an area identifier of a target area where the intelligent sound box is located; and taking the multimedia conference equipment in the target area as the target multimedia conference equipment.

In one possible implementation, the target conference instruction is a conference creation instruction; controlling the target multimedia conference equipment to provide corresponding conference function services through the target conference instruction, wherein the conference function services comprise: and controlling the target multimedia conference equipment to create a target audio/video conference through the conference creation instruction.

In one possible implementation, the method further includes: and establishing a conference joining instruction for the target audio/video conference, wherein the conference joining instruction is used for instructing a second user to join the target audio/video conference by inputting the conference joining instruction through voice.

In one possible implementation, the method further includes: determining a first Media Access Control (MAC) address of a wireless Access Point (AP) in the target area according to the area identifier; determining a second MAC address accessed to the AP according to the first MAC address; and determining the identity information of a third user accessing the target audio/video conference according to the second MAC address.

In a possible implementation manner, the target conference instruction is a phone call instruction, and the phone call instruction includes a target user identifier; controlling the target multimedia conference equipment to provide corresponding conference function services through the target conference instruction, wherein the conference function services comprise: determining a fourth user corresponding to the target user identification, and determining a telephone number of the fourth user; and controlling the target multimedia conference equipment to call the fourth user through the telephone call instruction according to the telephone number of the fourth user.

In a possible implementation manner, determining a fourth user corresponding to the target user identifier includes: when determining that a plurality of users corresponding to the target user identification exist, determining the correlation between the plurality of users and the first user according to the identity information of the first user; determining a user whose relevance to the first user exceeds a threshold as the fourth user.

In one possible implementation, the method further includes: extracting voiceprint features from the voice instruction; and determining the identity information of the first user according to the voiceprint characteristics.

In a possible implementation manner, the controlling, by the target conference instruction, the target multimedia conference device to provide a corresponding conference function service includes: and when the target conference instruction is determined to be directly responded by the target multimedia conference equipment, controlling the target multimedia conference equipment to provide corresponding conference function service through the target conference instruction.

In one possible implementation, the method further includes: when the target conference instruction cannot be directly responded by the target multimedia conference equipment, determining an associated conference instruction corresponding to the target conference instruction in a conference instruction database, wherein the associated conference instruction can be directly responded by the target multimedia conference equipment; and sending a second prompt instruction to the target multimedia conference device, wherein the second prompt instruction is used for controlling the target multimedia conference device to perform text display and/or voice output on the associated conference instruction so as to prompt the first user to perform voice input.

According to a second aspect of the present disclosure, there is provided an intelligent conference control method, which is applied to a multimedia conference device, the method including: receiving a voice instruction input by a first user; determining a target conference instruction matched with the voice instruction by analyzing the voice instruction; and sending the target conference instruction to a server so that the server provides corresponding conference function service based on the target conference instruction.

According to a third aspect of the present disclosure, there is provided an intelligent conference control apparatus, the apparatus being applied to a server, the apparatus including: the receiving module is used for receiving a voice instruction input by a first user; the analysis module is used for determining a target conference instruction matched with the voice instruction by analyzing the voice instruction; and the control module is used for controlling the multimedia conference equipment to provide corresponding conference function services based on the target conference instruction.

According to a fourth aspect of the present disclosure, there is provided an electronic device comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the intelligent conference control method of the first aspect.

According to a fifth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the intelligent conference control method of the first aspect described above.

According to a sixth aspect of the present disclosure, there is provided an intelligent conference control apparatus, which is applied to a multimedia conference device, the apparatus including: the receiving module is used for receiving a voice instruction input by a first user; the analysis module is used for determining a target conference instruction matched with the voice instruction by analyzing the voice instruction; and the sending module is used for sending the target conference instruction to a server so that the server provides corresponding conference function service based on the target conference instruction.

According to a seventh aspect of the present disclosure, there is provided an electronic apparatus comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to execute the intelligent conference control method of the second aspect.

According to an eighth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the intelligent conference control method of the second aspect described above.

The server receives a voice instruction input by a first user, determines a target conference instruction matched with the voice instruction by analyzing the voice instruction, and controls the multimedia conference equipment to provide corresponding conference function service based on the target conference instruction, so that the multimedia conference equipment can be controlled to provide the conference function service through the voice instruction, the man-machine interaction threshold of the multimedia conference equipment is effectively reduced, and the audio/video conference efficiency is improved.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 shows a schematic flow diagram of an intelligent conference control method according to an embodiment of the present disclosure;

fig. 2 shows a schematic diagram of an intelligent conference control system of an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a conference home page in a multimedia conferencing device according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram illustrating the linkage of the GUI and VUI in the target multimedia conferencing device according to one embodiment of the present disclosure;

fig. 5 shows a schematic diagram of a DUI in a multimedia conferencing device according to an embodiment of the present disclosure;

fig. 6 shows a schematic flow chart of an intelligent conference control method according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of an intelligent conference control apparatus according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of an intelligent conference control apparatus according to an embodiment of the present disclosure;

fig. 9 shows a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. As will be appreciated by those skilled in the art, and/or represents at least one of the connected objects.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

Fig. 1 shows a flowchart of an intelligent conference control method according to an embodiment of the present disclosure. The method can be applied to a server, and as shown in fig. 1, the method can include:

in step S11, a voice command input by the first user is received.

And step S12, determining a target conference instruction matched with the voice instruction by analyzing the voice instruction.

And step S13, controlling the multimedia conference equipment to provide corresponding conference function service based on the target conference instruction.

After receiving a voice instruction input by a first user, a server in the intelligent conference control system analyzes the voice instruction to determine the user intention of the first user, determines a target conference instruction matched with the voice instruction according to the user intention of the first user, and then controls the multimedia conference equipment to provide corresponding conference function service based on the target conference instruction.

Fig. 2 shows a schematic diagram of an intelligent conference control system according to an embodiment of the present disclosure. As shown in fig. 2, the intelligent conference control system includes: the system comprises an intelligent sound box, multimedia conference equipment, a server side and a wireless Access Point (AP). The intelligent sound box, the multimedia conference equipment and the AP are deployed in a conference room. The intelligent conference control system can realize information linkage of the intelligent sound box, the multimedia conference equipment and the AP in a conference room so as to provide corresponding conference function services for users.

The multimedia conference device may be an audio conference device, a telephone conference device, a video conference device, etc., which is not specifically limited by this disclosure.

In one possible implementation, the voice instruction is input by the first user based on the smart speaker.

When a first user in a conference room wants multimedia conference equipment in the conference room to provide conference function services, the first user can input a voice instruction to the conference control module through a smart sound box in the conference room.

In one possible implementation, the voice instruction is input by the first user based on the multimedia conference device.

In an example, the smart speaker in the smart conference control system may be integrated with a multimedia conference room device, without using a separate smart speaker, and when a first user in a conference room desires a multimedia conference device in the conference room to provide a conference function service, the first user may input a voice instruction to a server through the multimedia conference device.

In one possible implementation, determining a target meeting instruction matching the voice instruction by parsing the voice instruction includes: performing voice recognition on the voice command to convert the voice command into a text command; and determining a target conference instruction matched with the text instruction in a conference instruction database by performing natural language understanding on the text instruction, wherein the conference instruction database comprises a plurality of conference instructions.

In an example, the meeting instructions database can be updated periodically.

In an example, the conference instruction database may be deployed at a server, and when the multimedia conference device receives a voice instruction input by a first user, the multimedia conference device sends the voice instruction to the server, and after receiving the voice instruction, the server parses the voice instruction to determine a target conference instruction matching the voice instruction.

In an example, the conference instruction database may be deployed at the server, when the smart speaker receives a voice instruction input by the first user, the smart speaker sends the voice instruction to the server, and after receiving the voice instruction, the server parses the voice instruction to determine a target conference instruction matched with the voice instruction.

In an example, a smart speaker server corresponding to a smart speaker is added to the smart conference control system, and then the conference instruction database may be deployed in the smart speaker server. After the intelligent sound box receives the voice instruction input by the first user, the intelligent sound box sends the voice instruction to the intelligent sound box server, the intelligent sound box server analyzes the voice instruction to determine a target conference instruction matched with the voice instruction, and then the intelligent sound box server sends the target conference instruction to the server side.

The process of analyzing the voice command to determine the target conference command matched with the voice command is as follows: performing voice Recognition on the voice instruction by using an Automatic Speech Recognition (ASR) technology, and converting the voice instruction into a text instruction; further, Natural Language understanding is carried out on the text instruction through a Natural Language Processing (NSP) technology, and the user intention of the first user is determined; and then according to the user intention of the first user, performing content matching with the conference instruction in the conference instruction database in a word segmentation mode, and determining a target conference instruction matched with the text instruction, namely determining the target conference instruction matched with the voice instruction.

For example, when the first user inputs the voice command "call to a", the target conference command matched with the voice command "call to a" is determined to be a telephone call command by analyzing the voice command "call to a", the telephone call command includes the parameter "called party", and "a" is the value assigned to the parameter "called party".

In one possible implementation manner, the method further includes: extracting voiceprint features from the voice instruction; and determining the identity information of the first user according to the voiceprint characteristics.

The intelligent conference control system is bound with a voiceprint feature database, the voiceprint feature database comprises a plurality of user identification marks used for representing user identification information and voiceprint features corresponding to any user identification mark, and the voiceprint feature database can be updated regularly. After receiving a voice instruction input by a first user, the server extracts voiceprint features of the first user from the voice instruction, matches the voiceprint features of the first user with the voiceprint features in the voiceprint feature database, and determines identity information of the first user according to a matching result. The user identity identification based on the voiceprint enables the user to use the multimedia conference equipment without actively carrying out identity authentication login, for example, the user does not need to actively input an account password, and does not need to carry out fingerprint or face brushing authentication, so that login-free man-machine interaction can be realized, and the use efficiency of the multimedia conference equipment is improved.

In one possible implementation manner, the method further includes: and sending a first prompt instruction to the multimedia conference equipment, wherein the first prompt instruction is used for controlling the multimedia conference equipment to perform text display on various conference instructions so as to prompt a user to perform voice input.

After the multimedia conference equipment in the conference room is started, the server side can send a first prompt instruction to the multimedia conference equipment, and the multimedia conference equipment responds to the first prompt instruction, can perform text display on various conference instructions included in a conference instruction database in a conference homepage of the multimedia conference equipment, and prompts a user to perform voice input as required in a mode of screen display guidance of the multimedia conference equipment.

Fig. 3 is a schematic diagram illustrating a conference home page in the multimedia conference device according to an embodiment of the present disclosure. As shown in fig. 3, a plurality of conference instructions in three different scenes (a video conference scene, a teleconference scene, and a conference question and answer scene) are text-displayed in the conference home page to prompt the user to perform voice input as needed. For example, when a user needs to create a new video conference, the user can input "initiate the video conference" through the voice of the smart speaker according to a conference instruction displayed in the text of the conference homepage.

In one possible implementation manner, controlling the multimedia conference device to provide a corresponding conference function service based on the target conference instruction includes: determining target multimedia conference equipment according to the voice instruction; and controlling the target multimedia conference equipment to provide corresponding conference function services through the target conference instruction.

The intelligent conference control system comprises a plurality of multimedia conference devices deployed in different areas, and after receiving a voice instruction input by a first user through an intelligent sound box, the conference control module determines a target multimedia conference device according to the voice instruction, and then controls the target multimedia conference device to provide corresponding conference function service for the first user through the target conference instruction.

The manner of determining the target multimedia conference device according to the voice instruction includes at least two of the following.

The first method comprises the following steps:

in one possible implementation, the voice instruction is input by the first user based on the multimedia conference device; determining a target multimedia conference device according to a voice instruction, comprising: receiving a region identifier of a target region where the multimedia conference device is located, wherein the region identifier is sent by the multimedia conference device; and determining the multimedia conference equipment in the target area as target multimedia conference equipment according to the area identification of the target area.

After receiving a voice instruction input by a first user, the multimedia conference equipment sends the voice instruction and the area identifier of the target area where the multimedia conference equipment is located to the server, so that the server determines the multimedia conference equipment in the target area as the target multimedia conference equipment according to the area identifier.

And the second method comprises the following steps:

in one possible implementation, the voice instruction is input by the first user based on the smart sound box; determining a target multimedia conference device according to a voice instruction, comprising: determining a sound box identifier corresponding to the intelligent sound box, wherein the sound box identifier is used for indicating an area identifier of a target area where the intelligent sound box is located; and taking the multimedia conference equipment in the target area as target multimedia conference equipment.

In one example, the different areas may be different conference rooms, and the target area is the target conference room.

After receiving a voice instruction input by a first user, the intelligent sound box sends the voice instruction and a sound box identifier corresponding to the intelligent sound box to the server, the sound box identifier is the only identifier of the intelligent sound box, and the intelligent conference control system binds the sound box identifier corresponding to the intelligent sound box and the area identifier of the area where the intelligent sound box is located. The server side can determine an area identifier corresponding to the sound box identifier according to the sound box identifier, further determine a target area where the intelligent sound box is located, and use the multimedia conference equipment in the target area as target multimedia conference equipment which needs to provide conference function service for the first user.

For example, the intelligent conference control system binds the speaker identifier corresponding to the intelligent speaker to a conference Room identifier (Room ID) of a conference Room in which the intelligent speaker is located. After the first user inputs a voice instruction through the intelligent sound box, the intelligent sound box sends the voice instruction and the sound box identification corresponding to the intelligent sound box to the server, and the server can determine the Room ID corresponding to the sound box identification according to the sound box identification, so as to determine a target conference Room where the intelligent sound box is located, and take the multimedia conference equipment in the target conference Room as target multimedia conference equipment which needs to provide conference function service for the first user.

In one possible implementation manner, controlling the target multimedia conference device to provide a corresponding conference function service through the target conference instruction includes: and when the target conference instruction is determined to be directly responded by the target multimedia conference equipment, controlling the target multimedia conference equipment to provide corresponding conference function service through the target conference instruction.

For example, the first user inputs a voice command "call to a" by voice, and determines that a target conference command matched with the voice command "call to a" is a telephone call command by analyzing the voice command "call to a", and "a" is an assignment to a parameter "called party" in the telephone call command, and the telephone call command can be directly responded by the target multimedia conference device, and at this time, the target multimedia conference device is controlled to call a by the telephone call command.

In one possible implementation manner, the method further includes: when the target conference instruction cannot be directly responded by the target multimedia conference equipment, determining an associated conference instruction corresponding to the target conference instruction in a conference instruction database, wherein the associated conference instruction can be directly responded by the target multimedia conference equipment; and sending a second prompt instruction to the target multimedia conference equipment, wherein the second prompt instruction is used for controlling the target multimedia conference equipment to perform text display and/or voice output on the associated conference instruction so as to prompt the first user to perform voice input.

When a target conference instruction matched with a Voice instruction input by a first User cannot be directly responded by target multimedia conference equipment, a server side determines an associated conference instruction which corresponds to the target conference instruction and can be directly responded by the target multimedia conference equipment in a conference instruction database, and further sends a second prompt instruction to the target multimedia conference equipment, wherein the second prompt instruction is used for controlling the target multimedia conference equipment to realize the linkage of a Graphical User Interface (GUI), a Voice User Interface (VUI) and a Dialogue User Interface (DUI), and text display and/or Voice output are/is carried out on the associated conference instruction so as to prompt the first User to further carry out Voice input.

Fig. 4 is a schematic diagram illustrating GUI and VUI linkage in a target multimedia conference device according to an embodiment of the present disclosure. The first user inputs a voice instruction 'i want to meet', and the server determines that a target conference instruction matched with the voice instruction 'i want to meet' is a conference starting instruction by analyzing the voice instruction 'i want to meet'. However, it is unclear whether the first user needs to create a new audio/video conference, join an audio/video conference already created by another user, or call another user for a conference call. Therefore, the server controls the target multimedia conference equipment to realize the linkage of the GUI, the VUI and the DUI through the second prompt instruction. As shown in fig. 4, a text display is performed for the voice instruction "i want to take a meeting", and a text display and/or a voice output are performed for the associated meeting instructions ("initiate video conference", "join conference [ join conference instruction ]", "call [ phone number ]" and "conference home page") corresponding to the target meeting instruction (meeting start instruction) to prompt the first user to further perform voice input.

In an example, the VUI of the target multimedia conference device may be implemented by Text-To-Speech (TTS) technology, which is not specifically limited by this disclosure.

In one possible implementation, the target meeting instruction is a meeting creation instruction; the method for controlling the target multimedia conference equipment to provide the corresponding conference function service through the target conference instruction comprises the following steps: and controlling the target multimedia conference equipment to create the target audio/video conference through the conference creating instruction.

For example, after the first user inputs a voice instruction "initiate a video conference", the voice instruction "initiate a video conference" is analyzed, and a target conference instruction matched with the voice instruction "initiate a video conference" is determined to be a conference creation instruction, at this time, the server controls the target multimedia conference device to create the target video conference through the conference creation instruction.

In one possible implementation manner, the method further includes: and establishing a conference joining instruction for the target audio/video conference, wherein the conference joining instruction is used for instructing a second user to join the target audio/video conference by inputting the conference joining instruction through voice.

After the server side controls the target multimedia conference equipment to create the target audio/video conference according to the voice instruction input by the first user, in order to facilitate a second user needing to participate in the target audio/video conference to rapidly join the target audio/video conference, the server side creates a conference joining instruction for the target audio/video conference, so that the second user can input the conference joining instruction through the intelligent sound box or the multimedia conference equipment in the conference room where the second user is located, and the multimedia conference equipment in the conference room where the second user is located is controlled to join the target audio/video conference.

In an example, the join conference instruction may be a 6-bit digital conference code, which is not specifically limited by this disclosure.

In an example, the second user may control the multimedia conference device in the conference room or control the mobile terminal to join the corresponding target audio/video conference by inputting the conference joining instruction through the multimedia conference device in the conference room or through the mobile terminal of the second user by voice or manually, which is not limited in this disclosure.

In one possible implementation manner, the method further includes: determining a first Media Access Control (MAC) address of an AP in a target area according to the area identifier; determining a second MAC address of the access AP according to the first MAC address; and determining the identity information of a third user accessing the target audio/video conference according to the second MAC address.

Still taking the above fig. 2 as an example, the intelligent conference control system can implement information linkage between the multimedia conference device in the target conference room and the AP. And determining a target area (target conference Room) according to the determined area identifier (the Room ID of the target conference Room), and binding the Room ID of the target conference Room with the first MAC address of the AP in the target conference Room by the intelligent conference control system, so that the first MAC address of the AP in the target conference Room can be determined through the Room ID of the target conference Room. And then, determining a second MAC address of the AP accessed into the target conference room by monitoring the first MAC address, and determining the identity information of a third user accessed into the target audio/video conference on the multimedia conference equipment in the target conference room according to the second MAC address.

And determining the conditions of the participants of the target audio/video conference through the AP in the target conference room, and further performing GUI/VUI linkage display on the identity information of a third user accessing the target audio/video conference on target multimedia conference equipment so as to prompt the participant details of the target audio/video conference.

In one possible implementation manner, the target conference instruction is a telephone call instruction, and the telephone call instruction includes a target user identifier; the method for controlling the target multimedia conference equipment to provide the corresponding conference function service through the target conference instruction comprises the following steps: determining a fourth user corresponding to the target user identification, and determining a telephone number of the fourth user; and controlling the target multimedia conference equipment to call the fourth user through the telephone call instruction according to the telephone number of the fourth user.

For example, when the first user inputs a voice command "help me make a phone call and send a flower", the voice command "help me make a phone call and send a flower" is analyzed, and the target conference command matched with the voice command "help me make a phone call and send a flower" is determined as a phone call command, and the flower "is a target user identifier included in the phone call command. The intelligent conference control system is bound with the address book database, the server calls the address book database according to the target user identification 'spending', determines a fourth user (called party) corresponding to the target user identification 'spending', determines the telephone number of the fourth user, and controls the target to call the fourth user to the media conference equipment through the telephone call instruction according to the telephone number of the fourth user.

In a possible implementation manner, determining a fourth user corresponding to the target user identifier includes: when a plurality of users corresponding to the target user identification are determined, determining the correlation between the plurality of users and the first user according to the identity information of the first user; and determining the user with the relevance to the first user exceeding a threshold value as a fourth user.

Because the target user identifier 'flower' is the information input by the first user through voice, if a plurality of users who are homophonic with the 'flower' exist in the address book database, the server determines the correlation between the plurality of users and the first user according to the identity information of the first user, and further determines the user of which the correlation with the first user exceeds the threshold value as a fourth user. A user with a relevance to the first user that exceeds a threshold may be indicative of a callee with the first user having the highest likelihood of a telephone call. Through the binding of the intelligent conference control system and the address book database, a user does not need to perform complicated operation of searching a telephone number, and can accurately realize the call or video with a called party through simple voice input.

In an example, the correlation between the plurality of users and the first user may be determined by a department organization relationship or a business association relationship therebetween, which is not specifically limited by the present disclosure.

In an example, the intelligent conference control system may provide a conversational robot (Chat Bot) service, which reduces labor service costs by providing a user with a use help service of multimedia conference equipment through the DUI. For example, when a user inputs a voice instruction "where a switch of a multimedia conference device is located", the server determines a target conference room according to the voice instruction, further determines a power-on instruction corresponding to the target multimedia conference device after determining model information of the target multimedia conference device in the target conference room, and then sends the power-on instruction to the target multimedia conference device, and feeds back the power-on instruction to the user through the DUI, so that the user can use the target multimedia conference device by inputting the power-on instruction through voice.

Fig. 5 shows a schematic diagram of a DUI in a multimedia conferencing device according to an embodiment of the present disclosure. When a user inputs a voice instruction 'how to dial an international call', the server determines a conference instruction for dialing the international call in a conference instruction database (a "dial a landline telephone: 000+ an international area code + a landline telephone number" and a "dial a mobile phone: 000+ an international area code + a mobile phone number"), and further controls the target multimedia conference equipment to perform DUI display on the conference instruction as shown in FIG. 5, so that the user can further perform subsequent voice instruction input for dialing the international call, conference help service is provided for the user, and the use experience of the user is effectively improved.

The server receives a voice instruction input by a first user, determines a target conference instruction matched with the voice instruction by analyzing the voice instruction, and controls the multimedia conference equipment to provide corresponding conference function service based on the target conference instruction, so that the multimedia conference equipment can be controlled to provide the conference function service through the voice instruction, the man-machine interaction threshold of the multimedia conference equipment is effectively reduced, and the voice/time frequency conference efficiency is improved.

Fig. 6 shows a flowchart of an intelligent conference control method according to an embodiment of the present disclosure. The method may be applied to a multimedia conference device, as shown in fig. 6, and may include:

in step S61, a voice command input by the first user is received.

And step S62, determining a target conference instruction matched with the voice instruction by analyzing the voice instruction.

Step S63, sending the target conference instruction to the server, so that the server provides a corresponding conference function service based on the target conference instruction.

Compared with the intelligent conference control method shown in fig. 1, the voice instruction input by the first user can be locally analyzed in the multimedia conference device by using an edge algorithm, and after the target conference instruction is determined by analysis, the target conference instruction is sent to the server, so that the server provides a subsequent conference function service based on the target conference instruction.

The specific process of parsing the voice command may refer to the specific process of parsing the voice command in the embodiment shown in fig. 1, which is not described herein again.

The process of providing the corresponding conference function service by the service end based on the target conference instruction may refer to a specific process of providing the corresponding conference function service by the service end based on the target conference instruction in the embodiment shown in fig. 1, and details are not described here again.

Fig. 7 shows a schematic structural diagram of an intelligent conference control device according to an embodiment of the present disclosure. The apparatus 70 shown in fig. 7 can be applied to a server, and the apparatus 70 can be used for executing the steps of the above-mentioned method embodiment shown in fig. 1, where the apparatus 70 includes:

a receiving module 71, configured to receive a voice instruction input by a first user;

the analysis module 72 is used for determining a target conference instruction matched with the voice instruction by analyzing the voice instruction;

and the control module 73 is configured to control the multimedia conference device to provide a corresponding conference function service based on the target conference instruction.

In one possible implementation, the parsing module 72 includes:

the voice recognition submodule is used for carrying out voice recognition on the voice instruction and converting the voice instruction into a text instruction;

and the natural language understanding sub-module is used for determining a target conference instruction matched with the text instruction in a conference instruction database by performing natural language understanding on the text instruction, wherein the conference instruction database comprises a plurality of conference instructions.

In one possible implementation, the apparatus 70 further includes:

and the sending module is used for sending a first prompt instruction to the multimedia conference equipment, and the first prompt instruction is used for controlling the multimedia conference equipment to perform text display on various conference instructions so as to prompt a user to perform voice input.

In one possible implementation, the control module 73 includes:

the determining submodule is used for determining target multimedia conference equipment according to the voice instruction;

and the control submodule is used for controlling the target multimedia conference equipment to provide corresponding conference function services through the target conference instruction.

In one possible implementation, the voice instruction is input by the first user based on the multimedia conference device;

the determination submodule includes:

the receiving unit is used for receiving the area identification of the target area where the multimedia conference equipment is located, which is sent by the multimedia conference equipment;

and the first determining unit is used for determining the multimedia conference equipment in the target area as the target multimedia conference equipment according to the area identifier of the target area.

In one possible implementation, the voice instruction is input by the first user based on the smart sound box;

the determination submodule includes:

the second determination unit is used for determining a sound box identifier corresponding to the intelligent sound box, and the sound box identifier is used for indicating an area identifier of a target area where the intelligent sound box is located;

and the third determining unit is used for taking the multimedia conference equipment in the target area as the target multimedia conference equipment.

In one possible implementation, the target meeting instruction is a meeting creation instruction;

the control sub-module is specifically configured to:

and controlling the target multimedia conference equipment to create the target audio/video conference through the conference creating instruction.

In one possible implementation, the apparatus 70 further includes:

and the instruction creating module is used for creating a conference adding instruction for the target audio/video conference, and the conference adding instruction is used for indicating a second user to add the conference adding instruction into the target audio/video conference through voice input.

In one possible implementation, the apparatus 70 further includes:

the first determining module is used for determining a first MAC address of the AP in the target area according to the area identifier;

the second determining module is used for determining a second MAC address of the access AP according to the first MAC address;

and the third determining module is used for determining the identity information of a third user accessing the target audio/video conference according to the second MAC address.

In one possible implementation manner, the target conference instruction is a telephone call instruction, and the telephone call instruction includes a target user identifier;

the control sub-module includes:

the third determining unit is used for determining a fourth user corresponding to the target user identification and determining the telephone number of the fourth user;

and the control unit is used for controlling the target multimedia conference equipment to call the fourth user through a telephone call instruction according to the telephone number of the fourth user.

In one possible implementation manner, the third determining unit includes:

the first determining subunit is used for determining the correlation between the plurality of users and the first user according to the identity information of the first user when determining that the plurality of users corresponding to the target user identification exist;

and the second determining subunit is used for determining the user with the correlation with the first user exceeding the threshold value as the fourth user.

In one possible implementation, the apparatus 70 further includes:

the voice print extraction module is used for extracting voice print characteristics from the voice command;

and the fourth determining module is used for determining the identity information of the first user according to the voiceprint characteristics.

In one possible implementation, the control sub-module is specifically configured to:

and when the target conference instruction is determined to be directly responded by the target multimedia conference equipment, controlling the target multimedia conference equipment to provide corresponding conference function service through the target conference instruction.

In one possible implementation, the control sub-module includes:

the fourth determining unit is used for determining an associated conference instruction corresponding to the target conference instruction in the conference instruction database when the target conference instruction cannot be directly responded by the target multimedia conference equipment, and the associated conference instruction can be directly responded by the target multimedia conference equipment;

and the control unit is used for sending a second prompt instruction to the target multimedia conference equipment, and the second prompt instruction is used for controlling the target multimedia conference equipment to perform text display and/or voice output on the associated conference instruction so as to prompt the first user to perform voice input.

The apparatus 70 provided in the present disclosure can implement each step in the method embodiment shown in fig. 1, and implement the same technical effect, and is not described herein again to avoid repetition.

Fig. 8 shows a schematic structural diagram of an intelligent conference control device according to an embodiment of the present disclosure. The apparatus 80 shown in fig. 8 can be applied to a multimedia conference device, and the apparatus 80 can be used for executing the steps of the above-mentioned method embodiment shown in fig. 6, and the apparatus 80 includes:

a receiving module 81, configured to receive a voice instruction input by a first user;

the analysis module 82 is used for determining a target conference instruction matched with the voice instruction by analyzing the voice instruction;

and a sending module 83, configured to send the target conference instruction to the server, so that the server provides a corresponding conference function service based on the target conference instruction.

Fig. 9 shows a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 9, at the hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.

The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (peripheral component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 9, but this does not indicate only one bus or one type of bus.

And a memory for storing the program. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.

The processor reads the corresponding computer program from the nonvolatile memory to the memory and then runs the computer program to form the intelligent conference control device on the logic level. The processor executes the program stored in the memory and specifically executes: receiving a voice instruction input by a first user; determining a target conference instruction matched with the voice instruction by analyzing the voice instruction; and controlling the multimedia conference equipment to provide corresponding conference function services based on the target conference instruction.

In one possible implementation, the processor is specifically configured to perform: performing voice recognition on the voice command to convert the voice command into a text command; and determining a target conference instruction matched with the text instruction in a conference instruction database by performing natural language understanding on the text instruction, wherein the conference instruction database comprises a plurality of conference instructions.

In one possible implementation, the processor is specifically configured to perform: and sending a first prompt instruction to the multimedia conference equipment, wherein the first prompt instruction is used for controlling the multimedia conference equipment to perform text display on various conference instructions so as to prompt a user to perform voice input.

In one possible implementation, the processor is specifically configured to perform: determining target multimedia conference equipment according to the voice instruction; and controlling the target multimedia conference equipment to provide corresponding conference function services through the target conference instruction.

In one possible implementation, the voice instruction is input by the first user based on the multimedia conference device; the processor is specifically configured to perform: receiving a region identifier of a target region where the multimedia conference device is located, wherein the region identifier is sent by the multimedia conference device; and determining the multimedia conference equipment in the target area as target multimedia conference equipment according to the area identification of the target area.

In one possible implementation, the voice instruction is input by the first user based on the smart sound box; the processor is specifically configured to perform: determining a target multimedia conference device according to a voice instruction, comprising: determining a sound box identifier corresponding to the intelligent sound box, wherein the sound box identifier is used for indicating an area identifier of a target area where the intelligent sound box is located; and taking the multimedia conference equipment in the target area as target multimedia conference equipment.

In one possible implementation, the target meeting instruction is a meeting creation instruction; the processor is specifically configured to perform: and controlling the target multimedia conference equipment to create the target audio/video conference through the conference creating instruction.

In one possible implementation, the processor is specifically configured to perform: and establishing a conference joining instruction for the target audio/video conference, wherein the conference joining instruction is used for instructing a second user to join the target audio/video conference by inputting the conference joining instruction through voice.

In one possible implementation, the processor is specifically configured to perform: determining a first MAC address of an AP in a target area according to the area identifier; determining a second MAC address of the access AP according to the first MAC address; and determining the identity information of a third user accessing the target audio/video conference according to the second MAC address.

In one possible implementation manner, the target conference instruction is a telephone call instruction, and the telephone call instruction includes a target user identifier; the processor is specifically configured to perform: determining a fourth user corresponding to the target user identification, and determining a telephone number of the fourth user; and controlling the target multimedia conference equipment to call the fourth user through the telephone call instruction according to the telephone number of the fourth user.

In one possible implementation, the processor is specifically configured to perform: when a plurality of users corresponding to the target user identification are determined, determining the correlation between the plurality of users and the first user according to the identity information of the first user; and determining the user with the relevance to the first user exceeding a threshold value as a fourth user.

In one possible implementation, the processor is specifically configured to perform: extracting voiceprint features from the voice instruction; and determining the identity information of the first user according to the voiceprint characteristics.

In one possible implementation, the processor is specifically configured to perform: and when the target conference instruction is determined to be directly responded by the target multimedia conference equipment, controlling the target multimedia conference equipment to provide corresponding conference function service through the target conference instruction.

In one possible implementation, the processor is specifically configured to perform: when the target conference instruction cannot be directly responded by the target multimedia conference equipment, determining an associated conference instruction corresponding to the target conference instruction in a conference instruction database, wherein the associated conference instruction can be directly responded by the target multimedia conference equipment; and sending a second prompt instruction to the target multimedia conference equipment, wherein the second prompt instruction is used for controlling the target multimedia conference equipment to perform text display and/or voice output on the associated conference instruction so as to prompt the first user to perform voice input.

The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present specification may be embodied directly in a hardware decoding processor, or in a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.

The electronic device may execute the method executed in the method embodiment shown in fig. 1, and implement the functions of the method embodiment shown in fig. 1, which are not described herein again in this specification.

The present specification also proposes a computer-readable storage medium storing one or more programs, where the one or more programs include instructions, which when executed by an electronic device including a plurality of application programs, enable the electronic device to execute the intelligent conference control method in the embodiment shown in fig. 1, and specifically perform the steps of the method embodiment shown in fig. 1.

The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein were chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the techniques in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. An intelligent conference control method is applied to a server side, and comprises the following steps:

receiving a voice instruction input by a first user;

determining a target conference instruction matched with the voice instruction by analyzing the voice instruction;

and controlling the multimedia conference equipment to provide corresponding conference function services based on the target conference instruction.

2. The method of claim 1, wherein determining a target meeting instruction matching the voice instruction by parsing the voice instruction comprises:

converting the voice instruction into a text instruction by performing voice recognition on the voice instruction;

and determining the target meeting instruction matched with the text instruction in a meeting instruction database by understanding the natural language of the text instruction, wherein the meeting instruction database comprises a plurality of meeting instructions.

3. The method of claim 2, further comprising:

and sending a first prompt instruction to multimedia conference equipment, wherein the first prompt instruction is used for controlling the multimedia conference equipment to perform text display on the various conference instructions so as to prompt a user to perform voice input.

4. The method of claim 1, wherein controlling the multimedia conference device to provide the corresponding conference function service based on the target conference instruction comprises:

determining target multimedia conference equipment according to the voice instruction;

and controlling the target multimedia conference equipment to provide corresponding conference function services through the target conference instruction.

5. The method of claim 4, wherein the voice instruction is input by the first user based on a multimedia conference device;

determining a target multimedia conference device according to the voice instruction, comprising:

receiving a region identifier of a target region where the multimedia conference device is located, wherein the region identifier is sent by the multimedia conference device;

and determining the multimedia conference equipment in the target area as the target multimedia conference equipment according to the area identifier of the target area.

6. The method of claim 4, wherein the voice instruction is input by the first user based on a smart speaker;

determining a sound box identifier corresponding to the intelligent sound box, wherein the sound box identifier is used for indicating an area identifier of a target area where the intelligent sound box is located;

and taking the multimedia conference equipment in the target area as the target multimedia conference equipment.

7. The method of claim 5 or 6, wherein the target meeting instruction is a meeting creation instruction;

controlling the target multimedia conference equipment to provide corresponding conference function services through the target conference instruction, wherein the conference function services comprise:

and controlling the target multimedia conference equipment to create a target audio/video conference through the conference creation instruction.

8. The method of claim 10, further comprising:

and establishing a conference joining instruction for the target audio/video conference, wherein the conference joining instruction is used for instructing a second user to join the target audio/video conference by inputting the conference joining instruction through voice.

9. The method of claim 7, further comprising:

determining a first Media Access Control (MAC) address of a wireless Access Point (AP) in the target area according to the area identifier;

determining a second MAC address accessed to the AP according to the first MAC address;

and determining the identity information of a third user accessing the target audio/video conference according to the second MAC address.

10. The method of claim 5 or 6, wherein the target conference instruction is a phone call instruction, and the phone call instruction comprises a target user identifier;

determining a fourth user corresponding to the target user identification, and determining a telephone number of the fourth user;

and controlling the target multimedia conference equipment to call the fourth user through the telephone call instruction according to the telephone number of the fourth user.

11. The method of claim 10, wherein determining the fourth user corresponding to the target user identifier comprises:

when determining that a plurality of users corresponding to the target user identification exist, determining the correlation between the plurality of users and the first user according to the identity information of the first user;

determining a user whose relevance to the first user exceeds a threshold as the fourth user.

12. The method according to claim 1 or 11, characterized in that the method further comprises:

extracting voiceprint features from the voice instruction;

and determining the identity information of the first user according to the voiceprint characteristics.

13. The method of claim 4, wherein controlling the target multimedia conference device to provide the corresponding conference function service through the target conference instruction comprises:

14. The method of claim 13, further comprising:

when the target conference instruction cannot be directly responded by the target multimedia conference equipment, determining an associated conference instruction corresponding to the target conference instruction in a conference instruction database, wherein the associated conference instruction can be directly responded by the target multimedia conference equipment;

and sending a second prompt instruction to the target multimedia conference device, wherein the second prompt instruction is used for controlling the target multimedia conference device to perform text display and/or voice output on the associated conference instruction so as to prompt the first user to perform voice input.

15. An intelligent conference control method is applied to multimedia conference equipment, and comprises the following steps:

receiving a voice instruction input by a first user;

and sending the target conference instruction to a server so that the server provides corresponding conference function service based on the target conference instruction.

16. An intelligent conference control device, which is applied to a server, the device comprising:

the receiving module is used for receiving a voice instruction input by a first user;

the analysis module is used for determining a target conference instruction matched with the voice instruction by analyzing the voice instruction;

and the control module is used for controlling the multimedia conference equipment to provide corresponding conference function services based on the target conference instruction.

17. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the intelligent conference control method of any of claims 1-14.

18. A non-transitory computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the intelligent conference control method of any of claims 1-14.

19. An intelligent conference control device, wherein the device is applied to a multimedia conference device, the device comprises:

and the sending module is used for sending the target conference instruction to a server so that the server provides corresponding conference function service based on the target conference instruction.

20. An electronic device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the intelligent conference control method of claim 15.

21. A non-transitory computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the intelligent conference control method of claim 15.