CN111415128A

CN111415128A - Method, system, apparatus, device and medium for controlling conference

Info

Publication number: CN111415128A
Application number: CN201910013104.9A
Authority: CN
Inventors: 孙辉; 王思杰; 李胜; 张泽旋
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2019-01-07
Filing date: 2019-01-07
Publication date: 2020-07-14

Abstract

The invention discloses a method, a system, a device, equipment and a medium for controlling a conference, wherein the method comprises the following steps: analyzing the received conference invitation to obtain a conference flow, wherein the conference flow comprises a conference flow keyword of each conference stage; converting the received audio information into text information, and identifying audio keywords in the text information; when the audio keywords are successfully matched with the conference process keywords, the conference issues corresponding to the successfully matched conference process keywords are played; a conference summary is generated based on the textual information. According to the method provided by the embodiment of the invention, the working efficiency of the conference can be improved.

Description

Method, system, apparatus, device and medium for controlling conference

Technical Field

The present invention relates to the field of computers, and in particular, to a method, system, apparatus, device, and medium for controlling a conference.

Background

The meeting refers to an organized, leadership and purposeful meeting activity, which is carried out according to a certain program at a specified time and place. Currently, the conference can specifically adopt various modes, such as: teleconferencing and web conferencing. In particular, the network conference may also include a voice conference and a video conference.

During the above-mentioned many conferences, the number of people participating in the conference is large, and the conference is realized manually from the notice, the subject and the summary, so there is a technical problem that the conference work efficiency is low.

Disclosure of Invention

The embodiment of the invention provides a method, a system, a device, equipment and a medium for controlling a conference, which can improve the working efficiency of the conference.

In a first aspect, an embodiment of the present invention provides a method for controlling a conference, including:

analyzing the received conference invitation to obtain a conference flow, wherein the conference flow comprises a conference flow keyword of each conference stage; converting the received audio information into text information, and identifying audio keywords in the text information; when the audio keywords are successfully matched with the conference process keywords, the conference issues corresponding to the successfully matched conference process keywords are played; a conference summary is generated based on the textual information.

In a second aspect, an embodiment of the present invention provides a speech processing system, including: the sound sensor is coupled with the voice processing device;

a sound sensor for receiving audio information;

and the voice processing equipment is used for analyzing the received conference invitation to obtain a conference flow, wherein the conference flow comprises a conference flow keyword of each conference stage, converting the received audio information into text information, identifying the audio keyword in the text information, playing the conference topic corresponding to the successfully matched conference flow keyword when the audio keyword is successfully matched with the conference flow keyword, and generating a conference summary based on the text information.

In a third aspect, an embodiment of the present invention provides an apparatus for controlling a conference, including:

the analysis module is used for analyzing the meeting invitation to obtain a meeting flow, and the meeting flow comprises a meeting flow keyword of each meeting stage; the identification module is used for converting the received audio information into text information and identifying audio keywords in the text information; the control module is used for playing the conference topic corresponding to the successfully matched conference flow keyword when the audio keyword is successfully matched with the conference flow keyword; and the generating module is used for generating the conference summary based on the text information.

In a fourth aspect, an embodiment of the present invention provides an apparatus for controlling a conference, including a memory and a processor; wherein, the memorizer, is used for storing the procedure; a processor for executing a program stored in the memory to perform the method of controlling a conference described above in connection with the first aspect.

In a fifth aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to execute the method for controlling a conference described above in connection with the first aspect.

In a sixth aspect, an embodiment of the present invention provides a method for controlling a conference, including:

analyzing the meeting invitation to obtain a meeting flow, wherein the meeting flow comprises meeting flow keywords of each meeting stage; converting the received audio information into text information, and identifying audio keywords in the text information; and when the audio keywords are successfully matched with the conference process keywords, playing the conference subjects corresponding to the successfully matched conference process keywords.

In a seventh aspect, an embodiment of the present invention provides a speech processing system, including:

the sound sensor is coupled with the voice processing device; a sound sensor for receiving audio information; and the voice processing equipment is used for analyzing the conference invitation to obtain a conference flow, the conference flow comprises conference flow keywords of each conference stage, the received audio information is converted into text information, the audio keywords in the text information are identified, and when the audio keywords are successfully matched with the conference flow keywords, the conference topic corresponding to the successfully matched conference flow keywords is played.

In an eighth aspect, an embodiment of the present invention provides an apparatus for controlling a conference, including:

the analysis module is used for analyzing the meeting invitation to obtain a meeting flow, and the meeting flow comprises a meeting flow keyword of each meeting stage; the identification module is used for converting the received audio information into text information and identifying audio keywords in the text information; and the control module is used for playing the conference topic corresponding to the successfully matched conference flow keyword when the audio keyword and the conference flow keyword are successfully matched.

In a ninth aspect, an embodiment of the present invention provides an apparatus for controlling a conference, including:

a memory for storing a program; a processor for executing a program stored in the memory to perform the method of controlling a conference described above in connection with the sixth aspect.

In a tenth aspect, an embodiment of the present invention provides a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to execute the method for controlling a conference described above in connection with the sixth aspect.

According to the technical scheme, the received conference offer is firstly analyzed to obtain a conference flow, and then the audio keywords are identified. And under the condition that the audio keywords are successfully matched with the conference process keywords, the conference issues corresponding to the successfully matched conference process keywords can be played. The process of the automatic control conference can improve the working efficiency of the conference.

Drawings

The present invention will be better understood from the following description of specific embodiments thereof taken in conjunction with the accompanying drawings, in which like or similar reference characters designate like or similar features.

FIG. 1 is a diagram illustrating email in an exemplary embodiment according to the present invention;

FIG. 2 is a flow diagram illustrating a method of controlling a conference in accordance with an embodiment of the present invention;

FIG. 3 is a diagram illustrating a birthday party template according to an exemplary embodiment of the present invention;

FIG. 4 is a diagram illustrating a template for a work meeting in accordance with an exemplary embodiment of the present invention;

FIG. 5 is a schematic diagram illustrating a speech processing system according to an embodiment of the present invention;

fig. 6 is a schematic view showing a structure of an apparatus for controlling a conference according to an embodiment of the present invention;

fig. 7 is a flowchart illustrating a method of controlling a conference according to another embodiment of the present invention;

fig. 8 is a schematic configuration diagram showing an apparatus for controlling a conference according to another embodiment of the present invention;

FIG. 9 is a schematic diagram showing the structure of a speech processing system according to another embodiment of the present invention;

FIG. 10 is a block diagram of an exemplary hardware architecture of a computing device of the method and apparatus for controlling a conference of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and specific embodiments.

Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Conferences include a variety of, such as: teleconferencing and web conferencing. Typically, multiple people participate in the same conference. After determining the time and location of the meeting, the participants need to be notified in a number of ways. As one example, the participant may be notified by email.

Referring to fig. 1, fig. 1 is a schematic diagram of an email in an embodiment of the invention. It should be noted that the email in fig. 1 may be understood as an implementation manner of the meeting invitation in the embodiment of the present invention.

In the email, the meeting time, meeting location, meeting flow and participants are notified. The conference is carried out according to a conference flow, namely, a conference host is required to remind a conference subject. The conference subjects include 4 items, which are leader speaking, attendance system, canteen suggestion and summary respectively. Also, after the meeting is over, the meeting summary needs to be manually summarized. Therefore, the conference work efficiency is low.

Referring to fig. 2, fig. 2 is a flowchart illustrating a method for controlling a conference according to an embodiment of the present invention.

In FIG. 2, the meeting is initiated by a meeting offer. Specifically, the meeting invitation can be initiated by staff through the meeting invitation and can also be initiated by meeting triggering conditions. Wherein the meeting triggering condition is a condition for initiating a meeting. As one example, the meeting trigger condition may be a trigger time point or a trigger event, etc.

As shown in fig. 2, a method 200 for controlling a conference specifically includes the following steps:

step S201, analyzing the received meeting invitation to obtain a meeting flow, wherein the meeting flow comprises a meeting flow keyword of each meeting stage.

A meeting offer is a request to invite a particular person to a meeting at a specified time and at a specified location. The meeting offer can be an email meeting offer or a meeting offer in office software.

As one example, in the event that a meeting needs to be initiated, an email meeting offer may be sent to the participant. As another example, a meeting request may be initiated in office software, with a participant using the office software receiving a corresponding meeting offer.

In one embodiment of the invention, meeting offer templates may be preset to account for the different scenarios involved in the meeting. On the basis of the conference invitation template, part of the conference flow is filled, so that the time for initiating the conference can be saved, and the efficiency for controlling the conference is improved.

As one example, the meeting offer template may include a birthday meeting template and a working meeting template.

Referring to fig. 3, fig. 3 is a schematic diagram of a birthday party template according to an embodiment of the present invention. The flow of each birthday meeting is the same, with the differences being the meeting time, meeting location and participants. On the basis of the birthday meeting template, only meeting time, meeting place and participants need to be filled in.

Referring to fig. 4, fig. 4 is a schematic diagram of a working meeting template in an embodiment of the present invention. The flow of each work session is different in the subject. On the basis of the working meeting template, only the topics that need to be discussed during the meeting need to be filled in.

The conference invitation can be quickly constructed by using the conference invitation template, and the working efficiency of controlling the conference is further improved.

After receiving the meeting offer, the received meeting offer may be parsed. With continued reference to FIG. 1, the meeting offer in FIG. 1 includes a meeting flow. That is, the received meeting offer is parsed to obtain the meeting flow directly.

In one embodiment of the present invention, the conference flow includes a plurality of conference issues from which the main content of the conference discussion is known. Then, the conference flow keyword may be extracted on the basis of the conference issue. The specific process of extracting the conference flow keywords is similar to identifying audio keywords in the text. First, the conference topic may be segmented into one or more segments and the part-of-speech of each segment may be tagged. And then, identifying the conference flow key words of the conference topic based on the participles marked with the parts of speech.

It should be noted that the extraction of the meeting process keywords may be extracting the meeting process keywords while receiving the meeting offer, that is, extracting the meeting process keywords in real time; it may also be that after receiving the meeting offer, in the case that the meeting offer has been stored, the meeting flow keyword is extracted.

In the above step S201, the received meeting offer is an offer set based on a meeting offer template, which is a template set in advance based on a meeting scene.

In an embodiment, the step of analyzing the received meeting offer to obtain the meeting flow in step S201 may specifically include:

and analyzing the received meeting invitation to obtain the meeting topic of each meeting stage, and extracting meeting process keywords from the meeting topic.

In one embodiment, the received meeting offer comprises an email meeting offer.

Step S202, converting the received audio information into text information, and identifying audio keywords in the text information.

For the audio information of the user, i.e., the received audio information, it is possible to convert the voice signal into the digital signal by taking the great advantage of the digital signal in storage, transmission and processing into consideration. The received audio information is buffered for further processing of the information of the audio information. In order to ensure the usability of the digital signal, the digital signal may be subjected to a filtering process. Further, the speech signal of the audio information may be divided into a plurality of speech frames. Acoustic features are extracted from a plurality of speech frames respectively, namely the waveform of each speech frame is changed into a multi-dimensional vector. And finally, converting the multidimensional vector into text information by using an acoustic model.

After converting the audio information into text information, it is difficult to determine the audio keywords in the text because the text includes a plurality of words. In the embodiment of the invention, the audio keywords can be understood as words capable of reflecting main semantics of corresponding texts.

In one embodiment of the invention, text may first be segmented into tokens. In particular, the segmentation may be based on thesaurus segmentation or on statistical segmentation.

Based on word bank segmentation, that is, matching the segmentation of the text with the words in an established word bank according to a certain strategy, if a certain segmentation is found, the matching is successful, and the segmentation is identified. The strategy can comprise the following modes: according to different scanning directions, the word bank segmentation can be divided into forward matching and reverse matching; according to the condition of preferential matching with different lengths, the method can be divided into longest matching and shortest matching.

Based on the statistical segmentation, the text may be segmented into segments. The method is based on the principle that a statistical machine learning model is used for learning word segmentation rules on the premise of giving a large amount of segmented texts, so that the texts are segmented. The main statistical models are: n-gram (N-gram), Hidden Markov Model (HMM), maximum entropy Model (ME), Conditional Random field Model (CRF), etc.

In addition, word segmentation based on a word bank and word segmentation based on statistics can be combined, so that the characteristics of high word segmentation speed and high efficiency of the word bank are exerted, and the advantages of recognizing new words and automatically eliminating ambiguity by combining the statistics with context are utilized.

After segmenting the text into segments, audio keywords need to be identified in one or more segments.

In one embodiment of the invention, the part of speech of each participle may be tagged first. Part of speech is a fundamental syntactic property of a vocabulary. Tagging part of speech refers to tagging each participle with a correct part of speech, i.e. the process of determining whether each word is a noun, verb, adjective or other part of speech.

After determining the part of speech of each participle, named entity recognition is performed. The named entity identification is to identify brands, products, models and the like in the E-commerce field, and also comprises identification of some general field entities such as names of people, places, names of organizations, time and date and the like. As an example, named entity identification can be based on one of three methods: rule-based methods, statistical-based methods, and hybrid methods based on rules and statistics.

Generally, the word segmentation corresponding to the named entity is not an audio keyword. Therefore, after the named entities are identified, the audio keywords can be extracted from the participles corresponding to the non-named entities.

Any one of the following methods may be employed: a commonly used weighting technology (TF-IDF), a Topic model (Topic model) method and a Rapid Automatic Keyword Extraction (RAKE) method for information retrieval data mining are used for extracting audio keywords from the participles corresponding to the non-named entities.

In an embodiment, the step of converting the received audio information into text information in step S202 may specifically include:

and when the received audio information is the audio information comprising the label, converting the audio information comprising the label into text information comprising the label.

In this embodiment, the step of generating the conference summary based on the text information may specifically include:

and generating a conference summary corresponding to the label according to the text information comprising the label.

In one embodiment, the tag comprises a speaker tag, wherein the speaker tag is used for identifying a speaker corresponding to the audio information; alternatively, the tag includes a conference topic tag that identifies a conference topic to which the audio information pertains.

In one embodiment, the tags include a conference topic tag for identifying a conference issue to which the audio information pertains and a speaker tag for identifying a speaker in the conference issue to which the audio information pertains.

In an embodiment, the step of identifying the audio keyword in the text information in step S202 may specifically include:

step S2021, segmenting the text into one or more segments, and labeling the part of speech of each segment.

Step S2022, based on the word segments labeled with parts of speech, identifying the audio keywords in the text information.

Step S203, when the audio keywords are successfully matched with the conference process keywords, the conference issues corresponding to the successfully matched conference process keywords are played.

In embodiments of the invention, speech is received at different times, since the sound sensor may receive speech from different users. Therefore, the number of audio keywords is more than one. That is, during the conference, there are a plurality of audio keywords.

In one embodiment, one conference flow corresponds to one conference flow keyword, and a plurality of conference flows respectively correspond to a plurality of conference flow keywords. And when the audio keywords are the same as any conference process keywords, determining that the audio keywords are successfully matched with the conference process keywords.

This is schematically illustrated in connection with fig. 1. With continued reference to fig. 1, four conference flows are included in fig. 1.

The conference flow keywords of the first conference flow are: leading; the conference process keywords of the second conference process are: checking attendance; the conference process keywords of the fourth conference process are: a dining room; the conference process keywords of the fourth conference process are: and (6) summarizing.

The audio keywords are: and (6) leading. The audio keywords are successfully matched with the conference process keywords. The conference topic corresponding to the successfully matched conference flow keyword can be played in the voice playing device: leader speaking, namely playing: and (6) leading the head to speak.

Correspondingly, if any conference process keyword of the other audio keywords is successfully matched, the conference topic corresponding to the successfully matched conference process keyword can be played in the voice playing device.

In an embodiment, the determining that the matching between the audio keyword and the conference process keyword is successful in step S203 may specifically include:

and if the audio keywords are the same as any conference process keywords, determining that the audio keywords are successfully matched with the conference process keywords.

And step S204, generating a conference summary based on the text information.

The text is a document obtained by conversion based on the sound sensor receiving the user's audio information. The content corresponding to the audio information of the user comprises specific content related to the conference topic, so that the conference summary can be generated based on the text information converted from the audio information.

Specifically, the conference summary corresponding to the text information may be generated based on a text summarization algorithm. As one example, the text summarization algorithm includes at least one of: word frequency algorithms, cue word algorithms, location algorithms, heading algorithms, vocabulary chain algorithms, and associative network algorithms.

In one embodiment of the present invention, the method of controlling a conference may further include: a meeting summary is sent to the participants in the meeting offer. In this embodiment, the meeting offer may include the contact information of the participants, and then the meeting summary generated according to the text information may be automatically sent according to the contact information of the participants in the meeting offer.

As one example, the meeting offer is an email meeting offer. In an email meeting offer, the email box of each participant is recorded. And automatically sending the conference summary generated according to the text information to the electronic mailbox of each participant. It can be seen that with each participant, an automatically generated conference summary can be received.

In the embodiment of the invention, a conference flow is obtained by analyzing the conference offer; the method comprises the steps of converting user audio information into a text, identifying audio keywords in the text, playing a conference topic corresponding to the conference flow keywords under the condition that the audio keywords are successfully matched with the conference flow keywords, namely realizing automatic conference hosting and conference progress control, and generating a conference summary based on the text. Because the automatic presiding, the conference progress control and the automatic generation of the conference summary of the conference are realized, the working efficiency of the conference can be improved.

In one embodiment of the invention, the same speaker's speech needs to be collected during the conference. Then a tag may be inserted in the speaker's audio information that can distinguish between the different speakers. Thus, the received audio information may be speech including a tag. And then the voice including the same speaker label can be converted into the text including the label, and the text record can be generated according to the text information including the same speaker label. Wherein different speakers may be identified based on their voiceprints.

In one embodiment of the invention, it is desirable to collect utterances of the same conference topic during the conference. It is possible to start collecting the utterance of the conference issue while broadcasting the conference issue, i.e., insert a tag in the audio information of each utterance. And stopping collecting the speaking of the conference topic while broadcasting the next conference or after the conference is finished, namely stopping inserting a label into the audio information of each speaking.

Thus, the received audio information may be speech including a tag. And then the voice including the same conference theme label can be converted into the text including the label, and finally the text record is generated according to the text information including the same conference theme label.

In one embodiment of the invention, a text record may also be generated based on the speaker tag and the meeting topic tag. It is possible to start collecting the utterance of the conference issue, i.e., insert a tag in the audio information of each speaker, while broadcasting the conference issue. And stopping collecting the speech of the conference topic at the same time of broadcasting the next conference or after the conference is finished, namely stopping inserting a label into the audio information of each speaker. The text record may then include not only the speaker tag and the conference subject tag.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a speech processing system according to an embodiment of the present invention. In an embodiment of the present invention, the speech processing system 500 may include a sound sensor 510 and a speech processing device 520, the sound sensor 510 being coupled to the speech processing device 520.

The sound sensor and the voice processing device in the embodiment of the present invention may be separately configured, that is, the sound sensor and the voice processing device are separate devices, as an example, the sound sensor is configured at a local end, and the voice processing device is configured at a cloud end. The sound sensor may also be provided in the same device as the speech processing apparatus, as an example: the sound sensor and the voice processing device are provided in the conference device.

A sound sensor is a sensor that can sense an acoustic quantity and convert it into an outputable signal. The sound sensor includes a sound pressure sensor, a noise sensor, an ultrasonic sensor, and a microphone.

The sound sensor may collect a user's voice. The voice processing apparatus receives audio information of a user from the sound sensor. As an example, the voice processing device may play a conference issue corresponding to the successfully matched conference flow keyword, so that the user may speak according to the played conference issue to provide audio information to the sound sensor.

It should be noted that the speech processing system in fig. 5 can execute the method for controlling the conference in the embodiment of the present invention described above with reference to fig. 1 to 4. For convenience and brevity of description, detailed description of known methods is omitted here, and specific method steps for controlling a conference may refer to corresponding processes in the foregoing method embodiments, and are not described herein again.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a device for controlling a conference according to an embodiment of the present invention, where the device for controlling a conference corresponds to a method for controlling a conference, and the device 600 for controlling a conference specifically includes:

the analysis module 601 is configured to analyze the conference offer to obtain a conference flow, where the conference flow includes a conference flow keyword of each conference stage.

The identifying module 602 is configured to convert the received audio information into text information, and identify an audio keyword in the text information.

And the control module 603 is configured to play the conference issue corresponding to the successfully matched conference flow keyword when the audio keyword and the conference flow keyword are successfully matched.

A generating module 604 for generating a conference summary based on the text information.

In one embodiment, receiving the meeting offer is an offer set based on a meeting offer template, which is a pre-set template based on a meeting scenario.

In one embodiment, the parsing module 601 is specifically configured to:

In one embodiment, the received meeting offer comprises an email meeting offer.

In one embodiment, the recognition module 602, when specifically configured to recognize the audio keyword in the text message, is specifically configured to:

segmenting a text into one or more participles, and labeling the part of speech of each participle;

and identifying the audio keywords in the text information based on the participles marked with the parts of speech.

In an embodiment, the control module 603 may be further specifically configured to:

and if the audio keywords are the same as any conference process keywords, determining that the audio keywords are successfully matched with the conference process keywords, and playing the conference topic corresponding to the successfully matched conference process keywords.

In an embodiment, the generating module 604 may be specifically configured to:

generating a conference summary corresponding to the text information based on a text summarization algorithm, wherein the text summarization algorithm comprises at least one of the following: word frequency algorithms, cue word algorithms, location algorithms, heading algorithms, vocabulary chain algorithms, and associative network algorithms.

In one embodiment, the recognition module 602, when being specifically configured to convert the received audio information into text information, is specifically configured to:

In this embodiment, the generating module 604 may be further configured to generate a conference summary corresponding to the tag according to the text information including the tag.

In one embodiment, the apparatus 600 for controlling a conference may further include:

a sending module (not shown in FIG. 6) for sending the meeting summary to the participants in the meeting invitation.

It is to be understood that the invention is not limited to the particular arrangements and instrumentality described in the above embodiments and shown in the drawings. For convenience and brevity of description, detailed descriptions of known methods are omitted here, and specific working processes of the systems, modules and units described above may refer to corresponding processes in the foregoing described method embodiments, which are not described herein again.

Fig. 7 shows a flow diagram of a method of controlling a conference according to another embodiment of the present invention. As shown in fig. 7, in one embodiment, a method 700 of controlling a conference may include:

step S710, analyzing the meeting invitation to obtain a meeting flow, wherein the meeting flow comprises a meeting flow keyword of each meeting stage;

step S720, converting the received audio information into text information, and identifying audio keywords in the text information;

and step S730, when the audio keywords are successfully matched with the conference process keywords, playing the conference subjects corresponding to the successfully matched conference process keywords.

In one embodiment, the meeting offer is an offer set based on a meeting offer template, which is a template pre-set according to the meeting scenario.

In one embodiment, the method 700 of controlling a conference may further include:

step S740, generating a conference summary based on the text information.

step S750, sending the meeting summary to the participants in the meeting invitation.

In an embodiment, the step of converting the received audio information into text information in step S720 may specifically include:

step S721-01, acquiring the received audio information, wherein the audio information comprises a speaker tag, and the speaker tag is used for identifying a speaker corresponding to the audio information;

step S722-01, the audio information is converted into text information including a speaker tag.

In this embodiment, the method 700 for controlling a conference may further include:

and step S723-01, generating a conference summary corresponding to the speaker tag according to the text information comprising the speaker tag.

step S724-01, the conference summary corresponding to the speaker tag is sent to the participants in the conference invitation.

step S721-02, acquiring the received audio information, wherein the audio information comprises a conference subject label, and the conference subject label is used for identifying a conference subject to which the audio information belongs;

step S722-02, the audio information is converted into text information including a conference theme label.

and step S723-02, generating a conference summary corresponding to the conference theme label according to the text information comprising the conference theme label.

and step S724-02, sending the conference summary corresponding to the conference theme label to the participants in the conference invitation.

step S721-03, acquiring received audio information, wherein the audio information comprises a conference subject label and a speaker label, the conference subject label is used for identifying a conference subject to which the audio information belongs, and the speaker label is used for identifying a speaker in the conference subject to which the audio information belongs;

step S722-03, the audio information is converted into text information including a conference subject label and a speaker label.

and step S723-03, generating a conference summary corresponding to the conference subject label and the speaker label according to the text information comprising the conference subject label and the speaker label.

and step S724-03, sending the conference summary corresponding to the conference theme label and the speaker label to the participants in the conference invitation.

According to the method for controlling the conference, the received conference invitation is analyzed to obtain the conference flow, the audio keywords are identified, and the conference topic corresponding to the successfully matched conference flow keyword can be played under the condition that the audio keywords are successfully matched with the conference flow keyword. By the automatic control of the process of the conference, the working efficiency of the conference can be improved.

Fig. 8 is a schematic structural diagram illustrating an apparatus for controlling a conference according to another embodiment of the present invention. As shown in fig. 8, in one 810 embodiment, an apparatus 800 for controlling a conference may comprise:

the analysis module 810 is configured to analyze the conference offer to obtain a conference flow, where the conference flow includes a conference flow keyword of each conference stage;

an identifying module 820, configured to convert the received audio information into text information, and identify an audio keyword in the text information;

and the control module 830 is configured to play the conference issue corresponding to the successfully matched conference flow keyword when the audio keyword and the conference flow keyword are successfully matched.

In one embodiment, the apparatus 800 for controlling a conference may further include:

and the generating module is used for generating the conference summary based on the text information.

In this embodiment, the apparatus 800 for controlling a conference may further include:

a sending module for sending the meeting summary to the participants in the meeting invitation.

In one embodiment, the recognition module 820, when specifically configured to convert the received audio information into text information, may specifically be configured to:

acquiring received audio information, wherein the audio information comprises a speaker tag, and the speaker tag is used for identifying a speaker corresponding to the audio information; the audio information is converted into text information including a speaker tag.

and the generating module is used for generating the conference summary corresponding to the speaker label according to the text information comprising the speaker label.

and the sending module is used for sending the conference summary corresponding to the speaker tag to the participants in the conference invitation.

the method comprises the steps of obtaining received audio information, wherein the audio information comprises a conference subject label, and the conference subject label is used for identifying a conference subject to which the audio information belongs; the audio information is converted to text information including a conference theme tag.

and the generating module is used for generating the conference summary corresponding to the conference theme label according to the text information comprising the conference theme label.

and the sending module is used for sending the conference summary corresponding to the conference theme label to the participants in the conference invitation.

the method comprises the steps of obtaining received audio information, wherein the audio information comprises a conference subject label and a speaker label, the conference subject label is used for identifying a conference subject to which the audio information belongs, and the speaker label is used for identifying a speaker in the conference subject to which the audio information belongs; the audio information is converted into text information including a conference subject label and a speaker label.

and the generating module is used for generating the conference summary corresponding to the conference subject label and the speaker label according to the text information comprising the conference subject label and the speaker label.

and the sending module is used for sending the conference summary corresponding to the conference theme label and the speaker label to the participants in the conference invitation.

It is to be understood that the invention is not limited to the particular arrangements and instrumentality described in the above embodiments and shown in the drawings. For convenience and brevity of description, detailed descriptions of known methods are omitted here, and for the specific working processes of the systems, modules and units described above, reference may be made to the corresponding processes in the method embodiment described above with reference to fig. 7, which are not described herein again.

FIG. 9 is a schematic diagram of a speech processing system according to another embodiment of the present invention. As shown in fig. 9, the speech processing system 900 may include a sound sensor 910 and a speech processing device 920. The sound sensor is coupled to the speech processing device.

In one embodiment, a sound sensor 910 for receiving audio information;

and the voice processing device 920 is configured to analyze the conference offer to obtain a conference flow, where the conference flow includes a conference flow keyword in each conference stage, convert the received audio information into text information, identify the audio keyword in the text information, and play the conference issue corresponding to the successfully matched conference flow keyword when the audio keyword is successfully matched with the conference flow keyword.

It should be noted that the speech processing system in fig. 9 can execute the method for controlling the conference in the embodiment of the present invention described above with reference to fig. 7. For convenience and brevity of description, detailed description of known methods is omitted here, and specific method steps for controlling a conference may refer to corresponding processes in the foregoing method embodiments, and are not described herein again.

FIG. 10 is a block diagram illustrating an exemplary hardware architecture of a computing device capable of implementing the method and apparatus for controlling a conference in accordance with embodiments of the present invention.

As shown in fig. 10, computing device 1000 includes an input device 1001, an input interface 1002, a central processor 1003, a memory 1004, an output interface 1005, and an output device 1006. The input interface 1002, the central processing unit 1003, the memory 1004, and the output interface 1005 are connected to each other through a bus 710, and the input device 1001 and the output device 1006 are connected to the bus 710 through the input interface 1002 and the output interface 1005, respectively, and further connected to other components of the computing device 1000.

Specifically, the input device 1001 receives input information from the outside, and transmits the input information to the central processor 1003 via the input interface 1002; the central processor 1003 processes input information based on computer-executable instructions stored in the memory 1004 to generate output information, stores the output information temporarily or permanently in the memory 1004, and then transmits the output information to the output device 1006 through the output interface 1005; output device 1006 outputs the output information external to computing device 1000 for use by a user.

That is, in one embodiment, the computing device shown in FIG. 10 may also be implemented to include: a memory storing computer-executable instructions; and a processor which, when executing computer executable instructions, may implement the method of controlling a conference described in connection with fig. 1-6.

In one embodiment, the computing device shown in FIG. 10 may also be implemented to include: a memory storing computer-executable instructions; and a processor which, when executing the computer executable instructions, may implement the method of controlling a conference described in connection with fig. 7.

According to an embodiment of the invention, the process described above with reference to the flow chart may be implemented as a computer software program. For example, embodiments of the invention include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network, and/or installed from a removable storage medium.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method of controlling a conference, comprising:

analyzing the received conference invitation to obtain a conference flow, wherein the conference flow comprises a conference flow keyword of each conference stage;

converting the received audio information into text information, and identifying audio keywords in the text information;

when the audio keywords are successfully matched with the conference process keywords, the conference subjects corresponding to the successfully matched conference process keywords are played;

generating a conference summary based on the textual information.

2. The method of controlling a conference according to claim 1,

the received meeting offer is an offer set based on a meeting offer template, which is a template preset based on a meeting scene.

3. The method of controlling a meeting of claim 1, wherein said parsing the received meeting offer results in a meeting flow comprising:

and analyzing the received meeting invitation to obtain the meeting topic of each meeting stage, and extracting the meeting process keywords from the meeting topic.

4. The method of controlling a meeting of claim 1, wherein the received meeting offer comprises an email meeting offer.

5. The method of controlling a conference according to claim 1, wherein said identifying audio keywords in said text information comprises:

the text is divided into one or more participles, and the part of speech of each participle is labeled;

6. The method for controlling a conference according to claim 1, wherein when the audio keyword is successfully matched with the conference process keyword, playing the conference topic corresponding to the successfully matched conference process keyword, comprises:

and when the audio keywords are the same as any conference process keywords, determining that the audio keywords are successfully matched with the conference process keywords, and playing the conference topic corresponding to the successfully matched conference process keywords.

7. The method of controlling a conference as claimed in claim 1, wherein said generating a conference summary based on said textual information comprises:

8. The method of controlling a conference according to claim 1, wherein said converting the received audio information into text information comprises:

9. The method of controlling a conference as claimed in claim 8, wherein said generating a conference summary based on said textual information comprises:

10. The method of controlling a conference according to claim 8,

the tags comprise speaker tags, and the speaker tags are used for identifying speakers corresponding to the audio information; alternatively, the first and second electrodes may be,

the tags include a conference topic tag, and the conference topic tag is used for identifying a conference topic to which the audio information belongs.

11. The method of controlling a conference according to claim 8,

the tags comprise conference subject tags and speaker tags, the conference subject tags are used for identifying conference issues to which the audio information belongs, and the speaker tags are used for identifying speakers in the conference issues to which the audio information belongs.

12. The method of controlling a conference according to claim 1, further comprising:

sending the meeting summary to the participants in the meeting offer.

13. A method of controlling a conference, comprising:

analyzing the conference invitation to obtain a conference flow, wherein the conference flow comprises a conference flow keyword of each conference stage;

and when the audio keywords are successfully matched with the conference process keywords, playing the conference subjects corresponding to the successfully matched conference process keywords.

14. The method of controlling a conference according to claim 13,

the meeting invitation is an invitation set based on a meeting invitation template, and the meeting invitation template is a template preset according to a meeting scene.

15. The method of controlling a conference according to claim 13, wherein said converting the received audio information into text information comprises:

the method comprises the steps of obtaining received audio information, wherein the audio information comprises a speaker tag, and the speaker tag is used for identifying a speaker corresponding to the audio information;

converting the audio information into text information including the speaker tag.

16. The method of controlling a conference according to claim 15, further comprising:

and generating a conference summary corresponding to the speaker tag according to the text information comprising the speaker tag.

17. The method of controlling a conference according to claim 16, further comprising:

and sending the meeting summary corresponding to the speaker tag to the participants in the meeting invitation.

18. The method of controlling a conference according to claim 13, wherein said converting the received audio information into text information comprises:

the method comprises the steps of obtaining received audio information, wherein the audio information comprises a conference subject label, and the conference subject label is used for identifying a conference subject to which the audio information belongs;

converting the audio information into text information including the conference theme tag.

19. The method of controlling a conference according to claim 18, further comprising:

and generating a conference summary corresponding to the conference theme label according to the text information comprising the conference theme label.

20. The method of controlling a conference according to claim 19, further comprising:

and sending the meeting summary corresponding to the meeting theme label to the participants in the meeting invitation.

21. The method of controlling a conference according to claim 13, wherein said converting the received audio information into text information comprises:

the method comprises the steps of obtaining received audio information, wherein the audio information comprises a conference subject label and a speaker label, the conference subject label is used for identifying a conference subject to which the audio information belongs, and the speaker label is used for identifying a speaker in the conference subject to which the audio information belongs;

converting the audio information into text information including the conference subject tag and the speaker tag.

22. The method of controlling a conference according to claim 21, further comprising:

and generating a conference summary corresponding to the conference subject label and the speaker label according to the text information comprising the conference subject label and the speaker label.

23. The method of controlling a conference according to claim 22, further comprising:

and sending the conference summary corresponding to the conference subject label and the speaker label to the participants in the conference invitation.

24. A speech processing system comprising: a sound sensor and a speech processing device, the sound sensor coupled with the speech processing device;

the sound sensor is used for receiving audio information;

the speech processing apparatus for

Analyzing the received meeting invitation to obtain a meeting flow, wherein the meeting flow comprises meeting flow keywords of each meeting stage,

converting the received audio information into text information, identifying audio keywords in the text information,

when the audio keywords are successfully matched with the conference process keywords, the conference subjects corresponding to the successfully matched conference process keywords are played,

generating a conference summary based on the textual information.

25. A speech processing system comprising: a sound sensor and a speech processing device, the sound sensor coupled with the speech processing device;

the sound sensor is used for receiving audio information;

the speech processing apparatus for

Analyzing the meeting invitation to obtain a meeting flow, wherein the meeting flow comprises meeting flow keywords of each meeting stage,

26. An apparatus for controlling a conference, comprising:

the analysis module is used for analyzing the meeting invitation to obtain a meeting flow, and the meeting flow comprises a meeting flow keyword of each meeting stage;

the identification module is used for converting the received audio information into text information and identifying audio keywords in the text information;

the control module is used for playing the conference topic corresponding to the successfully matched conference flow keyword when the audio keyword is successfully matched with the conference flow keyword;

27. An apparatus for controlling a conference, comprising:

and the control module is used for playing the conference topic corresponding to the successfully matched conference flow keyword when the audio keyword is successfully matched with the conference flow keyword.

28. An apparatus for controlling a conference, comprising a memory and a processor;

the memory for storing executable program code;

the processor configured to read executable program code stored in the memory to perform the method of controlling a conference of any one of claims 1-12.

29. An apparatus for controlling a conference, comprising a memory and a processor;

the memory for storing executable program code;

the processor configured to read executable program code stored in the memory to perform the method of controlling a conference of any one of claims 13-23.

30. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises instructions which, when run on a computer, cause the computer to perform the method of controlling a conference according to any one of claims 1-12.

31. A computer-readable storage medium, comprising instructions which, when executed on a computer, cause the computer to perform the method of controlling a conference of any one of claims 13-23.