CN115623132A

CN115623132A - Intelligent conference system

Info

Publication number: CN115623132A
Application number: CN202211442190.3A
Authority: CN
Inventors: 刘丹; 汤跃忠; 杨静波; 陈龙
Original assignee: Third Research Institute Of China Electronics Technology Group Corp; Beijing Zhongdian Huisheng Technology Co ltd
Current assignee: Third Research Institute Of China Electronics Technology Group Corp; Beijing Zhongdian Huisheng Technology Co ltd
Priority date: 2022-11-18
Filing date: 2022-11-18
Publication date: 2023-01-17
Anticipated expiration: 2042-11-18
Also published as: CN115623132B

Abstract

The invention discloses an intelligent conference system. The intelligent conference system includes: the text display box is used for displaying a conference record, and the conference record is provided with a plurality of marks for distinguishing speeches corresponding to different speakers; each mark corresponds to one name setting module, and the name setting module is set to pop up an option box when a user executes a first trigger action; the option box comprises a plurality of name options which are suitable for being triggered by a second trigger action of the user, and when a certain name option is triggered, the name option replaces the corresponding mark of the option box. By adopting the invention, the name of the speaker can be set quickly, the early voiceprint recording step is saved, and the working efficiency is greatly improved. The text can be corresponding to the name of the speaker in time during subsequent voice recognition, and the name of the speaker can be corrected at any time before, during or after the conference is started and when the problem of machine recognition is found, so that the voiceprint and the name can be matched and corrected.

Description

Intelligent conference system

Technical Field

The invention relates to the technical field of conference recording, in particular to an intelligent conference system.

Background

The conference system applies the voice recognition technology, and can convert voice into characters to form a conference record. The conference system can usually distinguish different speakers according to a microphone array or a voiceprint recognition technology, when one speaker starts to talk, a mark is added in front of the content of the talk by the intelligent conference system, such as 'speaker 1' and 'speaker 2', but the mark cannot directly correspond to the name of the person, the memory load of a user is increased in the process of arranging the summary of the conference, the speakers cannot be quickly positioned, the workload of arranging the conference records is large, and the efficiency is low.

In the related art, in order to realize the purpose of matching the speech with the name, a user needs to input a section of speech in advance according to a preset text, input the name and complete voiceprint input, namely voiceprint acquisition is carried out on all participants in advance.

Disclosure of Invention

The embodiment of the invention provides an intelligent conference system, which is used for solving the problem of low feasibility of matching names with conference records in the prior art.

The intelligent conference system according to the embodiment of the invention comprises:

the system comprises a text display frame and a conference recording display frame, wherein the text display frame is used for displaying a conference record, and the conference record is provided with a plurality of marks used for distinguishing speeches corresponding to different speakers;

the name setting modules correspond to the marks respectively, and are arranged to pop up an option box when a user executes a first trigger action;

the option box comprises a plurality of name options which are suitable for being triggered by a second trigger action of the user, and when a certain name option is triggered, the name option replaces the corresponding mark of the option box.

According to some embodiments of the invention, the intelligent conference system further comprises:

and the participant function module is used for inputting and storing names of participants and synchronizing the names of the participants to option boxes of all the name setting modules so as to form name options in the option boxes.

According to some embodiments of the invention, any utterance in the conference recording is suitable for being selected by a third trigger action of the user;

after the user selects a certain utterance, the user is suitable for triggering any one of the names of the participants listed in the participant function module through a fourth trigger action so as to display the names of the participants before the utterance.

According to some embodiments of the invention, any one of the names listed in the participant function module is arranged to set the name setting module for the utterance while the name of the participant is displayed before the utterance.

According to some embodiments of the invention, any utterance in the conference recording is displayed when selected differently from other unselected utterances.

According to some embodiments of the invention, the name option in the option box is generated by a user entering a store in the option box.

According to some embodiments of the invention, the mark comprises a moderator mark and a speaker mark;

the name setting module comprises a host name setting module and a speaker name setting module;

all the host marks correspond to one host name setting module, and each speaker mark corresponds to one speaker name setting module.

According to some embodiments of the present invention, the option box of the speaker name setting module further comprises a current option and a full-text option, both the current option and the full-text option being adapted to be triggered by a fifth trigger action of the user;

when the user triggers the current option and a certain name option, replacing the mark corresponding to the option frame with the name option;

and when the full-text option and a certain name option are triggered by a user, uniformly replacing the same mark in the conference record as the mark corresponding to the option box with the name option.

By adopting the embodiment of the invention, the names of speakers can be quickly set by using a proper man-machine operation mode before the conference starts, in the process and after the conference is finished, so that the voice print recording step in the early stage is omitted, and the working efficiency is greatly improved. The text can be corresponding to the name of the speaker in time during subsequent voice recognition, and the name of the speaker can be corrected at any time before, during or after the conference is started and when the problem of machine recognition is found, so that the voiceprint and the name can be matched and corrected.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. In the drawings:

FIG. 1 is a schematic diagram of a name input interface of a participant function module in an embodiment of the invention;

FIG. 2 is a schematic diagram of a name display interface of a function module of a participant according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a local interface of an intelligent conference system in an embodiment of the invention;

FIG. 4 is a schematic diagram of a local interface of an intelligent conference system in an embodiment of the invention;

FIG. 5 is a schematic diagram of a local interface of an intelligent conference system in an embodiment of the invention;

fig. 6 is a schematic diagram of a local interface of the intelligent conference system in the embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Additionally, in some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

and the text display box is used for displaying a conference record, and the conference record is provided with a plurality of marks for distinguishing speeches corresponding to different speakers.

The conference system applies the voice recognition technology, and can convert voice into characters to form a conference record. At present, a conference system can usually distinguish different speakers according to a microphone array or a voiceprint recognition technology, and each time one speaker starts to speak, the conference system adds a mark, such as "speaker 1" or "speaker 2", before the speaking content to distinguish the speakers corresponding to the speaking. Each tag may correspond to one or more utterances, and two adjacent tags may not be the same, but two non-adjacent tags may be the same, for example, the tags may be "speaker 1", "speaker 2", and "speaker 1" in sequence.

And each mark corresponds to one name setting module, and the name setting modules are set to pop up option boxes when a user executes a first trigger action.

The first trigger action may be a preset action, such as a single click, a double click, or a right click, and is not specifically limited herein. Similarly, the second trigger action may also be a preset action such as a single click, a double click, or a right click, and the second trigger action is not specifically limited in the same way and may be set according to the habit of the user.

It will be appreciated that the name setting module pops up an option box whenever the user triggers the name setting module. In other words, the user executing the first trigger action is equivalent to issuing a control instruction to the name setting module, and the name setting module executes the action of popping up the option box according to the control instruction. The option frame is provided with a plurality of name options which are set in advance, the second trigger action executed by the user is equivalent to sending an instruction to the name setting module, and the name setting module further executes the action of modifying the corresponding mark into the selected name according to the instruction.

Whole operation process is simple, and the user neither need record in advance, also does not need the artificial speaker name of setting up for every speech next in later stage yet, very big promotion conference system's intelligence and feasibility, can satisfy the record demand of current meeting to guarantee the accuracy of meeting record.

The invention can solve the problem that how to quickly match and store the speeches and the names in the intelligent conference system, and is convenient for the system to display the correct name of the speaker according to the input voice. The invention can realize the voiceprint storage with name labels only by inputting names of the participants. The early voiceprint recording step is omitted, and the working efficiency is greatly improved. The text can also be timely mapped to the speaker name during subsequent speech recognition. And before, during or after the conference, when the problem of machine identification is found, the name of the speaker can be corrected at any time, and the matching and the modification of the voiceprint and the name can be carried out. The moderator name may also be selected by pulling down a menu. After the conference is finished, a text file containing information such as a participant, a host, a speaker, speech content and the like can be exported for backtracking of the conference and summary arrangement of the conference.

On the basis of the above-described embodiment, various modified embodiments are further proposed, and it is to be noted herein that, in order to make the description brief, only the differences from the above-described embodiment are described in the various modified embodiments.

and the participant function module is used for inputting and storing names of participants and synchronizing the names of the participants to the option boxes of all the name setting modules so as to form name options in the option boxes. It will be appreciated that the participant function module has a name entry interface through which a user can enter names of participants, and the participant function module can save the names of the participants and synchronize those names to all name setting modules to form name options in the option boxes of all name setting modules. Therefore, the situation that the user sets the option boxes of the name setting module independently one by one is avoided, the flow is simplified, and the efficiency is improved. For example, as shown in FIG. 1, a schematic diagram of a name entry interface is shown through which a user may enter names of participants. After a user inputs names of participants through the name input interface and triggers a storage button arranged on the name input interface, the participant function module can be automatically adjusted to a name display interface from the name input interface, fig. 2 is a name display interface schematic diagram, and in the interface, each name is constructed into a unit module which can be triggered.

after the user selects a certain utterance, the user is suitable for triggering any one of the names of the participants listed in the participant function module through a fourth trigger action so as to display the names of the participants before the utterance. The fourth trigger action may be a preset action, such as a single click, a double click, or a right click, and is not specifically limited herein.

Therefore, when the conference system has recognition errors, two speakers actually speak, but the conference system recognizes the speech of one speaker, the user can correct the speech through the mechanism, so that the accuracy of conference recording is ensured, and bugs existing in the recognition mechanism are compensated.

According to some embodiments of the invention, any one of the names listed in the participant function module is arranged to set the name setting module for the utterance while the name of the participant is displayed before the utterance. Thus, after the name is modified independently for a certain utterance, the user can modify the name again through the name setting module when finding the setting error of the user later. For example, as shown in fig. 5 and 6, the word "AAAAAAAA" is spoken. The utterance "recognized as" speaker 2 "is actually the utterance" korean XX ", and therefore, the user can modify it by adopting the mechanism of the present invention.

According to some embodiments of the invention, any utterance in the conference recording is displayed when selected differently from other unselected utterances. For example, selected utterances may be displayed in a font color that is distinct from other unselected utterances, as shown in fig. 5, where selected utterances are displayed in gray and unselected utterances are uniformly black. Of course, other ways of highlighting may be used, such as bolding, tilting, font shading, and the like, and all ways are not listed here, as long as the selected utterance can be distinguished from the unselected utterances.

According to some embodiments of the invention, the name option in the option box is generated by a user entering a store in the option box. It is to be understood that the name options in the option box of the name setting module may be configured in advance by the user in the name setting module. For example, the option box of each name setting module may be provided with a configuration window for facilitating the user to add a name, and the name setting module may set it as a name option in the option box according to the user's input.

According to some embodiments of the invention, the marks comprise a moderator mark and a speaker mark;

the name setting module comprises a moderator name setting module and a speaker name setting module;

For example, as shown in FIG. 3, an inverted triangle placed next to the word "moderator" sets the module for the moderator's name. As shown in fig. 4, an inverted triangle placed next to the word "speaker 2" is a speaker name setting module.

when the user triggers the full text option and a certain name option, the same mark in the conference record corresponding to the option box is uniformly replaced by the name option.

For example, as shown in fig. 4, the current option and the full-text option may be selected alternatively by the user through a fifth trigger action trigger, and then the user triggers the name option to complete name replacement.

Therefore, the user can uniformly replace all the same marks through full-text options, and the working efficiency is greatly improved. Meanwhile, the user can independently replace a certain name through the current option, so that the condition of identification error can be independently modified, and the accuracy of the conference record is improved.

The intelligent conference system according to the present invention will be described in detail in one specific embodiment with reference to the accompanying drawings. It is to be understood that the following description is only exemplary in nature and should not be taken as a specific limitation on the invention.

The intelligent conference system is not only a voice recognition recording tool, but also a tool for assisting in arranging conference summary, and the intelligent conference system is designed as a whole task flow. The invention designs a function of storing voiceprint information only by inputting the name of a speaker in the conference based on the current voice recognition technology, voiceprint recognition technology and microphone array technology. This mode can save the time of vocal print entering before the meeting, can store vocal print information sooner, makes the speech recognition text correspond the speaker name fast, when deriving the meeting brief and meeting reprint content simultaneously, directly derives moderator, participant, speaker information, makes the meeting brief arrangement efficiency show the improvement.

In detail, the intelligent conference system according to the embodiment of the present invention includes: the system comprises a participant function module, a text display box and a plurality of name setting modules.

As shown in fig. 1 and 2, the participant function module is used for inputting and storing names of participants. The names of the participants can be presented on the display interface of the functional module of the participants in the form of unit modules for the operation of users.

As shown in fig. 4, a text display box is used to display the meeting minutes. The conference system can distinguish different speakers according to a microphone array or a voiceprint recognition technology, and when one speaker starts speaking, the conference system converts the speaker into a text-form speech and marks the speech before, such as 'speaker 1' and 'speaker 2', so as to distinguish the speakers corresponding to the speech.

As shown in fig. 3-6, a name setting module is located next to each tag. The marks comprise a moderator mark and a speaker mark; the name setting module comprises a moderator name setting module and a speaker name setting module; all host marks correspond to one host name setting module, and each speaker mark corresponds to one speaker name setting module. For example, an inverted triangle next to "speaker 2". An inverted triangle next to the "presenter".

As shown in fig. 4, the name setting module is arranged to pop up an option box when the user performs the first trigger action; the option box includes a plurality of name options adapted to be triggered by a second triggering action of the user. The participant function module may synchronize the names of the participants to the option boxes of all name setting modules to form name options in the option boxes. When a name option is triggered, the name option replaces the mark corresponding to the option box.

As shown in fig. 4, the option box of the speaker name setting module further includes a current option and a full-text option, and both the current option and the full-text option are suitable for being triggered by a fifth trigger action of the user; when a user triggers a current option and a certain name option, replacing a mark corresponding to the option frame with the name option; when the user triggers the full text option and a certain name option, the name option is uniformly replaced by the same mark in the conference record as the mark corresponding to the option box.

As shown in fig. 5 and 6, any utterance in the conference record is suitable for being selected by the third trigger action of the user; after the user selects a certain utterance, the user is suitable for triggering any one of the names of the participants listed in the participant function module through a fourth trigger action so as to display the names of the participants before the utterance. Any one of the names listed in the participant function module is set to be displayed before the utterance, and a name setting module is set for the utterance at the same time. Any utterance in the conference recording is displayed when selected differently from other unselected utterances.

In the process of using the intelligent conference system of the embodiment of the invention, a user can input names of participants in the text boxes of the participant function modules, wherein each line contains one participant, as shown in fig. 1. After the input is finished, the 'save' button is clicked, namely, the storage of the names of the participants is finished, the function modules of the participants are changed into button types, and as shown in fig. 2, each name of the participants is a button. These names may be synchronized to all name setting modules at the same time. Clicking a button of the inverted triangle beside the host to pop up a pull-down menu, and selecting the name of the participant just stored in the pull-down menu to complete the change of the name of the host, as shown in fig. 3. Clicking the inverted triangle button beside the speaker in the conference record can also pop up a pull-down menu, selecting the name of the participant in the pull-down menu, and selecting to modify the current speaker or the full-text speaker, as shown in fig. 4. And selecting the corresponding relation between the voiceprint and the name which is not influenced by the current speaker, storing the voiceprint by selecting the full-text speaker, and calling the name of the voiceprint contrast when the next voiceprint matching is successful. Clicking the text content of the voice recognition, when the sentence words turn grey, clicking the left participant name button, as shown in fig. 5, starting another line of the current text, and displaying the name of the participant above the words, as shown in fig. 6. After the transfer is finished, the conference content can be exported, and information such as names of participants and names of hosts can be synchronously exported.

By adopting the embodiment of the invention, after the participants are input in batch and stored, the names of the participants appear in a button form and can be used as functional buttons when the roles are inserted. The speaker name may be modified by the names of the participants saved in the drop down menu. The moderator name may be modified by the names of the participants saved in the drop down menu. When the name of the speaker is modified, the current or full text can be selected to be modified, and when the full text is selected, the voiceprint and the name are matched and stored. The whole process can be completed only by simple operation of a user, and the working efficiency is high and the intelligent degree is high.

It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the present invention, and those skilled in the art can make various modifications and changes. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

In addition, those matters not described in detail in the present specification are well known to those skilled in the art.

Claims

1. An intelligent conferencing system, comprising:

a plurality of name setting modules, each of the marks corresponding to one of the name setting modules, the name setting modules being configured to pop up an option box when a user performs a first trigger action;

2. The intelligent conferencing system of claim 1, wherein the intelligent conferencing system further comprises:

and the participant function module is used for inputting and storing names of participants and synchronizing the names of the participants to the option boxes of all the name setting modules so as to form name options in the option boxes.

3. The intelligent conference system according to claim 2, wherein any utterance in the conference recording is adapted to be selected by a third triggering action of the user;

4. The intelligent conferencing system of claim 3, wherein any listed participant name in the participant function module is set to the name setting module for the utterance while the participant name is displayed prior to the utterance.

5. The intelligent conferencing system of claim 3, wherein any utterance in the conference recording is displayed when selected differently from other unselected utterances.

6. The intelligent conferencing system of claim 1, wherein the name option in the option box is generated by a user entering a store in the option box.

7. The intelligent conferencing system of claim 1, wherein the tags include a moderator tag and a speaker tag;

8. The intelligent conferencing system of claim 7, wherein the option box of the speaker name setting module further comprises a current option and a full-text option, both the current option and the full-text option being adapted to be triggered by a fifth triggering action of the user;