WO2019148583A1 - 一种会议智能管理方法及系统 - Google Patents

一种会议智能管理方法及系统 Download PDF

Info

Publication number
WO2019148583A1
WO2019148583A1 PCT/CN2018/078527 CN2018078527W WO2019148583A1 WO 2019148583 A1 WO2019148583 A1 WO 2019148583A1 CN 2018078527 W CN2018078527 W CN 2018078527W WO 2019148583 A1 WO2019148583 A1 WO 2019148583A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
reservation
audio
speaker
conference
Prior art date
Application number
PCT/CN2018/078527
Other languages
English (en)
French (fr)
Inventor
刘善果
李明
Original Assignee
深圳市鹰硕技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市鹰硕技术有限公司 filed Critical 深圳市鹰硕技术有限公司
Publication of WO2019148583A1 publication Critical patent/WO2019148583A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/109Time management, e.g. calendars, reminders, meetings or time accounting
    • G06Q10/1093Calendar-based scheduling for persons or groups
    • G06Q10/1095Meeting or appointment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/02Reservations, e.g. for tickets, services or events
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C9/00Individual registration on entry or exit
    • G07C9/30Individual registration on entry or exit not involving the use of a pass
    • G07C9/32Individual registration on entry or exit not involving the use of a pass in combination with an identity check
    • G07C9/37Individual registration on entry or exit not involving the use of a pass in combination with an identity check using biometric data, e.g. fingerprints, iris scans or voice recognition
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C9/00Individual registration on entry or exit
    • G07C9/30Individual registration on entry or exit not involving the use of a pass
    • G07C9/38Individual registration on entry or exit not involving the use of a pass with central registration
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/16Storage of analogue signals in digital stores using an arrangement comprising analogue/digital [A/D] converters, digital memories and digital/analogue [D/A] converters 
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/10527Audio or video recording; Data buffering arrangements
    • G11B2020/10537Audio or video recording
    • G11B2020/10546Audio or video recording specifically adapted for audio data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition

Definitions

  • the appointment management of conference rooms is usually reserved by means of filling out forms or by the conference room management system.
  • filling out a form to reserve a meeting room is usually time consuming and error-prone.
  • the conference room management system makes reservation reservations for the conference room.
  • the conference room where the reservation is successful is not easy to change. It is difficult to change the reservation result when an emergency situation occurs.
  • the organizer cannot and the reserved conference room.
  • the communication and coordination of personnel cannot meet the flexible and efficient use requirements of the conference room.
  • video or audio data of the conference site is usually recorded by using a capture tool such as a video camera or a voice recorder, and the video data or audio data is saved into a multimedia file.
  • a capture tool such as a video camera or a voice recorder
  • the conference can be viewed or listened to at any time.
  • the conference site usually arranges a dedicated recorder or participant to record by hand or by hand, recording the content of the conference.
  • the video or audio data is usually large, and it takes a lot of hardware storage space when saving.
  • the automatic summary extraction technology can process the input text, voice, video and other information, obtain the summary content in the input data, and present the processed summary result to the user for browsing.
  • the automatic summary extraction technology not only saves users time to access information, but also improves user productivity. There are a number of ways in the prior art to automatically generate a summary or summary of a meeting.
  • Patent Document 1 (CN107409061A) provides a method and system for speech summarization, which determines which participant is speaking based on comparing the image of the participant with the template image of the speaker and the non-speaker face.
  • the computer determines the voiceprint of the speaking participant by applying a hidden Markov model to the summary record of the participant's sound waveform and associating the determined voiceprint with the face of the speaking participant.
  • the computer recognizes and transcribes the content of the speaker's statements, identifies key points, and displays them on the participant's face in the video conference.
  • Patent Document 2 provides a method of recording a conference, setting a configuration file, defining key information of the conference (for example, raising a question question scenario), and a format of the conference summary, at a specific time point on the conference timeline, based on the configuration file Extract key information of each site, and combine key information of each site into key index points, which are used as index points for interaction or editing with the meeting summary; multiple key index points corresponding to multiple time points Combined into a summary of the meeting; interacting or editing with the summary of the meeting based on key information in the summary of the meeting.
  • Patent Document 1 can recognize a speech participant and associate the key content of the speech with the participant, it is the same for all participants to extract their speech information, and cannot be based on different participants. Different situations selectively select the speech content of different participants. In actual meetings, the importance of different participants is usually different. If each participant uses the same way to extract their speech information, it is possible to waste resources due to excessive information extracted from unspeaking participants' speech content. However, too little information is extracted for important participants' speech content, resulting in information omissions. Further, in Patent Document 1, the content of the statement made by the speaker is identified and transcribed, and after the key point is determined, information such as generated text is displayed for the user to view and read, and the advantage of the voice file itself is lost.
  • Patent Document 1 determines which participant is speaking by comparing the image of the participant with the template image of the speaker and the non-speaker face, and then the computer determines the speech participation by applying the hidden Markov model to the brief record of the participant's sound waveform.
  • the voiceprint of the person, the process of identifying the participants is more complicated and less efficient.
  • an object of the present invention is to provide a conference intelligent management method and system, which can flexibly reserve a conference room, and identify key speech contents of different speakers for the conference record, and automatically synthesize a conference summary in a voice form.
  • the intelligent management method of the conference includes:
  • the user reserves the conference room, and the reserved conference room, in the reservation time period, only the reservation user can open the access control system of the conference room, wherein the user can make a change request for the reservation result of the conference room that has been reserved, according to the user.
  • Different role types are processed in different ways;
  • S6 Pre-processing and storing the recorded audio/video data, wherein the start time and the end time of the speaker's speech are recorded, and the name and user name of the speaker are obtained, and the speaker, the start time of the speech, and the speech are obtained.
  • the end time is associated with the collected audio/video data and stored;
  • S9 Perform speech recognition processing on the obtained candidate key speech segment set, and filter and locate the audio/video segment set corresponding to the key speech content;
  • the user reserves a conference room through the network, obtains a user name of the user, obtains a corresponding role type according to the user name, displays a conference room that can be reserved, and allows the conference room to work.
  • the time period also shows which time periods of the conference room have been successfully reserved by other users, and which time periods are idle.
  • the time period for which the reservation has been successfully displayed is displayed as the reserved type and the type of the role of the reserved user is displayed, and the time for the reservation is not successful.
  • the segment is displayed as idle, the user selects the desired conference room and the required time period, submits a reservation request, and the system prompts the user after confirming that the appointment is successful.
  • the user selects a required conference room and a required time period, and submits a reservation request.
  • the system prompts the user to further include:
  • the user selects the required conference room and the required time period, submits a reservation request, and the system determines whether the time period of the conference room has been reserved by other users. If it is determined that the reservation is not made by another user, the reservation is confirmed to be successful and the user is prompted;
  • the privilege level corresponding to the type of the reserved user role is obtained, and the privilege level corresponding to the role type of the user who submitted the reservation request is obtained, and the privilege level of the two is determined. If the privilege level is lower than the privilege level corresponding to the role type of the currently submitted subscription request user, the reservation of the reserved user is canceled, the current submitted reservation request is successfully confirmed, and the user is prompted; when the reserved user role type corresponds to the privilege level is equal to or high When the permission level corresponding to the role type of the reservation requesting user is currently submitted, the current user may be prompted to send a reservation change request to the reserved user, and the current user inputs the reason for the request change, and sends a reservation change request to the reserved user, the reservation.
  • the change request includes the reason for the request change, the conference room requesting the change reservation, and the corresponding time period.
  • the reserved user confirms whether or not to agree to the reservation change request and feeds back to the system. If the reserved user confirms the consent to the reservation change request, System canceled
  • S402. Acquire a current time, determine whether the conference room has been reserved at the current time, and if not reserved, allow the user to enter and use the conference room. If the reservation has been made, the user name of the conference room is obtained, and the user name obtained in S401 is compared with each other. If the user name is consistent, the conference room access is opened, and the user is allowed to enter and use the conference room; if not, the user is prompted to use the conference room. Has been reserved, the current time is not available, and the process ends.
  • step S5 further includes:
  • the recording microphone periodically detects the voice information, and when detecting the voice information that the speaker starts to speak, triggers the recording start command, collects the speaker's audio/video data, and records the speaking start time according to the collected audio/video.
  • the attribute of the data determines whether the current speaker's speech continues or has stopped.
  • the recording pause or stop command is triggered, the end time of the speech is recorded, and the recording microphone continues to cyclically detect the voice information.
  • the continuous recording command or the recording start command is triggered to record the audio/video data of the next speaker.
  • step S6 further includes:
  • the meeting agenda table stores the meeting agenda, and the speaking time period of each speaker in the meeting, and the speaker corresponding to the current time is obtained according to the meeting agenda table, and the speaker, the speaking start time, and the end of the speech are ended.
  • the time is associated with the collected audio/video data and stored in the storage device;
  • the voiceprint feature data is identified according to the currently collected audio/video data, and the identified voiceprint feature data is matched with the user voiceprint feature data collected and stored in step S1, and the current speaker's name is obtained after the matching is successful.
  • the user name is associated with the collected audio/video data and stored in the storage device.
  • step S7 further includes:
  • the corresponding permission level is obtained, and different speakers are given different weight coefficients B according to the speaker's permission level.
  • step S8 further includes:
  • the audio/video segment of the specific time period is intercepted as a candidate key segment.
  • step S8 further includes:
  • Pre-set the keyword library perform speech recognition processing on the speaker's corresponding speech/video segment, and use the preset keyword database to match the recognized speech information. After the matching is successful, the recognized keyword is intercepted and the preset time period is intercepted. The audio/video segment is used as a candidate key segment.
  • the key library corresponding to different weight speakers and/or the preset time period of the interception are different. The higher the weight coefficient, the larger the number of keywords in the corresponding keyword library and / / The longer the length of the intercepted audio/video clip, the lower the weight coefficient, the smaller the number of keywords in the corresponding keyword library and/or the shorter the length of the intercepted audio/video clip.
  • step S9 further includes:
  • step S8 Combining the theme of the conference to determine the key speech content, performing speech recognition processing on the candidate key speech segments obtained in step S8, and converting the text into data data, the converted text data having a time axis corresponding to the audio/video data, according to the text data
  • the content in the content can be located to the audio/video data of the corresponding time period, and the transformed text data is filtered by using keywords corresponding to the key speech content, and finally the audio/video segment set corresponding to the key speech content is determined.
  • step S10 further includes:
  • the meeting agenda and other information pre-generate the header information of the summary, and generate the header information voice file according to the information, and then generate the transition information in the summary according to the meeting agenda and the like, and generate the transition information voice file according to the transition information, and
  • the head information voice file, the transition information voice file, and the voice summaries of different speakers that are stitched together are combined according to the corresponding relationship to form a voice summary of the conference.
  • the intelligent management system of the conference includes:
  • a user setting module configured to set a user role, and input user information, wherein a type of the user role is preset, a corresponding plurality of the user information is entered and stored for each role type, and each user has a unique user. Name; collect voiceprint feature information of different users, wherein voiceprint recognition technology is used to identify different users, collect voice data of different users, identify voiceprint feature data of the user and store it corresponding to the user name;
  • a permission level setting module configured to set a permission level corresponding to different role types, wherein multiple permission levels are set according to the attribute of the role type itself, and each role type corresponds to one permission level;
  • the conference room reservation module is used for the user to reserve the conference room, and the reserved conference room, in the reservation time period, only the reservation user can open the access control system of the conference room, wherein the user can reserve the reservation result of the reserved conference room. Requesting a change request, and performing different types of change processing according to different role types of the user;
  • the conference room use permission verification module is used to verify the usage rights of the user conference room, and confirm whether the user is allowed to use the conference room;
  • a data acquisition module configured to collect conference audio/video data during the conference
  • the audio/video data pre-processing module is configured to pre-process the recorded audio/video data, wherein the start time and the end time of the speaker's speech are recorded, and the name and user name of the speaker are obtained.
  • the speaker, the start time of the speech, and the end time of the speech are associated with the collected audio/video data, and are stored in the storage device;
  • a human rights re-determination module for determining the weighting factors of different speakers
  • a candidate key snippet set obtaining module configured to acquire a corresponding candidate key snippet set according to the speaker, wherein the candidate key snippet set obtaining module searches the stored audio/video according to the name of the speaker, and finds its corresponding
  • the specific snippet segment uses the preset strategy to intercept the candidate key snippet segments in the utterance segment, and the preset strategies for intercepting the candidate key snippet segments corresponding to the speakers having different weight coefficients are different;
  • An audio/video clip collection screening module configured to perform speech recognition processing on the acquired candidate key speech segment set, and filter an audio/video segment set corresponding to the focused speech content;
  • the voice summary synthesis module is configured to synthesize the audio/video segment set filtered by the audio/video segment collection screening module to form a voice summary.
  • the conference room reservation module further includes:
  • the reservation request module the user can reserve the conference room through the network, the reservation request module obtains the user name of the user, obtains the corresponding role type according to the user name, displays the conference room that can be reserved according to the role type, and displays which conference room
  • the time period has been successfully reserved by other users, and which time periods are idle.
  • the time period for which the reservation has been successfully displayed is displayed as reserved and the role type of the reserved user is displayed.
  • the time period is displayed as idle, and the user requests through the reservation.
  • the module selects the required conference room and the required time period, submits the reservation request, and the system confirms that the reservation is successful and prompts the user.
  • the conference room reservation module further includes:
  • the reservation change request module is configured to change the reservation result of the reserved conference room, and the user selects the required conference room and the required time period through the reservation request module, submits the reservation request, and the system determines the time period of the conference room. Whether it has been reserved by another user, if it is judged that it has not been reserved by another user, it is confirmed that the appointment is successful and the user is prompted;
  • the reservation change request module acquires the permission level corresponding to the type of the reserved user role, and obtains the permission level corresponding to the role type of the user who submitted the reservation request, and determines the level of the permission level. If the privilege level corresponding to the role type is lower than the privilege level corresponding to the role type of the current subscription requesting user, the reservation of the reserved user is canceled, and the currently submitted reservation request is confirmed and the user is prompted, and the permission corresponding to the user role type has been reserved.
  • the current user may be prompted to send a reservation change request to the reserved user, and the current user inputs the reason for the request change, and the reservation change request module sends the reserved request module to the reserved user.
  • the reservation change request module prompts the current user to make a reservation change. The request fails, and the current user can be prompted to select another time period to make an appointment.
  • the voice data collecting module is configured to use the voice collector set in the conference room access control to collect the user voice data, obtain the user name of the corresponding user, identify the voiceprint feature data according to the currently collected voice data, and identify the voiceprint.
  • the feature data is matched with the user voiceprint feature data collected and stored in advance, and the user name of the user corresponding to the current voice data is obtained after the matching is successful;
  • the reservation judging module is configured to obtain the current time, determine whether the conference room has been reserved at the current time, allow the user to enter and use the conference room if not reserved, and obtain the user name of the conference room if the reservation has been made. Whether the comparison with the user name obtained by the voice data collection module is consistent, and the conference room is opened to allow the user to enter and use the conference room; if not, the user is prompted to reserve the conference room, the current time is unavailable, and the process ends. .
  • the recording microphone periodically detects the voice information, and when detecting the voice information that the speaker starts to speak, triggers the recording start command, collects the speaker's audio/video data, and records the speaking start time according to the collected audio/video.
  • the attribute of the data determines whether the current speaker's speech continues or has stopped.
  • the recording pause or stop command is triggered, the end time of the speech is recorded, and the recording microphone continues to cyclically detect the voice information.
  • the continuous recording command or the recording start command is triggered to record the audio/video data of the next speaker.
  • the audio/video data pre-processing module further includes:
  • the meeting agenda processing module is configured to read a pre-stored meeting agenda table, the meeting agenda table stores the meeting agenda, and the speaking time period of each speaker in the meeting. Obtaining a speaker corresponding to the current time according to the meeting agenda table, and correlating the speaker, the speaking start time, and the speaking end time with the collected audio/video data, and storing the same in the storage device;
  • the voiceprint recognition module pre-collects the voice data of the participant speaker, identifies the voiceprint feature data of the participant speaker and stores it corresponding to the name of the participant speaker, and the voiceprint recognition module recognizes the voice according to the currently collected audio/video data.
  • the pattern data is matched with the voiceprint feature data collected and stored by the user setting module. After the matching is successful, the current speaker's name and user name are obtained, and the speaker, the start time of the speech, and the end of the speech are ended. The time is processed in association with the acquired audio/video data and stored in a storage device.
  • the speaking human rights re-determination module is further used to:
  • the candidate key segment collection module further includes:
  • the candidate key segment collection module further includes:
  • the keyword interception module pre-sets the keyword library, performs voice recognition processing on the speaker's corresponding speech/video segment, and uses the preset keyword database to match the recognized voice information. After the matching is successful, the recognized keyword is intercepted.
  • the audio/video clips of the preset time period are used as candidate key speech segments, the keyword pools corresponding to different weight speakers and/or the preset time period lengths are different, and the higher the weight coefficient, the key in the corresponding keyword library The greater the number of words and/or the longer the length of the intercepted audio/video clip, the lower the weight coefficient, the fewer the number of keywords in the corresponding keyword library and/or the shorter the length of the intercepted audio/video clip.
  • the audio/video segment collection screening module is further configured to determine a key speech content in combination with the conference topic, perform speech recognition processing on the candidate key speech segment obtained by the candidate key segment collection module, and convert the data into text data.
  • the subsequent text data has a time axis corresponding to the audio/video data, and can locate the audio/video data of the corresponding time period according to the content in the text data, and filter the converted text data by using keywords corresponding to the key speech content. And finally determine the audio/video clip set corresponding to the key speech content.
  • the speaker voice summary synthesis module is configured to sort the audio/video clip sets of the same speaker selected by the audio/video clip collection screening module in chronological order, and splicing the sorted audio/video clip sets into a piece of audio/video. a speech summary of the speech of the speaker;
  • the conference voice summary synthesis module is configured to generate a voice summary of the entire conference, and the conference voice summary synthesis module can generate a header information of the summary according to the conference topic, the conference agenda, and the like, and generate the header information voice file, the conference voice summary.
  • the synthesis module then generates the transition information in the summary based on the information such as the meeting agenda, and generates the transition information voice file by the above information, and the conference voice summary synthesis module sets the head information voice file, the transition information voice file, and the different spokespersons of the stitching completion.
  • the speech summaries are combined into corresponding correspondences to form a speech summary of the conference.
  • the present invention can provide a plurality of reservation modes of conference rooms according to different user roles and hierarchical rights, and avoid disorderly use of the conference rooms.
  • the reservation mode can be flexibly provided, and the reservation result of the reserved conference room can be changed according to different privilege levels of different users, thereby facilitating the management of the conference and coordinating the emergency situation.
  • establishing a communication and coordination mechanism between the reserved user and the subsequent user allowing the latecomer to communicate and coordinate with the reserved user on the emergency needs of the conference room, providing greater flexibility for the user to use the conference room.
  • the recording of invalid content is effectively reduced, the storage resource is saved, and the length of the recording time is reduced, so that the user can find the desired content later.
  • the weight coefficient of the speaker is determined, so that different candidate strategies are used to obtain the candidate key segment sets corresponding to different speakers according to the weight coefficient. Extract more content for important speeches, extract relatively less content for unimportant speeches, make the resulting summary content more reasonable, and provide users with more effective help.
  • the important content of the speech appears at a position with a high probability of speaking on the timeline of the speech, or the key turning words and connecting words followed by the important content of the speech, to intercept the candidate key speech segment set, and then
  • the intercepted candidate key speech segment sets are processed to obtain an audio/video clip set forming a speech digest, which can greatly improve the effectiveness of the extracted content, and the extraction efficiency is high, and is not affected by other factors such as the environment, further making the final
  • the resulting summary is more reasonable.
  • Figure 1 is a flow chart of the method of the present invention.
  • FIG. 2 is a schematic diagram of a reservation state of a conference room.
  • Figure 3 is a schematic diagram of the meeting agenda.
  • FIG. 4 is a schematic structural view of a system of the present invention.
  • FIG. 1 is a schematic flowchart diagram of an intelligent management method for a conference according to an embodiment of the present application. As shown in Figure 1, the method includes:
  • the type of the user role is set in advance according to the actual situation.
  • the types of roles may include: teachers, students, logistics directors, principals, and the like.
  • the role types can include: chairman, general manager, department manager, staff, etc. Enter and store corresponding user information for each role type.
  • the user information includes name, contact information (such as mobile phone number, email address), the name is used as the user's user name, and the role type is stored corresponding to the user name. For the mapping table.
  • the voiceprint feature information of different users is collected, and the voiceprint recognition technology is used to identify different users. Specifically, voice data of different users are collected, voiceprint feature data of the user is identified and stored corresponding to the user name. In the subsequent step, the voiceprint feature data is identified according to the currently collected audio data, and the identified voiceprint feature data is matched with the user voiceprint feature data collected and stored in advance, and the user corresponding to the current voice data is obtained after the matching is successful. name.
  • Each role type corresponds to one permission level.
  • four permission levels A, B, C, and D can be set, and the order of the rights is A>B>C>D.
  • the principal has the highest authority, the authority level is set to A; the logistics supervisor authority level is set to B; the teacher authority level is C; and the student authority level is D.
  • the chairman has the highest authority, the authority level is set to A; the general manager authority level is set to B; the department manager authority level is C; and the employee authority level is D.
  • the user reserves a meeting room.
  • the user is allowed to reserve a meeting room.
  • the reserved conference room only the reserved user can open the access control system of the conference room during the appointment period.
  • the user can log in to the intelligent conference control system through the network, and the system obtains the corresponding role type according to the user name of the login user, displays the conference room that can be reserved, and displays which time periods of the conference room have been successfully reserved by other users. Which time periods are idle, the time period for which the reservation has been successfully made is displayed as the reserved and the role type of the reserved user is displayed, and the time period for which the reservation has not been successfully displayed is displayed as idle.
  • the reservation status display of the conference room is shown.
  • the user selects the required conference room and the required time period, submits a reservation request, and the system prompts the user after confirming that the reservation is successful, and can also send the reservation success information to the user by using a short message or the like after the reservation is successful, and the reservation success information includes an appointment. Meeting room and appointment time period. Further, in order to facilitate the management of the conference and to cope with some unexpected situations, the reservation result of the conference room that has been reserved may be changed.
  • the specific method is as follows. The user selects the required conference room and the required time period, submits a reservation request, and the system determines whether the time period of the conference room has been reserved by other users. If it is determined that the reservation is not made by another user, the reservation is confirmed and the user is prompted to pass the confirmation.
  • the reservation success information includes the reserved conference room and the reservation time period; if it is determined that the reservation has been made by another user, the permission level corresponding to the reserved user role type is acquired, and the current submission reservation is obtained. The permission level corresponding to the role type of the user is requested, and the permission level of the user is determined. When the permission level corresponding to the reserved user role type is lower than the permission level corresponding to the role type of the currently submitted reservation request user, the reservation of the reserved user is cancelled.
  • submit to the current via SMS The user who requested the successful reservation sends the reservation success information, and the reservation success information includes the reserved conference room and the reservation time period.
  • the permission level corresponding to the reserved user role type is equal to or higher than the permission level corresponding to the role type of the currently submitted reservation request user, the current user may be prompted to send a reservation change request to the reserved user, and the current user inputs the reason for the request change.
  • the system sends a reservation change request to the reserved user by means of a short message or the like, the reservation change request includes the reason for the request change, the conference room requesting the change reservation, and the corresponding time period, and the reserved user confirms whether to agree to the reservation change request. And feedback to the system, if the reserved user confirms the consent to the reservation change request, the system cancels the reservation of the reserved user, confirms that the currently submitted reservation request is successful, and prompts the user to send an appointment to the canceled user by means of a short message or the like.
  • the canceled information, the information that the reservation is cancelled includes the reserved conference room and the canceled reservation time period, and at the same time, the reservation success information is sent to the currently submitted reservation request successful user by means of a short message or the like, and the reservation is successful. Information including scheduled meetings And a reservation period. If the reserved user disagrees with the reservation change request, the current user is prompted to make a reservation change request failure, and the current user may be prompted to select another time period reservation.
  • conference room A has been successfully booked by teacher Zhang San from 9:00 to 10:00 on December 1st.
  • the logistics supervisor found that there is a problem with the equipment in the conference room. It needs to be in the period from 9:00 to 10:00 on December 1st.
  • the system displays the conference room that can be reserved and the reservation of the conference room.
  • the conference room A has been assigned to the teacher by the role type from 9:00 to 10:00 on December 1st.
  • the user has made a successful appointment.
  • the logistics supervisor may submit a reservation request for the conference room A on the morning of December 1st from 9:00 to 10:00, and the system determines that the conference room A has been reserved for the third time, and obtains the permission level corresponding to the third three.
  • the conference room A has been successfully booked by the teacher Zhang San from 9:00 to 10:00 on December 1st.
  • the student responsible for Xiao Ming needs to use the conference room A to answer from 9:00 to 10:00 on December 1st.
  • the system displays the conference room that can be reserved and the appointment and reservation status of the conference room.
  • the conference room A has been reserved by the user whose role type is teacher from 9:00 to 10:00 on December 1st. success.
  • Xiao Ming can submit a reservation request for the conference room A from 9:00 to 10:00 on the morning of December 1, and the system judges that the conference room A has been reserved by Zhang San, and obtains the permission level corresponding to Zhang San.
  • Zhang San confirms whether to agree to the reservation change request and feeds back to the system. If Zhang San confirms the reservation change request, the system cancels Zhang San's appointment and confirms Xiao Ming. The submitted reservation request is successful and prompts the user to send a message to Zhang San "The conference room A you reserved (9:00 am to 10:00 am on December 1st) has been cancelled, please understand” Send out the message "Your appointment meeting room A (1 December 9: 00-10: 00) had an appointment to succeed.” If Zhang San does not agree with the reservation change request, he or she will prompt Xiao Ming to make an appointment change request failure, and may also prompt Xiao Ming to select another time period to make an appointment.
  • the traditional reservation method is single and fixed. After the reservation is successful, it is difficult to make appropriate adjustments according to the situation.
  • different reservation methods are provided according to the user role type and the permission level, and the conference room for the reservation is successful. Providing a negotiation mechanism between the appointment personnel not only ensures the orderly use of the conference room, but also provides an intelligent and flexible adjustment method for various management needs and the handling of unexpected situations, which provides great flexibility for the user to use the conference room. Sex.
  • S401 Collect voice data of the user by using a voice collector set in the conference room access control, and obtain a user name of the corresponding user.
  • the voiceprint feature data is identified according to the currently collected voice data, and the voiceprint feature data is matched with the user voiceprint feature data collected and stored in advance, and the user name of the user corresponding to the current voice data is obtained after the matching is successful.
  • S402. Obtain a current time, and determine whether the conference room has been reserved at the current time. If not reserved, the user is allowed to enter and use the meeting room. If the reservation has been made, the user name of the conference room is obtained, and the user name obtained in S401 is compared with each other. If the user name is consistent, the conference room access is opened, and the user is allowed to enter and use the conference room; if not, the user is prompted to use the conference room. Has been reserved, the current time is not available, and the process ends.
  • S5. Collect conference audio/video data during the conference.
  • the duration of the conference is not certain. It may take a long time.
  • the voice data of the speaker is not recorded during the entire conference. In this case, if the entire recording during the conference will waste resources, the object will be further increased.
  • the difficulty of finding content Specifically, audio/video data collection can be manually initiated, paused, or stopped by the user to record the desired content.
  • the recording microphone can be cyclically detected by the recording microphone. When the voice information of the speaker is detected, the recording start command is triggered, the audio/video data of the speaker is collected, and the speech is recorded. time.
  • the participant ends the speech, triggers the recording pause or stop command, and records the end time of the speech.
  • the recording microphone continues to cyclically detect the voice information, and when the voice message of the next speaker starts to be detected, the continuous recording command or the recording start command is triggered to record the audio/video data of the next speaker.
  • the speaker is a user who has entered user information and set a user role in step S1.
  • the start time and end time of the speaker's speech are recorded, and the name and user name of the speaker are obtained.
  • the pre-stored meeting agenda is read, the meeting agenda stores the meeting agenda, and the speaking time period of each speaker in the meeting.
  • 9:00 ⁇ 9:10 is the opening ceremony.
  • the speech of the speaker Li Ming is from 9:10 to 9:30.
  • the speech of the speaker Wang Wei is from 9:30 to 9:50.
  • the summary speech time is from 10:30 to 11:00, and the summary time of the conference is from 16:30 to 17:00.
  • the speaker corresponding to the current time is obtained according to the agenda of the meeting.
  • the spokesperson, the start time of the speech, and the end time of the speech Processed in association with the acquired audio/video data and stored in a storage device.
  • the voiceprint recognition technique is used to identify the participant currently speaking. Specifically, the voiceprint feature data is identified according to the currently collected audio/video data, and the identified voiceprint feature data is matched with the user voiceprint feature data collected and stored in step S1, and the current speaker is obtained after the matching is successful. The name and user name are associated with the collected audio/video data and stored in the storage device.
  • the position of the speaker in the meeting usually reflects his position and role in the meeting. For example, the first speaker and the last speaker of the conference usually occupy a heavier position, or the opening ceremony and midfield summary of the conference. The speeches and final summary speeches will also play an important role in the meeting. Therefore, according to the meeting agenda stored in the meeting agenda, the speaker's speaking position is determined, and different speakers are given different weighting coefficients A according to the speaker's speaking position.
  • the privilege level of the speaker can also reflect the position occupied by the speaker in the meeting. According to the speaker's user name, the corresponding privilege level is obtained, and different speakers are given different weight coefficients B according to the privilege level of the speaker.
  • the weight coefficient A and the weight coefficient B corresponding to the general spokesperson determine the final weight coefficient C of the speaker. It is also possible to use only the speaker's weighting factor A or the weighting factor B as the final weighting factor of the speaker. The greater the weighting factor of the spokesperson, the greater the importance of the content of the speech.
  • the method for determining the weight coefficient for the speaker is different from the traditional single method for extracting the content of the speaker for all the speakers, and the candidate key segment set corresponding to different speakers can be obtained according to the weight coefficient according to the weight coefficient.
  • the speech extracts more content, extracts relatively less content for the unimportant speech, makes the final summary content more reasonable, and provides more effective help for the user.
  • the stored audio/video is retrieved, and the corresponding specific speech segment is found, and the candidate key segment is intercepted in the speech segment by using a preset strategy.
  • the preset strategies for intercepting the candidate key segments corresponding to the speakers having different weight coefficients are different.
  • the spokesman Li Ming weight coefficient is 0.9
  • the speaker Wang Wei weight coefficient is 0.7, that is, the speaker Li Ming weight coefficient is greater than Wang Wei, then (1) for Li Ming's speech segment interception time period is as follows: 0% ⁇ 5%, 10% to 20%, 50% to 60%, 80% to 100%.
  • the time period for Wang Wei's speech is as follows: 0% to 5%, 10% to 20%, 80% to 100%.
  • the keyword library can be preset, and the speaker can speak audio/video clips.
  • the speech recognition process is performed, and the recognized speech information is matched by using a preset keyword database.
  • the audio/video segment of the preset time period after the recognized keyword is intercepted is used as a candidate key speech segment. For example, when it is recognized that the pre-set keyword "important yes” is included in the speech audio/video segment corresponding to the speaker, the audio/video segment 1 minute after the keyword is intercepted as a candidate key segment.
  • the candidate key segments can be intercepted in two ways, for example, the first mode is adopted first, and then the second key mode is used to intercept the candidate key segment segments.
  • the manner of intercepting the candidate key segment of the candidate can be based on the characteristics of the content of the speech, such as a position where the important content of the speech appears on the time axis of the speech, or a key turning word or a connecting word followed by the important content of the speech. Intercepting the set of candidate key segments can greatly improve the effectiveness of the extracted content, and the extraction efficiency is high, and will not be affected by other factors such as the environment, further making the final summary content more reasonable.
  • S9 Perform speech recognition processing on the obtained candidate key speech segment set, and filter and locate the audio/video segment set corresponding to the key speech content.
  • the keynote content may be determined in combination with the conference topic, and the keynote content may be a series of keywords related to the conference topic.
  • the candidate key speech segment obtained in step S8 is subjected to speech recognition processing, and converted into text data, and the converted text data has a time axis corresponding to the audio/video data, and the corresponding time can be located according to the content in the text data.
  • the transformed text data is filtered by keywords corresponding to the key speech content, and finally the audio/video segment set corresponding to the key speech content is determined.
  • step S10 Synthesize the audio/video segment set selected in step S9 to form a voice digest.
  • the transition information in the summary is generated, for example, “Summary of the work of the month at the beginning of the meeting”, “Li Si, Wang Wei, Li Ming, etc. made a speech during the meeting”, “Where the king The main content of Wei’s speech is “,” Finally, Wang Wu will deploy the next month’s work, the specific content is “, and the above information will be generated into a transitional information voice file.
  • the head information voice file, the transition information voice file, and the stitched voice summary of different speakers are combined into a time series and a corresponding relationship to form a voice summary of the conference.
  • a voice summary file corresponding to the following text is generated: On March 30, 2017, the marketing department meeting will be held in conference room A. The participants include: Li Ming, Wang Wei... At the beginning of the meeting, Zhang San summarized the work of the month. The specific content is “Zhang San’s speech summary”; during the meeting, Li Si, Wang Wei, Li Ming, etc. made a speech. Among them, Wang Wei’s main content was “Wang Wei’s speech summary”. Finally, Wang Wu’s work next month The deployment is carried out, and the specific content is “Wang Wu’s speech summary”.
  • FIG. 1 A schematic structural diagram of a conference intelligent management system according to an embodiment of the present invention is described below with reference to FIG.
  • System 400 includes the following modules:
  • the user setting module 401 is configured to set a user role and input user information.
  • the user setting module 401 includes:
  • the input/output device 4011 receives the type of the user role input by the system administrator.
  • the role type may include: a teacher, a student, a logistics supervisor, a principal, and the like.
  • the role types can include: chairman, general manager, department manager, staff, etc.
  • Receiving a plurality of user information corresponding to each role type the user information includes a name, a contact method (such as a mobile phone number, a mailbox), and the name is used as a user name of the user, and if there is a duplicate name, the serial number may be assigned to the duplicate user. Differentiate, for example, Xiao Ming 1, Xiao Ming 2, the name with the serial number as the user name of the user, each user has a unique user name.
  • the storage device 4012 stores the user role type and associates and stores the plurality of user information corresponding to each role type.
  • the role type is stored corresponding to the user name, and the storage method is, for example, a mapping table.
  • the voiceprint feature information collecting module 4013 collects voiceprint feature information of different users, and uses voiceprint recognition technology to identify different users. Specifically, the voice data of different users are collected, the voiceprint feature data of the user is identified, and stored in the storage device 4012 corresponding to the user name.
  • the system 400 further includes a privilege level setting module 402 for setting privilege levels corresponding to different role types.
  • Each role type corresponds to one permission level.
  • four permission levels A, B, C, and D can be set, and the order of the rights is A>B>C>D.
  • the principal has the highest authority, the authority level is set to A; the logistics supervisor authority level is set to B; the teacher authority level is C; and the student authority level is D.
  • the chairman has the highest authority, the privilege level is set to A; the general manager privilege level is set to B; the department manager privilege level is C; and the clerk's privilege level is D.
  • the system 400 also includes a meeting room reservation module 403 for the user to reserve a meeting room.
  • the user is allowed to reserve a meeting room.
  • the reserved conference room only the reserved user can open the access control system of the conference room during the appointment period.
  • the conference room reservation module 403 includes a reservation request module 4031, and the user can log in to the smart conference control system 400 through the network, and the reservation request module 4031 obtains the corresponding role type according to the user name of the login user, and displays the conference room that can be reserved. At the same time, it is displayed that the time slots of the conference room have been successfully reserved by other users, and which time periods are idle. The time period for which the reservation has been successfully displayed is displayed as the reserved and the role type of the reserved user is displayed, and the time period for which the reservation has not been successfully displayed is displayed as idle.
  • the user selects the required conference room and the required time period through the reservation requesting module 4031, submits the reservation request, and the system confirms that the reservation is successful, prompts the user, and can also send the reservation success information to the user by using a short message or the like after the reservation is successful.
  • the reservation success information includes the reserved conference room and the appointment time period.
  • the conference room reservation module 403 includes a reservation change request module 4032 for changing the reservation result of the conference room that has been reserved.
  • the user selects the required conference room and the required time period through the reservation request module 4031, and submits a reservation request.
  • the system determines whether the time period of the conference room has been reserved by other users, and if it is determined that the reservation is not made by another user, the reservation is confirmed to be successful. And prompting the user to send the reservation success information to the user by using a short message or the like, the reservation success information includes the reserved conference room and the reservation time period; if it is determined that the reservation has been made by another user, the reservation change request module 4032 obtains the reserved user role type.
  • Corresponding privilege level at the same time, obtaining the privilege level corresponding to the role type of the user who submitted the reservation request, and determining the privilege level of the two, when the privilege level corresponding to the reserved user role type is lower than the privilege level corresponding to the role type of the currently submitted reservation requesting user
  • the reservation of the reserved user is cancelled, the currently submitted reservation request is confirmed, and the user is prompted to send the information of the cancellation of the reservation to the user who canceled the reservation by means of a short message or the like, and the information of the reservation cancellation includes the reserved information.
  • Meeting room and cancelled reservation Lap while successful request user via SMS, etc. to the reservation of the current success information submitted transmission reservation, the reservation information includes a reservation success conference rooms and an appointment time.
  • the current user may be prompted to send a reservation change request to the reserved user, and the current user inputs the reason for the request change.
  • the reservation change request module 4032 transmits a reservation change request to the reserved user by a short message or the like, the reservation change request including the reason for the request change, the conference room requesting the change reservation, and the corresponding time period, and the reserved user confirms whether or not Agree to make a reservation change request and feed back to the system.
  • the system cancels the reservation of the reserved user, confirms that the currently submitted reservation request is successful, and prompts the user to cancel the reservation by SMS or the like.
  • the user sends the information that the reservation is cancelled, and the information that the reservation is cancelled includes the reserved conference room and the cancelled reservation time period, and simultaneously sends the reservation success information to the currently submitted reservation request successful user by means of a short message or the like.
  • the reservation success information Includes a reserved meeting room and an appointment time period. If the reserved user disagrees with the reservation change request, the reservation change request module 4032 may prompt the current user to make a reservation change request failure, and may also prompt the current user to select another time period reservation.
  • conference room A has been successfully booked by teacher Zhang San from 9:00 to 10:00 on December 1st.
  • the logistics supervisor found that there is a problem with the equipment in the conference room. It needs to be in the period from 9:00 to 10:00 on December 1st.
  • the system displays the conference room that can be reserved and the reservation of the conference room.
  • the conference room A has been assigned to the teacher by the role type from 9:00 to 10:00 on December 1st.
  • the user has made a successful appointment.
  • the logistics supervisor may submit a reservation request for the conference room A on the morning of December 1st from 9:00 to 10:00, and the system determines that the conference room A has been reserved for the third time, and obtains the permission level corresponding to the third three.
  • the conference room A has been successfully booked by the teacher Zhang San from 9:00 to 10:00 on December 1st.
  • the student responsible for Xiao Ming needs to use the conference room A to answer from 9:00 to 10:00 on December 1st.
  • the system displays the conference room that can be reserved and the appointment and reservation status of the conference room.
  • the conference room A has been reserved by the user whose role type is teacher from 9:00 to 10:00 on December 1st. success.
  • Xiao Ming can submit a reservation request for the conference room A from 9:00 to 10:00 on December 1st, and the system judges that the conference room A has been reserved by Zhang San, and then obtains the permission level corresponding to Zhang San.
  • Zhang San confirms whether to agree to the reservation change request and feeds back to the system. If Zhang San confirms the reservation change request, the system cancels Zhang San's appointment and confirms Xiao Ming. The submitted reservation request is successful and prompts the user to send a message to Zhang San "The conference room A you reserved (9:00 am to 10:00 am on December 1st) has been cancelled, please understand” Send out the message "Your appointment meeting room A (1 December 9: 00-10: 00) had an appointment to succeed.” If Zhang San does not agree with the reservation change request, he or she will prompt Xiao Ming to make an appointment change request failure, and may also prompt Xiao Ming to select another time period to make an appointment.
  • the traditional reservation method is single and fixed. After the reservation is successful, it is difficult to make appropriate adjustments according to the situation.
  • different reservation methods are provided according to the user role type and the permission level, and the conference room for the reservation is successful. Providing a negotiation mechanism between the appointment personnel not only ensures the orderly use of the conference room, but also provides an intelligent and flexible adjustment method for various management needs and the handling of unexpected situations, which provides great flexibility for the user to use the conference room. Sex.
  • the voice data collecting module 4041 is configured to collect voice data of the user by using a voice collector set in the conference room access control, and obtain a user name of the corresponding user.
  • the reservation judging module 4042 is configured to acquire the current time and determine whether the conference room has been reserved at the current time. If not reserved, the user is allowed to enter and use the meeting room. If the reservation has been made, the user name of the conference room is obtained, and the user name obtained by the voice data collection module 4041 is compared. If the user name is consistent, the conference room access is opened, and the user is allowed to enter and use the conference room; if the user is inconsistent, the user is prompted. The meeting room has been reserved, the current time is not available, and the process ends.
  • System 400 also includes a data acquisition module 405 for collecting conference audio/video data during the conference.
  • the participant ends the speech, triggers the recording pause or stop command, and records the end time of the speech.
  • the recording microphone continues to cyclically detect the voice information, and when the voice message of the next speaker starts to be detected, the continuous recording command or the recording start command is triggered to record the audio/video data of the next speaker.
  • the speakers are all users who have entered user information and set a user role in the user setting module 401.
  • System 400 also includes an audio/video data pre-processing module 406 for pre-processing and storing the recorded audio/video data.
  • the audio/video data pre-processing module 406 After the data acquisition module 405 pauses or ends the recording, the audio/video data pre-processing module 406 records the start time and end time of the speaker's speech, and obtains the name and user name of the speaker.
  • the audio/video data pre-processing module 406 includes a meeting agenda processing module 4061, configured to read a pre-stored meeting agenda table, where the meeting agenda table stores a meeting agenda, and a speaking time period of each speaker in the meeting.
  • 9:00 ⁇ 9:10 is the opening ceremony.
  • the speech of the speaker Li Ming is from 9:10 to 9:30.
  • the speech of the speaker Wang Wei is from 9:30 to 9:50.
  • the summary speech time is from 10:30 to 11:00, and the summary time of the conference is from 16:30 to 17:00.
  • the speaker corresponding to the current time is obtained according to the agenda of the meeting.
  • the spokesperson, the start time of the speech, and the end time of the speech Processed in association with the acquired audio/video data and stored in a storage device.
  • the audio/video data pre-processing module 406 includes a voiceprint recognition module 4062 for identifying the currently speaking participant using voiceprint recognition technology.
  • the voiceprint recognition module 4062 identifies the voiceprint feature data according to the currently collected audio/video data, and matches the identified voiceprint feature data with the user voiceprint feature data collected and stored by the voiceprint feature information collection module 4013, and the matching is successful. Then, the name of the current speaker is obtained, and the speaker, the start time of the speech, and the end time of the speech are associated with the collected audio/video data, and stored in the storage device.
  • the system 400 also includes a speaking human rights re-determination module 407 for determining the weighting coefficients of the different speakers.
  • the position of the speaker in the meeting usually reflects his position and role in the meeting. For example, the first speaker and the last speaker of the conference usually occupy a heavier position, or the opening ceremony and midfield summary of the conference. The speeches and final summary speeches will also play an important role in the meeting.
  • the speaking human rights re-determination module 407 determines the speaking position of the speaker according to the meeting agenda stored in the meeting agenda table, and assigns different weight coefficients A to different speakers according to the speaking position of the speaker.
  • the spokesperson's privilege level can also reflect the position of the spokesperson in the meeting.
  • the speaking human rights re-determination module 407 can obtain the corresponding permission level according to the speaker's user name, and assign different weight coefficients B to different speakers according to the speaker's permission level.
  • the speaking human rights re-determination module 407 synthesizes the weight coefficient A and the weight coefficient B corresponding to the speaker to determine the final weight coefficient C of the speaker.
  • the speaking human rights re-determination module 407 may also utilize only the speaker's weighting factor A or the weighting factor B as the final weighting coefficient of the speaker. The greater the weighting factor of the spokesperson, the greater the importance of the content of the speech.
  • the method for determining the weight coefficient for the speaker is different from the traditional single method for extracting the content of the speaker for all the speakers, and the candidate key segment set corresponding to different speakers can be obtained according to the weight coefficient according to the weight coefficient.
  • the speech extracts more content, extracts relatively less content for the unimportant speech, makes the final summary content more reasonable, and provides more effective help for the user.
  • the system 400 also includes a candidate key segment collection module 408 for obtaining a corresponding set of candidate key segments based on the speaker.
  • the candidate key segment collection module 408 retrieves the stored audio/video according to the name of the speaker, finds the corresponding specific segment, and intercepts the candidate key segment in the segment by using the preset strategy.
  • the preset strategies for intercepting the candidate key segments corresponding to the speakers having different weight coefficients are different.
  • the candidate key speech segment collection acquisition module 408 The time segment intercepting module 4081 is configured to intercept audio of a time period corresponding to the speaking audio/video segment of the speaker, and intercept the audio for a specific time period (for example, 0% to 5%, 10% to 30%, and 80% to 100%). / Video clips as candidate key segments.
  • the selection of the time period can be set according to the actual situation, and the number and length of audio/video segments intercepted by different weight coefficient speakers are different.
  • the spokesman Li Ming weight coefficient is 0.9
  • the speaker Wang Wei weight coefficient is 0.7, that is, the speaker Li Ming weight coefficient is greater than Wang Wei, then (1) for Li Ming's speech segment interception time period is as follows: 0% ⁇ 5%, 10% to 20%, 50% to 60%, 80% to 100%.
  • the time period for Wang Wei's speech is as follows: 0% to 5%, 10% to 20%, 80% to 100%.
  • the interception time period for Li Ming's speech segment is as follows: 0% to 5%, 10% to 20%, 50% to 60%, 80% to 100%.
  • the time period for the speech segment of Wang Wei is as follows: 0% to 5%, 10% to 15%, 50% to 60%, and 90% to 100%.
  • the interception time period for the speech segment of Li Ming is as follows: 0% to 5%, 10% to 20%, 50% to 60%, 80% to 100%, and the time period for the speech segment of Wang Wei is as follows: 0% to 5%, 10% to 15%, and 90% to 100%.
  • the candidate key segment collection module 408 includes keyword interception.
  • the module 4082 pre-sets a keyword library, performs voice recognition processing on the speaker-speaking audio/video segment, and uses the preset keyword database to match the recognized voice information. After the matching is successful, the recognized keyword is intercepted. Set the audio/video clip of the time period as a candidate key segment. For example, when it is recognized that the pre-set keyword "important yes" is included in the speech audio/video segment corresponding to the speaker, the keyword intercepting module 4082 intercepts the audio/video segment of the minute after the keyword as a candidate key segment. .
  • the spokesman Li Ming weight coefficient is 0.9
  • the speaker Wang Wei weight coefficient is 0.7
  • the speaker Li Ming weight coefficient is greater than Wang Wei
  • the keyword library A includes 20 keywords
  • the keyword library B includes 10 key Words
  • (1) use the preset keyword library A to match the voice information recognized by the speech/video clip corresponding to Li Ming. After the matching is successful, the audio/video clips of the recognized 3 minutes are intercepted as Candidate key speech segments; use the preset keyword library B to match the voice information recognized by the speech audio/video segment corresponding to Li Ming, and after the matching is successful, intercept the 3 minutes audio/video segment as the recognized keyword Candidate key segment.
  • the time period intercepting module 4081, the keyword intercepting module 4082 may exist separately, or may combine two modules to intercept the candidate key speech segments, for example, intercepting by the time segment intercepting module 4081, and then adopting the keyword intercepting module 4082. Intercepting and intercepting the set of candidate key segments.
  • the manner of intercepting the candidate key segment of the candidate can be based on the characteristics of the content of the speech, such as a position where the important content of the speech appears on the time axis of the speech, or a key turning word or a connecting word followed by the important content of the speech. Intercepting the set of candidate key segments can greatly improve the effectiveness of the extracted content, and the extraction efficiency is high, and will not be affected by other factors such as the environment, further making the final summary content more reasonable.
  • the audio/video clip collection screening module 409 can determine the key speech content in combination with the conference topic, and the key speech content can be a series of keywords related to the conference topic.
  • the audio/video segment collection screening module 409 performs speech recognition processing on the candidate key speech segments acquired by the candidate key speech segment collection acquisition module 408, and converts the data into text data, and the converted text data has a time corresponding to the audio/video data.
  • the axis is capable of locating the audio/video data of the corresponding time period according to the content in the text data.
  • the transformed text data is filtered by keywords corresponding to the key speech content, and finally the audio/video segment set corresponding to the key speech content is determined.
  • the combination of the above-mentioned conference topics can determine the content of the key speeches to further enhance the effectiveness of the extracted content.
  • the system 400 further includes a voice digest synthesizing module 410 for synthesizing the audio/video clip sets filtered by the audio/video clip set screening module 409 to form a voice digest.
  • the voice summary synthesis module 410 includes a speaker voice summary synthesis module 4101, which is used to sort the audio/video clip sets of the same speaker selected by the audio/video segment collection screening module 409 in chronological order, and the sorted The audio/video clip collection is stitched into a piece of audio/video as a voice summary of the speaker's speech. Further, the voice summary synthesis module 410 further includes a conference voice summary synthesis module 4102, configured to generate a voice summary of the entire conference.
  • the conference voice summary synthesis module 4102 can generate header information of the summary according to the conference theme, the conference agenda, and the like, for example, “The 2017 Artificial Intelligence Conference is held in Shanghai for a period of three days, and the participants include: Li Ming, Wang Wei...” And generate the header information voice file from the above information.
  • the conference voice summary synthesis module 4102 then generates the transition information in the summary according to the conference agenda and other information, for example: “Song three pairs of conferences at the opening ceremony”, “Li Si, Wang Wei, Li Ming, etc. during the conference “Speaking”, "Where Wang Wei speaks the main content is", “Finally, Wang Wu sums up the meeting, the specific content is”, and generates the transition information voice file.
  • the conference voice summary synthesis module 4102 combines the header information voice file, the transition information voice file, and the stitched voice summary of different speakers into a corresponding relationship to form a voice summary of the conference. For example, generate a voice summary file corresponding to the following text:
  • the 2017 Artificial Intelligence Conference was held in Shanghai for a period of three days.
  • Li Si, Wang Wei, Li Ming, etc. made a speech.
  • Wang Wei’s main content was “Wang Wei’s speech summary”.
  • Wang Wu summarized the meeting The content is "Wang Wu summary speech summary.”
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
  • the software functional unit described above is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods described in various embodiments of the present application. Part of the steps.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .
  • the invention can provide a plurality of conference room reservation manners according to different user roles and hierarchical rights, and avoid disorderly use of the conference room.
  • the reservation mode can be flexibly provided, and the reservation result of the reserved conference room can be changed according to different privilege levels of different users, thereby facilitating the management of the conference and coordinating the emergency situation.
  • And establishing a communication and coordination mechanism between the reserved user and the subsequent user allowing the latecomer to communicate and coordinate with the reserved user on the emergency needs of the conference room, providing greater flexibility for the user to use the conference room.
  • the recording of invalid content is effectively reduced, the storage resource is saved, and the length of the recording time is reduced, so that the user can find the desired content later.
  • the weight coefficient of the speaker is determined, so that different candidate strategies are used to obtain the candidate key segment sets corresponding to different speakers according to the weight coefficient. Extract more content for important speeches, extract relatively less content for unimportant speeches, make the resulting summary content more reasonable, and provide users with more effective help.
  • the important content of the speech appears at a position with a high probability of speaking on the timeline of the speech, or the key turning words and connecting words followed by the important content of the speech, to intercept the candidate key speech segment set, and then
  • the intercepted candidate key speech segment sets are processed to obtain an audio/video clip set forming a speech digest, which can greatly improve the effectiveness of the extracted content, and the extraction efficiency is high, and is not affected by other factors such as the environment, further making the final
  • the resulting summary is more reasonable.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Operations Research (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Development Economics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本发明公开了一种会议智能管理方法及系统,涉及智能管理领域。本发明通过用户角色与用户权限等级的区分,对已经预约的会议室的预约结果进行不同方式的变更,方便会议的管理以及协调突发情况;通过分析发言人在会议中的发言位置、身份信息、个人资料等信息,确定发言人的权重系数,从而根据权重系数来用不同的预设策略获取不同发言人对应的候选关键发言片段,进一步根据发言内容本身的特点,来截取候选关键发言片段集合,再对截取的候选关键发言片段集合进行处理以获取形成语音摘要的音频/视频片段集合,能够针对重要发言提取更多的内容,针对不重要的发言提取相对较少的内容,使最终形成的摘要内容更加合理,为用户提供更有效的帮助。

Description

一种会议智能管理方法及系统 技术领域
本发明涉及一种会议智能管理方法及系统,尤其涉及一种灵活预约并自动合成语音形式的会议摘要的方法及系统。
背景技术
现如今,每天召开的各类会议数不胜数,会议室的预约管理通常采用填表的方式进行预约或者通过会议室管理系统进行会议室的预约登记管理。但是,填表方式预约会议室通常耗时耗力,且容易出错。通过会议室管理系统进行会议室的预约登记,预约成功的会议室不易变更,当发生突发情况时难以变更预约结果,当某些紧急会议需要召开时,组织者也无法与已预约会议室的人员进行沟通协调,不能满足会议室灵活、高效的使用需求。同时,为了记录会议内容,通常使用摄像机或者录音笔等采集工具记录下会议现场的视频或者音频数据,将视频数据或音频数据保存成多媒体文件,通过回放保存的文件,可以随时观看或收听到会议内容,或者事后人工将其转化为记录文本,满足备忘、培训等需求。此外,会议现场通常会安排专门的记录员或者参会者自己通过笔记本电脑或者手写等方式进行记录,记录下召开的会议内容。但是,视频或者音频数据通常较大,保存时要占用大量硬件存储空间,且会议时间较长时回放过程中不容易定位到所需的内容,用户寻找到感兴趣的对象内容需要花费很多时间,用户体验很差。采用人工的方式记录会议内容虽然有助于记录关键内容且查找方便,但对记录人的要求较高,不经过专门训练的人员通常很难跟上会议讲话的进度,容易发生疏漏。
自动摘要提取技术可对输入的文本、语音、视频等信息进行处理,获得输入数据中的摘要内容,将处理后的摘要结果呈现给用户浏览。自动摘要提取技术不仅节约了用户访问信息的时间,而且提高了用户的工作效率。现有技术中存在多种自动生成会议摘要或总结的方式。
专利文献1(CN107409061A)提供了一种语音总结的方法、系统,计算机基于比较参与者的图像与讲话者和非讲话者面部的模板图像判断哪个参与 者正在讲话。计算机通过将隐马尔可夫模型应用到参与者声音波形的简要记录确定讲话参与者的声纹,并将确定的声纹与讲话参与者的面部相关联。计算机识别并转录讲话者所做陈述的内容,确定关键点,并在视频会议中参与者的面部上方显示它们。
专利文献2(CN102572356A)提供了一种记录会议的方法,设置配置文件,定义会议的关键信息(例如举手问问题场景)以及会议摘要的格式,在会议时间线上特定时间点,基于配置文件提取各个会场的关键信息,将各个会场的关键信息组合成关键索引点,所述关键索引点用作与会议摘要进行互动或编辑的索引点;将对应于多个时间点的多个关键索引点结合为会议摘要;根据所述会议摘要中的关键信息与所述会议摘要进行互动或编辑。
但是,无论专利文献1虽然能够识别讲话参与者并将讲话的关键内容与参与者关联起来显示,但是,其对所有参与者提取其讲话信息的方式都是相同的,并不能根据不同参与者的不同情况有选择的提取不同参与者的讲话内容。在实际会议中,不同参与者的重要性通常是不同的,如果对每位参与者使用同样的方式提取其讲话信息,则有可能针对不重要的参与者讲话内容提取的信息过多造成资源浪费,而针对重要的参与者讲话内容提取的信息过少造成信息疏漏。并且专利文献1中识别并转录讲话者所做陈述的内容,确定关键点后是生成文字等信息显示出来供用户观看阅读,丧失了语音文件自身的优势。此外,专利文献1通过比较参与者的图像与讲话者和非讲话者面部的模板图像判断哪个参与者正在讲话,再计算机通过将隐马尔可夫模型应用到参与者声音波形的简要记录确定讲话参与者的声纹,识别参与者的过程比较复杂,效率较低。
对于专利文献2,需要提取会场的关键信息,将会场的关键信息组合成关键索引点,基于多个时间点的多个关键索引点结合为会议摘要,一方面对会场环境数据的采集提出了更高的要求,另一方面容易受外部环境因素影响,其关键索引点并不一定代表会议的重要信息,其形成的会议摘要可能并不准确。
发明内容
基于上述问题,本发明的目的在于提供一种会议智能管理方法及系统,能够灵活预约会议室,并针对会议记录识别不同发言者的关键发言内容,自动合成语音形式的会议摘要。
所述会议的智能管理方法包括:
S1、设置用户角色,录入用户信息,其中,预先设置所述用户角色的类型,针对每种角色类型录入对应的多个所述用户信息并存储,每位用户均拥有唯一的用户名;采集不同用户的声纹特征信息,其中,利用声纹识别技术识别不同用户,采集不同用户的声音数据,识别用户的声纹特征数据并将其与用户名对应存储;
S2、设置不同所述角色类型对应的权限等级,其中,根据多个所述角色类型自身的属性设置多个权限等级,每种所述角色类型对应一个所述权限等级;
S3、用户预约会议室,经预约的会议室,在预约时间段内只有预约用户可以打开所述会议室的门禁系统,其中,用户能够对已经预约的会议室的预约结果提出变更请求,根据用户不同角色类型进行不同方式的变更处理;
S4、验证用户的所述会议室使用权限,确认是否允许所述用户使用所述会议室;
S5、在会议进行期间,采集会议音频/视频数据;
S6、对录制的音频/视频数据进行预处理后存储,其中,记录下该发言人发言的开始时间和结束时间,并获取该发言人的姓名、用户名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理并存储;
S7、确定不同发言人的权重系数;
S8、根据发言人获取其对应的候选关键发言片段集合,其中,根据发言人的姓名在存储的音频/视频中检索,找到其对应的具体发言片段,利用预设的策略在发言片段中截取候选的关键发言片段,具有不同权重系数的发言人对应的截取候选关键发言片段的预设策略不同;
S9、对获取的候选关键发言片段集合进行语音识别处理,筛选定位重点发言内容对应的音频/视频片段集合;
S10、对步骤105中筛选出的音频/视频片段集合进行合成,形成语音摘要。
可选地,所述步骤S3中,用户通过网络预约会议室,获取用户的用户名,根据所述用户名获取其对应的角色类型,显示其能够预约的会议室,以及所述会议室允许工作时间段,同时显示该会议室哪些时间段已被其他用户预约成功,哪些时间段空闲,对于已被预约成功的时间段显示为已预约并显示预约用户的角色类型,对于未被预约成功的时间段显示为空闲,用户选择所需的会议室及所需的时间段,提交预约请求,系统确认预约成功后提示用户。
可选地,所述用户选择所需的会议室及所需的时间段,提交预约请求,系统确认预约成功后提示用户进一步包括:
用户选择所需的会议室及所需时间段,提交预约请求,系统判断该会议室的该时间段是否已被其他用户预约,若判断未被其他用户预约,则确认预约成功并提示用户;
若判断已被其他用户预约,则获取已预约用户角色类型对应的权限等级,同时获取当前提交预约请求用户的角色类型对应的权限等级,判断二者权限等级高低,当已预约用户角色类型对应的权限等级低于当前提交预约请求用户的角色类型对应的权限等级时,则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户;当已预约用户角色类型对应的权限等级等于或者高于当前提交预约请求用户的角色类型对应的权限等级时,提示当前用户可向已预约的用户发送预约变更请求,当前用户输入请求变更的理由,向已预约的用户发送预约变更请求,所述预约变更请求包括所述请求变更的理由、请求变更预约的会议室及对应的时间段,已预约的用户确认是否同意预约变更请求并向系统反馈,如果已预约的用户确认同意所述预约变更请求,系统则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,如果已预约的用户不同意所述预约变更请求,则向当前用户提示预约变更请求失败,提示当前用户选择其他时间段预约。
可选地,所述步骤S4还包括:
S401、利用会议室门禁处设置的语音采集器,采集用户语音数据,获取对应用户的用户名,其中,根据当前采集的语音数据识别其中的声纹特征数据,将识别出的声纹特征数据与预先采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前语音数据对应用户的用户名;
S402、获取当前时间,判断当前时间该会议室是否已被预约,如果未被预约,则允许用户进入并使用该会议室。如果已被预约,则获取预约该会议室的用户名,与S401中获取的用户名比较是否一致,一致则开放该会议室门禁,允许用户进入并使用该会议室;不一致则提示用户该会议室已被预约,当前时间不可用,并结束流程。
可选地,所述步骤S5进一步包括:
由用户手动启动和停止音频/视频数据采集以录制需要的内容;
或者,令录制用麦克风循环检测语音信息,当检测到发言人开始发言的语音信息时,触发录制开始命令,采集发言人的音频/视频数据,记录下发言开始时间,根据采集到的音频/视频数据的属性判断当前发言人的发言在继续还是已经停止,当采集到的音频/视频数据满足预设条件时,触发录制暂停或停止命令,记录下发言结束时间,录制用麦克风继续循环检测语音信息,检测到下一发言人开始发言的语音信息时,触发继续录制命令或者录制开始命令,录制下一位发言人的音频/视频数据。
可选地,所述步骤S6进一步包括:
读取预先存储的会议议程表,会议议程表存储有会议议程,以及会议中各发言人的发言时间段,根据会议议程表获取当前时间对应的发言人,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中;
或者,根据当前采集的音频/视频数据识别其中的声纹特征数据,将识别出的声纹特征数据与步骤S1中采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前发言人的姓名、用户名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中。
可选地,所述步骤S7进一步包括:
根据会议议程表存储的会议议程确定发言人的发言位置,根据发言人的发言位置对不同发言人赋予不同的权重系数A;
和/或
根据发言人的用户名获取其对应的权限等级,根据发言人的权限等级对不同发言人赋予不同的权重系数B。
可选地,所述步骤S8进一步包括:
结合该发言人对应发言音频/视频片段的时间轴,截取其特定时间段的音频/视频片段作为候选的关键发言片段,权重系数越高,截取的特定时间段音频/视频片段数量越多和/或长度越长;权重系数越低,截取的特定时间段音频/视频片段数量越少和/或长度越短。
可选地,所述步骤S8进一步包括:
预先设置关键词库,对发言人对应发言音频/视频片段进行语音识别处理,使用预设的关键词库对识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后预设时间段的音频/视频片段作为候选的关键发言片段,不同权重发言人对应的关键词库和/或截取的预设时间段长度不同,权重系数越高,对应的关键词库中关键词数量越多和/或截取的音频/视频片段长度越长,权重系数越低,对应的关键词库中关键词数量越少和/或截取的音频/视频片段长度越短。
可选地,所述步骤S9进一步包括:
结合会议主题确定重点发言内容,对步骤S8中获取的候选关键发言片段进行语音识别处理,将其转化为文本数据,转化后的文本数据具有与音频/视频数据相对应的时间轴,根据文本数据中的内容能够定位到相应时间段的音频/视频数据,利用重点发言内容对应的关键词对转化后的文本数据进行筛选,最终确定重点发言内容对应的音频/视频片段集合。
可选地,所述步骤S10进一步包括:
将步骤S9中筛选出的同一发言人的音频/视频片段集合按时间顺序排 序,将排序后的音频/视频片段集合拼接为一段音频/视频,作为该发言人发言内容的语音摘要;根据会议主题、会议议程等信息预先生成摘要的头部信息,并将上述信息生成头部信息语音文件,再根据会议议程等信息生成摘要中承前启后的过渡信息,并将上述过渡信息生成过渡信息语音文件,将头部信息语音文件、过渡信息语音文件、拼接完成的不同发言人的语音摘要按对应关系合成到一起,形成会议的语音摘要。
所述会议的智能管理系统包括:
用户设置模块,用于设置用户角色,录入用户信息,其中,预先设置所述用户角色的类型,针对每种角色类型录入对应的多个所述用户信息并存储,每位用户均拥有唯一的用户名;采集不同用户的声纹特征信息,其中,利用声纹识别技术识别不同用户,采集不同用户的声音数据,识别用户的声纹特征数据并将其与用户名对应存储;
权限等级设置模块,用于设置不同角色类型对应的权限等级,其中,根据角色类型自身的属性设置多个权限等级,每种角色类型对应一个权限等级;
会议室预约模块,用于供用户预约会议室,经预约的会议室,在预约时间段内只有预约用户可以打开所述会议室的门禁系统,其中,用户能够对已经预约的会议室的预约结果提出变更请求,根据用户不同角色类型进行不同方式的变更处理;
会议室使用权限验证模块,用于验证用户会议室使用权限,确认是否允许该用户使用该会议室;
数据采集模块,用于在会议进行期间,采集会议音频/视频数据;
音频/视频数据预处理模块,用于对录制的音频/视频数据进行预处理后存储,其中,记录下该发言人发言的开始时间和结束时间,并获取该发言人的姓名、用户名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中;
发言人权重确定模块,用于确定不同发言人的权重系数;
候选关键发言片段集合获取模块,用于根据发言人获取其对应的候选关键发言片段集合,其中,候选关键发言片段集合获取模块根据发言人的姓名 在存储的音频/视频中检索,找到其对应的具体发言片段,利用预设的策略在发言片段中截取候选的关键发言片段,具有不同权重系数的发言人对应的截取候选关键发言片段的预设策略不同;
音频/视频片段集合筛选模块,用于对获取的候选关键发言片段集合进行语音识别处理,筛选定位重点发言内容对应的音频/视频片段集合;
语音摘要合成模块,用于对音频/视频片段集合筛选模块筛选出的音频/视频片段集合进行合成,形成语音摘要。
可选地,所述会议室预约模块进一步包括:
预约请求模块,用户可通过网络预约会议室,预约请求模块获取用户的用户名,根据所述用户名获取其对应的角色类型,根据角色类型显示其能够预约的会议室,同时显示该会议室哪些时间段已被其他用户预约成功,哪些时间段空闲,对于已被预约成功的时间段显示为已预约并显示预约用户的角色类型,对于未被预约成功的时间段显示为空闲,用户通过预约请求模块选择所需的会议室及所需的时间段,提交预约请求,系统确认预约成功后提示用户。
可选地,所述会议室预约模块进一步包括:
预约变更请求模块,用于对已经预约的会议室的预约结果进行变更,用户通过预约请求模块选择所需的会议室及所需的时间段,提交预约请求,系统判断该会议室的该时间段是否已被其他用户预约,若判断未被其他用户预约,则确认预约成功并提示用户;
若判断已被其他用户预约,预约变更请求模块则获取已预约用户角色类型对应的权限等级,同时获取当前提交预约请求用户的角色类型对应的权限等级,判断二者权限等级高低,当已预约用户角色类型对应的权限等级低于当前提交预约请求用户的角色类型对应的权限等级时,则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,当已预约用户角色类型对应的权限等级等于或者高于当前提交预约请求用户的角色类型对应的权限等级时,提示当前用户可向已预约的用户发送预约变更请求,当前用户输入请求变更的理由,预约变更请求模块向已预约的用户发送预约变更请求,所述 预约变更请求包括所述请求变更的理由、请求变更预约的会议室及对应的时间段,已预约的用户确认是否同意预约变更请求并向系统反馈,如果已预约的用户确认同意所述预约变更请求,系统则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,如果已预约的用户不同意所述预约变更请求,预约变更请求模块向当前用户提示预约变更请求失败,还可提示当前用户选择其他时间段预约。
可选地,所述会议室使用权限验证模块进一步包括:
语音数据采集模块,用于利用会议室门禁处设置的语音采集器,采集用户语音数据,获取对应用户的用户名,根据当前采集的语音数据识别其中的声纹特征数据,将识别出的声纹特征数据与预先采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前语音数据对应用户的用户名;
预约判断模块,用于获取当前时间,判断当前时间该会议室是否已被预约,如果未被预约,则允许用户进入并使用该会议室,如果已被预约,则获取预约该会议室的用户名,与语音数据采集模块获取的用户名比较是否一致,一致则开放该会议室门禁,允许用户进入并使用该会议室;不一致则提示用户该会议室已被预约,当前时间不可用,并结束流程。
可选地,所述数据采集模块进一步用于:
由用户手动启动和停止音频/视频数据采集以录制需要的内容;
或者,令录制用麦克风循环检测语音信息,当检测到发言人开始发言的语音信息时,触发录制开始命令,采集发言人的音频/视频数据,记录下发言开始时间,根据采集到的音频/视频数据的属性判断当前发言人的发言在继续还是已经停止,当采集到的音频/视频数据满足预设条件时,触发录制暂停或停止命令,记录下发言结束时间,录制用麦克风继续循环检测语音信息,检测到下一发言人开始发言的语音信息时,触发继续录制命令或者录制开始命令,录制下一位发言人的音频/视频数据。
可选地,所述音频/视频数据预处理模块进一步包括:
会议议程处理模块,用于读取预先存储的会议议程表,会议议程表存储有会议议程,以及会议中各发言人的发言时间段。根据会议议程表获取当前 时间对应的发言人,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中;
声纹识别模块,预先采集与会发言者的声音数据,识别与会发言者的声纹特征数据并将其与与会发言者姓名对应存储,声纹识别模块根据当前采集的音频/视频数据识别其中的声纹特征数据,将识别出的声纹特征数据与用户设置模块采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前发言人的姓名、用户名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中。
可选地,发言人权重确定模块进一步用于:
根据会议议程表存储的会议议程确定发言人的发言位置,根据发言人的发言位置对不同发言人赋予不同的权重系数A;
和/或
通过网络搜索发言人的相关身份信息和/或个人资料,根据获取的身份信息基于预设算法计算该发言人的权重系数B。
可选地,候选关键发言片段集合获取模块进一步包括:
时间段截取模块,用于结合该发言人对应发言音频/视频片段的时间轴,截取其特定时间段的音频/视频片段作为候选的关键发言片段,权重系数越高,截取的特定时间段音频/视频片段数量越多和/或长度越长;权重系数越低,截取的特定时间段音频/视频片段数量越少和/或长度越短。
可选地,候选关键发言片段集合获取模块进一步包括:
关键词截取模块,预先设置关键词库,对发言人对应发言音频/视频片段进行语音识别处理,使用预设的关键词库对识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后预设时间段的音频/视频片段作为候选的关键发言片段,不同权重发言人对应的关键词库和/或截取的预设时间段长度不同,权重系数越高,对应的关键词库中关键词数量越多和/或截取的音频/视频片段长度越长,权重系数越低,对应的关键词库中关键词数量越少和/或截取的音频/视频片段长度越短。
可选地,音频/视频片段集合筛选模块进一步用于,结合会议主题确定重点发言内容,对候选关键发言片段集合获取模块获取的候选关键发言片段进行语音识别处理,将其转化为文本数据,转化后的文本数据具有与音频/视频数据相对应的时间轴,能够根据文本数据中的内容定位到相应时间段的音频/视频数据,利用重点发言内容对应的关键词对转化后的文本数据进行筛选,最终确定重点发言内容对应的音频/视频片段集合。
可选地,语音摘要合成模块进一步包括:
发言人语音摘要合成模块,用于将音频/视频片段集合筛选模块筛选出的同一发言人的音频/视频片段集合按时间顺序排序,将排序后的音频/视频片段集合拼接为一段音频/视频,作为该发言人发言内容的语音摘要;
会议语音摘要合成模块,用于生成整个会议的语音摘要,会议语音摘要合成模块可根据会议主题、会议议程等信息生成摘要的头部信息,并将上述信息生成头部信息语音文件,会议语音摘要合成模块再根据会议议程等信息生成摘要中承前启后的过渡信息,并将上述信息生成过渡信息语音文件,会议语音摘要合成模块将头部信息语音文件、过渡信息语音文件、拼接完成的不同发言人的语音摘要按对应关系合成到一起,形成会议的语音摘要。
根据上述方式,本发明能够根据不同用户角色、等级权限提供多种会议室的预约方式,避免会议室的无序使用。通过用户角色与用户权限等级的区分,可以灵活的提供预约方式,通过对已经预约的会议室的预约结果根据不同用户的权限等级的不同进行不同方式的变更,方便会议的管理以及协调突发情况,并且在已预约用户与后来用户之间建立沟通协调机制,允许后来者与已预约用户之间就会议室的紧急需要进行沟通协调,为用户使用会议室提供了更大的灵活性。会议召开后,识别不同发言者的关键发言内容,并自动合成语音形式的会议摘要。通过检测发言人的语音信息自动化的开启和停止音频/视频录制,有效减少了无效内容的录制,节约了存储资源,并且减少了录制时间的长度,方便用户后续查找定位所需内容。通过分析发言人在会议中的发言位置、身份信息、个人资料等信息,确定发言人的权重系数,从而根据权重系数来用不同的预设策略获取不同发言人对应的候选关键发言片段集合,能够针对重要发言提取更多的内容,针对不重要的发言提取相对较少 的内容,使最终形成的摘要内容更加合理,为用户提供更有效的帮助。根据发言内容本身的特点,例如发言的重要内容出现在发言时间轴上的概率较大的位置,或者发言的重要内容所跟的关键转折词、连接词,来截取候选关键发言片段集合,再对截取的候选关键发言片段集合进行处理以获取形成语音摘要的音频/视频片段集合,能够大大提升提取出的内容的有效性,并且提取效率高,不会受到环境等其他因素的影响,进一步使最终形成的摘要内容更加合理。
附图说明
图1为本发明方法流程图。
图2为会议室的预约状态示意图。
图3为会议议程表示意图。
图4为本发明系统结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面结合附图对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图1为本申请一实施例提供的会议的智能管理方法的流程示意图。如图1所示,该方法包括:
S1、设置用户角色,录入用户信息。
首先根据实际情况预先设置用户角色的类型。例如在学校中,角色类型可包括:教师、学生、后勤主管、校长等。在公司中,角色类型可包括:董事长、总经理、部门经理、职员等。针对每种角色类型录入对应的多个用户信息并存储,用户信息包括姓名、联系方式(例如手机号码、邮箱),将姓名 作为用户的用户名,将角色类型与用户名对应存储,存储方式例如为映射表。存在重名的情况下,可为重名用户分配序号进行区分,例如小明1、小明2,将带序号的姓名作为该用户的用户名,每位用户均拥有唯一的用户名。通过上述用户角色等级的区分以及与每位用户的唯一关联,可以灵活针对每位用户设置不同的管理方式,有效提高了会议管理的智能性。
进一步地,采集不同用户的声纹特征信息,利用声纹识别技术识别不同用户。具体地,采集不同用户的声音数据,识别用户的声纹特征数据并将其与用户名对应存储。后续步骤中,根据当前采集的音频数据识别其中的声纹特征数据,将识别出的声纹特征数据与预先采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前语音数据对应用户的用户名。
S2、设置不同角色类型对应的权限等级。
根据角色类型自身的属性设置多个权限等级,每种角色类型对应一个权限等级。例如可设置四个权限等级A、B、C、D,且权限高低次序为A>B>C>D。具体地,在学校中,校长权限最高,权限等级设置为A;后勤主管权限等级设置为B;教师权限等级为C;学生权限等级为D。在公司中,董事长权限最高,权限等级设置为A;总经理权限等级设置为B;部门经理权限等级为C;职员权限等级为D。
S3、用户预约会议室。
为了方便各个会议室的使用,允许用户预约会议室。经预约的会议室,在预约时间段内只有预约用户可以打开该会议室的门禁系统。具体地,用户可通过网络登录智能会议控制系统,系统根据登录用户的用户名获取其对应的角色类型,显示其能够预约的会议室,同时显示该会议室哪些时间段已被其他用户预约成功,哪些时间段空闲,对于已被预约成功的时间段显示为已预约并显示预约用户的角色类型,对于未被预约成功的时间段显示为空闲。参见附图2示出了会议室的预约状态显示。用户选择所需的会议室及所需的时间段,提交预约请求,系统确认预约成功后提示用户,还可在预约成功后通过短信等方式向用户发送预约成功信息,所述预约成功信息包括预约的会议室以及预约时间段。进一步地,为了方便会议的管理以及应付一些突发情况,可以对已经预约的会议室的预约结果进行变更。具体方式如下。用户选 择所需的会议室及所需时间段,提交预约请求,系统判断该会议室的该时间段是否已被其他用户预约,若判断未被其他用户预约,则确认预约成功并提示用户,通过短信等方式向用户发送预约成功信息,所述预约成功信息包括预约的会议室以及预约时间段;若判断已被其他用户预约,则获取已预约用户角色类型对应的权限等级,同时获取当前提交预约请求用户的角色类型对应的权限等级,判断二者权限等级高低,当已预约用户角色类型对应的权限等级低于当前提交预约请求用户的角色类型对应的权限等级时,则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,通过短信等方式向被取消预约的用户发送预约被取消的信息,所述预约被取消的信息包括被预约的会议室以及被取消的预约时间段,同时,通过短信等方式向当前提交的预约请求成功的用户发送预约成功信息,所述预约成功信息包括预约的会议室以及预约时间段。当已预约用户角色类型对应的权限等级等于或者高于当前提交预约请求用户的角色类型对应的权限等级时,提示当前用户可向已预约的用户发送预约变更请求,当前用户输入请求变更的理由,系统通过短信等方式向已预约的用户发送预约变更请求,所述预约变更请求包括所述请求变更的理由、请求变更预约的会议室及对应的时间段,已预约的用户确认是否同意预约变更请求并向系统反馈,如果已预约的用户确认同意所述预约变更请求,系统则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,通过短信等方式向被取消预约的用户发送预约被取消的信息,所述预约被取消的信息包括被预约的会议室以及被取消的预约时间段,同时,通过短信等方式向当前提交的预约请求成功的用户发送预约成功信息,所述预约成功信息包括预约的会议室以及预约时间段。如果已预约的用户不同意所述预约变更请求,则向当前用户提示预约变更请求失败,还可提示当前用户选择其他时间段预约。
例如会议室A在12月1日上午9:00~10:00已经被教师张三预约成功,后勤主管发现该会议室设备存在问题需要在12月1日上午9:00~10:00时间段进行维修,后勤主管在登录系统后,系统显示其能够预约的会议室以及该会议室的预约情况,其中会议室A在12月1日上午9:00~10:00已经被角色类型为教师的用户预约成功。此时,后勤主管可针对会议室A在12月1日上午9:00~10:00,提交预约请求,系统判断会议室A该时间段 已被张三预约,则获取张三对应的权限等级,同时获取后勤主管的权限等级,判断张三的权限等级低于后勤主管的权限等级,则取消张三的预约,确认当前提交的预约请求成功并提示用户,向张三发送消息“您预约的会议室A(12月1日上午9:00~10:00)已被取消,请谅解”,向后勤主管发送消息“您预约的会议室A(12月1日上午9:00~10:00)已预约成功”。
或者会议室A在12月1日上午9:00~10:00已经被教师张三预约成功,学生小明负责的课题需要使用会议室A在12月1日上午9:00~10:00进行答辩,小明在登录系统后,系统显示其能够预约的会议室以及该会议室的与预约情况,其中会议室A在12月1日上午9:00~10:00已经被角色类型为教师的用户预约成功。此时,小明可针对会议室A在12月1日上午9:00~10:00,提交预约请求,系统判断会议室A该时间段已被张三预约,则获取张三对应的权限等级,同时获取小明的权限等级,判断张三的权限等级高于小明的权限等级,则提示小明可向已预约的用户发送预约变更请求,小明输入请求变更的理由,例如“老师好!学生小明于12月1日上午9:00~10:00需要用会议室A进行课题答辩,因外请评审专家时间难以变更,冒昧请求老师能够允许小明12月1日上午9:00~10:00使用会议室A”,系统通过短信等方式向张三发送预约变更请求,张三确认是否同意预约变更请求并向系统反馈,如果张三确认同意所述预约变更请求,系统则取消张三的预约,确认小明提交的预约请求成功并提示用户,向张三发送消息“您预约的会议室A(12月1日上午9:00~10:00)已被取消,请谅解”,向小明发送消息“您预约的会议室A(12月1日上午9:00~10:00)已预约成功”。如果张三不同意所述预约变更请求,则向小明提示预约变更请求失败,还可提示小明选择其他时间段预约。
传统的预约方式单一、固定,预约成功后难以根据情况变化做出适当的调整,通过本发明的预约方式,根据用户角色类型、权限等级,提供不同的预约方式,并针对预约成功的会议室在预约人员之间提供协商机制,既保障了会议室的有序使用,又为各种管理需求以及突发情况的处理提供了智能、灵活的调整方式,为用户使用会议室提供了极大的灵活性。
S4、验证用户会议室使用权限,确认是否允许该用户使用该会议室。
S401、利用会议室门禁处设置的语音采集器,采集用户语音数据,获取对应用户的用户名。
根据当前采集的语音数据识别其中的声纹特征数据,将识别出的声纹特征数据与预先采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前语音数据对应用户的用户名。
S402、获取当前时间,判断当前时间该会议室是否已被预约。如果未被预约,则允许用户进入并使用该会议室。如果已被预约,则获取预约该会议室的用户名,与S401中获取的用户名比较是否一致,一致则开放该会议室门禁,允许用户进入并使用该会议室;不一致则提示用户该会议室已被预约,当前时间不可用,并结束流程。
S5、在会议进行期间,采集会议音频/视频数据。
通常情况下,会议持续时间并不确定,可能进行很长时间,并非在整个会议过程均有发言者的语音数据需要记录,此时若在会议期间全程录制会造成资源浪费,也会进一步增加对象内容的查找难度。具体地,可由用户手动启动、暂停或停止音频/视频数据采集以录制需要的内容。此外,为了避免用户手动操作产生失误,可令录制用麦克风循环检测语音信息,当检测到发言人开始发言的语音信息时,触发录制开始命令,采集发言人的音频/视频数据,记录下发言开始时间。根据采集到的音频/视频数据的属性(例如语音强度大小)判断当前发言人的发言在继续还是已经停止,当采集到的音频/视频数据满足预设条件时,例如发言停止超过一定时间则认为该与会者发言结束,触发录制暂停或停止命令,记录下发言结束时间。录制用麦克风继续循环检测语音信息,检测到下一发言人开始发言的语音信息时,触发继续录制命令或者录制开始命令,录制下一位发言人的音频/视频数据。所述发言人均为步骤S1中已录入用户信息并设置用户角色的用户。
S6、对录制的音频/视频数据进行预处理后存储。
录制暂停后或者结束后记录下该发言人发言的开始时间和结束时间,并获取该发言人的姓名、用户名。
可选的,读取预先存储的会议议程表,会议议程表存储有会议议程,以 及会议中各发言人的发言时间段。参见图3,9:00~9:10为开幕式,发言人李明对应的发言时间为9:10~9:30,发言人王伟对应的发言时间为9:30~9:50,阶段性总结发言时间为10:30~11:00,大会总结发言时间为16:30~17:00等,根据会议议程表获取当前时间对应的发言人,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中。
或者,利用声纹识别技术识别当前发言的与会者。具体地,根据当前采集的音频/视频数据识别其中的声纹特征数据,将识别出的声纹特征数据与步骤S1中预先采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前发言人的姓名、用户名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中。
S7、确定不同发言人的权重信息。
发言人在会议中的发言位置通常能够反映其在会议中的地位与作用,例如大会的第一位发言人与最后一位发言人通常会占据较重的地位,或者大会开幕式、中场总结发言与终场总结发言也会在会议中占据重要的地位。因此,根据会议议程表存储的会议议程确定发言人的发言位置,根据发言人的发言位置对不同发言人赋予不同的权重系数A。
另外,发言人的权限等级也能够反映该发言人在会议中占据的地位,根据发言人的用户名获取其对应的权限等级,根据发言人的权限等级对不同发言人赋予不同的权重系数B。
综合发言人对应的权重系数A与权重系数B确定该发言人最终的权重系数C。也可仅利用发言人的权重系数A或者权重系数B作为该发言人的最终权重系数。发言人的权重系数越大表示其发言内容重要程度越大。
上述为发言人确定权重系数的方式区别于传统单一的针对所有发言人提取发言内容相同的方式,能够根据权重系数来用不同的预设策略获取不同发言人对应的候选关键发言片段集合,针对重要发言提取更多的内容,针对不重要的发言提取相对较少的内容,使最终形成的摘要内容更加合理,为用户提供更有效的帮助。
S8、根据发言人获取其对应的候选关键发言片段集合。
根据发言人的姓名和/或用户名在存储的音频/视频中检索,找到其对应的具体发言片段,利用预设的策略在发言片段中截取候选的关键发言片段。
其中,具有不同权重系数的发言人对应的截取候选关键发言片段的预设策略不同。权重系数越高,截取的音频/视频片段数量越多和/或长度越长;权重系数越低,截取的音频/视频片段数量越少和/或长度越短。
具体的,根据人们的发言习惯,一般一段发言的重要内容出现在0%~5%,10%~30%及80%~100%之间的概率较大,此时,结合该发言人对应发言音频/视频片段的时间轴,截取其特定时间段(例如0%~5%,10%~30%及80%~100%)的音频/视频片段作为候选的关键发言片段。时间段的选择可以根据实际情况进行设置,不同权重系数发言人截取的音频/视频片段数量及长度不同。权重系数越高,截取的特定时间段音频/视频片段数量越多和/或长度越长;权重系数越低,截取的特定时间段音频/视频片段数量越少和/或长度越短。举例说明,发言人李明权重系数为0.9,发言人王伟权重系数为0.7,也即发言人李明权重系数大于王伟时,则(1)针对李明的发言片段截取时间段如下:0%~5%、10%~20%、50%~60%、80%~100%,针对王伟的发言片段截取时间段如下:0%~5%、10%~20%、80%~100%。或者(2)针对李明的发言片段截取时间段如下:0%~5%、10%~20%、50%~60%、80%~100%,针对王伟的发言片段截取时间段如下:0%~5%、10%~15%、50%~60%、90%~100%。或者(3)针对李明的发言片段截取时间段如下:0%~5%、10%~20%、50%~60%、80%~100%,针对王伟的发言片段截取时间段如下:0%~5%、10%~15%、90%~100%。
或者,一些关键转折词、连接词后通常会引出发言的重要内容,例如“首先、其次、然而、最重要的是、最后”,可预先设置关键词库,对发言人对应发言音频/视频片段进行语音识别处理,使用预设的关键词库对识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后预设时间段的音频/视频片段作为候选的关键发言片段。例如在识别出发言人对应的发言音频/视频片段中包括预先设置的关键词“重要的是”时,截取该关键词后1分钟的音频/视频片段作为候选的关键发言片段。通过设置不同的关键词库可以控制匹配 出的关键词数量,通常关键词库中包括的关键词数量越多,使用该关键词库去进行匹配时,识别出的关键词数量相应也会越多。其中预设时间段的长度也可根据实际情况进行调整。因此,不同权重发言人对应的关键词库和/或截取的预设时间段长度不同。权重系数越高,对应的关键词库中关键词数量越多和/或截取的音频/视频片段长度越长;权重系数越低,对应的关键词库中关键词数量越少和/或截取的音频/视频片段长度越短。举例说明,发言人李明权重系数为0.9,发言人王伟权重系数为0.7,也即发言人李明权重系数大于王伟时,关键词库A包括20个关键词,关键词库B包括10个关键词,则(1)使用预设的关键词库A对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后3分钟的音频/视频片段作为候选的关键发言片段;使用预设的关键词库B对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后3分钟的音频/视频片段作为候选的关键发言片段。或者(2)使用预设的关键词库A对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后3分钟的音频/视频片段作为候选的关键发言片段;使用预设的关键词库A对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后1分钟的音频/视频片段作为候选的关键发言片段。或者(3)使用预设的关键词库A对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后3分钟的音频/视频片段作为候选的关键发言片段;使用预设的关键词库B对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后1分钟的音频/视频片段作为候选的关键发言片段。
上述结合该发言人对应发言音频/视频片段的时间轴,截取其特定时间段的音频/视频片段作为候选的关键发言片段的方式,以及对发言人对应发言音频/视频片段进行语音识别处理,使用预设的关键词库对识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后预设时间段的音频/视频片段作为候选的关键发言片段的方式,二者可以选择一种作为截取候选的关键发言片段的方式,也可以综合两种方式来截取候选的关键发言片段,例如先采用第一种方式,后采用第二种方式,截取出候选关键发言片段集合。
上述截取候选的关键发言片段的方式能够根据发言内容本身的特点,例如发言的重要内容出现在发言时间轴上的概率较大的位置,或者发言的重要内容所跟的关键转折词、连接词,来截取候选关键发言片段集合,能够大大提升提取出的内容的有效性,并且提取效率高,不会受到环境等其他因素的影响,进一步使最终形成的摘要内容更加合理
S9、对获取的候选关键发言片段集合进行语音识别处理,筛选定位重点发言内容对应的音频/视频片段集合。
具体地,结合会议主题可确定重点发言内容,所述重点发言内容可以为与会议主题相关的一系列关键词。对步骤S8中获取的候选关键发言片段进行语音识别处理,将其转化为文本数据,转化后的文本数据具有与音频/视频数据相对应的时间轴,可以根据文本数据中的内容定位到相应时间段的音频/视频数据。利用重点发言内容对应的关键词对转化后的文本数据进行筛选,最终确定重点发言内容对应的音频/视频片段集合。
上述结合会议主题可确定重点发言内容进一步提升了提取出的内容的有效性。
S10、对步骤S9中筛选出的音频/视频片段集合进行合成,形成语音摘要。
将步骤S9中筛选出的同一发言人的音频/视频片段集合按时间顺序排序,将排序后的音频/视频片段集合拼接为一段音频/视频,作为该发言人发言内容的语音摘要。进一步地,还可生成整个会议的语音摘要。可根据会议主题、会议议程等信息预先生成摘要的头部信息,例如:“2017年3月30日市场部例会在会议室A举行,与会人员包括:李明、王伟……”,并将上述信息生成头部信息语音文件。再根据会议议程等信息生成摘要中承前启后的过渡信息,例如:“在会议开始张三对当月工作进行总结”,“会议期间,李四、王伟、李明等进行了发言”,“其中王伟发言主要内容为”,“最后,王五对下月工作进行部署,具体内容为”,并将上述信息生成过渡信息语音文件。将头部信息语音文件、过渡信息语音文件、拼接完成的不同发言人的语音摘要按时间序列及对应关系合成到一起,形成会议的语音摘要。例如,生成对应于下列文字的语音摘要文件:2017年3月30日市场部例会在会议室A举行,与会人员包括:李明、王伟……,在会议开始张三对当月工作进行总结,具 体内容为“张三发言语音摘要”;会议期间,李四、王伟、李明等进行了发言,其中王伟发言主要内容为“王伟发言语音摘要”,最后,王五对下月工作进行部署,具体内容为“王五发言语音摘要”。
下面结合图4说明根据本发明实施例的会议智能管理系统的结构示意图。
系统400包括如下模块:
用户设置模块401,用于设置用户角色,录入用户信息。
用户设置模块401包括:
输入输出装置4011,接收系统管理员输入的用户角色的类型,例如在学校中,角色类型可包括:教师、学生、后勤主管、校长等。在公司中,角色类型可包括:董事长、总经理、部门经理、职员等。接收每种角色类型对应的多个用户信息,用户信息包括姓名、联系方式(例如手机号码、邮箱),将姓名作为用户的用户名,存在重名的情况下,可为重名用户分配序号进行区分,例如小明1、小明2,将带序号的姓名作为该用户的用户名,每位用户均拥有唯一的用户名。
存储装置4012,存储用户角色类型,并针对每种角色类型对应的多个用户信息关联存储。将角色类型与用户名对应存储,存储方式例如为映射表。通过上述用户角色等级的区分以及与每位用户的唯一关联,可以灵活针对每位用户设置不同的使用方式,有效提高了会议管理的智能性。
声纹特征信息采集模块4013,采集不同用户的声纹特征信息,利用声纹识别技术识别不同用户。具体地,采集不同用户的声音数据,识别用户的声纹特征数据并将其与用户名对应存储于存储装置4012。
系统400还包括权限等级设置模块402,用于设置不同角色类型对应的权限等级。
根据角色类型自身的属性设置多个权限等级,每种角色类型对应一个权限等级。例如可设置四个权限等级A、B、C、D,且权限高低次序为A>B>C>D。具体地,在学校中,校长权限最高,权限等级设置为A;后勤主管权限等级设置为B;教师权限等级为C;学生权限等级为D。在公司中,董事长权限最 高,权限等级设置为A;总经理权限等级设置为B;部门经理权限等级为C;职员权限等级为D。通过上述用户权限等级的区分,可以灵活针对每位用户设置不同的预约方式,有效提高了会议预约的灵活性。
系统400还包括会议室预约模块403,用于供用户预约会议室。
为了方便各个会议室的使用,允许用户预约会议室。经预约的会议室,在预约时间段内只有预约用户可以打开该会议室的门禁系统。
具体地,会议室预约模块403包括预约请求模块4031,用户可通过网络登录智能会议控制系统400,预约请求模块4031根据登录用户的用户名获取其对应的角色类型,显示其能够预约的会议室,同时显示该会议室哪些时间段已被其他用户预约成功,哪些时间段空闲,对于已被预约成功的时间段显示为已预约并显示预约用户的角色类型,对于未被预约成功的时间段显示为空闲。用户通过预约请求模块4031选择所需的会议室及所需的时间段,提交预约请求,系统确认预约成功后提示用户,还可在预约成功后通过短信等方式向用户发送预约成功信息,所述预约成功信息包括预约的会议室以及预约时间段。
进一步地,为了方便会议室的管理以及应付一些突发情况,会议室预约模块403包括预约变更请求模块4032,用于对已经预约的会议室的预约结果进行变更。用户通过预约请求模块4031选择所需的会议室及所需时间段,提交预约请求,系统判断该会议室的该时间段是否已被其他用户预约,若判断未被其他用户预约,则确认预约成功并提示用户,通过短信等方式向用户发送预约成功信息,所述预约成功信息包括预约的会议室以及预约时间段;若判断已被其他用户预约,预约变更请求模块4032则获取已预约用户角色类型对应的权限等级,同时获取当前提交预约请求用户的角色类型对应的权限等级,判断二者权限等级高低,当已预约用户角色类型对应的权限等级低于当前提交预约请求用户的角色类型对应的权限等级时,则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,通过短信等方式向被取消预约的用户发送预约被取消的信息,所述预约被取消的信息包括被预约的会议室以及被取消的预约时间段,同时,通过短信等方式向当前提交的预约请求成功的用户发送预约成功信息,所述预约成功信息包括预约的会议室以及预 约时间段。当已预约用户角色类型对应的权限等级等于或者高于当前提交预约请求用户的角色类型对应的权限等级时,提示当前用户可向已预约的用户发送预约变更请求,当前用户输入请求变更的理由,预约变更请求模块4032通过短信等方式向已预约的用户发送预约变更请求,所述预约变更请求包括所述请求变更的理由、请求变更预约的会议室及对应的时间段,已预约的用户确认是否同意预约变更请求并向系统反馈,如果已预约的用户确认同意所述预约变更请求,系统则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,通过短信等方式向被取消预约的用户发送预约被取消的信息,所述预约被取消的信息包括被预约的会议室以及被取消的预约时间段,同时,通过短信等方式向当前提交的预约请求成功的用户发送预约成功信息,所述预约成功信息包括预约的会议室以及预约时间段。如果已预约的用户不同意所述预约变更请求,预约变更请求模块4032向当前用户提示预约变更请求失败,还可提示当前用户选择其他时间段预约。
例如会议室A在12月1日上午9:00~10:00已经被教师张三预约成功,后勤主管发现该会议室设备存在问题需要在12月1日上午9:00~10:00时间段进行维修,后勤主管在登录系统后,系统显示其能够预约的会议室以及该会议室的预约情况,其中会议室A在12月1日上午9:00~10:00已经被角色类型为教师的用户预约成功。此时,后勤主管可针对会议室A在12月1日上午9:00~10:00,提交预约请求,系统判断会议室A该时间段已被张三预约,则获取张三对应的权限等级,同时获取后勤主管的权限等级,判断张三的权限等级低于后勤主管的权限等级,则取消张三的预约,确认当前提交的预约请求成功并提示用户,向张三发送消息“您预约的会议室A(12月1日上午9:00~10:00)已被取消,请谅解”,向后勤主管发送消息“您预约的会议室A(12月1日上午9:00~10:00)已预约成功”。
或者会议室A在12月1日上午9:00~10:00已经被教师张三预约成功,学生小明负责的课题需要使用会议室A在12月1日上午9:00~10:00进行答辩,小明在登录系统后,系统显示其能够预约的会议室以及该会议室的与预约情况,其中会议室A在12月1日上午9:00~10:00已经被角色类型为教师的用户预约成功。此时,小明可针对会议室A在12月1日上午9: 00~10:00,提交预约请求,系统判断会议室A该时间段已被张三预约,则获取张三对应的权限等级,同时获取小明的权限等级,判断张三的权限等级高于小明的权限等级,则提示小明可向已预约的用户发送预约变更请求,小明输入请求变更的理由,例如“老师好!学生小明于12月1日上午9:00~10:00需要用会议室A进行课题答辩,因外请评审专家时间难以变更,冒昧请求老师能够允许小明12月1日上午9:00~10:00使用会议室A”,系统通过短信等方式向张三发送预约变更请求,张三确认是否同意预约变更请求并向系统反馈,如果张三确认同意所述预约变更请求,系统则取消张三的预约,确认小明提交的预约请求成功并提示用户,向张三发送消息“您预约的会议室A(12月1日上午9:00~10:00)已被取消,请谅解”,向小明发送消息“您预约的会议室A(12月1日上午9:00~10:00)已预约成功”。如果张三不同意所述预约变更请求,则向小明提示预约变更请求失败,还可提示小明选择其他时间段预约。
传统的预约方式单一、固定,预约成功后难以根据情况变化做出适当的调整,通过本发明的预约方式,根据用户角色类型、权限等级,提供不同的预约方式,并针对预约成功的会议室在预约人员之间提供协商机制,既保障了会议室的有序使用,又为各种管理需求以及突发情况的处理提供了智能、灵活的调整方式,为用户使用会议室提供了极大的灵活性。
系统400还包括会议室使用权限验证模块404,用于验证用户会议室使用权限,确认是否允许该用户使用该会议室,包括:
语音数据采集模块4041,用于利用会议室门禁处设置的语音采集器,采集用户语音数据,获取对应用户的用户名。
根据当前采集的语音数据识别其中的声纹特征数据,将识别出的声纹特征数据与预先采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前语音数据对应用户的用户名。
预约判断模块4042,用于获取当前时间,判断当前时间该会议室是否已被预约。如果未被预约,则允许用户进入并使用该会议室。如果已被预约,则获取预约该会议室的用户名,与语音数据采集模块4041获取的用户名比较是否一致,一致则开放该会议室门禁,允许用户进入并使用该会议室;不一 致则提示用户该会议室已被预约,当前时间不可用,并结束流程。
系统400还包括数据采集模块405,用于在会议进行期间,采集会议音频/视频数据。
通常情况下,会议持续时间并不确定,可能进行很长时间,并非在整个会议过程均有发言者的语音数据需要记录,此时若在会议期间全程录制会造成资源浪费,也会进一步增加对象内容的查找难度。具体地,可由用户手动启动、暂停或停止音频/视频数据采集以录制需要的内容。此外,为了避免用户手动操作产生失误,可令录制用麦克风循环检测语音信息,当检测到发言人开始发言的语音信息时,触发录制开始命令,采集发言人的音频/视频数据,记录下发言开始时间。根据采集到的音频/视频数据的属性(例如语音强度大小)判断当前发言人的发言在继续还是已经停止,当采集到的音频/视频数据满足预设条件时,例如发言停止超过一定时间则认为该与会者发言结束,触发录制暂停或停止命令,记录下发言结束时间。录制用麦克风继续循环检测语音信息,检测到下一发言人开始发言的语音信息时,触发继续录制命令或者录制开始命令,录制下一位发言人的音频/视频数据。所述发言人均为用户设置模块401中已录入用户信息并设置用户角色的用户。
系统400还包括音频/视频数据预处理模块406,用于对录制的音频/视频数据进行预处理后存储。
数据采集模块405录制暂停后或者结束后由音频/视频数据预处理模块406记录下该发言人发言的开始时间和结束时间,并获取该发言人的姓名、用户名。
可选的,音频/视频数据预处理模块406包括会议议程处理模块4061,用于读取预先存储的会议议程表,会议议程表存储有会议议程,以及会议中各发言人的发言时间段。参见图3,9:00~9:10为开幕式,发言人李明对应的发言时间为9:10~9:30,发言人王伟对应的发言时间为9:30~9:50,阶段性总结发言时间为10:30~11:00,大会总结发言时间为16:30~17:00等,根据会议议程表获取当前时间对应的发言人,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中。
可选地,音频/视频数据预处理模块406包括声纹识别模块4062,用于利用声纹识别技术识别当前发言的与会者。声纹识别模块4062根据当前采集的音频/视频数据识别其中的声纹特征数据,将识别出的声纹特征数据与声纹特征信息采集模块4013采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前发言人的姓名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中。
系统400还包括发言人权重确定模块407,用于确定不同发言人的权重系数。
发言人在会议中的发言位置通常能够反映其在会议中的地位与作用,例如大会的第一位发言人与最后一位发言人通常会占据较重的地位,或者大会开幕式、中场总结发言与终场总结发言也会在会议中占据重要的地位。可选地,发言人权重确定模块407根据会议议程表存储的会议议程确定发言人的发言位置,根据发言人的发言位置对不同发言人赋予不同的权重系数A。
另外,发言人的权限等级也能够反映该发言人在会议中占据的地位。可选地,发言人权重确定模块407可根据发言人的用户名获取其对应的权限等级,根据发言人的权限等级对不同发言人赋予不同的权重系数B。
发言人权重确定模块407综合发言人对应的权重系数A与权重系数B确定该发言人最终的权重系数C。发言人权重确定模块407也可仅利用发言人的权重系数A或者权重系数B作为该发言人的最终权重系数。发言人的权重系数越大表示其发言内容重要程度越大。
上述为发言人确定权重系数的方式区别于传统单一的针对所有发言人提取发言内容相同的方式,能够根据权重系数来用不同的预设策略获取不同发言人对应的候选关键发言片段集合,针对重要发言提取更多的内容,针对不重要的发言提取相对较少的内容,使最终形成的摘要内容更加合理,为用户提供更有效的帮助。
系统400还包括候选关键发言片段集合获取模块408,用于根据发言人获取其对应的候选关键发言片段集合。
候选关键发言片段集合获取模块408根据发言人的姓名在存储的音频/ 视频中检索,找到其对应的具体发言片段,利用预设的策略在发言片段中截取候选的关键发言片段。
其中,具有不同权重系数的发言人对应的截取候选关键发言片段的预设策略不同。权重系数越高,截取的音频/视频片段数量越多和/或长度越长;权重系数越低,截取的音频/视频片段数量越少和/或长度越短。
根据人们的发言习惯,一般一段发言的重要内容出现在0%~5%,10%~30%及80%~100%之间的概率较大,可选地,候选关键发言片段集合获取模块408包括时间段截取模块4081,用于结合该发言人对应发言音频/视频片段的时间轴,截取其特定时间段(例如0%~5%,10%~30%及80%~100%)的音频/视频片段作为候选的关键发言片段。时间段的选择可以根据实际情况进行设置,不同权重系数发言人截取的音频/视频片段数量及长度不同。权重系数越高,截取的特定时间段音频/视频片段数量越多和/或长度越长;权重系数越低,截取的特定时间段音频/视频片段数量越少和/或长度越短。举例说明,发言人李明权重系数为0.9,发言人王伟权重系数为0.7,也即发言人李明权重系数大于王伟时,则(1)针对李明的发言片段截取时间段如下:0%~5%、10%~20%、50%~60%、80%~100%,针对王伟的发言片段截取时间段如下:0%~5%、10%~20%、80%~100%。或者(2)针对李明的发言片段截取时间段如下:0%~5%、10%~20%、50%~60%、80%~100%,针对王伟的发言片段截取时间段如下:0%~5%、10%~15%、50%~60%、90%~100%。或者(3)针对李明的发言片段截取时间段如下:0%~5%、10%~20%、50%~60%、80%~100%,针对王伟的发言片段截取时间段如下:0%~5%、10%~15%、90%~100%。
或者,一些关键转折词、连接词后通常会引出发言的重要内容,例如“首先、其次、然而、最重要的是、最后”,可选地,候选关键发言片段集合获取模块408包括关键词截取模块4082,预先设置关键词库,对发言人对应发言音频/视频片段进行语音识别处理,使用预设的关键词库对识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后预设时间段的音频/视频片段作为候选的关键发言片段。例如在识别出发言人对应的发言音频/视频片段中包括预先设置的关键词“重要的是”时,关键词截取模块4082截取该关键词后 1分钟的音频/视频片段作为候选的关键发言片段。通过设置不同的关键词库可以控制匹配出的关键词数量,通常关键词库中包括的关键词数量越多,使用该关键词库去进行匹配时,识别出的关键词数量相应也会越多。其中预设时间段的长度也可根据实际情况进行调整。因此,不同权重发言人对应的关键词库和/或截取的预设时间段长度不同。权重系数越高,对应的关键词库中关键词数量越多和/或截取的音频/视频片段长度越长;权重系数越低,对应的关键词库中关键词数量越少和/或截取的音频/视频片段长度越短。举例说明,发言人李明权重系数为0.9,发言人王伟权重系数为0.7,也即发言人李明权重系数大于王伟时,关键词库A包括20个关键词,关键词库B包括10个关键词,则(1)使用预设的关键词库A对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后3分钟的音频/视频片段作为候选的关键发言片段;使用预设的关键词库B对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后3分钟的音频/视频片段作为候选的关键发言片段。或者(2)使用预设的关键词库A对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后3分钟的音频/视频片段作为候选的关键发言片段;使用预设的关键词库A对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后1分钟的音频/视频片段作为候选的关键发言片段。或者(3)使用预设的关键词库A对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后3分钟的音频/视频片段作为候选的关键发言片段;使用预设的关键词库B对李明对应的发言音频/视频片段识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后1分钟的音频/视频片段作为候选的关键发言片段。
上述时间段截取模块4081,关键词截取模块4082,二者可以单独存在,也可以综合两个模块来截取候选的关键发言片段,例如先采用时间段截取模块4081截取,再采用关键词截取模块4082截取,截取出候选关键发言片段集合。
上述截取候选的关键发言片段的方式能够根据发言内容本身的特点,例 如发言的重要内容出现在发言时间轴上的概率较大的位置,或者发言的重要内容所跟的关键转折词、连接词,来截取候选关键发言片段集合,能够大大提升提取出的内容的有效性,并且提取效率高,不会受到环境等其他因素的影响,进一步使最终形成的摘要内容更加合理
系统400还包括音频/视频片段集合筛选模块409,用于对获取的候选关键发言片段集合进行语音识别处理,筛选定位重点发言内容对应的音频/视频片段集合。
音频/视频片段集合筛选模块409可结合会议主题确定重点发言内容,所述重点发言内容可以为与会议主题相关的一系列关键词。音频/视频片段集合筛选模块409对候选关键发言片段集合获取模块408获取的候选关键发言片段进行语音识别处理,将其转化为文本数据,转化后的文本数据具有与音频/视频数据相对应的时间轴,能够根据文本数据中的内容定位到相应时间段的音频/视频数据。利用重点发言内容对应的关键词对转化后的文本数据进行筛选,最终确定重点发言内容对应的音频/视频片段集合。
上述结合会议主题可确定重点发言内容进一步提升了提取出的内容的有效性。
系统400还包括语音摘要合成模块410,用于对音频/视频片段集合筛选模块409筛选出的音频/视频片段集合进行合成,形成语音摘要。
可选地,语音摘要合成模块410包括发言人语音摘要合成模块4101,用于将音频/视频片段集合筛选模块409筛选出的同一发言人的音频/视频片段集合按时间顺序排序,将排序后的音频/视频片段集合拼接为一段音频/视频,作为该发言人发言内容的语音摘要。进一步地,语音摘要合成模块410还包括会议语音摘要合成模块4102,用于生成整个会议的语音摘要。会议语音摘要合成模块4102可根据会议主题、会议议程等信息生成摘要的头部信息,例如:“2017年人工智能大会在上海举行,为期3天,与会人员包括:李明、王伟……”,并将上述信息生成头部信息语音文件。会议语音摘要合成模块4102再根据会议议程等信息生成摘要中承前启后的过渡信息,例如:“在开幕式上张三对会议进行致辞”,“会议期间,李四、王伟、李明等进行了发言”,“其中王伟发言主要内容为”,“最后,王五对会议进行总结,具体内容为”, 并将上述信息生成过渡信息语音文件。会议语音摘要合成模块4102将头部信息语音文件、过渡信息语音文件、拼接完成的不同发言人的语音摘要按对应关系合成到一起,形成会议的语音摘要。例如,生成对应于下列文字的语音摘要文件:2017年人工智能大会在上海举行,为期3天,与会人员包括:李明、王伟……,在开幕式上张三对会议进行致辞,具体内容为“张三致辞语音摘要”;会议期间,李四、王伟、李明等进行了发言,其中王伟发言主要内容为“王伟发言语音摘要”,最后,王五对会议进行总结,具体内容为“王五总结语音摘要”。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储 程序代码的介质。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。
工业实用性
本发明能够根据不同用户角色、等级权限提供多种会议室的预约方式,避免会议室的无序使用。通过用户角色与用户权限等级的区分,可以灵活的提供预约方式,通过对已经预约的会议室的预约结果根据不同用户的权限等级的不同进行不同方式的变更,方便会议的管理以及协调突发情况,并且在已预约用户与后来用户之间建立沟通协调机制,允许后来者与已预约用户之间就会议室的紧急需要进行沟通协调,为用户使用会议室提供了更大的灵活性。会议召开后,识别不同发言者的关键发言内容,并自动合成语音形式的会议摘要。通过检测发言人的语音信息自动化的开启和停止音频/视频录制,有效减少了无效内容的录制,节约了存储资源,并且减少了录制时间的长度,方便用户后续查找定位所需内容。通过分析发言人在会议中的发言位置、身份信息、个人资料等信息,确定发言人的权重系数,从而根据权重系数来用不同的预设策略获取不同发言人对应的候选关键发言片段集合,能够针对重要发言提取更多的内容,针对不重要的发言提取相对较少的内容,使最终形成的摘要内容更加合理,为用户提供更有效的帮助。根据发言内容本身的特点,例如发言的重要内容出现在发言时间轴上的概率较大的位置,或者发言的重要内容所跟的关键转折词、连接词,来截取候选关键发言片段集合,再对截取的候选关键发言片段集合进行处理以获取形成语音摘要的音频/视频片段集合,能够大大提升提取出的内容的有效性,并且提取效率高,不会受到环境等其他因素的影响,进一步使最终形成的摘要内容更加合理。

Claims (22)

  1. 一种会议智能管理方法,其特征在于包括如下步骤:
    S1、设置用户角色,录入用户信息,其中,预先设置所述用户角色的类型,针对每种角色类型录入对应的多个所述用户信息并存储,每位用户均拥有唯一的用户名;采集不同用户的声纹特征信息,其中,利用声纹识别技术识别不同用户,采集不同用户的声音数据,识别用户的声纹特征数据并将其与用户名对应存储;
    S2、设置不同所述角色类型对应的权限等级,其中,根据多个所述角色类型自身的属性设置多个权限等级,每种所述角色类型对应一个所述权限等级;
    S3、用户预约会议室,经预约的会议室,在预约时间段内只有预约用户可以打开所述会议室的门禁系统,其中,用户能够对已经预约的会议室的预约结果提出变更请求,根据用户不同角色类型进行不同方式的变更处理;
    S4、验证用户的所述会议室使用权限,确认是否允许所述用户使用所述会议室;
    S5、在会议进行期间,采集会议音频/视频数据;
    S6、对录制的音频/视频数据进行预处理后存储,其中,记录下该发言人发言的开始时间和结束时间,并获取该发言人的姓名、用户名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理并存储;
    S7、确定不同发言人的权重系数;
    S8、根据发言人获取其对应的候选关键发言片段集合,其中,根据发言人的姓名在存储的音频/视频中检索,找到其对应的具体发言片段,利用预设的策略在发言片段中截取候选的关键发言片段,具有不同权重系数的发言人对应的截取候选关键发言片段的预设策略不同;
    S9、对获取的候选关键发言片段集合进行语音识别处理,筛选定位重点发言内容对应的音频/视频片段集合;
    S10、对步骤105中筛选出的音频/视频片段集合进行合成,形成语音摘要。
  2. 根据权利要求1所述的会议智能管理方法,其特征在于, 所述步骤S3中,用户通过网络预约会议室,获取用户的用户名,根据所述用户名获取其对应的角色类型,显示其能够预约的会议室,以及所述会议室允许工作时间段,同时显示该会议室哪些时间段已被其他用户预约成功,哪些时间段空闲,对于已被预约成功的时间段显示为已预约并显示预约用户的角色类型,对于未被预约成功的时间段显示为空闲,用户选择所需的会议室及所需的时间段,提交预约请求,系统确认预约成功后提示用户。
  3. 根据权利要求1或2所述的会议智能管理方法,其特征在于,所述用户选择所需的会议室及所需的时间段,提交预约请求,系统确认预约成功后提示用户进一步包括:
    用户选择所需的会议室及所需时间段,提交预约请求,系统判断该会议室的该时间段是否已被其他用户预约,若判断未被其他用户预约,则确认预约成功并提示用户;
    若判断已被其他用户预约,则获取已预约用户角色类型对应的权限等级,同时获取当前提交预约请求用户的角色类型对应的权限等级,判断二者权限等级高低,当已预约用户角色类型对应的权限等级低于当前提交预约请求用户的角色类型对应的权限等级时,则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户;当已预约用户角色类型对应的权限等级等于或者高于当前提交预约请求用户的角色类型对应的权限等级时,提示当前用户可向已预约的用户发送预约变更请求,当前用户输入请求变更的理由,向已预约的用户发送预约变更请求,所述预约变更请求包括所述请求变更的理由、请求变更预约的会议室及对应的时间段,已预约的用户确认是否同意预约变更请求并向系统反馈,如果已预约的用户确认同意所述预约变更请求,系统则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,如果已预约的用户不同意所述预约变更请求,则向当前用户提示预约变更请求失败,提示当前用户选择其他时间段预约。
  4. 根据权利要求1所述的会议智能管理方法,其特征在于,所述步骤S4还包括:
    S401、利用会议室门禁处设置的语音采集器,采集用户语音数据,获取对应用户的用户名,其中,根据当前采集的语音数据识别其中的声纹特征数据, 将识别出的声纹特征数据与预先采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前语音数据对应用户的用户名;
    S402、获取当前时间,判断当前时间该会议室是否已被预约,如果未被预约,则允许用户进入并使用该会议室。如果已被预约,则获取预约该会议室的用户名,与S401中获取的用户名比较是否一致,一致则开放该会议室门禁,允许用户进入并使用该会议室;不一致则提示用户该会议室已被预约,当前时间不可用,并结束流程。
  5. 根据权利要求3或4所述的会议智能管理方法,其特征在于,所述步骤S5进一步包括:
    由用户手动启动和停止音频/视频数据采集以录制需要的内容;
    或者,令录制用麦克风循环检测语音信息,当检测到发言人开始发言的语音信息时,触发录制开始命令,采集发言人的音频/视频数据,记录下发言开始时间,根据采集到的音频/视频数据的属性判断当前发言人的发言在继续还是已经停止,当采集到的音频/视频数据满足预设条件时,触发录制暂停或停止命令,记录下发言结束时间,录制用麦克风继续循环检测语音信息,检测到下一发言人开始发言的语音信息时,触发继续录制命令或者录制开始命令,录制下一位发言人的音频/视频数据。
  6. 根据权利要求5所述的会议智能管理方法,其特征在于,所述步骤S6进一步包括:
    读取预先存储的会议议程表,会议议程表存储有会议议程,以及会议中各发言人的发言时间段,根据会议议程表获取当前时间对应的发言人,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中;
    或者,根据当前采集的音频/视频数据识别其中的声纹特征数据,将识别出的声纹特征数据与步骤S1中采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前发言人的姓名、用户名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中。
  7. 根据权利要求6所述的会议智能管理方法,其特征在于,所述步骤S7进 一步包括:
    根据会议议程表存储的会议议程确定发言人的发言位置,根据发言人的发言位置对不同发言人赋予不同的权重系数A;
    和/或
    根据发言人的用户名获取其对应的权限等级,根据发言人的权限等级对不同发言人赋予不同的权重系数B。
  8. 根据权利要求1所述的会议智能管理方法,其特征在于,所述步骤S8进一步包括:
    结合该发言人对应发言音频/视频片段的时间轴,截取其特定时间段的音频/视频片段作为候选的关键发言片段,权重系数越高,截取的特定时间段音频/视频片段数量越多和/或长度越长;权重系数越低,截取的特定时间段音频/视频片段数量越少和/或长度越短。
  9. 根据权利要求1所述的会议智能管理方法,其特征在于,所述步骤S8进一步包括:
    预先设置关键词库,对发言人对应发言音频/视频片段进行语音识别处理,使用预设的关键词库对识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后预设时间段的音频/视频片段作为候选的关键发言片段,不同权重发言人对应的关键词库和/或截取的预设时间段长度不同,权重系数越高,对应的关键词库中关键词数量越多和/或截取的音频/视频片段长度越长,权重系数越低,对应的关键词库中关键词数量越少和/或截取的音频/视频片段长度越短。
  10. 根据权利要求1所述的会议智能管理方法,其特征在于,所述步骤S9进一步包括:
    结合会议主题确定重点发言内容,对步骤S8中获取的候选关键发言片段进行语音识别处理,将其转化为文本数据,转化后的文本数据具有与音频/视频数据相对应的时间轴,根据文本数据中的内容能够定位到相应时间段的音频/视频数据,利用重点发言内容对应的关键词对转化后的文本数据进行筛选,最终确定重点发言内容对应的音频/视频片段集合。
  11. 根据权利要求1所述的会议智能管理方法,其特征在于,所述步骤S10进一步包括:
    将步骤S9中筛选出的同一发言人的音频/视频片段集合按时间顺序排序,将排序后的音频/视频片段集合拼接为一段音频/视频,作为该发言人发言内容的语音摘要;根据会议主题、会议议程等信息预先生成摘要的头部信息,并将上述信息生成头部信息语音文件,再根据会议议程等信息生成摘要中承前启后的过渡信息,并将上述过渡信息生成过渡信息语音文件,将头部信息语音文件、过渡信息语音文件、拼接完成的不同发言人的语音摘要按对应关系合成到一起,形成会议的语音摘要。
  12. 一种会议的智能管理系统,其特征在于,包括:
    用户设置模块,用于设置用户角色,录入用户信息,其中,预先设置所述用户角色的类型,针对每种角色类型录入对应的多个所述用户信息并存储,每位用户均拥有唯一的用户名;采集不同用户的声纹特征信息,其中,利用声纹识别技术识别不同用户,采集不同用户的声音数据,识别用户的声纹特征数据并将其与用户名对应存储;
    权限等级设置模块,用于设置不同角色类型对应的权限等级,其中,根据角色类型自身的属性设置多个权限等级,每种角色类型对应一个权限等级;
    会议室预约模块,用于供用户预约会议室,经预约的会议室,在预约时间段内只有预约用户可以打开所述会议室的门禁系统,其中,用户能够对已经预约的会议室的预约结果提出变更请求,根据用户不同角色类型进行不同方式的变更处理;
    会议室使用权限验证模块,用于验证用户会议室使用权限,确认是否允许该用户使用该会议室;
    数据采集模块,用于在会议进行期间,采集会议音频/视频数据;
    音频/视频数据预处理模块,用于对录制的音频/视频数据进行预处理后存储,其中,记录下该发言人发言的开始时间和结束时间,并获取该发言人的姓名、用户名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中;
    发言人权重确定模块,用于确定不同发言人的权重系数;
    候选关键发言片段集合获取模块,用于根据发言人获取其对应的候选关键发言片段集合,其中,候选关键发言片段集合获取模块根据发言人的姓名在存储的音频/视频中检索,找到其对应的具体发言片段,利用预设的策略在发言片段中截取候选的关键发言片段,具有不同权重系数的发言人对应的截取候选关键发言片段的预设策略不同;
    音频/视频片段集合筛选模块,用于对获取的候选关键发言片段集合进行语音识别处理,筛选定位重点发言内容对应的音频/视频片段集合;
    语音摘要合成模块,用于对音频/视频片段集合筛选模块筛选出的音频/视频片段集合进行合成,形成语音摘要。
  13. 根据权利要求12所述的会议智能管理系统,其特征在于:
    所述会议室预约模块进一步包括:
    预约请求模块,用户可通过网络预约会议室,预约请求模块获取用户的用户名,根据所述用户名获取其对应的角色类型,根据角色类型显示其能够预约的会议室,同时显示该会议室哪些时间段已被其他用户预约成功,哪些时间段空闲,对于已被预约成功的时间段显示为已预约并显示预约用户的角色类型,对于未被预约成功的时间段显示为空闲,用户通过预约请求模块选择所需的会议室及所需的时间段,提交预约请求,系统确认预约成功后提示用户。
  14. 根据权利要求12或13所述的会议智能管理系统,其特征在于:
    所述会议室预约模块进一步包括:
    预约变更请求模块,用于对已经预约的会议室的预约结果进行变更,用户通过预约请求模块选择所需的会议室及所需的时间段,提交预约请求,系统判断该会议室的该时间段是否已被其他用户预约,若判断未被其他用户预约,则确认预约成功并提示用户;
    若判断已被其他用户预约,预约变更请求模块则获取已预约用户角色类型对应的权限等级,同时获取当前提交预约请求用户的角色类型对应的权限等级,判断二者权限等级高低,当已预约用户角色类型对应的权限等级低于当前提 交预约请求用户的角色类型对应的权限等级时,则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,当已预约用户角色类型对应的权限等级等于或者高于当前提交预约请求用户的角色类型对应的权限等级时,提示当前用户可向已预约的用户发送预约变更请求,当前用户输入请求变更的理由,预约变更请求模块向已预约的用户发送预约变更请求,所述预约变更请求包括所述请求变更的理由、请求变更预约的会议室及对应的时间段,已预约的用户确认是否同意预约变更请求并向系统反馈,如果已预约的用户确认同意所述预约变更请求,系统则取消已预约用户的预约,确认当前提交的预约请求成功并提示用户,如果已预约的用户不同意所述预约变更请求,预约变更请求模块向当前用户提示预约变更请求失败,还可提示当前用户选择其他时间段预约。
  15. 根据权利要求12所述的会议智能管理系统,其特征在于,所述会议室使用权限验证模块进一步包括:
    语音数据采集模块,用于利用会议室门禁处设置的语音采集器,采集用户语音数据,获取对应用户的用户名,根据当前采集的语音数据识别其中的声纹特征数据,将识别出的声纹特征数据与预先采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前语音数据对应用户的用户名;
    预约判断模块,用于获取当前时间,判断当前时间该会议室是否已被预约,如果未被预约,则允许用户进入并使用该会议室,如果已被预约,则获取预约该会议室的用户名,与语音数据采集模块获取的用户名比较是否一致,一致则开放该会议室门禁,允许用户进入并使用该会议室;不一致则提示用户该会议室已被预约,当前时间不可用,并结束流程。
  16. 根据权利要求14或15所述的会议智能管理系统,其特征在于,所述数据采集模块进一步用于:
    由用户手动启动和停止音频/视频数据采集以录制需要的内容;
    或者,令录制用麦克风循环检测语音信息,当检测到发言人开始发言的语音信息时,触发录制开始命令,采集发言人的音频/视频数据,记录下发言开始时间,根据采集到的音频/视频数据的属性判断当前发言人的发言在继续还是 已经停止,当采集到的音频/视频数据满足预设条件时,触发录制暂停或停止命令,记录下发言结束时间,录制用麦克风继续循环检测语音信息,检测到下一发言人开始发言的语音信息时,触发继续录制命令或者录制开始命令,录制下一位发言人的音频/视频数据。
  17. 根据权利要求16所述的会议智能管理系统,其特征在于,所述音频/视频数据预处理模块进一步包括:
    会议议程处理模块,用于读取预先存储的会议议程表,会议议程表存储有会议议程,以及会议中各发言人的发言时间段。根据会议议程表获取当前时间对应的发言人,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中;
    声纹识别模块,预先采集与会发言者的声音数据,识别与会发言者的声纹特征数据并将其与与会发言者姓名对应存储,声纹识别模块根据当前采集的音频/视频数据识别其中的声纹特征数据,将识别出的声纹特征数据与用户设置模块采集存储的用户声纹特征数据进行匹配,匹配成功后获取当前发言人的姓名、用户名,将发言人、发言开始时间、发言结束时间与采集的音频/视频数据关联处理,并存储于存储装置中。
  18. 根据权利要求17所述的会议智能管理系统,其特征在于,发言人权重确定模块进一步用于:
    根据会议议程表存储的会议议程确定发言人的发言位置,根据发言人的发言位置对不同发言人赋予不同的权重系数A;
    和/或
    通过网络搜索发言人的相关身份信息和/或个人资料,根据获取的身份信息基于预设算法计算该发言人的权重系数B。
  19. 根据权利要求12所述的会议智能管理系统,其特征在于,候选关键发言片段集合获取模块进一步包括:
    时间段截取模块,用于结合该发言人对应发言音频/视频片段的时间轴,截取其特定时间段的音频/视频片段作为候选的关键发言片段,权重系数越高,截取的特定时间段音频/视频片段数量越多和/或长度越长;权重系数越低,截 取的特定时间段音频/视频片段数量越少和/或长度越短。
  20. 根据权利要求12所述的会议智能管理系统,其特征在于,候选关键发言片段集合获取模块进一步包括:
    关键词截取模块,预先设置关键词库,对发言人对应发言音频/视频片段进行语音识别处理,使用预设的关键词库对识别出的语音信息进行匹配,匹配成功后截取识别出的关键词后预设时间段的音频/视频片段作为候选的关键发言片段,不同权重发言人对应的关键词库和/或截取的预设时间段长度不同,权重系数越高,对应的关键词库中关键词数量越多和/或截取的音频/视频片段长度越长,权重系数越低,对应的关键词库中关键词数量越少和/或截取的音频/视频片段长度越短。
  21. 根据权利要求12所述的会议智能管理系统,其特征在于,音频/视频片段集合筛选模块进一步用于,结合会议主题确定重点发言内容,对候选关键发言片段集合获取模块获取的候选关键发言片段进行语音识别处理,将其转化为文本数据,转化后的文本数据具有与音频/视频数据相对应的时间轴,能够根据文本数据中的内容定位到相应时间段的音频/视频数据,利用重点发言内容对应的关键词对转化后的文本数据进行筛选,最终确定重点发言内容对应的音频/视频片段集合。
  22. 根据权利要求12所述的会议智能管理系统,其特征在于,语音摘要合成模块进一步包括:
    发言人语音摘要合成模块,用于将音频/视频片段集合筛选模块筛选出的同一发言人的音频/视频片段集合按时间顺序排序,将排序后的音频/视频片段集合拼接为一段音频/视频,作为该发言人发言内容的语音摘要;
    会议语音摘要合成模块,用于生成整个会议的语音摘要,会议语音摘要合成模块可根据会议主题、会议议程等信息生成摘要的头部信息,并将上述信息生成头部信息语音文件,会议语音摘要合成模块再根据会议议程等信息生成摘要中承前启后的过渡信息,并将上述信息生成过渡信息语音文件,会议语音摘要合成模块将头部信息语音文件、过渡信息语音文件、拼接完成的不同发言人的语音摘要按对应关系合成到一起,形成会议的语音摘要。
PCT/CN2018/078527 2018-02-02 2018-03-09 一种会议智能管理方法及系统 WO2019148583A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810105174.2A CN108346034B (zh) 2018-02-02 2018-02-02 一种会议智能管理方法及系统
CN201810105174.2 2018-02-02

Publications (1)

Publication Number Publication Date
WO2019148583A1 true WO2019148583A1 (zh) 2019-08-08

Family

ID=62958513

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/078527 WO2019148583A1 (zh) 2018-02-02 2018-03-09 一种会议智能管理方法及系统

Country Status (2)

Country Link
CN (1) CN108346034B (zh)
WO (1) WO2019148583A1 (zh)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110970034A (zh) * 2019-12-06 2020-04-07 中国银行股份有限公司 一种会议配套设备的控制方法及装置
CN111161726A (zh) * 2019-12-24 2020-05-15 广州索答信息科技有限公司 一种智能语音交互方法、设备、介质及系统
CN111786945A (zh) * 2020-05-15 2020-10-16 北京捷通华声科技股份有限公司 一种会议控制方法和装置
CN112036591A (zh) * 2020-08-11 2020-12-04 深圳市欧瑞博科技股份有限公司 智能预约方法、装置及智能预约控制装置
CN112084756A (zh) * 2020-09-08 2020-12-15 远光软件股份有限公司 会议文件生成方法、装置及电子设备
CN112291139A (zh) * 2020-11-30 2021-01-29 重庆满集网络科技有限公司 基于xmpp协议的即时通讯方法及系统
CN112468762A (zh) * 2020-11-03 2021-03-09 视联动力信息技术股份有限公司 一种发言方的切换方法、装置、终端设备和存储介质
CN112887875A (zh) * 2021-01-22 2021-06-01 平安科技(深圳)有限公司 会议系统语音数据采集方法、装置、电子设备及存储介质
CN113177816A (zh) * 2020-01-08 2021-07-27 阿里巴巴集团控股有限公司 一种信息处理方法及装置
CN113206974A (zh) * 2021-04-21 2021-08-03 随锐科技集团股份有限公司 视频画面切换方法及系统
CN113746822A (zh) * 2021-08-25 2021-12-03 安徽创变信息科技有限公司 一种远程会议管理方法及系统
CN114282621A (zh) * 2021-12-29 2022-04-05 湖北微模式科技发展有限公司 一种多模态融合的话者角色区分方法与系统
CN114553616A (zh) * 2022-01-12 2022-05-27 广州市迪士普音响科技有限公司 一种会议单元的音频传输方法、装置、系统及终端设备
CN116366800A (zh) * 2023-03-03 2023-06-30 四川九鼎乾元科技有限公司 在线会议方法、装置、存储介质及电子设备
CN116703344A (zh) * 2023-06-08 2023-09-05 大庆九富科技有限公司 一种基于大数据的数字化管控系统及方法
CN117078223A (zh) * 2023-09-28 2023-11-17 广州隽智智能科技有限公司 一种基于人工智能的智能会议管理系统
CN117312612A (zh) * 2023-10-07 2023-12-29 广东鼎尧科技有限公司 一种基于多模态的远程会议数据记录方法、系统和介质
US11916687B2 (en) 2021-07-28 2024-02-27 Zoom Video Communications, Inc. Topic relevance detection using automated speech recognition

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3621002A1 (en) * 2018-09-06 2020-03-11 Koninklijke Philips N.V. Monitoring moveable entities in a predetermined area
CN109781111A (zh) * 2019-01-28 2019-05-21 平安科技(深圳)有限公司 为访客提供路线信息的方法、装置、介质和计算机设备
CN109949818A (zh) * 2019-02-15 2019-06-28 平安科技(深圳)有限公司 一种基于声纹识别的会议管理方法及相关设备
CN110049271B (zh) * 2019-03-19 2021-12-10 视联动力信息技术股份有限公司 一种视联网会议信息展示方法及装置
CN109905764B (zh) * 2019-03-21 2021-08-24 广州国音智能科技有限公司 一种视频中目标人物语音截取方法及装置
US20200349526A1 (en) * 2019-04-30 2020-11-05 Nanning Fugui Precision Industrial Co., Ltd. Method for arranging meeting agenda and computer device employing the same
CN110322869B (zh) * 2019-05-21 2023-06-16 平安科技(深圳)有限公司 会议分角色语音合成方法、装置、计算机设备和存储介质
CN110322033A (zh) * 2019-05-23 2019-10-11 深圳壹账通智能科技有限公司 基于用户地址的会议室管理方法、装置、设备及存储介质
CN110210835A (zh) * 2019-06-04 2019-09-06 成都四通瑞坤科技有限公司 一种智能高效会议实现控制方法及系统
CN110211590B (zh) * 2019-06-24 2021-12-03 新华智云科技有限公司 一种会议热点的处理方法、装置、终端设备及存储介质
CN112312039A (zh) * 2019-07-15 2021-02-02 北京小米移动软件有限公司 音视频信息获取方法、装置、设备及存储介质
CN111597381A (zh) * 2020-04-16 2020-08-28 国家广播电视总局广播电视科学研究院 内容生成方法、装置以及介质
CN111599228A (zh) * 2020-04-29 2020-08-28 滨州学院 一种在线教育培训系统、设备及可读存储介质
CN111385645A (zh) * 2020-05-30 2020-07-07 耿奎 一种基于语音识别的视频文件截取方法
CN111833876A (zh) * 2020-07-14 2020-10-27 科大讯飞股份有限公司 会议发言控制方法、系统、电子设备及存储介质
CN111899742B (zh) * 2020-08-06 2021-03-23 广州科天视畅信息科技有限公司 一种提高会议进行效率的方法及系统
CN114339356B (zh) * 2020-09-29 2024-02-23 北京字跳网络技术有限公司 视频录制方法、装置、设备及存储介质
CN112328984B (zh) * 2020-11-24 2024-02-09 深圳市鹰硕技术有限公司 一种应用于大数据的数据安全管理方法和系统
CN112287246B (zh) * 2020-12-29 2021-11-16 视联动力信息技术股份有限公司 基于协议标识实现访问控制和信息过滤的方法和装置
CN112819184B (zh) * 2020-12-31 2024-05-24 中国人寿保险股份有限公司上海数据中心 一种基于积分算法的空闲会议室检测方法
CN115050393B (zh) * 2022-06-23 2024-07-12 安徽听见科技有限公司 获取回听音频的方法、装置、设备及存储介质
CN116633909B (zh) * 2023-07-17 2023-12-19 福建一缕光智能设备有限公司 基于人工智能的会议管理方法和系统
CN116647635B (zh) * 2023-07-27 2023-11-28 深圳市乗名科技有限公司 一种基于深度学习的远程桌面会议系统及方法
CN116939150B (zh) * 2023-09-14 2023-11-24 北京橙色风暴数字技术有限公司 一种基于机器视觉的多媒体平台监测系统及方法

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004023661A (ja) * 2002-06-19 2004-01-22 Ricoh Co Ltd 記録情報処理方法、記録媒体及び記録情報処理装置
US6853716B1 (en) * 2001-04-16 2005-02-08 Cisco Technology, Inc. System and method for identifying a participant during a conference call
CN101043556A (zh) * 2006-01-25 2007-09-26 阿瓦雅技术有限公司 在电话呼叫期间显示参与者的分级
CN101529500A (zh) * 2006-10-23 2009-09-09 日本电气株式会社 内容概括系统、内容概括的方法和程序
CN102063461A (zh) * 2009-11-06 2011-05-18 株式会社理光 发言记录装置以及发言记录方法
CN102572356A (zh) * 2012-01-16 2012-07-11 华为技术有限公司 记录会议的方法和会议系统
CN104780282A (zh) * 2014-01-13 2015-07-15 国际商业机器公司 对电话会议中的发言内容进行分类的方法和设备
CN106534088A (zh) * 2016-11-01 2017-03-22 西安易朴通讯技术有限公司 一种会议室管理方法、会议室管理终端及云端服务器
CN106934471A (zh) * 2017-02-17 2017-07-07 北京光年无限科技有限公司 应用于智能机器人的会议室管理方法及系统
CN107220719A (zh) * 2017-05-27 2017-09-29 深圳市创维群欣安防科技股份有限公司 一种会议室管理预约方法、系统及存储装置
CN107409061A (zh) * 2015-03-23 2017-11-28 国际商业机器公司 语音总结程序

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE60204827T2 (de) * 2001-08-08 2006-04-27 Nippon Telegraph And Telephone Corp. Anhebungsdetektion zur automatischen Sprachzusammenfassung
JP5045670B2 (ja) * 2006-05-17 2012-10-10 日本電気株式会社 音声データ要約再生装置、音声データ要約再生方法および音声データ要約再生用プログラム
CN102347060A (zh) * 2010-08-04 2012-02-08 鸿富锦精密工业(深圳)有限公司 电子记录装置及方法
CN102096861A (zh) * 2010-12-29 2011-06-15 深圳市五巨科技有限公司 一种实现会议室预定及提醒的方法和系统
CN103559882B (zh) * 2013-10-14 2016-08-10 华南理工大学 一种基于说话人分割的会议主持人语音提取方法
CN105632498A (zh) * 2014-10-31 2016-06-01 株式会社东芝 生成会议记录的方法、装置和系统
US11076052B2 (en) * 2015-02-03 2021-07-27 Dolby Laboratories Licensing Corporation Selective conference digest
CN105933128A (zh) * 2016-04-25 2016-09-07 四川联友电讯技术有限公司 一种基于噪音过滤和身份认证的音频会议纪要推送方法
CN107346568B (zh) * 2016-05-05 2020-04-17 阿里巴巴集团控股有限公司 一种门禁系统的认证方法和装置

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6853716B1 (en) * 2001-04-16 2005-02-08 Cisco Technology, Inc. System and method for identifying a participant during a conference call
JP2004023661A (ja) * 2002-06-19 2004-01-22 Ricoh Co Ltd 記録情報処理方法、記録媒体及び記録情報処理装置
CN101043556A (zh) * 2006-01-25 2007-09-26 阿瓦雅技术有限公司 在电话呼叫期间显示参与者的分级
CN101529500A (zh) * 2006-10-23 2009-09-09 日本电气株式会社 内容概括系统、内容概括的方法和程序
CN102063461A (zh) * 2009-11-06 2011-05-18 株式会社理光 发言记录装置以及发言记录方法
CN102572356A (zh) * 2012-01-16 2012-07-11 华为技术有限公司 记录会议的方法和会议系统
CN104780282A (zh) * 2014-01-13 2015-07-15 国际商业机器公司 对电话会议中的发言内容进行分类的方法和设备
CN107409061A (zh) * 2015-03-23 2017-11-28 国际商业机器公司 语音总结程序
CN106534088A (zh) * 2016-11-01 2017-03-22 西安易朴通讯技术有限公司 一种会议室管理方法、会议室管理终端及云端服务器
CN106934471A (zh) * 2017-02-17 2017-07-07 北京光年无限科技有限公司 应用于智能机器人的会议室管理方法及系统
CN107220719A (zh) * 2017-05-27 2017-09-29 深圳市创维群欣安防科技股份有限公司 一种会议室管理预约方法、系统及存储装置

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110970034A (zh) * 2019-12-06 2020-04-07 中国银行股份有限公司 一种会议配套设备的控制方法及装置
CN111161726A (zh) * 2019-12-24 2020-05-15 广州索答信息科技有限公司 一种智能语音交互方法、设备、介质及系统
CN111161726B (zh) * 2019-12-24 2023-11-03 广州索答信息科技有限公司 一种智能语音交互方法、设备、介质及系统
CN113177816A (zh) * 2020-01-08 2021-07-27 阿里巴巴集团控股有限公司 一种信息处理方法及装置
CN111786945A (zh) * 2020-05-15 2020-10-16 北京捷通华声科技股份有限公司 一种会议控制方法和装置
CN112036591A (zh) * 2020-08-11 2020-12-04 深圳市欧瑞博科技股份有限公司 智能预约方法、装置及智能预约控制装置
CN112084756A (zh) * 2020-09-08 2020-12-15 远光软件股份有限公司 会议文件生成方法、装置及电子设备
CN112084756B (zh) * 2020-09-08 2023-10-10 远光软件股份有限公司 会议文件生成方法、装置及电子设备
CN112468762A (zh) * 2020-11-03 2021-03-09 视联动力信息技术股份有限公司 一种发言方的切换方法、装置、终端设备和存储介质
CN112468762B (zh) * 2020-11-03 2024-04-02 视联动力信息技术股份有限公司 一种发言方的切换方法、装置、终端设备和存储介质
CN112291139B (zh) * 2020-11-30 2022-11-29 重庆满集网络科技有限公司 基于xmpp协议的即时通讯方法及系统
CN112291139A (zh) * 2020-11-30 2021-01-29 重庆满集网络科技有限公司 基于xmpp协议的即时通讯方法及系统
CN112887875A (zh) * 2021-01-22 2021-06-01 平安科技(深圳)有限公司 会议系统语音数据采集方法、装置、电子设备及存储介质
CN112887875B (zh) * 2021-01-22 2022-10-18 平安科技(深圳)有限公司 会议系统语音数据采集方法、装置、电子设备及存储介质
CN113206974A (zh) * 2021-04-21 2021-08-03 随锐科技集团股份有限公司 视频画面切换方法及系统
US11916687B2 (en) 2021-07-28 2024-02-27 Zoom Video Communications, Inc. Topic relevance detection using automated speech recognition
CN113746822A (zh) * 2021-08-25 2021-12-03 安徽创变信息科技有限公司 一种远程会议管理方法及系统
CN113746822B (zh) * 2021-08-25 2023-07-21 广州市昇博电子科技有限公司 一种远程会议管理方法及系统
CN114282621A (zh) * 2021-12-29 2022-04-05 湖北微模式科技发展有限公司 一种多模态融合的话者角色区分方法与系统
CN114553616B (zh) * 2022-01-12 2023-11-24 广州市迪士普音响科技有限公司 一种会议单元的音频传输方法、装置、系统及终端设备
CN114553616A (zh) * 2022-01-12 2022-05-27 广州市迪士普音响科技有限公司 一种会议单元的音频传输方法、装置、系统及终端设备
CN116366800B (zh) * 2023-03-03 2023-12-15 四川九鼎乾元科技有限公司 在线会议方法、装置、存储介质及电子设备
CN116366800A (zh) * 2023-03-03 2023-06-30 四川九鼎乾元科技有限公司 在线会议方法、装置、存储介质及电子设备
CN116703344A (zh) * 2023-06-08 2023-09-05 大庆九富科技有限公司 一种基于大数据的数字化管控系统及方法
CN117078223A (zh) * 2023-09-28 2023-11-17 广州隽智智能科技有限公司 一种基于人工智能的智能会议管理系统
CN117078223B (zh) * 2023-09-28 2024-02-02 广州隽智智能科技有限公司 一种基于人工智能的智能会议管理系统
CN117312612A (zh) * 2023-10-07 2023-12-29 广东鼎尧科技有限公司 一种基于多模态的远程会议数据记录方法、系统和介质
CN117312612B (zh) * 2023-10-07 2024-04-02 广东鼎尧科技有限公司 一种基于多模态的远程会议数据记录方法、系统和介质

Also Published As

Publication number Publication date
CN108346034B (zh) 2021-10-15
CN108346034A (zh) 2018-07-31

Similar Documents

Publication Publication Date Title
WO2019148583A1 (zh) 一种会议智能管理方法及系统
US11412325B2 (en) Recording meeting audio via multiple individual smartphones
JP4466564B2 (ja) 文書作成閲覧装置、文書作成閲覧ロボットおよび文書作成閲覧プログラム
US6687671B2 (en) Method and apparatus for automatic collection and summarization of meeting information
CN108305632A (zh) 一种会议的语音摘要形成方法及系统
TWI616868B (zh) 會議記錄裝置及其自動生成會議記錄的方法
US9037461B2 (en) Methods and systems for dictation and transcription
US20040021765A1 (en) Speech recognition system for managing telemeetings
TWI619115B (zh) 會議記錄裝置及其自動生成會議記錄的方法
JP2008032825A (ja) 発言者表示システム、発言者表示方法および発言者表示プログラム
CN109560941A (zh) 会议记录方法、装置、智能终端及存储介质
CN111401699A (zh) 一种智能会议管理方法、机器人及存储介质
JP2005080110A (ja) 音声会議システム、音声会議端末装置およびプログラム
KR20170126667A (ko) 회의 기록 자동 생성 방법 및 그 장치
JP4469867B2 (ja) コミュニケーションの状況を管理する装置、方法およびプログラム
KR100608591B1 (ko) 멀티미디어 회의록 생성 방법 및 장치
CN113949838A (zh) 一种无纸化会议系统、方法、设备及存储介质
CN111223487B (zh) 一种信息处理方法及电子设备
CN110751950A (zh) 基于大数据的警用谈话语音识别方法及系统
JP6091690B1 (ja) 議会運営支援システム及び議会運営支援方法
JP4735640B2 (ja) 音声会議システム
CN111091342A (zh) 一种智能会议记录系统
KR102291113B1 (ko) 회의록 작성 장치 및 방법
Song et al. SmartMeeting: Automatic meeting transcription and summarization for in-person conversations
CN110428184B (zh) 待办事项分发方法、装置、设备及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18903881

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18903881

Country of ref document: EP

Kind code of ref document: A1