WO2019095586A1 - Meeting minutes generation method, application server, and computer readable storage medium - Google Patents

Meeting minutes generation method, application server, and computer readable storage medium Download PDF

Info

Publication number
WO2019095586A1
WO2019095586A1 PCT/CN2018/077628 CN2018077628W WO2019095586A1 WO 2019095586 A1 WO2019095586 A1 WO 2019095586A1 CN 2018077628 W CN2018077628 W CN 2018077628W WO 2019095586 A1 WO2019095586 A1 WO 2019095586A1
Authority
WO
WIPO (PCT)
Prior art keywords
speaker
content
speakers
meeting
meeting minutes
Prior art date
Application number
PCT/CN2018/077628
Other languages
French (fr)
Chinese (zh)
Inventor
王健宗
黄章成
程宁
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019095586A1 publication Critical patent/WO2019095586A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/61Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities

Definitions

  • the present application relates to the field of voice processing technologies, and in particular, to a conference minutes generation method, an application server, and a computer readable storage medium.
  • the present application provides a method for generating a meeting minutes, an application server, and a computer readable storage medium, which can automatically summarize and generate meeting minutes according to meeting content records, thereby saving human resource costs.
  • the present application provides an application server, where the application server includes a memory, a processor, and a memory meeting generation system that can be run on the processor, where the meeting minutes are generated.
  • the system is executed by the processor, the following steps are performed: acquiring audio record information of a conference, and extracting, from the audio record information, the content of each speaker according to the voice feature of each speaker; The content of the speech of the speaker is subjected to keyword extraction; and the meeting minutes corresponding to the meeting are generated according to the extracted keywords.
  • the present application further provides a method for generating a meeting minutes, which is applied to an application server, the method comprising: acquiring audio record information of a conference, and recording the audio record according to the voice feature of each speaker. Extracting the content of each of the speakers of the information; performing keyword extraction on the content of the speech of each of the speakers; and generating a meeting minutes corresponding to the meeting according to the extracted keywords.
  • the present application further provides a computer readable storage medium storing a meeting minutes generating system, the meeting minutes generating system being executable by at least one processor, so that The at least one processor performs the steps of the method of generating a meeting minutes as described above.
  • the conference minutes generating method, the application server, and the computer readable storage medium proposed by the present application first acquire audio recording information of a conference, and from the audio recording according to the voice characteristics of each speaker.
  • the content of each speaker of the speaker is extracted from the information; secondly, keyword extraction is performed on the content of the speech of each of the speakers; and finally, the meeting minutes corresponding to the meeting are generated according to the extracted keywords.
  • the participants in the meeting can focus more on the content and process of the meeting.
  • the meeting summary is streamlined and accurate. It can also be used for reference and reference by other people in need. Compared with traditional manual recording, this solution is more efficient and accurate, and saves human resource costs.
  • FIG. 1 is a schematic diagram of an optional application environment of each embodiment of the present application.
  • FIG. 2 is a schematic diagram of an optional hardware architecture of an application server of the present application
  • FIG. 3 is a schematic diagram of a program module of a first embodiment of a meeting minutes generation system of the present application
  • FIG. 4 is a schematic diagram of a program module of a second embodiment of the meeting minutes generating system of the present application.
  • FIG. 5 is a schematic flowchart of an implementation process of a first embodiment of a method for generating a meeting minutes of the present application
  • FIG. 6 is a schematic diagram of an implementation process of a second embodiment of a method for generating a meeting minutes of the present application.
  • FIG. 1 it is a schematic diagram of an optional application environment of each embodiment of the present application.
  • the present application is applicable to an application environment including, but not limited to, the terminal device 1, the application server 2, and the network 3.
  • the terminal device 1 may be a mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a navigation device, an in-vehicle device, etc. Mobile devices, etc., as well as fixed terminals such as digital TVs, desktop computers, notebooks, broadband phones, servers, and the like.
  • the application server 2 may be a computing device such as a rack server, a blade server, a tower server, or a rack server.
  • the application server 2 may be a standalone server or a server cluster composed of multiple servers.
  • the network 3 may be an intranet, an Internet, a Global System of Mobile communication (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, Wireless or wired networks such as 5G networks, Bluetooth, Wi-Fi, and call
  • the application server 2 can be respectively connected to one or more of the terminal devices 1 through the network 3 for data transmission and interaction.
  • FIG. 2 it is a schematic diagram of an optional hardware architecture of the application server 2 of the present application.
  • the application server 2 may include, but is not limited to, the memory 11, the processor 12, and the network interface 13 being communicably connected to each other through a system bus. It is to be noted that FIG. 2 only shows the application server 2 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
  • the memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (eg, SD or DX memory, etc.), a random access memory (RAM), a static Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like.
  • the memory 11 may be an internal storage unit of the application server 2, such as a hard disk or memory of the application server 2.
  • the memory 11 may also be an external storage device of the application server 2, such as a plug-in hard disk equipped on the application server 2, a smart memory card (SMC), and a secure digital number. (Secure Digital, SD) card, flash card, etc.
  • the memory 11 can also include both the internal storage unit of the application server 2 and its external storage device.
  • the memory 11 is generally used to store an operating system installed in the application server 2 and various types of application software, such as program code of the meeting minutes generation system 100. Further, the memory 11 can also be used to temporarily store various types of data that have been output or are to be output.
  • the processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments.
  • the processor 12 is typically used to control the overall operation of the application server 2, such as performing control and processing related to data interaction or communication with the terminal device 1.
  • the processor 12 is configured to run program code or process data stored in the memory 11, such as running the conference minutes generating system and the like.
  • the network interface 13 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the application server 2 and other electronic devices.
  • the network interface 13 is mainly used to connect the application server 2 to one or more of the terminal devices 1 through the network 3, and the application server 2 and the one or more terminals. A data transmission channel and a communication connection are established between the devices 1.
  • the present application proposes a meeting minutes generation system 100.
  • FIG. 3 it is a program module diagram of the first embodiment of the meeting minutes generation system 100 of the present application.
  • the meeting minutes generating system 100 includes a series of computer program instructions stored in the memory 11, and when the computer program instructions are executed by the processor 12, the meeting minutes generating operation of the embodiments of the present application can be implemented. .
  • the meeting minutes generation system 100 can be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions. For example, in FIG. 3, the meeting minutes generation system 100 can be divided into a content acquisition module 101, an extraction module 102, and a generation module 103. among them:
  • the content obtaining module 101 is configured to obtain audio record information of a conference, and extract, from the audio record information, the content of each speaker's speech according to the voice feature of each speaker.
  • the application server 2 collects the conference voice content through each terminal device 1, receives the voice content sent by each terminal device 1 and saves the voice content, and the voice content can be saved into a specified audio format, such as MP3. , wma, wav, etc.
  • the terminal device 1 collects the voice content through a sound collecting device (for example, a microphone).
  • the terminal device 1 can send the collected voice content to the application server 2 in real time or periodically, or when the participant on the side of the terminal device 1 ends a speech, the terminal device 1 will continuously collect the voice.
  • the content is sent to the application server 2.
  • the application server 2 After receiving the voice content sent by the terminal device 1, the application server 2 saves the voice content.
  • the content obtaining module 101 can obtain the audio record information of the conference, because the full voice content of the conference is saved on the application server 2.
  • the audio recording information is preferably the voice content of the conference.
  • the conference call is a video conference call
  • the conference record received and saved by the application server 2 is audio and video (voice and video screen) content, and at this time, the audio record information acquired by the content acquisition module 101 Also preferred is the voice content of the conference.
  • the voice characteristics of each speaker can be pre-acquired prior to the meeting. Specifically, each participant is preset with a unique ID number. Before the meeting, the voice characteristics of each participant are pre-admitted, and then an identity index table is established according to the voice characteristics and ID number of each participant. The identity index table stores the correspondence between the voice characteristics of each participant and the ID of each participant, thereby enabling confirmation of the membership of the participant.
  • the participants can be from the local or remote speakers.
  • the speaker's voice feature may be generated into a speaker model, and the speaker model and the corresponding speaker ID number are stored in the identity index table.
  • the speaker sound feature of the segment of the voice content needs to be extracted first, and the sound feature is extracted. Compare with each speaker model in the identity index table and get a matching score. If the matching score reaches a preset score, it indicates that the speaker model corresponding to the sound feature parameter exists in the index table, thereby obtaining the speaker ID number and confirming the speaker identity. Otherwise, it indicates that there is no speaker model corresponding to the sound feature in the index table, and a new speaker model and a new ID number are generated according to the sound feature, and stored in the identity index table, so as to facilitate the search for matching.
  • a UBM model general background model
  • an i-vector extraction algorithm can be used for matching scoring.
  • the i-vector value is calculated from the two pieces of speech content as the sound characteristics of the speaker of the two pieces of speech content.
  • the input is scored by the dot-product algorithm or the PLDA algorithm. If the score exceeds a certain threshold, it is considered that the two speech contents belong to the same speaker. .
  • the content acquisition module 101 may extract each audio from the audio record information according to the voice feature of each speaker. The content of a speech by the speaker.
  • the extracting module 102 is configured to perform keyword extraction on the content of the speech of each of the speakers.
  • the voice content of each speaker may be converted into a corresponding text before keyword extraction.
  • the extraction module 102 may first sort the multiple segments of text content in a certain order. For example, the multi-segment text content can be sorted according to the time axis (eg, according to the order in which the text content is generated, the number of sentences, the serial number, etc.).
  • the extraction module 102 can employ a TF-IDF algorithm to extract keywords for each of the speakers' speech content.
  • the TF-IDF algorithm can be used to assess how important a word is in a spoken text. The importance of a word increases proportionally with the number of times it appears in the text.
  • the TF-IDF value of a certain word is obtained by word frequency (TF) and inverse document frequency (IDF), and the TF-IDF value is higher if the word is more important to the spoken text.
  • TF word frequency
  • IDF inverse document frequency
  • the extraction module 102 can rank the TF-IDF value in the first few words as the keyword of the utterance text. For example, a word with the TF-IDF value ranked in the top five is used as a keyword for the spoken text.
  • the generating module 103 is configured to generate a meeting minutes corresponding to the meeting according to the extracted keywords.
  • the generating module 103 may generate a meeting minutes based on the extracted keywords in combination with the speaking content to which each keyword belongs. In other implementation manners of the present application, the generating module 103 may further take the speaker's intonation (generally, the higher the intonation of the voice content, correspondingly, the higher the importance of the voice content) as a consideration parameter to generate The meeting minutes.
  • the generating module 103 may further process the generated meeting minutes by using an NLP natural language algorithm to generate a more fluent and standardized meeting minutes.
  • the NLP analysis engine based on the NLP natural language algorithm can pre-collect and store a large amount of real corpus, so that the linguistic behavior of the words in the meeting minutes can be revised.
  • the meeting minutes generating system 100 includes a series of computer program instructions stored in the memory 11, and when the computer program instructions are executed by the processor 12, the meeting minutes generating operation of the embodiments of the present application can be implemented. .
  • the meeting minutes generation system 100 can be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions.
  • the meeting minutes generation system 100 can be divided into a content acquisition module 101, an extraction module 102, a generation module 103, a feature creation module 104, and a transmission module 105.
  • the program modules 101-103 are the same as the first embodiment of the meeting minutes generation system 100 of the present application, and the feature creation module 104 and the transmission module 105 are added thereto. among them:
  • the feature establishing module 104 is configured to acquire a voice sample of each of the speakers, and extract a sound feature of each of the speakers from a voice sample of each of the speakers.
  • each participant is required to perform a conference check-in by voice to obtain a voice sample, thereby realizing pre-admission of the voice of each participant and performing sound feature extraction.
  • the sending module 105 is configured to send the meeting minutes generated by the generating module 103 to the preset user by mail or fax, or provide a link to the preset user to obtain the meeting minutes.
  • the preset user may be a participant or other pre-designated person.
  • the sending module 105 may also encrypt the meeting minutes to ensure data security before storing or transmitting the meeting minutes.
  • the meeting minutes are compressed and encrypted, and the decompression password is a designated password or a password known or agreed by each participant.
  • the present application also proposes a method for generating a meeting minutes.
  • FIG. 5 it is a schematic flow chart of the implementation of the first embodiment of the method for generating meeting minutes of the present application.
  • the order of execution of the steps in the flowchart shown in FIG. 5 may be changed according to different requirements, and some steps may be omitted.
  • Step S502 Acquire audio record information of a conference, and extract the content of each speaker's speech from the audio record information according to the voice feature of each speaker.
  • the application server 2 collects the conference voice content through each terminal device 1, receives the voice content sent by each terminal device 1 and saves the voice content, and the voice content can be saved into a specified audio format, such as MP3. , wma, wav, etc.
  • the terminal device 1 collects the voice content through a sound collection device (for example, a microphone).
  • the terminal device 1 can send the collected voice content to the application server 2 in real time or periodically, or when the participant on the side of the terminal device 1 ends a speech, the terminal device 1 will continuously collect the voice.
  • the content is sent to the application server 2.
  • the application server 2 After receiving the voice content sent by the terminal device 1, the application server 2 saves the voice content.
  • the audio record information of the conference can be obtained from the application server 2.
  • the audio recording information is preferably the voice content of the conference.
  • the conference call is a video conference call
  • the conference record received and saved by the application server 2 is audio and video (voice and video picture) content, and at this time, the acquired audio record information is also preferably the same.
  • the voice content of the meeting is a video conference call.
  • the voice characteristics of each speaker can be pre-acquired prior to the meeting. Specifically, each participant is preset with a unique ID number. Before the meeting, the voice characteristics of each participant are pre-admitted, and then an identity index table is established according to the voice characteristics and ID number of each participant. The identity index table stores the correspondence between the voice characteristics of each participant and the ID of each participant, thereby enabling confirmation of the membership of the participant.
  • the participants can be from the local or remote speakers.
  • the speaker's voice feature may be generated into a speaker model, and the speaker model and the corresponding speaker ID number are stored in the identity index table.
  • the speaker sound feature of the segment of the voice content needs to be extracted first, and the sound feature is extracted. Compare with each speaker model in the identity index table and get a matching score. If the matching score reaches a preset score, it indicates that the speaker model corresponding to the sound feature parameter exists in the index table, thereby obtaining the speaker ID number and confirming the speaker identity. Otherwise, it indicates that there is no speaker model corresponding to the sound feature in the index table, and a new speaker model and a new ID number are generated according to the sound feature, and stored in the identity index table, so as to facilitate the search for matching.
  • a UBM model general background model
  • an i-vector extraction algorithm can be used for matching scoring.
  • the i-vector value is calculated from the two pieces of speech content as the sound characteristics of the speaker of the two pieces of speech content.
  • the input is scored by the dot-product algorithm or the PLDA algorithm. If the score exceeds a certain threshold, it is considered that the two speech contents belong to the same speaker. .
  • the voice of each speaker can be extracted from the audio record information according to the voice feature of each speaker.
  • the content of the speech is
  • Step S504 performing keyword extraction on the content of the speech of each of the speakers.
  • the voice content of each speaker may be converted into a corresponding text before keyword extraction.
  • the plurality of pieces of text content may be first sorted in a certain order.
  • the multi-segment text content can be sorted according to the time axis (eg, according to the order in which the text content is generated, the number of sentences, the serial number, etc.).
  • a TF-IDF algorithm may be employed to extract keywords for each of the speakers' speech content.
  • the TF-IDF algorithm can be used to assess how important a word is in a spoken text. The importance of a word increases proportionally with the number of times it appears in the text.
  • the TF-IDF value of a certain word is obtained by word frequency (TF) and inverse document frequency (IDF), and the TF-IDF value is higher if the word is more important to the spoken text. The bigger. Therefore, the first few words of the TF-IDF value can be used as the keywords of the speech text. For example, a word with the TF-IDF value ranked in the top five is used as a keyword for the spoken text.
  • Step S506 generating a meeting minutes corresponding to the meeting according to the extracted keywords.
  • the meeting minutes may be generated based on the extracted keywords in combination with the speaking content to which each keyword belongs.
  • the speaker's intonation (generally, the higher the intonation of the voice content, correspondingly, the higher the importance of the voice content) may be further taken as a consideration parameter to generate the conference. summary.
  • the generated meeting minutes may be further processed by an NLP natural language algorithm to generate a more fluent and standardized meeting minutes.
  • the NLP analysis engine based on the NLP natural language algorithm can pre-collect and store a large amount of real corpus, so that the linguistic behavior of the words in the meeting minutes can be revised.
  • the conference minutes generating method proposed by the present application firstly acquires audio record information of the conference, and extracts each of the speakers from the audio record information according to the voice feature of each speaker.
  • the content of the speech secondly, performing keyword extraction on the content of the speech of each of the speakers; further, generating a meeting minutes corresponding to the meeting according to the extracted keywords; and finally, generating the meeting minutes by mail Or send it to the preset user in the form of a fax, or provide a link to the preset user to obtain the meeting minutes.
  • the participants in the meeting can focus more on the content and process of the meeting.
  • the meeting summary is streamlined and accurate. It can also be used for reference and reference by other people in need. Compared with traditional manual recording, this solution is more efficient and accurate, and saves human resource costs.
  • FIG. 6 it is a schematic diagram of an implementation process of a second embodiment of a method for generating a meeting minutes of the present application.
  • the order of execution of the steps in the flowchart shown in FIG. 6 may be changed according to different requirements, and some steps may be omitted.
  • Step S500 Acquire a voice sample of each of the speakers, and extract a sound feature of each of the speakers from a voice sample of each of the speakers.
  • each participant is required to perform a conference check-in by voice to obtain a voice sample, thereby realizing pre-admission of the voice of each participant and performing sound feature extraction.
  • Step S502 Acquire audio record information of a conference, and extract the content of each speaker's speech from the audio record information according to the voice feature of each speaker.
  • the application server 2 collects the conference voice content through each terminal device 1, receives the voice content sent by each terminal device 1 and saves the voice content, and the voice content can be saved into a specified audio format, such as MP3. , wma, wav, etc.
  • the terminal device 1 collects the voice content through a sound collection device (for example, a microphone).
  • the terminal device 1 can send the collected voice content to the application server 2 in real time or periodically, or when the participant on the side of the terminal device 1 ends a speech, the terminal device 1 will continuously collect the voice.
  • the content is sent to the application server 2.
  • the application server 2 After receiving the voice content sent by the terminal device 1, the application server 2 saves the voice content.
  • the audio record information of the conference can be obtained from the application server 2.
  • the audio recording information is preferably the voice content of the conference.
  • the conference call is a video conference call
  • the conference record received and saved by the application server 2 is audio and video (voice and video picture) content, and at this time, the acquired audio record information is also preferably the same.
  • the voice content of the meeting is a video conference call.
  • the voice characteristics of each speaker can be pre-acquired prior to the meeting. Specifically, each participant is preset with a unique ID number. Before the meeting, the voice characteristics of each participant are pre-admitted, and then an identity index table is established according to the voice characteristics and ID number of each participant. The identity index table stores the correspondence between the voice characteristics of each participant and the ID of each participant, thereby enabling confirmation of the membership of the participant.
  • the participants can be from the local or remote speakers.
  • the speaker's voice characteristics may be generated into a speaker model, and the speaker model and the corresponding speaker ID number are stored in the identity index table.
  • the speaker sound feature of the segment of the voice content needs to be extracted first, and the sound feature is extracted. Compare with each speaker model in the identity index table and get a matching score. If the matching score reaches a preset score, it indicates that the speaker model corresponding to the sound feature parameter exists in the index table, thereby obtaining the speaker ID number and confirming the speaker identity. Otherwise, it indicates that there is no speaker model corresponding to the sound feature in the index table, and a new speaker model and a new ID number are generated according to the sound feature, and stored in the identity index table, so as to facilitate the search for matching.
  • a UBM model general background model
  • an i-vector extraction algorithm can be used for matching scoring.
  • the i-vector value is calculated from the two pieces of speech content as the sound characteristics of the speaker of the two pieces of speech content.
  • the input is scored by the dot-product algorithm or the PLDA algorithm. If the score exceeds a certain threshold, it is considered that the two speech contents belong to the same speaker. .
  • the voice of each speaker can be extracted from the audio record information according to the voice feature of each speaker.
  • the content of the speech is
  • Step S504 performing keyword extraction on the content of the speech of each of the speakers.
  • the voice content of each speaker may be converted into a corresponding text before keyword extraction.
  • the plurality of pieces of text content may be first sorted in a certain order.
  • the multi-segment text content can be sorted according to the time axis (eg, according to the order in which the text content is generated, the number of sentences, the serial number, etc.).
  • a TF-IDF algorithm may be employed to extract keywords for each of the speakers' speech content.
  • the TF-IDF algorithm can be used to assess how important a word is in a spoken text. The importance of a word increases proportionally with the number of times it appears in the text.
  • the TF-IDF value of a certain word is obtained by word frequency (TF) and inverse document frequency (IDF), and the TF-IDF value is higher if the word is more important to the spoken text. The bigger. Therefore, the first few words of the TF-IDF value can be used as the keywords of the speech text. For example, a word with the TF-IDF value ranked in the top five is used as a keyword for the spoken text.
  • Step S506 generating a meeting minutes corresponding to the meeting according to the extracted keywords.
  • the meeting minutes may be generated based on the extracted keywords in combination with the speaking content to which each keyword belongs.
  • the speaker's intonation (generally, the higher the intonation of the voice content, correspondingly, the higher the importance of the voice content) may be further taken as a consideration parameter to generate the conference. summary.
  • the generated meeting minutes may be further processed by an NLP natural language algorithm to generate a more fluent and standardized meeting minutes.
  • the NLP analysis engine based on the NLP natural language algorithm can pre-collect and store a large amount of real corpus, so that the linguistic behavior of the words in the meeting minutes can be revised.
  • Step S508 sending the meeting minutes to the preset user by mail or fax, or providing a link to the preset user to obtain the meeting minutes.
  • the preset user may be a participant or other pre-designated person.
  • the meeting minutes may also be encrypted prior to storing or transmitting the meeting minutes to ensure data security. For example, compress and encrypt the meeting minutes, decompress the password as a specified password or a password known or agreed by each participant.
  • the method for generating meeting minutes proposed by the present application firstly acquires a voice sample of each of the speakers, and extracts each of the speakers from the voice samples of each of the speakers. a sound feature; secondly, acquiring audio record information of the conference, and extracting the content of each speaker from the audio record information according to the voice feature of each speaker; and, for each of the speakers
  • the content of the speech of the person is extracted by the keyword; further, the meeting minutes corresponding to the meeting are generated according to the extracted keywords; finally, the generated meeting minutes are sent to the preset user by mail or fax, or The preset user provides a link to obtain the meeting minutes.
  • the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better.
  • Implementation Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Toys (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Disclosed in the present application is a meeting minutes generation method. The method comprises: obtaining audio record information of a meeting, and extracting speech content of each speaker from the audio record information according to sound characteristics of each speaker; extracting a keyword from the speech content of each speaker; and generating meeting minutes corresponding to the meeting according to the extracted keywords. The present application also provides an application server and a computer readable storage medium. By means of the meeting minutes generation method, the application server and the computer readable storage medium provided in the present application, meeting minutes can be automatically summarized and generated according to meeting content records, thereby reducing costs of human resources.

Description

会议纪要生成方法、应用服务器及计算机可读存储介质Meeting minutes generation method, application server, and computer readable storage medium
本申请要求于2017年11月17日提交中国专利局、申请号为201711141751.5、发明名称为“会议纪要生成方法、应用服务器及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims the priority of the Chinese patent application filed on November 17, 2017, the Chinese Patent Office, the application number is 201711141751.5, and the invention name is "meeting minutes generation method, application server and computer readable storage medium", the entire contents of which are The citation is incorporated in the application.
技术领域Technical field
本申请涉及语音处理技术领域,尤其涉及会议纪要生成方法、应用服务器及计算机可读存储介质。The present application relates to the field of voice processing technologies, and in particular, to a conference minutes generation method, an application server, and a computer readable storage medium.
背景技术Background technique
在政府、公司办公过程当中,每个工作日几乎都可能会面临各项会议,大到重要的决策层指示会议,小到组内针对某个事件的讨论亦或是功能的探究,都经由“会议“这种形式来完成。而在参会过程中,参会成员一般专注于跟进会议内容、进程,在会议结束后,会议纪要往往需要依靠专门的工作人员根据参会过程进行收集整理,从而导致整理会议纪要的过程需要人力成本的投入。对于一些小型的组内会议,往往因为时间及人力原因,无专门的工作人员来整理会议纪要,将不利于推动团队的建设与成长。In the process of government and company work, almost every working day may face various meetings, from the important decision-making level to the meeting, to the discussion of an event or the function of the group. The meeting "This form is done. In the process of participation, the participants generally focus on following up the content and process of the meeting. After the meeting, the meeting minutes often need to rely on the special staff to collect and organize according to the participation process, which leads to the process of organizing the meeting minutes. Input of labor costs. For some small group meetings, often due to time and manpower, no dedicated staff to organize the meeting minutes will not be conducive to promoting team building and growth.
发明内容Summary of the invention
有鉴于此,本申请提出一种会议纪要生成方法、应用服务器及计算机可读存储介质,可以实现根据会议内容记录自动总结并生成会议纪要,节省人力资源成本。In view of this, the present application provides a method for generating a meeting minutes, an application server, and a computer readable storage medium, which can automatically summarize and generate meeting minutes according to meeting content records, thereby saving human resource costs.
首先,为实现上述目的,本申请提出一种应用服务器,所述应用服务器包括存储器、处理器,所述存储器上存储有可在所述处理器上运行的会议纪要生成系统,所述会议纪要生成系统被所述处理器执行时实现如下步骤:获 取一会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容;对每一所述发言人的发言内容进行关键字提取;及根据所述提取的关键字生成与所述会议对应的会议纪要。First, in order to achieve the above object, the present application provides an application server, where the application server includes a memory, a processor, and a memory meeting generation system that can be run on the processor, where the meeting minutes are generated. When the system is executed by the processor, the following steps are performed: acquiring audio record information of a conference, and extracting, from the audio record information, the content of each speaker according to the voice feature of each speaker; The content of the speech of the speaker is subjected to keyword extraction; and the meeting minutes corresponding to the meeting are generated according to the extracted keywords.
此外,为实现上述目的,本申请还提供一种会议纪要生成方法,应用于应用服务器,所述方法包括:获取一会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容;对每一所述发言人的发言内容进行关键字提取;及根据所述提取的关键字生成与所述会议对应的会议纪要。In addition, to achieve the above object, the present application further provides a method for generating a meeting minutes, which is applied to an application server, the method comprising: acquiring audio record information of a conference, and recording the audio record according to the voice feature of each speaker. Extracting the content of each of the speakers of the information; performing keyword extraction on the content of the speech of each of the speakers; and generating a meeting minutes corresponding to the meeting according to the extracted keywords.
进一步地,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质存储有会议纪要生成系统,所述会议纪要生成系统可被至少一个处理器执行,以使所述至少一个处理器执行如上述会议纪要生成方法的步骤。Further, in order to achieve the above object, the present application further provides a computer readable storage medium storing a meeting minutes generating system, the meeting minutes generating system being executable by at least one processor, so that The at least one processor performs the steps of the method of generating a meeting minutes as described above.
相较于现有技术,本申请所提出的会议纪要生成方法、应用服务器及计算机可读存储介质,首先,获取一会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容;其次,对每一所述发言人的发言内容进行关键字提取;最后,根据所述提取的关键字生成与所述会议对应的会议纪要。这样,可以实现根据会议内容记录自动总结并生成会议纪要,方便参会人员对会议内容的回顾,会议中的参会人员可以更专注于会议内容与进程,会议结束后,精简、准确的会议纪要也可以供其他有需求人员进行查阅与参考引用,相比于传统的人工记录整理,本方案更高效准确,同时节省了人力资源成本。Compared with the prior art, the conference minutes generating method, the application server, and the computer readable storage medium proposed by the present application first acquire audio recording information of a conference, and from the audio recording according to the voice characteristics of each speaker. The content of each speaker of the speaker is extracted from the information; secondly, keyword extraction is performed on the content of the speech of each of the speakers; and finally, the meeting minutes corresponding to the meeting are generated according to the extracted keywords. In this way, it is possible to automatically summarize and generate meeting minutes according to the meeting content records, so that the participants can review the content of the meeting. The participants in the meeting can focus more on the content and process of the meeting. After the meeting, the meeting summary is streamlined and accurate. It can also be used for reference and reference by other people in need. Compared with traditional manual recording, this solution is more efficient and accurate, and saves human resource costs.
附图说明DRAWINGS
图1是本申请各个实施例一可选的应用环境示意图;1 is a schematic diagram of an optional application environment of each embodiment of the present application;
图2是本申请应用服务器一可选的硬件架构的示意图;2 is a schematic diagram of an optional hardware architecture of an application server of the present application;
图3是本申请会议纪要生成系统第一实施例的程序模块示意图;3 is a schematic diagram of a program module of a first embodiment of a meeting minutes generation system of the present application;
图4是本申请会议纪要生成系统第二实施例的程序模块示意图;4 is a schematic diagram of a program module of a second embodiment of the meeting minutes generating system of the present application;
图5为本申请会议纪要生成方法第一实施例的实施流程示意图;5 is a schematic flowchart of an implementation process of a first embodiment of a method for generating a meeting minutes of the present application;
图6为本申请会议纪要生成方法第二实施例的实施流程示意图。FIG. 6 is a schematic diagram of an implementation process of a second embodiment of a method for generating a meeting minutes of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the objects, technical solutions, and advantages of the present application more comprehensible, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.
需要说明的是,在本申请中涉及“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。It should be noted that the descriptions of "first", "second" and the like in the present application are for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. . Thus, features defining "first" or "second" may include at least one of the features, either explicitly or implicitly. In addition, the technical solutions between the various embodiments may be combined with each other, but must be based on the realization of those skilled in the art, and when the combination of the technical solutions is contradictory or impossible to implement, it should be considered that the combination of the technical solutions does not exist. Nor is it within the scope of protection required by this application.
参阅图1所示,是本申请各个实施例一可选的应用环境示意图。Referring to FIG. 1 , it is a schematic diagram of an optional application environment of each embodiment of the present application.
在本实施例中,本申请可应用于包括,但不仅限于,终端设备1、应用服务器2、网络3的应用环境中。其中,所述终端设备1可以是移动电话、智能电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、导航装置、车载装置等等的可移动设备,以及诸如数字TV、台式计算机、笔记本、宽带电话、服务器等等的固定终端。所述应用服务器2可以是机架式服务器、刀片式服务器、塔式服务器或机柜式服务 器等计算设备,该应用服务器2可以是独立的服务器,也可以是多个服务器所组成的服务器集群。所述网络3可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi、通话网络等无线或有线网络。In this embodiment, the present application is applicable to an application environment including, but not limited to, the terminal device 1, the application server 2, and the network 3. The terminal device 1 may be a mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a navigation device, an in-vehicle device, etc. Mobile devices, etc., as well as fixed terminals such as digital TVs, desktop computers, notebooks, broadband phones, servers, and the like. The application server 2 may be a computing device such as a rack server, a blade server, a tower server, or a rack server. The application server 2 may be a standalone server or a server cluster composed of multiple servers. The network 3 may be an intranet, an Internet, a Global System of Mobile communication (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, Wireless or wired networks such as 5G networks, Bluetooth, Wi-Fi, and call networks.
其中,所述应用服务器2可以通过所述网络3分别与一个或多个所述终端设备1通信连接,以进行数据传输和交互。The application server 2 can be respectively connected to one or more of the terminal devices 1 through the network 3 for data transmission and interaction.
参阅图2所示,是本申请应用服务器2一可选的硬件架构的示意图。Referring to FIG. 2, it is a schematic diagram of an optional hardware architecture of the application server 2 of the present application.
本实施例中,所述应用服务器2可包括,但不仅限于,可通过系统总线相互通信连接存储器11、处理器12、网络接口13。需要指出的是,图2仅示出了具有组件11-13的应用服务器2,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。In this embodiment, the application server 2 may include, but is not limited to, the memory 11, the processor 12, and the network interface 13 being communicably connected to each other through a system bus. It is to be noted that FIG. 2 only shows the application server 2 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,所述存储器11可以是所述应用服务器2的内部存储单元,例如该应用服务器2的硬盘或内存。在另一些实施例中,所述存储器11也可以是所述应用服务器2的外部存储设备,例如该应用服务器2上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器11还可以既包括所述应用服务器2的内部存储单元也包括其外部存储设备。本实施例中,所述存储器11通常用于存储安装于所述应用服务器2的操作系统和各类应用软件,例如会议纪要生成系统100的程序代码等。此外,所述存储器11还可以用于暂时地存储已经输出或者将要输出的各类数据。The memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (eg, SD or DX memory, etc.), a random access memory (RAM), a static Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like. In some embodiments, the memory 11 may be an internal storage unit of the application server 2, such as a hard disk or memory of the application server 2. In other embodiments, the memory 11 may also be an external storage device of the application server 2, such as a plug-in hard disk equipped on the application server 2, a smart memory card (SMC), and a secure digital number. (Secure Digital, SD) card, flash card, etc. Of course, the memory 11 can also include both the internal storage unit of the application server 2 and its external storage device. In this embodiment, the memory 11 is generally used to store an operating system installed in the application server 2 and various types of application software, such as program code of the meeting minutes generation system 100. Further, the memory 11 can also be used to temporarily store various types of data that have been output or are to be output.
所述处理器12在一些实施例中可以是中央处理器(Central Processing  Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器12通常用于控制所述应用服务器2的总体操作,例如执行与所述终端设备1进行数据交互或者通信相关的控制和处理等。本实施例中,所述处理器12用于运行所述存储器11中存储的程序代码或者处理数据,例如运行所述的会议纪要生成系统等。The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the application server 2, such as performing control and processing related to data interaction or communication with the terminal device 1. In this embodiment, the processor 12 is configured to run program code or process data stored in the memory 11, such as running the conference minutes generating system and the like.
所述网络接口13可包括无线网络接口或有线网络接口,该网络接口13通常用于在所述应用服务器2与其他电子设备之间建立通信连接。本实施例中,所述网络接口13主要用于通过所述网络3将所述应用服务器2与一个或多个所述终端设备1相连,在所述应用服务器2与所述一个或多个终端设备1之间的建立数据传输通道和通信连接。The network interface 13 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the application server 2 and other electronic devices. In this embodiment, the network interface 13 is mainly used to connect the application server 2 to one or more of the terminal devices 1 through the network 3, and the application server 2 and the one or more terminals. A data transmission channel and a communication connection are established between the devices 1.
至此,己经详细介绍了本申请相关设备的硬件结构和功能。下面,将基于上述介绍提出本申请的各个实施例。So far, the hardware structure and functions of the devices related to this application have been described in detail. Hereinafter, various embodiments of the present application will be made based on the above description.
首先,本申请提出一种会议纪要生成系统100。First, the present application proposes a meeting minutes generation system 100.
参阅图3所示,是本申请会议纪要生成系统100第一实施例的程序模块图。Referring to FIG. 3, it is a program module diagram of the first embodiment of the meeting minutes generation system 100 of the present application.
本实施例中,所述会议纪要生成系统100包括一系列的存储于存储器11上的计算机程序指令,当该计算机程序指令被处理器12执行时,可以实现本申请各实施例的会议纪要生成操作。在一些实施例中,基于该计算机程序指令各部分所实现的特定的操作,会议纪要生成系统100可以被划分为一个或多个模块。例如,在图3中,会议纪要生成系统100可以被分割成内容获取模块101、提取模块102及生成模块103。其中:In this embodiment, the meeting minutes generating system 100 includes a series of computer program instructions stored in the memory 11, and when the computer program instructions are executed by the processor 12, the meeting minutes generating operation of the embodiments of the present application can be implemented. . In some embodiments, the meeting minutes generation system 100 can be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions. For example, in FIG. 3, the meeting minutes generation system 100 can be divided into a content acquisition module 101, an extraction module 102, and a generation module 103. among them:
所述内容获取模块101用于获取一会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容。The content obtaining module 101 is configured to obtain audio record information of a conference, and extract, from the audio record information, the content of each speaker's speech according to the voice feature of each speaker.
在一实施例中,当电话会议开始后,应用服务器2通过各终端设备1采集会议语音内容,接收各终端设备1发送的语音内容并予以保存,语音内容可以保存为指定的音频格式,如MP3、wma、wav等。In an embodiment, after the conference call starts, the application server 2 collects the conference voice content through each terminal device 1, receives the voice content sent by each terminal device 1 and saves the voice content, and the voice content can be saved into a specified audio format, such as MP3. , wma, wav, etc.
具体的,当终端设备1一侧的参会人员开始发言时,该终端设备1则通过 声音采集装置(例如麦克风)采集语音内容。该终端设备1可以将采集的语音内容实时或定时的发送给应用服务器2,或者,当该终端设备1这侧的参会人员结束一次发言后,该终端设备1才将本次持续采集的语音内容发送给应用服务器2。应用服务器2接收到终端设备1发送的语音内容后,对语音内容予以保存。Specifically, when the participant on the side of the terminal device 1 starts speaking, the terminal device 1 collects the voice content through a sound collecting device (for example, a microphone). The terminal device 1 can send the collected voice content to the application server 2 in real time or periodically, or when the participant on the side of the terminal device 1 ends a speech, the terminal device 1 will continuously collect the voice. The content is sent to the application server 2. After receiving the voice content sent by the terminal device 1, the application server 2 saves the voice content.
由于应用服务器2上保存有一会议的全程语音内容,内容获取模块101则可以获取该会议的音频记录信息。在本实施方式中,音频记录信息优选是该会议的语音内容。在本申请的其他实施方式中,若电话会议为视频电话会议,则应用服务器2接收并保存的会议记录是音视频(语音和视频画面)内容,此时,内容获取模块101获取的音频记录信息同样优选是该会议的语音内容。The content obtaining module 101 can obtain the audio record information of the conference, because the full voice content of the conference is saved on the application server 2. In the present embodiment, the audio recording information is preferably the voice content of the conference. In other embodiments of the present application, if the conference call is a video conference call, the conference record received and saved by the application server 2 is audio and video (voice and video screen) content, and at this time, the audio record information acquired by the content acquisition module 101 Also preferred is the voice content of the conference.
每一发言人(参会人员)的声音特征可以在进行会议前进行预先获取。具体地,每一参会人员被预先设定有唯一的ID编号。会议前预先录取每一参会人员的声音特征,然后根据每一参会人员的声音特征与ID编号建立一身份索引表。该身份索引表中存储了每一参会人员的声音特征与每一参会人员的ID的对应关系,进而可以实现对参会成员身份进行确认。所述参会人员可以来自本端或者远端的发言人员。The voice characteristics of each speaker (participant) can be pre-acquired prior to the meeting. Specifically, each participant is preset with a unique ID number. Before the meeting, the voice characteristics of each participant are pre-admitted, and then an identity index table is established according to the voice characteristics and ID number of each participant. The identity index table stores the correspondence between the voice characteristics of each participant and the ID of each participant, thereby enabling confirmation of the membership of the participant. The participants can be from the local or remote speakers.
在一实施方式中,可以将发言者的声音特征生成一发言者模型,将该发言者模型与对应的发言者ID编号存储在身份索引表中。In an embodiment, the speaker's voice feature may be generated into a speaker model, and the speaker model and the corresponding speaker ID number are stored in the identity index table.
在完成参会人员的身份索引表建立后,当需要分析音频记录信息中某一段语音内容属于那个发言人的发言内容时,需要先提取该段语音内容的发言人声音特征,并将该声音特征与身份索引表中的每一发言者模型进行比较,并得到匹配得分。如果匹配得分达到一预设分数,则表明索引表中存在该声音特征参数对应的发言者模型,由此即可得到该发言者ID编号,确认该发言者身份。否则,表明索引表中不存在与该声音特征对应的发言者模型,则根据该声音特征生成新的发言者模型以及新的ID编号,并存储在身份索引表中,以便后续方便查找匹配。After completing the identity index table of the participant, when it is necessary to analyze a certain piece of voice content in the audio record information belonging to the speaker's speech content, the speaker sound feature of the segment of the voice content needs to be extracted first, and the sound feature is extracted. Compare with each speaker model in the identity index table and get a matching score. If the matching score reaches a preset score, it indicates that the speaker model corresponding to the sound feature parameter exists in the index table, thereby obtaining the speaker ID number and confirming the speaker identity. Otherwise, it indicates that there is no speaker model corresponding to the sound feature in the index table, and a new speaker model and a new ID number are generated according to the sound feature, and stored in the identity index table, so as to facilitate the search for matching.
在进行匹配打分时,可以使用一个UBM模型(通用背景模型)和i-vector提取算法来进行匹配打分。举例而言,从两段语音内容中计算i-vector值作为该两段语音内容的发言人的声音特征。对于两个计算得到的i-vector值,利用dot-product(点积)算法或者PLDA算法对输入项进行打分,如果分数超过某一阈值,则认为为该两段语音内容属于同一发言人的发言。When performing matching scoring, a UBM model (general background model) and an i-vector extraction algorithm can be used for matching scoring. For example, the i-vector value is calculated from the two pieces of speech content as the sound characteristics of the speaker of the two pieces of speech content. For the two calculated i-vector values, the input is scored by the dot-product algorithm or the PLDA algorithm. If the score exceeds a certain threshold, it is considered that the two speech contents belong to the same speaker. .
通过对所述音频记录信息中每一段语音内容与参会人员的ID编号建立进行映射关系,所述内容获取模块101即可根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容。By mapping the voice content of each piece of the audio record information to the ID number of the participant, the content acquisition module 101 may extract each audio from the audio record information according to the voice feature of each speaker. The content of a speech by the speaker.
所述提取模块102用于对每一所述发言人的发言内容进行关键字提取。The extracting module 102 is configured to perform keyword extraction on the content of the speech of each of the speakers.
在一实施例中,可以先将各发言人的语音内容转化成对应的文本后再进行关键字提取。可选地,当转换得到的文字内容有多段时,提取模块102可以先按照一定的顺序对多段文字内容进行排序。例如,可以按照时间轴(如根据文字内容的生成顺序或句数、序号等)对多段文字内容进行排序。In an embodiment, the voice content of each speaker may be converted into a corresponding text before keyword extraction. Optionally, when the converted text content has multiple segments, the extraction module 102 may first sort the multiple segments of text content in a certain order. For example, the multi-segment text content can be sorted according to the time axis (eg, according to the order in which the text content is generated, the number of sentences, the serial number, etc.).
提取模块102可以采用TF-IDF算法来提取每一所述发言人的发言内容的关键字。TF-IDF算法可以用于评估一字词对于一个发言文本中的重要程度。字词的重要性会随着它在文本中出现的次数成正比增加。在进行TF-IDF计算时,通过词频(TF)与逆文档频率(IDF)得出某个字词的TF-IDF值,若该字词对发言文本的重要性越高则该TF-IDF值越大。因此提取模块102可以将TF-IDF值排在最前面的几个字词作为该发言文本的关键词。例如,将TF-IDF值排在前五的字词作为该发言文本的关键词。The extraction module 102 can employ a TF-IDF algorithm to extract keywords for each of the speakers' speech content. The TF-IDF algorithm can be used to assess how important a word is in a spoken text. The importance of a word increases proportionally with the number of times it appears in the text. When performing TF-IDF calculation, the TF-IDF value of a certain word is obtained by word frequency (TF) and inverse document frequency (IDF), and the TF-IDF value is higher if the word is more important to the spoken text. The bigger. Therefore, the extraction module 102 can rank the TF-IDF value in the first few words as the keyword of the utterance text. For example, a word with the TF-IDF value ranked in the top five is used as a keyword for the spoken text.
所述生成模块103用于根据所述提取的关键字生成与所述会议对应的会议纪要。The generating module 103 is configured to generate a meeting minutes corresponding to the meeting according to the extracted keywords.
在一实施方式中,生成模块103可以根据提取的关键字并结合每一关键字所属的发言内容来生成会议纪要。在本申请的其他实施方式中,生成模块103还可以进一步将发言者的语调(一般来说,语音内容的语调越高,相应的,该语音内容的重要性越高)作为考量参数,以生成所述会议纪要。In an embodiment, the generating module 103 may generate a meeting minutes based on the extracted keywords in combination with the speaking content to which each keyword belongs. In other implementation manners of the present application, the generating module 103 may further take the speaker's intonation (generally, the higher the intonation of the voice content, correspondingly, the higher the importance of the voice content) as a consideration parameter to generate The meeting minutes.
在一实施方式中,生成模块103还可以通过NLP自然语言算法对上述生成的会议纪要进行进一步处理,以生成语义更通顺、规范的会议纪要。基于NLP自然语言算法建立的NLP分析引擎可以预先搜集并存储有大量的真实语料,从而可以实现对会议纪要中的字词中有瑕疵或不规范的语言行为进行修订。In an embodiment, the generating module 103 may further process the generated meeting minutes by using an NLP natural language algorithm to generate a more fluent and standardized meeting minutes. The NLP analysis engine based on the NLP natural language algorithm can pre-collect and store a large amount of real corpus, so that the linguistic behavior of the words in the meeting minutes can be revised.
参阅图4所示,是本申请会议纪要生成系统100第二实施例的程序模块图。本实施例中,所述会议纪要生成系统100包括一系列的存储于存储器11上的计算机程序指令,当该计算机程序指令被处理器12执行时,可以实现本申请各实施例的会议纪要生成操作。在一些实施例中,基于该计算机程序指令各部分所实现的特定的操作,会议纪要生成系统100可以被划分为一个或多个模块。例如,在图4中,会议纪要生成系统100可以被分割成内容获取模块101、提取模块102、生成模块103、特征建立模块104及发送模块105。所述各程序模块101-103与本申请会议纪要生成系统100第一实施例相同,并在此基础上增加特征建立模块104及发送模块105。其中:Referring to FIG. 4, it is a program module diagram of a second embodiment of the meeting minutes generating system 100 of the present application. In this embodiment, the meeting minutes generating system 100 includes a series of computer program instructions stored in the memory 11, and when the computer program instructions are executed by the processor 12, the meeting minutes generating operation of the embodiments of the present application can be implemented. . In some embodiments, the meeting minutes generation system 100 can be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions. For example, in FIG. 4, the meeting minutes generation system 100 can be divided into a content acquisition module 101, an extraction module 102, a generation module 103, a feature creation module 104, and a transmission module 105. The program modules 101-103 are the same as the first embodiment of the meeting minutes generation system 100 of the present application, and the feature creation module 104 and the transmission module 105 are added thereto. among them:
所述特征建立模块104用于获取每一所述发言人的语音样本,并从每一所述发言人的语音样本中提取出每一所述发言人的声音特征。The feature establishing module 104 is configured to acquire a voice sample of each of the speakers, and extract a sound feature of each of the speakers from a voice sample of each of the speakers.
具体地,可以在进行会议前,要求每一参会人员通过语音方式进行会议签到以获取语音样本,从而来实现预先录取每一参会人员的声音并进行声音特征提取。Specifically, before the conference is performed, each participant is required to perform a conference check-in by voice to obtain a voice sample, thereby realizing pre-admission of the voice of each participant and performing sound feature extraction.
所述发送模块105用于将生成模块103生成的会议纪要以邮件或传真形式发送给预设用户,或向所述预设用户提供链接以获取所述会议纪要。所述预设用户可以是参会人员或者其他预先指定的人员。The sending module 105 is configured to send the meeting minutes generated by the generating module 103 to the preset user by mail or fax, or provide a link to the preset user to obtain the meeting minutes. The preset user may be a participant or other pre-designated person.
在一实施方式中,在存储或发送会议纪要之前,发送模块105还可以对会议纪要进行加密,以保证数据安全。例如,对会议纪要进行压缩加密,解压密码为指定密码或者为各参会人员公知的或约定的密码。In an embodiment, the sending module 105 may also encrypt the meeting minutes to ensure data security before storing or transmitting the meeting minutes. For example, the meeting minutes are compressed and encrypted, and the decompression password is a designated password or a password known or agreed by each participant.
此外,本申请还提出一种会议纪要生成方法。In addition, the present application also proposes a method for generating a meeting minutes.
参阅图5所示,是本申请会议纪要生成方法第一实施例的实施流程示意 图。在本实施例中,根据不同的需求,图5所示的流程图中的步骤的执行顺序可以改变,某些步骤可以省略。Referring to Fig. 5, it is a schematic flow chart of the implementation of the first embodiment of the method for generating meeting minutes of the present application. In this embodiment, the order of execution of the steps in the flowchart shown in FIG. 5 may be changed according to different requirements, and some steps may be omitted.
步骤S502,获取一会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容。Step S502: Acquire audio record information of a conference, and extract the content of each speaker's speech from the audio record information according to the voice feature of each speaker.
在一实施例中,当电话会议开始后,应用服务器2通过各终端设备1采集会议语音内容,接收各终端设备1发送的语音内容并予以保存,语音内容可以保存为指定的音频格式,如MP3、wma、wav等。In an embodiment, after the conference call starts, the application server 2 collects the conference voice content through each terminal device 1, receives the voice content sent by each terminal device 1 and saves the voice content, and the voice content can be saved into a specified audio format, such as MP3. , wma, wav, etc.
具体的,当终端设备1一侧的参会人员开始发言时,该终端设备1则通过声音采集装置(例如麦克风)采集语音内容。该终端设备1可以将采集的语音内容实时或定时的发送给应用服务器2,或者,当该终端设备1这侧的参会人员结束一次发言后,该终端设备1才将本次持续采集的语音内容发送给应用服务器2。应用服务器2接收到终端设备1发送的语音内容后,对语音内容予以保存。Specifically, when the participant on the side of the terminal device 1 starts speaking, the terminal device 1 collects the voice content through a sound collection device (for example, a microphone). The terminal device 1 can send the collected voice content to the application server 2 in real time or periodically, or when the participant on the side of the terminal device 1 ends a speech, the terminal device 1 will continuously collect the voice. The content is sent to the application server 2. After receiving the voice content sent by the terminal device 1, the application server 2 saves the voice content.
由于应用服务器2上保存有一会议的全程语音内容,则可以从应用服务器2上获取该会议的音频记录信息。在本实施方式中,音频记录信息优选是该会议的语音内容。在本申请的其他实施方式中,若电话会议为视频电话会议,则应用服务器2接收并保存的会议记录是音视频(语音和视频画面)内容,此时,获取的音频记录信息同样优选是该会议的语音内容。Since the full voice content of the conference is saved on the application server 2, the audio record information of the conference can be obtained from the application server 2. In the present embodiment, the audio recording information is preferably the voice content of the conference. In other embodiments of the present application, if the conference call is a video conference call, the conference record received and saved by the application server 2 is audio and video (voice and video picture) content, and at this time, the acquired audio record information is also preferably the same. The voice content of the meeting.
每一发言人(参会人员)的声音特征可以在进行会议前进行预先获取。具体地,每一参会人员被预先设定有唯一的ID编号。会议前预先录取每一参会人员的声音特征,然后根据每一参会人员的声音特征与ID编号建立一身份索引表。该身份索引表中存储了每一参会人员的声音特征与每一参会人员的ID的对应关系,进而可以实现对参会成员身份进行确认。所述参会人员可以来自本端或者远端的发言人员。The voice characteristics of each speaker (participant) can be pre-acquired prior to the meeting. Specifically, each participant is preset with a unique ID number. Before the meeting, the voice characteristics of each participant are pre-admitted, and then an identity index table is established according to the voice characteristics and ID number of each participant. The identity index table stores the correspondence between the voice characteristics of each participant and the ID of each participant, thereby enabling confirmation of the membership of the participant. The participants can be from the local or remote speakers.
在一实施方式中,可以将发言者的声音特征生成一发言者模型,将该发言者模型与对应的发言者ID编号存储在身份索引表中。In an embodiment, the speaker's voice feature may be generated into a speaker model, and the speaker model and the corresponding speaker ID number are stored in the identity index table.
在完成参会人员的身份索引表建立后,当需要分析音频记录信息中某一段语音内容属于那个发言人的发言内容时,需要先提取该段语音内容的发言人声音特征,并将该声音特征与身份索引表中的每一发言者模型进行比较,并得到匹配得分。如果匹配得分达到一预设分数,则表明索引表中存在该声音特征参数对应的发言者模型,由此即可得到该发言者ID编号,确认该发言者身份。否则,表明索引表中不存在与该声音特征对应的发言者模型,则根据该声音特征生成新的发言者模型以及新的ID编号,并存储在身份索引表中,以便后续方便查找匹配。After completing the identity index table of the participant, when it is necessary to analyze a certain piece of voice content in the audio record information belonging to the speaker's speech content, the speaker sound feature of the segment of the voice content needs to be extracted first, and the sound feature is extracted. Compare with each speaker model in the identity index table and get a matching score. If the matching score reaches a preset score, it indicates that the speaker model corresponding to the sound feature parameter exists in the index table, thereby obtaining the speaker ID number and confirming the speaker identity. Otherwise, it indicates that there is no speaker model corresponding to the sound feature in the index table, and a new speaker model and a new ID number are generated according to the sound feature, and stored in the identity index table, so as to facilitate the search for matching.
在进行匹配打分时,可以使用一个UBM模型(通用背景模型)和i-vector提取算法来进行匹配打分。举例而言,从两段语音内容中计算i-vector值作为该两段语音内容的发言人的声音特征。对于两个计算得到的i-vector值,利用dot-product(点积)算法或者PLDA算法对输入项进行打分,如果分数超过某一阈值,则认为为该两段语音内容属于同一发言人的发言。When performing matching scoring, a UBM model (general background model) and an i-vector extraction algorithm can be used for matching scoring. For example, the i-vector value is calculated from the two pieces of speech content as the sound characteristics of the speaker of the two pieces of speech content. For the two calculated i-vector values, the input is scored by the dot-product algorithm or the PLDA algorithm. If the score exceeds a certain threshold, it is considered that the two speech contents belong to the same speaker. .
通过对所述音频记录信息中每一段语音内容与参会人员的ID编号建立进行映射关系,即可根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容。By mapping the voice content of each piece of the audio record information with the ID number of the participant, the voice of each speaker can be extracted from the audio record information according to the voice feature of each speaker. The content of the speech.
步骤S504,对每一所述发言人的发言内容进行关键字提取。Step S504, performing keyword extraction on the content of the speech of each of the speakers.
在一实施例中,可以先将各发言人的语音内容转化成对应的文本后再进行关键字提取。可选地,当转换得到的文字内容有多段时,可以先按照一定的顺序对多段文字内容进行排序。例如,可以按照时间轴(如根据文字内容的生成顺序或句数、序号等)对多段文字内容进行排序。In an embodiment, the voice content of each speaker may be converted into a corresponding text before keyword extraction. Optionally, when there are multiple segments of the converted text content, the plurality of pieces of text content may be first sorted in a certain order. For example, the multi-segment text content can be sorted according to the time axis (eg, according to the order in which the text content is generated, the number of sentences, the serial number, etc.).
在一实施方式中,可以采用TF-IDF算法来提取每一所述发言人的发言内容的关键字。TF-IDF算法可以用于评估一字词对于一个发言文本中的重要程度。字词的重要性会随着它在文本中出现的次数成正比增加。在进行TF-IDF计算时,通过词频(TF)与逆文档频率(IDF)得出某个字词的TF-IDF值,若该字词对发言文本的重要性越高则该TF-IDF值越大。因此可以将TF-IDF值排在最 前面的几个字词作为该发言文本的关键词。例如,将TF-IDF值排在前五的字词作为该发言文本的关键词。In an embodiment, a TF-IDF algorithm may be employed to extract keywords for each of the speakers' speech content. The TF-IDF algorithm can be used to assess how important a word is in a spoken text. The importance of a word increases proportionally with the number of times it appears in the text. When performing TF-IDF calculation, the TF-IDF value of a certain word is obtained by word frequency (TF) and inverse document frequency (IDF), and the TF-IDF value is higher if the word is more important to the spoken text. The bigger. Therefore, the first few words of the TF-IDF value can be used as the keywords of the speech text. For example, a word with the TF-IDF value ranked in the top five is used as a keyword for the spoken text.
步骤S506,根据所述提取的关键字生成与所述会议对应的会议纪要。Step S506, generating a meeting minutes corresponding to the meeting according to the extracted keywords.
在一实施方式中,可以根据提取的关键字并结合每一关键字所属的发言内容来生成会议纪要。在本申请的其他实施方式中,还可以进一步将发言者的语调(一般来说,语音内容的语调越高,相应的,该语音内容的重要性越高)作为考量参数,以生成所述会议纪要。In an embodiment, the meeting minutes may be generated based on the extracted keywords in combination with the speaking content to which each keyword belongs. In other implementation manners of the present application, the speaker's intonation (generally, the higher the intonation of the voice content, correspondingly, the higher the importance of the voice content) may be further taken as a consideration parameter to generate the conference. summary.
在一实施方式中,还可以通过NLP自然语言算法对上述生成的会议纪要进行进一步处理,以生成语义更通顺、规范的会议纪要。基于NLP自然语言算法建立的NLP分析引擎可以预先搜集并存储有大量的真实语料,从而可以实现对会议纪要中的字词中有瑕疵或不规范的语言行为进行修订。In an embodiment, the generated meeting minutes may be further processed by an NLP natural language algorithm to generate a more fluent and standardized meeting minutes. The NLP analysis engine based on the NLP natural language algorithm can pre-collect and store a large amount of real corpus, so that the linguistic behavior of the words in the meeting minutes can be revised.
通过上述步骤S502-S506,本申请所提出的会议纪要生成方法,首先,获取会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容;其次,对每一所述发言人的发言内容进行关键字提取;再者,根据所述提取的关键字生成与所述会议对应的会议纪要;最后,将生成的会议纪要以邮件或传真形式发送给预设用户,或向所述预设用户提供链接以获取所述会议纪要。这样,可以实现根据会议内容记录自动总结并生成会议纪要,方便参会人员对会议内容的回顾,会议中的参会人员可以更专注于会议内容与进程,会议结束后,精简、准确的会议纪要也可以供其他有需求人员进行查阅与参考引用,相比于传统的人工记录整理,本方案更高效准确,同时节省了人力资源成本。Through the above steps S502-S506, the conference minutes generating method proposed by the present application firstly acquires audio record information of the conference, and extracts each of the speakers from the audio record information according to the voice feature of each speaker. The content of the speech; secondly, performing keyword extraction on the content of the speech of each of the speakers; further, generating a meeting minutes corresponding to the meeting according to the extracted keywords; and finally, generating the meeting minutes by mail Or send it to the preset user in the form of a fax, or provide a link to the preset user to obtain the meeting minutes. In this way, it is possible to automatically summarize and generate meeting minutes according to the meeting content records, so that the participants can review the content of the meeting. The participants in the meeting can focus more on the content and process of the meeting. After the meeting, the meeting summary is streamlined and accurate. It can also be used for reference and reference by other people in need. Compared with traditional manual recording, this solution is more efficient and accurate, and saves human resource costs.
参阅图6所示,是本申请会议纪要生成方法第二实施例的实施流程示意图。在本实施例中,根据不同的需求,图6所示的流程图中的步骤的执行顺序可以改变,某些步骤可以省略。Referring to FIG. 6, it is a schematic diagram of an implementation process of a second embodiment of a method for generating a meeting minutes of the present application. In this embodiment, the order of execution of the steps in the flowchart shown in FIG. 6 may be changed according to different requirements, and some steps may be omitted.
步骤S500,获取每一所述发言人的语音样本,并从每一所述发言人的语音样本中提取出每一所述发言人的声音特征。Step S500: Acquire a voice sample of each of the speakers, and extract a sound feature of each of the speakers from a voice sample of each of the speakers.
具体地,可以在进行会议前,要求每一参会人员通过语音方式进行会议签到以获取语音样本,从而来实现预先录取每一参会人员的声音并进行声音特征提取。Specifically, before the conference is performed, each participant is required to perform a conference check-in by voice to obtain a voice sample, thereby realizing pre-admission of the voice of each participant and performing sound feature extraction.
步骤S502,获取一会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容。Step S502: Acquire audio record information of a conference, and extract the content of each speaker's speech from the audio record information according to the voice feature of each speaker.
在一实施例中,当电话会议开始后,应用服务器2通过各终端设备1采集会议语音内容,接收各终端设备1发送的语音内容并予以保存,语音内容可以保存为指定的音频格式,如MP3、wma、wav等。In an embodiment, after the conference call starts, the application server 2 collects the conference voice content through each terminal device 1, receives the voice content sent by each terminal device 1 and saves the voice content, and the voice content can be saved into a specified audio format, such as MP3. , wma, wav, etc.
具体的,当终端设备1一侧的参会人员开始发言时,该终端设备1则通过声音采集装置(例如麦克风)采集语音内容。该终端设备1可以将采集的语音内容实时或定时的发送给应用服务器2,或者,当该终端设备1这侧的参会人员结束一次发言后,该终端设备1才将本次持续采集的语音内容发送给应用服务器2。应用服务器2接收到终端设备1发送的语音内容后,对语音内容予以保存。Specifically, when the participant on the side of the terminal device 1 starts speaking, the terminal device 1 collects the voice content through a sound collection device (for example, a microphone). The terminal device 1 can send the collected voice content to the application server 2 in real time or periodically, or when the participant on the side of the terminal device 1 ends a speech, the terminal device 1 will continuously collect the voice. The content is sent to the application server 2. After receiving the voice content sent by the terminal device 1, the application server 2 saves the voice content.
由于应用服务器2上保存有一会议的全程语音内容,则可以从应用服务器2上获取该会议的音频记录信息。在本实施方式中,音频记录信息优选是该会议的语音内容。在本申请的其他实施方式中,若电话会议为视频电话会议,则应用服务器2接收并保存的会议记录是音视频(语音和视频画面)内容,此时,获取的音频记录信息同样优选是该会议的语音内容。Since the full voice content of the conference is saved on the application server 2, the audio record information of the conference can be obtained from the application server 2. In the present embodiment, the audio recording information is preferably the voice content of the conference. In other embodiments of the present application, if the conference call is a video conference call, the conference record received and saved by the application server 2 is audio and video (voice and video picture) content, and at this time, the acquired audio record information is also preferably the same. The voice content of the meeting.
每一发言人(参会人员)的声音特征可以在进行会议前进行预先获取。具体地,每一参会人员被预先设定有唯一的ID编号。会议前预先录取每一参会人员的声音特征,然后根据每一参会人员的声音特征与ID编号建立一身份索引表。该身份索引表中存储了每一参会人员的声音特征与每一参会人员的ID的对应关系,进而可以实现对参会成员身份进行确认。所述参会人员可以来自本端或者远端的发言人员。The voice characteristics of each speaker (participant) can be pre-acquired prior to the meeting. Specifically, each participant is preset with a unique ID number. Before the meeting, the voice characteristics of each participant are pre-admitted, and then an identity index table is established according to the voice characteristics and ID number of each participant. The identity index table stores the correspondence between the voice characteristics of each participant and the ID of each participant, thereby enabling confirmation of the membership of the participant. The participants can be from the local or remote speakers.
在一实施方式中,可以将发言者的声音特征生成一发言者模型,将该发 言者模型与对应的发言者ID编号存储在身份索引表中。In one embodiment, the speaker's voice characteristics may be generated into a speaker model, and the speaker model and the corresponding speaker ID number are stored in the identity index table.
在完成参会人员的身份索引表建立后,当需要分析音频记录信息中某一段语音内容属于那个发言人的发言内容时,需要先提取该段语音内容的发言人声音特征,并将该声音特征与身份索引表中的每一发言者模型进行比较,并得到匹配得分。如果匹配得分达到一预设分数,则表明索引表中存在该声音特征参数对应的发言者模型,由此即可得到该发言者ID编号,确认该发言者身份。否则,表明索引表中不存在与该声音特征对应的发言者模型,则根据该声音特征生成新的发言者模型以及新的ID编号,并存储在身份索引表中,以便后续方便查找匹配。After completing the identity index table of the participant, when it is necessary to analyze a certain piece of voice content in the audio record information belonging to the speaker's speech content, the speaker sound feature of the segment of the voice content needs to be extracted first, and the sound feature is extracted. Compare with each speaker model in the identity index table and get a matching score. If the matching score reaches a preset score, it indicates that the speaker model corresponding to the sound feature parameter exists in the index table, thereby obtaining the speaker ID number and confirming the speaker identity. Otherwise, it indicates that there is no speaker model corresponding to the sound feature in the index table, and a new speaker model and a new ID number are generated according to the sound feature, and stored in the identity index table, so as to facilitate the search for matching.
在进行匹配打分时,可以使用一个UBM模型(通用背景模型)和i-vector提取算法来进行匹配打分。举例而言,从两段语音内容中计算i-vector值作为该两段语音内容的发言人的声音特征。对于两个计算得到的i-vector值,利用dot-product(点积)算法或者PLDA算法对输入项进行打分,如果分数超过某一阈值,则认为为该两段语音内容属于同一发言人的发言。When performing matching scoring, a UBM model (general background model) and an i-vector extraction algorithm can be used for matching scoring. For example, the i-vector value is calculated from the two pieces of speech content as the sound characteristics of the speaker of the two pieces of speech content. For the two calculated i-vector values, the input is scored by the dot-product algorithm or the PLDA algorithm. If the score exceeds a certain threshold, it is considered that the two speech contents belong to the same speaker. .
通过对所述音频记录信息中每一段语音内容与参会人员的ID编号建立进行映射关系,即可根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容。By mapping the voice content of each piece of the audio record information with the ID number of the participant, the voice of each speaker can be extracted from the audio record information according to the voice feature of each speaker. The content of the speech.
步骤S504,对每一所述发言人的发言内容进行关键字提取。Step S504, performing keyword extraction on the content of the speech of each of the speakers.
在一实施例中,可以先将各发言人的语音内容转化成对应的文本后再进行关键字提取。可选地,当转换得到的文字内容有多段时,可以先按照一定的顺序对多段文字内容进行排序。例如,可以按照时间轴(如根据文字内容的生成顺序或句数、序号等)对多段文字内容进行排序。In an embodiment, the voice content of each speaker may be converted into a corresponding text before keyword extraction. Optionally, when there are multiple segments of the converted text content, the plurality of pieces of text content may be first sorted in a certain order. For example, the multi-segment text content can be sorted according to the time axis (eg, according to the order in which the text content is generated, the number of sentences, the serial number, etc.).
在一实施方式中,可以采用TF-IDF算法来提取每一所述发言人的发言内容的关键字。TF-IDF算法可以用于评估一字词对于一个发言文本中的重要程度。字词的重要性会随着它在文本中出现的次数成正比增加。在进行TF-IDF计算时,通过词频(TF)与逆文档频率(IDF)得出某个字词的TF-IDF值,若该字 词对发言文本的重要性越高则该TF-IDF值越大。因此可以将TF-IDF值排在最前面的几个字词作为该发言文本的关键词。例如,将TF-IDF值排在前五的字词作为该发言文本的关键词。In an embodiment, a TF-IDF algorithm may be employed to extract keywords for each of the speakers' speech content. The TF-IDF algorithm can be used to assess how important a word is in a spoken text. The importance of a word increases proportionally with the number of times it appears in the text. When performing TF-IDF calculation, the TF-IDF value of a certain word is obtained by word frequency (TF) and inverse document frequency (IDF), and the TF-IDF value is higher if the word is more important to the spoken text. The bigger. Therefore, the first few words of the TF-IDF value can be used as the keywords of the speech text. For example, a word with the TF-IDF value ranked in the top five is used as a keyword for the spoken text.
步骤S506,根据所述提取的关键字生成与所述会议对应的会议纪要。Step S506, generating a meeting minutes corresponding to the meeting according to the extracted keywords.
在一实施方式中,可以根据提取的关键字并结合每一关键字所属的发言内容来生成会议纪要。在本申请的其他实施方式中,还可以进一步将发言者的语调(一般来说,语音内容的语调越高,相应的,该语音内容的重要性越高)作为考量参数,以生成所述会议纪要。In an embodiment, the meeting minutes may be generated based on the extracted keywords in combination with the speaking content to which each keyword belongs. In other implementation manners of the present application, the speaker's intonation (generally, the higher the intonation of the voice content, correspondingly, the higher the importance of the voice content) may be further taken as a consideration parameter to generate the conference. summary.
在一实施方式中,还可以通过NLP自然语言算法对上述生成的会议纪要进行进一步处理,以生成语义更通顺、规范的会议纪要。基于NLP自然语言算法建立的NLP分析引擎可以预先搜集并存储有大量的真实语料,从而可以实现对会议纪要中的字词中有瑕疵或不规范的语言行为进行修订。In an embodiment, the generated meeting minutes may be further processed by an NLP natural language algorithm to generate a more fluent and standardized meeting minutes. The NLP analysis engine based on the NLP natural language algorithm can pre-collect and store a large amount of real corpus, so that the linguistic behavior of the words in the meeting minutes can be revised.
步骤S508,将所述会议纪要以邮件或传真形式发送给预设用户,或向所述预设用户提供链接以获取所述会议纪要。所述预设用户可以是参会人员或者其他预先指定的人员。Step S508, sending the meeting minutes to the preset user by mail or fax, or providing a link to the preset user to obtain the meeting minutes. The preset user may be a participant or other pre-designated person.
在一实施方式中,在存储或发送会议纪要之前,还可以对会议纪要进行加密,以保证数据安全。例如,对会议纪要进行压缩加密,解压密码为指定密码或者为各参会人员公知的或约定的密码In an embodiment, the meeting minutes may also be encrypted prior to storing or transmitting the meeting minutes to ensure data security. For example, compress and encrypt the meeting minutes, decompress the password as a specified password or a password known or agreed by each participant.
通过上述步骤S500-S508,本申请所提出的会议纪要生成方法,首先,获取每一所述发言人的语音样本,并从每一所述发言人的语音样本中提取出每一所述发言人的声音特征;其次,获取会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容;再者,对每一所述发言人的发言内容进行关键字提取;再者,根据所述提取的关键字生成与所述会议对应的会议纪要;最后,将生成的会议纪要以邮件或传真形式发送给预设用户,或向所述预设用户提供链接以获取所述会议纪要。这样,可以实现根据会议内容记录自动总结并生成会议纪要,方便参会 人员对会议内容的回顾,会议中的参会人员可以更专注于会议内容与进程,会议结束后,精简、准确的会议纪要也可以供其他有需求人员进行查阅与参考引用,相比于传统的人工记录整理,本方案更高效准确,同时节省了人力资源成本。Through the above steps S500-S508, the method for generating meeting minutes proposed by the present application firstly acquires a voice sample of each of the speakers, and extracts each of the speakers from the voice samples of each of the speakers. a sound feature; secondly, acquiring audio record information of the conference, and extracting the content of each speaker from the audio record information according to the voice feature of each speaker; and, for each of the speakers The content of the speech of the person is extracted by the keyword; further, the meeting minutes corresponding to the meeting are generated according to the extracted keywords; finally, the generated meeting minutes are sent to the preset user by mail or fax, or The preset user provides a link to obtain the meeting minutes. In this way, it is possible to automatically summarize and generate meeting minutes according to the meeting content records, so that the participants can review the content of the meeting. The participants in the meeting can focus more on the content and process of the meeting. After the meeting, the meeting summary is streamlined and accurate. It can also be used for reference and reference by other people in need. Compared with traditional manual recording, this solution is more efficient and accurate, and saves human resource costs.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above is only a preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.

Claims (20)

  1. 一种会议纪要生成方法,应用于应用服务器,其特征在于,所述方法包括:A method for generating a meeting minutes, which is applied to an application server, wherein the method comprises:
    获取一会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容;Obtaining audio record information of a conference, and extracting, according to the voice feature of each speaker, the content of each speaker's speech from the audio record information;
    对每一所述发言人的发言内容进行关键字提取;及Keyword extraction for each of the speakers' speeches; and
    根据所述提取的关键字生成与所述会议对应的会议纪要。A meeting minutes corresponding to the meeting are generated according to the extracted keywords.
  2. 如权利要求1所述的会议纪要生成方法,其特征在于,所述方法还包括:The method of generating a meeting minutes according to claim 1, wherein the method further comprises:
    获取每一所述发言人的语音样本,并从每一所述发言人的语音样本中提取出每一所述发言人的声音特征。Acquiring a voice sample of each of the speakers, and extracting a sound feature of each of the speakers from a voice sample of each of the speakers.
  3. 如权利要求1所述的会议纪要生成方法,其特征在于,所述根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容的步骤包括:The method of generating a meeting minutes according to claim 1, wherein the step of extracting the content of each of the speakers from the audio recording information according to the sound characteristics of each speaker comprises:
    对每一所述发言人设置一ID编号,并根据每一所述发言人的声音特征建立一发言者模型;Setting an ID number for each of the speakers, and establishing a speaker model according to the voice characteristics of each of the speakers;
    从所述音频记录信息中的第一段语音中提取出发言人的声音特征;Extracting a voice feature of the speaker from the first piece of voice in the audio record information;
    将所述提取的声音特征与所述多个发言者模型进行比较,并得到匹配得分;及Comparing the extracted sound features with the plurality of speaker models and obtaining a matching score; and
    根据匹配得分的高低确定所述第一段语音的发言人的ID编号。The ID number of the speaker of the first segment of speech is determined according to the level of the matching score.
  4. 如权利要求2所述的会议纪要生成方法,其特征在于,所述根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容的步骤包括:The method of generating a meeting minutes according to claim 2, wherein the step of extracting the content of each of the speakers from the audio recording information according to the sound characteristics of each speaker comprises:
    对每一所述发言人设置一ID编号,并根据每一所述发言人的声音特征建立一发言者模型;Setting an ID number for each of the speakers, and establishing a speaker model according to the voice characteristics of each of the speakers;
    从所述音频记录信息中的第一段语音中提取出发言人的声音特征;Extracting a voice feature of the speaker from the first piece of voice in the audio record information;
    将所述提取的声音特征与所述多个发言者模型进行比较,并得到匹配得分;及Comparing the extracted sound features with the plurality of speaker models and obtaining a matching score; and
    根据匹配得分的高低确定所述第一段语音的发言人的ID编号。The ID number of the speaker of the first segment of speech is determined according to the level of the matching score.
  5. 如权利要求1所述的会议纪要生成方法,其特征在于,所述对每一所述发言人的发言内容进行关键字提取的步骤包括:The method of generating a meeting minutes according to claim 1, wherein the step of performing keyword extraction on the content of each of the speakers of the speaker comprises:
    将每一所述发言人的发言内容转换为文字内容;Converting the content of each speaker's speech into text content;
    通过TF-IDF算法计算所述文字内容中每个词语的TF-IDF值;及Calculating a TF-IDF value of each word in the text content by a TF-IDF algorithm; and
    将TF-IDF值排名靠前的词语认定为所述发言内容的关键字并进行提取。The words whose top TF-IDF values are ranked are identified as keywords of the speech content and extracted.
  6. 如权利要求2所述的会议纪要生成方法,其特征在于,所述对每一所述发言人的发言内容进行关键字提取的步骤包括:The method of generating a meeting minutes according to claim 2, wherein the step of performing keyword extraction on the content of each of the speakers of the speaker comprises:
    将每一所述发言人的发言内容转换为文字内容;Converting the content of each speaker's speech into text content;
    通过TF-IDF算法计算所述文字内容中每个词语的TF-IDF值;及Calculating a TF-IDF value of each word in the text content by a TF-IDF algorithm; and
    将TF-IDF值排名靠前的词语认定为所述发言内容的关键字并进行提取。The words whose top TF-IDF values are ranked are identified as keywords of the speech content and extracted.
  7. 根据权利要求1所述的会议纪要生成方法,其特征在于,所根据所述提取的关键字生成与所述会议对应的会议纪要的步骤包括:The method of generating a meeting minutes according to claim 1, wherein the step of generating a meeting minutes corresponding to the meeting according to the extracted keywords comprises:
    根据所述提取的关键字生成会议主旨内容;及Generating conference subject matter based on the extracted keywords; and
    利用自然语言算法对所述会议主旨内容进行处理,以生成所述会议对应的会议纪要。The conference subject matter is processed by a natural language algorithm to generate a conference minutes corresponding to the conference.
  8. 根据权利要求1所述的会议纪要生成方法,其特征在于,所述方法还包括:The method of generating a meeting minutes according to claim 1, wherein the method further comprises:
    将所述会议纪要以邮件或传真形式发送给预设用户,或向所述预设用户提供链接以获取所述会议纪要。Sending the meeting minutes to the preset user by mail or fax, or providing a link to the preset user to obtain the meeting minutes.
  9. 一种应用服务器,其特征在于,所述应用服务器包括存储器、处理器,所述存储器上存储有可在所述处理器上运行的会议纪要生成系统,所述会议纪要生成系统被所述处理器执行时实现如下步骤:An application server, comprising: a memory, a processor, on the memory, a conference minutes generating system executable on the processor, wherein the meeting minutes generating system is used by the processor The following steps are implemented during execution:
    获取一会议的音频记录信息,并根据每一发言人的声音特征从所述音频 记录信息中提取出每一所述发言人的发言内容;Obtaining audio record information of a conference, and extracting, according to the voice feature of each speaker, the content of each speaker's speech from the audio record information;
    对每一所述发言人的发言内容进行关键字提取;及Keyword extraction for each of the speakers' speeches; and
    根据所述提取的关键字生成与所述会议对应的会议纪要。A meeting minutes corresponding to the meeting are generated according to the extracted keywords.
  10. 如权利要求9所述的应用服务器,其特征在于,所述会议纪要生成系统被所述处理器执行时还实现步骤:The application server according to claim 9, wherein said meeting minutes generating system is further implemented when said processor executes:
    获取每一所述发言人的语音样本,并从每一所述发言人的语音样本中提取出每一所述发言人的声音特征。Acquiring a voice sample of each of the speakers, and extracting a sound feature of each of the speakers from a voice sample of each of the speakers.
  11. 如权利要求9所述的应用服务器,其特征在于,所述根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容的步骤,具体包括:The application server according to claim 9, wherein the step of extracting the content of each speaker's speech from the audio recording information according to the sound feature of each speaker comprises:
    对每一所述发言人设置一ID编号,并根据每一所述发言人的声音特征建立一发言者模型;Setting an ID number for each of the speakers, and establishing a speaker model according to the voice characteristics of each of the speakers;
    从所述音频记录信息中的第一段语音中提取出发言人的声音特征;Extracting a voice feature of the speaker from the first piece of voice in the audio record information;
    将所述提取的声音特征与所述多个发言者模型进行比较,并得到匹配得分;及Comparing the extracted sound features with the plurality of speaker models and obtaining a matching score; and
    根据匹配得分的高低确定所述第一段语音的发言人的ID编号。The ID number of the speaker of the first segment of speech is determined according to the level of the matching score.
  12. 如权利要求9所述的应用服务器,其特征在于,所述对每一所述发言人的发言内容进行关键字提取的步骤,具体包括:The application server according to claim 9, wherein the step of performing keyword extraction on the content of the speech of each of the speakers includes:
    将每一所述发言人的发言内容转换为文字内容;Converting the content of each speaker's speech into text content;
    通过TF-IDF算法计算所述文字内容中每个词语的TF-IDF值;及Calculating a TF-IDF value of each word in the text content by a TF-IDF algorithm; and
    将TF-IDF值排名靠前的词语认定为所述发言内容的关键字并进行提取。The words whose top TF-IDF values are ranked are identified as keywords of the speech content and extracted.
  13. 如权利要求9所述的应用服务器,其特征在于,所根据所述提取的关键字生成与所述会议对应的会议纪要的步骤包括:The application server according to claim 9, wherein the step of generating a meeting minutes corresponding to the meeting according to the extracted keywords comprises:
    根据所述提取的关键字生成会议主旨内容;及Generating conference subject matter based on the extracted keywords; and
    利用自然语言算法对所述会议主旨内容进行处理,以生成所述会议对应的会议纪要。The conference subject matter is processed by a natural language algorithm to generate a conference minutes corresponding to the conference.
  14. 如权利要求9所述的应用服务器,其特征在于,所述会议纪要生成系统被所述处理器执行时还实现步骤:The application server according to claim 9, wherein said meeting minutes generating system is further implemented when said processor executes:
    将所述会议纪要以邮件或传真形式发送给预设用户,或向所述预设用户提供链接以获取所述会议纪要。Sending the meeting minutes to the preset user by mail or fax, or providing a link to the preset user to obtain the meeting minutes.
  15. 一种计算机可读存储介质,所述计算机可读存储介质存储有会议纪要生成系统,所述会议纪要生成系统可被至少一个处理器执行,以使所述至少一个处理器执行如下步骤:A computer readable storage medium storing a meeting minutes generation system, the meeting minutes generation system being executable by at least one processor to cause the at least one processor to perform the following steps:
    获取一会议的音频记录信息,并根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容;Obtaining audio record information of a conference, and extracting, according to the voice feature of each speaker, the content of each speaker's speech from the audio record information;
    对每一所述发言人的发言内容进行关键字提取;及Keyword extraction for each of the speakers' speeches; and
    根据所述提取的关键字生成与所述会议对应的会议纪要。A meeting minutes corresponding to the meeting are generated according to the extracted keywords.
  16. 如权利要求15所述的计算机可读存储介质,其特征在于,所述会议纪要生成系统被所述处理器执行时还实现步骤:The computer readable storage medium of claim 15 wherein said meeting minutes generation system is further implemented when said processor is executed:
    获取每一所述发言人的语音样本,并从每一所述发言人的语音样本中提取出每一所述发言人的声音特征。Acquiring a voice sample of each of the speakers, and extracting a sound feature of each of the speakers from a voice sample of each of the speakers.
  17. 如权利要求15所述的计算机可读存储介质,其特征在于,所述根据每一发言人的声音特征从所述音频记录信息中提取出每一所述发言人的发言内容的步骤,具体包括:The computer readable storage medium according to claim 15, wherein the step of extracting the content of each of the speakers from the audio recording information according to the sound characteristics of each speaker, specifically comprises :
    对每一所述发言人设置一ID编号,并根据每一所述发言人的声音特征建立一发言者模型;Setting an ID number for each of the speakers, and establishing a speaker model according to the voice characteristics of each of the speakers;
    从所述音频记录信息中的第一段语音中提取出发言人的声音特征;Extracting a voice feature of the speaker from the first piece of voice in the audio record information;
    将所述提取的声音特征与所述多个发言者模型进行比较,并得到匹配得分;及Comparing the extracted sound features with the plurality of speaker models and obtaining a matching score; and
    根据匹配得分的高低确定所述第一段语音的发言人的ID编号。The ID number of the speaker of the first segment of speech is determined according to the level of the matching score.
  18. 如权利要求15所述的计算机可读存储介质,其特征在于,所述对每一所述发言人的发言内容进行关键字提取的步骤,具体包括:The computer readable storage medium according to claim 15, wherein the step of performing keyword extraction on the content of the speech of each of the speakers comprises:
    将每一所述发言人的发言内容转换为文字内容;Converting the content of each speaker's speech into text content;
    通过TF-IDF算法计算所述文字内容中每个词语的TF-IDF值;及Calculating a TF-IDF value of each word in the text content by a TF-IDF algorithm; and
    将TF-IDF值排名靠前的词语认定为所述发言内容的关键字并进行提取。The words whose top TF-IDF values are ranked are identified as keywords of the speech content and extracted.
  19. 如权利要求15所述的计算机可读存储介质,其特征在于,所根据所述提取的关键字生成与所述会议对应的会议纪要的步骤包括:The computer readable storage medium of claim 15, wherein the step of generating a meeting minutes corresponding to the meeting based on the extracted keywords comprises:
    根据所述提取的关键字生成会议主旨内容;及Generating conference subject matter based on the extracted keywords; and
    利用自然语言算法对所述会议主旨内容进行处理,以生成所述会议对应的会议纪要。The conference subject matter is processed by a natural language algorithm to generate a conference minutes corresponding to the conference.
  20. 如权利要求15所述的计算机可读存储介质,其特征在于,所述会议纪要生成系统被所述处理器执行时还实现步骤:The computer readable storage medium of claim 15 wherein said meeting minutes generation system is further implemented when said processor is executed:
    将所述会议纪要以邮件或传真形式发送给预设用户,或向所述预设用户提供链接以获取所述会议纪要。Sending the meeting minutes to the preset user by mail or fax, or providing a link to the preset user to obtain the meeting minutes.
PCT/CN2018/077628 2017-11-17 2018-02-28 Meeting minutes generation method, application server, and computer readable storage medium WO2019095586A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711141751.5A CN108022583A (en) 2017-11-17 2017-11-17 Meeting summary generation method, application server and computer-readable recording medium
CN201711141751.5 2017-11-17

Publications (1)

Publication Number Publication Date
WO2019095586A1 true WO2019095586A1 (en) 2019-05-23

Family

ID=62080675

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/077628 WO2019095586A1 (en) 2017-11-17 2018-02-28 Meeting minutes generation method, application server, and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN108022583A (en)
WO (1) WO2019095586A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113014540A (en) * 2020-11-24 2021-06-22 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109525800A (en) * 2018-11-08 2019-03-26 江西国泰利民信息科技有限公司 A kind of teleconference voice recognition data transmission method
CN109361825A (en) * 2018-11-12 2019-02-19 平安科技(深圳)有限公司 Meeting summary recording method, terminal and computer storage medium
CN109473103A (en) * 2018-11-16 2019-03-15 上海玖悦数码科技有限公司 A kind of meeting summary generation method
CN109543173A (en) * 2018-11-30 2019-03-29 苏州麦迪斯顿医疗科技股份有限公司 Rescue record generation method, device, electronic equipment and storage medium
CN109803059A (en) * 2018-12-17 2019-05-24 百度在线网络技术(北京)有限公司 Audio-frequency processing method and device
CN111415128B (en) * 2019-01-07 2024-06-07 阿里巴巴集团控股有限公司 Method, system, device, equipment and medium for controlling conference
CN109960743A (en) * 2019-01-16 2019-07-02 平安科技(深圳)有限公司 Conference content differentiating method, device, computer equipment and storage medium
CN110049270B (en) * 2019-03-12 2023-05-30 平安科技(深圳)有限公司 Multi-person conference voice transcription method, device, system, equipment and storage medium
CN110010130A (en) * 2019-04-03 2019-07-12 安徽阔声科技有限公司 A kind of intelligent method towards participant's simultaneous voice transcription text
CN110134756A (en) * 2019-04-15 2019-08-16 深圳壹账通智能科技有限公司 Minutes generation method, electronic device and storage medium
CN110298252A (en) * 2019-05-30 2019-10-01 平安科技(深圳)有限公司 Meeting summary generation method, device, computer equipment and storage medium
CN110322872A (en) * 2019-06-05 2019-10-11 平安科技(深圳)有限公司 Conference voice data processing method, device, computer equipment and storage medium
CN114629736A (en) * 2020-01-19 2022-06-14 腾讯云计算(北京)有限责任公司 Conference document generation method and device
CN111626061A (en) * 2020-05-27 2020-09-04 深圳前海微众银行股份有限公司 Conference record generation method, device, equipment and readable storage medium
CN111666746B (en) * 2020-06-05 2023-09-29 中国银行股份有限公司 Conference summary generation method and device, electronic equipment and storage medium
CN113782026A (en) * 2020-06-09 2021-12-10 北京声智科技有限公司 Information processing method, device, medium and equipment
CN111787172A (en) * 2020-06-12 2020-10-16 深圳市珍爱捷云信息技术有限公司 Method, device, server and storage medium for realizing telephone conference based on mobile terminal
CN111797226B (en) * 2020-06-30 2024-04-05 北京百度网讯科技有限公司 Conference summary generation method and device, electronic equipment and readable storage medium
CN111899742B (en) * 2020-08-06 2021-03-23 广州科天视畅信息科技有限公司 Method and system for improving conference efficiency
CN112687272B (en) * 2020-12-18 2023-03-21 北京金山云网络技术有限公司 Conference summary recording method and device and electronic equipment
CN113766170A (en) * 2021-09-18 2021-12-07 苏州科天视创信息科技有限公司 Audio and video based on-line conference multi-terminal resource sharing method and system
CN114757155B (en) * 2022-06-14 2022-09-27 深圳乐播科技有限公司 Conference document generation method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102572372A (en) * 2011-12-28 2012-07-11 中兴通讯股份有限公司 Extraction method and device for conference summary
CN104427292A (en) * 2013-08-22 2015-03-18 中兴通讯股份有限公司 Method and device for extracting a conference summary
US20150348538A1 (en) * 2013-03-14 2015-12-03 Aliphcom Speech summary and action item generation
CN106448675A (en) * 2016-10-21 2017-02-22 科大讯飞股份有限公司 Recognition text correction method and system
CN106802885A (en) * 2016-12-06 2017-06-06 乐视控股(北京)有限公司 A kind of meeting summary automatic record method, device and electronic equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9560206B2 (en) * 2010-04-30 2017-01-31 American Teleconferencing Services, Ltd. Real-time speech-to-text conversion in an audio conference session
CN105957531B (en) * 2016-04-25 2019-12-31 上海交通大学 Speech content extraction method and device based on cloud platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102572372A (en) * 2011-12-28 2012-07-11 中兴通讯股份有限公司 Extraction method and device for conference summary
US20150348538A1 (en) * 2013-03-14 2015-12-03 Aliphcom Speech summary and action item generation
CN104427292A (en) * 2013-08-22 2015-03-18 中兴通讯股份有限公司 Method and device for extracting a conference summary
CN106448675A (en) * 2016-10-21 2017-02-22 科大讯飞股份有限公司 Recognition text correction method and system
CN106802885A (en) * 2016-12-06 2017-06-06 乐视控股(北京)有限公司 A kind of meeting summary automatic record method, device and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113014540A (en) * 2020-11-24 2021-06-22 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium
CN113014540B (en) * 2020-11-24 2022-09-27 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN108022583A (en) 2018-05-11

Similar Documents

Publication Publication Date Title
WO2019095586A1 (en) Meeting minutes generation method, application server, and computer readable storage medium
US10958598B2 (en) Method and apparatus for generating candidate reply message
CN103187053B (en) Input method and electronic equipment
CN109388701A (en) Minutes generation method, device, equipment and computer storage medium
CN111666746B (en) Conference summary generation method and device, electronic equipment and storage medium
CN110866110A (en) Conference summary generation method, device, equipment and medium based on artificial intelligence
US20150066935A1 (en) Crowdsourcing and consolidating user notes taken in a virtual meeting
CN109657181B (en) Internet information chain storage method, device, computer equipment and storage medium
US20160321272A1 (en) System and methods for vocal commenting on selected web pages
CN106713111B (en) Processing method for adding friends, terminal and server
WO2019148585A1 (en) Conference abstract generating method and apparatus
CN111798118B (en) Enterprise operation risk monitoring method and device
WO2020103447A1 (en) Link-type storage method and apparatus for video information, computer device and storage medium
CN109582906B (en) Method, device, equipment and storage medium for determining data reliability
CN110705235A (en) Information input method and device for business handling, storage medium and electronic equipment
CN110738323A (en) Method and device for establishing machine learning model based on data sharing
CN111223487B (en) Information processing method and electronic equipment
CN112446622A (en) Enterprise WeChat session evaluation method, system, electronic device and storage medium
CN110750619B (en) Chat record keyword extraction method and device, computer equipment and storage medium
CN112395391A (en) Concept graph construction method and device, computer equipment and storage medium
CN108846098B (en) Information flow abstract generating and displaying method
CN113111658B (en) Method, device, equipment and storage medium for checking information
KR102030551B1 (en) Instant messenger driving apparatus and operating method thereof
WO2021103594A1 (en) Tacitness degree detection method and device, server and readable storage medium
WO2019071907A1 (en) Method for identifying help information based on operation page, and application server

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18878912

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.10.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18878912

Country of ref document: EP

Kind code of ref document: A1