CN110263313B - Man-machine collaborative editing method for conference shorthand - Google Patents
Man-machine collaborative editing method for conference shorthand Download PDFInfo
- Publication number
- CN110263313B CN110263313B CN201910533479.8A CN201910533479A CN110263313B CN 110263313 B CN110263313 B CN 110263313B CN 201910533479 A CN201910533479 A CN 201910533479A CN 110263313 B CN110263313 B CN 110263313B
- Authority
- CN
- China
- Prior art keywords
- audio
- conference
- text
- server
- terminal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000003058 natural language processing Methods 0.000 claims description 23
- 238000005516 engineering process Methods 0.000 claims description 10
- 238000012937 correction Methods 0.000 claims description 8
- 238000005520 cutting process Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 241000282414 Homo sapiens Species 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/103—Workflow collaboration or project management
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C7/00—Arrangements for writing information into, or reading information out from, a digital store
- G11C7/16—Storage of analogue signals in digital stores using an arrangement comprising analogue/digital [A/D] converters, digital memories and digital/analogue [D/A] converters
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Strategic Management (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Entrepreneurship & Innovation (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Acoustics & Sound (AREA)
- Economics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Telephonic Communication Services (AREA)
- Document Processing Apparatus (AREA)
Abstract
The invention discloses a man-machine collaborative editing method for conference shorthand, which comprises the following steps: 1. the conference shorthand terminal cuts the audio stream according to the natural sentence and sends the audio segment to a third-party server, and the third-party server converts the audio segment into a text corresponding to the audio segment; 2. when the conference stenography terminal cuts the audio stream, recording the start time, the end time and the audio code of each audio segment, and generating a log file by combining the text corresponding to the audio segment returned by the third-party server; 3. the conference stenography terminal sends the audio segment, the text and the log file to the collaborative editing server; 4. the collaborative editing server corresponds the audio segments and the texts one by one according to the log file; 5. and the manual editing terminal is used for manually correcting the conference record according to the audio segments and the texts which correspond to each other one by one. The invention can simply and conveniently correct the dynamically generated conference record in real time according to the conference audio.
Description
Technical Field
The invention relates to the technical field of voice shorthand, in particular to a man-machine collaborative editing method for conference shorthand.
Background
During the meeting, the organization and the specific content of the meeting are recorded by the recording personnel, and then the meeting record is formed. The most traditional form is to shorthand the recording staff on site and check the meeting record against the meeting record collations after the meeting is over.
With the development of a speech recognition technology (ASR) and a natural language processing technology (NLP), audio generated in a conference can be directly converted into characters in real time at a conference site and conference records are generated, and the workload of a recorder is greatly reduced.
Speech recognition technology is the conversion of lexical content in human speech into computer readable input, such as keystrokes, binary codes or character sequences; the natural language processing technology researches how to realize effective communication between people and computers by using natural language; the combination of the two can convert human speech into written expression form-text of human language. However, this conversion process does not guarantee a hundred percent accuracy, and particularly for terms, person names, etc. that are not entered into the system, the system has no way of determining what words should be. For example, inputting voice "chapter yi", the system can recognize the name of this star and convert it into correct text; the method is characterized in that the voice Zhangiang is input, for the strange phrase, the system can only transliterate word by word and select default options set by the system, for example, when the system defaults to Zhang and give priority to chapters, the voice Zhangiang may be converted into the word Zhangiang, and errors exist. Of course, the actual error is not limited thereto.
The accuracy of the existing man-machine collaborative editing method for conference shorthand is basically about 90-95%, and the errors existing in the text need to be corrected. At present, the adopted correction mode is mainly that after the meeting is finished, a recording person sorts and checks meeting records according to meeting records, so that the generation of meeting record draft has certain time delay and certain inconvenience. The most ideal correction method is to correct the text converted from the audio in real time, but the technical obstacle is how to realize timely and fast correction of the text while the audio is being recorded and the text is being generated, that is, how to timely and fast correct the dynamically generated text.
Disclosure of Invention
Aiming at the problems, the invention provides a man-machine collaborative editing method for conference shorthand.
A man-machine collaborative editing method for conference shorthand comprises the following steps: 1. when a conference is carried out, the conference shorthand terminal cuts the audio stream according to the natural sentence to form an audio segment and sends the audio segment to a third-party server, and the third-party server converts the audio segment into a text corresponding to the audio segment through a voice recognition technology and a natural language processing technology; 2. when the conference stenography terminal cuts the audio stream, recording the start time, the end time and the audio code of each audio segment, and generating a log file by combining the text corresponding to the audio segment returned by the third-party server; 3. the conference stenography terminal sends the audio segment, the text and the log file to the collaborative editing server; 4. the collaborative editing server corresponds the audio segments and the texts one by one according to the log file; 5. and the manual editing terminal is used for manually correcting the conference record according to the audio segments and the texts which correspond to each other one by one.
Further, the third-party server comprises an ASR server and an NLP server.
Further, the duration of the audio segment is limited to 60s, and the time interval between cutting the audio segments is 0.00001 ms.
Further, the conference stenographic terminal numbers each section of audio and text; and if the audio segment has no corresponding text, the conference shorthand terminal marks the audio segment in a log file.
Further, when the conference shorthand terminal detects that the network is interrupted, the data transmission to the third-party server is stopped, the data are temporarily stored in the memory, and when the network is connected again, the data are sequentially transmitted to the third-party server through the memory.
Further, the conference shorthand terminal copies the audio stream and sends the audio stream to the collaborative editing server while cutting the audio stream.
Furthermore, the manual editing terminal has the functions of searching and replacing, can directly modify a certain character or phrase, can also perform one-time correction on the same error in the text through searching and replacing, and can perform special display on the currently corrected content for the recording personnel to check.
The invention has the beneficial effects that: 1. the conference shorthand terminal transmits the audio in the form of audio segments, and the converted text can be corrected after the short audio segments are transmitted and the text conversion is finished, so that the real-time correction of the dynamically generated conference record is realized; 2. the audio and the text are in one-to-one correspondence according to the unit of the natural sentence, so that a recorder can directly click a certain section of text, the audio corresponding to the section of text can be played, and the recorder is assisted to judge and correct the text; 3. the processing mechanism when dealing with the network disconnection can well solve the problem of audio transmission after the network reconnection.
Drawings
FIG. 1 is a block diagram of a conference shorthand system;
fig. 2 is a schematic diagram of audio waveforms.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments. The embodiments of the present invention have been presented for purposes of illustration and description, and are not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Example 1
A man-machine collaborative editing method for conference shorthand is characterized in that the mentioned hardware equipment comprises a conference shorthand terminal, a third-party server, a collaborative editing server and a manual editing terminal. In this embodiment, the third party server includes an ASR server and an NLP server. The direct connection relationship of the hardware devices is shown in fig. 1.
The conference shorthand terminal is an independent device which is placed in a conference site and used for recording and preprocessing conference audio; the manual editing terminal is a desktop, a notebook or the like which is provided with specific software, and the specific software refers to software which can realize necessary functions.
The manual editing terminal and the conference shorthand terminal can be located at different places, for example, a conference is started in Beijing, and recording personnel corrects the conference records in Shanghai.
The connection mode among the conference shorthand terminal, the ASR server, the NLP server and the manual editing terminal can adopt but not limited to a wired network, a WiFi network and a 4G network.
The man-machine collaborative editing method disclosed by the embodiment comprises the following steps of:
when a conference is carried out, the conference shorthand terminal cuts an audio stream according to a natural sentence to form an audio segment, and sends the audio segment to a third-party server, and the third-party server converts the audio segment into a text corresponding to the audio segment through a voice recognition technology and a natural language processing technology.
The third-party server comprises an ASR server and an NLP server, the conference stenographic terminal sends the audio segment to the ASR server, the ASR server converts the content of the audio segment into a primary text and returns the primary text to the conference stenographic terminal, the conference stenographic terminal sends the primary text returned by the ASR server to the NLP server, and the NLP server is used for automatically correcting the primary text generated by the ASR server according to natural language and returning the corrected secondary text to the conference stenographic terminal.
The ASR server converts the content of the audio segment into a text, and the conversion process is mechanical conversion, wherein a great number of wrongly written characters (mostly homophone errors) exist; the NLP server automatically corrects the primary text according to the natural language, and the conversion process is a process of automatically correcting the primary text based on the habit of the natural language of human beings. The NLP server returns the secondary text of the conference shorthand terminal, the accuracy can reach 90-95%, but a certain error rate still exists.
The natural sentence in this embodiment refers to the sentence between adjacent pauses, such as "i am a thick and wild sound like the yellow river" in fig. 2, and "not loud in the building of the united nations". The audio stream is cut according to the natural sentence, so that the integrity of audio information can be ensured, and the audio data loss is prevented; and secondly, the bandwidth occupied in the audio sending process is reduced, the audio can conveniently and quickly reach the voice text conversion server, the audio jam caused by network traffic jam in the path sent to the voice text conversion server is reduced, the situation that the bicycles and battery cars, especially pedestrians, can shuttle from gaps of the automobiles is better than that on the jammed road, and the network transmission is the same.
When no audio fluctuations are detected for a period of time, the audio stream is cut and then processing continues after 0.00001 ms. The interval between audio segments is set to 0.00001ms in order to minimize audio loss and misalignment. For example, 5s audio contains an audio segment interval in the middle, and if the audio segment interval is 0.1ms, then on average, 1h audio will generate 72ms deviation, and 4h audio will generate 288ms deviation; if the audio segment interval is 0.00001ms, then on average, 1h of audio produces only 0.0072ms of deviation, and 4h of audio produces only 0.0288ms of deviation.
If the pause is not detected for a long enough time within 60s, the audio stream is cut forcibly, so that the audio segment is prevented from being too long and affecting the transmission speed of the audio segment and the response speed of the ASR server and the NLP server.
When the audio stream is cut to form audio segments, it is independent from the audio stream being generated, meaning that the segment of audio ends and that the segment of audio can be played back for modification of its corresponding text.
And secondly, when the conference shorthand terminal cuts the audio stream, recording the start time, the end time and the audio code of each audio segment, and generating a log file by combining the text corresponding to the audio segment returned by the third-party server. The text corresponding to the audio segment refers to the text automatically corrected by the NLP server.
The start time and the end time of the audio segment are based on Beijing time. The start time and the end time of the audio segment and the corresponding audio code are information which can be acquired by the conference stenography terminal in the audio cutting process, but the text corresponding to the audio segment is a secondary text returned by the NLP server.
Ideally, a segment of audio corresponds to a segment of text and corresponds to the text in sequence, but there may be a possibility that a segment of audio does not correspond to a text, such as a situation of playing a song on site. This involves the problem of how to correspond the secondary text returned by the NLP server to the audio segments one-to-one. In this embodiment, a method for solving this problem is that, if the audio segment has no text corresponding to the audio segment, the conference stenography terminal marks the audio segment in the log file, the collaborative editing server makes one-to-one correspondence between the audio segment and the secondary text according to the log file, and if there is a mark in a certain audio segment, the audio segment is skipped over so as to avoid the problem that the text and the audio segment are in correspondence error. The conference shorthand terminal knows which audio segment has no corresponding text, and the judgment is carried out by data returned by the ASR server, for example, one or more of the start time, the end time and the audio number are fused to form characteristic information connected with the audio segment and sent to the ASR server, the ASR server returns a text carrying the characteristic information, and the conference shorthand terminal can know that the audio segment has no corresponding text and sends the text.
And thirdly, the conference shorthand terminal sends the audio segment, the text and the log file to the collaborative editing server, and the collaborative editing server corresponds the audio segment and the text one by one according to the log file.
In the transmission process of the audio segment and the text, the audio segment is large and the text is small, so the text is often transmitted to the collaborative editing server earlier than the audio segment, that is, the audio segment and the text are not transmitted to the collaborative editing server at the same time, and the collaborative editing server knows which piece of text corresponds to which piece of audio. In this embodiment, this problem is solved by numbering each piece of audio and text by the conference shorthand terminal.
And fourthly, the manual editing terminal is used for manually correcting the conference record according to the audio segments and the texts which correspond to each other one by one.
For convenience of operation, the text may be displayed in segments according to the audio segments, that is, the text corresponding to one audio segment is displayed as one segment. When a recording person manually clicks a certain text, the manual editing terminal performs frame selection display and playing on the audio waveform corresponding to the text, and assists the recording person in judging and text correction. For example, when "shout of Chinese score with loud voice" is clicked, the audio waveform corresponding to the text is displayed in a frame and played.
Meanwhile, the manual editing terminal has the functions of searching and replacing, can directly modify a certain character or phrase, can also perform one-time correction on the same error in the text through searching and replacing, and can perform special display on the currently corrected content for the recording personnel to check.
Because the conference shorthand terminal, the ASR server, the NLP server, the collaborative editing server and the manual editing terminal are all connected through the network, the network interruption may occur in the conference process. When the conference stenographic terminal detects the network interruption, the data transmission to the ASR server/NLP server is stopped, the data are temporarily stored in the memory, when the network is connected again, the data are sequentially transmitted to the ASR server/NLP server through the memory, the phenomenon that after the network is reconnected, the ASR server/NLP server receives the audio data in a centralized mode, the situation that the audio data are attacked is mistakenly solved, and the connection between the conference stenographic terminal and the conference stenographic terminal is closed. In order to prevent the network disconnection between the conference shorthand terminal and the collaborative editing server, the backup conference audio is stored in the collaborative editing server. The backup conference audio can be used for correcting the conference record by calling the conference audio by the manual editing terminal after the conference is finished, but not necessarily correcting the conference record in the conference process; meanwhile, the problem that the manual editing terminal cannot acquire the audio information when transmission obstacles exist between the conference shorthand terminal and the collaborative editing server can be prevented.
It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by one of ordinary skill in the art and related arts based on the embodiments of the present invention without any creative effort, shall fall within the protection scope of the present invention.
Claims (7)
1. A man-machine collaborative editing method for conference shorthand is characterized by comprising the following steps:
step 1, when a conference is carried out, a conference shorthand terminal cuts an audio stream according to a natural sentence to form an audio segment, and sends the audio segment to a third-party server, and the third-party server converts the audio segment into a text corresponding to the audio segment through a voice recognition technology and a natural language processing technology;
step 2, when cutting the audio stream, the conference stenography terminal records the start time, the end time and the audio code of each audio segment, and generates a log file by combining the text corresponding to the audio segment returned by the third-party server;
step 3, the conference stenography terminal numbers each section of audio and text, and sends the audio section, the text and the log file to the collaborative editing server;
step 4, the collaborative editing server makes one-to-one correspondence between the audio segments and the texts according to the log file;
and 5, the manual editing terminal is used for manually correcting the conference record according to the audio segments and the texts which correspond to each other one by one.
2. The collaborative editing method according to claim 1, wherein the third-party server includes an ASR server and an NLP server.
3. The human-computer collaborative editing method according to claim 1 or 2, wherein the audio segment duration is limited to within 60s, and the time interval between cutting audio segments is 0.00001 ms.
4. The collaborative editing method according to claim 3, wherein the conference shorthand terminal marks in a log file if the audio segment does not have corresponding text.
5. The human-computer collaborative editing method according to any one of claims 1, 2 and 4, wherein when the conference shorthand terminal detects a network interruption, sending of data to the third-party server is stopped, the data is temporarily stored in the memory, and when the network is reconnected, the data is sent to the third-party server in order through the memory.
6. The human-computer collaborative editing method according to claim 5, wherein the conference shorthand terminal copies and sends the audio stream to the collaborative editing server while cutting the audio stream.
7. The human-computer collaborative editing method according to claim 1, wherein the manual editing terminal has a search and replacement function, can directly modify a certain character or phrase, can also perform one-time correction on the same error in the text through search and replacement, and can perform special display on the currently corrected content for a recording person to view.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910533479.8A CN110263313B (en) | 2019-06-19 | 2019-06-19 | Man-machine collaborative editing method for conference shorthand |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910533479.8A CN110263313B (en) | 2019-06-19 | 2019-06-19 | Man-machine collaborative editing method for conference shorthand |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110263313A CN110263313A (en) | 2019-09-20 |
CN110263313B true CN110263313B (en) | 2021-08-24 |
Family
ID=67919636
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910533479.8A Active CN110263313B (en) | 2019-06-19 | 2019-06-19 | Man-machine collaborative editing method for conference shorthand |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110263313B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113421572B (en) * | 2021-06-23 | 2024-02-02 | 平安科技(深圳)有限公司 | Real-time audio dialogue report generation method and device, electronic equipment and storage medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101101590A (en) * | 2006-07-04 | 2008-01-09 | 王建波 | Sound and character correspondence relation table generation method and positioning method |
CN105159870A (en) * | 2015-06-26 | 2015-12-16 | 徐信 | Processing system for precisely completing continuous natural speech textualization and method for precisely completing continuous natural speech textualization |
CN105827417A (en) * | 2016-05-31 | 2016-08-03 | 安徽声讯信息技术有限公司 | Voice quick recording device capable of performing modification at any time in conference recording |
CN105845129A (en) * | 2016-03-25 | 2016-08-10 | 乐视控股(北京)有限公司 | Method and system for dividing sentences in audio and automatic caption generation method and system for video files |
CN106057193A (en) * | 2016-07-13 | 2016-10-26 | 深圳市沃特沃德股份有限公司 | Conference record generation method based on telephone conference and device |
CN106802885A (en) * | 2016-12-06 | 2017-06-06 | 乐视控股(北京)有限公司 | A kind of meeting summary automatic record method, device and electronic equipment |
CN106941000A (en) * | 2017-03-21 | 2017-07-11 | 百度在线网络技术(北京)有限公司 | Voice interactive method and device based on artificial intelligence |
CN106971723A (en) * | 2017-03-29 | 2017-07-21 | 北京搜狗科技发展有限公司 | Method of speech processing and device, the device for speech processes |
CN107451110A (en) * | 2017-07-10 | 2017-12-08 | 珠海格力电器股份有限公司 | Method, device and server for generating conference summary |
CN108008824A (en) * | 2017-12-26 | 2018-05-08 | 安徽声讯信息技术有限公司 | The method that official document takes down in short-hand the collection of this multilink data |
CN108335697A (en) * | 2018-01-29 | 2018-07-27 | 北京百度网讯科技有限公司 | Minutes method, apparatus, equipment and computer-readable medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101458681A (en) * | 2007-12-10 | 2009-06-17 | 株式会社东芝 | Voice translation method and voice translation apparatus |
TWI616868B (en) * | 2014-12-30 | 2018-03-01 | 鴻海精密工業股份有限公司 | Meeting minutes device and method thereof for automatically creating meeting minutes |
-
2019
- 2019-06-19 CN CN201910533479.8A patent/CN110263313B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101101590A (en) * | 2006-07-04 | 2008-01-09 | 王建波 | Sound and character correspondence relation table generation method and positioning method |
CN105159870A (en) * | 2015-06-26 | 2015-12-16 | 徐信 | Processing system for precisely completing continuous natural speech textualization and method for precisely completing continuous natural speech textualization |
CN105845129A (en) * | 2016-03-25 | 2016-08-10 | 乐视控股(北京)有限公司 | Method and system for dividing sentences in audio and automatic caption generation method and system for video files |
CN105827417A (en) * | 2016-05-31 | 2016-08-03 | 安徽声讯信息技术有限公司 | Voice quick recording device capable of performing modification at any time in conference recording |
CN106057193A (en) * | 2016-07-13 | 2016-10-26 | 深圳市沃特沃德股份有限公司 | Conference record generation method based on telephone conference and device |
CN106802885A (en) * | 2016-12-06 | 2017-06-06 | 乐视控股(北京)有限公司 | A kind of meeting summary automatic record method, device and electronic equipment |
CN106941000A (en) * | 2017-03-21 | 2017-07-11 | 百度在线网络技术(北京)有限公司 | Voice interactive method and device based on artificial intelligence |
CN106971723A (en) * | 2017-03-29 | 2017-07-21 | 北京搜狗科技发展有限公司 | Method of speech processing and device, the device for speech processes |
CN107451110A (en) * | 2017-07-10 | 2017-12-08 | 珠海格力电器股份有限公司 | Method, device and server for generating conference summary |
CN108008824A (en) * | 2017-12-26 | 2018-05-08 | 安徽声讯信息技术有限公司 | The method that official document takes down in short-hand the collection of this multilink data |
CN108335697A (en) * | 2018-01-29 | 2018-07-27 | 北京百度网讯科技有限公司 | Minutes method, apparatus, equipment and computer-readable medium |
Also Published As
Publication number | Publication date |
---|---|
CN110263313A (en) | 2019-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9031839B2 (en) | Conference transcription based on conference data | |
CN107622054B (en) | Text data error correction method and device | |
US11682381B2 (en) | Acoustic model training using corrected terms | |
KR101768509B1 (en) | On-line voice translation method and device | |
CN110265026B (en) | Conference shorthand system and conference shorthand method | |
JP5787780B2 (en) | Transcription support system and transcription support method | |
CN101241514A (en) | Method for creating error-correcting database, automatic error correcting method and system | |
US20240220741A1 (en) | Voice-based interface for translating utterances between users | |
JP2004355630A (en) | Semantic object synchronous understanding implemented with speech application language tag | |
CN112328758A (en) | Session intention identification method, device, equipment and storage medium | |
CN104240718A (en) | Transcription support device, method, and computer program product | |
JP7107229B2 (en) | Information processing device, information processing method, and program | |
US20060195318A1 (en) | System for correction of speech recognition results with confidence level indication | |
CN110263313B (en) | Man-machine collaborative editing method for conference shorthand | |
CN111814494B (en) | Language translation method, device and computer equipment | |
CN110265027B (en) | Audio transmission method for conference shorthand system | |
CN110264998B (en) | Audio positioning method for conference shorthand system | |
CN109275009B (en) | Method and device for controlling synchronization of audio and text | |
Meteer et al. | Modeling conversational speech for speech recognition | |
JP7107228B2 (en) | Information processing device, information processing method, and program | |
US20210225377A1 (en) | Method for transcribing spoken language with real-time gesture-based formatting | |
CN112466286A (en) | Data processing method and device, terminal equipment | |
Wray et al. | Best practices for crowdsourcing dialectal arabic speech transcription | |
CN112053679A (en) | Role separation conference shorthand system and method based on mobile terminal | |
CN107342080B (en) | Conference site synchronous shorthand system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |