CN113068058A - Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology - Google Patents
Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology Download PDFInfo
- Publication number
- CN113068058A CN113068058A CN202110297837.7A CN202110297837A CN113068058A CN 113068058 A CN113068058 A CN 113068058A CN 202110297837 A CN202110297837 A CN 202110297837A CN 113068058 A CN113068058 A CN 113068058A
- Authority
- CN
- China
- Prior art keywords
- voice
- character
- information
- module
- voice information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000005516 engineering process Methods 0.000 title claims abstract description 9
- 238000013518 transcription Methods 0.000 title claims abstract description 8
- 230000035897 transcription Effects 0.000 title claims abstract description 8
- 238000012545 processing Methods 0.000 claims abstract description 39
- 238000006243 chemical reaction Methods 0.000 claims abstract description 27
- 238000012795 verification Methods 0.000 claims abstract description 19
- 230000008030 elimination Effects 0.000 claims abstract description 12
- 238000003379 elimination reaction Methods 0.000 claims abstract description 12
- 238000000034 method Methods 0.000 claims description 29
- 230000008569 process Effects 0.000 claims description 17
- 230000009467 reduction Effects 0.000 claims description 3
- 230000004075 alteration Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/2187—Live feed
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
- H04N21/4307—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440236—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Databases & Information Systems (AREA)
- Quality & Reliability (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The invention discloses a real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technologies, which comprises a voice acquisition module, a voice noise elimination module, a character conversion module, a character voice library, a character verification module, a data receiving module, a data processing module, a master control module and a subtitle playing module, wherein the voice acquisition module is used for acquiring a voice signal; the voice acquisition module comprises two voice acquisition terminals, and the voice acquisition terminals are used for acquiring real-time voice information during live broadcasting; the real-time voice information is sent to a voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, and the denoised voice information is obtained after the denoising processing; and the voice information subjected to noise elimination is sent to a character conversion module, and the character conversion module sends the acquired voice information subjected to noise elimination to a character voice library for voice-to-character processing. The invention can prepare voice to text and provide more accurate caption information.
Description
Technical Field
The invention relates to the field of voice recognition, in particular to a real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technologies.
Background
The language identification refers to the process that a computer operates on language symbols used in daily life by using limited characteristics or rules to identify characters or words, the character voice conversion technology is a voice generation technology based on a voice synthesis technology, and can convert texts in the computer into continuous natural languages or convert the natural languages into a form of characters, and a subtitle on-screen live broadcasting system is needed when voice contents are converted into characters and displayed in a subtitle form in the live broadcasting process.
The existing live system for subtitle on-screen display is used, errors are easy to occur when voice is converted into characters, the subtitle is inaccurate, and certain influence is brought to the use of the live system for subtitle on-screen display.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: how to solve current subtitle live broadcast system of going to the screen, use from the centre, make mistakes easily when pronunciation commentaries on classics word, the subtitle that leads to is inaccurate, the problem of certain influence has been brought for the use of subtitle live broadcast system of going to the screen, provides a real-time subtitle live broadcast system of going to the screen based on speech recognition and transcription technique.
The invention solves the technical problems through the following technical scheme, and the invention comprises a voice acquisition module, a voice noise elimination module, a character conversion module, a character voice library, a character verification module, a data receiving module, a data processing module, a master control module and a subtitle playing module;
the voice acquisition module comprises two voice acquisition terminals, and the voice acquisition terminals are used for acquiring real-time voice information during live broadcasting;
the real-time voice information is sent to a voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, and the denoised voice information is obtained after the denoising processing;
the voice information subjected to noise elimination is sent to a character conversion module, and the character conversion module sends the obtained voice information subjected to noise elimination to a character voice library for voice-to-character conversion processing to obtain converted character information;
the character information is sent to a character verification module, and the character verification module is used for performing character verification processing on the converted character information to obtain standard character information;
the standard text information is sent to a data receiving module, and the data receiving module converts the standard text information and processes the standard text information to play text contents;
and the master control module controls the subtitle playing module to synchronously play the text content.
Preferably, the specific processing procedure of the voice acquisition module is as follows:
the method comprises the following steps: the two voice acquisition terminals synchronously acquire voice information and respectively mark the voice information as M1 and M2;
step two: synchronously accelerating and playing the voice information M1 and the voice information M2, extracting the voice information M1 and the voice information M2 which are less than a preset value, and marking the voice information M1 and the voice information M2 as Ki, i is 1 … … n;
step three: all Ki are combined, and then the voice information M1 and the rest of the voice information M2 are combined to obtain the combined voice information MAndspeech information MAndi.e. the voice information to be denoised.
Preferably, the voice denoising module performs denoising processing in the following specific process: and (3) leading the voice information needing voice denoising into a voice denoising module, automatically eliminating information irrelevant to the current task through a soft thresholding layer of a self-adaptive threshold value by a deep residual shrinkage network in the voice denoising module, accurately identifying strong noise data, and eliminating strong noise, wherein the voice information with the noise eliminated is obtained after the strong noise is eliminated.
Preferably, the specific process of the text conversion module for performing text conversion is as follows:
the method comprises the following steps: importing voice information subjected to noise reduction processing, marking the voice information as P, and importing the voice information P into a character voice library;
step two: leading the voice information P into a character voice library for matching processing;
step three: when the similarity between the voice information of the voice information P and the voice characters prestored in the character voice library exceeds a preset value, the character matching is indicated to be successful, namely the extracted character is marked as an identification character;
step four: and arranging and combining all the identification characters according to the identification time to obtain the converted character information.
Preferably, the specific processing procedure of the text verification module is as follows: and extracting the converted character information, transmitting the converted character information back to a character voice library, performing a character-to-voice process, and when the similarity between the voice information converted by the character-to-voice process and the original input voice exceeds a preset value, verifying that the character passes through, and marking the verified character as standard character information.
Compared with the prior art, the invention has the following advantages: this real-time subtitle on-screen live broadcast system based on speech recognition and transcription technique through in the pronunciation collection stage, connects the processing that carries out more clarity to the speech information, can the effectual quality that promotes the speech information who gathers to promote the accuracy of conversion characters, at pronunciation commentaries on classics characters stage simultaneously, after the conversion succeeds, go forward to the pronunciation again and verify, the degree of accuracy of further assurance characters conversion has promoted the degree of accuracy of this system, lets the system be worth using widely more.
Drawings
FIG. 1 is a system block diagram of the present invention.
Detailed Description
The following examples are given for the detailed implementation and specific operation of the present invention, but the scope of the present invention is not limited to the following examples.
As shown in fig. 1, the present embodiment provides a technical solution: a real-time subtitle on-screen live broadcast system based on voice recognition and transcription technology comprises a voice acquisition module, a voice noise elimination module, a character conversion module, a character voice library, a character verification module, a data receiving module, a data processing module, a master control module and a subtitle playing module;
the voice acquisition module comprises two voice acquisition terminals, and the voice acquisition terminals are used for acquiring real-time voice information during live broadcasting;
the real-time voice information is sent to a voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, and the denoised voice information is obtained after the denoising processing;
the voice information subjected to noise elimination is sent to a character conversion module, and the character conversion module sends the obtained voice information subjected to noise elimination to a character voice library for voice-to-character conversion processing to obtain converted character information;
the character information is sent to a character verification module, and the character verification module is used for performing character verification processing on the converted character information to obtain standard character information;
the standard text information is sent to a data receiving module, and the data receiving module converts the standard text information and processes the standard text information to play text contents;
and the master control module controls the subtitle playing module to synchronously play the text content.
The specific processing process of the voice acquisition module for the voice acquisition module is as follows:
the method comprises the following steps: the two voice acquisition terminals synchronously acquire voice information and respectively mark the voice information as M1 and M2;
step two: synchronously accelerating and playing the voice information M1 and the voice information M2, extracting the voice information M1 and the voice information M2 which are less than a preset value, and marking the voice information M1 and the voice information M2 as Ki, i is 1 … … n;
step three: all Ki are combined, and then the voice information M1 and the rest of the voice information M2 are combined to obtain the combined voice information MAndspeech information MAndi.e. the voice information to be denoised.
The voice denoising module performs denoising processing specifically as follows: and (3) leading the voice information needing voice denoising into a voice denoising module, automatically eliminating information irrelevant to the current task through a soft thresholding layer of a self-adaptive threshold value by a deep residual shrinkage network in the voice denoising module, accurately identifying strong noise data, and eliminating strong noise, wherein the voice information with the noise eliminated is obtained after the strong noise is eliminated.
The specific process of the character conversion module for performing character conversion is as follows:
the method comprises the following steps: importing voice information subjected to noise reduction processing, marking the voice information as P, and importing the voice information P into a character voice library;
step two: leading the voice information P into a character voice library for matching processing;
step three: when the similarity between the voice information of the voice information P and the voice characters prestored in the character voice library exceeds a preset value, the character matching is indicated to be successful, namely the extracted character is marked as an identification character;
step four: and arranging and combining all the identification characters according to the identification time to obtain the converted character information.
The specific processing procedure of the character verification module is as follows: and extracting the converted character information, transmitting the converted character information back to a character voice library, performing a character-to-voice process, and when the similarity between the voice information converted by the character-to-voice process and the original input voice exceeds a preset value, verifying that the character passes through, and marking the verified character as standard character information.
In summary, when the invention is used, the voice collecting module includes two voice collecting terminals, the voice collecting terminal is used for collecting real-time voice information during live broadcasting, the real-time voice information is sent to the voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, the denoised voice information is obtained after denoising processing, the denoised voice information is sent to the text conversion module, the text conversion module sends the obtained denoised voice information to the text voice library for voice text-to-text processing, converted text information is obtained, the text information is sent to the text verification module, the text verification module is used for performing text verification processing on the converted text information to obtain standard text information, the standard text information is sent to the data receiving module, the data receiving module converts the standard text information, the master control module controls the caption playing module to play the text content synchronously.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.
Claims (5)
1. A real-time subtitle on-screen live broadcast system based on voice recognition and transcription technology is characterized by comprising a voice acquisition module, a voice noise elimination module, a character conversion module, a character voice library, a character verification module, a data receiving module, a data processing module, a master control module and a subtitle playing module;
the voice acquisition module comprises two voice acquisition terminals, and the voice acquisition terminals are used for acquiring real-time voice information during live broadcasting;
the real-time voice information is sent to a voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, and the denoised voice information is obtained after the denoising processing;
the voice information subjected to noise elimination is sent to a character conversion module, and the character conversion module sends the obtained voice information subjected to noise elimination to a character voice library for voice-to-character conversion processing to obtain converted character information;
the character information is sent to a character verification module, and the character verification module is used for performing character verification processing on the converted character information to obtain standard character information;
the standard text information is sent to a data receiving module, and the data receiving module converts the standard text information and processes the standard text information to play text contents;
and the master control module controls the subtitle playing module to synchronously play the text content.
2. The system of claim 1, wherein the system comprises: the specific processing process of the voice acquisition module for the voice acquisition module is as follows:
the method comprises the following steps: the two voice acquisition terminals synchronously acquire voice information and respectively mark the voice information as M1 and M2;
step two: synchronously accelerating and playing the voice information M1 and the voice information M2, extracting the voice information M1 and the voice information M2 which are less than a preset value, and marking the voice information M1 and the voice information M2 as Ki, i is 1 … … n;
step three: all Ki are combined, and then the voice information M1 and the rest of the voice information M2 are combined to obtain the combined voice information MAndspeech information MAndi.e. the voice information to be denoised.
3. The system of claim 1, wherein the system comprises: the voice denoising module performs denoising processing specifically as follows: and (3) leading the voice information needing voice denoising into a voice denoising module, automatically eliminating information irrelevant to the current task through a soft thresholding layer of a self-adaptive threshold value by a deep residual shrinkage network in the voice denoising module, accurately identifying strong noise data, and eliminating strong noise, wherein the voice information with the noise eliminated is obtained after the strong noise is eliminated.
4. The system of claim 1, wherein the system comprises: the specific process of the character conversion module for performing character conversion is as follows:
the method comprises the following steps: importing voice information subjected to noise reduction processing, marking the voice information as P, and importing the voice information P into a character voice library;
step two: leading the voice information P into a character voice library for matching processing;
step three: when the similarity between the voice information of the voice information P and the voice characters prestored in the character voice library exceeds a preset value, the character matching is indicated to be successful, namely the extracted character is marked as an identification character;
step four: and arranging and combining all the identification characters according to the identification time to obtain the converted character information.
5. The system of claim 1, wherein the system comprises: the specific processing procedure of the character verification module is as follows: and extracting the converted character information, transmitting the converted character information back to a character voice library, performing a character-to-voice process, and when the similarity between the voice information converted by the character-to-voice process and the original input voice exceeds a preset value, verifying that the character passes through, and marking the verified character as standard character information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110297837.7A CN113068058A (en) | 2021-03-19 | 2021-03-19 | Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110297837.7A CN113068058A (en) | 2021-03-19 | 2021-03-19 | Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113068058A true CN113068058A (en) | 2021-07-02 |
Family
ID=76562544
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110297837.7A Pending CN113068058A (en) | 2021-03-19 | 2021-03-19 | Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113068058A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105161104A (en) * | 2015-07-31 | 2015-12-16 | 北京云知声信息技术有限公司 | Voice processing method and device |
CN105374356A (en) * | 2014-08-29 | 2016-03-02 | 株式会社理光 | Speech recognition method, speech assessment method, speech recognition system, and speech assessment system |
CN106340294A (en) * | 2016-09-29 | 2017-01-18 | 安徽声讯信息技术有限公司 | Synchronous translation-based news live streaming subtitle on-line production system |
CN106409296A (en) * | 2016-09-14 | 2017-02-15 | 安徽声讯信息技术有限公司 | Voice rapid transcription and correction system based on multi-core processing technology |
US20170098447A1 (en) * | 2014-11-28 | 2017-04-06 | Shenzhen Skyworth-Rgb Electronic Co., Ltd. | Voice recognition method and system |
CN109741749A (en) * | 2018-04-19 | 2019-05-10 | 北京字节跳动网络技术有限公司 | A kind of method and terminal device of speech recognition |
CN110085210A (en) * | 2019-03-15 | 2019-08-02 | 平安科技(深圳)有限公司 | Interactive information test method, device, computer equipment and storage medium |
CN111883110A (en) * | 2020-07-30 | 2020-11-03 | 上海携旅信息技术有限公司 | Acoustic model training method, system, device and medium for speech recognition |
-
2021
- 2021-03-19 CN CN202110297837.7A patent/CN113068058A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105374356A (en) * | 2014-08-29 | 2016-03-02 | 株式会社理光 | Speech recognition method, speech assessment method, speech recognition system, and speech assessment system |
US20170098447A1 (en) * | 2014-11-28 | 2017-04-06 | Shenzhen Skyworth-Rgb Electronic Co., Ltd. | Voice recognition method and system |
CN105161104A (en) * | 2015-07-31 | 2015-12-16 | 北京云知声信息技术有限公司 | Voice processing method and device |
CN106409296A (en) * | 2016-09-14 | 2017-02-15 | 安徽声讯信息技术有限公司 | Voice rapid transcription and correction system based on multi-core processing technology |
CN106340294A (en) * | 2016-09-29 | 2017-01-18 | 安徽声讯信息技术有限公司 | Synchronous translation-based news live streaming subtitle on-line production system |
CN109741749A (en) * | 2018-04-19 | 2019-05-10 | 北京字节跳动网络技术有限公司 | A kind of method and terminal device of speech recognition |
CN110085210A (en) * | 2019-03-15 | 2019-08-02 | 平安科技(深圳)有限公司 | Interactive information test method, device, computer equipment and storage medium |
CN111883110A (en) * | 2020-07-30 | 2020-11-03 | 上海携旅信息技术有限公司 | Acoustic model training method, system, device and medium for speech recognition |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107945792B (en) | Voice processing method and device | |
CN111968649A (en) | Subtitle correction method, subtitle display method, device, equipment and medium | |
RU2251737C2 (en) | Method for automatic recognition of language of recognized text in case of multilingual recognition | |
US10529340B2 (en) | Voiceprint registration method, server and storage medium | |
CN111681642B (en) | Speech recognition evaluation method, device, storage medium and equipment | |
WO2019218467A1 (en) | Method and apparatus for dialect recognition in voice and video calls, terminal device, and medium | |
CN111986656B (en) | Teaching video automatic caption processing method and system | |
CN109710949B (en) | Translation method and translator | |
CN108305618B (en) | Voice acquisition and search method, intelligent pen, search terminal and storage medium | |
CN107102990A (en) | The method and apparatus translated to voice | |
CN104347071B (en) | Method and system for generating reference answers of spoken language test | |
CN111402892A (en) | Conference recording template generation method based on voice recognition | |
US20240064383A1 (en) | Method and Apparatus for Generating Video Corpus, and Related Device | |
CN113035199A (en) | Audio processing method, device, equipment and readable storage medium | |
CN111613215A (en) | Voice recognition method and device | |
CN113658594A (en) | Lyric recognition method, device, equipment, storage medium and product | |
CN112270917B (en) | Speech synthesis method, device, electronic equipment and readable storage medium | |
CN112466287B (en) | Voice segmentation method, device and computer readable storage medium | |
CN113068058A (en) | Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology | |
CN116403583A (en) | Voice data processing method and device, nonvolatile storage medium and vehicle | |
CN112233679B (en) | Artificial intelligence speech recognition system | |
CN112509567B (en) | Method, apparatus, device, storage medium and program product for processing voice data | |
CN112487804B (en) | Chinese novel speech synthesis system based on semantic context scene | |
CN110428668B (en) | Data extraction method and device, computer system and readable storage medium | |
CN114490929A (en) | Bidding information acquisition method and device, storage medium and terminal equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210702 |