CN113068058A - Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology - Google Patents

Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology Download PDF

Info

Publication number
CN113068058A
CN113068058A CN202110297837.7A CN202110297837A CN113068058A CN 113068058 A CN113068058 A CN 113068058A CN 202110297837 A CN202110297837 A CN 202110297837A CN 113068058 A CN113068058 A CN 113068058A
Authority
CN
China
Prior art keywords
voice
character
information
module
voice information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110297837.7A
Other languages
Chinese (zh)
Inventor
李广垒
陈祖涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Baoxin Information Technology Co ltd
Original Assignee
Anhui Baoxin Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Baoxin Information Technology Co ltd filed Critical Anhui Baoxin Information Technology Co ltd
Priority to CN202110297837.7A priority Critical patent/CN113068058A/en
Publication of CN113068058A publication Critical patent/CN113068058A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440236Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technologies, which comprises a voice acquisition module, a voice noise elimination module, a character conversion module, a character voice library, a character verification module, a data receiving module, a data processing module, a master control module and a subtitle playing module, wherein the voice acquisition module is used for acquiring a voice signal; the voice acquisition module comprises two voice acquisition terminals, and the voice acquisition terminals are used for acquiring real-time voice information during live broadcasting; the real-time voice information is sent to a voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, and the denoised voice information is obtained after the denoising processing; and the voice information subjected to noise elimination is sent to a character conversion module, and the character conversion module sends the acquired voice information subjected to noise elimination to a character voice library for voice-to-character processing. The invention can prepare voice to text and provide more accurate caption information.

Description

Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology
Technical Field
The invention relates to the field of voice recognition, in particular to a real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technologies.
Background
The language identification refers to the process that a computer operates on language symbols used in daily life by using limited characteristics or rules to identify characters or words, the character voice conversion technology is a voice generation technology based on a voice synthesis technology, and can convert texts in the computer into continuous natural languages or convert the natural languages into a form of characters, and a subtitle on-screen live broadcasting system is needed when voice contents are converted into characters and displayed in a subtitle form in the live broadcasting process.
The existing live system for subtitle on-screen display is used, errors are easy to occur when voice is converted into characters, the subtitle is inaccurate, and certain influence is brought to the use of the live system for subtitle on-screen display.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: how to solve current subtitle live broadcast system of going to the screen, use from the centre, make mistakes easily when pronunciation commentaries on classics word, the subtitle that leads to is inaccurate, the problem of certain influence has been brought for the use of subtitle live broadcast system of going to the screen, provides a real-time subtitle live broadcast system of going to the screen based on speech recognition and transcription technique.
The invention solves the technical problems through the following technical scheme, and the invention comprises a voice acquisition module, a voice noise elimination module, a character conversion module, a character voice library, a character verification module, a data receiving module, a data processing module, a master control module and a subtitle playing module;
the voice acquisition module comprises two voice acquisition terminals, and the voice acquisition terminals are used for acquiring real-time voice information during live broadcasting;
the real-time voice information is sent to a voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, and the denoised voice information is obtained after the denoising processing;
the voice information subjected to noise elimination is sent to a character conversion module, and the character conversion module sends the obtained voice information subjected to noise elimination to a character voice library for voice-to-character conversion processing to obtain converted character information;
the character information is sent to a character verification module, and the character verification module is used for performing character verification processing on the converted character information to obtain standard character information;
the standard text information is sent to a data receiving module, and the data receiving module converts the standard text information and processes the standard text information to play text contents;
and the master control module controls the subtitle playing module to synchronously play the text content.
Preferably, the specific processing procedure of the voice acquisition module is as follows:
the method comprises the following steps: the two voice acquisition terminals synchronously acquire voice information and respectively mark the voice information as M1 and M2;
step two: synchronously accelerating and playing the voice information M1 and the voice information M2, extracting the voice information M1 and the voice information M2 which are less than a preset value, and marking the voice information M1 and the voice information M2 as Ki, i is 1 … … n;
step three: all Ki are combined, and then the voice information M1 and the rest of the voice information M2 are combined to obtain the combined voice information MAndspeech information MAndi.e. the voice information to be denoised.
Preferably, the voice denoising module performs denoising processing in the following specific process: and (3) leading the voice information needing voice denoising into a voice denoising module, automatically eliminating information irrelevant to the current task through a soft thresholding layer of a self-adaptive threshold value by a deep residual shrinkage network in the voice denoising module, accurately identifying strong noise data, and eliminating strong noise, wherein the voice information with the noise eliminated is obtained after the strong noise is eliminated.
Preferably, the specific process of the text conversion module for performing text conversion is as follows:
the method comprises the following steps: importing voice information subjected to noise reduction processing, marking the voice information as P, and importing the voice information P into a character voice library;
step two: leading the voice information P into a character voice library for matching processing;
step three: when the similarity between the voice information of the voice information P and the voice characters prestored in the character voice library exceeds a preset value, the character matching is indicated to be successful, namely the extracted character is marked as an identification character;
step four: and arranging and combining all the identification characters according to the identification time to obtain the converted character information.
Preferably, the specific processing procedure of the text verification module is as follows: and extracting the converted character information, transmitting the converted character information back to a character voice library, performing a character-to-voice process, and when the similarity between the voice information converted by the character-to-voice process and the original input voice exceeds a preset value, verifying that the character passes through, and marking the verified character as standard character information.
Compared with the prior art, the invention has the following advantages: this real-time subtitle on-screen live broadcast system based on speech recognition and transcription technique through in the pronunciation collection stage, connects the processing that carries out more clarity to the speech information, can the effectual quality that promotes the speech information who gathers to promote the accuracy of conversion characters, at pronunciation commentaries on classics characters stage simultaneously, after the conversion succeeds, go forward to the pronunciation again and verify, the degree of accuracy of further assurance characters conversion has promoted the degree of accuracy of this system, lets the system be worth using widely more.
Drawings
FIG. 1 is a system block diagram of the present invention.
Detailed Description
The following examples are given for the detailed implementation and specific operation of the present invention, but the scope of the present invention is not limited to the following examples.
As shown in fig. 1, the present embodiment provides a technical solution: a real-time subtitle on-screen live broadcast system based on voice recognition and transcription technology comprises a voice acquisition module, a voice noise elimination module, a character conversion module, a character voice library, a character verification module, a data receiving module, a data processing module, a master control module and a subtitle playing module;
the voice acquisition module comprises two voice acquisition terminals, and the voice acquisition terminals are used for acquiring real-time voice information during live broadcasting;
the real-time voice information is sent to a voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, and the denoised voice information is obtained after the denoising processing;
the voice information subjected to noise elimination is sent to a character conversion module, and the character conversion module sends the obtained voice information subjected to noise elimination to a character voice library for voice-to-character conversion processing to obtain converted character information;
the character information is sent to a character verification module, and the character verification module is used for performing character verification processing on the converted character information to obtain standard character information;
the standard text information is sent to a data receiving module, and the data receiving module converts the standard text information and processes the standard text information to play text contents;
and the master control module controls the subtitle playing module to synchronously play the text content.
The specific processing process of the voice acquisition module for the voice acquisition module is as follows:
the method comprises the following steps: the two voice acquisition terminals synchronously acquire voice information and respectively mark the voice information as M1 and M2;
step two: synchronously accelerating and playing the voice information M1 and the voice information M2, extracting the voice information M1 and the voice information M2 which are less than a preset value, and marking the voice information M1 and the voice information M2 as Ki, i is 1 … … n;
step three: all Ki are combined, and then the voice information M1 and the rest of the voice information M2 are combined to obtain the combined voice information MAndspeech information MAndi.e. the voice information to be denoised.
The voice denoising module performs denoising processing specifically as follows: and (3) leading the voice information needing voice denoising into a voice denoising module, automatically eliminating information irrelevant to the current task through a soft thresholding layer of a self-adaptive threshold value by a deep residual shrinkage network in the voice denoising module, accurately identifying strong noise data, and eliminating strong noise, wherein the voice information with the noise eliminated is obtained after the strong noise is eliminated.
The specific process of the character conversion module for performing character conversion is as follows:
the method comprises the following steps: importing voice information subjected to noise reduction processing, marking the voice information as P, and importing the voice information P into a character voice library;
step two: leading the voice information P into a character voice library for matching processing;
step three: when the similarity between the voice information of the voice information P and the voice characters prestored in the character voice library exceeds a preset value, the character matching is indicated to be successful, namely the extracted character is marked as an identification character;
step four: and arranging and combining all the identification characters according to the identification time to obtain the converted character information.
The specific processing procedure of the character verification module is as follows: and extracting the converted character information, transmitting the converted character information back to a character voice library, performing a character-to-voice process, and when the similarity between the voice information converted by the character-to-voice process and the original input voice exceeds a preset value, verifying that the character passes through, and marking the verified character as standard character information.
In summary, when the invention is used, the voice collecting module includes two voice collecting terminals, the voice collecting terminal is used for collecting real-time voice information during live broadcasting, the real-time voice information is sent to the voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, the denoised voice information is obtained after denoising processing, the denoised voice information is sent to the text conversion module, the text conversion module sends the obtained denoised voice information to the text voice library for voice text-to-text processing, converted text information is obtained, the text information is sent to the text verification module, the text verification module is used for performing text verification processing on the converted text information to obtain standard text information, the standard text information is sent to the data receiving module, the data receiving module converts the standard text information, the master control module controls the caption playing module to play the text content synchronously.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (5)

1. A real-time subtitle on-screen live broadcast system based on voice recognition and transcription technology is characterized by comprising a voice acquisition module, a voice noise elimination module, a character conversion module, a character voice library, a character verification module, a data receiving module, a data processing module, a master control module and a subtitle playing module;
the voice acquisition module comprises two voice acquisition terminals, and the voice acquisition terminals are used for acquiring real-time voice information during live broadcasting;
the real-time voice information is sent to a voice denoising module, the voice denoising module performs denoising processing on the received real-time voice information, and the denoised voice information is obtained after the denoising processing;
the voice information subjected to noise elimination is sent to a character conversion module, and the character conversion module sends the obtained voice information subjected to noise elimination to a character voice library for voice-to-character conversion processing to obtain converted character information;
the character information is sent to a character verification module, and the character verification module is used for performing character verification processing on the converted character information to obtain standard character information;
the standard text information is sent to a data receiving module, and the data receiving module converts the standard text information and processes the standard text information to play text contents;
and the master control module controls the subtitle playing module to synchronously play the text content.
2. The system of claim 1, wherein the system comprises: the specific processing process of the voice acquisition module for the voice acquisition module is as follows:
the method comprises the following steps: the two voice acquisition terminals synchronously acquire voice information and respectively mark the voice information as M1 and M2;
step two: synchronously accelerating and playing the voice information M1 and the voice information M2, extracting the voice information M1 and the voice information M2 which are less than a preset value, and marking the voice information M1 and the voice information M2 as Ki, i is 1 … … n;
step three: all Ki are combined, and then the voice information M1 and the rest of the voice information M2 are combined to obtain the combined voice information MAndspeech information MAndi.e. the voice information to be denoised.
3. The system of claim 1, wherein the system comprises: the voice denoising module performs denoising processing specifically as follows: and (3) leading the voice information needing voice denoising into a voice denoising module, automatically eliminating information irrelevant to the current task through a soft thresholding layer of a self-adaptive threshold value by a deep residual shrinkage network in the voice denoising module, accurately identifying strong noise data, and eliminating strong noise, wherein the voice information with the noise eliminated is obtained after the strong noise is eliminated.
4. The system of claim 1, wherein the system comprises: the specific process of the character conversion module for performing character conversion is as follows:
the method comprises the following steps: importing voice information subjected to noise reduction processing, marking the voice information as P, and importing the voice information P into a character voice library;
step two: leading the voice information P into a character voice library for matching processing;
step three: when the similarity between the voice information of the voice information P and the voice characters prestored in the character voice library exceeds a preset value, the character matching is indicated to be successful, namely the extracted character is marked as an identification character;
step four: and arranging and combining all the identification characters according to the identification time to obtain the converted character information.
5. The system of claim 1, wherein the system comprises: the specific processing procedure of the character verification module is as follows: and extracting the converted character information, transmitting the converted character information back to a character voice library, performing a character-to-voice process, and when the similarity between the voice information converted by the character-to-voice process and the original input voice exceeds a preset value, verifying that the character passes through, and marking the verified character as standard character information.
CN202110297837.7A 2021-03-19 2021-03-19 Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology Pending CN113068058A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110297837.7A CN113068058A (en) 2021-03-19 2021-03-19 Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110297837.7A CN113068058A (en) 2021-03-19 2021-03-19 Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology

Publications (1)

Publication Number Publication Date
CN113068058A true CN113068058A (en) 2021-07-02

Family

ID=76562544

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110297837.7A Pending CN113068058A (en) 2021-03-19 2021-03-19 Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology

Country Status (1)

Country Link
CN (1) CN113068058A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105161104A (en) * 2015-07-31 2015-12-16 北京云知声信息技术有限公司 Voice processing method and device
CN105374356A (en) * 2014-08-29 2016-03-02 株式会社理光 Speech recognition method, speech assessment method, speech recognition system, and speech assessment system
CN106340294A (en) * 2016-09-29 2017-01-18 安徽声讯信息技术有限公司 Synchronous translation-based news live streaming subtitle on-line production system
CN106409296A (en) * 2016-09-14 2017-02-15 安徽声讯信息技术有限公司 Voice rapid transcription and correction system based on multi-core processing technology
US20170098447A1 (en) * 2014-11-28 2017-04-06 Shenzhen Skyworth-Rgb Electronic Co., Ltd. Voice recognition method and system
CN109741749A (en) * 2018-04-19 2019-05-10 北京字节跳动网络技术有限公司 A kind of method and terminal device of speech recognition
CN110085210A (en) * 2019-03-15 2019-08-02 平安科技(深圳)有限公司 Interactive information test method, device, computer equipment and storage medium
CN111883110A (en) * 2020-07-30 2020-11-03 上海携旅信息技术有限公司 Acoustic model training method, system, device and medium for speech recognition

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105374356A (en) * 2014-08-29 2016-03-02 株式会社理光 Speech recognition method, speech assessment method, speech recognition system, and speech assessment system
US20170098447A1 (en) * 2014-11-28 2017-04-06 Shenzhen Skyworth-Rgb Electronic Co., Ltd. Voice recognition method and system
CN105161104A (en) * 2015-07-31 2015-12-16 北京云知声信息技术有限公司 Voice processing method and device
CN106409296A (en) * 2016-09-14 2017-02-15 安徽声讯信息技术有限公司 Voice rapid transcription and correction system based on multi-core processing technology
CN106340294A (en) * 2016-09-29 2017-01-18 安徽声讯信息技术有限公司 Synchronous translation-based news live streaming subtitle on-line production system
CN109741749A (en) * 2018-04-19 2019-05-10 北京字节跳动网络技术有限公司 A kind of method and terminal device of speech recognition
CN110085210A (en) * 2019-03-15 2019-08-02 平安科技(深圳)有限公司 Interactive information test method, device, computer equipment and storage medium
CN111883110A (en) * 2020-07-30 2020-11-03 上海携旅信息技术有限公司 Acoustic model training method, system, device and medium for speech recognition

Similar Documents

Publication Publication Date Title
CN107945792B (en) Voice processing method and device
CN111968649A (en) Subtitle correction method, subtitle display method, device, equipment and medium
RU2251737C2 (en) Method for automatic recognition of language of recognized text in case of multilingual recognition
US10529340B2 (en) Voiceprint registration method, server and storage medium
CN111681642B (en) Speech recognition evaluation method, device, storage medium and equipment
WO2019218467A1 (en) Method and apparatus for dialect recognition in voice and video calls, terminal device, and medium
CN111986656B (en) Teaching video automatic caption processing method and system
CN109710949B (en) Translation method and translator
CN108305618B (en) Voice acquisition and search method, intelligent pen, search terminal and storage medium
CN107102990A (en) The method and apparatus translated to voice
CN104347071B (en) Method and system for generating reference answers of spoken language test
CN111402892A (en) Conference recording template generation method based on voice recognition
US20240064383A1 (en) Method and Apparatus for Generating Video Corpus, and Related Device
CN113035199A (en) Audio processing method, device, equipment and readable storage medium
CN111613215A (en) Voice recognition method and device
CN113658594A (en) Lyric recognition method, device, equipment, storage medium and product
CN112270917B (en) Speech synthesis method, device, electronic equipment and readable storage medium
CN112466287B (en) Voice segmentation method, device and computer readable storage medium
CN113068058A (en) Real-time subtitle on-screen live broadcasting system based on voice recognition and transcription technology
CN116403583A (en) Voice data processing method and device, nonvolatile storage medium and vehicle
CN112233679B (en) Artificial intelligence speech recognition system
CN112509567B (en) Method, apparatus, device, storage medium and program product for processing voice data
CN112487804B (en) Chinese novel speech synthesis system based on semantic context scene
CN110428668B (en) Data extraction method and device, computer system and readable storage medium
CN114490929A (en) Bidding information acquisition method and device, storage medium and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210702