CN113207032A - System and method for increasing subtitles by recording videos in intelligent classroom - Google Patents
System and method for increasing subtitles by recording videos in intelligent classroom Download PDFInfo
- Publication number
- CN113207032A CN113207032A CN202110477210.XA CN202110477210A CN113207032A CN 113207032 A CN113207032 A CN 113207032A CN 202110477210 A CN202110477210 A CN 202110477210A CN 113207032 A CN113207032 A CN 113207032A
- Authority
- CN
- China
- Prior art keywords
- time
- segment
- audio
- text content
- recorded
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000000605 extraction Methods 0.000 claims description 12
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/433—Content storage operation, e.g. storage operation in response to a pause request, caching operations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/2187—Live feed
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/433—Content storage operation, e.g. storage operation in response to a pause request, caching operations
- H04N21/4334—Recording operations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8456—Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
The invention discloses a system and a method for adding subtitles to a video recorded in an intelligent classroom, which are used for automatically extracting audio, identifying voice and aligning subtitles to the recorded video. By using the method and the device, the caption content can be accurately and quickly added to the recorded video, a large amount of manual review and translation work is avoided, the caption generation efficiency is improved, and the course quality is guaranteed.
Description
Technical Field
The invention relates to the technical field of intelligent classes, in particular to a system and a method for recording videos and adding subtitles in an intelligent class.
Background
Currently, students understand in a live classroom by watching PPT and lectures displayed by teachers. When the content taught by the teacher is not heard clearly, the content is often understood by playing back the live broadcast and combining the content of the PPT. Alternatively, the content that the teacher just explained can be understood by asking other classmates who are watching the live broadcast. However, the above method increases the time consumption of students in class, and cannot make the students know the information to be conveyed by the teachers in real time. Therefore, recorded and broadcast lessons can be played back selectively, and the learning enthusiasm of students can be struck often when the content explained by teachers is not clear in the recorded and broadcast lessons, so that the desire of the students to listen to lessons is reduced, and the effect of live broadcast teaching is reduced.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a system and a method for recording videos in an intelligent classroom and adding subtitles.
In order to achieve the purpose, the invention adopts the following technical scheme:
a system for recording videos and adding subtitles in an intelligent classroom comprises:
the video recording module: the system is used for carrying out video recording on live video of an intelligent classroom to obtain a recorded video file;
the audio extraction module: the audio extraction module is used for extracting audio from a recorded video file recorded by the video recording module to obtain a recorded audio file;
a voice recognition module: the system is used for recording audio files for voice recognition, recognizing corresponding text contents, synchronously recording the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishing the association relationship between each audio segment and the text contents;
and a subtitle adding module: the video playing device is used for displaying each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting displaying time and the ending displaying time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
an editing module: the method is used for the user to modify the video file added with the subtitle content, and comprises the steps of changing the initial display time of the text content and modifying the text content.
Further, in the system, the voice recognition module is configured to perform voice recognition on the recorded audio files in sequence according to a time sequence, record a current time as a start time of the audio segment when the text content is recognized for the first time, and record the time as an end time of the audio segment when the text content cannot be recognized any more after a preset time period elapses from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
The invention also provides a method for utilizing the system, which comprises the following specific processes:
in the live broadcast process of an intelligent classroom, a video recording module synchronously records videos, and after the live broadcast is finished, the video recording is finished to obtain a recorded video file;
the audio extraction module performs audio extraction on the recorded video file recorded by the video recording module to obtain a recorded audio file;
the voice recognition module carries out voice recognition on the recorded audio file, recognizes corresponding text contents, synchronously records the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishes the association relationship between each audio segment and the text contents;
when a user triggers a caption adding event, the caption adding module displays each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting display time and the ending display time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
when the user finds that the text content is not matched with the picture of the recorded video file, the initial display time of the corresponding text content can be advanced or pushed back by the editing module so as to be completely matched with the picture of the recorded video file; when the user finds that the text content has errors, the text content can be modified through the editing module.
Further, in the method, the voice recognition module performs voice recognition on the recorded audio files in sequence according to a time sequence, records the current time as the start time of the audio segment when the text content is recognized for the first time, and records the current time as the end time of the audio segment when the text content cannot be recognized any more after a certain time exceeds a preset time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
The invention has the beneficial effects that: by using the method and the device, the caption content can be accurately and quickly added to the recorded video, a large amount of manual review and translation work is avoided, the caption generation efficiency is improved, and the course quality is guaranteed.
Detailed Description
The present invention will be further described below, and it should be noted that the present embodiment is based on the technical solution, and a detailed implementation manner and a specific operation process are provided, but the protection scope of the present invention is not limited to the present embodiment.
Example 1
The embodiment provides a system for recording video and increasing subtitles in an intelligent classroom, which comprises:
the video recording module: the system is used for carrying out video recording on live video of an intelligent classroom to obtain a recorded video file;
the audio extraction module: the audio extraction module is used for extracting audio from a recorded video file recorded by the video recording module to obtain a recorded audio file;
a voice recognition module: the system is used for recording audio files for voice recognition, recognizing corresponding text contents, synchronously recording the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishing the association relationship between each audio segment and the text contents;
and a subtitle adding module: the video playing device is used for displaying each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting displaying time and the ending displaying time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
an editing module: the method is used for the user to modify the video file added with the subtitle content, and comprises the steps of changing the initial display time of the text content and modifying the text content.
Further, the voice recognition module is used for sequentially carrying out voice recognition on the recorded audio files according to a time sequence, recording the current time as the starting time of the audio segment when the text content is recognized for the first time, and recording the time as the ending time of the audio segment when the text content cannot be recognized any more after the preset time length is exceeded from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
Example 2
The present embodiment provides a method for implementing the system described in embodiment 1, which includes the following specific processes:
in the live broadcast process of an intelligent classroom, a video recording module synchronously records videos, and after the live broadcast is finished, the video recording is finished to obtain a recorded video file;
the audio extraction module performs audio extraction on the recorded video file recorded by the video recording module to obtain a recorded audio file;
the voice recognition module carries out voice recognition on the recorded audio file, recognizes corresponding text contents, synchronously records the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishes the association relationship between each audio segment and the text contents;
when a user triggers a caption adding event, the caption adding module displays each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting display time and the ending display time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
when the user finds that the text content is not matched with the picture of the recorded video file, the initial display time of the corresponding text content can be advanced or pushed back by the editing module so as to be completely matched with the picture of the recorded video file; when the user finds that the text content has errors, the text content can be modified through the editing module.
In the method, a voice recognition module carries out voice recognition on recorded audio files in sequence according to a time sequence, the current time is recorded as the starting time of the audio segment when the text content is recognized for the first time, and the time is recorded as the ending time of the audio segment when the text content cannot be recognized any more after the preset time length is exceeded from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
Various corresponding changes and modifications can be made by those skilled in the art based on the above technical solutions and concepts, and all such changes and modifications should be included in the protection scope of the present invention.
Claims (4)
1. The utility model provides a system for recording video increase caption in wisdom classroom which characterized in that includes:
the video recording module: the system is used for carrying out video recording on live video of an intelligent classroom to obtain a recorded video file;
the audio extraction module: the audio extraction module is used for extracting audio from a recorded video file recorded by the video recording module to obtain a recorded audio file;
a voice recognition module: the system is used for recording audio files for voice recognition, recognizing corresponding text contents, synchronously recording the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishing the association relationship between each audio segment and the text contents;
and a subtitle adding module: the video playing device is used for displaying each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting displaying time and the ending displaying time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
an editing module: the method is used for the user to modify the video file added with the subtitle content, and comprises the steps of changing the initial display time of the text content and modifying the text content.
2. The system according to claim 1, wherein the voice recognition module is configured to perform voice recognition on the recorded audio files in sequence according to a time sequence, record a current time as a start time of the audio segment when the text content is recognized for the first time, and record the current time as an end time of the audio segment when the text content cannot be recognized any more after a preset time period elapses from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
3. A method using the system of any one of claims 1-2, characterized by the specific process of:
in the live broadcast process of an intelligent classroom, a video recording module synchronously records videos, and after the live broadcast is finished, the video recording is finished to obtain a recorded video file;
the audio extraction module performs audio extraction on the recorded video file recorded by the video recording module to obtain a recorded audio file;
the voice recognition module carries out voice recognition on the recorded audio file, recognizes corresponding text contents, synchronously records the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishes the association relationship between each audio segment and the text contents;
when a user triggers a caption adding event, the caption adding module displays each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting display time and the ending display time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
when the user finds that the text content is not matched with the picture of the recorded video file, the initial display time of the corresponding text content can be advanced or pushed back by the editing module so as to be completely matched with the picture of the recorded video file; when the user finds that the text content has errors, the text content can be modified through the editing module.
4. The method according to claim 3, wherein the voice recognition module performs voice recognition on the recorded audio files in sequence according to a time sequence, records the current time as the start time of the audio segment when the text content is recognized for the first time, and records the current time as the end time of the audio segment when the text content cannot be recognized any more after a preset time period from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110477210.XA CN113207032A (en) | 2021-04-29 | 2021-04-29 | System and method for increasing subtitles by recording videos in intelligent classroom |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110477210.XA CN113207032A (en) | 2021-04-29 | 2021-04-29 | System and method for increasing subtitles by recording videos in intelligent classroom |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113207032A true CN113207032A (en) | 2021-08-03 |
Family
ID=77029604
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110477210.XA Pending CN113207032A (en) | 2021-04-29 | 2021-04-29 | System and method for increasing subtitles by recording videos in intelligent classroom |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113207032A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113784158A (en) * | 2021-08-31 | 2021-12-10 | 珠海读书郎软件科技有限公司 | System and method for recording key points of pure English live broadcast lessons |
CN117496774A (en) * | 2023-11-08 | 2024-02-02 | 常州工业职业技术学院 | Interactive teaching method based on network informatization in colleges and universities |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105845129A (en) * | 2016-03-25 | 2016-08-10 | 乐视控股(北京)有限公司 | Method and system for dividing sentences in audio and automatic caption generation method and system for video files |
CN106506335A (en) * | 2016-11-10 | 2017-03-15 | 北京小米移动软件有限公司 | The method and device of sharing video frequency file |
CN106851401A (en) * | 2017-03-20 | 2017-06-13 | 惠州Tcl移动通信有限公司 | A kind of method and system of automatic addition captions |
CN108289244A (en) * | 2017-12-28 | 2018-07-17 | 努比亚技术有限公司 | Video caption processing method, mobile terminal and computer readable storage medium |
CN110335612A (en) * | 2019-07-11 | 2019-10-15 | 招商局金融科技有限公司 | Minutes generation method, device and storage medium based on speech recognition |
CN111986656A (en) * | 2020-08-31 | 2020-11-24 | 上海松鼠课堂人工智能科技有限公司 | Teaching video automatic caption processing method and system |
-
2021
- 2021-04-29 CN CN202110477210.XA patent/CN113207032A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105845129A (en) * | 2016-03-25 | 2016-08-10 | 乐视控股(北京)有限公司 | Method and system for dividing sentences in audio and automatic caption generation method and system for video files |
CN106506335A (en) * | 2016-11-10 | 2017-03-15 | 北京小米移动软件有限公司 | The method and device of sharing video frequency file |
CN106851401A (en) * | 2017-03-20 | 2017-06-13 | 惠州Tcl移动通信有限公司 | A kind of method and system of automatic addition captions |
CN108289244A (en) * | 2017-12-28 | 2018-07-17 | 努比亚技术有限公司 | Video caption processing method, mobile terminal and computer readable storage medium |
CN110335612A (en) * | 2019-07-11 | 2019-10-15 | 招商局金融科技有限公司 | Minutes generation method, device and storage medium based on speech recognition |
CN111986656A (en) * | 2020-08-31 | 2020-11-24 | 上海松鼠课堂人工智能科技有限公司 | Teaching video automatic caption processing method and system |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113784158A (en) * | 2021-08-31 | 2021-12-10 | 珠海读书郎软件科技有限公司 | System and method for recording key points of pure English live broadcast lessons |
CN113784158B (en) * | 2021-08-31 | 2022-06-17 | 珠海读书郎软件科技有限公司 | System and method for recording key points of pure English live broadcast lessons |
CN117496774A (en) * | 2023-11-08 | 2024-02-02 | 常州工业职业技术学院 | Interactive teaching method based on network informatization in colleges and universities |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109698920B (en) | Follow teaching system based on internet teaching platform | |
WO2018072390A1 (en) | Classroom teaching recording and requesting method and system | |
WO2019095447A1 (en) | Guided teaching method having remote assessment function | |
CN206039925U (en) | Interactive operating means of multimedia teaching | |
US20200286396A1 (en) | Following teaching system having voice evaluation function | |
US10354540B2 (en) | Method for generating a dedicated format file for a panorama mode teaching system | |
CN113301369B (en) | Interaction system and method for recorded and broadcast videos in intelligent classroom | |
CN113207033B (en) | System and method for processing invalid video clips recorded in intelligent classroom | |
CN113225575B (en) | A system and method for answering questions and interacting in smart classrooms | |
CN113207032A (en) | System and method for increasing subtitles by recording videos in intelligent classroom | |
CN105139706A (en) | Online education curriculum interaction method and system based on intelligent television | |
CN106952513A (en) | A system and method for immersion English learning in free time | |
CN111639233A (en) | Learning video subtitle adding method and device, terminal equipment and storage medium | |
CN105808733A (en) | Display method and apparatus | |
CN109688484A (en) | Teaching video learning method and system | |
CN113784158B (en) | System and method for recording key points of pure English live broadcast lessons | |
CN110807960A (en) | Internet-based auxiliary teaching system | |
CN115278356A (en) | Intelligent course video clip control method | |
CN112529748A (en) | Intelligent education platform based on time node mark feedback learning state | |
CN111050111A (en) | Online interactive learning communication platform and learning device thereof | |
CN202929870U (en) | Machine tool practical training data storage and display system | |
CN113391745A (en) | Method, device, equipment and storage medium for processing key contents of network courses | |
CN103581569A (en) | Method and system for recording electric power system teaching courseware | |
CN117596433A (en) | International Chinese teaching audiovisual courseware editing system based on time axis fine adjustment | |
CN113014949B (en) | Student privacy protection system and method for smart classroom course playback |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210803 |