CN113207032A - System and method for increasing subtitles by recording videos in intelligent classroom - Google Patents

System and method for increasing subtitles by recording videos in intelligent classroom Download PDF

Info

Publication number
CN113207032A
CN113207032A CN202110477210.XA CN202110477210A CN113207032A CN 113207032 A CN113207032 A CN 113207032A CN 202110477210 A CN202110477210 A CN 202110477210A CN 113207032 A CN113207032 A CN 113207032A
Authority
CN
China
Prior art keywords
time
segment
audio
text content
recorded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110477210.XA
Other languages
Chinese (zh)
Inventor
秦曙光
陈家峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Readboy Education Technology Co Ltd
Original Assignee
Readboy Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Readboy Education Technology Co Ltd filed Critical Readboy Education Technology Co Ltd
Priority to CN202110477210.XA priority Critical patent/CN113207032A/en
Publication of CN113207032A publication Critical patent/CN113207032A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • H04N21/4334Recording operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention discloses a system and a method for adding subtitles to a video recorded in an intelligent classroom, which are used for automatically extracting audio, identifying voice and aligning subtitles to the recorded video. By using the method and the device, the caption content can be accurately and quickly added to the recorded video, a large amount of manual review and translation work is avoided, the caption generation efficiency is improved, and the course quality is guaranteed.

Description

System and method for increasing subtitles by recording videos in intelligent classroom
Technical Field
The invention relates to the technical field of intelligent classes, in particular to a system and a method for recording videos and adding subtitles in an intelligent class.
Background
Currently, students understand in a live classroom by watching PPT and lectures displayed by teachers. When the content taught by the teacher is not heard clearly, the content is often understood by playing back the live broadcast and combining the content of the PPT. Alternatively, the content that the teacher just explained can be understood by asking other classmates who are watching the live broadcast. However, the above method increases the time consumption of students in class, and cannot make the students know the information to be conveyed by the teachers in real time. Therefore, recorded and broadcast lessons can be played back selectively, and the learning enthusiasm of students can be struck often when the content explained by teachers is not clear in the recorded and broadcast lessons, so that the desire of the students to listen to lessons is reduced, and the effect of live broadcast teaching is reduced.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a system and a method for recording videos in an intelligent classroom and adding subtitles.
In order to achieve the purpose, the invention adopts the following technical scheme:
a system for recording videos and adding subtitles in an intelligent classroom comprises:
the video recording module: the system is used for carrying out video recording on live video of an intelligent classroom to obtain a recorded video file;
the audio extraction module: the audio extraction module is used for extracting audio from a recorded video file recorded by the video recording module to obtain a recorded audio file;
a voice recognition module: the system is used for recording audio files for voice recognition, recognizing corresponding text contents, synchronously recording the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishing the association relationship between each audio segment and the text contents;
and a subtitle adding module: the video playing device is used for displaying each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting displaying time and the ending displaying time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
an editing module: the method is used for the user to modify the video file added with the subtitle content, and comprises the steps of changing the initial display time of the text content and modifying the text content.
Further, in the system, the voice recognition module is configured to perform voice recognition on the recorded audio files in sequence according to a time sequence, record a current time as a start time of the audio segment when the text content is recognized for the first time, and record the time as an end time of the audio segment when the text content cannot be recognized any more after a preset time period elapses from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
The invention also provides a method for utilizing the system, which comprises the following specific processes:
in the live broadcast process of an intelligent classroom, a video recording module synchronously records videos, and after the live broadcast is finished, the video recording is finished to obtain a recorded video file;
the audio extraction module performs audio extraction on the recorded video file recorded by the video recording module to obtain a recorded audio file;
the voice recognition module carries out voice recognition on the recorded audio file, recognizes corresponding text contents, synchronously records the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishes the association relationship between each audio segment and the text contents;
when a user triggers a caption adding event, the caption adding module displays each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting display time and the ending display time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
when the user finds that the text content is not matched with the picture of the recorded video file, the initial display time of the corresponding text content can be advanced or pushed back by the editing module so as to be completely matched with the picture of the recorded video file; when the user finds that the text content has errors, the text content can be modified through the editing module.
Further, in the method, the voice recognition module performs voice recognition on the recorded audio files in sequence according to a time sequence, records the current time as the start time of the audio segment when the text content is recognized for the first time, and records the current time as the end time of the audio segment when the text content cannot be recognized any more after a certain time exceeds a preset time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
The invention has the beneficial effects that: by using the method and the device, the caption content can be accurately and quickly added to the recorded video, a large amount of manual review and translation work is avoided, the caption generation efficiency is improved, and the course quality is guaranteed.
Detailed Description
The present invention will be further described below, and it should be noted that the present embodiment is based on the technical solution, and a detailed implementation manner and a specific operation process are provided, but the protection scope of the present invention is not limited to the present embodiment.
Example 1
The embodiment provides a system for recording video and increasing subtitles in an intelligent classroom, which comprises:
the video recording module: the system is used for carrying out video recording on live video of an intelligent classroom to obtain a recorded video file;
the audio extraction module: the audio extraction module is used for extracting audio from a recorded video file recorded by the video recording module to obtain a recorded audio file;
a voice recognition module: the system is used for recording audio files for voice recognition, recognizing corresponding text contents, synchronously recording the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishing the association relationship between each audio segment and the text contents;
and a subtitle adding module: the video playing device is used for displaying each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting displaying time and the ending displaying time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
an editing module: the method is used for the user to modify the video file added with the subtitle content, and comprises the steps of changing the initial display time of the text content and modifying the text content.
Further, the voice recognition module is used for sequentially carrying out voice recognition on the recorded audio files according to a time sequence, recording the current time as the starting time of the audio segment when the text content is recognized for the first time, and recording the time as the ending time of the audio segment when the text content cannot be recognized any more after the preset time length is exceeded from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
Example 2
The present embodiment provides a method for implementing the system described in embodiment 1, which includes the following specific processes:
in the live broadcast process of an intelligent classroom, a video recording module synchronously records videos, and after the live broadcast is finished, the video recording is finished to obtain a recorded video file;
the audio extraction module performs audio extraction on the recorded video file recorded by the video recording module to obtain a recorded audio file;
the voice recognition module carries out voice recognition on the recorded audio file, recognizes corresponding text contents, synchronously records the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishes the association relationship between each audio segment and the text contents;
when a user triggers a caption adding event, the caption adding module displays each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting display time and the ending display time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
when the user finds that the text content is not matched with the picture of the recorded video file, the initial display time of the corresponding text content can be advanced or pushed back by the editing module so as to be completely matched with the picture of the recorded video file; when the user finds that the text content has errors, the text content can be modified through the editing module.
In the method, a voice recognition module carries out voice recognition on recorded audio files in sequence according to a time sequence, the current time is recorded as the starting time of the audio segment when the text content is recognized for the first time, and the time is recorded as the ending time of the audio segment when the text content cannot be recognized any more after the preset time length is exceeded from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
Various corresponding changes and modifications can be made by those skilled in the art based on the above technical solutions and concepts, and all such changes and modifications should be included in the protection scope of the present invention.

Claims (4)

1. The utility model provides a system for recording video increase caption in wisdom classroom which characterized in that includes:
the video recording module: the system is used for carrying out video recording on live video of an intelligent classroom to obtain a recorded video file;
the audio extraction module: the audio extraction module is used for extracting audio from a recorded video file recorded by the video recording module to obtain a recorded audio file;
a voice recognition module: the system is used for recording audio files for voice recognition, recognizing corresponding text contents, synchronously recording the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishing the association relationship between each audio segment and the text contents;
and a subtitle adding module: the video playing device is used for displaying each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting displaying time and the ending displaying time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
an editing module: the method is used for the user to modify the video file added with the subtitle content, and comprises the steps of changing the initial display time of the text content and modifying the text content.
2. The system according to claim 1, wherein the voice recognition module is configured to perform voice recognition on the recorded audio files in sequence according to a time sequence, record a current time as a start time of the audio segment when the text content is recognized for the first time, and record the current time as an end time of the audio segment when the text content cannot be recognized any more after a preset time period elapses from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
3. A method using the system of any one of claims 1-2, characterized by the specific process of:
in the live broadcast process of an intelligent classroom, a video recording module synchronously records videos, and after the live broadcast is finished, the video recording is finished to obtain a recorded video file;
the audio extraction module performs audio extraction on the recorded video file recorded by the video recording module to obtain a recorded audio file;
the voice recognition module carries out voice recognition on the recorded audio file, recognizes corresponding text contents, synchronously records the starting time and the ending time of each audio segment which can be recognized to obtain the text contents, and establishes the association relationship between each audio segment and the text contents;
when a user triggers a caption adding event, the caption adding module displays each segment of text content in each segment of video segment corresponding to the recorded video file according to the starting time and the ending time of each segment of audio segment, wherein the starting display time and the ending display time of each segment of text content correspond to the starting time and the ending time of the corresponding video segment; finally, the video file with the added subtitle content is obtained;
when the user finds that the text content is not matched with the picture of the recorded video file, the initial display time of the corresponding text content can be advanced or pushed back by the editing module so as to be completely matched with the picture of the recorded video file; when the user finds that the text content has errors, the text content can be modified through the editing module.
4. The method according to claim 3, wherein the voice recognition module performs voice recognition on the recorded audio files in sequence according to a time sequence, records the current time as the start time of the audio segment when the text content is recognized for the first time, and records the current time as the end time of the audio segment when the text content cannot be recognized any more after a preset time period from a certain time; and recording the time as the starting time of the next audio segment until the next recognition of the text content, and so on, thereby recognizing each audio segment capable of recognizing the text content and obtaining the starting time and the ending time of each audio segment.
CN202110477210.XA 2021-04-29 2021-04-29 System and method for increasing subtitles by recording videos in intelligent classroom Pending CN113207032A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110477210.XA CN113207032A (en) 2021-04-29 2021-04-29 System and method for increasing subtitles by recording videos in intelligent classroom

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110477210.XA CN113207032A (en) 2021-04-29 2021-04-29 System and method for increasing subtitles by recording videos in intelligent classroom

Publications (1)

Publication Number Publication Date
CN113207032A true CN113207032A (en) 2021-08-03

Family

ID=77029604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110477210.XA Pending CN113207032A (en) 2021-04-29 2021-04-29 System and method for increasing subtitles by recording videos in intelligent classroom

Country Status (1)

Country Link
CN (1) CN113207032A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113784158A (en) * 2021-08-31 2021-12-10 珠海读书郎软件科技有限公司 System and method for recording key points of pure English live broadcast lessons
CN117496774A (en) * 2023-11-08 2024-02-02 常州工业职业技术学院 Interactive teaching method based on network informatization in colleges and universities

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105845129A (en) * 2016-03-25 2016-08-10 乐视控股(北京)有限公司 Method and system for dividing sentences in audio and automatic caption generation method and system for video files
CN106506335A (en) * 2016-11-10 2017-03-15 北京小米移动软件有限公司 The method and device of sharing video frequency file
CN106851401A (en) * 2017-03-20 2017-06-13 惠州Tcl移动通信有限公司 A kind of method and system of automatic addition captions
CN108289244A (en) * 2017-12-28 2018-07-17 努比亚技术有限公司 Video caption processing method, mobile terminal and computer readable storage medium
CN110335612A (en) * 2019-07-11 2019-10-15 招商局金融科技有限公司 Minutes generation method, device and storage medium based on speech recognition
CN111986656A (en) * 2020-08-31 2020-11-24 上海松鼠课堂人工智能科技有限公司 Teaching video automatic caption processing method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105845129A (en) * 2016-03-25 2016-08-10 乐视控股(北京)有限公司 Method and system for dividing sentences in audio and automatic caption generation method and system for video files
CN106506335A (en) * 2016-11-10 2017-03-15 北京小米移动软件有限公司 The method and device of sharing video frequency file
CN106851401A (en) * 2017-03-20 2017-06-13 惠州Tcl移动通信有限公司 A kind of method and system of automatic addition captions
CN108289244A (en) * 2017-12-28 2018-07-17 努比亚技术有限公司 Video caption processing method, mobile terminal and computer readable storage medium
CN110335612A (en) * 2019-07-11 2019-10-15 招商局金融科技有限公司 Minutes generation method, device and storage medium based on speech recognition
CN111986656A (en) * 2020-08-31 2020-11-24 上海松鼠课堂人工智能科技有限公司 Teaching video automatic caption processing method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113784158A (en) * 2021-08-31 2021-12-10 珠海读书郎软件科技有限公司 System and method for recording key points of pure English live broadcast lessons
CN113784158B (en) * 2021-08-31 2022-06-17 珠海读书郎软件科技有限公司 System and method for recording key points of pure English live broadcast lessons
CN117496774A (en) * 2023-11-08 2024-02-02 常州工业职业技术学院 Interactive teaching method based on network informatization in colleges and universities

Similar Documents

Publication Publication Date Title
CN109698920B (en) Follow teaching system based on internet teaching platform
WO2018072390A1 (en) Classroom teaching recording and requesting method and system
WO2019095447A1 (en) Guided teaching method having remote assessment function
CN206039925U (en) Interactive operating means of multimedia teaching
US20200286396A1 (en) Following teaching system having voice evaluation function
US10354540B2 (en) Method for generating a dedicated format file for a panorama mode teaching system
CN113301369B (en) Interaction system and method for recorded and broadcast videos in intelligent classroom
CN113207033B (en) System and method for processing invalid video clips recorded in intelligent classroom
CN113225575B (en) A system and method for answering questions and interacting in smart classrooms
CN113207032A (en) System and method for increasing subtitles by recording videos in intelligent classroom
CN105139706A (en) Online education curriculum interaction method and system based on intelligent television
CN106952513A (en) A system and method for immersion English learning in free time
CN111639233A (en) Learning video subtitle adding method and device, terminal equipment and storage medium
CN105808733A (en) Display method and apparatus
CN109688484A (en) Teaching video learning method and system
CN113784158B (en) System and method for recording key points of pure English live broadcast lessons
CN110807960A (en) Internet-based auxiliary teaching system
CN115278356A (en) Intelligent course video clip control method
CN112529748A (en) Intelligent education platform based on time node mark feedback learning state
CN111050111A (en) Online interactive learning communication platform and learning device thereof
CN202929870U (en) Machine tool practical training data storage and display system
CN113391745A (en) Method, device, equipment and storage medium for processing key contents of network courses
CN103581569A (en) Method and system for recording electric power system teaching courseware
CN117596433A (en) International Chinese teaching audiovisual courseware editing system based on time axis fine adjustment
CN113014949B (en) Student privacy protection system and method for smart classroom course playback

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210803