CN113259763B - Teaching video processing method and device and electronic equipment - Google Patents

Teaching video processing method and device and electronic equipment Download PDF

Info

Publication number
CN113259763B
CN113259763B CN202110482496.0A CN202110482496A CN113259763B CN 113259763 B CN113259763 B CN 113259763B CN 202110482496 A CN202110482496 A CN 202110482496A CN 113259763 B CN113259763 B CN 113259763B
Authority
CN
China
Prior art keywords
video
knowledge
teaching
information
knowledge points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110482496.0A
Other languages
Chinese (zh)
Other versions
CN113259763A (en
Inventor
崔寅生
刘洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zuoyebang Education Technology Beijing Co Ltd
Original Assignee
Zuoyebang Education Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zuoyebang Education Technology Beijing Co Ltd filed Critical Zuoyebang Education Technology Beijing Co Ltd
Priority to CN202110482496.0A priority Critical patent/CN113259763B/en
Publication of CN113259763A publication Critical patent/CN113259763A/en
Application granted granted Critical
Publication of CN113259763B publication Critical patent/CN113259763B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks

Abstract

The invention belongs to the technical field of online education, and provides a teaching video processing method, a teaching video processing device and electronic equipment, wherein the method comprises the following steps: segmenting a teaching video to be processed into video segments according to knowledge points; judging whether knowledge points related to each video segment are the same or have correlation; and splicing the video segments with the same knowledge points or associated knowledge points to form a teaching segment. The teaching video is segmented according to the knowledge points identified from the teaching video, the related video segments such as the knowledge point explanation video segments and the question explanation video segments based on the same knowledge point are spliced into a new teaching video segment to be displayed to the user, the content of the whole teaching video can be effectively classified and summarized, the user can conveniently find the corresponding video and the corresponding question according to the knowledge points, the interest of the user in watching the teaching video is improved, and the teaching efficiency and the learning efficiency of the user are improved.

Description

Teaching video processing method and device and electronic equipment
Technical Field
The invention belongs to the technical field of education, is particularly suitable for online education, and more particularly relates to a teaching video processing method and device, electronic equipment and a computer readable medium.
Background
With the development of modern information technology and the demand of the education market, online education is being popularized and popularized continuously as an emerging education concept. Online teaching has brought very big facility for teacher and student, and through online education, the teacher can impart knowledge to students at home, and the student also can listen to class and examination etc. in step at home. Present online teaching mainly goes on based on APP etc. and the teacher can carry out the live broadcast through the network and give lessons, arrange the operation or examination, and the student listens to lessons or answers through intelligent terminal such as the smart mobile phones that install corresponding teaching APP.
The prior network teaching generally shows teaching contents such as PPT to students through the network on the interface of APP for teachers, the teaching is carried out simultaneously, teaching videos are often long in time, each video comprises a plurality of knowledge points and corresponding questions explained by the teacher, teachers or students look over live videos subsequently and want to look back a certain knowledge point, but do not know the position of the knowledge point in the videos, time is wasted for searching, teaching or learning efficiency is reduced, and how to find the needed contents from the teaching videos quickly and accurately becomes the technical problem to be solved.
Disclosure of Invention
Technical problem to be solved
The invention aims to solve the technical problem of how to improve the teaching or learning efficiency by using the teaching video.
(II) technical scheme
In order to solve the above technical problem, an aspect of the present invention provides a teaching video processing method, including:
aggregating the multi-frame images of the teaching video according to the change of the characteristics of the adjacent frame images in the teaching video to form an image frame set; converting each section of the teaching video corresponding to each image frame set into video text section information; inputting the information of each video text segment into a trained knowledge point identification model, and outputting to obtain the name and the corresponding time period of each knowledge point related to the information of each video text segment;
segmenting a teaching video to be processed into video segments according to knowledge points;
judging whether the knowledge points related to the video segments are the same or have correlation;
splicing video bands with the same or related knowledge points to form teaching fragments;
the knowledge point identification model comprises a video type identification model and knowledge identification models corresponding to the video types, and each knowledge identification model carries out knowledge point identification training aiming at video text segment information corresponding to the video type;
the method for inputting the information of each video text segment into the trained knowledge point identification model and outputting the names and the corresponding time periods of the knowledge points related to the information of each video text segment comprises the following steps:
inputting the video text segment information into the trained video type recognition model, and outputting the type of each video text segment information;
and inputting the video text segment information into the trained knowledge recognition model of the corresponding type according to the type of the video text segment information, and outputting to obtain the name and the corresponding time period of each knowledge point related to each video text segment information.
According to a preferred embodiment of the present invention, the converting each of the teaching videos corresponding to each of the image frame sets into video text segment information includes:
extracting voice information in the teaching video, and converting the voice information into first text information;
converting the image information in each image frame set into second text information;
and combining each second text message with the first text message of the corresponding time section to form the video text section message.
According to a preferred embodiment of the present invention, the video type includes a knowledge point explanation and a title explanation;
the knowledge identification model comprises a first knowledge identification model and a second knowledge identification model, wherein,
the first knowledge identification model is used for identifying knowledge points related to video text segment information explained by the knowledge points;
the second knowledge identification model is used for identifying knowledge points related to video text segment information of the title explanation.
According to a preferred embodiment of the present invention, the segmenting the teaching video to be processed into video segments according to knowledge points includes: and segmenting the teaching video to be processed into video segments according to the names of the knowledge points related to each piece of video text segment information and the corresponding time periods.
According to a preferred embodiment of the present invention, the determining whether the knowledge points related to the video segments are the same or related comprises: and judging whether the knowledge points related to all the video segments are the same or have correlation according to the names of the knowledge points.
According to a preferred embodiment of the present invention, the teaching segment includes one knowledge point explanation and a topic explanation associated with the knowledge point, and/or the teaching segment includes more than one associated knowledge point explanation and a topic explanation associated with the knowledge points.
According to a preferred embodiment of the present invention, the splicing of video segments with the same knowledge points or associated knowledge points to form an instructional fragment comprises: splicing the knowledge point video segments and the question video segments with the same knowledge points or associated knowledge points to form teaching segments;
optionally, the teaching segments are named according to the related knowledge points.
Optionally, the teaching segments are spliced into a second teaching video.
A second aspect of the present invention provides a teaching video processing apparatus, including:
the knowledge point identification module is used for carrying out aggregation processing on the multi-frame images of the teaching video according to the change of the characteristics of the adjacent frame images in the teaching video so as to form an image frame set; converting each section of the teaching video corresponding to each image frame set into video text section information; inputting the information of each video text segment into a trained knowledge point identification model, and outputting to obtain the name and the corresponding time period of each knowledge point related to the information of each video text segment;
the video segmentation module is used for segmenting the teaching video to be processed into video segments according to the knowledge points;
the judging module is used for judging whether the knowledge points related to the video segments are the same or have correlation;
the video editing module is used for splicing video segments with the same knowledge points or associated knowledge points to form a teaching segment;
the knowledge point identification models comprise video type identification models and knowledge identification models corresponding to the video types, and each knowledge identification model carries out knowledge point identification training on video text segment information corresponding to the video type;
the method for inputting the information of each video text segment into the trained knowledge point identification model and outputting the names and the corresponding time periods of the knowledge points related to the information of each video text segment comprises the following steps:
inputting the video text segment information into the trained video type recognition model, and outputting the type of each video text segment information;
and inputting the video text segment information into the trained knowledge recognition model of the corresponding type according to the type of the video text segment information, and outputting to obtain the name and the corresponding time period of each knowledge point related to each video text segment information.
A third aspect of the present invention provides an electronic device, comprising a processor and a memory, wherein the memory is used for storing a computer executable program, and when the computer program is executed by the processor, the processor executes any one of the teaching video processing methods described above.
The fourth aspect of the present invention further provides a computer-readable medium storing a computer-executable program, which when executed, implements the teaching video processing method according to any one of the above.
The fifth aspect of the present invention further provides a computer executable program, which when executed, implements the teaching video processing method according to any one of the above.
(III) advantageous effects
According to the method and the device, the teaching video is segmented according to the knowledge points identified from the teaching video, and the associated video segments, such as the knowledge point explanation video segment and the question explanation video segment based on the same knowledge point, are spliced into a new teaching video segment to be displayed to the user, so that the content of the whole teaching video can be effectively classified and summarized, the user can conveniently find the corresponding video and the question according to the knowledge points, the interest of the user in watching the teaching video is improved, and the teaching efficiency and the learning efficiency of the user are improved.
Drawings
Fig. 1 is a flowchart illustrating a teaching video processing method according to an embodiment of the present invention;
FIG. 2 is a flow chart of another teaching video processing method according to an embodiment of the present invention;
FIG. 3 is a flow chart of a knowledge point identification method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a knowledge point identification model according to an embodiment of the present invention;
FIG. 5 is a schematic view of a teaching video processing interface according to an embodiment of the present invention;
FIG. 6a is a schematic view of a teaching video processing interface according to another embodiment of the present invention;
FIG. 6b is a schematic diagram of a teaching video processing interface according to another embodiment of the present invention;
FIG. 7 is a schematic diagram of a teaching video processing apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;
fig. 9 is a schematic diagram of a computer-readable recording medium provided by an embodiment of the present invention.
Detailed Description
In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.
The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.
The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different network and/or processing unit devices and/or microcontroller devices.
The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will also be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.
Teaching videos are videos recorded while live broadcasting, and teachers show PPT teaching contents prepared in advance to users one by using client sides and explain each PPT content. The teaching videos are often long in time, each video comprises a plurality of knowledge points and corresponding questions which are explained by a teacher, the teacher or a student needs to review a certain knowledge point subsequently, but does not know the position of the knowledge point in the video, time is wasted in searching, teaching or learning efficiency is reduced, and how to quickly and accurately find needed contents from the teaching videos becomes a technical problem to be solved.
In order to solve the above technical problem, the present invention provides a video processing method, including: segmenting a teaching video to be processed into video segments according to knowledge points; judging whether knowledge points related to each video segment are the same or have correlation; and splicing the video segments with the same knowledge points or associated knowledge points to form a teaching segment. Teaching fragments can be named according to related knowledge points, so that subsequent searching is facilitated.
Specifically, the recorded teaching video may be edited subsequently, the system identifies images and audio in the video, aggregates the image frames by an image aggregation technique according to changes of the image frames in the video to form a plurality of continuous image frame sets, and converts each image frame set into text information by using an OCR (Optical Character Recognition) technique. And converting the audio information into text information by using a voice recognition technology, and combining the text information converted by the image frame set and the text information converted by the audio within the corresponding time by using a time axis to form a plurality of continuous video text segment information. Then, inputting each piece of video text segment information into the trained type recognition model according to a video time axis, so as to obtain the type of an image frame set corresponding to each piece of video text segment information, inputting each piece of video text segment information into different knowledge point recognition models according to the type, so as to obtain knowledge points related to each piece of video text segment information, and segmenting the teaching video into video segments according to the knowledge points. And finally, judging whether the knowledge points related to each video segment are related or not, and splicing the related video segments to obtain a spliced teaching segment. Each teaching segment can comprise a knowledge point and a plurality of corresponding topics, and can also comprise a plurality of associated knowledge points and topics corresponding to the knowledge points. Each teaching segment can be independently used as a teaching video, a plurality of teaching segments can be spliced into a new teaching video, a user can conveniently find out corresponding videos and subjects according to knowledge points, and teaching and learning efficiency is improved.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the accompanying drawings in combination with the embodiments.
It should be noted that, although the application scenario of online live lesson video processing playback is taken as an example for description, the application of the present invention is not limited to this.
Fig. 1 is a schematic flow chart of a teaching video processing method according to an embodiment of the present invention, which can implement automatic editing and recombining of teaching videos.
As shown in fig. 1, the method includes:
s101, segmenting the teaching video to be processed into video segments according to the knowledge points.
In some embodiments, fig. 2 is a schematic flowchart of another teaching video processing method according to an embodiment of the present invention, and as shown in fig. 2, before segmenting the teaching video to be processed into video segments according to knowledge points, the method further includes: and S100, identifying all knowledge points involved in the teaching video.
For ease of understanding, the following description is provided for how to identify knowledge points in a teaching video.
Fig. 3 is a schematic flow chart of a knowledge point identification method according to an embodiment of the present invention, and as shown in fig. 3, the knowledge point identification method includes:
s1001 first cuts the teaching video into frames, and cuts the teaching video into a plurality of images of consecutive frames, which may be, for example, one frame by one frame.
The video can be segmented by the length of the sentence in the voice, for example, the sentence can be correspondingly segmented into a video segment. The teaching video can be segmented according to whether the frame image changes or not.
After the device, such as a server, acquires the recorded teaching video, the voice information and the video information in the teaching video can be extracted and processed respectively. S1001 and S1002 process pure video (containing no voice information).
S1002, aggregating the images according to the change of the characteristics of the adjacent frame images, judging which frame images are the same or similar, forming a plurality of continuous image frame sets after aggregation, wherein the images in each image frame set have high similarity, and each image frame set corresponds to a certain time period on a video time axis.
In some embodiments, a teaching video multi-presentation teaching PPT, through which video images presenting the same PPT content may be aggregated in one image frame set.
S1003, respectively identifying the images in each image frame set by using an OCR technology, and acquiring text information (namely first text information) in the images;
s1004, converting the voice information of the teaching video into text information (namely second text information) by utilizing a voice recognition technology;
s1005, since the text information converted from the voice information also has a time attribute on the video time axis, combining the text information converted from the image in the image frame set with the text information converted from the voice information in the same time period to obtain a plurality of continuous video text segment information.
The voice information and the video image information in the same time period are more relevant, and are irrelevant in few exceptional cases. The knowledge points involved in a certain video can be judged based on the integration of the two information. Furthermore, the text information converted by the two is adopted to comprehensively judge and identify the knowledge points, so that the calculation amount is small and the realization is easy.
And S1006, after the video text segment information is obtained, inputting each piece of video text segment information into the trained knowledge point recognition model according to the sequence of a video time axis, and outputting the names and the corresponding time periods of the knowledge points related to each piece of video text segment information.
Fig. 4 is an exemplary schematic diagram of a knowledge point identification model provided by an embodiment of the present invention, and as shown in fig. 4, the knowledge point identification model includes a video type identification model and a plurality of knowledge identification models corresponding to video types. Accordingly, step S1006 may include:
s10061, first, inputting the video text segment information into a type identification model, and outputting the type of each video text segment information, in the embodiment of the present invention, the type is preset, for example, the type may include knowledge point explanation, title explanation, and others, and the model determines whether the type to which the video text segment information belongs is a knowledge point or a title according to the information content of the video text segment.
Fig. 5 is a schematic view of a teaching video processing interface according to an embodiment of the present invention. As shown in fig. 5, the teaching video is divided into seven video text segments after being processed in the above steps, the seven video text segments are input into a type recognition model to obtain the type of each video text segment, and video segments of the same type may contain different contents, for example, a first video text segment may be a knowledge point explanation, which may include one or more knowledge points, and when a plurality of knowledge points are included, the similarity between images where each knowledge point is located is high, and thus the knowledge points belong to the same image frame set; the presentation of multiple topics may also be included in the video text segment of the topic presentation. This requires a distinction by model identification.
Regarding the type judgment of the knowledge points or the topics, a plurality of knowledge points and text information of the topics in the topic library can be used as training samples in advance to train the type recognition model, and finally, the output classification result is compared with the actual type of the video so as to adjust the parameters of the model. And finally, in S10061, recognizing by using the trained type recognition model, and outputting the type of each piece of video text segment information and a corresponding time segment.
S10062, after obtaining the type of each video text segment, inputting the information of each video text segment into the trained knowledge identification model according to the type for identifying the knowledge point related to each video text segment, and outputting the model to obtain the knowledge points (for example, the names of the knowledge points) related to each video text segment and the corresponding time period.
Because the feature vectors corresponding to different types of video text segment information are also different, in order to ensure the accuracy of recognition, different types of knowledge recognition models are used for different types of video text segment information, for example, the video text segment information explained by knowledge points, the trained first knowledge recognition model needs to be input, and the knowledge points related to the video text segment information explained by the knowledge points are output; and inputting the trained second knowledge identification model to the video text segment information of the subject explanation, and outputting knowledge points related to the video text segment information of the subject explanation. Therefore, the model training and algorithm optimization are more targeted, and the subsequently trained model is higher in recognition rate and higher in speed.
Before inputting the video text segment information into the knowledge identification model, firstly, a knowledge identification model needs to be established and trained (this step is generally performed before the method is implemented), the input of the model is the video text segment information corresponding to a plurality of video samples containing knowledge point labels, the output is the label names of the knowledge points, and the parameters of the model are continuously adjusted until the output knowledge point label names are the same as the actual names of the knowledge points.
The names of the knowledge points are preset, the names of the knowledge points can be summarized according to the content of the knowledge points, the names of the knowledge points can also be used as the names of the knowledge points according to chapter names in unified teaching materials, for example, by taking elementary mathematics as an example, the names of the knowledge points can be set to be 'number expressed by letters', 'evaluation value containing alphabets' and the like, the names of the knowledge points can be output through a model when video text section information meeting the characteristics of the knowledge points is available, and the model outputs the names of a plurality of knowledge points if a plurality of knowledge points are involved in one piece of video text section information.
Of course, the types are not distinguished, and a knowledge point identification model is uniformly adopted for processing; in addition, other methods may also be adopted to process a plurality of knowledge points involved in each piece of video text segment information, which is not limited by the present invention, and the above example is only presented as one embodiment of the method.
And finally, segmenting the teaching video to be processed into video segments according to the identified knowledge points and the time periods in the corresponding teaching video, wherein the types of the video segments comprise knowledge point video segments and topic video segments, and each video segment relates to one knowledge point.
S102, judging whether the knowledge points related to the video segments are the same or related.
In some embodiments, whether the knowledge points related to each video segment are associated or not is judged according to the names of the knowledge points, and the label name of each knowledge point can be preset.
In the embodiment of the invention, the knowledge points are managed by a tree structure, the first-level knowledge points are upper-level knowledge points and comprise a plurality of lower-level subdivided second-level knowledge points, each second-level knowledge point can be divided into a plurality of third-level knowledge points, the knowledge points of the same branch under the knowledge points of a certain level in the tree structure can be set as associated knowledge points, the last-level knowledge points of the same branch in the tree structure can be set as associated knowledge points, and the associated rules can be adjusted at any time according to actual conditions.
For example, a tree structure comprises three levels of knowledge points, wherein the first level of knowledge point is "number learning and calculation", wherein the tree structure comprises a plurality of second level knowledge points, one of which is "formula and equation", and the second level knowledge points further comprise two third level knowledge points "number expressed by letters" and "evaluation with alphabets", and the two third level knowledge points are associated knowledge points. The knowledge point names output by the knowledge point identification model are all final knowledge points, so that classification management is facilitated.
The name of the knowledge point related to each video segment can be automatically matched with a preset tree structure, the knowledge points related to the video segments are divided into positions with the same name in the tree structure, and if the knowledge points related to the video segments are divided into the same position or the same branch in the tree structure, whether the knowledge points related to the video segments are related or not is judged according to a preset related rule.
And S103, splicing the video segments with the same knowledge points or associated knowledge points to form a teaching segment.
Specifically, the teaching segment comprises at least one associated knowledge point explanation and a topic associated with the knowledge point, and a knowledge point video segment and a topic video segment which are the same as or associated with the knowledge point are spliced to form the teaching segment; for example, in the teaching video shown in fig. 5, if it is determined that the knowledge points related to the first two knowledge point video segments and the knowledge point related to the last topic video segment are all related, the corresponding video segments are spliced.
The teaching segments can be named according to the names of the related knowledge points, and a plurality of teaching segments can be spliced again to form a new teaching video. The new teaching video is edited and recombined according to the related knowledge points.
Fig. 6a and fig. 6b are schematic diagrams of teaching video processing interfaces provided by two embodiments of the present invention, respectively. As shown in fig. 6a, after identifying the knowledge points related to each video segment of the teaching video, it is detected that the knowledge points related to a certain knowledge point video segment and a topic 1 video segment are both knowledge points 1, so that the topic 1 video segment and the knowledge point video segment containing the same knowledge points 1 are spliced to form a new teaching video, and the name of the knowledge point 1 is used as the label of the video, when the user plays the video, the knowledge points explained by the teacher are played first, and then the topic explanation associated with the knowledge points is played, so that the watching user can understand the previously explained knowledge points more easily, the interest of the user in watching the teaching video is improved, and the teaching efficiency and the learning efficiency of the user are improved.
Preferably, a striking prompt point is set on a progress bar of a new teaching video formed by recombining the knowledge points, and is used for distinguishing the knowledge points related to the video by a user, and further prompting the knowledge point explanation and the subject explanation.
The knowledge point prompt can directly set a prompt point at the start of the video segment corresponding to the knowledge point, and the name of the knowledge point can be directly displayed or the abbreviation of the knowledge point can be used as the prompt. And clicking the knowledge point or the prompt point by the user to play the video from the starting point of the knowledge point.
If a teaching video comprises a plurality of video segments related to the same knowledge point, as shown in fig. 6b, the spliced teaching video relates to a knowledge point 2a and a knowledge point 2b, and simultaneously comprises two topics 2a and 2b associated with the knowledge point 2, the knowledge points 2a and 2b and the knowledge points related to the topics 2a and 2b are judged and obtained to be all associated, when the teaching video is spliced, the knowledge point 2a, the knowledge point 2b, the topics 2a and 2b are sequentially spliced together according to a time axis of the original teaching video to form the new teaching video, and meanwhile, a prompt point 1, a prompt point 2 and a prompt point 3 are respectively arranged at boundaries of the knowledge point 2a, the knowledge point 2b, the topics 2a and the topics 2b in the video, so that a user can master the viewing progress when viewing the video, and can quickly find the content he wants to view.
Preferably, a plurality of new teaching videos formed by splicing in a set period can be classified according to knowledge points, the new teaching videos with the knowledge points belonging to the same tree structure are classified into the same class, the same upper-level knowledge points are stored in a database as tags, and the new teaching videos are sorted from high to low according to the click rate of a user.
When a user searches videos, the keywords contain names of knowledge points, all new teaching video introduction and links of the same knowledge points are displayed to the user according to a preset sequence at the same time, the user can select a proper video to watch, and when the user selects a certain video, the client calls the video from the server to play.
The server also can identify the name of the teaching material of each knowledge point according to the teaching material for education, for example, the name of the teaching material is multiplied by 100 to appear in the mathematical teaching material of the second grade of primary school, after the name of the teaching material is identified, other knowledge points belonging to the same teaching material are screened from the knowledge point library, and the teaching videos corresponding to the knowledge points are classified by the name of the teaching material as a label.
When a user searches a certain teaching video, the server can recommend the teaching videos of other knowledge points belonging to the same teaching material to the user while showing the teaching video to the user, so that the user can conveniently continue to learn, and the learning efficiency and the user experience are improved.
Those skilled in the art will appreciate that all or part of the steps to implement the above-described embodiments are implemented as programs (computer programs) executed by a computer data processing apparatus. When the computer program is executed, the method provided by the invention can be realized. Furthermore, the computer program may be stored in a computer readable storage medium, which may be a readable storage medium such as a magnetic disk, an optical disk, a ROM, a RAM, or a storage array composed of a plurality of storage media, such as a magnetic disk or a magnetic tape storage array. The storage medium is not limited to centralized storage, but may be distributed storage, such as cloud storage based on cloud computing.
Embodiments of the apparatus of the present invention are described below, which may be used to perform method embodiments of the present invention. The details described in the device embodiments of the invention should be regarded as complementary to the above-described method embodiments; reference is made to the above-described method embodiments for details not disclosed in the apparatus embodiments of the invention.
Fig. 7 is an automatic instructional video editing apparatus according to an embodiment of the present invention, as shown in fig. 7, the apparatus 200 includes:
the video segmentation module 201 is used for segmenting the teaching video to be processed into video segments according to the knowledge points;
the judging module 202 is configured to judge whether knowledge points related to video segments are the same or associated;
and the video clipping module 203 is used for splicing the video segments with the same knowledge points or the associated knowledge points to form a teaching segment.
Wherein the apparatus 200 further comprises a knowledge point identification module for identifying knowledge points involved in the teaching video. The knowledge point identification module is also used for carrying out aggregation processing on the multi-frame images of the teaching video according to the change of the characteristics of the adjacent frame images in the teaching video so as to form an image frame set; converting the teaching video corresponding to each image frame set into video text segment information; and inputting the information of each video text segment into the trained recognition model, and outputting to obtain the name and the corresponding time period of each knowledge point related to each piece of video text segment information. The knowledge point identification module is also used for extracting voice information in the teaching video and converting the voice information into first text information; converting the image information in each image frame set into second text information; and combining each second text message with the first text message of the corresponding time section to form video text section information. The knowledge point identification module is also used for inputting the video text segment information into a trained type identification model and outputting the type of each video text segment information; and inputting the video text segment information into the trained knowledge point identification model according to the type, and outputting to obtain the name and the corresponding time period of each knowledge point related to each video text segment information. The knowledge point identification model comprises a first type identification model and a second type identification model, wherein the first type identification model is used for identifying knowledge points related to video text segment information explained by the knowledge points; the second type identification model is used for identifying knowledge points related to video text segment information of the title explanation.
The video segmentation module 201 is further configured to segment the teaching video to be processed into video segments according to the names and corresponding time periods of the knowledge points related to the information of each video text segment obtained through the output.
The video segment classification module 202 is further configured to determine whether knowledge points related to all video segments are the same or associated according to the names of the knowledge points.
The video clipping module 203 is further configured to splice the knowledge point video segments and the topic video segments having the same knowledge points or having associations to form a teaching segment; optionally, naming the teaching segments according to the related knowledge points; optionally, the teaching segments are spliced into a second teaching video. The teaching segment includes a knowledge point explanation and topics associated with the knowledge point, and/or the teaching segment includes associated more than one knowledge point explanation and topics associated with the knowledge points.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where the electronic device includes a processor and a memory, where the memory is used to store a computer-executable program, and when the computer program is executed by the processor, the processor executes a teaching video automatic clipping method.
As shown in fig. 8, the electronic device is in the form of a general purpose computing device. The number of the processors can be one or more, and the processors can work together. The invention also does not exclude that distributed processing is performed, i.e. the processors may be distributed over different physical devices. The electronic device of the present invention is not limited to a single entity, and may be a sum of a plurality of entity devices.
The memory stores a computer executable program, typically machine readable code. The computer readable program may be executed by the processor to enable an electronic device to perform the method of the invention, or at least some of the steps of the method.
The memory may include volatile memory, such as Random Access Memory (RAM) and/or cache memory units, as well as non-volatile memory, such as read-only memory (ROM).
Optionally, in this embodiment, the electronic device further includes an I/O interface, which is used for data exchange between the electronic device and an external device. The I/O interface may be a local bus representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, and/or a memory storage device using any of a variety of bus architectures.
It should be understood that the electronic device shown in fig. 8 is only one example of the present invention, and elements or components not shown in the above examples may also be included in the electronic device of the present invention. For example, some electronic devices further include a display unit such as a display screen, and some electronic devices further include a human-computer interaction element such as a button, a keyboard, and the like. Electronic devices are considered to be covered by the present invention as long as the electronic devices are capable of executing a computer-readable program in a memory to implement the method of the present invention or at least a part of the steps of the method.
Fig. 9 is a schematic diagram of a computer-readable recording medium provided by an embodiment of the present invention. As shown in fig. 9, the computer-readable recording medium has stored therein a computer-executable program which, when executed, implements the above-described teaching video automatic clipping method of the present invention. The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
The computer readable medium carries one or more programs which, when executed by a device, cause the computer readable medium to perform the functions of: segmenting a teaching video to be processed into video segments according to knowledge points; judging whether knowledge points related to each video segment are the same or have correlation; and splicing video segments with the same knowledge points or associated knowledge points to form teaching segments.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
From the above description of the embodiments, those skilled in the art will readily appreciate that the present invention can be implemented by hardware capable of executing a specific computer program, such as the system of the present invention, and electronic processing units, servers, clients, mobile phones, control units, processors, etc. included in the system. The invention may also be implemented by computer software for performing the method of the invention. It should be noted, however, that the computer software for executing the method of the present invention is not limited to be executed by one or a specific hardware entity, but may also be implemented in a distributed manner by hardware entities without specific details, for example, some method steps executed by a computer program may be executed by a mobile client, and another part may be executed by a smart meter, a smart pen, or the like. For computer software, the software product may be stored in a computer readable storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or may be distributed over a network, as long as it enables the electronic device to perform the method according to the present invention.
In summary, the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functionality of some or all of the components in embodiments consistent with the present invention may be implemented in practice using a general purpose data processing device such as a microprocessor or a Digital Signal Processor (DSP). The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
While the foregoing embodiments have described the objects, aspects and advantages of the present invention in further detail, it should be understood that the present invention is not inherently related to any particular computer, virtual machine or electronic device, and various general-purpose machines may be used to implement the present invention. The invention is not to be considered as limited to the specific embodiments thereof, but is to be understood as being modified in all respects, all changes and equivalents that come within the spirit and scope of the invention.

Claims (7)

1. A method for teaching video processing, comprising:
aggregating the multi-frame images of the teaching video according to the change of the characteristics of the adjacent frame images in the teaching video to form an image frame set; converting each section of the teaching video corresponding to each image frame set into video text section information; inputting the information of each video text segment into a trained knowledge point identification model, and outputting to obtain the name and the corresponding time period of each knowledge point related to the information of each video text segment;
segmenting a teaching video to be processed into video segments according to knowledge points;
judging whether the knowledge points related to the video segments are the same or have correlation;
splicing video bands with the same or associated knowledge points to form teaching segments;
the knowledge point identification model comprises a video type identification model and knowledge identification models corresponding to the video types, and each knowledge identification model carries out knowledge point identification training aiming at video text segment information corresponding to the video type;
the method for inputting the information of each video text segment into the trained knowledge point identification model and outputting the names and the corresponding time periods of the knowledge points related to the information of each video text segment comprises the following steps:
inputting the video text segment information into the trained video type recognition model, and outputting the type of each video text segment information;
and inputting the video text segment information into the trained knowledge recognition model of the corresponding type according to the type of the video text segment information, and outputting to obtain the name and the corresponding time period of each knowledge point related to each video text segment information.
2. The instructional video processing method of claim 1, wherein said converting each of the instructional videos corresponding to each of the image frame sets into video text segment information comprises:
extracting voice information in the teaching video, and converting the voice information into first text information;
converting the image information in each image frame set into second text information;
and combining each second text message with the first text message of the corresponding time section to form the video text section message.
3. The instructional video processing method of claim 1 wherein the video types include point of knowledge explanation and topic explanation;
the knowledge identification model comprises a first knowledge identification model and a second knowledge identification model, wherein,
the first knowledge identification model is used for identifying knowledge points related to video text segment information explained by the knowledge points;
the second knowledge identification model is used for identifying knowledge points related to video text segment information of the title explanation.
4. The instructional video processing method of any one of claims 1 to 3, wherein said segmenting the instructional video to be processed into video segments by knowledge points comprises:
and segmenting the teaching video to be processed into video segments according to the names of the knowledge points related to each piece of video text segment information and the corresponding time periods.
5. The instructional video processing method according to any one of claims 1 to 3, wherein the determining whether the knowledge points related to the video segments are the same or related comprises:
and judging whether the knowledge points related to all the video segments are the same or have correlation according to the names of the knowledge points.
6. The instructional video processing method of any one of claims 1 to 3, wherein the instructional fragment includes a knowledge point explanation and a topic explanation associated with the knowledge point, and/or,
the teaching segment comprises more than one associated knowledge point explanation and a topic explanation associated with the knowledge points;
optionally, the splicing the video segments with the same knowledge points or associated knowledge points to form a teaching segment includes: splicing the knowledge point video segments and the question video segments with the same knowledge points or associated knowledge points to form teaching segments;
optionally, naming the teaching segments according to the related knowledge points;
optionally, the teaching segments are spliced into a second teaching video.
7. An instructional video processing apparatus, comprising:
the knowledge point identification module is used for carrying out aggregation processing on the multi-frame images of the teaching video according to the change of the characteristics of the adjacent frame images in the teaching video so as to form an image frame set; converting each section of the teaching video corresponding to each image frame set into video text section information; inputting the information of each video text segment into a trained knowledge point identification model, and outputting to obtain the name and the corresponding time period of each knowledge point related to the information of each video text segment;
the video segmentation module is used for segmenting the teaching video to be processed into video segments according to the knowledge points;
the judging module is used for judging whether the knowledge points related to the video segments are the same or have correlation;
the video editing module is used for splicing video segments with the same knowledge points or associated knowledge points to form a teaching segment;
the knowledge point identification model comprises a video type identification model and knowledge identification models corresponding to the video types, and each knowledge identification model carries out knowledge point identification training aiming at video text segment information corresponding to the video type;
the method for inputting the information of each video text segment into the trained knowledge point identification model and outputting the names and the corresponding time periods of the knowledge points related to the information of each video text segment comprises the following steps:
inputting the video text segment information into the trained video type recognition model, and outputting the type of each video text segment information;
and inputting the video text segment information into the trained knowledge recognition model of the corresponding type according to the type of the video text segment information, and outputting to obtain the name and the corresponding time period of each knowledge point related to each video text segment information.
CN202110482496.0A 2021-04-30 2021-04-30 Teaching video processing method and device and electronic equipment Active CN113259763B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110482496.0A CN113259763B (en) 2021-04-30 2021-04-30 Teaching video processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110482496.0A CN113259763B (en) 2021-04-30 2021-04-30 Teaching video processing method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113259763A CN113259763A (en) 2021-08-13
CN113259763B true CN113259763B (en) 2023-04-07

Family

ID=77223492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110482496.0A Active CN113259763B (en) 2021-04-30 2021-04-30 Teaching video processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113259763B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115695932B (en) * 2022-12-30 2023-03-17 湖南希赛网络科技有限公司 Multimedia teaching management system based on online education
CN116541559A (en) * 2023-05-11 2023-08-04 智慧校园(广东)教育科技有限公司 System and method for answering interaction for intelligent class

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109698920A (en) * 2017-10-20 2019-04-30 深圳市鹰硕技术有限公司 It is a kind of that tutoring system is followed based on internet teaching platform
CN110035330A (en) * 2019-04-16 2019-07-19 威比网络科技(上海)有限公司 Video generation method, system, equipment and storage medium based on online education
CN110569364A (en) * 2019-08-21 2019-12-13 北京大米科技有限公司 online teaching method, device, server and storage medium
CN110602546A (en) * 2019-09-06 2019-12-20 Oppo广东移动通信有限公司 Video generation method, terminal and computer-readable storage medium
CN111429768A (en) * 2020-03-17 2020-07-17 安徽爱学堂教育科技有限公司 Knowledge point splitting and integrating method and system based on teaching recording and broadcasting
CN111739358A (en) * 2020-06-19 2020-10-02 联想(北京)有限公司 Teaching file output method and device and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109698920A (en) * 2017-10-20 2019-04-30 深圳市鹰硕技术有限公司 It is a kind of that tutoring system is followed based on internet teaching platform
CN110035330A (en) * 2019-04-16 2019-07-19 威比网络科技(上海)有限公司 Video generation method, system, equipment and storage medium based on online education
CN110569364A (en) * 2019-08-21 2019-12-13 北京大米科技有限公司 online teaching method, device, server and storage medium
CN110602546A (en) * 2019-09-06 2019-12-20 Oppo广东移动通信有限公司 Video generation method, terminal and computer-readable storage medium
CN111429768A (en) * 2020-03-17 2020-07-17 安徽爱学堂教育科技有限公司 Knowledge point splitting and integrating method and system based on teaching recording and broadcasting
CN111739358A (en) * 2020-06-19 2020-10-02 联想(北京)有限公司 Teaching file output method and device and electronic equipment

Also Published As

Publication number Publication date
CN113259763A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
CN112507140B (en) Personalized intelligent learning recommendation method, device, equipment and storage medium
CN110134931B (en) Medium title generation method, medium title generation device, electronic equipment and readable medium
US20170193393A1 (en) Automated Knowledge Graph Creation
CN108959531B (en) Information searching method, device, equipment and storage medium
CN113259763B (en) Teaching video processing method and device and electronic equipment
CN110569364A (en) online teaching method, device, server and storage medium
US10089898B2 (en) Information processing device, control method therefor, and computer program
CN114339285B (en) Knowledge point processing method, video processing method, device and electronic equipment
CN108460122B (en) Video searching method, storage medium, device and system based on deep learning
CN111935529B (en) Education audio and video resource playing method, equipment and storage medium
Chang et al. Yet another adaptive learning management system based on Felder and Silverman’s learning styles and Mashup
CN108121715A (en) A kind of word tag method and word tag device
CN112287168A (en) Method and apparatus for generating video
US20240061899A1 (en) Conference information query method and apparatus, storage medium, terminal device, and server
CN111507680A (en) Online interviewing method, system, equipment and storage medium
CN110598095A (en) Method, device and storage medium for identifying article containing designated information
CN113360598A (en) Matching method and device based on artificial intelligence, electronic equipment and storage medium
CN111739358A (en) Teaching file output method and device and electronic equipment
South et al. DebateVis: Visualizing political debates for non-expert users
CN111723235B (en) Music content identification method, device and equipment
CN113779345B (en) Teaching material generation method and device, computer equipment and storage medium
US20150213726A1 (en) System and methods for automatic composition of tutorial video streams
US11854430B2 (en) Learning platform with live broadcast events
CN114173191B (en) Multi-language answering method and system based on artificial intelligence
CN115757720A (en) Project information searching method, device, equipment and medium based on knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant