Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide an analysis method and system for online course video resource composition, and aims to solve the problems of identification and evaluation of a teaching design method in a teaching video and detection and retrieval of teaching materials.
In order to achieve the above object, in a first aspect, an analysis method for online course video resource composition includes the following steps:
respectively determining video stream information and audio stream information from an online course video to be analyzed;
processing the video stream to extract each video shot, and segmenting the audio stream information to obtain each audio clip;
identifying a video content style of a video shot; the video content style comprises at least one of the following contents: slide lectures, lecturers, and others, such as no slide lectures and no lecturers;
identifying whether the online course video contains case analysis or not by combining the image and the text in the video lens and the voice text corresponding to the audio clip, and if the keyword text of the case analysis appears in the image, the text or the voice text, considering that the online course video contains the case analysis;
and taking the video content style of the video shot and the result of whether the online course video contains the case analysis as the analysis result of the online course video resource to be analyzed.
In an optional embodiment, the method further comprises the steps of:
if the video shot comprises the slide lecture, the target to be detected in the video shot also comprises the presentation type of the slide lecture; the presentation type includes at least one of the following ways: plain text, plain images, pictures, animations, and the like; the presentation type of the slide show presentation also belongs to the video content style.
In an alternative embodiment, the keyword words of the case analysis are: case, instance, case, say, for example, such as, example, or trial proof.
In an optional embodiment, the identifying the video content style of the video footage specifically includes the following steps:
comparing image change areas of all video lenses, detecting whether the change areas contain characters or not, combining peripheral outlines of all the image change areas, if the peripheral outlines of the combined image change areas are rectangular and contain character information, judging that a slide lecture exists in the video, and delimiting a slide lecture area range;
whether a lecturer exists in a video shot is analyzed in an image analysis mode;
combining the slide lecture note analysis result and the lecturer analysis result, the video shots are divided into pure slide lecture notes, pure lecturers, slide lecture notes mixed with lecturers or slide lecture notes without lecturers.
In an alternative embodiment, the presentation type of the slide lecture is specifically obtained by analyzing the following steps:
judging whether lines and characters exist in the range of a slide presentation area defined in a video shot, and if only lines exist, presenting the type of the lines as a pure image; if only the characters exist, the presentation type is pure characters; if the lines and the characters exist, the display type is image-text;
and judging whether the image frames in the same video shot have an inclusion relationship, if so, judging that the presentation type comprises animation.
In a second aspect, the present invention provides an online course video resource composition parsing system, including:
the video analysis unit is used for respectively determining video stream information and audio stream information from the online course video to be analyzed;
a shot extraction unit, configured to process the video stream to extract each video shot, and segment the audio stream information to obtain each audio segment;
the shot recognition unit is used for recognizing the video content style of the video shot; the video content style comprises at least one of the following contents: slide lecture, lecturer, and no slide lecture and no lecturer;
the case analysis unit is used for identifying whether the online course video comprises case analysis or not by combining the image and the text in the video lens and the voice text corresponding to the audio clip, and if the keyword and the text of the case analysis appear in the image, the text or the voice text, the online course video is considered to comprise the case analysis;
and the element analysis unit is used for taking the video content style of the video shot and the result of whether the online course video contains case analysis as the analysis result formed by the online course video resource to be analyzed.
In an optional embodiment, if the video footage includes a slide lecture, the target to be detected in the video footage identified by the footage identification unit further includes a presentation type of the slide lecture; the presentation type includes at least one of the following ways: pure characters, pure images, pictures and texts, animation and the like; the presentation type of the slide show presentation also belongs to the video content style.
In an alternative embodiment, the keyword words of the case analysis are: case, instance, case, say, for example, such as, example, or trial proof.
In an optional embodiment, the shot identification unit compares image change areas of all video shots, detects whether the change areas contain characters, merges peripheral outlines of all the image change areas, judges that a slide lecture exists in a video if the peripheral outlines of the merged image change areas are rectangular and contain character information, and delimits a slide lecture area range; whether a lecturer exists in a video shot is analyzed in an image analysis mode; and combining the slide lecture note analysis result and the lecturer analysis result to divide the video shots into a pure slide lecture note, a pure lecturer, a slide lecture note and a lecturer mixture or a slide lecturer without a lecturer.
In an alternative embodiment, the presentation type of the slide lecture is specifically obtained by analyzing the following steps:
judging whether lines and characters exist in the range of a slide presentation area defined in a video shot, and if only lines exist, presenting the type of the lines as a pure image; if only the characters exist, the presentation type is pure characters; if the lines and the characters exist, the display type is image-text;
and judging whether the image frames in the same video shot have an inclusion relationship, if so, judging that the presentation type comprises animation.
Generally, compared with the prior art, the above technical solution conceived by the present invention has the following beneficial effects:
the invention provides an analysis method and system for online course video resource composition, which can identify design elements of a teaching video from multiple dimensions such as images, voice and the like, is beneficial to understanding the teaching design intention of teachers, decomposes knowledge elements of teaching resources, and has great reference value and practical significance for analysis and evaluation of teaching design, quick retrieval of teaching resources and the like. The online course video producer can also use the invention to carry out autonomous evaluation on the produced online course video, determine the video shortage and modify the video. The online course platform can automatically complete the design element calibration of video resources through the invention, and marks on the platform for learners to select resources suitable for the learners.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention aims to meet the requirements of the teaching analysis and design field, and identifies the resource elements of the online course video and analyzes the organization time sequence of each resource element so as to obtain the composition analysis result of the video resource. The contents of the video resource elements comprise teaching slides, demonstration videos, animation materials and the like, and the organization characteristics of the resource elements comprise links such as whether a teacher goes out of the mirror or not, whether case explanation is inserted in teaching or not and the like.
The method for analyzing the course video resource composition identifies the design elements of the online course resources so as to analyze and identify the design elements from the perspective of video teaching; specifically, images in the online course video can be analyzed, whether the slide lecture is in a luxuriant picture and text, whether animation exists or not and whether a teacher goes out of the mirror are identified, whether the video contains case explanation or not is identified by combining the analysis of voice content, and then the performance of the video is evaluated.
FIG. 1 is a flow chart of a method for identifying online course video resource design elements according to an embodiment of the present invention; as shown in fig. 1, the method comprises the following steps:
s110, respectively determining video stream information and audio stream information from an online course video to be analyzed;
s120, processing the video stream to extract each video shot, and segmenting the audio stream information to obtain each audio clip;
s130, identifying the video content style of the video shot; the video content style comprises at least one of the following contents: slide lecture, lecturer, and no slide lecture and no lecturer;
s140, identifying whether the online course video contains case analysis or not by combining the image and the text in the video lens and the voice text corresponding to the audio clip, and if the keyword text of the case analysis appears in the image, the text or the voice text, considering that the online course video contains the case analysis;
s150, taking the video content style of the video shot and the result of whether the online course video contains case analysis as the analysis result of the online course video resource to be analyzed.
Specifically, if the video footage includes a slide lecture, the video content style of the video footage further includes a presentation type of the slide lecture; the presentation type includes at least one of the following ways: plain text, plain image, plain text, and animation.
In a specific embodiment, an implementation process of the online course video resource composition analysis method provided in the embodiment of the present invention includes the following steps:
(1) acquiring a target online course video, and respectively acquiring video stream information and audio stream information from a video file;
(2) processing the video stream information to obtain shot extraction, and performing audio segment segmentation on the audio stream information;
(3) judging the type of the video shot: plain slide lectures, teacher lectures, slide lectures mixed with teacher pictures, and others (no slide lectures and no teacher pictures);
(4) further analysis of the presentation type of the slide lecture in the video footage containing the slide lecture: pure characters, pure images, pictures and texts, and animation;
(5) identifying an image text and a voice text in a video lens, and judging whether case analysis exists or not by combining the image text and the voice text;
(6) and (5) summarizing the information in the steps (3) to (5) to generate a design element evaluation report of the video resource as an analysis result of the online course video resource composition.
It is understood that the design elements contain video content style and whether case analysis exists. The type of video shot, such as plain slide lecture, teacher lecture, slide lecture mixed with teacher pictures, and others, is the video content style. Further, the presentation type of the slide show presentation may also be considered as a video content style. That is, a video content style may contain multiple elements and is not limited to only one representation.
The process of processing the video stream information in the step (2) is specifically as follows:
extracting image frames according to a unit of second, finishing the extraction of video shots by calculating the correlation degree of adjacent image frames, and recording the starting time and the ending time of each video shot;
the process of processing the audio stream information in the step (2) is specifically as follows:
dividing the audio according to the pause of the speaker voice and recording the start time and the end time of the audio segment;
the identification process of the video lens type in the step (3) specifically comprises the following steps:
comparing image change areas of all video lenses, detecting whether the change areas contain characters, merging the peripheral outlines of all the image change areas (taking the maximum peripheral outline), if the peripheral outline of the merged image change area is rectangular and contains character information, judging that a slide lecture exists in the video, and drawing a slide lecture range;
meanwhile, whether people appear in the video shot or not is analyzed through the image;
combining the slide lecture with the character recognition result, and dividing the video shot into a pure slide lecture, a pure teacher, a mixed type of the slide lecture and the teacher and the like;
the specific process of the slide presentation type in the video shot in the step (4) is as follows:
judging whether lines exist in a video shot demarcated area or not, if so, judging the area to be a graph, and if characters exist at the same time, judging the area to be a graph and a text;
judging whether the image frames in the same lens have inclusion relationship, namely, if a certain image frame contains the content (characters and graphs) of a pre-preamble image frame, judging that animation exists;
the specific process of case analysis and identification in the video lens in the step (5) is as follows:
converting the audio clips into corresponding text texts by a voice recognition technology;
segmenting words in a voice text and a slide lecture, extracting keywords, searching in full-text semantic information in a regular matching mode, and judging that case analysis exists if 'case, example, if, for example, such as example and trial proof', and the like, is searched;
the invention utilizes artificial intelligence technology to identify the design elements of the online course video resources through image and voice analysis.
In a specific embodiment, the method for identifying design elements of online course video resources and the intelligent analysis system provided by the embodiment of the present invention include: the system comprises a video acquisition module, a video preprocessing module and a video design element identification module; the video acquisition module is used for reading a video file to be detected and acquiring video stream information and audio stream information; the video preprocessing module is used for processing the video stream information and the audio stream information; the video design element recognition module is used for recognizing the design elements of the video resources according to the results of the image and voice analysis and finally generating an evaluation report.
The video acquisition module can read a target online course video and load a video file to be detected in a memory; video stream information and audio stream information are obtained from the video file, respectively.
The video preprocessing module comprises an audio processing module and a video processing module, and the audio processing module is used for processing audio stream information and converting the audio stream information into a voice text through voice recognition; the video processing module is used for processing video stream information, extracting image frames according to a unit of second, completing extraction of video fragments by calculating the correlation degree of adjacent image frames, and identifying characters in the images as image texts by character identification;
the video design element identification module comprises: the system comprises a lens type identification unit, a slide show presentation type identification module, a case analysis identification module and a video design element evaluation report generation module.
The lens type identification unit firstly identifies the change areas of different lenses, merges the peripheral outlines of the change areas, and demarcates the maximum peripheral outline shape, and if the peripheral outline shape is similar to a rectangle and characters are detected, the change areas are judged as a slide lecture; meanwhile, whether a teacher exists in the image frame is judged through face recognition; merging the slide lecture notes and the human body judgment results;
the slide lecture presentation type identification module analyzes a slide lecture area defined in the image frame, detects whether characters exist or not, detects whether lines and graphs exist or not, and combines the character detection result and the graph detection result; identifying the inclusion relationship between characters and figures in the slide lecture area of adjacent image frames of the same lens, and if the inclusion relationship exists between a certain image frame and the content of a preorder image frame, judging that the image frame is an animation;
the case analysis recognition module performs 'example' on the image text and the voice text of the video, if the example is, for example, keyword retrieval such as 'trial proof', and the like, and if the matching is hit, the case analysis recognition module judges that the video contains the explanation in the form of case analysis;
and the video resource design element evaluation report generation module summarizes the identification result to generate an evaluation report.
It can be understood that the invention is based on the video image and voice analysis technology, carries on intelligent analysis and evaluation to the online course teaching video resource from the design element angle, has great reference value and practical significance to the online course platform and the producer, and also can be convenient for the learner to find the video resource suitable for the habit.
The beneficial effects of the invention can be mainly used in the following situations:
(1) the online course video producer can use the invention to independently evaluate the produced online course video design elements, determine the video shortage and modify the video shortage.
(2) The online course platform can use the invention to identify the design elements of the video uploaded by the producer and provide the learner with label information so that the learner can quickly find the resources suitable for the learner.
To further explain the design element identification method and the intelligent analysis system for online course video resources provided by the embodiment of the invention, the following is detailed with reference to the accompanying drawings and specific examples:
in a more specific embodiment, the present example provides a design element recognition system for online lesson video facing "circuit theory". The circuit theory online course video is an mp4 file. In this embodiment, a computer is required to use the system.
The specific use steps are as follows:
s1: and clicking to start preprocessing the video, and extracting the video stream and the audio stream. The subsequent steps S2-S3 are video stream preprocessing flow, and the steps S4-S5 are audio stream preprocessing flow.
S2: and carrying out similarity comparison on the extracted video frames and carrying out deletion operation, and reserving the video frames in a picture form.
S3: and identifying and extracting image text information in the picture by adopting a CTPN + CRNN algorithm. The CTPN can effectively detect the transversely distributed characters of a complex scene, and the CRNN model is a popular image-text recognition model at present and can recognize a longer text sequence.
S4: the audio stream is segmented based on an energy analysis of the audio waveform.
S5: calling a CMUSPinx voice recognition algorithm to perform voice recognition, and recording voice text information; the cmnspinx speech recognition algorithm is one of the mainstream open source speech recognition frameworks at present, originates from the university of calkymelong, and has a model in which a plurality of speeches including mandarin, english, french, spanish, and italian can be directly used.
S6: firstly, identifying the variation areas of different lenses, merging the peripheral outlines of the variation areas, defining the maximum peripheral outline shape, judging as a slide lecture if the peripheral outline shape is approximate to a rectangle and characters are detected, and executing the step S8-the step S9, otherwise skipping the step S8-the step S9.
S7: and judging whether a teacher exists in the image frame by adopting an MTCNN face detection algorithm. The MTCNN algorithm is a method for detecting human faces by a multitask cascade convolution neural network, and is one of the best human face detectors with the effect of open source codes so far.
S8: analyzing a slide presentation area defined in the image frame, detecting whether characters exist or not, and detecting whether lines and graphs exist or not.
S9: and identifying the inclusion relation of characters and graphics in the slide show lecture areas of adjacent image frames of the same lens.
S10: the image text and the voice text of the video are analyzed, and keywords like' case, example, if, for example, proof, etc. are searched in the image and voice text information in a regular matching manner, so that the case in the video is identified.
S12: summarizing the results of the steps S6-S11, judging the type of the lens according to the results of the steps S6-S7, judging the presentation type of the slide lecture according to the results of the steps S8-S9, judging whether the case analysis form explanation is contained according to the result of the step S10, and finally generating and displaying a video resource design element evaluation report.
The online course teaching video verification method can intelligently analyze and recognize design elements of the online course teaching video, the recognition result is based on the image and voice data of the video, verification can be effectively carried out on whether teachers of the online course teaching video go out of the mirror, whether pictures and texts exist, whether animation is used, whether case analysis exists and other design elements, the recognition result is accurate, and the online course teaching video verification method has great reference value and practical significance for online course platforms and makers.
FIG. 2 is a block diagram of a system for identifying online course video asset design elements provided by an embodiment of the present invention; as shown in fig. 2, includes:
a video analysis unit 210, configured to determine video stream information and audio stream information from the online course video to be parsed, respectively;
a shot extraction unit 220, configured to process the video stream to extract each video shot, and segment the audio stream information to obtain each audio segment;
a shot recognition unit 230 for recognizing a video content style of the video shot; the video content style comprises at least one of the following contents: slide lecture, lecturer, and no slide lecture and no lecturer;
the case analysis unit 240 is configured to combine the image and the text in the video shot with the voice text corresponding to the audio segment, identify whether the online lesson video includes case analysis, and if a keyword text of the case analysis appears in the image, the text, or the voice text, consider that the online lesson video includes case analysis;
and the element analysis unit 250 is configured to use the video content style of the video shot and whether the online course video includes a result of case analysis as an analysis result formed by the online course video resource to be analyzed.
It is understood that specific functions of each unit in fig. 2 can refer to detailed descriptions in the foregoing method embodiments, and are not described herein again.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.