CN111641869B - Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium - Google Patents

Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN111641869B
CN111641869B CN202010500050.1A CN202010500050A CN111641869B CN 111641869 B CN111641869 B CN 111641869B CN 202010500050 A CN202010500050 A CN 202010500050A CN 111641869 B CN111641869 B CN 111641869B
Authority
CN
China
Prior art keywords
video
frames
frame
adjacent
distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010500050.1A
Other languages
Chinese (zh)
Other versions
CN111641869A (en
Inventor
熊军
赵俊博
栾博恒
陈澈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubo Network Technology Shanghai Co ltd
Original Assignee
Hubo Network Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubo Network Technology Shanghai Co ltd filed Critical Hubo Network Technology Shanghai Co ltd
Priority to CN202010500050.1A priority Critical patent/CN111641869B/en
Publication of CN111641869A publication Critical patent/CN111641869A/en
Application granted granted Critical
Publication of CN111641869B publication Critical patent/CN111641869B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a video lens dividing method, a video lens dividing device, electronic equipment and a computer readable storage medium, wherein after I frames in a video to be processed are identified, a plurality of I frames are aggregated according to the distance between adjacent I frames to obtain a target I frame. And segmenting the video to be processed based on the target I frame to obtain video segments. And determining a target video frame in the video clips according to the distance between the video frames in each video clip, and segmenting the video clips based on the target video frame. According to the video segmentation scheme, the video is segmented by using the target I frame obtained by polymerization to obtain the video segment, so that the problem of over-fine segmentation caused by directly adopting the I frame for segmentation in the prior art is solved, and the video segment is segmented by obtaining the target video frame in the video segment, so that the segmentation accuracy of the scene change point can be further improved. The scheme can avoid the problem of low processing efficiency in the segmentation scheme in the prior art on the premise of improving the segmentation accuracy.

Description

Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium
Technical Field
The present application relates to the field of video processing technologies, and in particular, to a video mirroring method and apparatus, an electronic device, and a computer-readable storage medium.
Background
With the popularization of mobile internet and communication technology, files in a video format gradually become the mainstream, and the processing requirement for video key information is more and more common no matter short videos or long videos. For example, in the later stage of video processing, a splicing idea needs to be designed for the integrated video material according to the character script and the split-mirror script. Therefore, a fine-grained slicing process is required for the original video.
The video segmentation process mainly comprises manual operation of an editor and automatic mirror segmentation. In the automatic mirror segmentation process, the video is generally segmented directly based on the I frame in the video in the currently common mode, and the I frame included in the video cannot completely reflect the scene change in the video, so that the video cannot be segmented accurately in the mode, the segmentation is possibly too fine, and the problem that the scene change point cannot be segmented accurately is caused. In addition, a segmentation model is obtained based on a training neural network, and a video segmentation mode is performed based on the segmentation model, so that the processing efficiency is extremely low due to the need of performing model training and a later model identification process.
Disclosure of Invention
An object of the present application includes, for example, providing a video segmentation method, apparatus, electronic device and computer readable storage medium, which can improve video segmentation accuracy and improve processing efficiency.
The embodiment of the application can be realized as follows:
in a first aspect, an embodiment of the present application provides a video mirror splitting method, where the method includes:
identifying an I frame in a plurality of video frames contained in a video to be processed;
performing aggregation processing on the plurality of I frames according to the distance between every two adjacent I frames to obtain a target I frame in the plurality of I frames;
segmenting the video to be processed based on the obtained target I frame to obtain a plurality of video segments;
for each video clip, determining a target video frame in the video clip according to the distance between every two adjacent video frames in a plurality of video frames contained in the video clip;
and segmenting the video segment based on the target video frame.
In an optional embodiment, the step of performing aggregation processing on the multiple I frames according to the distance between every two adjacent I frames to obtain a target I frame in the multiple I frames includes:
calculating the distance between every two adjacent I frames;
and detecting whether the distance between the two adjacent I frames meets a preset aggregation condition, and if so, taking the next I frame in the two adjacent I frames as a target I frame.
In an optional embodiment, the step of detecting whether the distance between two adjacent I frames satisfies a preset aggregation condition includes:
obtaining an average distance value according to the distance between a plurality of groups of adjacent I frames;
and for every two adjacent I frames, detecting whether the distance between the two adjacent I frames is smaller than the average distance value and smaller than a preset distance value, and if so, determining that the distance between the two adjacent I frames meets a preset aggregation condition.
In an optional embodiment, the step of determining, for each video segment, a target video frame in the video segment according to a distance between every two adjacent video frames in a plurality of video frames included in the video segment includes:
for each video segment, calculating the distance between every two adjacent video frames in a plurality of video frames contained in the video segment;
obtaining a difference value corresponding to the two adjacent video frames according to the distance between the two adjacent video frames;
and detecting whether the difference value meets a preset abnormal condition, and if so, taking any video frame in the video frames corresponding to the difference value as a target video frame.
In an optional implementation manner, the step of obtaining a difference value corresponding to the two adjacent video frames according to the distance between the two adjacent video frames includes:
obtaining a first difference value corresponding to the two adjacent video frames according to the distance between the two adjacent video frames;
and calculating the absolute difference between every two adjacent first differential values as second differential values corresponding to the two adjacent first differential values.
In an optional embodiment, the step of detecting whether the difference value satisfies a preset abnormal condition includes:
performing Gaussian distribution processing on the plurality of first difference values and the plurality of second difference values to obtain first abnormal critical values corresponding to the plurality of first difference values and second abnormal critical values corresponding to the plurality of second difference values;
finding out a first difference value which is larger than the first abnormal critical value and is a local extremum in the plurality of first difference values aiming at the plurality of first difference values, and if a second difference value corresponding to the found first difference value is a local extremum in the plurality of second difference values, determining that the found first difference value meets a preset abnormal condition;
and finding out a second differential value which is larger than the second abnormal critical value and is a local extremum in the plurality of second differential values according to the plurality of second differential values, and determining that the found second differential value meets a preset abnormal condition if a first differential value corresponding to the found second differential value is a local extremum in the plurality of first differential values.
In an optional embodiment, before the step of segmenting the video segment based on the target video frame, the method further includes:
aiming at each video clip, obtaining the pixel value of each pixel point contained in each video frame in the video clip;
and when the pixel values of the pixels with the number exceeding the preset number in the pixels contained in the video frame are lower than the preset value, determining that the video frame is the target video frame.
In a second aspect, an embodiment of the present application provides a video split mirror device, where the device includes:
the identification module is used for identifying I frames in a plurality of video frames contained in the video to be processed;
the aggregation module is used for aggregating the plurality of I frames according to the distance between every two adjacent I frames to obtain a target I frame in the plurality of I frames;
the first segmentation module is used for segmenting the video to be processed based on the obtained target I frame to obtain a plurality of video segments;
the determining module is used for determining a target video frame in each video segment according to the distance between every two adjacent video frames in a plurality of video frames contained in the video segment;
and the second segmentation module is used for segmenting the video segment based on the target video frame.
In a third aspect, an embodiment of the present application provides an electronic device, including: the system comprises a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, when an electronic device runs, the processor is communicated with the storage medium through the bus, and the processor executes the machine-readable instructions to execute the steps of the method according to any one of the preceding implementation modes.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the method according to any one of the foregoing embodiments.
The beneficial effects of the embodiment of the application include, for example:
according to the video lens splitting method, the video lens splitting device, the electronic equipment and the computer readable storage medium, after the I frames in the video to be processed are identified, according to the distance between adjacent I frames, the I frames are aggregated to obtain the target I frame, and the video to be processed is split based on the obtained target I frame to obtain the video clip. And determining a target video frame in the video clips according to the distance between the video frames in each video clip, and finally segmenting the video clips based on the target video frame. According to the video segmentation scheme, the video is segmented by using the target I frame obtained by polymerization to obtain the video segment, so that the problem of over-fine segmentation caused by directly adopting the I frame for segmentation in the prior art is solved, and the video segment is segmented by obtaining the target video frame in the video segment, so that the segmentation accuracy of the scene change point can be further improved. The scheme can avoid the problem of low processing efficiency in the segmentation scheme in the prior art on the premise of improving the segmentation accuracy.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a block diagram of an electronic device according to an embodiment of the present disclosure;
fig. 2 is a flowchart of a video mirroring method according to an embodiment of the present application;
fig. 3 is a flowchart of a target I-frame determining method according to an embodiment of the present application;
fig. 4 is a flowchart of a preset aggregation condition detection method according to an embodiment of the present application;
fig. 5 is a flowchart of a target video frame determination method according to an embodiment of the present application;
fig. 6 is a flowchart of a preset abnormal condition detection method according to an embodiment of the present application;
fig. 7 is another flowchart of a target video frame determination method according to an embodiment of the present application;
fig. 8 is a functional block diagram of a video mirror splitting device according to an embodiment of the present application.
Icon: 110-a processor; 120-a memory; 130-a communication module; 140-video split mirror device; 141-an identification module; 142-a polymerization module; 143-a first bisection module; 144-a determination module; 145-second segmentation module.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the present application, the terms "first," "second," and the like, if any, are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
It should be noted that the features of the embodiments of the present application may be combined with each other without conflict.
Referring to fig. 1, a block diagram of an electronic device provided in the embodiment of the present application is shown, where the electronic device may include, but is not limited to, a computer, a server, and other devices. The electronic device may include a memory 120, a processor 110, and a communication module 130. The memory 120, the processor 110 and the communication module 130 are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines.
The memory 120 is used for storing programs or data. The Memory 120 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
The processor 110 is used for reading/writing data or programs stored in the memory 120 and executing the video split mirror method provided by any embodiment of the present application.
The communication module 130 is used for establishing a communication connection between the electronic device and another communication terminal through a network, and for transceiving data through the network.
It should be understood that the configuration shown in fig. 1 is merely a schematic configuration diagram of an electronic device, which may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.
Referring to fig. 2, fig. 2 is a flowchart illustrating a video mirroring method according to an embodiment of the present application, where the video mirroring method can be executed by the electronic device shown in fig. 1. It should be understood that in other embodiments, the order of some steps in the video partial mirror method of this embodiment may be interchanged according to actual needs, or some steps may be omitted or deleted. The detailed steps of the video split mirror method are described as follows.
In step S210, an I frame of a plurality of video frames included in the video to be processed is identified.
Step S220, performing aggregation processing on the multiple I frames according to the distance between every two adjacent I frames to obtain a target I frame in the multiple I frames.
And step S230, segmenting the video to be processed based on the obtained target I frame to obtain a plurality of video segments.
Step S240, for each video segment, determining a target video frame in the video segment according to a distance between every two adjacent video frames in a plurality of video frames included in the video segment.
And step S250, segmenting the video segment based on the target video frame.
Video effects are based on the principle that the human eye's persistence of vision is exploited to create a sense of motion to the human eye through a series of substantially static video frames to achieve a dynamic video effect. In the process of transmitting video data, since the amount of video data is generally large, great pressure is applied to network transmission resources and storage resources. Therefore, in order to facilitate transmission of video data and reduce network resource occupation and storage resource occupation, generally, after video data is segmented, the segmented data is transmitted, and then the video data is restored based on the received information at the receiving end to obtain complete video data.
Under the above scenario, how to reasonably segment video data is an important problem, and in the current video data segmentation scheme, either the problem of inaccurate segmentation exists or the problem of low processing efficiency exists. Based on this, the present embodiment provides a video split mirror scheme to improve the problems existing in the prior art.
An I frame in a video is also called an encoded frame, and is an independent frame with all information, and can be independently decoded without referring to other images. In video slicing, I-frames may serve as important slicing points. Therefore, in the embodiment, the video compression algorithm may be used to identify the I frame of the video to be processed, and in addition, the B frame and the P frame of the video to be processed may also be identified. The P frame is also called an inter-frame predictive coding frame, and needs to refer to the previous I frame for decoding, and the B frame is also called a bidirectional predictive coding frame, that is, the B frame records the difference between the current frame and the previous and subsequent frames.
In the prior art, videos are often directly sliced by identified I frames, but this method may have a problem that scenes of point videos of some I frames are not switched, and videos are directly sliced by I frames in a video stream, so that the problem that the slicing is too fine may exist.
Therefore, in this embodiment, after the I frame included in the video to be processed is identified, a plurality of I frames may be selected, and the plurality of I frames may be aggregated according to the distance between every two adjacent I frames to obtain the target I frame in the plurality of I frames.
The distance between the video frames can reflect the similarity between the two frames to a certain extent, so that when the distance between the two frames is small, the change between the two frames is small, and the scene switching may not exist. Conversely, if the distance between two frames is large, it indicates that the change between the two frames is large, and there is a high possibility of scene switching. Optionally, the distance between the video frames may be calculated by using a perceptual hash algorithm, and certainly, may also be calculated by using other methods, which is not limited in this embodiment.
In this embodiment, it may be determined which I frames are finally used as target I frames to slice the video to be processed based on the distance between two adjacent I frames.
In this embodiment, after the I frame in the video to be processed is identified, the relevant information of the identified I frame may be stored in the created list, and the stored information may include the position of each I frame in the video to be processed, such as the several frames. In this way, based on the position information of each I frame stored in the list, each two adjacent I frames in the plurality of I frames can be specified, and the distance between each two adjacent I frames can be calculated.
In this embodiment, a video to be processed is first segmented with the determined target I frame, so as to segment the video to be processed into a plurality of video segments.
In view of the fact that there may be a scene switching phenomenon inside each video segment obtained by segmenting the target I frame, in order to avoid missing the video frames corresponding to the scene switching in the content of the video segment, in this embodiment, the operation of segmenting each video segment is further performed.
Optionally, in this embodiment, in each video segment, a target video frame in the video segment is determined according to a distance between every two adjacent video frames. The principle underlying this is the same as that described above for determining the target I-frame. The video segments are segmented based on the determined target video frames, so that points with large picture content difference among the video frames due to scene conversion and the like in the video segments can be segmented.
Based on the video framing scheme provided by this embodiment, aggregation processing may be performed on the identified I frames to obtain target I frames therein, and the video to be processed is once segmented based on the target I frames to obtain video segments. Therefore, the problem of over-fine segmentation existing in the process of directly segmenting by using the I frame can be avoided. Furthermore, for each video clip, a target video frame can be determined based on the distance between every two adjacent video frames, and the video clip is segmented by using the target video frame, so that the defect that the scene switching points in the video clip are difficult to segment accurately can be avoided.
Referring to fig. 3, in the present embodiment, the determining of the target I frame in step S220 may be performed in the following manner:
step S221, calculating a distance between every two adjacent I frames.
Step S222, detecting whether the distance between the two adjacent I frames satisfies a preset aggregation condition, and if the distance satisfies the preset aggregation condition, executing the following step S223.
In step S223, the next I frame in the two adjacent I frames is used as the target I frame.
In this embodiment, a perceptual hash algorithm may be used to calculate the distance between two adjacent I frames. It should be noted that the two adjacent I frames are not actually adjacent video frames, but adjacent I frames in a plurality of I frames. For example, if the 1 st frame, the 50 th frame, and the 100 th frame are I frames, and the other video frames from the 1 st frame to the 100 th frame are not I frames, the 1 st frame and the 50 th frame are adjacent I frames, and the 50 th frame and the 100 th frame are adjacent I frames.
For each two adjacent I frames, if the distance between the two adjacent I frames satisfies the preset aggregation condition, the subsequent I frame in the two adjacent I frames may be used as the target I frame. And the previous I frame is not used as the target I frame, that is, the previous I frame is not used as the segmentation point when segmentation is performed.
Alternatively, referring to fig. 4, when determining whether the distance between two adjacent I frames satisfies the preset aggregation condition, the following steps may be performed:
step S2221, an average distance value is obtained according to the distance between two adjacent I frames.
Step S2222, for each two adjacent I frames, detects whether the distance between the two adjacent I frames is smaller than the average distance value and smaller than a preset distance value, and if both are smaller, executes the following step S2223.
Step S2223, determining that the distance between the two adjacent I frames satisfies a preset aggregation condition.
It should be understood that the resulting plurality of I-frames may constitute a plurality of sets of two-by-two adjacent I-frames, and thus, a plurality of distance values may be obtained. From the plurality of obtained distance values, an average distance value can be obtained. In this embodiment, the purpose of aggregating the I frames is to find an I frame with a small difference in picture content, so as to avoid the problem of excessively fine segmentation. Thus, the target I-frame may be determined based on the calculated average distance value relative to the ensemble.
In this embodiment, two I frames having a distance smaller than the average distance value may be found, or two I frames having a distance smaller than 1/2 of the average distance value may also be found, and the embodiment is not limited specifically. On the other hand, according to the research based on big data in advance, it is found that when the distance between two video frames is smaller than a certain value, the content difference between the two video frames is often smaller, and the value is about 30, therefore, on the basis of the above, it can be detected whether the distance between two I frames satisfying the above condition is smaller than a preset distance value, and the preset distance value may be 30, 35, and the like. If the two conditions are met, the distance between the two I frames can be determined to meet the preset aggregation condition. The two I frames may be aggregated, i.e., the subsequent I frame of the two I frames is used as the target I frame.
After the video to be processed is divided into a plurality of video segments based on the determined target I frame, in step S230, the target video frame is determined in the video segments, and the video segments can be divided in the following manner, please refer to fig. 5.
Step S231, for each of the video segments, calculating a distance between every two adjacent video frames in a plurality of video frames included in the video segment.
Step S232, obtaining a difference value corresponding to the two adjacent video frames according to the distance between the two adjacent video frames.
In step S233, it is detected whether the difference value satisfies a predetermined abnormal condition, and if the difference value satisfies the predetermined abnormal condition, the following step S234 is executed.
In step S234, any one of the video frames corresponding to the difference value is used as a target video frame.
In this embodiment, a perceptual hash algorithm may be used to calculate a distance between every two adjacent frames in the video segment, and a calculated distance value is used as a first difference value corresponding to the two adjacent frames. In this embodiment, the obtained plurality of first difference values may be stored in the created first difference list in order. And calculating the absolute difference value between every two adjacent first differential values in the plurality of first differential values as second differential values corresponding to the two adjacent first differential values. The obtained plurality of second difference values may be stored in the created second difference list in order.
In this embodiment, for two adjacent video frames in the video segment, the corresponding difference value includes a corresponding first difference value and a second difference value corresponding to the first difference value. Whether a difference value corresponding to two adjacent video frames meets a preset abnormal condition or not is detected, so that whether a larger picture content difference exists between the two video frames or not can be determined. If the preset abnormal condition is met, the difference between the two video frames is determined to be large, and any one of the two video frames can be used as a target video frame for segmentation.
Referring to fig. 6, optionally, when determining whether the difference value satisfies the preset abnormal condition, the following steps may be performed, and it should be noted that the sequence before and after the following sub-steps is not limited thereto, for example, step S2332 and step S2333 may be performed in parallel, may be performed in different sequences, or may be performed alternatively, which is not limited in this embodiment.
In step S2331, gaussian distribution processing is performed on the plurality of first difference values and the plurality of second difference values to obtain first abnormal critical values corresponding to the plurality of first difference values and second abnormal critical values corresponding to the plurality of second difference values.
Step S2332, for the plurality of first difference values, finding a first difference value which is larger than the first abnormal critical value and is a local extremum, and if a second difference value corresponding to the found first difference value is a local extremum in the plurality of second difference values, determining that the found first difference value satisfies a preset abnormal condition.
Step S2333, for the plurality of second difference values, finding a second difference value which is larger than the second abnormal critical value and is a local extremum, and if the first difference value corresponding to the found second difference value is a local extremum in the plurality of first difference values, determining that the found second difference value satisfies a preset abnormal condition.
In this embodiment, a gaussian distribution processing method is adopted, and the distribution of each first difference value can be obtained by performing numerical conversion, and similarly, the distribution of each second difference value can be obtained by performing numerical conversion.
For the gaussian distribution result obtained by the plurality of first differential values, a first abnormal critical value may be obtained, which may be a maximum value in the gaussian distribution result. The significance of this maximum value is to characterize the difference between the two video frames corresponding to the first outlier threshold, which may be large. Likewise, for the gaussian distribution results obtained by the plurality of second differential values, a second anomaly threshold value may be obtained, which may be the maximum value among the gaussian distribution results. Similarly, the second outlier threshold characterizes a difference between two corresponding video frames, which may be large.
Optionally, for a plurality of first differential values in the first differential list, a first differential value that is greater than the first abnormal critical value and is a local extremum is found, that is, the first differential value is greater than or equal to two first differential values before and after the first differential value. In order to further improve the accuracy of the identification, the comprehensive judgment can be performed by combining the conditions of the second differential values. As described above, since the first differential value has the corresponding second differential value, the second differential value corresponding to the first differential value is obtained for the first differential value satisfying the above condition, and whether the second differential value is a local extremum among the plurality of second differential values, that is, whether the second differential value is greater than or equal to two second differential values before and after the local extremum among the plurality of second differential values is detected. If the condition is satisfied, it may be determined that the distance between the two video frames corresponding to the first difference value satisfies a preset abnormal condition, that is, any one of the two video frames may be used as the target video frame.
Similarly, for a plurality of second difference values in the second difference list, a second difference value satisfying the condition can be found in the same manner as described above, and the second difference value has a corresponding first difference value, the first difference value has two corresponding video frames, the two video frames are video frames satisfying the condition, and any one of the video frames can be used as the target video frame.
In this embodiment, the video frames whose distances satisfy the preset abnormal condition may be determined by separately using the first difference value in combination with the second difference value, or determined by separately using the second difference value in combination with the first difference value, or determined by combining the first difference value and the second difference value, which is not limited in this embodiment.
In addition, considering that in actual implementation, when a black frame appears in a video, there is often a scene change in the sense of people, and therefore, when determining a target video frame to split the video, in addition to determining the target video frame according to the above manner, a black frame in a video clip may be detected as the target video frame in combination with the following manner, alternatively, please refer to fig. 7, which may be performed in the following manner.
Step S310, for each video segment, obtaining a pixel value of each pixel point included in each video frame in the video segment.
In step S320, when the pixel values of the pixels with the number exceeding the preset number are lower than the preset value, the video frame is determined to be the target video frame.
In this embodiment, each video frame is in the form of a picture, and is composed of a plurality of pixel points. The pixel value of the black pixel is lower and can be considered to be equal to 0. Therefore, if a video frame includes 256 pixels, and most of the pixels are black pixels, and if the pixel values of more than 200 pixels are lower than the preset value, it can be considered that most of the video frame is black. In this case, the video frame gives the user the feeling of the presence of scene cuts, and therefore, the video frame can be taken as a cut point to be cut.
Therefore, the target video frame can be determined by the distance judgment mode between the video frames and the pixel value detection mode, so as to segment the video segments. The split mirror scheme provided by the embodiment can be used for accurately splitting based on scene switching, and the problem of scene mixing is solved. In addition, the problem of low efficiency in a mode of training and identifying by adopting a neural network model is avoided. According to the experiment, based on the split mirror scheme provided by the embodiment, for a 480P video of 1min, the split mirror can be automatically completed in about 20s, and the accuracy is high.
Referring to fig. 8, which is a functional block diagram of a video mirroring device 140 according to another embodiment of the present disclosure, the video mirroring device 140 includes an identification module 141, an aggregation module 142, a first dividing module 143, a determination module 144, and a second dividing module 145.
The identifying module 141 is configured to identify an I frame in a plurality of video frames included in the video to be processed.
It is understood that the identification module 141 can be used to execute the step S210, and for the detailed implementation of the identification module 141, reference can be made to the above-mentioned content related to the step S210.
And an aggregation module 142, configured to perform aggregation processing on the multiple I frames according to a distance between every two adjacent I frames to obtain a target I frame in the multiple I frames.
It is understood that the aggregation module 142 may be configured to perform the step S220, and for the detailed implementation of the aggregation module 142, reference may be made to the content related to the step S220.
The first segmentation module 143 is configured to segment the to-be-processed video based on the obtained target I frame to obtain a plurality of video segments.
It is understood that the first dividing module 143 can be used to execute the step S230, and for the detailed implementation of the first dividing module 143, reference can be made to the above-mentioned content related to the step S230.
The determining module 144 is configured to determine, for each video segment, a target video frame in the video segment according to a distance between every two adjacent video frames in a plurality of video frames included in the video segment.
It is understood that the determining module 144 can be used to perform the step S240, and the detailed implementation of the determining module 144 can refer to the content related to the step S240.
A second segmentation module 145 configured to segment the video segment based on the target video frame.
It is understood that the second segmentation module 145 can be used to perform the step S250, and for the detailed implementation of the second segmentation module 145, reference can be made to the above description of the step S250.
Further, an embodiment of the present application also provides a computer-readable storage medium, where a machine-executable instruction is stored in the computer-readable storage medium, and when the machine-executable instruction is executed, the video mirroring method provided by the foregoing embodiment is implemented.
The steps executed when the computer program runs are not described in detail herein, and reference may be made to the explanation of the video mirroring method above.
To sum up, according to the video mirror segmentation method, the video mirror segmentation device, the electronic device, and the computer-readable storage medium provided in the embodiments of the present application, after an I frame in a video to be processed is identified, a plurality of I frames are aggregated according to a distance between adjacent I frames to obtain a target I frame therein, and a video to be processed is segmented based on the obtained target I frame to obtain a video segment. And determining a target video frame in the video clips according to the distance between the video frames in each video clip, and finally segmenting the video clips based on the target video frame. According to the video segmentation scheme, the video is segmented by utilizing the target I frame obtained by polymerization to obtain the video segment, so that the problem of over-fine segmentation caused by directly adopting the I frame for segmentation in the prior art is solved, and the video segment is segmented by obtaining the target video frame in the video segment, so that the segmentation accuracy of the scene change point can be further improved. The scheme can avoid the problem of low processing efficiency in the segmentation scheme in the prior art on the premise of improving the segmentation accuracy.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (7)

1. A method for video mirroring, the method comprising:
identifying an I frame in a plurality of video frames contained in a video to be processed;
performing aggregation processing on the plurality of I frames according to the distance between every two adjacent I frames to obtain a target I frame in the plurality of I frames;
segmenting the video to be processed based on the obtained target I frame to obtain a plurality of video segments;
for each video clip, determining a target video frame in the video clip according to the distance between every two adjacent video frames in a plurality of video frames contained in the video clip;
segmenting the video segment based on the target video frame;
the step of determining the target video frame comprises the following steps:
for each video segment, calculating the distance between every two adjacent video frames in a plurality of video frames contained in the video segment; obtaining a difference value corresponding to the two adjacent video frames according to the distance between the two adjacent video frames; detecting whether the difference value meets a preset abnormal condition, and if so, taking any video frame in the video frames corresponding to the difference value as a target video frame;
the step of obtaining the difference value corresponding to two adjacent video frames comprises the following steps:
obtaining a first difference value corresponding to the two adjacent video frames according to the distance between the two adjacent video frames; calculating an absolute difference value between every two adjacent first differential values as second differential values corresponding to the two adjacent first differential values;
the step of detecting whether the differential value satisfies a preset abnormal condition includes:
performing Gaussian distribution processing on the plurality of first difference values and the plurality of second difference values to obtain first abnormal critical values corresponding to the plurality of first difference values and second abnormal critical values corresponding to the plurality of second difference values; finding out a first difference value which is larger than the first abnormal critical value and is a local extremum in the plurality of first difference values aiming at the plurality of first difference values, and if a second difference value corresponding to the found first difference value is a local extremum in the plurality of second difference values, determining that the found first difference value meets a preset abnormal condition; and finding out a second differential value which is larger than the second abnormal critical value and is a local extremum in the plurality of second differential values according to the plurality of second differential values, and determining that the found second differential value meets a preset abnormal condition if a first differential value corresponding to the found second differential value is a local extremum in the plurality of first differential values.
2. The video mirror splitting method according to claim 1, wherein the step of aggregating the plurality of I frames according to the distance between every two adjacent I frames to obtain a target I frame of the plurality of I frames comprises:
calculating the distance between every two adjacent I frames;
and detecting whether the distance between the two adjacent I frames meets a preset aggregation condition, and if so, taking the next I frame in the two adjacent I frames as a target I frame.
3. The video mirror splitting method according to claim 2, wherein the step of detecting whether the distance between two adjacent I frames satisfies a preset aggregation condition comprises:
obtaining an average distance value according to the distance between a plurality of groups of adjacent I frames;
and for every two adjacent I frames, detecting whether the distance between the two adjacent I frames is smaller than the average distance value and smaller than a preset distance value, and if so, determining that the distance between the two adjacent I frames meets a preset aggregation condition.
4. The video mirroring method according to any one of claims 1 to 3, wherein before the step of slicing the video segment based on the target video frame, the method further comprises:
aiming at each video clip, obtaining the pixel value of each pixel point contained in each video frame in the video clip;
and when the pixel values of the pixels with the number exceeding the preset number in the pixels contained in the video frame are lower than the preset value, determining that the video frame is the target video frame.
5. A video binning apparatus, characterized in that the apparatus comprises:
the identification module is used for identifying I frames in a plurality of video frames contained in the video to be processed;
the aggregation module is used for aggregating the plurality of I frames according to the distance between every two adjacent I frames to obtain a target I frame in the plurality of I frames;
the first segmentation module is used for segmenting the video to be processed based on the obtained target I frame to obtain a plurality of video segments;
the determining module is used for determining a target video frame in each video segment according to the distance between every two adjacent video frames in a plurality of video frames contained in the video segment;
the second segmentation module is used for segmenting the video segment based on the target video frame;
the determination module is to:
for each video segment, calculating the distance between every two adjacent video frames in a plurality of video frames contained in the video segment; obtaining a difference value corresponding to the two adjacent video frames according to the distance between the two adjacent video frames; detecting whether the difference value meets a preset abnormal condition, and if so, taking any video frame in the video frames corresponding to the difference value as a target video frame;
the determining module is used for obtaining the corresponding difference value of two adjacent video frames in the following way:
obtaining a first difference value corresponding to the two adjacent video frames according to the distance between the two adjacent video frames; calculating an absolute difference value between every two adjacent first differential values as second differential values corresponding to the two adjacent first differential values;
the determining module is used for detecting whether the differential value meets a preset abnormal condition or not by the following modes:
performing Gaussian distribution processing on the plurality of first difference values and the plurality of second difference values to obtain first abnormal critical values corresponding to the plurality of first difference values and second abnormal critical values corresponding to the plurality of second difference values; finding out a first difference value which is larger than the first abnormal critical value and is a local extremum in the plurality of first difference values aiming at the plurality of first difference values, and if a second difference value corresponding to the found first difference value is a local extremum in the plurality of second difference values, determining that the found first difference value meets a preset abnormal condition; and finding out a second differential value which is larger than the second abnormal critical value and is a local extremum in the plurality of second differential values according to the plurality of second differential values, and determining that the found second differential value meets a preset abnormal condition if a first differential value corresponding to the found second differential value is a local extremum in the plurality of first differential values.
6. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the method according to any one of claims 1 to 4.
7. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, is adapted to carry out the steps of the method according to any one of claims 1 to 4.
CN202010500050.1A 2020-06-04 2020-06-04 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium Active CN111641869B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010500050.1A CN111641869B (en) 2020-06-04 2020-06-04 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010500050.1A CN111641869B (en) 2020-06-04 2020-06-04 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111641869A CN111641869A (en) 2020-09-08
CN111641869B true CN111641869B (en) 2022-01-04

Family

ID=72331247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010500050.1A Active CN111641869B (en) 2020-06-04 2020-06-04 Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111641869B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215862B (en) * 2020-10-12 2024-01-26 虎博网络技术(上海)有限公司 Static identification detection method, device, terminal equipment and readable storage medium
CN113099313B (en) * 2021-03-31 2022-07-05 杭州海康威视数字技术股份有限公司 Video slicing method and device and electronic equipment
CN115619891A (en) * 2021-07-15 2023-01-17 上海幻电信息科技有限公司 Method and system for generating split-mirror script
CN115690662B (en) * 2022-11-11 2024-03-08 百度时代网络技术(北京)有限公司 Video material generation method and device, electronic equipment and storage medium
CN117750121A (en) * 2023-02-22 2024-03-22 书行科技(北京)有限公司 Video processing method and device, electronic equipment and computer readable storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102694966A (en) * 2012-03-05 2012-09-26 天津理工大学 Construction method of full-automatic video cataloging system
CN103426176A (en) * 2013-08-27 2013-12-04 重庆邮电大学 Video shot detection method based on histogram improvement and clustering algorithm
CN103578094A (en) * 2012-07-20 2014-02-12 清华大学 Shot segmentation method
CN105898373A (en) * 2015-12-17 2016-08-24 乐视云计算有限公司 Video slicing method and device
CN108010054A (en) * 2017-11-15 2018-05-08 中国地质大学(武汉) The video image motion target extraction method and system of segmentation mix Gauss model
CN108596940A (en) * 2018-04-12 2018-09-28 北京京东尚科信息技术有限公司 A kind of methods of video segmentation and device
CN109740499A (en) * 2018-12-28 2019-05-10 北京旷视科技有限公司 Methods of video segmentation, video actions recognition methods, device, equipment and medium
CN110012350A (en) * 2019-03-25 2019-07-12 联想(北京)有限公司 A kind of method for processing video frequency and device, equipment, storage medium
CN110458141A (en) * 2019-08-20 2019-11-15 北京深演智能科技股份有限公司 A kind of extracting method of key frame of video, apparatus and system
CN110545462A (en) * 2018-05-29 2019-12-06 优酷网络技术(北京)有限公司 video processing method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7123769B2 (en) * 2001-11-09 2006-10-17 Arcsoft, Inc. Shot boundary detection
CN101719144B (en) * 2009-11-04 2013-04-24 中国科学院声学研究所 Method for segmenting and indexing scenes by combining captions and video image information
CN110059761A (en) * 2019-04-25 2019-07-26 成都睿沿科技有限公司 A kind of human body behavior prediction method and device
CN110213670B (en) * 2019-05-31 2022-01-07 北京奇艺世纪科技有限公司 Video processing method and device, electronic equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102694966A (en) * 2012-03-05 2012-09-26 天津理工大学 Construction method of full-automatic video cataloging system
CN103578094A (en) * 2012-07-20 2014-02-12 清华大学 Shot segmentation method
CN103426176A (en) * 2013-08-27 2013-12-04 重庆邮电大学 Video shot detection method based on histogram improvement and clustering algorithm
CN105898373A (en) * 2015-12-17 2016-08-24 乐视云计算有限公司 Video slicing method and device
CN108010054A (en) * 2017-11-15 2018-05-08 中国地质大学(武汉) The video image motion target extraction method and system of segmentation mix Gauss model
CN108596940A (en) * 2018-04-12 2018-09-28 北京京东尚科信息技术有限公司 A kind of methods of video segmentation and device
CN110545462A (en) * 2018-05-29 2019-12-06 优酷网络技术(北京)有限公司 video processing method and device
CN109740499A (en) * 2018-12-28 2019-05-10 北京旷视科技有限公司 Methods of video segmentation, video actions recognition methods, device, equipment and medium
CN110012350A (en) * 2019-03-25 2019-07-12 联想(北京)有限公司 A kind of method for processing video frequency and device, equipment, storage medium
CN110458141A (en) * 2019-08-20 2019-11-15 北京深演智能科技股份有限公司 A kind of extracting method of key frame of video, apparatus and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于对象的视频分割算法研究与实现;王成儒,顾广华,胡正平;《数字电视与数字视频》;20031117;全文 *
闪光灯和标题条对新闻视频镜头检测影响的研究;张子银等;《清华大学学报(自然科学版)》;20030130(第01期);全文 *

Also Published As

Publication number Publication date
CN111641869A (en) 2020-09-08

Similar Documents

Publication Publication Date Title
CN111641869B (en) Video split mirror method, video split mirror device, electronic equipment and computer readable storage medium
CN112990191B (en) Shot boundary detection and key frame extraction method based on subtitle video
CN112929744B (en) Method, apparatus, device, medium and program product for segmenting video clips
CN110572579B (en) Image processing method and device and electronic equipment
US20220172476A1 (en) Video similarity detection method, apparatus, and device
CN113613065B (en) Video editing method and device, electronic equipment and storage medium
CN110446062B (en) Receiving processing method for big data file transmission, electronic device and storage medium
CN108460098B (en) Information recommendation method and device and computer equipment
CN110692251B (en) Method and system for combining digital video content
US20170180746A1 (en) Video transcoding method and electronic apparatus
CN111263243B (en) Video coding method and device, computer readable medium and electronic equipment
CN110891202B (en) Segmentation method, segmentation system and non-transitory computer readable medium
US11521025B2 (en) Selective image compression of an image stored on a device based on user preferences
CN107203763B (en) Character recognition method and device
CN104410799B (en) A kind of careful method of distributed skill
CN111914682A (en) Teaching video segmentation method, device and equipment containing presentation file
CN116233534A (en) Video processing method and device, electronic equipment and storage medium
CN115147756A (en) Video stream processing method and device, electronic equipment and storage medium
CN114283428A (en) Image processing method and device and computer equipment
CN112949449A (en) Staggered judgment model training method and device and staggered image determining method and device
CN112016427A (en) Video strip splitting method and device
CN117576678B (en) Video processing method, device, equipment, medium and product
CN114173190B (en) Video data detection method, device, electronic equipment and storage medium
CN114627036B (en) Processing method and device of multimedia resources, readable medium and electronic equipment
CN112929662B (en) Coding method for solving object overlapping problem in code stream structured image coding method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant