CN112149575A

CN112149575A - Method for automatically screening automobile part fragments from video

Info

Publication number: CN112149575A
Application number: CN202011014986.XA
Authority: CN
Inventors: 崔恺旭
Original assignee: Xinhua Zhiyun Technology Co ltd
Current assignee: Xinhua Zhiyun Technology Co ltd
Priority date: 2020-09-24
Filing date: 2020-09-24
Publication date: 2020-12-29
Anticipated expiration: 2040-09-24
Also published as: CN112149575B

Abstract

The application relates to a method for automatically screening automobile part segments from videos, which is characterized in that a video to be processed is converted into video frames, the probability that the video frames belong to each automobile part segment is calculated through an automobile part classification model, and the detailed analysis of the corresponding relation between each frame and an automobile part in the video to be processed is realized. Through disassembling pending video and be a plurality of time windows, match with the car position according to the time window to and the amalgamation of the same time window in car position, make and finally realize having the fragment that has each position of car in the autofilter pending video, removed the trouble of manual screening from, with low costs and screening efficiency height.

Description

Method for automatically screening automobile part fragments from video

Technical Field

The application relates to the technical field of video processing, in particular to a method for automatically screening automobile part fragments from videos.

Background

The existing scheme is generally implemented by adopting a manual screening method to find out a segment with an automobile part from a video. Specifically, one or more video auditors are typically required to view the entire video first. And then labeling the lens of the automobile. This method requires a large amount of labor cost and is inefficient in screening.

Disclosure of Invention

Based on this, it is necessary to provide a method for automatically screening a video with car part segments, aiming at the problems of high cost and low screening efficiency of the conventional method for manually screening car part segments from a video.

The application provides a method for automatically screening automobile part-containing segments from videos, which comprises the following steps:

acquiring a video to be processed, converting the video to be processed into a plurality of video frames arranged according to time sequence, and inputting each video frame into an automobile part classification model to obtain the probability of each video frame belonging to each automobile part segment;

according to the duration length of each time window and the difference value of the starting time of two adjacent time windows, disassembling the plurality of video frames arranged according to the time sequence into a plurality of time windows;

calculating the probability of each time window belonging to each automobile part, and taking the automobile part corresponding to the maximum probability as the automobile part to which the time window belongs;

merging the time windows with the same automobile part into one time window to obtain a plurality of merged time windows, and taking each merged time window as an automobile part segment;

and according to each automobile part segment, carrying out information marking on the automobile part segment on the video to be processed.

Further, prior to the initial step, the method further comprises:

acquiring images of a plurality of marked automobile parts;

and constructing an automobile part classification model, and taking the images of the plurality of marked automobile parts as training data to train the automobile part classification model.

Further, acquiring a video to be processed, converting the video to be processed into a plurality of video frames arranged according to time sequence, and inputting each video frame into the automobile part classification model to obtain the probability that each video frame belongs to each automobile part segment, including:

acquiring a video to be processed, and converting the video to be processed into a plurality of video frames arranged according to a time sequence;

extracting a key frame from the plurality of video frames arranged according to the time sequence, wherein N is a positive integer and is more than 1, and N is an interval N frame;

inputting each key frame into the automobile part classification model to obtain the probability of each key frame belonging to each automobile part segment,

all the key frames are arranged according to the time sequence and combined to form a key frame set.

Further, the disassembling the plurality of video frames arranged in time sequence into a plurality of time windows includes:

setting the duration length of each time window and the difference value of the starting time of two adjacent time windows;

according to the duration length of each time window and the difference value of the starting time of two adjacent time windows, disassembling the key frame set into a plurality of time windows; the time windows are sequentially connected in time sequence, and the adjacent two time windows have overlapping time.

Further, calculating the probability of each time window belonging to each automobile part, and taking the automobile part corresponding to the maximum probability as the automobile part to which the time window belongs, the method comprises the following steps:

selecting a time window, and acquiring all key frames in the time window and the probability of each key frame belonging to each automobile part;

averaging the probability belonging to each automobile part to obtain the average probability belonging to each automobile part;

sorting the average probabilities belonging to the automobile parts in a descending order, and selecting the maximum average probability;

taking the automobile part corresponding to the maximum average probability as the automobile part to which the time window belongs;

and repeatedly executing the steps until each time window finishes the matching of the automobile parts.

Further, the step of calculating the probability of each time window being assigned to each automobile part before the automobile part corresponding to the maximum average probability is taken as the automobile part to which the time window belongs, and taking the automobile part corresponding to the maximum probability as the automobile part to which the time window belongs, further comprises:

judging whether the maximum average probability is greater than a preset probability value or not;

if the maximum average probability is larger than the preset probability value, executing a step of taking the automobile part corresponding to the maximum average probability as the automobile part to which the time window belongs;

and if the maximum average probability is smaller than or equal to the preset probability value, removing the time window and returning to the step of selecting the next time window.

Further, before the information annotation of the car position segment is performed on the video to be processed, the method further includes:

and performing optical flow processing on each automobile part segment to remove repeated frames in each automobile part segment.

Further, performing optical flow processing on each automobile part segment to remove the repeated frames in each automobile part segment, including:

selecting an automobile part segment, calculating the optical flow distance between every two adjacent key frames in the automobile part segment in sequence from the tail of the automobile part segment, judging whether the optical flow distance is greater than an optical flow distance threshold value or not until two adjacent key frames with the first optical flow distance greater than the optical flow distance threshold value appear, and taking the key frame with a larger time node as an end frame of a main body segment;

sequentially calculating the optical flow distance between every two adjacent key frames in the automobile part fragment from the beginning of the automobile part fragment, judging whether the optical flow distance is greater than an optical flow distance threshold value or not until two adjacent key frames with the first optical flow distance greater than the optical flow distance threshold value appear, and taking the key frame with a smaller time node as a starting frame of the main body fragment;

taking a segment from a starting frame of the body segment to an ending frame of the body segment as a body segment of the automobile part segment;

removing the rest segment parts except the main segment in the automobile part segment;

and repeatedly executing the steps until all the automobile part segments are subjected to optical flow processing.

Further, before removing the other segment portions except the main segment in the automobile part segment, performing optical flow processing on each automobile part segment to remove the repeated frames in each automobile part segment, further comprising:

selecting a first X frame of the starting frame of the main body segment as a first complementary segment by taking the starting frame of the main body segment as a starting point;

and taking the ending frame of the main body segment as a starting point, and selecting the last Y frame of the ending frame of the main body segment as a second complementary segment.

Further, removing the other segment parts except the main segment in the automobile part segment comprises:

and removing the rest of the automobile part segment except the main segment, the first complementing segment and the second complementing segment.

Drawings

Fig. 1 is a flowchart illustrating a method for automatically screening a video with car location segments according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The application provides a method for automatically screening automobile part fragments from a video. It should be noted that the method for automatically screening the video with the car part segments provided by the present application is applicable to any format of video.

In addition, the method for automatically screening the automobile part-containing segments from the video provided by the application is not limited to the implementation subject. Optionally, an executive subject of the method for automatically screening automobile part segments from videos provided by the present application may be an automobile part segment screening terminal. In particular, the one or more processors in the automobile part fragment screening terminal can be used as the execution subject of the method for automatically screening the automobile part fragments from the videos.

As shown in fig. 1, in an embodiment of the present application, the method includes the following steps S100 to S900:

s100, obtaining a video to be processed, and converting the video to be processed into a plurality of video frames arranged according to a time sequence. Further, each video frame is input into the automobile part classification model, and the probability that each video frame belongs to each automobile part segment is obtained.

Specifically, the automobile part classification model may be built with an automobile part classification table, as shown in table 1. The automobile part classification model can calculate the probability that each video frame belongs to each automobile part segment according to the automobile part classification table through a matching algorithm in the automobile part classification model.

TABLE 1 automotive parts Classification List

This step can be refined to determine the probability that each video frame belongs to a respective car portion segment. For example, the 1 st video frame has a 5% probability of being assigned to the center armrest, a 10% probability of being assigned to the center screen, and so on.

And S300, according to the duration length of each time window and the difference value of the starting time of two adjacent time windows, disassembling the plurality of video frames arranged according to the time sequence into a plurality of time windows.

Specifically, for example, there are 5 video frames in 1 second, and the video to be processed can be converted into a form in which a plurality of video frames are arranged in chronological order through step S100. In step S200, the plurality of video frames are divided into a plurality of time windows, so that the time windows can be further analyzed for the attribution of the automobile parts, and the video frames can be analyzed in a segmented manner.

And S500, calculating the probability of each time window belonging to each automobile part. Furthermore, the automobile part corresponding to the maximum probability is used as the automobile part belonging to the time window.

Specifically, for example, the time window 1 includes the 1 st video frame to the 5 th video frame, and if the probability that the time window 1 belongs to the center control screen is the greatest, the time window 1 is considered to belong to the center control screen.

And S700, merging the time windows with the same automobile part into one time window to obtain a plurality of merged time windows. And taking each merged time window as a car part segment.

Specifically, for example, the time window 1 and the time window 2 are both assigned to the center control screen, and then the time window 1 and the time window 2 can be combined into the same time window, thereby simplifying the analysis of the assignment of the automobile parts.

And S900, according to each automobile part segment, carrying out information marking on the automobile part segment on the video to be processed.

Specifically, the significance of this step is to backtrack the automobile part attribution information of each time window to the original video to be processed. The information labels may include time node labels and automobile part labels. For example, through steps S100 to S700, time windows of 1 central control screen and time windows of 1 engine are finally obtained, so that time node labeling can be performed on the video to be processed according to the start time node and the end time node of the time windows, and automobile part labeling can be performed on the video to be processed. By the method for automatically screening the segments with the automobile parts from the video, which segments in the video to be processed belong to which automobile parts can be automatically known finally.

In the embodiment, the detailed analysis of the corresponding relation between each frame and the automobile part in the video to be processed is realized by converting the video to be processed into the video frame and calculating the probability that the video frame belongs to each automobile part segment through the automobile part classification model. Through disassembling pending video and be a plurality of time windows, match with the car position according to the time window to and the amalgamation of the same time window in car position, make and finally realize having the fragment that has each position of car in the autofilter pending video, removed the trouble of manual screening from, with low costs and screening efficiency height.

In an embodiment of the present application, before the step S100, the method further includes the following steps S010 to S020:

and S010, acquiring images of a plurality of marked automobile parts.

And S020, constructing an automobile part classification model, and taking the images of the plurality of marked automobile parts as training data to train the automobile part classification model.

Specifically, a large number of images can be randomly selected in advance, and the images are sequentially labeled through the automobile part classification table shown in table 1 to form a plurality of images of labeled automobile parts. Furthermore, the images of the plurality of marked automobile parts are used as training data to train the automobile part classification model.

In this embodiment, the automobile part classification model is trained by using the images of the plurality of marked automobile parts as training data, so as to provide a basis for the automobile part classification model to realize the probability calculation function of automobile part attribution.

In an embodiment of the present application, the step S100 includes the following steps S110 to S140:

and S110, acquiring a video to be processed. And converting the video to be processed into a plurality of video frames arranged according to the time sequence.

And S120, extracting a key frame from the video frames arranged according to the time sequence at intervals of N frames according to the time sequence. N is a positive integer and N is greater than 1.

S130, inputting each key frame into the automobile part classification model to obtain the probability of each key frame belonging to each automobile part segment.

And S140, arranging all the key frames according to the time sequence, and combining to form a key frame set.

Specifically, in this embodiment, the purpose of extracting one key frame every N frames is to simplify the car location attribution analysis of subsequent video frames. This is because there may be many video frames in 1 second of the video to be processed, and the image contents of many adjacent video frames are similar or even identical, so in the division process of the subsequent time window, the image contents of all the video frames of two adjacent time windows may be substantially identical, the probability calculation of the automobile part attribution in step S500 also loses meaning, and there are many meaningless calculations.

By carrying out frame extraction processing on a plurality of video frames arranged according to the time sequence, the automobile part attribution analysis of the subsequent video frames can be effectively simplified, the information of key frames is not lost, and the attribution analysis processing of the repeated video frames is avoided.

Alternatively, N may take 4.

In an embodiment of the present application, the step S300 includes the following steps S310 to S320:

s310, setting the duration length of each time window and the difference value of the starting time of two adjacent time windows.

S320, according to the duration of each time window and the difference between the start times of two adjacent time windows, the key frame set is decomposed into a plurality of time windows. The time windows are sequentially connected in time sequence, and the adjacent two time windows have overlapping time.

Specifically, the duration length of each time window may be 5 seconds. The difference between the start times of two adjacent time windows may be 1 second. Then, for example, time window 1 is from 0 second to 5 seconds, time window 2 is from 1 second to 6 seconds, time window 3 is from 2 seconds to 7 seconds, and so on.

In this embodiment, the key frame set is disassembled into a plurality of time windows, so that the time windows can be further analyzed for the attribution of the automobile parts, and the key frame segmentation analysis is realized.

In an embodiment of the present application, the step S500 includes the following steps S510 to S590:

s510, selecting a time window, and acquiring all key frames in the time window and the probability of each key frame belonging to each automobile part.

S530, averaging the probability belonging to each automobile part to obtain the average probability belonging to each automobile part.

S550, sorting the average probability belonging to each automobile part from big to small, and selecting the maximum average probability.

S570, the automobile part corresponding to the maximum average probability is regarded as the automobile part to which the time window belongs.

And S590, repeatedly executing the step S510 to the step S570 until the matching of the automobile parts is completed in each time window.

Specifically, for example, time window 1 has a 1 st key frame, a 5 th key frame, and a 9 th key frame. The average value of the probability of each key frame of the time window 1 belonging to each automobile part is obtained, and the average probability of the automobile nameplate in the time window 1 is found to be 56%, the average probability of the brand logo is 23%, and the average probability of other automobile parts is 0%, so that the time window 1 belongs to the automobile nameplate. The side surface shows that the image content of all the key frames appearing in the time window 1 is probably two elements of the vehicle nameplate and the brand logo of the vehicle.

In this embodiment, the accurate determination of the automobile part to which the time window belongs is realized by obtaining all the key frames in the time window and the probability that each key frame belongs to each automobile part, obtaining the average value of the probabilities, and finally taking the automobile part corresponding to the maximum average probability as the automobile part to which the time window belongs.

In an embodiment of the present application, before the step S570, the step S500 further includes the following steps S561 to S563:

s561, judging whether the maximum average probability is larger than a probability preset value.

S562, if the maximum average probability is greater than the preset probability value, executing the subsequent step S570.

S563, if the maximum average probability is less than or equal to the preset probability value, removing the time window, and returning to the step S510.

Specifically, the maximum average probability of the car parts in some time windows is small, and the images corresponding to all keyframes in the time window may not be clear, so that the car parts to which the time window belongs cannot be distinguished, and the time window needs to be removed.

The probability preset value may be 30%.

In this embodiment, by determining whether the maximum average probability is greater than a preset probability value, filtering the time window in which the unclear key frame is located can be implemented, so that the final filtering result is accurate.

In an embodiment of the present application, before step S900, the method further includes the following steps:

and S800, performing optical flow processing on each automobile part segment to remove repeated frames in each automobile part segment.

Specifically, each merged time window is a car location segment. However, there may be some repeated frames in the car part segment, for example, the images of two adjacent key frames are the same. The repeated frames with higher similarity can be removed in the step.

In an embodiment of the present application, the step S800 includes the following steps S810 to S870:

s810, selecting an automobile part segment, and sequentially calculating the optical flow distance between every two adjacent key frames in the automobile part segment from the tail of the automobile part segment. Further, whether the optical flow distance is larger than the optical flow distance threshold value or not is judged until two adjacent key frames with the first optical flow distance larger than the optical flow distance threshold value appear, and the key frame with the larger time node is used as the end frame of the main body segment.

And S830, sequentially calculating the optical flow distance between every two adjacent key frames in the automobile part segment from the beginning of the automobile part segment. Further, whether the optical flow distance is larger than the optical flow distance threshold value or not is judged until two adjacent key frames with the first optical flow distance larger than the optical flow distance threshold value appear, and the key frame with the smaller time node is used as the starting frame of the main body segment.

S850, taking a segment from the starting frame of the main segment to the ending frame of the main segment as the main segment of the automobile part segment;

s870, removing the other segment parts except the body segment in the automobile part segment.

S890, the above steps S810 to S870 are repeatedly executed until all the automobile part segments are subjected to the optical flow processing.

Specifically, the main purpose of the present embodiment is to reserve the middle part of each car location segment, and remove the repeated frames at the beginning and the end, where the beginning and the end are from the viewpoint of time nodes, and the middle part is reserved to ensure the integrity of the car location segment. When the optical-flow distance between two adjacent keyframes is greater than the optical-flow distance threshold, the similarity between two adjacent keyframes is low. Conversely, when the optical-flow distance between two adjacent key frames is less than or equal to the optical-flow distance threshold, the similarity between two adjacent key frames on the surface is high, and the two adjacent key frames can be removed as repeated frames.

Optionally, before step S810, the step S800 further includes:

s801, acquiring time lengths of all automobile part segments, and executing optical flow processing of subsequent steps S810 to S890 on the automobile part segments with the time lengths larger than the preset time length.

The preset time length may be a time length of 13 frames. This is because if the temporal length of the car-part segment is too short, there is no need to repeat frame removal.

In an embodiment of the present application, before the step S870, the step S800 further includes the following steps S861 to S862:

s861, using the starting frame of the main body segment as the starting point, selects the first X frame of the starting frame of the main body segment as the first complementary segment.

S862, starting with the end frame of the main body segment, selecting the last Y frame of the end frame of the main body segment as the second complementary segment.

In particular, X may be equal to Y. The first and second complementary segments are arranged such that some important information is lost during repeated frame removal. X and Y may both be 3. For example, the body segment is 115 th to 285 th frames, then the first complementary segment may be 110 th, 105 th and 100 th frames, and the first complementary segment may be 290 th, 295 th and 300 th frames.

In an embodiment of the present application, the step S870 includes the following steps:

s871, removing the other segment parts except the main segment, the first complementing segment and the second complementing segment in the automobile part segment.

Specifically, the present embodiment is to support the embodiments from step S861 to step S862.

The technical features of the embodiments described above may be arbitrarily combined, the order of execution of the method steps is not limited, and for simplicity of description, all possible combinations of the technical features in the embodiments are not described, however, as long as there is no contradiction between the combinations of the technical features, the combinations of the technical features should be considered as the scope of the present description.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims

1. A method for automatically screening a video for segments having car parts, the method comprising:

s100, acquiring a video to be processed, converting the video to be processed into a plurality of video frames arranged according to time sequence, and inputting each video frame into an automobile part classification model to obtain the probability of each video frame belonging to each automobile part segment;

s300, decomposing the plurality of video frames arranged according to the time sequence into a plurality of time windows according to the duration length of each time window and the difference value of the starting time of two adjacent time windows;

s500, calculating the probability of each time window belonging to each automobile part, and taking the automobile part corresponding to the maximum probability as the automobile part to which the time window belongs;

s700, merging the time windows with the same automobile part into one time window to obtain a plurality of merged time windows, and taking each merged time window as an automobile part segment;

2. The method for automatically screening segments with car parts according to claim 1, wherein before the step S100, the method further comprises:

s010, acquiring images of a plurality of marked automobile parts;

3. The method for automatically screening segments with car parts according to claim 2, wherein the step S100 comprises:

s110, acquiring a video to be processed, and converting the video to be processed into a plurality of video frames arranged according to a time sequence;

s120, extracting a key frame from the video frames arranged according to the time sequence, wherein N is a positive integer and is more than 1, and N is an interval N frame according to the time sequence;

s130, inputting each key frame into the automobile part classification model to obtain the probability of each key frame belonging to each automobile part segment,

4. The method for automatically filtering car parts-having segments from video according to claim 3, wherein said step S300 comprises:

s310, setting the duration length of each time window and the difference value of the starting time of two adjacent time windows;

s320, decomposing the key frame set into a plurality of time windows according to the duration length of each time window and the difference value of the starting time of two adjacent time windows; the time windows are sequentially connected in time sequence, and the adjacent two time windows have overlapping time.

5. The method for automatically filtering car parts-having segments from video according to claim 4, wherein said step S500 comprises:

s510, selecting a time window, and acquiring all key frames in the time window and the probability of each key frame belonging to each automobile part;

s530, averaging the probability belonging to each automobile part to obtain the average probability belonging to each automobile part;

s550, sequencing the average probabilities belonging to the automobile parts in a descending order, and selecting the maximum average probability;

s570, taking the automobile part corresponding to the maximum average probability as the automobile part to which the time window belongs;

6. The method for automatically filtering car parts-having segments from video according to claim 5, wherein before step S570, said step S500 further comprises:

s561, judging whether the maximum average probability is larger than a probability preset value;

s562, if the maximum average probability is greater than a preset probability value, executing the subsequent step S570;

7. The method for automatically filtering car parts-having segments from video according to claim 6, wherein before step S900, the method further comprises:

8. The method of claim 7, wherein the step S800 comprises:

s810, selecting an automobile part segment, calculating the optical flow distance between every two adjacent key frames in the automobile part segment in sequence from the tail of the automobile part segment, judging whether the optical flow distance is greater than an optical flow distance threshold value or not until two adjacent key frames with the first optical flow distance greater than the optical flow distance threshold value appear, and taking the key frame with the larger time node as the end frame of the main body segment;

s830, sequentially calculating optical flow distances between every two adjacent key frames in the automobile part segment from the beginning of the automobile part segment, and judging whether the optical flow distances are larger than an optical flow distance threshold value or not until two adjacent key frames with the first optical flow distances larger than the optical flow distance threshold value appear, and taking the key frame with a smaller time node as a starting frame of the main body segment;

s870, removing the rest segment parts except the main segment in the automobile part segment;

9. The method of claim 8, wherein before step S870, said step S800 further comprises:

s861, taking the starting frame of the main body segment as a starting point, selecting the first X frame of the starting frame of the main body segment as a first complementary segment;

10. The method for automatically filtering segments with car parts from video according to claim 9, wherein said step S870 comprises:

s871, removing the rest of the automobile part segment except the main segment, the first complementing segment and the second complementing segment.