CN110705462A

CN110705462A - Hadoop-based distributed video key frame extraction method

Info

Publication number: CN110705462A
Application number: CN201910935963.3A
Authority: CN
Inventors: 程飞
Original assignee: Sichuan Loy Technology Co Ltd
Current assignee: Sichuan Loy Technology Co Ltd
Priority date: 2019-09-29
Filing date: 2019-09-29
Publication date: 2020-01-17
Anticipated expiration: 2039-09-29
Also published as: CN110705462B

Abstract

The invention discloses a distributed video key frame extraction method based on Hadoop, which comprises the following steps: acquiring an original video to be processed; splitting the original video into independent image frames according to a preset sequence; detecting whether the image frame contains a first target or not, if so, marking the image frame as a target frame, otherwise, marking the image frame as a common frame; dividing the target frames into a plurality of target frame collections; respectively acquiring a key frame of each target frame set; and generating a key frame collection according to the key frames of all the target frame collections. The method and the device perform key frame extraction on the part containing the target in the video, and improve the efficiency of key frame extraction.

Description

Hadoop-based distributed video key frame extraction method

Technical Field

The invention belongs to the technical field of video processing, and particularly relates to a distributed video key frame extraction method based on Hadoop.

Background

The amount of all-media public opinion data collected during public opinion supervision is exponentially increased, and especially video data is subjected to preprocessing and postprocessing manually, so that a large amount of time and energy are consumed, and key target information is easily omitted. The key frame extraction technology greatly improves the situation, the key frame is not influenced by the problems of time, audio and video synchronization and the like, and various modes can be provided for browsing and navigating. However, currently, in the process of actually applying the key frame extraction technology, the biggest difficulty is that the extraction speed of the key frame is too slow.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a distributed video key frame extraction method based on Hadoop, and improves the key frame extraction efficiency.

The purpose of the invention is realized by the following technical scheme: the distributed video key frame extraction method based on Hadoop comprises the following steps:

acquiring an original video to be processed;

splitting the original video into independent image frames according to a preset sequence;

detecting whether the image frame contains a first target or not, if so, marking the image frame as a target frame, otherwise, marking the image frame as a common frame;

dividing the target frames into a plurality of target frame collections;

respectively acquiring a key frame of each target frame set;

and generating a key frame collection according to the key frames of all the target frame collections.

Preferably, the Hadoop-based distributed video key frame extraction method further includes:

and removing redundancy of the key frames in the key frame set.

Preferably, when the first target includes a plurality of objects, if the image frame includes at least one object, the image frame is marked as a target frame.

Preferably, dividing the target frames into a plurality of target frame collections includes:

respectively detecting whether the common frame is adjacent to the target frame, and if the common frame is adjacent to the target frame, marking the common frame as a node frame;

and dividing the target frames into a plurality of target frame collections according to the node frames.

defining a target frame with a previous frame as a common frame as a first target frame, and defining a target frame with a previous frame as a target frame as a second target frame;

respectively creating a target frame set for each first target frame, and adding the first target frame to the corresponding target frame set;

detecting the similarity between a second target frame and a previous target frame, and if the similarity is greater than or equal to a first threshold, adding the second target frame to a target frame set to which the previous target frame belongs; and if the similarity is smaller than a first threshold value, creating a target frame set for the second target frame, and adding the second target frame to the target frame set.

Preferably, the respectively obtaining the key frames of each target frame set includes:

step one, setting a clustering center value N;

processing the target frames in the target frame aggregation by using a maximum distance method to obtain N clustering centers, wherein the maximum distance method is used for calculating whether the similarity between the target frames meets a preset condition;

step three, calculating the similarity between the clustering centers, and if the similarity is greater than or equal to a second threshold value, merging the two corresponding clustering centers;

classifying the target frames in the target frame set by using minimum distance classification;

calculating the average similarity of the target frames in each cluster, and acquiring the target frame with the minimum difference value with the average similarity as a new cluster center;

step six, if the similarity between the new clustering center of each cluster and the original clustering center is greater than or equal to a third threshold value, terminating the algorithm, wherein the new clustering center of each cluster is a key frame; otherwise, executing step three.

The invention has the beneficial effects that: the method and the device only extract the key frames aiming at the part containing the target in the video, thereby improving the efficiency of extracting the key frames.

Drawings

FIG. 1 is a flow chart of a distributed video key frame extraction method based on Hadoop.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.

Referring to fig. 1, the distributed video key frame extraction method based on Hadoop includes:

s1, acquiring the original video to be processed.

The original video in this embodiment is stored in a Hadoop Distributed File System (HDFS). The HDFS is used for storing large data volume on a cheap cluster, reliability is improved through multiple copies, and a fault-tolerant and recovery mechanism is provided; when the large files are stored, the large files are cut into small files, a plurality of servers are enabled to jointly manage the same file by using the concept of divide-and-conquer, each small file is subjected to redundancy backup and is stored in different servers in a scattered mode, and high availability is achieved without loss.

The HDFS includes: a nanomode: the cluster master manages the directory tree of the file system and processes the read and request of the client; SecondaryNamende: persistent metadata, primarily for the purpose of pressure sharing by the namenode; a DataNode: and storing all data blocks of the whole cluster and processing real data reading and writing.

S2, splitting the original video into independent image frames according to a preset sequence.

In this embodiment, OpenCV is used to operate the original video, and the original video is used as an input and broken frame by frame, so that the original video is split into independent image frames, where the obtained image frames are stored according to the time sequence of the obtained image frames in the original video.

And S3, detecting whether the image frame contains a first target or not, if so, marking the image frame as a target frame, otherwise, marking the image frame as a common frame.

When detecting whether the image frame includes the first target, the first target may be set in advance by a user, and then the image frame may be analyzed using various image analysis methods to determine whether the image frame includes the first target.

In some embodiments, when the first target comprises a plurality of objects, if the image frame comprises at least one of the objects, the image frame is marked as a target frame. That is, when the image frame includes one or more target objects, the image frame is regarded as a target frame, so that the target frame can be obtained more widely, and the representativeness of the finally obtained target frame is improved.

In some embodiments, the image frame is detected using a convolutional neural network model to determine whether the image frame includes a first target. Therefore, training of convolutional neural network models is needed, various convolutional neural network models exist at present, and the fast-RCNN model has the characteristics of high detection precision, high detection speed and the like, so that the fast-RCNN model is adopted in the embodiment to detect the first target.

In the implementation, the target frame containing the first target is extracted from all the image frames obtained from the original video, and then the key frame extraction is performed on the target frame, instead of performing key frame extraction on all the image frames, so that the speed of key frame extraction is increased, and the obtained key frame can better represent the required content.

S4, dividing the target frames into a plurality of target frame collections.

In some embodiments, dividing the target frames into a plurality of target frame collections includes:

S41A, respectively detecting whether the common frame is adjacent to the target frame, and if the common frame is adjacent to the target frame, marking the common frame as a node frame;

S42A, dividing the target frames into a plurality of target frame collections according to the node frames; specifically, the node frame with the next frame as the target frame is marked as an initial frame, the node frame with the previous frame as the target frame is marked as an end frame, and the target frames between the adjacent initial frame and the end frame form a target frame set.

The method for classifying the target frame collection has high processing efficiency and can effectively improve the extraction speed of the key frames.

S41B, defining a target frame with a previous frame as a common frame as a first target frame, and defining a target frame with the previous frame as a target frame as a second target frame;

S42B, respectively creating a target frame set for each first target frame, and adding the first target frame to the corresponding target frame set;

S43B, detecting the similarity between a second target frame and a previous target frame, and if the similarity is greater than or equal to a first threshold, adding the second target frame to a target frame set to which the previous target frame belongs; and if the similarity is smaller than a first threshold value, creating a target frame set for the second target frame, and adding the second target frame to the target frame set.

The method classifies by comparing the similarity between adjacent target frames, and can ensure that the content represented by the target frames in the same target frame set has continuity, namely the image frames belong to the same event, so that the key frames obtained in the subsequent key frame extraction process are more accurate, more reliable and more representative of the required content.

And S5, respectively acquiring the key frames of each target frame collection.

Respectively acquiring the key frames of each target frame set, wherein the method comprises the following steps:

s51, setting a clustering center value N;

s52, processing the target frames in the target frame aggregation by using a maximum distance method to obtain N clustering centers, wherein the maximum distance method is used for calculating whether the similarity between the target frames meets a preset condition;

s53, calculating the similarity between the clustering centers, and if the similarity is greater than or equal to a second threshold value, merging the two corresponding clustering centers;

s54, classifying the target frames in the target frame set by using minimum distance classification;

s55, calculating the average similarity of the target frames in each cluster, and acquiring the target frame with the minimum difference with the average similarity as a new cluster center;

s56, if the similarity between the new clustering center of each cluster and the original clustering center is greater than or equal to a third threshold value, terminating the algorithm, wherein the new clustering center of each cluster is a key frame; otherwise, step S53 is executed.

The key frame extracted by the method can effectively represent the main content of the video, and the extraction speed is high.

And S6, generating a key frame set according to the key frames of all the target frame sets.

The distributed video key frame extraction method based on Hadoop further comprises the following steps: and storing the key frame collection in the Hadoop distributed file system.

The distributed video key frame extraction method based on Hadoop further comprises the following steps: and performing redundancy removal on the key frame.

The operation of performing redundancy removal on the key frame may be performed in step S5, or may be performed in step S6.

The redundancy removing method comprises the steps of calculating the similarity between the key frames, and only keeping one of the two key frames with the similarity larger than a fourth threshold value.

The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. The distributed video key frame extraction method based on Hadoop is characterized by comprising the following steps:

acquiring an original video to be processed;

dividing the target frames into a plurality of target frame collections;

respectively acquiring a key frame of each target frame set;

2. The Hadoop-based distributed video key frame extraction method according to claim 1, further comprising:

and removing redundancy of the key frames in the key frame set.

3. The Hadoop-based distributed video keyframe extraction method as recited in claim 1, wherein when the first target comprises a plurality of objects, the image frame is marked as a target frame if the image frame contains at least one of the objects.

4. The Hadoop-based distributed video key frame extraction method as claimed in claim 1, wherein dividing the target frames into a plurality of target frame collections comprises:

5. The Hadoop-based distributed video key frame extraction method as claimed in claim 1, wherein dividing the target frames into a plurality of target frame collections comprises:

6. The Hadoop-based distributed video key frame extraction method as claimed in claim 1, wherein the step of respectively obtaining the key frames of each target frame set comprises:

step one, setting a clustering center value N;