CN115866295A

CN115866295A - Video key frame secondary extraction method and system for terminal row of convertor station

Info

Publication number: CN115866295A
Application number: CN202211474126.3A
Authority: CN
Inventors: 谭林林; 王嘉琦; 程鑫; 陈中; 曹卫国
Original assignee: Southeast University
Current assignee: Southeast University
Priority date: 2022-11-22
Filing date: 2022-11-22
Publication date: 2023-03-28

Abstract

The invention provides a secondary extraction method and a secondary extraction system for a video key frame of a terminal row of a convertor station, and relates to the field of video processing. The secondary extraction method of the video key frame facing to the terminal strip of the convertor station comprises the following steps: carrying out video acquisition on a real object of a terminal row of the converter station; carrying out feature extraction and graying processing on each frame of picture in the video; subtracting the gray values of the corresponding pixel points of the two adjacent frames of images, and taking the absolute value of the gray values to obtain a difference value; comparing the difference value with a set difference threshold, if the difference value is higher than the set difference threshold, extracting the frame to become a key frame of the video, and establishing a primary selection picture set; the problems that for a terminal row video shot manually in a current convertor station, the number of each frame of picture is large, the similarity of each frame of picture is high, and the definition of the picture is uneven are solved.

Description

Video key frame secondary extraction method and system for terminal row of convertor station

Technical Field

The invention relates to the technical field of video processing, in particular to a secondary extraction method and a secondary extraction system of a video key frame facing a terminal row of a convertor station.

Background

For video, the video is composed of many still pictures, which are called frames. Due to the large number of video frames and the high similarity between adjacent frames, the computer needs to spend much time on processing the video. In order to effectively reduce the operation time of a computer, video frames need to be screened and extracted, and after video key frames which can contain effective information are extracted, the key frames are processed, so that the operation time can be greatly reduced.

In an application scene of the converter station, in order to improve the workload of data acquisition of the terminal block on site, the data is acquired in a video mode. Due to the uncertainty of the definition of manual shooting and the similarity of shot objects, the conventional terminal row-oriented video key frame extraction technology has great limitations, and has the problems of large number of pictures, high similarity of each frame of picture and uneven definition of the pictures.

Disclosure of Invention

Technical problem to be solved

Aiming at the defects of the prior art, the invention provides a method and a system for extracting a video key frame twice for a terminal strip of a converter station, which solve the problems of huge number of pictures per frame, high similarity of the pictures per frame and uneven definition of the pictures for a terminal strip video shot manually in the current converter station.

(II) technical scheme

In order to realize the purpose, the invention is realized by the following technical scheme:

on the one hand, the method for extracting the video key frame secondarily facing to the convertor station terminal strip is provided, and comprises the following steps:

carrying out video acquisition on a real object of a terminal row of the converter station;

performing feature extraction and graying processing on each frame of picture in the video;

subtracting the gray values of the corresponding pixel points of the two adjacent frames of images, and taking the absolute value of the gray values to obtain a differential value;

comparing the difference value with a set difference threshold, if the difference value is higher than the set difference threshold, extracting the frame to become a key frame of the video, and establishing a primary selection picture set;

performing convolution operation on the gray value of the pixel point corresponding to each image in the primary selection image set through a Laplace mask to obtain Conv;

calculating a standard deviation between Conv and a gray value of a corresponding pixel point of each image, wherein the standard deviation represents the definition of each image;

and setting a definition threshold, defining the pictures with the definition lower than the definition threshold as fuzzy, and defining the pictures with the definition higher than the definition threshold as clear.

Preferably, the video acquisition of the converter station terminal strip real object is realized through a high-definition camera, and the high-definition camera is installed in a place where the converter station indoor picture of the terminal strip real object can be shot clearly.

Preferably, the performing of feature extraction and graying processing on each frame of picture in the video specifically includes:

let the n frame and the n-1 frame in the video sequence be f _n And f _n-1 Gray levels of corresponding pixel points of two frames of images after feature extraction and gray level processingThe values are respectively denoted as f _n (x, y) and f _n-1 (x,y)。

Preferably, the subtraction is performed on the gray values of the corresponding pixels of the two adjacent frames of images, and the absolute value of the gray values is taken to obtain the difference value, wherein the specific implementation formula is as follows:

D _n (x，y)＝|f _n (x，y)-f _n-1 (x，y)|

wherein D is _n Representing the differential value.

Preferably, the comparing the difference value with a set difference threshold, if the difference value is higher than the set difference threshold, extracting the frame to be a key frame of the video, and establishing a primary selection picture set, specifically including:

comparing the difference value with a set difference threshold value, setting a difference value threshold value zeta between two adjacent frames when the difference value between two adjacent frames is larger and the difference between two adjacent frames is larger, and meeting the requirement that D is met if the difference value between one frame and the adjacent previous frame is higher than the set threshold value zeta _n >ζ, the frame is extracted to become a key frame of the video, the first extraction of the key frame of the video is realized, and an initially selected picture set omega is established ₁ 。

Preferably, the performing convolution operation on the gray value of the pixel point corresponding to each image in the initially selected image set through a laplacian mask to obtain Conv specifically includes:

in the process of performing convolution operation on the gray value of the pixel point corresponding to each image in the primarily selected image set through a Laplace mask, the set omega is subjected to convolution operation ₁ Setting the gray value of the corresponding pixel point of each image as f (x, y), and performing convolution operation through a Laplace mask to obtain Conv, wherein the Conv is represented by the following formula:

wherein

K is selected so that each array element is an integer and the sum of all array elements is zero.

In another aspect, a video key frame secondary extraction system facing a converter station terminal row is provided, and the system comprises a multi-lens primary video key frame extraction unit considering frame difference correlation and a multi-lens secondary key frame extraction unit considering definition complementarity;

the multi-shot primary video key frame extraction unit considering frame difference correlation comprises: performing feature extraction and graying processing on each frame of picture in the video, then performing difference on the gray values of two adjacent frames, comparing the difference value with a set difference threshold value, realizing the extraction of the key frame of the video for the first time, and establishing a primary selected picture set omega 1;

the multi-shot secondary key frame extraction unit considering the definition complementarity comprises a set omega ₁ Performing convolution operation on the gray value of the pixel point corresponding to each frame of image through a Laplace mask to obtain Conv, and then calculating the standard deviation of the Conv and the gray value of the pixel point corresponding to each frame of image to obtain a value representing the definition of each frame of image; setting a definition threshold from the set omega ₁ And screening out clear pictures to realize secondary extraction of the video key frames facing the terminal row of the convertor station.

In yet another aspect, an apparatus is provided, comprising:

one or more processors;

a memory for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to perform the method for video keyframe secondary extraction for a converter-oriented sub-row.

In yet another aspect, a computer readable storage medium is provided, which stores a computer program, which when executed by a processor implements a method for secondary extraction of video keyframes for a converter-oriented sub-strip as described.

(III) advantageous effects

The invention discloses a method and a system for extracting a video key frame twice for a terminal strip of a converter station, which solve the problems of huge number of pictures per frame, high similarity of the pictures per frame and uneven picture definition of a terminal strip video shot manually in the current converter station.

Drawings

Fig. 1 is a schematic flow chart of a secondary extraction method of video key frames for a terminal row of a converter station according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the present invention, and it is to be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Examples

As shown in fig. 1, in an aspect, an embodiment of the present invention provides a method for extracting a video key frame twice for a terminal block of a converter station, including:

carrying out feature extraction and graying processing on each frame of picture in the video;

subtracting the gray values of the corresponding pixel points of the two adjacent frames of images, and taking the absolute value of the gray values to obtain a difference value;

Preferably, the performing the feature extraction and the graying processing on each frame of picture in the video specifically includes:

let the n frame and the n-1 frame image in the video sequence be f respectively _n And f _n-1 The gray values of the corresponding pixel points of the two frames of images after the feature extraction and the graying processing are respectively recorded as f _n (x, y) and f _n-1 (x,y)。

D _n (x，y)＝|F _n (x，y)-F _n-1 (x，y)|

wherein D is _n Representing the differential value.

Preferably, the comparing the difference value with a set difference threshold, if the difference value is higher than the set difference threshold, extracting the frame to become a key frame of the video, and establishing an initial selection picture set, specifically including:

comparing the difference value with a set difference threshold value, setting a difference value threshold value zeta between two adjacent frames when the difference value between two adjacent frames is larger and the difference between two adjacent frames is larger, and meeting the requirement that D is met if the difference value between one frame and the adjacent previous frame is higher than the set threshold value zeta _n >Zeta, the frame is extracted to become a key frame of the video, the first extraction of the key frame of the video is realized, and a primary selection picture set omega is established ₁ 。

in the process of performing convolution operation on the gray value of the pixel point corresponding to each image in the primarily selected image set through a Laplace mask, the set omega is subjected to convolution operation ₁ The gray value of the corresponding pixel point of each image is set as f (x, y), generalConvolution operation is performed through the laplace mask to obtain Conv, which is shown as the following formula:

wherein

As still another embodiment of the present invention, there is provided a video key frame secondary extraction system oriented to a substation terminal block, the system including a multi-shot primary video key frame extraction unit considering frame difference correlation and a multi-shot secondary key frame extraction unit considering sharpness complementarity;

the multi-shot secondary key frame extraction unit considering sharpness complementarity comprises a set omega ₁ Performing convolution operation on the gray value of the pixel point corresponding to each frame of image through a Laplace mask to obtain Conv, and then calculating the standard deviation of the Conv and the gray value of the pixel point corresponding to each frame of image to obtain a value representing the definition of each frame of image; setting a definition threshold from the set omega ₁ And screening out clear pictures to realize secondary extraction of the video key frames facing the terminal row of the converter station.

As still another embodiment of the present invention, there is provided an apparatus including:

one or more processors;

a memory for storing one or more programs,

the one or more programs, when executed by the one or more processors, cause the one or more processors to perform a method of secondary video key frame extraction for a converter station sub-row as in the above embodiments.

As a further embodiment of the present invention, a computer readable storage medium storing a computer program is provided, which when executed by a processor implements a video key frame secondary extraction method for a terminal row of a converter station in the above embodiments.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A secondary extraction method of video key frames facing a convertor station terminal row is characterized by comprising the following steps:

2. The method for secondary extraction of video keyframes from a terminal row of converter stations as claimed in claim 1, wherein: the video acquisition of the converter station terminal strip real object is realized through a high-definition camera, and the high-definition camera is arranged in a place which can clearly shoot the terminal strip real object picture in the converter station chamber.

3. The method for secondary extraction of video keyframes from a terminal row of converter stations as claimed in claim 1, wherein: the specific steps of carrying out feature extraction and graying processing on each frame of picture in the video comprise:

let the n frame and the n-1 frame in the video sequence be f _n And f _n-1 The gray values of the corresponding pixel points of the two frames of images after the feature extraction and the graying processing are respectively recorded as f _n (x, y) and f _n-1 (x,y)。

4. The method for secondary extraction of video keyframes from a terminal row of converter stations as claimed in claim 3, wherein: the gray values of the corresponding pixel points of the two adjacent frames of images are subtracted, and the absolute value of the gray values is taken to obtain a differential value, wherein the specific implementation formula is as follows:

D _n (x,y)＝|f _n (x,y)-f _n-1 (x,y)|

wherein D is _n Representing the differential value.

5. The method for extracting video key frames secondarily from convertor station terminal blocks as claimed in claim 4, wherein the method comprises the following steps: comparing the difference value with a set difference threshold, if the difference value is higher than the set difference threshold, extracting the frame to be a key frame of the video, and establishing a primary selection picture set, which specifically comprises:

6. The method for secondary extraction of video keyframes from a terminal row of converter stations as claimed in claim 5, wherein: performing convolution operation on the gray value of the pixel point corresponding to each image in the initially selected image set through a laplacian mask to obtain Conv, specifically comprising:

/>

wherein

7. A video key frame secondary extraction system facing a convertor station terminal row is characterized by comprising a multi-lens primary video key frame extraction unit considering frame difference correlation and a multi-lens secondary key frame extraction unit considering definition complementarity;

the multi-shot secondary key frame extraction unit considering sharpness complementarity comprises a set omega ₁ Performing convolution operation on the gray values of the corresponding pixel points of each frame of image through a Laplacian mask to obtain Conv, then calculating the standard deviation between the Conv and the gray values of the corresponding pixel points of each frame of image, and obtaining a value representing the definition of each frame of image; setting a definition threshold from the set omega ₁ And screening out clear pictures to realize secondary extraction of the video key frames facing the terminal row of the converter station.

8. An apparatus, comprising:

one or more processors;

a memory for storing one or more programs,

the one or more programs, when executed by the one or more processors, cause the one or more processors to perform a method of video keyframe secondary extraction as recited in any one of claims 1-6 directed to a substation sub-row.

9. A computer-readable storage medium storing a computer program, wherein the program, when executed by a processor, implements a method for secondary extraction of video keyframes oriented to a converter station sub-row as claimed in any one of claims 1 to 6.