CN111476314A

CN111476314A - Fuzzy video detection method integrating optical flow algorithm and deep learning

Info

Publication number: CN111476314A
Application number: CN202010342615.8A
Authority: CN
Inventors: 王儒敬; 李登山; 谢成军; 张洁; 李�瑞; 陈天娇; 陈红波; 胡海瀛; 刘海云
Original assignee: Hefei Institutes of Physical Science of CAS
Current assignee: Hefei Institutes of Physical Science of CAS
Priority date: 2020-04-27
Filing date: 2020-04-27
Publication date: 2020-07-31
Anticipated expiration: 2040-04-27
Also published as: CN111476314B

Abstract

The invention relates to a fuzzy video detection method integrating an optical flow algorithm and deep learning, which comprises the following steps: preprocessing a training video sample; obtaining a fuzzy video detection model, constructing a fuzzy video time sequence training model, and obtaining a characteristic diagram of a video frame through a deep learning algorithm; aggregating twenty feature graphs of the first ten frames and the last ten frames of the frames to be detected to one feature graph by a light stream algorithm according to the weight which is from 0 to 1 and conforms to normal distribution; determining the weight according to a normal distribution algorithm; detecting a frame feature map, and detecting the feature map; and locating and marking the specific position of the target in the video frame. The method not only considers the characteristics of the video frames, but also considers the video time sequence, and the related factors such as space, geographic position, weather and the like, and carries out optical flow fusion on each frame and the frames before and after the frame by using an optical flow method; the fuzzy video detection and identification capability under the complex application condition is improved, and the detection rate of the target in the fuzzy video is improved.

Description

Fuzzy video detection method integrating optical flow algorithm and deep learning

Technical Field

The invention relates to the technical field of video identification, in particular to a fuzzy video detection method integrating an optical flow algorithm and deep learning.

Background

How to improve the detection rate of the blurred video is a difficult problem, and because the factors of video defocusing, partial shielding and motion blurring exist in the blurred video, even if the blurred video is a high-definition video, the intercepted frame is not as clear as a photo shot by a camera; in video monitoring, due to dim light at night, too long shooting distance and other reasons, the shot video is often blurred.

At present, detection work of fuzzy videos including monitored videos is mainly completed by professional personnel, but under the condition that the detection background is complex, the personnel are influenced by factors such as knowledge level and the like, and the accuracy of the fuzzy videos is difficult to guarantee by means of naked eyes. Meanwhile, the monitoring video in the natural environment is greatly influenced by weather conditions, such as heavy fog, wind and snow, heavy rain and the like, and is influenced by doped illumination, shadow and the like, so that the traditional fuzzy video detection method based on deep learning has low efficiency and unsatisfactory robustness. In addition, most of the existing detection methods focus on the feature extraction of video frames, and the consideration of relevant condition factors such as time sequence information of videos is neglected, so that the automatic identification of the fuzzy video can only exist in an experimental stage. How to improve the accuracy of the detection of the blurred video becomes a technical problem which needs to be solved urgently.

Disclosure of Invention

The invention aims to provide a fuzzy video detection method integrating an optical flow algorithm and deep learning, which can improve the detection and recognition capability of a fuzzy video under a complex application condition and improve the detection rate of a target in the fuzzy video.

In order to achieve the purpose, the invention adopts the following technical scheme: a fuzzy video detection method integrating an optical flow algorithm and deep learning comprises the following steps in sequence:

(1) preprocessing a training video sample: collecting a plurality of fuzzy videos and time and geographical position information of the corresponding targets as training data, manually marking detection targets in the fuzzy videos, and capturing frames of all marked videos to obtain a plurality of types of targets, wherein each type of target is provided with a plurality of training samples;

(2) training the frames obtained in the step (1) to obtain a fuzzy video detection model based on a deep learning algorithm, constructing a fuzzy video time sequence training model, introducing different fuzzy video shooting time and geographic positions as feature data, and training a rice disease detection model based on a deep learning fusion optical flow algorithm; obtaining a feature map of the video frame through a deep learning algorithm;

(3) aggregating twenty feature graphs of the first ten frames and the last ten frames of the frames to be detected to one feature graph by a light stream algorithm according to the weight which is from 0 to 1 and conforms to normal distribution;

(4) determining the weight in the step (3) according to a normal distribution algorithm;

(5) constructing a frame feature map detection model based on an image detection algorithm of deep learning, detecting a frame feature map generated by combining calculation of an optical flow algorithm, and detecting the feature map;

(6) and (3) combining the mark of the specific position of the target in the fuzzy video, inputting the space, geographic position and time information of the video to be detected into the trained frame characteristic diagram detection model, identifying and detecting the fuzzy video, positioning and marking the specific position of the target in the video frame by the computer.

In step (1), the pre-processing of the training video sample comprises the following steps:

(1a) collecting a plurality of videos which are fuzzily shot due to rain, snow, fog and night, and classifying the videos according to the time and geographical position information when the corresponding target occurs;

(1b) marking a video by using a video marking tool, wherein the video marking tool marks the video frame by frame, and the marked content is the category of an object in the video;

(1c) and (2) intercepting frames of the videos classified according to the time and the geographic position information when the corresponding target occurs in the step (1a) by using an algorithm, and storing the frames according to the time and the geographic position when the corresponding target occurs in a classified manner for training a detection model.

In the step (2), the training is performed by using the obtained frames based on the deep learning algorithm to obtain the fuzzy video detection model, so as to obtain the frame feature map, and the method specifically comprises the following steps:

(2a) training networks for obtaining the frame feature maps respectively by adopting ResNet-50, ResNet-101 and Goog L eNet for diversified frame formation and detection in the subsequent steps;

(2b) the network structure of ResNet-50 is that 49 convolution layers, 1 average pooling layer, wherein the convolution layer is divided into 16 blocks, each block has 1 shortcut connection, and finally, a softmax layer is used for generating a classification prediction confidence coefficient;

(2c) when the frame passes through the three types of networks respectively, the feature map of the frame is output before the last softmax layer.

The step (3) specifically comprises the following steps:

(3a) providing different information of the target object example according to the frame feature map;

(3b) by using an optical flow algorithm, the feature map of a specific frame and the feature maps of five frames before and after the specific frame are fused together, wherein the formula is as follows:

wherein f is_iIs a feature map of a specific frame, ∑ denotes an optical flow aggregate, w_iRepresenting the aggregation of adjacent characteristic maps by different weights, f_jA characteristic diagram after polymerization;

wherein w_iIs determined by the following formula:

and z is the distance of a particular frame from an adjacent frame, defining: z is | i-j |, μ is the mean of the normal distribution, σ is the variance of the normal distribution, and μ is taken to be 0, and σ is taken to be 1;

the optical flow algorithm adopts an optical flow algorithm in a computer vision library, and specifically comprises the following steps:

let pixel I (x, y, t), x, y denote coordinates, t denotes time, moved by a distance (dx, dy) to the next frame, with dt times, assuming this pixel is unchanged for a small time, i.e.:

I(x,y,t)＝I(x+dx,y+dy,t+dt)

expanding the above formula to obtain:

wherein represents a second order infinite element, comparing the above two equations, we can obtain:

by removing dt from the above formula, we can obtain:

i.e. the optical flow vector sought.

The step (4) comprises the following steps:

(4a) numbering eleven adjacent frames of the frame to be detected;

(4b) according to the weight calculation formula

And calculating the weight of each frame of eleven frames in total, wherein the value range of the weight is between 0 and 1.

The image detection algorithm based on deep learning in the step (5) is used for constructing a frame feature map detection model and detecting a frame feature map, and the method comprises the following steps:

(5a) using an R-FCN network as a network for detecting the frame characteristic diagram, wherein the R-FCN network comprises an RPN network and an R-FCN network;

(5b) the RPN network uses 9 anchor boxes, each graph generates 300 suggestion boxes, the position sensitive graph in the R-FCN network is 7 × 7 pixels;

(5c) training the R-FCN network by using a training sample frame to obtain a frame detection model related to the target category;

(5d) and (5) inputting the aggregated feature map into the frame detection model in the step (5c) to obtain a detection result.

The step (6) specifically comprises the following steps:

(6a) respectively training the training sample frames classified to obtain detection models of the fuzzy frames of the respective classes of the intercepted fuzzy video;

(6b) inputting the video frames of the respective types into respective detection models, and detecting to obtain detection results of the video frames of the respective types containing spatial, geographic position and time information;

(6c) the detection result is marked in the output result;

(6d) and counting the detection results of the training sample frames of each category, wherein the detection results comprise time and geographical position information when the target occurs.

According to the technical scheme, the beneficial effects of the invention are as follows: compared with the prior art, the method not only considers the characteristics of the video frames, but also considers the video time sequence, and the related factors such as space, geographic position, weather and the like, and carries out optical flow fusion on each frame and the frames before and after the frame by using an optical flow method; the method and the device respectively establish models for factors such as different geographic positions, weather and the like, and respectively detect the factors, thereby improving the detection and identification capabilities of the fuzzy video under the complex application condition and improving the detection rate of the target in the fuzzy video.

Drawings

FIG. 1 is a flow chart of a method of the present invention;

FIG. 2 is a diagram illustrating feature maps of an optical flow aggregation video frame according to the present invention.

Detailed Description

As shown in fig. 1, a blurred video detection method combining an optical flow algorithm and deep learning includes the following steps:

(1c) and (2) intercepting frames of the videos classified according to the time and the geographic position information when the corresponding target occurs in the step (1a) by using an algorithm, and storing the frames according to the time and the geographic position when the corresponding target occurs in a classified manner for training a detection model. The frames in step (1c) should have several classes, and there should be several training sample frames in each class.

Collecting a plurality of fuzzy videos and time, geographical position and weather information of shooting corresponding videos as training data, manually marking out targets in the fuzzy videos to obtain a plurality of types of videos, wherein each type of video has a plurality of video training samples. Here, not only a sample of the video but also information of time, geographical location, weather, etc. at the time of video capture is obtained, by which the robustness of the blurred video recognition is further increased.

The step (3) specifically comprises the following steps:

wherein w_iIs determined by the following formula:

and z is the distance of a particular frame from an adjacent frame, defining: z is | i-j |, μ is the mean of the normal distribution, σ is the variance of the normal distribution, and should be adjusted according to different application ranges, and generally, μ is 0, and σ is 1;

I(x,y,t)＝I(x+dx,y+dy,t+dt)

expanding the above formula to obtain:

by removing dt from the above formula, we can obtain:

i.e. the optical flow vector sought.

The above equation indicates that the closer to the specific frame, the larger the weight value, and the farther from the specific frame, the smaller the weight value.

The step (4) comprises the following steps:

(4a) numbering eleven adjacent frames of the frame to be detected;

(4b) according to the weight calculation formula

The step (6) specifically comprises the following steps:

(6c) the detection result has labels in the output result, such as the position of the target in the frame, the target type, the target confidence coefficient, etc.;

In summary, compared with the prior art, the method not only considers the characteristics of the video frames, but also considers the video time sequence, and the related factors such as space, geographic position, weather and the like, and carries out optical flow fusion on each frame and the frames before and after the frame by using an optical flow method; the method and the device respectively establish models for factors such as different geographic positions, weather and the like, and respectively detect the factors, thereby improving the detection and identification capabilities of the fuzzy video under the complex application condition and improving the detection rate of the target in the fuzzy video.

Claims

1. A fuzzy video detection method integrating an optical flow algorithm and deep learning is characterized in that: the method comprises the following steps in sequence:

2. The method of detecting blurred video with integrated optical flow algorithm and deep learning according to claim 1, wherein: in step (1), the pre-processing of the training video sample comprises the following steps:

3. The method of detecting blurred video with integrated optical flow algorithm and deep learning according to claim 1, wherein: in the step (2), the training is performed by using the obtained frames based on the deep learning algorithm to obtain the fuzzy video detection model, so as to obtain the frame feature map, and the method specifically comprises the following steps:

4. The method of detecting blurred video with integrated optical flow algorithm and deep learning according to claim 1, wherein: the step (3) specifically comprises the following steps:

wherein w_iIs determined by the following formula:

I(x,y,t)＝I(x+dx,y+dy,t+dt)

expanding the above formula to obtain:

by removing dt from the above formula, we can obtain:

i.e. the optical flow vector sought.

5. The method of detecting blurred video with integrated optical flow algorithm and deep learning according to claim 1, wherein: the step (4) comprises the following steps:

(4a) numbering eleven adjacent frames of the frame to be detected;

(4b) according to the weight calculation formula

6. The video detection method with fusion of optical flow algorithm and deep learning model according to claim 1, characterized in that: the image detection algorithm based on deep learning in the step (5) is used for constructing a frame feature map detection model and detecting a frame feature map, and the method comprises the following steps:

7. The video detection method with fusion of optical flow algorithm and deep learning model according to claim 1, characterized in that: the step (6) specifically comprises the following steps:

(6c) the detection result is marked in the output result;