CN114511644B

CN114511644B - Adaptive digital camouflage method based on deep learning

Info

Publication number: CN114511644B
Application number: CN202210069098.0A
Authority: CN
Inventors: 张静; 霍一博; 杜晓辉; 刘娟秀; 刘霖; 刘永
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2022-01-21
Filing date: 2022-01-21
Publication date: 2023-06-09
Anticipated expiration: 2042-01-21
Also published as: CN114511644A

Abstract

The invention discloses a self-adaptive digital camouflage method based on deep learning, and belongs to the field of image processing. According to the invention, the surrounding environment background is continuously scanned by devices such as a camera, and the environment background state corresponding to the position of the camouflage target is rapidly analyzed and calculated by utilizing the relation between the space information and the front and rear frames of the video, so that the real-time pixel level camouflage is carried out on the camouflage target, and the effect of visual stealth is achieved. The technology can realize good camouflage of stationary objects and moving objects, and has great significance for natural exploration, latent investigation, national security and scientific research.

Description

Adaptive digital camouflage method based on deep learning

Technical Field

The invention belongs to the field of image processing, in particular to a technology for calculating the optimal camouflage color of a camouflage target through the color information of the surrounding environment, which can ensure that the camouflage target has great concealment.

Background

The conventional digital camouflage generation method is complicated in implementation steps, generally needs to generate digital camouflage patches through background dominant color extraction and patch extraction, and finally color filling is carried out on the generated patches to complete the whole digital camouflage generation process. Although the digital camouflage generated by the traditional method is widely applied, due to the visual persistence effect of human eyes, the method cannot realize good camouflage in the motion process of the camouflage object, and the self-adaptive digital camouflage technology can make up for the defects, so that the camouflage object can also realize good camouflage effect in the motion process. The self-adaptive digital camouflage algorithm based on deep learning is provided, and the generation of the end-to-end self-adaptive digital camouflage is realized. The method provided by the invention utilizes the deep neural network to extract time and space information from the reference frame, and can realize good camouflage in the motion process of the camouflage object.

Disclosure of Invention

The invention aims to solve the problem that the digital camouflage cannot ensure good camouflage of the moving object to be camouflaged in the moving process of the object to be camouflaged, and provides a solution for the problem, environmental information around the object to be camouflaged is acquired through equipment such as a camera, the camouflage color most similar to the surrounding environment is calculated for the moving object to be camouflaged at any time, the purpose of high speed and high efficiency is achieved, and a solid foundation is laid for further research of the self-adaptive digital camouflage technology.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: an adaptive digital camouflage method based on deep learning, the method comprising:

step 1: shooting an environmental information video around a current target to be camouflaged through a camera, and obtaining a video image closest to the current time;

step 2: making mask information on the video images of the six frames, so that a camouflage target is black in the mask video image, and environment information is white in the mask video image;

step 3: the color information obtained in the step 1 and the mask information obtained in the step 2 are used as inputs and sent to an adaptive camouflage network based on deep learning;

step 4: calculating the input obtained in the step 3 by utilizing the weight of the deep learning neural network trained in advance to obtain a result of the self-adaptive camouflage algorithm based on the deep learning;

step 5: extracting the result to obtain camouflage color information of the current frame, and performing color camouflage by utilizing the variable color metamaterial of the target to be camouflaged and the flexible OLED;

step 6: and (3) after the camouflage at the moment is finished, repeating the steps 1-5 until the camouflage task is finished;

the adaptive camouflage network for deep learning comprises an encoder and a decoder, and has 17 layers; the encoder has 12 layers, namely a 6-layer serial space-time gating convolution module with different convolution kernels, a 4-layer serial cavity convolution module and a 2-layer serial space-time gating convolution module; the decoder has 5 layers, and is sequentially provided with a transposition convolution module, a gating convolution module, a transposition convolution module, a gating convolution module and a gating convolution module;

the space-time gating convolution module is used for obtaining a result by adopting a mode of combining gating convolution and space-time convolution simultaneously for continuous frame images input by the piece; the gating convolution is to obtain the result of one branch by using convolution operation on the original input, then to obtain the result of the other branch by using the same convolution mode and sigmoid activation function to perform feature selection on the original input, and then to multiply the results of the two branches to obtain the output of the time-space gating convolution; the space-time convolution is to move the front quarter of the channel number in the continuous multi-frame image back one frame, the first frame left is filled with 0, then move one frame forward for the next quarter, the information of the last frame left is filled with zero, and the rest half part is kept unchanged; then, carrying out feature extraction on the processed continuous multi-frame images by using two-dimensional convolution, and then merging feature information;

the cavity convolution module uses a convolution kernel to sample input continuous frame images at equal intervals;

and when the transposition convolution module is used, four sides of an input image are subjected to zero padding, then the convolution kernel is rotated by 180 degrees, and convolution is performed.

Further, in the step 1, color information of the surrounding environment of the sixty frames closest to the current time is obtained, the video images of the sixty frames are sampled at equal intervals, the sampling interval is ten frames, finally, 6 frames of video images are obtained, and the 6 frames of images are adopted for subsequent processing.

In order to meet the requirement of real-time performance, the invention adopts time-space gating convolution to replace 3D convolution, and can effectively reduce operation parameters; in the latter half of the neural network, a 2D convolution module of transpose convolution and gate control convolution is adopted, and only the image of the current frame is decoded to generate, so that the parameter number of the neural network can be reduced to the greatest extent, and the operation time is reduced. According to the experiment, under the computer condition that the CPU is AMD ryzen-5 3600CPU@3.60GHz, the display card is NVIDIA GeTorce GTX 1660Ti GPU, and the memory is 16G RAM, the operation speed can reach more than 12 frames/second.

Drawings

Fig. 1 is a flow chart of an adaptive digital camouflage technique.

Fig. 2 is a deep learning network structure of the present invention.

FIG. 3 is a graph of camouflage results from one experiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the drawings and examples, but embodiments of the invention are not limited thereto.

The algorithm flow chart of the invention is shown in fig. 1, and the self-adaptive digital camouflage method based on deep learning specifically comprises the following steps:

step 1: shooting an environmental information video around a current target to be camouflaged through a camera, acquiring color information of the surrounding environment of sixty frames closest to the current time, sampling the video images of the sixty frames at equal intervals, wherein the sampling interval is ten frames, and finally obtaining six frames of video images;

step 2: making mask information on the video images of the six frames so that a camouflage target is black (the pixel is 0) in the mask video image, and the environment information is white (the pixel is 255) in the mask video image;

step 3: the color information obtained in the step 1 and the mask information obtained in the step 2 are used as inputs and sent to an adaptive camouflage algorithm based on deep learning;

step 5: and extracting the result to obtain camouflage color information of the current frame, and performing color camouflage by utilizing the variable color metamaterial of the target to be camouflaged and the flexible OLED.

Step 6: and (5) after the camouflage is finished at the moment, repeating the steps 1-5 until the camouflage task is finished.

For the adaptive camouflage algorithm based on deep learning of step 3 in the specific embodiment, the algorithm needs to be described in detail:

a. introduction to the overall structure of the algorithm. The algorithm is an end-to-end convolutional neural network and mainly comprises cavity convolution, time-space gating convolution and transposed convolution. The algorithm structure adopts a coder decoder structure, and has 17 layers, wherein the coder has 12 layers, namely a 6-layer serial space-time gating convolution module with different convolution kernels, a 2-layer serial cavity convolution module and a 2-layer serial space-time gating convolution module, and the decoder has 5 layers, namely a transposition convolution module, a gating convolution module, a transposition convolution module, a gating convolution module and a gating convolution module.

b. Introduction to hole convolution

Hole convolution is a convolution manner in which an input matrix is sampled at equal intervals using a convolution kernel. The cavity convolution has two main effects, namely, the first is that the receptive field can be enlarged while the spatial resolution is not reduced. Second is the ability to capture multi-scale context information.

c. Introduction to space-time gated convolution

The space-time gating convolution consists of a space-time convolution product and a gating part. Space-time convolution is to move the front quarter of the channel information back one frame, the first frame is filled with 0; and then moving one quarter forward by one frame, zero padding the information of the last frame, keeping the last half part unchanged, then carrying out feature extraction by using two-dimensional convolution, and then merging the feature information. The gating convolution is to divide the input into two parts, wherein one part adopts a convolution mode such as a space-time convolution mode, the other part uses a sigmoid activation function to perform feature selection after using the convolution, and then the result of the two parts is multiplied to obtain the output of the space-time gating convolution.

Gating _y,x ＝∑∑W _g *I

TsmFeature _y,x ＝∑∑W _f *I

Where Γ is a sigmoid function, outputting only 0 and 1,0 representing an invalid position, and 1 representing a valid position.

Is an activation function and may be a ReLU or an ELU, etc. W (W) _f And W is _g Representing two different spatio-temporal convolution kernels.

d. Introduction to transpose convolution

Transpose convolution is a special way of convolution that can output information that is larger than the input size, often used to upsample the image. The operation of transpose convolution is to zero-fill the four sides of the input, then rotate the convolution kernel by 180 degrees, and then perform normal convolution to obtain the final output.

Claims

1. An adaptive digital camouflage method based on deep learning, the method comprising:

step 2: making mask information on the video image so that the camouflage target is black in the mask video image and the environment information is white in the mask video image;

step 3: taking the video image obtained in the step 1 and the mask information obtained in the step 2 as inputs, and sending the inputs to a self-adaptive camouflage network based on deep learning;

the space-time gating convolution module is used for simultaneously adopting a gating convolution and space-time convolution combined mode to obtain a result for the input continuous frame images; the gating convolution is to obtain the result of one branch by using convolution operation on the original input, then to obtain the result of the other branch by using the same convolution mode and sigmoid activation function to perform feature selection on the original input, and then to multiply the results of the two branches to obtain the output of the time-space gating convolution; the space-time convolution is to move the front quarter of the channel number in the continuous multi-frame image back one frame, the first frame left is filled with 0, then move one frame forward for the next quarter, the information of the last frame left is filled with zero, and the rest half part is kept unchanged; then, carrying out feature extraction on the processed continuous multi-frame images by using two-dimensional convolution, and then merging feature information;

the transpose convolution module is used for carrying out zero padding on four sides of an input image, rotating a convolution kernel by 180 degrees and then carrying out convolution.

2. The adaptive digital camouflage method based on deep learning as claimed in claim 1, wherein the color information of the surrounding environment of the sixty frames closest to the current time is obtained in the step 1, the video images of the sixty frames are sampled at equal intervals, the sampling interval is ten frames, finally, 6 frames of video images are obtained, and the 6 frames of images are used for subsequent processing.