CN111353449A - Infrared road image water body detection method based on condition generation countermeasure network - Google Patents

Infrared road image water body detection method based on condition generation countermeasure network Download PDF

Info

Publication number
CN111353449A
CN111353449A CN202010149314.3A CN202010149314A CN111353449A CN 111353449 A CN111353449 A CN 111353449A CN 202010149314 A CN202010149314 A CN 202010149314A CN 111353449 A CN111353449 A CN 111353449A
Authority
CN
China
Prior art keywords
network
image
discriminator
water body
mask
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010149314.3A
Other languages
Chinese (zh)
Inventor
王欢
汪立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202010149314.3A priority Critical patent/CN111353449A/en
Publication of CN111353449A publication Critical patent/CN111353449A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses an infrared road image water body detection method based on a condition generation countermeasure network, which comprises the following steps: acquiring a road image by using an infrared camera, zooming the road image to a specified size, and marking to obtain a mask representing water body position information; constructing a conditional generation countermeasure network, wherein the network adopts a wasserstein GAN structure, takes a full convolution neural network as a generator, takes a convolution neural network as a discriminator, and simultaneously utilizes a preprocessing function in a reflex attention unit to preprocess input images of the generator and the discriminator in the network; training a network by using the infrared road image and the corresponding mask; and scaling the image to be detected to a corresponding size, inputting a trained condition to generate a countermeasure network, wherein the output of the generator is a binary image representing a water body detection result. The method for detecting the water body area on the road surface by using the infrared road image has high accuracy and recall rate and is suitable for related tasks in the field of unmanned driving.

Description

Infrared road image water body detection method based on condition generation countermeasure network
Technical Field
The invention relates to the technical field of pavement water body detection, in particular to an infrared road image water body detection method based on condition generation countermeasure network.
Background
In the field of unmanned driving, detection of a road surface water body area is a key and important task, and the road surface water body area often means that dangers such as water pits which are difficult to detect can be hidden under the road surface water body area, and if the road surface water body area cannot be detected correctly, great damage can be brought to an unmanned automobile. Compared with a visible light camera, in an infrared road picture acquired by an infrared camera, the difference between a water body area and the surrounding environment is larger, and detection is facilitated, so that in practical application, the infrared camera is often used for acquiring a road surface water body image, and the water body area in the infrared image is detected by using an image processing and computer vision method. The reflection characteristic of the water body on the road surface brings certain puzzlement to the water body detection task, compared with a visible light image, the reflection characteristic of the water body in an infrared image is not obvious, but still exists, and under the condition, the water body detection algorithm realized based on the traditional image processing means easily causes virtual detection or missing detection. Meanwhile, the number and the area of the water body in the image are random, and the shape of the water body is irregular, so that the water body detection method of the infrared road image is realized by using a method related to the image segmentation field. With the development of deep learning and artificial intelligence technology, the examples of image segmentation by using the deep learning method are numerous, and in an image segmentation task, compared with the traditional method, the deep learning method is higher in accuracy and recall rate and better in segmentation effect, so that the deep learning method is an important way for solving the problem of detection of the water body on the road surface.
The paper Wasserstein GAN proposes some suggestions to the traditional structure and training process for generating a countermeasure network, and proposes a Wasserstein GAN structure. In addition, the condition generation countermeasure network has been proved to be widely applied in the field of image segmentation and often achieves a good effect, so that it is an important idea to solve the problem of detecting the water body on the road surface by using the condition generation countermeasure network. In the paper "Single Image waterer circuit Detection using FCN with Reflection Attention Units" of ECCV 2018, an author proposes a network structure for a road surface water Detection problem, namely, a Reflection Attention Unit (RAU), the principle of which is based on that the connection line of a water surface Reflection and a real object is often close to vertical, so that a feature diagram generated in the Detection process can be horizontally cut, and compared in the vertical direction to judge whether the Reflection Attention unit has a Reflection relationship, and the Reflection Attention unit is reasonably used, so that the road surface water Detection problem can be contended, and the effect of a deep learning network can be improved.
However, the detection result of the water body detection method proposed in the paper tends to be smooth and not fine enough, so that much detail information is lost, and especially, missing detection is easily caused to a small water body area. Moreover, the method mainly aims at visible light images, and the effect on infrared images is not verified.
Disclosure of Invention
The invention aims to provide an infrared road image water body detection method based on a condition generation countermeasure network, which is characterized in that the countermeasure network is generated by constructing a condition following a Wasserstein GAN structure, a full convolution network is used as a generator, a convolution neural network is used as a discriminator, and a preprocessing function in a reflex attention unit is simultaneously utilized for preprocessing an input image of the generator and the discriminator.
The technical solution for realizing the purpose of the invention is as follows: an infrared road image water body detection method based on a condition generation countermeasure network comprises the following steps:
step 1, acquiring an infrared road image by using an infrared camera, cutting and zooming the road image to a specified size, and acquiring a mask containing road water position information in the acquired image by using a labeling method;
step 2, constructing a conditional generation countermeasure network, wherein the conditional generation countermeasure network integrally adopts a basic structure of wasserstein GAN; the generator is a full convolution neural network following a U-Net model structure; the discriminator is a convolutional neural network; preprocessing the input images of the generator and the discriminator in the network by utilizing a preprocessing function in the attention reflecting unit;
step 3, generating a confrontation network by using the acquired infrared road image and the mask marked by the infrared road image and training conditions;
and 4, zooming the single-channel infrared image to be detected to a specified size, inputting the trained condition to generate the countermeasure network, and obtaining a binary image which is output by a generator of the condition generation countermeasure network and used for representing a water body detection result.
Compared with the prior art, the invention has the following remarkable advantages: (1) the training process of the condition generation countermeasure network is more stable due to the combination of the wasserstein GAN mode, in addition, the effect of the detection of the pavement water body is less influenced by the reflection of the water surface in the mode of combining the condition generation countermeasure network with the preprocessing function in the reflection attention unit, and is less influenced by the problem of unbalanced distribution of positive and negative samples, so that the detection result has lower false detection rate and undetected rate, and higher accuracy and recall rate can be obtained; (2) compared with a water body detection method utilizing a classical full convolution neural network, the method combines a U-Net structure and conditions to generate an antagonistic network structure, the detection effect is high in refinement degree, and details are processed more in place.
Drawings
FIG. 1 is a flow chart of the preprocessing function of the reflex attention unit according to the present invention.
Fig. 2 is a block diagram of a conditional generation countermeasure network employed by the present invention.
FIG. 3 is a diagram showing the effects of the present invention.
Detailed Description
In the course of research and practice on existing methods and theories, the applicant has found that: if the Wasserstein GAN is combined with the structure of the condition generation countermeasure network, the condition generation countermeasure network is optimized, the preprocessing function in the reflection attention unit is used for preprocessing the infrared road image to be detected input by the generator and the discriminator in the network, and the water body area in the infrared road image can be effectively detected.
The invention provides an infrared road image water body detection method based on a condition generation countermeasure network, which comprises the following steps:
step 1, acquiring an infrared road image by using an infrared camera, cutting and zooming the road image to a specified size, and acquiring a mask containing road water position information in the acquired image by using a labeling method;
step 2, constructing a conditional generation countermeasure network, wherein the conditional generation countermeasure network integrally adopts a basic structure of wasserstein GAN; the generator is a full convolution neural network following a U-Net model structure; the discriminator is a convolutional neural network; preprocessing the input images of the generator and the discriminator in the network by utilizing a preprocessing function in the attention reflecting unit;
step 3, generating a confrontation network by using the acquired infrared road image and the mask marked by the infrared road image and training conditions;
and 4, zooming the single-channel infrared image to be detected to a specified size, inputting the trained condition to generate the countermeasure network, and obtaining a binary image which is output by a generator of the condition generation countermeasure network and used for representing a water body detection result.
Further, in step 2, the conditionally generated countermeasure network follows the wasserstein GAN basic structure, and meets the following requirements:
1) the output layer of the arbiter network has no activation function;
2) the loss function of the discriminator network is the difference between the prediction result of the discriminator on the generated mask and the prediction result of the discriminator on the real mask;
3) the countermeasure loss term of the loss function of the generator network is the inverse number of the corresponding generated mask prediction result of the discriminator;
4) cutting off all training parameter values of the discriminator after each optimization of the discriminator to ensure that all the training parameter values are in a specified interval;
5) the optimizer adopts a random gradient descent optimizer.
Further, in the step 2, the condition generation countermeasure network has the following structure:
the network comprises a generator network and a discriminator network;
the generator network is a full convolution neural network following a U-Net model structure, the input of the network is a real image to be detected or trained with a fixed size, the real image is input and then is preprocessed by a preprocessing function in a reflex attention unit, and then the real image is input into a first layer of convolution layer; the output of the generator is a characteristic diagram representing the water body detection result of the input infrared road image, namely a mask is generated, wherein the larger the pixel value is, the higher the probability that the pixel at the same position in the original image corresponding to the pixel belongs to the road surface water body area is;
the discriminator network is a convolutional neural network, a mask which is a real image and a corresponding real image are input, or a generated mask which is output by the generator and a corresponding real image are output, the input real image is preprocessed through a preprocessing function in the reflection attention unit, then is connected with the input real mask or the generated mask in a channel dimension, and then is input into a first layer of convolutional layer processing; the last layer of the discriminator is a fully connected layer, where the output is a single value representing the probability that the input mask is the true mask corresponding to the true image.
Further, in step 2, the preprocessing function of the reflex attention unit specifically includes:
as shown in FIG. 1, let h, w and c be the height, width and channel number of the input feature map of the preprocessing function of the reflex attention unit, the preprocessing function firstly reduces the height of the input feature map to n, reduces the width to w/2 by mean pooling, keeps the channel number unchanged, records the feature map as X, splits out each row of X, and expands all the n split rows into a new feature map with height h, width w and channel number c by upsampling, then connects the n new feature maps in the channel dimension to obtain a new feature map with channel number n × c, records as X ', then expands the input feature map I itself by n times in the channel dimension, i.e. corresponding to the connection of n input feature maps I in the channel dimension, obtains a new feature map with channel number n × c, records as I, obtains the difference between X ' and I ' I, i.e. corresponding position elements, obtains a new feature map D and D, and finally obtains a new feature map with channel number D, and D as the output feature map (D, D) in the channel number n + 6332 c, and records as I × c, and D as the output feature map.
Further, n is 8 or 16.
Further, in step 3, the step of generating the countermeasure network by the training condition includes:
a) setting network parameters, randomly initializing parameters to be trained, and inputting real images for training and corresponding real masks one by one, wherein each iteration is as in steps b) -e);
b) inputting a real image into a generator to obtain a generation mask;
c) inputting the real image and the real mask into a discriminator to obtain the output result y of the discriminatort(ii) a Simultaneously inputting the real image and the generated mask into a discriminator to obtain an output result y of the discriminatorf
d) By generating masks, real masks, output y of the discriminatorfCalculating the loss of the generator according to the loss function of the generator, and outputting the result y through the discriminatortAnd yfCalculating the loss of the discriminator according to the loss function of the discriminator;
e) optimizing network parameters according to the loss of the generator and the discriminator and the network structure;
f) and after the data to be used for training is used, the training is finished, and the network parameters are stored.
Further, in the step 4, generating a countermeasure network by using the trained conditions, and obtaining a binary image representing the water detection result includes:
a) scaling the image to be detected to the size of the adaptive generator, and inputting the scaled image into the trained condition to generate a confrontation network;
b) obtaining a generated mask generated by a generator, and binarizing the generated mask by using a threshold value, wherein the threshold value is an average value of a pixel value used for representing a road surface water body area and a pixel value used for representing a non-road surface water body area in an input real mask, namely adding two possible values and dividing the sum by 2; and the mask after binarization is the detection result of the road surface water body corresponding to the input image.
The technical solution of the present invention will be described in detail below with reference to the embodiments and the accompanying drawings.
Examples
An infrared road image water body detection method based on a condition generation countermeasure network comprises the following steps:
the method comprises the following steps of 1, acquiring an infrared road image by using an infrared camera, zooming the infrared road image to a specified size, and obtaining a mask containing road surface water body position information in the acquired image by using a labeling method, wherein the specified size is 640 × 360, then determining pixels representing a water body area by using a manual labeling method, and generating a binary image representing the water body position, namely the mask, wherein the mask size is also 640 × 360, an area with a pixel value of 0 represents a corresponding non-road surface water body area in the original image, an area with a pixel value of 255 represents a corresponding road surface water body area in the original image, and each acquired image has a corresponding real mask.
Step 2: a conditional generation countermeasure network following the basic structure of wasserstein GAN is constructed, the structure diagram of the conditional generation countermeasure network is shown in fig. 2, and the network is composed of two parts, which are respectively:
a) a generator network, the structure of which comprises:
inputting an original picture to be detected, wherein the width of the original picture is 640 pixels, the height of the original picture is 360 pixels, and the number of channels is 3;
the preprocessing layer is used for processing the input original picture by using a preprocessing function in the attention reflecting unit;
convolutional layers 1, 64 convolutional kernels, with a convolutional kernel size of 5 × 5, step size of 2 × 2;
convolution layer 2, 128 convolution kernels, convolution kernel size 5 × 5, step length 2 × 2, output after batch regularization processing, by the linear rectification function activation processing of the leakage with gradient 0.2;
convolutional layers 3, 256 convolutional kernels, the size of the convolutional kernels is 5 × 5, the step length is 2 × 2, and the output is subjected to batch regularization processing;
the convolution layer comprises 4 convolution kernels and 512 convolution kernels, the size of each convolution kernel is 5 × 5, the step length is 2 × 2, and after the output is subjected to batch regularization processing, the output is activated and processed by a linear rectification function with leakage, the gradient of which is 0.2;
the convolution layer comprises 5 convolution kernels and 512 convolution kernels, the size of each convolution kernel is 5 × 5, the step length is 2 × 2, and after the output is subjected to batch regularization processing, the output is activated and processed by a linear rectification function with leakage, the gradient of which is 0.2;
the convolution layer comprises 6 convolution kernels and 512 convolution kernels, the size of each convolution kernel is 5 × 5, the step length is 2 × 2, and after the output is subjected to batch regularization processing, the output is activated and processed by a linear rectification function with leakage, the gradient of which is 0.2;
the convolution layer comprises 7 convolution kernels and 512 convolution kernels, the size of each convolution kernel is 5 × 5, the step length is 2 × 2, and after the output is subjected to batch regularization processing, the output is activated and processed by a linear rectification function with leakage, the gradient of which is 0.2;
convolution layer 8, 512 convolution kernels, convolution kernel size 5 × 5, step length 2 × 2, output after batch regularization processing, by linear rectification function (ReLU) activation processing;
the deconvolution layer comprises 1 convolution layer and 512 convolution kernels, wherein the size of the convolution kernels is 5 × 5, the step length is 2 × 2, the length and the width of a deconvolution operation output characteristic diagram are consistent with the output result of the convolution layer 7, the deconvolution result is output and is subjected to batch regularization processing, then is inactivated randomly with the probability of 0.5, and is connected with the output result of the convolution layer 7 which is not activated in the channel dimension, and is activated by a linear rectification function;
the deconvolution layer has 2 and 512 convolution kernels, the size of the convolution kernels is 5 × 5, the step length is 2 × 2, the length and the width of a deconvolution operation output characteristic diagram are consistent with the output result of the convolution layer 6, the output of the deconvolution result is subjected to batch regularization processing, is randomly inactivated with the probability of 0.5, is connected with the output result of the convolution layer 6 which is not activated in the channel dimension, and is activated by a linear rectification function;
the deconvolution layer has 3 and 512 convolution kernels, the size of the convolution kernels is 5 × 5, the step length is 2 × 2, the length and the width of a deconvolution operation output characteristic diagram are consistent with the output result of the convolution layer 5, the output of the deconvolution result is subjected to batch regularization processing, is randomly inactivated with the probability of 0.5, is connected with the output result of the convolution layer 5 which is not activated in the channel dimension, and is activated by a linear rectification function;
the deconvolution layer comprises 4 convolutional layers and 512 convolutional kernels, the size of the convolutional kernels is 5 × 5, the step length is 2 × 2, the length and the width of a deconvolution operation output characteristic diagram are consistent with the output result of the convolutional layers 4, the output result of the deconvolution is subjected to batch regularization processing, then is connected with the output result of the convolutional layers 4 which is not activated in channel dimension, and is activated by a linear rectification function;
5, 256 convolution kernels are added to the deconvolution layer, the size of each convolution kernel is 5 × 5, the step length is 2 × 2, the length and the width of a deconvolution operation output characteristic diagram are consistent with the result output by the attention reflecting unit 2, the result output by the deconvolution is subjected to batch regularization processing, then is connected with the output result which is not activated by the attention reflecting unit 2 in the channel dimension, and is activated by a linear rectification function;
6 deconvolution layers and 128 convolution kernels, wherein the size of the convolution kernels is 5 × 5, the step length is 2 × 2, the length and the width of a deconvolution operation output characteristic diagram are consistent with the output result of the convolution layer 2, the output result of the deconvolution is subjected to batch regularization processing, then is connected with the output result of the convolution layer 2 which is not activated in channel dimension, and is subjected to activation processing by a linear rectification function;
the deconvolution layer comprises 7 convolution kernels and 64 convolution kernels, the size of each convolution kernel is 5 × 5, the step length is 2 × 2, the length and the width of a deconvolution operation output characteristic diagram are consistent with the result output by the attention reflecting unit 1, the result output by the deconvolution operation is subjected to batch regularization processing, then is connected with the output result which is not activated by the attention reflecting unit 1 in the channel dimension, and is activated by a linear rectification function;
and (4) deconvolution layer 8, 1 convolution kernel, wherein the size of the convolution kernel is 5 × 5, the step size is 2 × 2, the length and the width of an output feature map of the deconvolution operation are consistent with those of an input image of the generator, and the deconvolution result is activated by a hyperbolic tangent function (tanh) and then is output as the generator.
b) The discriminator network, its structure includes:
inputting an original picture and a mask picture, wherein the width of the original picture is 640 pixels, the height of the original picture is 360 pixels, the number of channels of the original picture is 3, and the number of channels of the mask picture is 1;
the preprocessing layer is used for processing an input original picture by using a preprocessing function in the attention reflecting unit and connecting the processed picture with an input mask picture in a channel dimension;
convolution layer 1, 64 convolution kernels, convolution kernel size 5 × 5, step length 2 × 2, output through the linear rectification function activation processing with leakage of gradient 0.2;
convolution layer 2, 128 convolution kernels, convolution kernel size 5 × 5, step length 2 × 2, output after batch regularization processing, by the linear rectification function activation processing of the leakage with gradient 0.2;
convolution layer 3, 256 convolution kernels, convolution kernel size 5 × 5, step length 2 × 2, output after batch regularization processing, by the linear rectification function activation processing of the leakage with gradient 0.2;
the convolution layer comprises 4 convolution kernels and 512 convolution kernels, the size of each convolution kernel is 5 × 5, the step length is 2 × 2, and after the output is subjected to batch regularization processing, the output is activated and processed by a linear rectification function with leakage, the gradient of which is 0.2;
and the output of the all-connection layer is a single value, the higher the value is, the higher the probability that the input mask is a real mask representing the position of the pavement water area in the input image is, and the layer has no activation function.
The input of the preprocessing function in the attention reflecting unit is an infrared road image to be detected, which has been adjusted in size and stored as a three-channel image, and is marked as I, and the processing flow of the preprocessing function is shown in fig. 1, and the preprocessing function specifically operates as follows:
let h, w, and c be the height, width, and channel number of the input feature map of the reflex attention unit, respectively, in this embodiment, h is 360, w is 64, and c is 3, the preprocessing function first reduces the height of the input feature map to 16 by mean pooling, reduces the width to w/2, i.e., 320, and does not change the channel number, note that the feature map at this time is X, then splits out each row of X, and expands all 16 split out rows by upsampling to h, i.e., 360, w, i.e., 640, and channel number to c, i.e., 3, then connects these 16 new feature maps in the channel dimension to obtain a new feature map with channel number of 16 × c, i.e., 48, note that X, then the input feature map I itself expands to 16 times in the channel dimension, i.e., corresponds to 16 input feature maps I, I is connected in the channel dimension, to obtain 16I, i.e., 48, i.e., I, n.
And step 3: and (4) generating a confrontation network by using the acquired infrared road image and the mask marked by the infrared road image and training conditions. Firstly, mapping the pixel value of the binary mask, wherein the 0 value is mapped to-1, and the 255 value is mapped to 1. Randomly initializing parameters needing training in the network, wherein in the training process, each time one picture for training and a corresponding real mask are input into the generator, generating a generated mask, and simultaneously inputting the real picture and the real mask into the discriminator to generate a discrimination result ytAnd inputting the real picture and the generated mask into a discriminator to generate a discrimination result yf. For the arbiter, the penalty function can be expressed as yf-ytFor the generator, the loss function may be expressed as-200 × yf+LdataWherein L isdataThe calculation method of the data loss item of the generator comprises the steps of subtracting a generated mask and a real mask and calculating the absolute value of the result, dividing the absolute value by the total number of pixels of the generated mask to obtain the average pixel distance, namely the data loss item L of the generatordata. In each iteration of the training process, the discriminant is optimized once, and then the training parameters after the discriminant is updated are in the range of [ -0.5,0.5 [)]The interval of (2) is truncated, namely the parameter assignment larger than 0.5 is 0.5, the parameter assignment smaller than-0.5 is-0.5, the generator is optimized twice, the optimizer uses a random gradient descent optimizer, the optimization goal is to minimize the corresponding loss function value, and the learning rate is set to be 0.0002. After the picture and the real mask used for training are circularly used for 300 times, the training process is ended, and the model parameters are saved.
Step 4, zooming the infrared road image to be detected to a specified size, namely 640 × 360, inputting a trained condition to generate a generator of the countermeasure network, wherein no training process exists in the network, and parameters are not changeable, obtaining an image which is output by the generator of the condition generation countermeasure network and used for representing a water body detection result, and dividing the image by taking 0 as a threshold value, namely, setting pixels with values larger than 0 as 255 and pixels with values smaller than or equal to 0 as 0 to obtain a binary image representing the output result, wherein the pixels with values of 255 represent that a corresponding area in the original image is a road surface water body, and the pixels with values of 0 represent that a corresponding area in the original image is a non-road surface water body, and the original image, a real mask corresponding to the original image and a prediction result are shown in figure 3.

Claims (7)

1. An infrared road image water body detection method based on a condition generation countermeasure network is characterized by comprising the following steps:
step 1, acquiring an infrared road image by using an infrared camera, cutting and zooming the road image to a specified size, and acquiring a mask containing road water position information in the acquired image by using a labeling method;
step 2, constructing a conditional generation countermeasure network, wherein the conditional generation countermeasure network integrally adopts a basic structure of wasserstein GAN; the generator is a full convolution neural network following a U-Net model structure; the discriminator is a convolutional neural network; preprocessing the input images of the generator and the discriminator in the network by utilizing a preprocessing function in the attention reflecting unit;
step 3, generating a confrontation network by using the acquired infrared road image and the mask marked by the infrared road image and training conditions;
and 4, zooming the single-channel infrared image to be detected to a specified size, inputting the trained condition to generate the countermeasure network, and obtaining a binary image which is output by a generator of the condition generation countermeasure network and used for representing a water body detection result.
2. The infrared road image water body detection method based on the condition-generated countermeasure network of claim 1, wherein in the step 2, the condition-generated countermeasure network follows a wasserstein GAN basic structure, and meets the following requirements:
1) the output layer of the arbiter network has no activation function;
2) the loss function of the discriminator network is the difference between the prediction result of the discriminator on the generated mask and the prediction result of the discriminator on the real mask;
3) the countermeasure loss term of the loss function of the generator network is the inverse number of the corresponding generated mask prediction result of the discriminator;
4) cutting off all training parameter values of the discriminator after each optimization of the discriminator to ensure that all the training parameter values are in a specified interval;
5) the optimizer adopts a random gradient descent optimizer.
3. The infrared road image water body detection method based on the condition generating countermeasure network of claim 1, wherein in the step 2, the condition generating countermeasure network has the following structure:
the network comprises a generator network and a discriminator network;
the generator network is a full convolution neural network following a U-Net model structure, the input of the network is a real image to be detected or trained with a fixed size, the real image is input and then is preprocessed by a preprocessing function in a reflex attention unit, and then the real image is input into a first layer of convolution layer; the output of the generator is a characteristic diagram representing the water body detection result of the input infrared road image, namely a mask is generated, wherein the larger the pixel value is, the higher the probability that the pixel at the same position in the original image corresponding to the pixel belongs to the road surface water body area is;
the discriminator network is a convolutional neural network, a mask which is a real image and a corresponding real image are input, or a generated mask which is output by the generator and a corresponding real image are output, the input real image is preprocessed through a preprocessing function in the reflection attention unit, then is connected with the input real mask or the generated mask in a channel dimension, and then is input into a first layer of convolutional layer processing; the last layer of the discriminator is a fully connected layer, where the output is a single value representing the probability that the input mask is the true mask corresponding to the true image.
4. The infrared road image water body detection method based on the condition-generated countermeasure network of claim 1, wherein in the step 2, the preprocessing function of the reflex attention unit specifically comprises the following operation steps:
the method comprises the steps of firstly reducing the height of an input feature map of a preprocessing function of a reflection attention unit to n, reducing the width of the input feature map to w/2 and keeping the number of channels unchanged by the aid of h, w and c respectively through mean pooling, recording the feature map as X, then splitting each line of X, expanding all n split lines to a new feature map with the height of h, the width of w and the number of channels of c through upsampling, then connecting the n new feature maps in channel dimension to obtain a new feature map with the number of channels of n × c, recording the new feature map as X ', then expanding the input feature map I by the number of original n times in channel dimension, namely equivalent to connecting the n input feature maps I in the channel dimension to obtain a new feature map with the number of channels of n × c, recording the new feature map as I ', obtaining the difference between X and I ', namely corresponding to position elements D, subtracting the new feature map D from the input feature map I in the channel dimension to obtain a new feature map as I ', and obtaining a new feature map as I, and connecting the channel number of n, and recording the feature map as I ', and obtaining a high feature map as an output function D, wherein the input feature map is × D and the output of the channel.
5. The infrared road image water body detection method based on the condition-generated countermeasure network of claim 4, wherein n is 8 or 16.
6. The infrared road image water body detection method based on condition generation countermeasure network of claim 1, wherein in the step 3, the step of training the condition generation countermeasure network is as follows:
a) setting network parameters, randomly initializing parameters to be trained, and inputting real images for training and corresponding real masks one by one, wherein each iteration is as in steps b) -e);
b) inputting a real image into a generator to obtain a generation mask;
c) inputting the real image and the real mask into a discriminator to obtain the output result y of the discriminatort(ii) a Simultaneously inputting the real image and the generated mask into a discriminator to obtain an output result y of the discriminatorf
d) By generating masks, real masks, output y of the discriminatorfCalculating the loss of the generator according to the loss function of the generator, and outputting the result y through the discriminatortAnd yfCalculating the loss of the discriminator according to the loss function of the discriminator;
e) optimizing network parameters according to the loss of the generator and the discriminator and the network structure;
f) and after the data to be used for training is used, the training is finished, and the network parameters are stored.
7. The infrared road image water body detection method based on the condition-generated countermeasure network of claim 1, wherein in the step 4, the countermeasure network is generated by using the trained conditions, and the step of obtaining the binary image representing the water body detection result includes:
a) scaling the image to be detected to the size of the adaptive generator, and inputting the scaled image into the trained condition to generate a confrontation network;
b) obtaining a generated mask generated by a generator, and binarizing the generated mask by using a threshold value, wherein the threshold value is an average value of a pixel value used for representing a road surface water body area and a pixel value used for representing a non-road surface water body area in an input real mask, namely adding two possible values and dividing the sum by 2; and the mask after binarization is the detection result of the road surface water body corresponding to the input image.
CN202010149314.3A 2020-03-03 2020-03-03 Infrared road image water body detection method based on condition generation countermeasure network Withdrawn CN111353449A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010149314.3A CN111353449A (en) 2020-03-03 2020-03-03 Infrared road image water body detection method based on condition generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010149314.3A CN111353449A (en) 2020-03-03 2020-03-03 Infrared road image water body detection method based on condition generation countermeasure network

Publications (1)

Publication Number Publication Date
CN111353449A true CN111353449A (en) 2020-06-30

Family

ID=71194299

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010149314.3A Withdrawn CN111353449A (en) 2020-03-03 2020-03-03 Infrared road image water body detection method based on condition generation countermeasure network

Country Status (1)

Country Link
CN (1) CN111353449A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507849A (en) * 2020-12-04 2021-03-16 东南大学 Dynamic-to-static scene conversion method for generating countermeasure network based on conditions
CN115588039A (en) * 2022-12-12 2023-01-10 易斯德(福建)智能科技有限公司 Luminosity stereogram generation method and device based on light ray adaptive counterstudy

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507849A (en) * 2020-12-04 2021-03-16 东南大学 Dynamic-to-static scene conversion method for generating countermeasure network based on conditions
CN115588039A (en) * 2022-12-12 2023-01-10 易斯德(福建)智能科技有限公司 Luminosity stereogram generation method and device based on light ray adaptive counterstudy

Similar Documents

Publication Publication Date Title
CN111582201B (en) Lane line detection system based on geometric attention perception
CN110119728B (en) Remote sensing image cloud detection method based on multi-scale fusion semantic segmentation network
CN110348376B (en) Pedestrian real-time detection method based on neural network
CN109859190B (en) Target area detection method based on deep learning
CN107045629B (en) Multi-lane line detection method
CN111666921B (en) Vehicle control method, apparatus, computer device, and computer-readable storage medium
CN110866455B (en) Pavement water body detection method
CN110889449A (en) Edge-enhanced multi-scale remote sensing image building semantic feature extraction method
CN108830171B (en) Intelligent logistics warehouse guide line visual detection method based on deep learning
CN110309808B (en) Self-adaptive smoke root node detection method in large-scale space
CN109840483B (en) Landslide crack detection and identification method and device
CN109117788A (en) A kind of public transport compartment crowding detection method merging ResNet and LSTM
CN113780132B (en) Lane line detection method based on convolutional neural network
CN111915628B (en) Single-stage instance segmentation method based on prediction target dense boundary points
CN110717886A (en) Pavement pool detection method based on machine vision in complex environment
CN111160293A (en) Small target ship detection method and system based on characteristic pyramid network
CN110852327A (en) Image processing method, image processing device, electronic equipment and storage medium
CN110210493A (en) Profile testing method and system based on non-classical receptive field modulation neural network
CN111353449A (en) Infrared road image water body detection method based on condition generation countermeasure network
CN113011338A (en) Lane line detection method and system
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
CN112836573A (en) Lane line image enhancement and completion method based on confrontation generation network
CN114332921A (en) Pedestrian detection method based on improved clustering algorithm for Faster R-CNN network
CN115527133A (en) High-resolution image background optimization method based on target density information
CN115661860A (en) Method, device and system for dog behavior and action recognition technology and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20200630

WW01 Invention patent application withdrawn after publication