CN112258537B - Method for monitoring dark vision image edge detection based on convolutional neural network - Google Patents

Method for monitoring dark vision image edge detection based on convolutional neural network Download PDF

Info

Publication number
CN112258537B
CN112258537B CN202011161185.6A CN202011161185A CN112258537B CN 112258537 B CN112258537 B CN 112258537B CN 202011161185 A CN202011161185 A CN 202011161185A CN 112258537 B CN112258537 B CN 112258537B
Authority
CN
China
Prior art keywords
image
convolution
edge detection
size
edge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011161185.6A
Other languages
Chinese (zh)
Other versions
CN112258537A (en
Inventor
赵志强
张琴
陶于祥
陈阔
钱鹰
黄颖
何帆
王少志
徐晓文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202011161185.6A priority Critical patent/CN112258537B/en
Publication of CN112258537A publication Critical patent/CN112258537A/en
Application granted granted Critical
Publication of CN112258537B publication Critical patent/CN112258537B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the field of image processing, and particularly relates to a method for detecting edges of a supervised scotopic vision image based on a convolutional neural network, which comprises the following steps: acquiring a scotopic vision image, inputting the scotopic vision image into a trained supervision image edge detection model for performing scotopic vision image edge detection to obtain an edge detection result; the monitoring image edge detection model is an optimized monitoring image edge detection model and consists of six edge detection modules and a splicing module; according to the invention, by adding the convolution layer of the model and introducing the residual error structure unit, the characteristics and edge detail information of one-stage learning on the image can be better reserved, so that the trained optimization model can further improve the image edge detection effect, the continuity of the image edge is enhanced, and the output edge detection image is more in line with the human eye observation effect.

Description

Method for monitoring dark vision image edge detection based on convolutional neural network
Technical Field
The invention belongs to the field of image processing, computer vision and deep learning, and particularly relates to a method for detecting edges of a supervised scotopic vision image based on a convolutional neural network.
Background
Edge detection is a classic problem in the field of image and computer vision, is a pre-storage work of a plurality of problems, such as image segmentation and image recognition, and is widely applied to a plurality of fields, such as medical images and image positioning. Although edge detection methods and algorithms have achieved some success, it is still worth further exploration.
The original purpose of image edge detection is to extract the edge or contour of an object, so that the subsequent application is facilitated. However, the two tasks of extracting the edge and obtaining the detail information are often contradictory, and different tasks are different, so different edge detection schemes need to be proposed according to different target tasks. In deep learning, in order to obtain a better training result, the data samples in the training set are often required to be distributed more uniformly. However, in most cases, the edge point pixels in the image only occupy a small portion of the image, which results in extreme imbalance between positive and negative samples in the training set, and brings a certain difficulty to the training of the deep learning model. Especially, the contrast of the image in a scotopic vision environment is lower, and a more serious challenge is brought to the image edge extraction.
Disclosure of Invention
In order to solve the problems of the prior art, the invention provides a method for detecting the edge of a supervised scotopic vision image based on a convolutional neural network, which comprises the following steps: acquiring a scotopic vision image, and inputting the scotopic vision image into a trained optimized supervised image edge detection model to perform scotopic vision image edge detection to obtain an edge detection result; the optimized supervised image edge detection model consists of six edge detection modules and a splicing module;
the process of training the optimization supervision image edge detection model comprises the following steps:
s1: acquiring a scotopic vision original image data set and an edge labeling image data set of the same scene under a normal illumination condition, dividing the data sets into a training set and a testing set, and simultaneously performing data enhancement processing on a scotopic vision original image in the training set and an edge labeling image of the same scene under the normal illumination condition to obtain an amplified training sample set;
s2: inputting the images in the amplified training set into an optimized supervised image edge detection model to obtain an effect process diagram of six edge detection modules;
s3: splicing the process graphs with the six edge detection effects, converting the spliced images into three-dimensional images, and performing convolution on the three-dimensional images to obtain edge detection images;
s4: calculating a loss function of the optimized supervised image edge detection model, and calculating the error between the obtained edge detection image in the training process and the edge marking image of the same scene under the normal illumination condition according to the loss function;
s5: continuously adjusting the weight of the loss function, and saving the training weight parameter of the model when the value of the loss function is minimum;
s6: inputting the data in the test set into an optimized supervised image edge detection model for testing;
s7: and outputting an edge detection result, and finishing the model training.
Preferably, the process of scotopic vision image data set comprises: extracting R, G, B channels of the images under normal illumination to obtain R1, G1 and B1 images; linearly changing the gray levels of the R1, G1 and B1 images to 0-47 to obtain R2, G2 and B2 images; recombining the R2, G2 and B2 images to obtain an image in a dark visual environment; and carrying out set processing on the recombined images to obtain a scotopic vision image data set.
Preferably, the structure of the six edge detection modules includes:
the first module comprises two convolution layers, the number of convolution kernels of the first layer is 32, the size of a convolution window is 3 x 3, the step size is 2 x 2, and an activation function is a relu function; the number of the convolution kernels of the second layer is 64, the size of a convolution window is 3 x 3, the step size is 1 x 1, and the activation function is a relu function;
the second module comprises two convolution layers, the number of convolution kernels of the first layer is 128, the size of a convolution window is 3 x 3, the step size is 1 x 1, and the activation function is a relu function; the number of the convolution kernels in the second layer is 128, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
the third module comprises a convolution layer and two same convolution structures; the number of convolution kernels of the convolution layer is 256, the size of a convolution window is 1 x 1, and the step size is 1 x 1; the convolution structure is: the number of convolution kernels is 256, the size of a convolution window is 3 x 3, the step length is 1 x 1, and the activation function is a relu function;
the fourth module comprises a convolution layer and three same convolution structures; the number of convolution kernels of the convolution layer is 256, the size of a convolution window is 1 x 1, and the step size is 2 x 2; the convolution structure is: the number of convolution kernels of the first convolution layer is 256, the size of a convolution window is 1 x 1, the step size is 1 x 1, and the activation function is a relu function; the number of convolution kernels of the second convolution layer is 512, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
the fifth module comprises a convolution layer and three same convolution structures; the number of convolution kernels of the convolution layer is 512, the size of a convolution window is 1 x 1, and the step size is 2 x 2; the convolution structure is: the number of convolution kernels of the first convolution layer is 256, the size of a convolution window is 1 x 1, the step size is 1 x 1, and the activation function is a relu function; the number of convolution kernels of the second convolution layer is 512, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
the sixth module comprises a convolution layer and three same convolution structures; the number of convolution kernels of the convolution layer is 256, the size of a convolution window is 1 x 1, and the step size is 1 x 1; the convolution structure is: a relu activation function, wherein the number of convolution kernels of the first convolution layer is 128, the size of a convolution window is 1 x 1, the step size is 1 x 1, and the activation function is a relu function; the number of convolution kernels for the second convolution layer is 256, the convolution window size is 3 x 3, and the step size is 1 x 1.
Preferably, the splicing module is used for splicing the edge detection effect process diagrams obtained by the six edge detection modules; converting the spliced image into a three-dimensional image by adopting a contit function, and performing convolution on the three-dimensional image; the number of convolution kernels to be convolved is 1, the size of the convolution window is 1 x 1, and the step size is 1 x 1.
Preferably, the image in the scotopic vision image dataset is subjected to enhancement processing, including random cropping of the image and flip rotation processing.
Preferably, the loss function of the optimized supervised image edge detection model is:
Figure BDA0002744374960000031
further, the coefficient of the loss function is optimized, and the optimization formula of the coefficient of the loss function is as follows:
Figure BDA0002744374960000032
furthermore, the weight lambda of the positive sample and the negative sample is set, the initial value of the lambda is set to be 0.6-1.2, and the initial value is continuously updated through the training of the model.
Preferably, the process of calculating the error between the edge detection image and the edge labeling image of the same scene under the normal illumination condition is as follows: acquiring an edge point number set of edge marking images of the same scene under a normal illumination condition; acquiring an edge point number set of an edge detection model image; calculating the error of the edge detection image and the edge marking image of the same scene under the normal illumination condition according to an error formula;
the error formula is:
Figure BDA0002744374960000041
the invention has the beneficial effects that the invention provides a monitoring edge detection method based on the convolutional neural network, which is more suitable for the edge detection of the scotopic vision image. According to the invention, by adding the convolution layer of the model and introducing the residual error structure unit, the characteristics and edge detail information of one-stage learning on the image can be better reserved, so that the trained model can further improve the image edge detection effect, the PSNR and MSE evaluation indexes are improved, the continuity of the image edge is enhanced, and the edge detection image output by the model is more in line with the human eye observation effect.
Drawings
FIG. 1 is a schematic flow chart of the supervised image edge detection method based on convolutional neural network of the present invention;
FIG. 2 is an overall model network structure of the convolutional neural network-based supervised image edge detection method of the present invention;
FIG. 3 is a graph of the validation effect of the test set in the disclosed edge annotation dataset;
fig. 4 is a diagram of the edge detection effect of the image captured in the actual dark vision environment.
Detailed Description
The technical solutions and advantages of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and the present invention will be further described in detail. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
A method for detecting edges of a supervised scotopic vision image based on a convolutional neural network, as shown in fig. 1, the method comprising: acquiring a scotopic vision image, inputting the scotopic vision image into a trained optimized supervised image edge detection model for performing scotopic vision image edge detection to obtain a detection result; the optimized supervised image edge detection model consists of six edge detection modules and a splicing module;
the process of training the optimization supervision image edge detection model comprises the following steps:
s1: acquiring a scotopic vision original image data set and an edge label image data set of the same scene under a normal illumination condition, dividing the data sets into a training set and a testing set, and simultaneously performing data enhancement processing on a scotopic vision original image in the training set and an edge label image of the same scene under the normal illumination condition to obtain an amplified training sample set;
s2: inputting the images in the augmented training set into an optimized supervised image edge detection model;
s3: obtaining effect process diagrams of six edge detection modules, splicing the process diagrams of six edge detection effects, converting spliced images into three-dimensional images, and performing convolution on the three-dimensional images to obtain edge detection images;
s4: calculating a loss function of the optimized supervised image edge detection model, and calculating the error between the obtained edge detection image in the training process and the edge marking image of the same scene under the normal illumination condition according to the loss function;
s5: continuously adjusting the weight of the loss function, and saving the training weight parameter of the model when the value of the loss function is minimum;
s6: inputting the data in the test set into an optimized supervised image edge detection model for testing;
s7: and outputting an edge detection result, and finishing the model training.
Preferably, a scotopic vision original image dataset and an edge labeling image dataset of the same scene under normal illumination conditions are obtained, the datasets are divided into a training set and a testing set, and meanwhile, some images shot in an actual scotopic vision environment are prepared to serve as verification sets.
Specifically, a public BIPED edge labeling data set is selected to be subjected to linear change to obtain a BIPED data set in a dark visual environment, and the BIPED data set is called a scotopic _ BIPED data set; the training set comprises 200 original images and corresponding edge labeling images, the test set comprises 50 original images and corresponding edge labeling images, and 50 images shot in an actual dark vision environment are collected as a verification set.
Preferably, the process of acquiring the scotopic vision image dataset further comprises: r, G, B channels of the images under normal illumination are extracted to obtain R1, G1 and B1 images; linearly changing the gray levels of the R1, G1 and B1 images to 0-47 to obtain R2, G2 and B2 images; recombining the R2, G2 and B2 images to obtain an image under a dark vision environment; and carrying out set processing on the recombined images to obtain a scotopic vision image data set.
The structure of six edge detection modules of the optimized supervised image edge detection model comprises:
the first module comprises two convolution layers, the number of convolution kernels of the first layer is 32, the size of a convolution window is 3 x 3, the step size is 2 x 2, and an activation function is a relu function; the number of convolution kernels in the second layer is 64, the size of a convolution window is 3 x 3, the step size is 1 x 1, and the activation function is a relu function;
the second module comprises two convolution layers, the number of convolution kernels of the first layer is 128, the size of a convolution window is 3 x 3, the step size is 1 x 1, and the activation function is a relu function; the number of the convolution kernels in the second layer is 128, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
the third module comprises a convolution layer and two same convolution structures; the number of convolution kernels of the convolution layer is 256, the size of a convolution window is 1 x 1, and the step size is 1 x 1; the convolution structure is: the number of convolution kernels is 256, the size of a convolution window is 3 x 3, the step length is 1 x 1, and the activation function is a relu function;
the fourth module comprises a convolution layer and three same convolution structures; the number of convolution kernels of the convolution layer is 256, the size of a convolution window is 1 x 1, and the step size is 2 x 2; the convolution structure is: the number of convolution kernels of the first convolution layer is 256, the size of a convolution window is 1 x 1, the step size is 1 x 1, and the activation function is a relu function; the number of convolution kernels of the second convolution layer is 512, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
the fifth module comprises a convolution layer and three same convolution structures; the number of convolution kernels of the convolution layer is 512, the size of a convolution window is 1 x 1, and the step size is 2 x 2; the convolution structure is: the number of convolution kernels of the first convolution layer is 256, the size of a convolution window is 1 x 1, the step size is 1 x 1, and the activation function is a relu function; the number of convolution kernels of the second convolution layer is 512, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
the sixth module comprises a convolution layer and three same convolution structures; the number of convolution kernels of the convolution layer is 256, the size of a convolution window is 1 x 1, and the step size is 1 x 1; the convolution structure is: a relu activation function, wherein the number of convolution kernels of the first convolution layer is 128, the size of a convolution window is 1 x 1, the step size is 1 x 1, and the activation function is a relu function; the number of convolution kernels for the second convolution layer is 256, the convolution window size is 3 x 3, and the step size is 1 x 1.
And (3) carrying out enhancement processing on the image in the scotopic vision image data set, wherein the enhancement processing comprises random cropping, image turning and rotation processing on the image. By the method, the training sample set in the data set can be expanded.
As shown in fig. 2, the process of inputting the image data in the data set into the optimized supervised image edge detection model, wherein each structure of the model processes the image comprises the following steps:
1) a first edge detection module: the input image is a scotopic vision image subjected to data enhancement processing, the scotopic vision image passes through two convolution layers, the number of convolution kernels of a first layer is 32, the size of a convolution window is 3 x 3, the step size is 2 x 2, the filling mode is SAME, then normalization is carried out, the activation function is a relu function, the number of convolution kernels of a second layer is 64, the size of the convolution window is 3 x 3, the step size is 1 x 1, the filling mode is SAME, then normalization is carried out, and the activation function is a relu function, so that an image conv1_1 is obtained. Then, the image conv1_1 is up-sampled, the number of convolution kernels is 1, the size of a convolution window is 1 × 1, the sampling size is 2, and the step size is 1 × 1, so that a result output image output1 of the first module is obtained.
Meanwhile, convolution operation is carried out on the image conv1_1, the number of convolution kernels is 128, the size of a convolution window is 1 x 1, the step size is 2 x 2, the filling mode is SAME, and then normalization is carried out to obtain an image rconv1_ 1.
2) A second edge detection module: the input image is the conv1_1 image of the step 1), the number of convolution kernels of the first layer is 128, the size of a convolution window is 3 × 3, the step size is 1 × 1, the filling mode is SAME, then normalization is carried out, the activation function is a relu function, the number of convolution kernels of the second layer is 128, the size of the convolution window is 3 × 3, the step size is 1 × 1, the filling mode is SAME, and then normalization is carried out, so that the image conv2_1 is obtained. Then, the image conv2_1 is up-sampled, the number of convolution kernels is 1, the size of the convolution window is 1 × 1, the sampling size is 2, and the step size is 1 × 1, so that the result output map output2 of the second module is obtained.
Meanwhile, the image conv2_1 is convolved again, the number of convolution kernels is 128, the size of a convolution window is 3 × 3, the step size is 2 × 2, the filling mode is SAME, an image conv2_2 is obtained, and then the image add2_1 is obtained by adding the conv2_2 and the rconv1_1 image in the step 1). And performing convolution on add2_1, wherein the number of convolution kernels is 256, the size of a convolution window is 1 × 1, the step size is 2 × 2, the filling mode is SAME, and then performing normalization to obtain an image rconv2_ 1.
3) A third edge detection module: the input image is the conv2_2 image in the step 2), the number of convolution kernels is 256, the size of a convolution window is 1 × 1, the step size is 1 × 1, the filling mode is SAME, and then normalization is performed to obtain an image conv3_ 1. Processing the image add2_1 in the step 2), and performing 2 cycles, wherein the structure in the cycle is as follows: the image add2_1 firstly passes through a relu activation function and then passes through two convolution layers, the number of convolution kernels of the first layer is 256, the size of a convolution window is 3 × 3, the step size is 1 × 1, the filling mode is SAME, then normalization is performed, the activation function is a relu function, the number of convolution kernels of the second layer is 256, the size of the convolution window is 3 × 3, the step size is 1 × 1, the filling mode is SAME, then normalization is performed, an image conv3_2 is obtained through the two convolution layers, then averaging operation is performed on the image conv3_2 and the image conv3_1 to obtain an image conv3_ mean, and therefore, after two cycles, image naming after several cycles is performed, the image naming of the cycle body structure is kept unchanged for convenience of naming, similar cycle body structures related below also follow the principle, and are not repeated. And then, the image conv3_ mean is up-sampled, the number of convolution kernels is 1, the size of a convolution window is 1 × 1, the sampling size is 4, and the step size is 1 × 1, so that a result output image output3 of a third module is obtained.
And meanwhile, convolving the images conv3_ mean, wherein the number of convolution kernels is 256, the size of a convolution window is 3 × 3, the step size is 2 × 2, the filling mode is SAME, an image conv3_3 is obtained, and then the image conv3_3 is added with the image rconv2_1 in the step 2) to obtain an image add3_ 1. And performing convolution on add3_1, wherein the number of convolution kernels is 512, the size of a convolution window is 1 × 1, the step size is 2 × 2, the filling mode is SAME, and then performing normalization to obtain an image rconv3_ 1.
4) Fourth edge detection module: the input image is the conv2_2 image in the step 2), and after one convolution, the number of convolution kernels is 256, the size of a convolution window is 1 × 1, the step size is 2 × 2, and the filling mode is SAME, so that the image conv4_1 is obtained. And then adding the image conv4_1 and the image conv3_3 in the step 3) to obtain an image add4_ 1. And then, performing convolution on the image add4_1, wherein the number of convolution kernels is 512, the size of a convolution window is 1 × 1, the step size is 1 × 1, the filling mode is SAME, and performing normalization processing to obtain an image conv4_ 2. Further processing the image add3_1 in step 3), performing one convolution, wherein the number of convolution kernels is 512, the size of a convolution window is 1 × 1, the step size is 1 × 1, the filling mode is SAME, and performing normalization processing to obtain an image conv4_ 3. The image conv4_3 goes through 3 cycles, where the structure in the cycle is: the image conv4_3 passes through a relu activation function and then passes through two convolution layers, the number of convolution kernels of the first layer is 256, the size of a convolution window is 1 x 1, the step size is 1 x 1, the filling mode is SAME, then normalization is carried out, the activation function is a relu function, the number of convolution kernels of the second layer is 512, the size of the convolution window is 3, the step size is 1 x 1, the filling mode is SAME, then normalization is carried out to obtain an image conv4_4, the image conv4_3 and the image conv4_4 are added to obtain an image add4_2, and then the image add4_2 and the image conv4_3 are subjected to averaging operation to obtain an image conv4_ mean. And upsampling the image conv4_ mean, wherein the number of convolution kernels is 1, the size of a convolution window is 1 x 1, the sampling size is 8, and the step size is 1 x 1, so that a result output image output4 of the fourth module is obtained.
And meanwhile, the image conv4_ mean is convolved, the number of convolution kernels is 512, the size of a convolution window is 3 × 3, the step size is 2 × 2, the filling mode is SAME, an image conv4_5 is obtained, and then the image conv4_5 and the image rconv3_1 in the step 3) are added to obtain an image add4_ 3. And performing convolution on add4_3, wherein the number of convolution kernels is 512, the size of a convolution window is 1 × 1, the step size is 1 × 1, the filling mode is SAME, and then performing normalization to obtain an image rconv4_ 1.
5) A fifth edge detection module: the input image is the conv4_1 image in the step 4), and after one convolution, the number of convolution kernels is 512, the size of a convolution window is 1 × 1, the step size is 2 × 2, and the filling mode is SAME, so that an image conv5_1 is obtained. Then, the image conv5_1 is added to the image conv4_5 in the step 4) to obtain an image add5_ 1. And then, performing convolution on the image add5_1, wherein the number of convolution kernels is 512, the size of a convolution window is 1 × 1, the step size is 1 × 1, the filling mode is SAME, and performing normalization processing to obtain an image conv5_ 2. Processing the image add4_3 in the step 4), and performing 3 cycles, wherein the structure in the cycle is as follows: the image add4_3 passes through a relu activation function and then passes through two convolution layers, the number of convolution kernels of the first layer is 256, the size of a convolution window is 1 × 1, the step size is 1 × 1, the filling mode is SAME, then normalization is performed, the activation function is a relu function, the number of convolution kernels of the second layer is 512, the size of the convolution window is 3 × 3, the step size is 1 × 1, the filling mode is SAME, then normalization is performed to obtain an image conv5_3, the image conv5_3 and the image add4_3 are added to obtain an image add5_2, and then the image add5_2 and the image conv5_2 are subjected to averaging operation to obtain an image conv5_ mean. The image conv5_ mean is up-sampled with the number of convolution kernels of 1, the size of the convolution window of 1 × 1, the sample size of 16, and the step size of 1 × 1, resulting in the result output map output5 of the fifth module.
Meanwhile, the image conv5_ mean is added to the image rconv4_1 of step 4) to obtain an image add5_ 3.
6) A sixth edge detection module: the input image is the add5_3 image in the step 5), and after one convolution, the number of convolution kernels is 256, the size of a convolution window is 1 × 1, the step size is 1 × 1, the filling mode is SAME, and the image conv6_1 is obtained after normalization processing. And then, the image conv5_ mean in the step 5) is convoluted, the number of convolution kernels is 256, the size of a convolution window is 1 × 1, the step size is 1 × 1, the filling mode is SAME, and normalization processing is carried out to obtain an image conv6_ 2. The image conv6_1 is processed again, and the loop is performed for 3 times, and the structure in the loop is as follows: the image conv6_1 passes through a relu activation function and then passes through two convolution layers, the number of convolution kernels of the first layer is 128, the size of a convolution window is 1 × 1, the step size is 1 × 1, the filling mode is SAME, then normalization is carried out, the activation function is a relu function, the number of convolution kernels of the second layer is 256, the size of the convolution window is 3 × 3, the step size is 1 × 1, the filling mode is SAME, then normalization is carried out to obtain an image conv6_3, the image conv6_3 and the image add5_2 in the step 5) are added to obtain an image add6_1, and then the image add6_1 and the image conv6_2 are subjected to average value calculation to obtain an image 6_ mean. And upsampling the image conv6_ mean, wherein the number of convolution kernels is 1, the size of a convolution window is 1 × 1, the sampling size is 16, and the step size is 1 × 1, so that a result output image output6 of the sixth module is obtained.
Splicing the images output1-output6 obtained in each step from the step 1) to the step 6) to obtain a one-dimensional array outputs _1, changing the array outputs _1 into a three-dimensional array outputs _3 by adopting a cont function, performing one-time convolution on the three-dimensional array outputs _3, wherein the number of convolution kernels is 1, the size of a convolution window is 1 × 1, the step size is 1 × 1, and an image conv7_1 is obtained, namely the output edge detection effect graph.
Preferably, the global learning rate of the supervised scotopic vision image edge detection model of the convolutional neural network is 0.0001, the iteration times are 15000 times, and the input image is an RGB image.
Preferably, the weights of the last 5 times of training of the convolutional neural network edge detection model are saved, and the weight of the last iteration is used by default, and is numbered as 14999. And verifying by using the test set image in the scotopic vision image data set and the image shot in the scotopic vision environment, and storing the image edge detection result as a picture.
Preferably, the up-sampling method used by the supervised scotopic vision image edge detection model of the convolutional neural network is transposed convolution.
The loss function of the supervised scotopic vision image edge detection model of the convolutional neural network is as follows:
Figure BDA0002744374960000111
Figure BDA0002744374960000112
Figure BDA0002744374960000113
1-β=|Y + |/|Y + +Y - |
wherein, A n (.), n represents the dimensions of the output image of each module of the supervised image edge detection model, W represents the set of all parameters in the model, W represents the corresponding set of parameters,
Figure BDA0002744374960000114
beta represents the coefficient of each term in the loss function, Y + Representing non-edge data in the edge labeled image dataset, σ () representing the scale level of each weight, y j Indicating whether the pixel is marked as an edge, X indicating the input image, Y - Representing edge data in the edge labeled image dataset.
The process of calculating the error between the edge detection image and the edge marking image of the same scene under the normal illumination condition comprises the following steps: acquiring an edge point number set of edge marking images of the same scene under a normal illumination condition; acquiring an edge point number set of an edge detection model image; calculating the error of the edge detection image and the edge marking image of the same scene under the normal illumination condition according to an error formula;
the error formula is:
Figure BDA0002744374960000115
wherein error represents the error between the edge detection image and the edge labeling image of the same scene under normal illumination condition, M new Representing a set of edge points of the image after the edge detection model, M o Representing the set of edge points of the edge labeled image of the same scene under normal illumination conditions, m representing the edge points of the image after passing through the edge detection model, | | | | calculation Euclid Edge point arrival detection for representing mth labelThe Euclidean distance between the measured edge points.
As shown in fig. 3, the test set images in the scotopic vision image data set are subjected to a trained supervised image edge detection model to obtain the parameter values of ODS (fixed contour threshold), OIS (optimal threshold for each image), and AP (average accuracy).
As shown in fig. 4, the scotopic vision image captured by the camera is subjected to a trained supervised image edge detection model to obtain parameter values of PSNR (peak signal-to-noise ratio) and MSE (mean square error).
The above-mentioned embodiments, which further illustrate the objects, technical solutions and advantages of the present invention, should be understood that the above-mentioned embodiments are only preferred embodiments of the present invention, and should not be construed as limiting the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A method for supervising dark vision image edge detection based on a convolutional neural network is characterized by comprising the following steps: acquiring a scotopic vision image, inputting the scotopic vision image into a trained optimized supervised image edge detection model for performing scotopic vision image edge detection to obtain an edge detection result; the optimized supervised image edge detection model consists of six edge detection modules and a splicing module;
the process of training the optimization supervision image edge detection model comprises the following steps:
s1: acquiring a scotopic vision original image data set and an edge label image data set of the same scene under a normal illumination condition, dividing the data sets into a training set and a testing set, and simultaneously performing data enhancement processing on a scotopic vision original image in the training set and an edge label image of the same scene under the normal illumination condition to obtain an amplified training sample set;
s2: inputting the images in the amplified training set into an optimized supervised image edge detection model to obtain an effect process diagram of six edge detection modules; the structure of the six edge detection modules is as follows:
the first module comprises two convolution layers, the number of convolution kernels of the first layer is 32, the size of a convolution window is 3 x 3, the step size is 2 x 2, and an activation function is a relu function; the number of convolution kernels in the second layer is 64, the size of a convolution window is 3 x 3, the step size is 1 x 1, and the activation function is a relu function;
the second module comprises two convolution layers, the number of convolution kernels of the first layer is 128, the size of a convolution window is 3 x 3, the step size is 1 x 1, and the activation function is a relu function; the number of the convolution kernels in the second layer is 128, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
the third module comprises a convolution layer and two same convolution structures; the number of convolution kernels of the convolution layer is 256, the size of a convolution window is 1 x 1, and the step size is 1 x 1; the convolution structure is: the number of convolution kernels is 256, the size of a convolution window is 3 x 3, the step size is 1 x 1, and the activation function is a relu function;
the fourth module comprises a convolution layer and three same convolution structures; the number of convolution kernels of the convolution layer is 256, the size of a convolution window is 1 x 1, and the step size is 2 x 2; the convolution structure is: the number of convolution kernels of the first convolution layer is 256, the size of a convolution window is 1 x 1, the step size is 1 x 1, and the activation function is a relu function; the number of convolution kernels of the second convolution layer is 512, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
the fifth module comprises a convolution layer and three same convolution structures; the number of convolution kernels of the convolution layer is 512, the size of a convolution window is 1 x 1, and the step size is 2 x 2; the convolution structure is: the number of convolution kernels of the first convolution layer is 256, the size of a convolution window is 1 x 1, the step size is 1 x 1, and the activation function is a relu function; the number of convolution kernels of the second convolution layer is 512, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
the sixth module comprises a convolution layer and three same convolution structures; the number of convolution kernels of the convolution layer is 256, the size of a convolution window is 1 x 1, and the step size is 1 x 1; the convolution structure is: a relu activation function, the number of convolution kernels of the first convolution layer is 128, the size of a convolution window is 1 x 1, the step size is 1 x 1, and the activation function is a relu function; the number of convolution kernels of the second convolution layer is 256, the size of a convolution window is 3 x 3, and the step size is 1 x 1;
s3: splicing the process graphs with the six edge detection effects, converting the spliced images into three-dimensional images, and performing convolution on the three-dimensional images to obtain edge detection images;
s4: calculating a loss function of the optimized supervised image edge detection model, and calculating the error between the obtained edge detection image in the training process and the edge marking image of the same scene under the normal illumination condition according to the loss function;
s5: continuously adjusting the weight of the loss function, and saving the training weight parameter of the model when the value of the loss function is minimum;
s6: inputting the data in the test set into an optimized supervised image edge detection model for testing;
s7: and outputting an edge detection result, and finishing the model training.
2. The method according to claim 1, wherein the step of acquiring the scotopic vision image data set comprises: extracting R, G, B channels of the images under normal illumination to obtain R1, G1 and B1 images; linearly changing the gray levels of the R1, G1 and B1 images to 0-47 to obtain R2, G2 and B2 images; recombining the R2, G2 and B2 images to obtain an image under a dark vision environment; and carrying out set processing on the recombined images to obtain a scotopic vision image data set.
3. The supervised scotopic vision image edge detection method based on the convolutional neural network as claimed in claim 1, wherein the stitching module is configured to stitch the edge detection effect process graphs obtained by the six edge detection modules; transforming the spliced image into a three-dimensional image by adopting a contit function, and performing convolution on the three-dimensional image; the number of convolution kernels to be convolved is 1, the size of the convolution window is 1 x 1, and the step size is 1 x 1.
4. The method according to claim 1, wherein the image in the scotopic vision image data set is subjected to enhancement processing, including random cropping, image flipping and rotation processing.
5. The supervised scotopic vision image edge detection method based on the convolutional neural network as claimed in claim 1, wherein the loss function of the optimized supervised image edge detection model is as follows:
Figure FDA0003686347270000031
wherein, A n (.), n represents the dimension of each module output image of the supervision image edge detection model, W represents the set of all parameters in the model, W represents the corresponding parameter set,
Figure FDA0003686347270000032
beta represents the coefficient of each term in the loss function, Y + Representing non-edge data in the edge labeled image dataset, σ () representing the scale level of each weight, y j Indicating whether pixels of the edge-detected image are marked as edges, X indicating the input image, Y - Representing edge data in the edge labeled image dataset.
6. The supervised scotopic vision image edge detection method based on the convolutional neural network as claimed in claim 5, wherein the coefficients of the loss function are optimized, and the optimization formula of the coefficients of the loss function is as follows:
Figure FDA0003686347270000033
where λ represents the weight controlling the positive and negative samples.
7. The supervised scotopic vision image edge detection method based on the convolutional neural network as claimed in claim 5, wherein the weight λ controlling the positive sample and the negative sample is set, the initial value of λ is set to be between 0.6 and 1.2, and the initial value of λ is continuously updated through model training.
8. The method for supervised scotopic vision image edge detection based on the convolutional neural network as claimed in claim 1, wherein the process of calculating the error of the edge detection image and the edge label image of the same scene under the normal illumination condition is as follows: acquiring an edge point number set of edge marking images of the same scene under a normal illumination condition; acquiring an edge point number set of an edge detection model image; calculating the error of the edge detection image and the edge marking image of the same scene under the normal illumination condition according to an error formula;
the error formula is:
Figure FDA0003686347270000041
wherein error represents the error between the edge detection image and the edge labeling image of the same scene under the normal illumination condition, M new Representing a set of edge points of the image after the edge detection model, M o Representing the set of edge points of the edge labeled image of the same scene under normal illumination conditions, m representing the edge points of the image after passing through the edge detection model, | | | | calculation Euclid Indicating the euclidean distance between the edge point of the mth label and the detected edge point.
CN202011161185.6A 2020-10-27 2020-10-27 Method for monitoring dark vision image edge detection based on convolutional neural network Active CN112258537B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011161185.6A CN112258537B (en) 2020-10-27 2020-10-27 Method for monitoring dark vision image edge detection based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011161185.6A CN112258537B (en) 2020-10-27 2020-10-27 Method for monitoring dark vision image edge detection based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN112258537A CN112258537A (en) 2021-01-22
CN112258537B true CN112258537B (en) 2022-08-26

Family

ID=74262029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011161185.6A Active CN112258537B (en) 2020-10-27 2020-10-27 Method for monitoring dark vision image edge detection based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN112258537B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239199B (en) * 2021-05-18 2022-09-23 重庆邮电大学 Credit classification method based on multi-party data set
CN113361693B (en) * 2021-06-30 2022-10-25 北京百度网讯科技有限公司 Method and device for generating convolutional neural network, and image recognition method and device
CN114693712A (en) * 2022-04-08 2022-07-01 重庆邮电大学 Dark vision and low-illumination image edge detection method based on deep learning
CN116908178B (en) * 2023-09-13 2024-03-08 吉林农业大学 Hypha phenotype acquisition device and method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108734717A (en) * 2018-04-17 2018-11-02 西北工业大学 The dark weak signal target extracting method of single frames star chart background based on deep learning
CN109492580A (en) * 2018-11-08 2019-03-19 北方工业大学 Multi-size aerial image positioning method based on full convolution network field saliency reference
WO2019071990A1 (en) * 2017-10-11 2019-04-18 中兴通讯股份有限公司 Image processing method and apparatus
CN110992342A (en) * 2019-12-05 2020-04-10 电子科技大学 SPCP infrared small target detection method based on 3DATV constraint
CN111797841A (en) * 2020-05-10 2020-10-20 浙江工业大学 Visual saliency detection method based on depth residual error network
CN111815528A (en) * 2020-06-30 2020-10-23 上海电力大学 Bad weather image classification enhancement method based on convolution model and feature fusion

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10282834B1 (en) * 2018-06-22 2019-05-07 Caterpillar Inc. Measurement platform that automatically determines wear of machine components based on images

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019071990A1 (en) * 2017-10-11 2019-04-18 中兴通讯股份有限公司 Image processing method and apparatus
CN108734717A (en) * 2018-04-17 2018-11-02 西北工业大学 The dark weak signal target extracting method of single frames star chart background based on deep learning
CN109492580A (en) * 2018-11-08 2019-03-19 北方工业大学 Multi-size aerial image positioning method based on full convolution network field saliency reference
CN110992342A (en) * 2019-12-05 2020-04-10 电子科技大学 SPCP infrared small target detection method based on 3DATV constraint
CN111797841A (en) * 2020-05-10 2020-10-20 浙江工业大学 Visual saliency detection method based on depth residual error network
CN111815528A (en) * 2020-06-30 2020-10-23 上海电力大学 Bad weather image classification enhancement method based on convolution model and feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Age and gender classification using convolutional neural networks";Gil Levi;《IEEE Conference on Computer Vision & Pattern Recognition Workshops,2015》;20151026;第34-42页 *
"空间变化模糊的图像复原算法";万宇;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190715(第07期);第I138-1027页 *

Also Published As

Publication number Publication date
CN112258537A (en) 2021-01-22

Similar Documents

Publication Publication Date Title
CN112258537B (en) Method for monitoring dark vision image edge detection based on convolutional neural network
CN110232394B (en) Multi-scale image semantic segmentation method
CN112949565B (en) Single-sample partially-shielded face recognition method and system based on attention mechanism
CN110766632A (en) Image denoising method based on channel attention mechanism and characteristic pyramid
CN110781756A (en) Urban road extraction method and device based on remote sensing image
CN112364931B (en) Few-sample target detection method and network system based on meta-feature and weight adjustment
CN111160229B (en) SSD network-based video target detection method and device
CN112465759A (en) Convolutional neural network-based aeroengine blade defect detection method
CN109034184B (en) Grading ring detection and identification method based on deep learning
CN113902625A (en) Infrared image enhancement method based on deep learning
CN111242026B (en) Remote sensing image target detection method based on spatial hierarchy perception module and metric learning
CN112862774A (en) Accurate segmentation method for remote sensing image building
CN112598657B (en) Defect detection method and device, model construction method and computer equipment
CN110930409A (en) Salt body semantic segmentation method based on deep learning and semantic segmentation model
CN113888461A (en) Method, system and equipment for detecting defects of hardware parts based on deep learning
CN115147418B (en) Compression training method and device for defect detection model
CN115775236A (en) Surface tiny defect visual detection method and system based on multi-scale feature fusion
CN111539456B (en) Target identification method and device
KR20220167824A (en) Defect detection system and method through image completion based on artificial intelligence-based denoising
CN115908995A (en) Digital instrument reading identification method and device, electronic equipment and storage medium
CN116485885A (en) Method for removing dynamic feature points at front end of visual SLAM based on deep learning
CN111539931A (en) Appearance abnormity detection method based on convolutional neural network and boundary limit optimization
CN110969630A (en) Ore bulk rate detection method based on RDU-net network model
CN116843725B (en) River surface flow velocity measurement method and system based on deep learning optical flow method
CN117975267A (en) Remote sensing image change detection method based on twin multi-scale cross attention

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant