CN114529737A

CN114529737A - Optical red footprint image contour extraction method based on GAN network

Info

Publication number: CN114529737A
Application number: CN202210158517.8A
Authority: CN
Inventors: 唐俊; 蒋文龙; 朱明�; 王年; 张艳; 鲍文霞
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2022-02-21
Filing date: 2022-02-21
Publication date: 2022-05-24

Abstract

The invention relates to an optical red footprint image contour extraction method based on a GAN network, which comprises the following steps: collecting an original optical red footprint image through an optical footprint collector; making a training set and a testing set; constructing a generator through a residual error network; constructing a discriminator through a PatchGAN Markov discriminator; forming a cycleGAN cycle by the generator and the discriminator to generate an antagonistic network as a training network; sending the training set into a training network for training; and (3) using the trained generator as a contour extraction test network, inputting source domain data of the test set, and obtaining the optical red footprint image contour. The method can directly perform a set of operation of unified flow on the simply preprocessed original optical red footprint image to extract the outline image, thereby simplifying the flow of outline extraction; the invention can ignore the difference between different original images, uses the network with the same parameter to process the data of the same batch, and reduces the calculation amount of contour extraction.

Description

Optical red footprint image contour extraction method based on GAN network

Technical Field

The invention relates to the technical field of image processing and contour extraction, in particular to an optical red footprint image contour extraction method based on a GAN network.

Background

The research on footprints is not limited to the criminal investigation field, but also relates to various aspects in our lives. For example, by researching the difference of foot prints of human bodies during different sports, the shoe making technology is improved, and sports shoes with high technology content suitable for different sports are developed, so that the bodies of people are better protected during sports. In the medical field, medical experts can explore the relation between footprints and certain diseases by researching the influence of the diseases on the footprints generated during walking of patients, predict the disease types and stages of the patients, improve the diagnosis speed and the like.

The morphological characteristics of the optical red footprint image outline are one of the very important characteristics in the red footprint research. In the current research process, the method of extracting the contour from the red footprint image usually adopts a mode of first median filtering and then binarization, and the mode is generally called as a traditional method. The median filtering processes the gray value of each pixel point of the image into the median of the gray values of all the pixel points in a neighborhood window of the pixel point, thereby eliminating isolated noise points and erasing texture information in the optical red footprint. And (4) setting the gray value of a pixel point on the image to be 0 or 255 by binarization, and displaying the whole image to have an obvious black-white effect so as to reduce the data volume in the image and highlight the outline information of the red footprint in the image. However, the traditional method has high quality requirements on the original optical red footprint image, and for the situation that a large area of block noise exists in the original image or a foot print arch region is broken, a traditional method cannot be used for extracting a high-reliability outline image, however, the quality of the footprint image acquired at a crime scene is often not high.

Disclosure of Invention

The invention aims to provide a GAN network-based optical red footprint image contour extraction method which can process block noise in an original image, fill the defect of a footprint in the original image and more flexibly generate a contour image with higher reliability.

In order to achieve the purpose, the invention adopts the following technical scheme: an optical red footprint image contour extraction method based on a GAN network comprises the following steps in sequence:

(1) collecting an original optical red footprint image through an optical footprint collector;

(2) making a training set and a testing set;

(3) constructing a generator through a residual error network;

(4) constructing a discriminator through a PatchGAN Markov discriminator;

(5) the generator and the discriminator form a CycleGAN cycle generation countermeasure network, and the CycleGAN cycle generation countermeasure network is used as a training network;

(6) sending the training set into a training network for training;

(7) and (3) using the trained generator as a contour extraction test network, inputting source domain data of the test set, and obtaining the optical red footprint image contour.

The step (2) specifically comprises the following steps:

(2a) unifying all the original optical red footprint images into a right foot image, and erasing a scale in the image;

(2b) randomly extracting images from the original optical red footprint image as source domain data of a training set and a testing set, and then randomly extracting the images to be used as target domain data of the training set after processing.

The step (3) specifically comprises the following steps:

(3a) the generator is a residual error network structure with 9 residual error blocks, c7s1_ k represents a down-sampling layer with k filters, convolution kernel size of 7 × 7 and step size of 1; a down-sampling layer containing k filters, convolution kernel size 3 x 3 and step size 2 is represented by dk; let rk denote a residual block comprising two convolutional layers with k filters and a convolutional kernel size of 3 x 3; denote an upsampled layer with k filters, convolution kernel size 3 x 3, step size 1/2 by uk; the structure of the generator is represented as: c7s1_64, d128, d256, r256, r256, r256, r256, r256, r256, r256, r256, r256, u128, u64, c7s1_ 3;

adding a self-attention module after two convolution layers of the last residual block of the generator, and recalibrating the obtained characteristics, wherein the formula is as follows:

wherein: e (x)_i) Is the ith row, g (x) in the feature map e (x) in space e_j) Represents the j-th line, s in the feature map g (x) in space g_i，jRepresenting the ith row and jth column element, β, in a two-dimensional matrix s_i，jIs the element of the ith row and the jth column of the two-dimensional matrix beta, i represents the row number, N represents the maximum value range of i, h (x)_i) Is line i in the feature map h (x) in space h; formula (1) is an attention mask calculation formula, and formula (2) is a feature recalibration calculation formula;

(3b) a spectral normalization operation is performed after each convolutional layer in the generator:

f_w(x)＝Wx＝U∑Vx (3)

wherein, U represents an orthogonal matrix of m × m, Σ is a diagonal matrix of m × n, the number on the diagonal is the singular value of the weight matrix W, V is an orthogonal matrix of n × n, and x is a variable.

The step (5) specifically comprises the following steps:

by a_iAnd b_iRepresenting one sample in the source domain and the target domain, respectively, the CycleGAN cycle generation countermeasure network is represented as follows:

a_i→G_A(a_i)→G_B(G_A(a_i))

b_i→G_B(b_i)→G_A(G_B(b_i))

one sample a in the source domain a_iTransition into the target domain via generator G _ A, labeled G _ A (a)_i) Then G _ A (a)_i) And transits to the source domain through the generator G _ B, labeled G _ B (G _ A (a))_i))；

One sample B in the target field B_iTransition into the source domain via generator G _ B, labeled G _ B (B)_i) Then G _ B (B)_i) And transited to the target domain through the generator G _ A, marked as G _ A (G _ B (B))_i))；

G_A(a_i) And G _ B (B)_i) Are all generating samples, G _ B (G _ A (a)_i) G _ A (G _ B (B))_i) All are cyclically generated samples;

(5a) the Wasserstein distance is used to replace the traditional countermeasure loss based on the cross entropy, and the calculation formula of the Wasserstein distance is as follows:

in the formula II (P)_r，P_g) Representing a true distribution P_rAnd generating a distribution P_gA set of all joint distributions combined, γ represents any one of the set of joint distributions, variable x represents one true sample in γ, variable y represents one generated sample in γ, E_(x，y)～γ[||x-y||]Representing the expectation of gamma versus the distance x and y, the lower boundary of the expectation in the set of union distributions is calculated

To add the Wasserstein distance to the training process of the CycleGAN cycle to generate the countermeasure network, equation (4) is transformed into equation (5):

wherein | f | non-conducting phosphor_L≤KRepresenting that the function f is continuous in LepuShz on K, namely the absolute value of the derivative function of f is less than K, K is more than or equal to 0, and sup represents that the upper boundary of the function f meeting the condition is taken;

in the training process of generating the countermeasure network by the CycleGAN cycle, defining a function f (x) as an objective function containing a weight w to solve, namely converting the formula (5) into a formula (6):

optimizing an objective function f_w(x) I.e. find f_w(x) The optimal weight w value of (a) is to find the optimal solution of w, and the function f (x) is defined as an objective function f containing the weight w_w(x) Due to f_w(x) Wx, W is a weight matrix, W is a weight; i f_w||_L≤KRepresenting function f_wIs continued in Li PuShtz at K, i.e. f_wThe absolute value of the derivative function of (A) is less than K; e denotes expectation, P_rRepresents the true distribution, x-P_rIndicating that sample x is from a true distribution, P_gRepresents a generation distribution, x to P_gIndicating that the sample x is from the generated distribution,

representing samples x from the true distribution P_rTime f_w(x) The expectation of the value of,

representing the sample x from the generation profile P_gTime f_w(x) A desire for a value;

according to the formula (6), the target function of the arbiter in the reconstructed cycleGAN cycle generation countermeasure network is formula (7) In the assurance function f_w(x) With the premise of lipschitz continuity, the optimization arbiter is equivalent to optimizing the true distributions and generating the Wasserstein distances between the distributions at this time:

d (x) is the objective function of the arbiter, which is equal to the true distribution and the Wasserstein distance between the resulting distributions;

after the Wasserstein distance is introduced, the countermeasure loss of the generator is shown in a formula (8), and the countermeasure loss of the discriminator is shown in a formula (9):

wherein G represents a generator, D represents a discriminator, N represents the maximum value range of i, and a_iRepresents one source domain sample of N pairs of samples, b_iRepresenting a target domain sample;

(5b) introducing a cyclic consistent loss to define a one-to-one correspondence relationship between the original image and the contour image, wherein the cyclic consistent loss is shown as the following formula:

wherein, | G _ B (G _ A (a))_i))-a_i||₁To obtain G _ B (G _ A (a))_i) A and a_iManhattan distance of; a is_iRepresenting a source domain sample, through the generator G _ A and G _ B, a new image G _ B (G _ A (a)) in the source domain is obtained_i))；b_iRepresents one target domain sample, G _ A (G _ B (B))_i) Represents the new image of bi through the generator twice.

The step (6) specifically comprises the following steps:

(6a) setting network hyper-parameters, respectively taking out a sample from a source domain and a target domain of a training set, and generating a corresponding generation sample and a cyclic generation sample for the antagonistic network through the cyclic GAN cycle;

(6b) calculating the countermeasure loss lossD _ a, lossD _ B of the discriminator;

(6c) the opposing losses lossG _ a, lossG _ B and the cyclic consistent loss lossC of the generator are calculated, i.e. the total loss of the generator is as follows:

lossG_total＝lossG_A+lossG_B+lossC

(6d) and selecting a RMSProp optimizer, setting the parameters of the optimizer, training and saving a cycleGAN cycle to generate an antagonistic network.

According to the technical scheme, the beneficial effects of the invention are as follows: firstly, the method can directly perform a set of unified flow operation on the simply preprocessed original optical red footprint image to extract the outline image, thereby simplifying the flow of outline extraction; secondly, the method can ignore the difference between different original images, process the data of the same batch by using the network with the same parameters, and reduce the calculated amount of contour extraction; thirdly, the invention uses a more flexible GAN network algorithm, can weaken the corresponding relation between pixel values of pixel points at the same position in the original image and the contour image, neglects flaws in the original image and processes the situations which can not be processed by some traditional methods; fourthly, the invention adds a self-attention mechanism in the CycleGAN structure, so that the whole network can learn complete structural information according to the relation between the structures in the image, the learning direction is led, and the contour extraction effect is improved; fifthly, the spectrum normalization operation is added in the CycleGAN structure, and the Wasserstein distance is used for replacing the traditional countermeasure loss based on the cross entropy, so that the problem that the gradient of the GAN network disappears in the training process can be effectively avoided, and the training of the GAN network is more stable.

Drawings

FIG. 1 is a flow chart of a method of the present invention;

FIG. 2 is a schematic diagram of a fourth generation optical footprint harvester;

FIG. 3 is an original image;

FIG. 4 is a pre-processed image;

FIG. 5 is a profile image;

FIG. 6 is a schematic diagram of a self-attention mechanism;

FIG. 7 is a schematic representation of the structure of cycleGAN;

FIG. 8 is a schematic of the cyclic uniform loss.

Detailed Description

As shown in fig. 1, a GAN network-based optical red footprint image contour extraction method includes the following sequential steps:

(2) making a training set and a testing set;

(3) constructing a generator through a residual error network;

(4) constructing a discriminator through a PatchGAN Markov discriminator;

(6) sending the training set into a training network for training;

The step (2) specifically comprises the following steps:

The step (3) specifically comprises the following steps:

(3a) the generator is a residual error network structure with 9 residual error blocks, and a down-sampling layer containing k filters, the convolution kernel size is 7 x 7 and the step size is 1 is represented by c7s1_ k; a down-sampling layer containing k filters, convolution kernel size 3 x 3 and step size 2 is represented by dk; let rk denote a residual block comprising two convolution layers with k filters and a convolution kernel size of 3 x 3; denote an upsampled layer with k filters, convolution kernel size 3 x 3, step size 1/2 by uk; the structure of the generator is represented as: c7s1_64, d128, d256, r256, r256, r256, r256, r256, r256, r256, r256, r256, u128, u64, c7s1_ 3;

wherein: e (x)_i) Is the ith row, g (x) in the feature map e (x) in space e_j) Represents the j (th) line, s in the feature map g (x) in space g_i，jRepresenting the ith row and jth column element, beta, in a two-dimensional matrix s_i，jIs the element of the ith row and the jth column of the two-dimensional matrix beta, i represents the row number, N represents the maximum value range of i, h (x)_i) Is line i in the feature map h (x) in space h; formula (1) is an attention mask calculation formula, and formula (2) is a feature recalibration calculation formula;

f_w(x)＝Wx＝U∑Vx (3)

The step (5) specifically comprises the following steps:

a_i→G_A(a_i)→G_B(G_A(a_i))

b_i→G_B(b_i)→G_A(G_B(b_i))

in the formula II (P)_r，P_g) Representing true distributions P_rAnd generating a distribution P_gA set of all joint distributions combined, γ represents any one of the set of joint distributions, variable x represents one true sample in γ, variable y represents one generated sample in γ, E_(x，y)～γ[||x-y||]Representing the expectation of gamma versus the distance x and y, the lower boundary of the expectation in the set of union distributions is calculated

wherein | f | non-conducting phosphor_L≤KThe expression function f is continuous in Li Puchiz on K, namely the absolute value of the derivative function of f is smaller than K, K is larger than or equal to 0, and sup represents that the function f meeting the condition is subjected to upper boundary;

optimizing an objective function f_w(x) I.e. find f_w(x) The optimal weight w value of (a) is to find the optimal solution of w, and the function f (x) is defined as an objective function f containing the weight w_w(x) Due to f_w(x) Wx, W is a weight matrix, W is a weight; | f_w||_L≤KRepresenting function f_wIs continued in Li PuShtz at K, i.e. f_wThe absolute value of the derivative function of (A) is less than K; e denotes expectation, P_rRepresents the true distribution, x-P_rIndicating that sample x is from a true distribution, P_gRepresents a generation distribution, x to P_gIndicating that the sample x is from the generated distribution,

according to the formula (6), the objective function of the discriminator in the reconstructed cycleGAN cycle generation countermeasure network is the formula (7), and the function f is ensured_w(x) With a continuous pre-plum hoxizUnder the condition, the optimization discriminator is equivalent to optimizing the real distribution and generating Wasserstein distance between the distributions:

wherein, | G _ B (G _ A (a))_i))-a_i||₁To obtain G _ B (G _ A (a))_i) A and a_iManhattan distance of; a is_iRepresenting a source domain sample, through a generator G _ A and then a generator G _ B, a new image G _ B (G _ A (a)) in the source domain is obtained_i))；b_iRepresents one target domain sample, G _ A (G _ B (B))_i) Represents the new image obtained by bi through the generator twice.

The step (6) specifically comprises the following steps:

lossG_total＝lossG_A+lossG_B+lossC

Example one

Step 1: collecting an optical red footprint image through an optical footprint collector;

the fourth generation optical footprint acquisition instrument (FMC500IV) developed by hangzhou innovation and electronics technology development limited company is used for acquiring optical bare images, as shown in fig. 2, the effective acquisition area of the device is the black part of the front of the instrument, power interfaces and USB interfaces distributed on the left side and the right side of the instrument are connected according to the specification, and corresponding drivers and computer software are installed on a computer for acquisition. When the footprint is in contact with the acquisition surface, the triple prism below the acquisition surface of the equipment can be totally reflected, and the captured image is transmitted to a computer software interface and stored locally in a computer. The device can collect a left foot or right foot optical red footprint image at a time, and stores the collected image data according to a uniform naming specification, wherein the resolution of the stored image data is 325 x 670.

And 2, step: and preprocessing the acquired optical red footprint image and making a data set. In step 1, 300 optical red footprint images with different IDs are collected, each ID comprises 10 left foot images and 10 right foot images, and 6000 images in total, and the original images are shown in fig. 3.

Due to the arrangement of the acquisition instrument, there is a ruler for marking the length in the original optical red footprint image, on the thumb side of the footprint, as shown in fig. 3. All left foot images are converted into right foot images by detecting the direction of the scale in the images, then partial pixel values of the left side of the images with the size of 35 pixel points are all set to be 255 so as to erase the scale, and finally the images are converted into black and white colors, so that the processed image data is shown in fig. 4.

450 different images were randomly selected from the group, and divided into 3 groups of 150 images. The first set is defined as the source domain data of the training set, the second set is defined as the source domain data of the test set, and the third set is manually processed into the target domain data (i.e., the optical red footprint outline image, as shown in fig. 5) using photoshos software, defined as the target domain data of the training set.

And step 3: the generator is defined, the basic structure of the generator used by the invention is a ResNet structure with 9 residual blocks, and the ResNet structure is added with a residual unit through a short circuit mechanism, so that the degradation problem of a deep network can be well solved. The basic structure of the generator of the present invention can be expressed as: c7s1_64, d128, d256, r256, r256, r256, r256, r256, r256, r256, r256, r256, u128, u64, c7s1_ 3.

In order to enable the network to pay attention to the correlation between the red footprint structures and improve the quality of the generated outline, a self-attention module is added after two Convolition layers of the last residual block of a generator, and the obtained features are recalibrated.

The self-attention module can focus on the relevance between points in the feature map, namely, the influence of other points on the current point is calculated, so that the structural features of the image are obtained, and the description capacity of the network on the image details is improved. A detailed block diagram of the self-attention module is shown in fig. 6. In order to reduce the computation amount of the network, firstly, a feature diagram x (with dimension H W C) is respectively convolved by 3 pieces of 1W 1 to obtain 3 feature spaces e, g and H, then the feature diagram of the space e, g and H is expanded to HW C, the expanded feature diagram row represents all channel information at a certain position before expansion, the expanded feature diagram row represents all position information at a certain channel before expansion, the feature diagram in the space e is multiplied by the feature diagram in the space g after being transformed, and the attention mask is obtained after being normalized by a softmax layer, as shown in formula (1), the formulaMiddle beta_i，jThe degree of influence of the generation of the jth position on the ith position in the graph is represented. And finally, multiplying the feature map of the space H by an attention mask as shown in a formula (2), so as to increase the response of the key area, reduce the response of other areas, and then recovering the size to obtain an output feature map y (dimension H W C). The self-attention module can enable the whole network to learn complete structural information according to the connection between the structures in the image, namely, local information learning to global information learning, so that blind learning of the network is avoided, the learning direction is led, and the network performance is improved.

In order to solve the problems that the countermeasure network generated by the cycleGAN cycle is unstable during training and gradient loss is easy to occur, Wasserstein distance is introduced subsequently to replace the traditional countermeasure loss based on cross entropy. Therefore, the invention performs a spectrum normalization operation after each contribution layer in the generator (before the self-attention module when the self-attention module is involved) to provide the condition that the Wasserstein distance holds, i.e., the function satisfies 1-Lipschitz continuity, and 1-Lipschitz continuity means that the maximum gradient of any point of the function is 1.

The spectral normalization operation may be to convert f_w(x) Is limited within a certain range, thereby ensuring the function f_w(x) Satisfying 1-Lipschitz continuity. As shown in equation (3), for the function f_w(x) Weight matrix W of_m*nA Singular Value (SVD) decomposition is performed. After the weight matrix W is decomposed, U represents an orthogonal matrix of m x m, Σ is a diagonal matrix of m x n, the number on the diagonal is the singular value of the weight matrix W, and V is an orthogonal matrix of n x n. Through SVD, a complex matrix can be decomposed into 3 simple matrices to be multiplied, and the 3 matrices respectively represent 3 matrix transformations of W, wherein U and V represent rotation transformations, and Σ represents stretching transformations. The mode length of the vector is changed only by the stretching process, so the function f_w(x) 1-Lipschitz in succession only correlates with sigma. Defining the maximum singular value obtained after SVD decomposition as the spectral coefficient of the matrix, dividing the weight matrix W by the spectral coefficient after each convolution operation in the training process, and ensuring that the stretching matrix sigma after SVD decomposition is less than 1, namely ensuring that the function f_w(x) Satisfying 1-Lipschitz continuity. The above process is the spectrum normalization operation on the weight matrix W.

And 4, step 4: the discriminator is defined, the basic structure of the discriminator used by the invention is PatchGAN, this structure is used to judge whether the patch of a certain size covered by the image is from the original image, compared with the global discriminator, the discriminator of such patch level has less parameters, and can process the image of any size in a complete convolution mode. If a Convolume-LeakyReLU layer containing k filters is represented by cxsi _ k, the Convolution kernel size is x, and the step size is i; cksi denotes a Convolume-instanceNorm-LeakyReLU layer with k filters, Convolution kernel size 4 x 4, step size i, where the slope of LeakyReLU is 0.2. The discriminator basic structure of the invention can be expressed as follows: c4s2_64, c128s2, c256s2, c512s1, c4s1_ 1. The last layer deletes the LeakyReLU layer.

For the same reason as in step 3.1 and step 3.2, a spectral normalization operation is performed after each contribution layer in the discriminator, and a self-attention module (after spectral normalization operation) is added after the contribution layer in c512s 1.

And 5: a training network is built, the training network model of the invention refers to a CycleGAN structure, as shown in FIG. 7, the structure can well complete image translation work under the condition of lacking of paired data sets. Wherein A represents a training set source domain, namely an optical red footprint image data set, and B represents a training set target domain, namely a contour data set; g _ A and G _ B are the generator structures defined in step 3, G _ A is used for generating the image in the target domain from the image in the source domain, G _ B is used for generating the image in the source domain from the image in the target domain; d _ A and D _ B are the discriminator structure defined in step 4, D _ A is used for judging whether the image is a true image in the target domain, D _ B is used for judging whether the image is an image in the source domain; g _ A and D _ A form a forward GAN network, and G _ B and D _ B form a reverse GAN network. The forward GAN network and the antagonistic loss can ensure that a sufficiently realistic target domain image is generated; and adding a reverse GAN network and circulating consistency loss to limit the strict one-to-one correspondence relationship between the target domain image obtained by translating the source domain image and the source domain image in the image translation process.

Because the conventional GAN network has certain defects of the countermeasure loss based on the cross entropy, and the problems of gradient disappearance and mode collapse can occur in the training process, the Wassertein distance is used for replacing the conventional countermeasure loss based on the cross entropy, so that the stable training is ensured, and the structure of a loss function is simplified.

Compared with the countermeasure loss based on the cross entropy, the Wasserstein distance does not have the mode collapse phenomenon because no intersection exists between the distributions. When the two distributions do not have intersection, the resistance loss based on the cross entropy is abrupt and cannot provide gradient information, and the Wasserstein distance is smooth and can provide gradient information. To add the Wasserstein distance to the GAN training process, equation (4) is first transformed into equation (5), where | | f | | computationally_L≤KThe expression function f is continuous on K by Lipschitz, namely the absolute value of the derivative function f does not exceed K (K is more than or equal to 0), and sup represents that the upper boundary of the function f meeting the condition is taken.

In the training process of generating the countermeasure network by the CycleGAN cycle, the function f (x) is defined as an objective function containing the weight w to be solved, namely, the formula (5) is converted into the formula (6). The function f must be guaranteed due to the establishment of equation (6)_w(x) Is 1-Lipschitz continuous in K, so it is necessary to use the spectral normalization operation limit f in the training process_w(x) W is within a certain range, when f_w(x) The derivative function for x is limited to a certain range.

According to the formula (6), the target function of the discriminator in the countermeasure network generated by the cycleGAN cycle can be reconstructed to be the formula (7), and the function f is ensured_w(x) With the premise of Lipschitz continuity, the optimization arbiter is equivalent to optimizing the true distributions and generating Wasserstein distances between the distributions at this time. The problem of disappearance of the gradient does not occur.

After the Wasserstein distance is introduced, the countermeasure loss of the generator is shown in a formula (8), and the countermeasure loss of the discriminator is shown in a formula (9). Wherein G represents a generator and D representsDiscriminator, N represents the number of samples in a round of training, a_iRepresenting one source domain sample of N samples, b_iRepresenting a target domain sample.

The CycleGAN structure considers that the learned function should have cyclic consistency, as shown in fig. 8, each image a in the data field a should be able to return a to the original point of translation in cyclic translation, so as to ensure that there is a one-to-one relationship between the image translation result and the original image, and vice versa:

a_i→G_A(a_i)→G_B(G_A(a_i))≈a_i

b_i→G_B(b_i)→G_A(G_B(b_i))≈b_i

if this introduces a cyclic consistent loss, as shown in equation (6).

Step 6: and (3) sending the data set manufactured in the step 2.2 into a built training network for training.

Setting the network training epoch number to be 200, the batch size to be 1, the source domain and source domain samples to be respectively represented by A and a, and the target domain and target domain samples to be represented by B and B, namely, the process of translating one sample twice is as follows:

real_a→fake_b→rec_a

real_b→fake_a→rec_b

respectively taking a sample real _ a and a sample real _ B from a source domain and a target domain, generating a generated sample fake _ B in the target domain by the real _ a through a generator G _ A, and generating a generated sample rec _ a in the source domain by the fake _ B through a generator G _ B; the generation process of real _ b → fake _ a → rec _ b, and so on.

Calculating the loss of the discriminator D _ B as lossD _ B from real _ a and fake _ a according to the formula (9); the loss of discriminator D _ a is calculated as lossD _ a from real _ b and fake _ b.

Calculating the loss of the generator G _ A as lossG _ A by fake _ b according to a formula (8); the loss of generator G _ B is calculated from fake _ a as lossG _ B. According to equation (10), the cycle consistent loss is calculated from real _ a, rec _ a, real _ b, rec _ b as lossC. I.e. the total loss of the generator is:

lossG_total＝lossG_A+lossG_B+lossC

an initial learning rate of 0.002 was set, a fixed learning rate was used in the first 100 epochs, and the learning rate was linearly attenuated to 0 in the last 100 epochs. Using the RMSProp optimizer, the discriminator D _ A parameters are optimized according to lossD _ A, the discriminator D _ B parameters are optimized according to lossD _ B, and the discriminator G parameters are optimized according to lossG_totalAnd optimizing the parameters of the generators G _ A and G _ B, and storing the trained network.

And 7: and building a contour extraction test network by using the trained generator G _ A, inputting source domain data of a test data set, outputting the obtained target domain data, and testing the network.

In conclusion, the method can directly perform a set of operation with a unified flow on the simply preprocessed original optical red footprint image to extract the outline image, thereby simplifying the flow of outline extraction; the invention can ignore the difference between different original images, and uses the network with the same parameter to process the data of the same batch, thereby reducing the calculation amount of contour extraction; the invention uses a more flexible GAN network algorithm, can weaken the corresponding relation between pixel values of pixels at the same position in the original image and the contour image, neglects the flaw in the original image and processes the situations which can not be processed by some traditional methods.

Claims

1. An optical red footprint image contour extraction method based on a GAN network is characterized in that: the method comprises the following steps in sequence:

(2) making a training set and a testing set;

(3) constructing a generator through a residual error network;

(4) constructing a discriminator through a PatchGAN Markov discriminator;

(6) sending the training set into a training network for training;

2. The GAN network-based optical red footprint image contour extraction method of claim 1, wherein: the step (2) specifically comprises the following steps:

3. The GAN network-based optical red footprint image contour extraction method of claim 1, wherein: the step (3) specifically comprises the following steps:

(3a) the generator is a residual error network structure with 9 residual error blocks, and a down-sampling layer containing k filters, the convolution kernel size is 7 x 7 and the step size is 1 is represented by c7s1_ k; a down-sampling layer containing k filters, convolution kernel size 3 x 3 and step size 2 is represented by dk; let rk denote a residual block comprising two convolutional layers with k filters and a convolutional kernel size of 3 x 3; denote an upsampled layer with k filters, convolution kernel size 3 x 3, step size 1/2 by uk; the structure of the generator is represented as: c7s1_64, d128, d256, r256, r256, r256, r256, r256, r256, r256, r256, r256, u128, u64, c7s1_ 3;

f_w(x)＝Wx＝U∑Vx (3)

4. The GAN network-based optical red footprint image contour extraction method of claim 1, wherein: the step (5) specifically comprises the following steps:

a_i→G_A(a_i)→G_B(G_A(a_i))

b_i→G_B(b_i)→G_A(G_B(b_i))

in the formula II (P)_r，P_g) Representing a true distribution P_rAnd generating a distribution P_gA set of all joint distributions combined, γ represents any one of the set of joint distributions, variable x represents one true sample in γ, variable y represents one generated sample in γ, E_(x，y)～γ[||x-y||]Representing the expectation of gamma versus the distance x and y, the lower boundary of the expectation in the set of the joint distribution is calculated

wherein | | f | non-calculation_L≤KRepresenting that the function f is continuous in LepuShz on K, namely the absolute value of the derivative function of f is less than K, K is more than or equal to 0, and sup represents that the upper boundary of the function f meeting the condition is taken;

optimizing an objective function f_w(x) I.e. find f_w(x) The optimal weight w value of (a) is to find the optimal solution of w, and the function f (x) is defined as an objective function f containing the weight w_w(x) Due to f_w(x) Wx, W is a weight matrix, W is a weight; | f_w||_L≤KRepresenting function f_wIs continued in Li PuShtz at K, i.e. f_wThe absolute value of the derivative function of (A) is less than K; e denotes expectation, P_rRepresents the true distribution, x-P_rRepresenting samples x from the true distribution, P_gRepresents a generation distribution, x to P_gIndicating that the sample x is from the generated distribution,

representing samples x from the true distribution P_rTime f_w(x) The expectation of the value of the position of the optical fiber,

representing samples x from the generated distribution P_gTime f_w(x) A desire for a value;

according to the formula (6), the objective function of the discriminator in the countermeasure network generated by reconstructing the cycleGAN cycle is the formula (7), and the function f is ensured_w(x) With the premise of Lipschitz continuity, the optimization discriminator is equivalent to optimizing the true distribution and generating Wasserstein distance between the distributions:

wherein G represents a generator, D represents a discriminator, N represents the maximum value range of i, and a_iRepresenting one source domain sample of N pairs of samples, b_iRepresenting a target domain sample;

(5b) introducing cycle consistent loss to define the one-to-one correspondence relationship between the original image and the contour image, wherein the cycle consistent loss is shown as the following formula:

wherein, | G _ B (G _ A (a))_i))-a_i||₁To obtain G _ B (G _ A (a))_i) A and a_iManhattan distance of; a is_iRepresenting a source domain sample, through a generator G _ A and then a generator G _ B, a new image G _ B (G _ A (a)) in the source domain is obtained_i))；b_iRepresents one target domain sample, G _ A (G _ B (B))_i) Represents the new image of bi through the generator twice.

5. The GAN network-based optical red footprint image contour extraction method of claim 1, wherein: the step (6) specifically comprises the following steps:

(6c) the opposing losses lossG _ a, lossG _ B and the round-robin uniform loss lossC of the generator are calculated, i.e. the total generator loss is as follows:

lossG_total＝lossG_A+lossG_B+lossC

(6d) and selecting a RMSProp optimizer, setting the parameters of the optimizer, training and storing the cycleGAN cycle to generate the countermeasure network.