CN114565806A - Feature domain optimization small sample image conversion method based on characterization enhancement - Google Patents
Feature domain optimization small sample image conversion method based on characterization enhancement Download PDFInfo
- Publication number
- CN114565806A CN114565806A CN202210170641.6A CN202210170641A CN114565806A CN 114565806 A CN114565806 A CN 114565806A CN 202210170641 A CN202210170641 A CN 202210170641A CN 114565806 A CN114565806 A CN 114565806A
- Authority
- CN
- China
- Prior art keywords
- image
- representing
- feature
- information
- domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 116
- 238000000034 method Methods 0.000 title claims abstract description 85
- 238000005457 optimization Methods 0.000 title claims abstract description 35
- 238000012512 characterization method Methods 0.000 title abstract description 35
- 238000000605 extraction Methods 0.000 claims abstract description 35
- 230000007246 mechanism Effects 0.000 claims abstract description 33
- 238000012545 processing Methods 0.000 claims abstract description 23
- 230000000007 visual effect Effects 0.000 claims abstract description 14
- 230000006870 function Effects 0.000 claims description 47
- 238000012549 training Methods 0.000 claims description 24
- 230000008569 process Effects 0.000 claims description 23
- 230000006835 compression Effects 0.000 claims description 18
- 238000007906 compression Methods 0.000 claims description 18
- 125000004122 cyclic group Chemical group 0.000 claims description 14
- 230000005284 excitation Effects 0.000 claims description 13
- 230000002708 enhancing effect Effects 0.000 claims description 11
- 238000005070 sampling Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 5
- 238000001514 detection method Methods 0.000 claims description 5
- 238000009499 grossing Methods 0.000 claims description 5
- 230000005764 inhibitory process Effects 0.000 claims description 4
- 230000009467 reduction Effects 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 230000002829 reductive effect Effects 0.000 abstract description 3
- 238000002474 experimental method Methods 0.000 description 59
- 238000010586 diagram Methods 0.000 description 22
- 230000000052 comparative effect Effects 0.000 description 14
- 230000003287 optical effect Effects 0.000 description 13
- 230000000694 effects Effects 0.000 description 12
- 230000009466 transformation Effects 0.000 description 7
- 238000011160 research Methods 0.000 description 5
- 238000011426 transformation method Methods 0.000 description 5
- 230000006978 adaptation Effects 0.000 description 3
- 230000003042 antagnostic effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 238000009432 framing Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 210000003462 vein Anatomy 0.000 description 3
- 241000196324 Embryophyta Species 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 230000003313 weakening effect Effects 0.000 description 2
- 241001665757 Bufonia Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003446 memory effect Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Image Processing (AREA)
Abstract
The invention belongs to the technical field of image processing, and particularly relates to a feature domain optimization small sample image conversion method based on characterization enhancement. According to the method, the histogram equalization algorithm and the Canny algorithm priori knowledge are introduced, the image contrast information and the edge information are enhanced, on the basis, the network feature extraction capability is improved through a channel attention mechanism based on sub-pixel convolution, image characterization enhancement is carried out, and the problem that fine features of an image are not prominent is solved. The invention provides a conversion mechanism based on a characteristic domain optimization algorithm, a characteristic domain and a content domain are adaptively divided by introducing a countermeasure idea, a parameter space is reduced by utilizing various rich images in a source domain, a network is not limited to the generation of a single sample due to the addition of a noise strategy, and the problem of mode collapse is alleviated; and constructing a reconstruction strategy according to the characteristics of the source domain and the target domain, completing a small sample image conversion task by utilizing the weakened circulation semantic consistency, and obtaining a conversion image with better visual effect.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a feature domain optimization small sample image conversion method based on characterization enhancement.
Background
The image is used as an important carrier for environmental perception and knowledge acquisition, and has important significance for scientific exploration, research and information mining. The task of image conversion is based on a given data set, and a deep learning algorithm is utilized to learn good network mapping so as to generate an image simultaneously having target data domain characteristics and source data domain content. The research on the high-efficiency and reliable image conversion algorithm has important theoretical value and practical significance, and scholars at home and abroad carry out deep research on image conversion and obtain important achievements. Among them, the most famous and most effective image conversion method in the existing literature mainly includes: 1. a supervised image conversion method based on generation countermeasure comprises the following steps: in 2016, Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, et al, image-to-image transformation with conditional adaptation of network, proceedings of The IEEE conference on computer vision and pattern recognition, Honolulu, Hawaii, The United States of America,2017: 1125-channel 1134, a method for solving The task of large-scale supervised image transformation by introducing a condition-generating antagonistic network was proposed. 2. The unsupervised image conversion method based on the generation countermeasure comprises the following steps: in 2017, Jun-Yan Zhu, Taesung Park, Phillip Isola, et al, Ungained image-to-image transformation using cycle-dependent adaptive networks, proceedings of the IEEE international conference on computer vision, Italy,2017:2223 + 2232. 3. The unsupervised image conversion method based on the shared domain comprises the following steps: in 2017, Ming-Yu Liu, Thomas Breuel, Jan Kautz, unsuperved image-to-image transformation networks, Advances in neural information processing systems, California, The United States of America,2017:700 Bufonia 708, a transformation is proposed, which involves The assumption that two image domains share The same content domain, except for The expression form between The domains, and a better transformation effect is obtained.
When the image conversion relates to the fields of small samples and unsupervised fields, the supervised method can not be used for training at all, and in addition, the solution method similar to the cyclic consistency can require the image data set to be large in scale and cannot be applied to the fields of small samples. At present, scholars at home and abroad deeply research an image conversion algorithm based on a sample field, and the method mainly comprises the following steps: 1. the small sample image generation method based on data enhancement comprises the following steps: in 2018, Hang Gao, Zheng Shou, Alireza Zareian, et al, Low-shot learning via collaborative-prediction adaptation networks, arXiv prediction arXiv:1810.11730,2018:1-13, it is proposed to generate synthetic similar data by using large data sets by using a generation confrontation network to alleviate the problem of data loss of small sample target domains and obtain better results. 2. The small sample image classification method based on the regular element learning comprises the following steps: in 2018, Yabin Zhang, Hui Tang, Kui Jia, Fine-grained visual identification use of method-learning selection of automatic data, proceedings of the understanding of the following conference on computer vision (ECCV), Munich, Germany,2018: 233-. 3. The unsupervised image conversion method based on the small samples comprises the following steps: in 2019, Ming-Yu Liu, Xun Huang, Arun Mallyya, et al, Few-shot unsupervised image-to-image transformation, proceedings of the IEEE/CVF International Conference on Computer Vision.Seoul, Korea,2019: 10551-.
Disclosure of Invention
The invention aims to provide a feature domain optimization small sample image conversion method based on characterization enhancement, which can retain more image detail information and generate more richness.
A feature domain optimization small sample image conversion method based on characterization enhancement comprises the following steps:
step 1: acquiring a small sample image dataset;
step 2: training an image conversion model;
step 2.1: initializing parameters and setting the maximum iteration times;
step 2.2: enhancing the image contrast information and the edge information;
step 2.3: performing characteristic processing by utilizing a compression operation and an excitation operation of a channel attention mechanism;
step 2.4: performing up-sampling operation on the image feature set after feature processing by combining sub-pixel convolution;
step 2.5: completing small sample image conversion with rich details by a conversion mechanism based on a feature domain optimization algorithm;
step 2.5.1: introducing a countermeasure thought, adaptively dividing a characteristic domain and a content domain, and reducing a parameter space by utilizing various rich images in a source domain;
the feature information is accessed into a new discriminator to construct a new discriminating process, and the concept of unchanged contents of two times of conversion is combined in the constructing discriminating process, namely the image feature information of the converted image is similar to the image feature information extracted by a given sample after the image feature information is re-extracted; in order to avoid overlarge loss caused by simple application of image difference and further unbalance a loss function, the process is completed by utilizing a countermeasure idea, image feature information extracted twice is fitted to the vicinity of the same feature distribution, parameters of a generator in a network are forced to be further optimized and a better feature extraction process is completed through the loss function, so that the division of a feature domain is more accurate, and the purpose of optimizing the division of the feature domain and a content domain is achieved;
by adding a new discriminator of category characteristics, adopting the characteristics of a reconstructed image as a pseudo label and the characteristics of a real image as a real label, the related judgment is completed, and the added loss function is as follows:
where x denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,indicating a desire;
step 2.5.2: adding a noise strategy;
random information is introduced and added to the noise information redundancy of a decoder module in a generator network, and the method specifically comprises the following steps:
newvec=Cov(Cov(Cov(concat(covec,z),z),z),z)
wherein x isimgRepresenting an input image; co is a mixture ofvecRepresenting a content vector; concat (·, ·) represents a noise addition operation; cov (·, ·) represents a convolution process;an extraction process representing a content vector; z represents noise information; newvecRepresenting a new content vector;
step 2.5.3: according to the characteristics of the source domain and the target domain, a reconstruction strategy is constructed, and an image conversion task is completed by utilizing the weakened cyclic semantic consistency;
the cycle consistency reconstruction loss is specifically:
wherein the content of the first and second substances,representing a weakened cyclic semantic consistency;indicating forward loss consistency;representing the backward loss consistency;
the forward loss consistency is specifically:
wherein Conv (·) represents a feature extraction process; lambda [ alpha ]1Hyper-parameters representing forward conversion AndAimg,A′img,Bimgand CimgRespectively representing images; dec (·,) denotes image decoding in accordance with given image category information and image content information; phi (-) represents the image class extraction process;representing an image content feature information extraction process;
the back loss consistency is:
wherein λ is2Hyper-parameters representing a backward transformationAndB′imgand C'imgRespectively representing images;
step 2.6: judging whether the maximum iteration times is reached, if not, returning to the step 2.2; otherwise, outputting the trained image conversion model;
and step 3: and inputting the image to be converted into the trained image conversion model to obtain a converted image with better visual effect.
Further, the method for performing enhancement processing on the image contrast information and the edge information in step 2.2 specifically includes:
step 2.2.1: enhancing the contrast information of the image by adopting a histogram equalization algorithm, wherein the processed image meets uniform probability density distribution;
step 2.2.2: taking the gray level image enhanced by the histogram equalization algorithm as the input of a Canny algorithm, and performing Gaussian smoothing; gradient solving is carried out on the smoothed image, non-maximum inhibition and double-threshold detection are utilized to complete the acquisition of edge information characteristics, and the edge information characteristics are added to an image characteristic set;
the smoothed image is:
wherein f (x, y) represents an input gray image;representing a convolution process;representing a gaussian function; σ represents a standard deviation;
and performing gradient solving on the smoothed image, specifically:
wherein, gxRepresents the gradient in the x-direction; gyRepresents the gradient in the y-direction; m (x, y) represents an image gradient; α (x, y) represents an image direction.
Further, the method for performing feature processing by using the compression operation and the excitation operation of the channel attention mechanism in step 2.3 specifically includes:
the compression operation is specifically:
wherein z iscA statistic representing a channel length; h and W are the spatial dimensions of the feature map; u. ucRepresenting a feature map; fsqRepresenting a compression operation;
the excitation operation applies a gating module and a Sigmoid activation function to learn nonlinear description information between channels, and the nonlinear description information is formally expressed as:
s=Fex(z,W)=σ(W2δ(W1z))
where, σ denotes a Sigmoid function, representing a variable, e representing a natural constant; c represents the number of channels; r represents the ratio of dimensionality reduction; delta denotes the value of the ReLU function, representing a variable; fexIndicating an actuation operation.
Further, the method for performing the upsampling operation on the feature-processed image feature set in the step 2.4 by combining the sub-pixel convolution specifically includes:
wherein, IoutRepresents an output value; i isinputRepresenting an input value; f represents a network; l represents the Lth layer; b is a mixture ofLRepresents the bias of the L-th layer;represents a convolution operation; w is a group ofLA parameter indicating an L-th layer; PS (-) represents the adjustment function of the pixel, specifically:
wherein T represents an input feature set;representing a rounding-down operation; x represents the extended feature length; r represents an upsampling multiple; y represents the expanded feature width; c represents the number of characteristic channels before expansion; mod (·,) represents a modulo operation; c represents the number of the characteristic channels after expansion.
The invention has the beneficial effects that:
according to the method, the contrast information and the edge information of the image are enhanced by introducing the priori knowledge of a histogram equalization algorithm and a Canny algorithm, on the basis, the method provides a channel attention mechanism based on sub-pixel convolution to improve the network feature extraction capability, performs image characterization enhancement, solves the problem that fine features of the image are not prominent, and is beneficial to performing subsequent conversion tasks. The invention provides a conversion mechanism based on a characteristic domain optimization algorithm, adaptively divides a characteristic domain and a content domain by introducing a countermeasure idea, reduces the parameter space by utilizing various rich images in a source domain, adds a noise strategy to enable a network not to be limited by the generation of a single sample, slows down the problem of mode collapse, constructs a reconstruction strategy according to the characteristics of the source domain and a target domain, completes the conversion task of small-sample images by utilizing the weakened cyclic semantic consistency, and obtains a converted image with better visual effect. The invention has better conversion effect on small sample image conversion, can retain more image detail information, generates richer conversion images and has certain effectiveness.
Drawings
FIG. 1 is a flow chart of the present invention.
Fig. 2(a) -2 (b) are graphs of comparative experiment results of 10000 rounds of underwater sonar small sample data set training, fig. 2(a) is a graph of original FUNIT network experiment results, and fig. 2(b) is a graph of original FUNIT network experiment results after contrast information and edge information enhancement processing.
Fig. 3(a) -3 (b) are graphs of results of 50000 rounds of comparative experiments in the Oxford optical small sample image data set training, fig. 3(a) is a graph of results of experiments of an original FUNIT network, and fig. 3(b) is a graph of results of experiments of the original FUNIT network after contrast information and edge information enhancement processing.
FIG. 4 is a schematic diagram of a network structure of a channel attention mechanism based on sub-pixel convolution according to the present invention.
Fig. 5(a) -5 (b) are result graphs of 10000 rounds of comparison experiment examples of training of underwater sonar small sample data sets (underwater targets with complex shapes), fig. 5(a) is a result graph of an original FUNIT network experiment example (box-selecting underwater targets with complex shapes), and fig. 5(b) is a result graph of an original FUNIT network experiment example after characterization enhancement.
Fig. 6(a) -6 (b) are graphs of results of Oxford optical small sample image data set training 57500 rounds of comparative experiments (target with complex shape), fig. 6(a) is a graph of results of an original FUNIT network experiment (target with complex box shape), and fig. 6(b) is a graph of results of an enhanced original FUNIT network experiment.
Fig. 7 is a schematic diagram of a network structure for adaptively dividing a feature domain and a content domain by introducing a countermeasure idea.
Fig. 8 is a schematic diagram of a network structure for noise policy addition.
Fig. 9 is a schematic network structure of an image reconstruction strategy.
Fig. 10(a) -10 (b) are graphs of results of 10000 times of experiments of training an underwater sonar small sample data set with an antagonistic thought, fig. 10(a) is a graph of results of experiments of an original FUNIT network (selecting underwater targets with a single shape), and fig. 10(b) is a graph of results of experiments of the original FUNIT network after the antagonistic thought is introduced.
Fig. 11(a) -11 (b) are graphs of comparative experiment results of 10000 times of training for constructing a reconstruction strategy from an underwater sonar small sample data set, fig. 11(a) is a graph of experiment results of an original FUNIT network (an underwater target with large deformation is selected), and fig. 11(b) is a graph of experiment results of the original FUNIT network after the reconstruction strategy is constructed.
Fig. 12(a) -12 (b) are graphs of results of 50000 rounds of comparative experiments of Oxford flower optical small sample image dataset introduction countermeasure training, fig. 12(a) is a graph of results of experiments of an original FUNIT network (for selecting targets with rich detailed information), and fig. 12(b) is a graph of results of experiments of the original FUNIT network after introduction of countermeasure.
Fig. 13(a) -13 (b) are graphs of 50000 rounds of comparative experiment results of Oxford optical small sample image dataset reconstruction strategy construction, fig. 13(a) is a graph of experiment results of an original FUNIT network (an optical target converted from single strain to multiple strains, which is a target with large deformation, is selected in a box), and fig. 13(b) is a graph of experiment results of the original FUNIT network after a reconstruction strategy is constructed.
Fig. 14 is a schematic network structure diagram of a feature domain optimization small sample image transformation method based on characterization enhancement according to the present invention.
Fig. 15(a) -15 (b) are comparative experiment example result graphs (underwater sonar small sample data sets) of 10000 training rounds of feature domain optimization small sample image conversion method based on characterization enhancement provided by the present invention, fig. 15(a) is an experiment example result graph of an original FUNIT network (selecting underwater targets with less texture features), and fig. 15(b) is an experiment example result graph of the feature domain optimization small sample image conversion method based on characterization enhancement provided by the present invention.
Fig. 16(a) -16 (b) are comparative experiment example result graphs (Oxford flower optical small sample image data sets) of 50000 rounds of training based on the feature domain optimization small sample image transformation method based on the feature enhancement provided by the present invention, fig. 16(a) is an experiment example result graph (for framing a target with less texture features) of an original FUNIT network, and fig. 16(b) is an experiment example result graph based on the feature domain optimization small sample image transformation method based on the feature enhancement provided by the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention provides a feature domain optimization small sample image conversion method based on characterization enhancement. The method comprises the following steps: (1) according to the characteristics of the small sample images in the data set, enhancing the image contrast information and the edge information; (2) providing a channel attention mechanism based on sub-pixel convolution to improve the network feature extraction capability for characterization enhancement; (3) and a conversion mechanism based on a feature domain optimization algorithm is provided, and the small sample image conversion with rich details is completed. The invention provides a feature domain optimization small sample image conversion method based on characterization enhancement, aiming at obtaining a better small sample image conversion effect. The problem that the contrast information and the edge information of the small sample image are fuzzy due to the acquisition means is solved, and the image contrast information and the edge information are enhanced; on the basis, a channel attention mechanism based on sub-pixel convolution is provided to improve the network feature extraction capability, enhance image representation and solve the problem that fine features of the image are not prominent; the method comprises the steps of solving the problems that parameter space optimization is insufficient and a single sample is generated due to limited sample number, providing a conversion mechanism based on a feature domain optimization algorithm, adaptively dividing a feature domain and a content domain by introducing a countermeasure idea, reducing the parameter space by utilizing various abundant images in a source domain, reducing the generation of the single sample by adding a noise strategy, reducing the problem of mode collapse, constructing a reconstruction strategy according to the characteristics of the source domain and a target domain, completing a small-sample image conversion task by utilizing weakened circulation semantic consistency, and obtaining a conversion image with better visual effect. The feature domain optimization small sample image conversion method based on the characterization enhancement has a good conversion effect on small sample image conversion, can retain more image detail information, generates richer conversion images, and has certain effectiveness.
A feature domain optimization small sample image conversion method based on characterization enhancement comprises the following steps:
step 1: acquiring a small sample image dataset;
and 2, step: training an image conversion model;
step 2.1: initializing parameters and setting the maximum iteration times;
step 2.2: enhancing the image contrast information and the edge information;
step 2.2.1: enhancing the contrast information of the image by adopting a histogram equalization algorithm, wherein the processed image meets uniform probability density distribution;
step 2.2.2: taking the gray level image enhanced by the histogram equalization algorithm as the input of a Canny algorithm, and performing Gaussian smoothing; gradient solving is carried out on the smoothed image, non-maximum inhibition and double-threshold detection are utilized to complete the acquisition of edge information features, and the edge information features are added to an image feature set;
the smoothed image is:
wherein f (x, y) represents an input gray image;representing a convolution process;representing a gaussian function; σ represents the standard deviation;
and performing gradient solving on the smoothed image, specifically:
wherein, gxRepresents the gradient in the x-direction; gyRepresents the gradient in the y-direction; m (x, y) represents an image gradient; α (x, y) represents an image direction;
step 2.3: performing characteristic processing by utilizing a compression operation and an excitation operation of a channel attention mechanism;
the compression operation is specifically:
wherein z iscA statistic representing a channel length; h and W are characteristic diagramsThe spatial dimension of (a); u. ofcRepresenting a feature map; fsqRepresenting a compression operation;
the excitation operation applies a gating module and a Sigmoid activation function to learn nonlinear description information between channels, and the nonlinear description information is formally expressed as:
s=Fex(z,W)=σ(W2δ(W1z))
where, σ denotes a Sigmoid function, representing a variable, e representing a natural constant; c represents the number of channels; r represents the ratio of dimensionality reduction; delta denotes the value of the ReLU function, representing a variable; fexRepresenting an actuation operation;
step 2.4: performing up-sampling operation on the image feature set after feature processing by combining sub-pixel convolution;
wherein, IoutRepresents an output value; i isinputRepresenting an input value; f represents a network; l represents the Lth layer; bLIndicating the bias of the L-th layer;representing a convolution operation; wLParameter indicating L-th layerCounting; PS (-) represents the adjustment function of the pixel, specifically:
wherein T represents an input feature set;representing a rounding-down operation; x represents the extended feature length; r represents an upsampling multiple; y represents the expanded feature width; c represents the number of the characteristic channels before expansion; mod (·,) represents a modulo operation; c represents the number of the expanded characteristic channels;
step 2.5: completing small sample image conversion with rich details by a conversion mechanism based on a feature domain optimization algorithm;
step 2.5.1: introducing a countermeasure thought, adaptively dividing a characteristic domain and a content domain, and reducing a parameter space by utilizing various rich images in a source domain;
the feature information is accessed into a new discriminator to construct a new discriminating process, and the concept of unchanged contents of two times of conversion is combined in the constructing discriminating process, namely the image feature information of the converted image is similar to the image feature information extracted by a given sample after the image feature information is re-extracted; in order to avoid the phenomenon that the loss is too large due to simple application of image difference and further the loss function is unbalanced, the process is completed by utilizing the countermeasure idea, the image characteristic information extracted twice is fitted to the vicinity of the same characteristic distribution, and the parameters of a generator in the network are forced to be further optimized through the loss function to complete a better characteristic extraction process, so that the division of the characteristic domain is more accurate, and the purpose of optimizing the division of the characteristic domain and the content domain is achieved;
by adding a new discriminator of category features, adopting the features of the reconstructed image as a pseudo label and the features of the real image as a real label, the correlation judgment is completed, and the added loss function is as follows:
where x denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,indicating a desire;
step 2.5.2: adding a noise strategy;
random information is introduced and added to the noise information redundancy of a decoder module in a generator network, and the method specifically comprises the following steps:
newvec=Cov(Cov(Cov(concat(covec,z),z),z),z)
wherein x isimgRepresenting an input image; co is a mixture ofvecRepresenting a content vector; concat (·, ·) represents a noise addition operation; cov (·, ·) represents a convolution process;an extraction process representing a content vector; z represents noise information; newvecRepresenting a new content vector;
step 2.5.3: according to the characteristics of the source domain and the target domain, a reconstruction strategy is constructed, and an image conversion task is completed by utilizing the weakened cyclic semantic consistency;
the cycle consistency reconstruction loss is specifically:
wherein the content of the first and second substances,representing a weakened cyclic semantic consistency;indicating forward loss consistency;indicating backward loss consistency;
the forward loss consistency is specifically:
wherein Conv (·) represents a feature extraction process; lambda [ alpha ]1Hyper-parameters representing forward conversion AndAimg,A′img,Bimgand CimgRespectively representing images; dec (·,) denotes image decoding in accordance with given image category information and image content information; phi (-) represents the image class extraction process;representing an image content feature information extraction process;
the back loss consistency is:
wherein λ is2Hyper-parameters representing a backward transformationAndB′imgand C'imgRespectively representing images;
step 2.6: judging whether the maximum iteration times is reached, if not, returning to the step 2.2; otherwise, outputting the trained image conversion model;
and step 3: and inputting the image to be converted into the trained image conversion model to obtain a converted image with better visual effect.
Example 1:
the invention aims to provide a feature domain optimization small sample image conversion method which can retain more image detail information and generate richer images based on characterization enhancement.
The invention comprises the following steps in the realization:
(1) enhancing the image contrast information and the edge information: firstly, processing image contrast information by adopting a histogram equalization algorithm; acquiring image edge information by adopting a Canny algorithm;
(2) the method provides a channel attention mechanism based on sub-pixel convolution to improve the network feature extraction capability for characterization enhancement: carrying out feature processing by utilizing compression operation and excitation operation of a channel attention mechanism; combining sub-pixel convolution to carry out up-sampling operation with the multiplying power of 2 on the processed image feature set;
(3) and (3) providing a conversion mechanism based on a feature domain optimization algorithm to complete the conversion of the small sample image with rich details: firstly, a countermeasure thought is introduced to adaptively divide a characteristic domain and a content domain, and a parameter space is reduced by utilizing various rich images in a source domain; secondly, due to the addition of a noise strategy, the network is not limited to the generation of a single sample, and the problem of mode collapse is alleviated; and thirdly, according to the characteristics of the source domain and the target domain, a reconstruction strategy is constructed, an image conversion task is completed by utilizing weakened circulation semantic consistency, and a small sample conversion image with better visual effect is obtained.
The present invention may further comprise:
1. the probability density function of the small sample image in step (1) isr represents the converted grayscale image, s represents the output grayscale image, pr(r) represents the probability density function of the random variable r.The conversion function isw is a pseudo integral variable, pr(w) represents the probability density function of the random variable w, and L represents the maximum gray value.
2. And (2) taking the gray level image enhanced by the histogram equalization algorithm as the input of a Canny algorithm, performing Gaussian smoothing, gradient solving, non-maximum suppression and double-threshold detection to complete the acquisition of edge information features, and adding the edge information features to an image feature set.
3. In the step (2), the statistical information of the channel direction is generated through the global pooling operation, the compression operation of the overall visual angle is provided for the network after the global space information is compressed, and the limitation of the local visual angle is alleviated. On the basis, a gating module and an activation function are introduced to perform excitation operation by utilizing more sufficient dependent information related to the capture channels, and nonlinear description information among the channels is learned.
4. Introducing sub-pixel convolution in the step (2) to perform up-sampling operation with the magnification of 2 on the image feature set, specifically:Ioutrepresents the output value, IinputRepresenting input values, f representing the network, L representing the Lth layer, bLThe bias of the L-th layer is indicated,representing a convolution operation, WLRepresents the parameters of the L-th layer and PS (-) represents the adjustment function of the pixel.
5. The step (2) introduces a pixel adjustment function, which specifically comprises:t represents a set of input features,represents a lower rounding operation, x tableShowing the length of the feature after expansion, r representing the multiple of upsampling, y representing the width of the feature after expansion, C representing the number of the feature channels before expansion, mod (·, and), and C representing the number of the feature channels after expansion.
6. In the step (3), by adding a new discriminator of category characteristics, adopting the characteristics of the reconstructed image as a pseudo label and the characteristics of the real image as a real label, finishing the correlation judgment and adding a loss functionx denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,indicating a desire.
7. In the step (3), random information is added by introducing redundancy of noise information of a decoder module in the generator network, specifically:and newvec=Cov(Cov(Cov(concat(covec,z),z),z),z),ximgRepresenting an input image, covecRepresents the content vector, concat (·,) represents the noise addition operation, Cov (·,) represents the convolution process,representing the extraction process of the content vector, z representing noise information, newvecRepresenting a new content vector.
8. Weakening the pixel level error of the image into a characteristic level error through adding the convolution layer in the step (3), specifically: indicating a weakened cyclic semantic consistency,it is shown that the forward loss consistency is,and backward loss consistency is expressed, a conversion mechanism based on a feature domain optimization algorithm is realized through weakened circulation semantic consistency, and small sample image conversion with rich details is obtained.
9. Forward loss consistency in said step (3)Conv (. beta.) denotes the feature extraction procedure, λ1A hyper-parameter representing the forward conversion, andAimg,A′img,Bimgand CimgRespectively representing an image, Dec (-) representing image decoding according to given image class information and image content information, phi (-) representing an image class extraction process,representing an image content feature information extraction process.
10. Consistency of back loss in the step (3)λ2Hyper-parameters representing a backward transformationAndB′imgand C'imgRespectively representing images.
Compared with the prior art, the invention has the advantages that: a. the traditional image conversion method adopts a large amount of source domain and target domain data to extract features and complete an image conversion task, but when the number of samples is limited, namely a small sample data set, the traditional image conversion method cannot be adopted to fully train the image feature extraction process, the effectiveness of feature extraction is influenced, and in order to ensure that the small sample image conversion has a better conversion effect and more image detail information can be reserved, the invention provides a feature domain optimization small sample image conversion method based on characterization enhancement; b. the invention relates to a method for enhancing the contrast information and the edge information of a small sample image, which solves the problem that the contrast information and the edge information of the small sample image are fuzzy due to the acquisition means of the small sample image; c. on the basis of enhancing image contrast information and edge information, the invention provides a channel attention mechanism based on sub-pixel convolution to improve network feature extraction capability, enhance image representation, solve the problem that fine features of an image are not prominent, and facilitate subsequent conversion tasks; d. the invention provides a conversion mechanism based on a characteristic domain optimization algorithm, adaptively divides a characteristic domain and a content domain by introducing a countermeasure idea, reduces the parameter space by utilizing various rich images in a source domain, reduces the problem of mode collapse by adding a noise strategy so that a network is not limited to the generation of a single sample, constructs a reconstruction strategy according to the characteristics of the source domain and a target domain, completes a small-sample image conversion task by utilizing weakened cyclic semantic consistency and obtains a converted image with better visual effect.
The feature domain optimization small sample image conversion method based on the characterization enhancement has a good conversion effect on small sample image conversion, can retain more image detail information, generates richer conversion images, and has certain effectiveness.
Example 2:
with reference to fig. 1, the specific steps of the present invention are as follows:
(1) image contrast information and edge information are enhanced
The image contrast information enhancement processing adopts a histogram equalization algorithm, and the probability density function is as follows:
where r represents the converted grayscale image, s represents the output grayscale image, pr(r) represents the probability density function of the random variable r.
The transfer function is:
wherein w is a pseudo integral variable, pr(w) represents the probability density function of the random variable w, and L represents the maximum gray value.
The two formulas can be obtained:
in the formula, s is not less than 0 and not more than L-1.
The image edge information enhancement processing adopts a Canny algorithm, the gray level image enhanced by the histogram equalization algorithm is used as the input of the Canny algorithm, Gaussian smoothing is carried out, and the smoothed image is as follows:
wherein f (x, y) represents an input gray image,which represents the process of convolution, is,is highThe gaussian function, σ, represents the standard deviation.
And performing gradient solving on the smoothed image, specifically:
in the formula, gxDenotes the gradient in the x direction, gyDenotes the gradient in the y direction, M (x, y) denotes the image gradient, and α (x, y) denotes the image direction.
And (4) finishing the acquisition of the edge information features by using non-maximum inhibition and double-threshold detection, and adding the edge information features into the image feature set.
Because the evaluation index of the current image conversion is not perfect, the method follows the common practice in the research field and carries out algorithm comparison from the image visual effect. In the field of small sample image conversion, the conversion effect of the FUNIT network is relatively best at present, and the invention takes the FUNIT network as an original network to carry out comparison experiment verification.
In order to verify the effectiveness of enhancement processing of contrast information and edge information of a small sample image dataset according to the present invention, fig. 2 is a graph of comparative experiment results of 10000 training rounds of underwater sonar small sample dataset, where fig. 2(a) is a graph of original FUNIT network experiment results, and fig. 2(b) is a graph of original FUNIT network experiment results after enhancement processing of contrast information and edge information. By comparing the generation results of the representative images outlined in fig. 2(a) and fig. 2(b), the images after the contrast information and the edge information enhancement processing are more complete and clear due to the further guidance of the contrast information and the edge information on the network gradient.
Fig. 3 is a 50000 round comparison experiment result diagram of Oxford optical small sample image data set training, wherein fig. 3(a) is an experiment result diagram of an original FUNIT network, and fig. 3(b) is an experiment result diagram of an original FUNIT network after contrast information and edge information enhancement processing. Although the generated image has randomness, by comparing the generated results of the representative images selected in the boxes in fig. 3(a) and fig. 3(b), the detail information such as the vein of the petals in the original FUNIT network after the contrast information and the edge information enhancement processing is clearer than the vein of the petals in the original FUNIT network.
(2) Channel attention mechanism based on sub-pixel convolution is provided to improve network feature extraction capability for characterization enhancement
Compression and excitation of channel attention mechanism
The compression operation provides an overall view angle for the network after compressing the global space information, namely, the statistical information of the channel direction is generated through the global pooling operation, the statistical information relieves the limitation that the feature map can only utilize a local view angle and cannot utilize network information except the local view angle, and the feature extraction of the network is more sufficient. The compression operation is specifically:
in the formula, zcStatistics representing channel length, H and W being spatial dimensions of the feature map, ucShowing a characteristic diagram, FsqRepresenting a compression operation.
On the basis, in order to more fully utilize information extracted after compression operation, the network performs excitation operation, can more fully capture channel-related dependency information, the excitation operation applies a gating module and a Sigmoid activation function, learns nonlinear description information between channels, and the nonlinear description information is expressed in a formalization mode as follows:
s=Fex(z,W)=σ(W2δ(W1z))
where, σ denotes a Sigmoid function,c denotes the number of channels, r denotes the ratio of the dimensionality reduction, δ denotes the ReLU function, FexIndicating an actuation operation.
The Sigmoid function is formalized as:
in the formula (I), the compound is shown in the specification,representing a variable and e a natural constant.
The formalization of the ReLU function is represented as:
Introducing sub-pixel convolution to carry out up-sampling operation on image feature set
The invention introduces sub-pixel convolution to carry out up-sampling operation with the multiplying power of 2 on an image feature set, and specifically comprises the following steps:
in the formula IoutRepresents the output value, IinputRepresenting input values, f representing the network, L representing the Lth layer, bLThe bias of the L-th layer is indicated,representing a convolution operation, WLDenotes the L-th layerPS (-) represents the adjustment function of the pixel.
Introducing a pixel adjusting function, specifically:
in the formula, T represents an input feature set,representing a lower rounding operation, x representing the extended feature length, r representing an upsampling multiple, y representing the extended feature width, C representing the number of feature channels before extension, mod (·,) representing a modulo operation, and C representing the number of feature channels after extension.
Fig. 4 is a schematic diagram of a network structure of a channel attention mechanism based on sub-pixel convolution, the scale of an image is continuously changed due to the cooperation of the channel attention mechanism and the sub-pixel convolution, a certain sparse representation advantage is brought to the network, a regular effect is added to the network by using sparse representation, and the phenomenon that the overfitting phenomenon is caused due to the fact that the network has a memory effect on a small sample image is prevented. Therefore, by utilizing the attention of image characteristic information brought by a channel attention mechanism and the advantages of image characteristic synthesis and partial sparse representation brought by sub-pixel convolution, the network characteristic extraction capability can be improved, the image characterization enhancement can be carried out, the problem that the fine characteristics of the image are not outstanding can be solved, and the subsequent conversion task can be favorably carried out.
On the basis of fig. 2 and fig. 3, to further verify the effectiveness of the sub-pixel convolution-based channel attention mechanism for improving network feature extraction capability for characterization enhancement, fig. 5 is a result graph of 10000 rounds of comparison experiment examples (underwater targets with complex shapes) trained by an underwater sonar small sample data set, where fig. 5(a) is a result graph of an original FUNIT network experiment example (box-selecting underwater targets with complex shapes), and fig. 5(b) is a result graph of an original FUNIT network experiment after characterization enhancement. As can be seen from comparison of the generated results of the representative underwater target image with a complex shape, which is boxed in fig. 5(a) and fig. 5(b), the outline of the image after characterization enhancement is more complete, and the target information is richer, but it should be noted that sufficient training is required for performing characterization enhancement based on the channel attention mechanism of sub-pixel convolution, otherwise, the quality of the generated image is adversely affected. Fig. 6 is a graph of results of Oxford optical small sample image data set training 57500 rounds of comparative experiments (target with complex shape), in which fig. 6(a) is a graph of results of an original FUNIT network experiment (target with complex shape box-selected), and fig. 6(b) is a graph of results of an original FUNIT network experiment after characterization enhancement. Although the generated images have randomness, it can be seen by comparing the generation results of the representative target images with complex shapes selected in the boxes in fig. 6(a) and fig. 6(b), that the shadow information on the petals in the flowers of the images after characterization and enhancement is clearer, the images are more complete and rich, the subtle features are more obvious, but the problem of sufficient training is also noticed, otherwise, the negative effect on the image quality is still generated.
(3) The method provides a conversion mechanism based on a feature domain optimization algorithm to complete small sample image conversion with rich details
The invention provides a conversion mechanism based on a characteristic domain optimization algorithm, adaptively divides a characteristic domain and a content domain by introducing a countermeasure idea, reduces the parameter space by utilizing various rich images in a source domain, reduces the generation of a network without being limited to a single sample by adding a noise strategy, alleviates the problem of mode collapse, constructs a reconstruction strategy according to the characteristics of the source domain and a target domain, and completes a small sample image conversion task by utilizing the weakened cyclic semantic consistency.
Self-adaptive division of feature domain and content domain by introducing countermeasure thought
The method comprises the steps of accessing feature information into a new discriminator, constructing a new discrimination process, combining the thought that the contents of two times of conversion are unchanged in the discrimination process, namely, after the image feature information of the converted image is extracted again, the image feature information is similar to the image feature information extracted by a given sample, and providing a countermeasure thought to finish the process in order to avoid the phenomenon that the loss is too large and further unbalance a loss function caused by simply applying image difference. Fig. 7 is a schematic diagram of a network structure for adaptively dividing a feature domain and a content domain by introducing countermeasures according to the present invention. The network can use sufficient source domain images to alleviate the problem of limited target domain sample number by dividing the feature domain and the content domain, namely, the parameter space is reduced by using multiple types of abundant images in the source domain. By adding a new discriminator of category features, adopting the features of the reconstructed image as a pseudo label and the features of the real image as a real label, the correlation judgment is completed, and the added loss function is as follows:
where x denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,indicating a desire.
Addition of noise strategy
The noise strategy is added by enriching the content information of the image by using the random information contained in the noise, so that the result of image conversion is more diversified, but the noise information is directly added to the network input layer as extra information, and under the guidance of the loss function gradient, the network often ignores the added noise information after training iteration, so that the converted image can not enhance the self performance by using the random information. In the invention, the retention of random information is added by introducing the noise information redundancy into a decoder module of a generator network, so that the network is not limited to the generation of a single sample, and the problem of mode collapse is alleviated, and fig. 8 is a schematic diagram of a network structure added by a noise strategy.
The formalized description of the noise addition strategy is:
newvec=Cov(Cov(Cov(concat(covec,z),z),z),z)
in the formula, ximgRepresenting an input image, covecRepresents the content vector, concat (·,) represents the noise addition operation, Cov (·,) represents the convolution process,representing the extraction process of the content vector, z representing noise information, newvecRepresenting a new content vector.
Thirdly, constructing a reconstruction strategy to weaken cyclic semantic consistency operation
The traditional cycle consistency loss function requires that the geometric semantics of the original image cannot be changed in the conversion process, and actually, if the visual effect of the original image is required to be unchanged, the application range of the image conversion process is necessarily limited, so that the conversion cannot be effectively completed between image domains with greatly changed image geometric semantics. Therefore, the present invention weakens the pixel level error of the image into the characteristic level error by adding the convolution layer to take account of the relaxation of the geometric constraint and the adaptation of the inter-domain content, and simultaneously constructs the image reconstruction strategy by using the common reconstruction loss in the codec, fig. 9 is a network structure diagram of the image reconstruction strategy, and the new cycle consistency reconstruction loss is specifically:
in the formula (I), the compound is shown in the specification,indicating a weakened cyclic semantic consistency,it is shown that the forward loss consistency is,indicating the back loss consistency.
And (3) according to the characteristics of the source domain and the target domain, constructing a reconstruction strategy, and completing an image conversion task by using the weakened cyclic semantic consistency to obtain a small sample conversion image with better visual effect.
The forward loss consistency is:
in the formula, Conv (. cndot.) represents a feature extraction process, λ1A hyper-parameter representing the forward conversion, andAimg,Ai′mg,Bimgand CimgRespectively representing an image, Dec (-) representing image decoding according to given image class information and image content information, phi (-) representing an image class extraction process,representing an image content feature information extraction process.
The back loss consistency is:
in the formula of lambda2Hyper-parameters representing a backward transformationAndB′imgand C'imgRespectively representing an image.
In order to verify the effectiveness of the conversion mechanism based on the feature domain optimization algorithm, firstly, the feature domain and the content domain are divided in a self-adaptive mode by introducing a countermeasure idea, and the operation of weakening the consistency of the cycle semantics by constructing a reconstruction strategy is verified in an experiment. Fig. 10 is an experimental example result diagram of 10000 rounds of training for the countermeasure introduced into the underwater sonar small sample data set, where fig. 10(a) is an experimental example result diagram of an original FUNIT network (box-selecting an underwater target with a single shape), and fig. 10(b) is an experimental example result diagram of an original FUNIT network after the countermeasure is introduced. Comparing the generation results of the underwater target images with single representative shapes, which are selected by frames in fig. 10(a) and 10(b), it can be seen that by introducing a countermeasure idea, that is, adding a new category discriminator, the category information is easier to extract and further expressed in the converted image, so that the color distribution of the converted image is closer to that of the target domain image, and the information of a part of the content domain is well retained. Fig. 11 is a comparative experiment result diagram of 10000 times of training for constructing a reconstruction strategy from an underwater sonar small sample data set, where fig. 11(a) is an experiment result diagram of an original FUNIT network (a large deformation underwater target is selected), and fig. 11(b) is an experiment result diagram of an original FUNIT network after a reconstruction strategy is constructed. Comparing the generated results of the representative underwater target images with large deformation, which are selected in the frames in fig. 11(a) and fig. 11(b), it can be seen that the images with large deformation are helpful, and the original network is more like copying the target domain images when facing the target objects with large differences.
Fig. 12 is a graph of the results of 50000 rounds of comparative experiments of Oxford optical small sample image dataset introduction countermeasure training, in which fig. 12(a) is a graph of the results of experiments of an original FUNIT network (for selecting targets with rich detailed information), and fig. 12(b) is a graph of the results of experiments of the original FUNIT network after introduction of countermeasure. As can be seen from comparison of the target image generation results with rich representative detailed information, which are boxed in fig. 12(a) and fig. 12(b), the content domain is also completely preserved due to the effect of the pseudo tag, and the content of the converted image is more inclined to the target domain image. Fig. 13 is a graph of results of 50000 rounds of comparative experiments in Oxford flower optical small sample image dataset reconstruction strategy construction, wherein fig. 13(a) is a graph of results of experiments in an original FUNIT network (the object with larger deformation, namely the optical object of single-plant to multi-plant conversion, is boxed), and fig. 13(b) is a graph of results of experiments in an original FUNIT network after a reconstruction strategy is constructed. Comparing the target image generation results of the single plant to the multiple plants, which are representative targets with large deformation, selected in the frames of fig. 13(a) and fig. 13(b), it can be seen that the original FUNIT network after the reconstruction strategy is constructed makes the conversion from single flowers to multiple flowers more natural.
Fig. 14 is a schematic network structure diagram of a feature domain optimized small sample image conversion method based on characterization enhancement provided by the present invention, fig. 15 is a comparative experiment example result diagram (underwater sonar small sample data set) of 10000 training rounds of the feature domain optimized small sample image conversion method based on characterization enhancement provided by the present invention, where fig. 15(a) is an experiment example result diagram of an original FUNIT network (for framing underwater objects with fewer texture features), and fig. 15(b) is an experiment example result diagram of the feature domain optimized small sample image conversion method based on characterization enhancement provided by the present invention. Comparing the results of generating the underwater target images with less representative texture features, which are selected by frames in fig. 15(a) and 15(b), shows that the method provided by the invention has a better conversion effect on small sample image conversion, can retain more image detail information, and can generate richer converted images.
Fig. 16 is a comparison experiment example result graph (Oxford flower optical small sample image dataset) of 50000 rounds of training of the feature domain optimization small sample image transformation method based on characterization enhancement provided by the present invention, wherein fig. 16(a) is an experiment example result graph (for framing a target with less texture features) of an original FUNIT network, and fig. 16(b) is an experiment example result graph of the feature domain optimization small sample image transformation method based on characterization enhancement provided by the present invention. As can be seen from comparison of the target image generation results with less representative texture features, which are boxed in fig. 16(a) and fig. 16(b), the method provided by the present invention automatically adds corresponding veins to petals, thereby increasing image richness and generating richer conversion images.
The feature domain optimization small sample image conversion method based on characterization enhancement provided by the invention is used for carrying out experimental result analysis on an underwater sonar small sample image data set and an Oxford flower optical small sample image data set, so that the provided method is verified to have a good conversion effect on small sample image conversion, can retain more image detail information, generates more abundant conversion images and has certain effectiveness.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (4)
1. A method for optimizing small sample image conversion based on a feature domain with enhanced representation is characterized by comprising the following steps:
step 1: acquiring a small sample image dataset;
step 2: training an image conversion model;
step 2.1: initializing parameters and setting the maximum iteration times;
step 2.2: enhancing the image contrast information and the edge information;
step 2.3: performing characteristic processing by utilizing a compression operation and an excitation operation of a channel attention mechanism;
step 2.4: performing up-sampling operation on the image feature set after feature processing by combining sub-pixel convolution;
step 2.5: completing small sample image conversion with rich details by a conversion mechanism based on a feature domain optimization algorithm;
step 2.5.1: introducing a countermeasure thought, adaptively dividing a characteristic domain and a content domain, and reducing a parameter space by utilizing various rich images in a source domain;
the feature information is accessed into a new discriminator to construct a new discrimination process, and the concept of unchanged contents of two conversions is combined in the construction discrimination process, namely the image feature information of the converted image is re-extracted and is similar to the image feature information extracted by a given sample; in order to avoid the phenomenon that the loss is too large due to simple application of image difference and further the loss function is unbalanced, the process is completed by utilizing the countermeasure idea, the image characteristic information extracted twice is fitted to the vicinity of the same characteristic distribution, and the parameters of a generator in the network are forced to be further optimized through the loss function to complete a better characteristic extraction process, so that the division of the characteristic domain is more accurate, and the purpose of optimizing the division of the characteristic domain and the content domain is achieved;
by adding a new discriminator of category features, adopting the features of the reconstructed image as a pseudo label and the features of the real image as a real label, the correlation judgment is completed, and the added loss function is as follows:
where x denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,indicates a desire;
step 2.5.2: adding a noise strategy;
random information is introduced and added to the noise information redundancy of a decoder module in a generator network, and the method specifically comprises the following steps:
newvec=Cov(Cov(Cov(concat(covec,z),z),z),z)
wherein x isimgRepresenting an input image; co ofvecRepresenting a content vector; concat (·, ·) represents a noise addition operation; cov (·, ·) represents a convolution process;an extraction process representing a content vector; z represents noise information; newvecRepresenting a new content vector;
step 2.5.3: according to the characteristics of the source domain and the target domain, a reconstruction strategy is constructed, and an image conversion task is completed by utilizing the weakened cyclic semantic consistency;
the cycle consistency reconstruction loss is specifically:
wherein the content of the first and second substances,representing a weakened cyclic semantic consistency;representing forward loss consistency;representing the backward loss consistency;
the forward loss consistency is specifically:
wherein Conv (·) represents a feature extraction process; lambda [ alpha ]1Hyper-parameters representing forward conversion AndAimg,Ai′mg,Bimgand CimgRespectively representing images; dec (·,) denotes image decoding in accordance with given image category information and image content information; phi (-) represents the image class extraction process;representing an image content feature information extraction process;
the back loss consistency is:
wherein λ is2Hyper-parameters representing a backward transformationAndB′imgand C'imgRespectively representing images;
step 2.6: judging whether the maximum iteration times is reached, if not, returning to the step 2.2; otherwise, outputting the trained image conversion model;
and step 3: and inputting the image to be converted into the trained image conversion model to obtain a converted image with better visual effect.
2. The method of claim 1, wherein the method comprises the following steps: the method for enhancing the image contrast information and the edge information in the step 2.2 specifically comprises the following steps:
step 2.2.1: enhancing the contrast information of the image by adopting a histogram equalization algorithm, wherein the processed image meets uniform probability density distribution;
step 2.2.2: taking the gray level image enhanced by the histogram equalization algorithm as the input of a Canny algorithm, and performing Gaussian smoothing; gradient solving is carried out on the smoothed image, non-maximum inhibition and double-threshold detection are utilized to complete the acquisition of edge information features, and the edge information features are added to an image feature set;
the smoothed image is:
wherein f (x, y) represents an input gray image;representing a convolution process;representing a gaussian function; σ represents the standard deviation;
and performing gradient solving on the smoothed image, specifically:
wherein, gxRepresents the gradient in the x-direction; gyRepresents the gradient in the y-direction; m (x, y) represents an image gradient; α (x, y) represents an image direction.
3. The method of claim 1, wherein the method comprises the following steps: the method for performing feature processing by using the compression operation and the excitation operation of the channel attention mechanism in the step 2.3 specifically comprises the following steps:
the compression operation is specifically:
wherein z iscA statistic representing a channel length; h and W are the spatial dimensions of the feature map; u. ofcRepresenting a feature map; fsqRepresenting a compression operation;
the excitation operation applies a gating module and a Sigmoid activation function to learn nonlinear description information between channels, and the nonlinear description information is formally expressed as:
s=Fex(z,W)=σ(W2δ(W1z))
4. The method of claim 1, wherein the method comprises the following steps: the method for performing upsampling operation on the image feature set after feature processing by combining sub-pixel convolution in the step 2.4 specifically includes:
wherein, IoutRepresents an output value; i isinputRepresenting an input value; f represents a network; l represents the Lth layer; bLRepresents the bias of the L-th layer;represents a convolution operation; wLA parameter indicating an L-th layer; PS (-) represents the adjustment function of the pixel, specifically:
wherein T represents an input feature set;representing a rounding-down operation; x represents the extended feature length; r represents an upsampling multiple; y represents the expanded feature width; c represents the number of characteristic channels before expansion; mod (·,) represents a modulo operation; c represents the number of the characteristic channels after expansion.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210170641.6A CN114565806A (en) | 2022-02-24 | 2022-02-24 | Feature domain optimization small sample image conversion method based on characterization enhancement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210170641.6A CN114565806A (en) | 2022-02-24 | 2022-02-24 | Feature domain optimization small sample image conversion method based on characterization enhancement |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114565806A true CN114565806A (en) | 2022-05-31 |
Family
ID=81714453
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210170641.6A Pending CN114565806A (en) | 2022-02-24 | 2022-02-24 | Feature domain optimization small sample image conversion method based on characterization enhancement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114565806A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116579918A (en) * | 2023-05-19 | 2023-08-11 | 哈尔滨工程大学 | Attention mechanism multi-scale image conversion method based on style independent discriminator |
-
2022
- 2022-02-24 CN CN202210170641.6A patent/CN114565806A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116579918A (en) * | 2023-05-19 | 2023-08-11 | 哈尔滨工程大学 | Attention mechanism multi-scale image conversion method based on style independent discriminator |
CN116579918B (en) * | 2023-05-19 | 2023-12-26 | 哈尔滨工程大学 | Attention mechanism multi-scale image conversion method based on style independent discriminator |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113313657B (en) | Unsupervised learning method and system for low-illumination image enhancement | |
CN110363068B (en) | High-resolution pedestrian image generation method based on multiscale circulation generation type countermeasure network | |
CN110263858B (en) | Bolt image synthesis method and device and related equipment | |
CN109242097B (en) | Visual representation learning system and method for unsupervised learning | |
CN110503113B (en) | Image saliency target detection method based on low-rank matrix recovery | |
CN113256494B (en) | Text image super-resolution method | |
CN114723950A (en) | Cross-modal medical image segmentation method based on symmetric adaptive network | |
CN112927137A (en) | Method, device and storage medium for acquiring blind super-resolution image | |
CN111476272A (en) | Dimension reduction method based on structural constraint symmetric low-rank retention projection | |
CN114548265A (en) | Crop leaf disease image generation model training method, crop leaf disease identification method, electronic device and storage medium | |
CN114299130A (en) | Underwater binocular depth estimation method based on unsupervised adaptive network | |
CN107301631B (en) | SAR image speckle reduction method based on non-convex weighted sparse constraint | |
CN114565806A (en) | Feature domain optimization small sample image conversion method based on characterization enhancement | |
CN112541566B (en) | Image translation method based on reconstruction loss | |
CN117252936A (en) | Infrared image colorization method and system adapting to multiple training strategies | |
CN116188791A (en) | Intrinsic image decomposition method based on bilateral feature pyramid network and multi-scale identification | |
CN114387485B (en) | Mars image augmentation method based on generation of countermeasure network | |
CN114022371B (en) | Defogging device and defogging method based on space and channel attention residual error network | |
CN116152263A (en) | CM-MLP network-based medical image segmentation method | |
CN114331894A (en) | Face image restoration method based on potential feature reconstruction and mask perception | |
CN113781294A (en) | Method for realizing shadow play style migration through improved cyclic generation confrontation network | |
CN114218850A (en) | Heterogeneous multi-relation graph representation learning method | |
CN115294418A (en) | Method and apparatus for domain adaptation for image segmentation, and storage medium | |
Wang et al. | APST-Flow: A Reversible Network-Based Artistic Painting Style Transfer Method. | |
CN116579918B (en) | Attention mechanism multi-scale image conversion method based on style independent discriminator |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |