CN114565806A - Feature domain optimization small sample image conversion method based on characterization enhancement - Google Patents

Feature domain optimization small sample image conversion method based on characterization enhancement Download PDF

Info

Publication number
CN114565806A
CN114565806A CN202210170641.6A CN202210170641A CN114565806A CN 114565806 A CN114565806 A CN 114565806A CN 202210170641 A CN202210170641 A CN 202210170641A CN 114565806 A CN114565806 A CN 114565806A
Authority
CN
China
Prior art keywords
image
representing
feature
information
domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210170641.6A
Other languages
Chinese (zh)
Inventor
王兴梅
王坤华
陈伟京
李孟昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202210170641.6A priority Critical patent/CN114565806A/en
Publication of CN114565806A publication Critical patent/CN114565806A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the technical field of image processing, and particularly relates to a feature domain optimization small sample image conversion method based on characterization enhancement. According to the method, the histogram equalization algorithm and the Canny algorithm priori knowledge are introduced, the image contrast information and the edge information are enhanced, on the basis, the network feature extraction capability is improved through a channel attention mechanism based on sub-pixel convolution, image characterization enhancement is carried out, and the problem that fine features of an image are not prominent is solved. The invention provides a conversion mechanism based on a characteristic domain optimization algorithm, a characteristic domain and a content domain are adaptively divided by introducing a countermeasure idea, a parameter space is reduced by utilizing various rich images in a source domain, a network is not limited to the generation of a single sample due to the addition of a noise strategy, and the problem of mode collapse is alleviated; and constructing a reconstruction strategy according to the characteristics of the source domain and the target domain, completing a small sample image conversion task by utilizing the weakened circulation semantic consistency, and obtaining a conversion image with better visual effect.

Description

Feature domain optimization small sample image conversion method based on characterization enhancement
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a feature domain optimization small sample image conversion method based on characterization enhancement.
Background
The image is used as an important carrier for environmental perception and knowledge acquisition, and has important significance for scientific exploration, research and information mining. The task of image conversion is based on a given data set, and a deep learning algorithm is utilized to learn good network mapping so as to generate an image simultaneously having target data domain characteristics and source data domain content. The research on the high-efficiency and reliable image conversion algorithm has important theoretical value and practical significance, and scholars at home and abroad carry out deep research on image conversion and obtain important achievements. Among them, the most famous and most effective image conversion method in the existing literature mainly includes: 1. a supervised image conversion method based on generation countermeasure comprises the following steps: in 2016, Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, et al, image-to-image transformation with conditional adaptation of network, proceedings of The IEEE conference on computer vision and pattern recognition, Honolulu, Hawaii, The United States of America,2017: 1125-channel 1134, a method for solving The task of large-scale supervised image transformation by introducing a condition-generating antagonistic network was proposed. 2. The unsupervised image conversion method based on the generation countermeasure comprises the following steps: in 2017, Jun-Yan Zhu, Taesung Park, Phillip Isola, et al, Ungained image-to-image transformation using cycle-dependent adaptive networks, proceedings of the IEEE international conference on computer vision, Italy,2017:2223 + 2232. 3. The unsupervised image conversion method based on the shared domain comprises the following steps: in 2017, Ming-Yu Liu, Thomas Breuel, Jan Kautz, unsuperved image-to-image transformation networks, Advances in neural information processing systems, California, The United States of America,2017:700 Bufonia 708, a transformation is proposed, which involves The assumption that two image domains share The same content domain, except for The expression form between The domains, and a better transformation effect is obtained.
When the image conversion relates to the fields of small samples and unsupervised fields, the supervised method can not be used for training at all, and in addition, the solution method similar to the cyclic consistency can require the image data set to be large in scale and cannot be applied to the fields of small samples. At present, scholars at home and abroad deeply research an image conversion algorithm based on a sample field, and the method mainly comprises the following steps: 1. the small sample image generation method based on data enhancement comprises the following steps: in 2018, Hang Gao, Zheng Shou, Alireza Zareian, et al, Low-shot learning via collaborative-prediction adaptation networks, arXiv prediction arXiv:1810.11730,2018:1-13, it is proposed to generate synthetic similar data by using large data sets by using a generation confrontation network to alleviate the problem of data loss of small sample target domains and obtain better results. 2. The small sample image classification method based on the regular element learning comprises the following steps: in 2018, Yabin Zhang, Hui Tang, Kui Jia, Fine-grained visual identification use of method-learning selection of automatic data, proceedings of the understanding of the following conference on computer vision (ECCV), Munich, Germany,2018: 233-. 3. The unsupervised image conversion method based on the small samples comprises the following steps: in 2019, Ming-Yu Liu, Xun Huang, Arun Mallyya, et al, Few-shot unsupervised image-to-image transformation, proceedings of the IEEE/CVF International Conference on Computer Vision.Seoul, Korea,2019: 10551-.
Disclosure of Invention
The invention aims to provide a feature domain optimization small sample image conversion method based on characterization enhancement, which can retain more image detail information and generate more richness.
A feature domain optimization small sample image conversion method based on characterization enhancement comprises the following steps:
step 1: acquiring a small sample image dataset;
step 2: training an image conversion model;
step 2.1: initializing parameters and setting the maximum iteration times;
step 2.2: enhancing the image contrast information and the edge information;
step 2.3: performing characteristic processing by utilizing a compression operation and an excitation operation of a channel attention mechanism;
step 2.4: performing up-sampling operation on the image feature set after feature processing by combining sub-pixel convolution;
step 2.5: completing small sample image conversion with rich details by a conversion mechanism based on a feature domain optimization algorithm;
step 2.5.1: introducing a countermeasure thought, adaptively dividing a characteristic domain and a content domain, and reducing a parameter space by utilizing various rich images in a source domain;
the feature information is accessed into a new discriminator to construct a new discriminating process, and the concept of unchanged contents of two times of conversion is combined in the constructing discriminating process, namely the image feature information of the converted image is similar to the image feature information extracted by a given sample after the image feature information is re-extracted; in order to avoid overlarge loss caused by simple application of image difference and further unbalance a loss function, the process is completed by utilizing a countermeasure idea, image feature information extracted twice is fitted to the vicinity of the same feature distribution, parameters of a generator in a network are forced to be further optimized and a better feature extraction process is completed through the loss function, so that the division of a feature domain is more accurate, and the purpose of optimizing the division of the feature domain and a content domain is achieved;
by adding a new discriminator of category characteristics, adopting the characteristics of a reconstructed image as a pseudo label and the characteristics of a real image as a real label, the related judgment is completed, and the added loss function is as follows:
Figure BDA0003517993380000021
where x denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,
Figure BDA0003517993380000022
indicating a desire;
step 2.5.2: adding a noise strategy;
random information is introduced and added to the noise information redundancy of a decoder module in a generator network, and the method specifically comprises the following steps:
Figure BDA0003517993380000031
newvec=Cov(Cov(Cov(concat(covec,z),z),z),z)
wherein x isimgRepresenting an input image; co is a mixture ofvecRepresenting a content vector; concat (·, ·) represents a noise addition operation; cov (·, ·) represents a convolution process;
Figure BDA0003517993380000032
an extraction process representing a content vector; z represents noise information; newvecRepresenting a new content vector;
step 2.5.3: according to the characteristics of the source domain and the target domain, a reconstruction strategy is constructed, and an image conversion task is completed by utilizing the weakened cyclic semantic consistency;
the cycle consistency reconstruction loss is specifically:
Figure BDA0003517993380000033
wherein the content of the first and second substances,
Figure BDA0003517993380000034
representing a weakened cyclic semantic consistency;
Figure BDA0003517993380000035
indicating forward loss consistency;
Figure BDA0003517993380000036
representing the backward loss consistency;
the forward loss consistency is specifically:
Figure BDA0003517993380000037
wherein Conv (·) represents a feature extraction process; lambda [ alpha ]1Hyper-parameters representing forward conversion
Figure BDA0003517993380000038
Figure BDA0003517993380000039
And
Figure BDA00035179933800000310
Aimg,A′img,Bimgand CimgRespectively representing images; dec (·,) denotes image decoding in accordance with given image category information and image content information; phi (-) represents the image class extraction process;
Figure BDA00035179933800000311
representing an image content feature information extraction process;
the back loss consistency is:
Figure BDA00035179933800000312
wherein λ is2Hyper-parameters representing a backward transformation
Figure BDA00035179933800000313
And
Figure BDA00035179933800000314
B′imgand C'imgRespectively representing images;
step 2.6: judging whether the maximum iteration times is reached, if not, returning to the step 2.2; otherwise, outputting the trained image conversion model;
and step 3: and inputting the image to be converted into the trained image conversion model to obtain a converted image with better visual effect.
Further, the method for performing enhancement processing on the image contrast information and the edge information in step 2.2 specifically includes:
step 2.2.1: enhancing the contrast information of the image by adopting a histogram equalization algorithm, wherein the processed image meets uniform probability density distribution;
step 2.2.2: taking the gray level image enhanced by the histogram equalization algorithm as the input of a Canny algorithm, and performing Gaussian smoothing; gradient solving is carried out on the smoothed image, non-maximum inhibition and double-threshold detection are utilized to complete the acquisition of edge information characteristics, and the edge information characteristics are added to an image characteristic set;
the smoothed image is:
Figure BDA0003517993380000041
wherein f (x, y) represents an input gray image;
Figure BDA0003517993380000042
representing a convolution process;
Figure BDA0003517993380000043
representing a gaussian function; σ represents a standard deviation;
and performing gradient solving on the smoothed image, specifically:
Figure BDA0003517993380000044
Figure BDA0003517993380000045
Figure BDA0003517993380000046
Figure BDA0003517993380000047
wherein, gxRepresents the gradient in the x-direction; gyRepresents the gradient in the y-direction; m (x, y) represents an image gradient; α (x, y) represents an image direction.
Further, the method for performing feature processing by using the compression operation and the excitation operation of the channel attention mechanism in step 2.3 specifically includes:
the compression operation is specifically:
Figure BDA0003517993380000048
wherein z iscA statistic representing a channel length; h and W are the spatial dimensions of the feature map; u. ucRepresenting a feature map; fsqRepresenting a compression operation;
the excitation operation applies a gating module and a Sigmoid activation function to learn nonlinear description information between channels, and the nonlinear description information is formally expressed as:
s=Fex(z,W)=σ(W2δ(W1z))
where, σ denotes a Sigmoid function,
Figure BDA0003517993380000051
Figure BDA0003517993380000052
representing a variable, e representing a natural constant;
Figure BDA0003517993380000053
Figure BDA0003517993380000054
c represents the number of channels; r represents the ratio of dimensionality reduction; delta denotes the value of the ReLU function,
Figure BDA0003517993380000055
Figure BDA0003517993380000056
representing a variable; fexIndicating an actuation operation.
Further, the method for performing the upsampling operation on the feature-processed image feature set in the step 2.4 by combining the sub-pixel convolution specifically includes:
Figure BDA0003517993380000057
wherein, IoutRepresents an output value; i isinputRepresenting an input value; f represents a network; l represents the Lth layer; b is a mixture ofLRepresents the bias of the L-th layer;
Figure BDA0003517993380000058
represents a convolution operation; w is a group ofLA parameter indicating an L-th layer; PS (-) represents the adjustment function of the pixel, specifically:
Figure BDA0003517993380000059
wherein T represents an input feature set;
Figure BDA00035179933800000510
representing a rounding-down operation; x represents the extended feature length; r represents an upsampling multiple; y represents the expanded feature width; c represents the number of characteristic channels before expansion; mod (·,) represents a modulo operation; c represents the number of the characteristic channels after expansion.
The invention has the beneficial effects that:
according to the method, the contrast information and the edge information of the image are enhanced by introducing the priori knowledge of a histogram equalization algorithm and a Canny algorithm, on the basis, the method provides a channel attention mechanism based on sub-pixel convolution to improve the network feature extraction capability, performs image characterization enhancement, solves the problem that fine features of the image are not prominent, and is beneficial to performing subsequent conversion tasks. The invention provides a conversion mechanism based on a characteristic domain optimization algorithm, adaptively divides a characteristic domain and a content domain by introducing a countermeasure idea, reduces the parameter space by utilizing various rich images in a source domain, adds a noise strategy to enable a network not to be limited by the generation of a single sample, slows down the problem of mode collapse, constructs a reconstruction strategy according to the characteristics of the source domain and a target domain, completes the conversion task of small-sample images by utilizing the weakened cyclic semantic consistency, and obtains a converted image with better visual effect. The invention has better conversion effect on small sample image conversion, can retain more image detail information, generates richer conversion images and has certain effectiveness.
Drawings
FIG. 1 is a flow chart of the present invention.
Fig. 2(a) -2 (b) are graphs of comparative experiment results of 10000 rounds of underwater sonar small sample data set training, fig. 2(a) is a graph of original FUNIT network experiment results, and fig. 2(b) is a graph of original FUNIT network experiment results after contrast information and edge information enhancement processing.
Fig. 3(a) -3 (b) are graphs of results of 50000 rounds of comparative experiments in the Oxford optical small sample image data set training, fig. 3(a) is a graph of results of experiments of an original FUNIT network, and fig. 3(b) is a graph of results of experiments of the original FUNIT network after contrast information and edge information enhancement processing.
FIG. 4 is a schematic diagram of a network structure of a channel attention mechanism based on sub-pixel convolution according to the present invention.
Fig. 5(a) -5 (b) are result graphs of 10000 rounds of comparison experiment examples of training of underwater sonar small sample data sets (underwater targets with complex shapes), fig. 5(a) is a result graph of an original FUNIT network experiment example (box-selecting underwater targets with complex shapes), and fig. 5(b) is a result graph of an original FUNIT network experiment example after characterization enhancement.
Fig. 6(a) -6 (b) are graphs of results of Oxford optical small sample image data set training 57500 rounds of comparative experiments (target with complex shape), fig. 6(a) is a graph of results of an original FUNIT network experiment (target with complex box shape), and fig. 6(b) is a graph of results of an enhanced original FUNIT network experiment.
Fig. 7 is a schematic diagram of a network structure for adaptively dividing a feature domain and a content domain by introducing a countermeasure idea.
Fig. 8 is a schematic diagram of a network structure for noise policy addition.
Fig. 9 is a schematic network structure of an image reconstruction strategy.
Fig. 10(a) -10 (b) are graphs of results of 10000 times of experiments of training an underwater sonar small sample data set with an antagonistic thought, fig. 10(a) is a graph of results of experiments of an original FUNIT network (selecting underwater targets with a single shape), and fig. 10(b) is a graph of results of experiments of the original FUNIT network after the antagonistic thought is introduced.
Fig. 11(a) -11 (b) are graphs of comparative experiment results of 10000 times of training for constructing a reconstruction strategy from an underwater sonar small sample data set, fig. 11(a) is a graph of experiment results of an original FUNIT network (an underwater target with large deformation is selected), and fig. 11(b) is a graph of experiment results of the original FUNIT network after the reconstruction strategy is constructed.
Fig. 12(a) -12 (b) are graphs of results of 50000 rounds of comparative experiments of Oxford flower optical small sample image dataset introduction countermeasure training, fig. 12(a) is a graph of results of experiments of an original FUNIT network (for selecting targets with rich detailed information), and fig. 12(b) is a graph of results of experiments of the original FUNIT network after introduction of countermeasure.
Fig. 13(a) -13 (b) are graphs of 50000 rounds of comparative experiment results of Oxford optical small sample image dataset reconstruction strategy construction, fig. 13(a) is a graph of experiment results of an original FUNIT network (an optical target converted from single strain to multiple strains, which is a target with large deformation, is selected in a box), and fig. 13(b) is a graph of experiment results of the original FUNIT network after a reconstruction strategy is constructed.
Fig. 14 is a schematic network structure diagram of a feature domain optimization small sample image transformation method based on characterization enhancement according to the present invention.
Fig. 15(a) -15 (b) are comparative experiment example result graphs (underwater sonar small sample data sets) of 10000 training rounds of feature domain optimization small sample image conversion method based on characterization enhancement provided by the present invention, fig. 15(a) is an experiment example result graph of an original FUNIT network (selecting underwater targets with less texture features), and fig. 15(b) is an experiment example result graph of the feature domain optimization small sample image conversion method based on characterization enhancement provided by the present invention.
Fig. 16(a) -16 (b) are comparative experiment example result graphs (Oxford flower optical small sample image data sets) of 50000 rounds of training based on the feature domain optimization small sample image transformation method based on the feature enhancement provided by the present invention, fig. 16(a) is an experiment example result graph (for framing a target with less texture features) of an original FUNIT network, and fig. 16(b) is an experiment example result graph based on the feature domain optimization small sample image transformation method based on the feature enhancement provided by the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention provides a feature domain optimization small sample image conversion method based on characterization enhancement. The method comprises the following steps: (1) according to the characteristics of the small sample images in the data set, enhancing the image contrast information and the edge information; (2) providing a channel attention mechanism based on sub-pixel convolution to improve the network feature extraction capability for characterization enhancement; (3) and a conversion mechanism based on a feature domain optimization algorithm is provided, and the small sample image conversion with rich details is completed. The invention provides a feature domain optimization small sample image conversion method based on characterization enhancement, aiming at obtaining a better small sample image conversion effect. The problem that the contrast information and the edge information of the small sample image are fuzzy due to the acquisition means is solved, and the image contrast information and the edge information are enhanced; on the basis, a channel attention mechanism based on sub-pixel convolution is provided to improve the network feature extraction capability, enhance image representation and solve the problem that fine features of the image are not prominent; the method comprises the steps of solving the problems that parameter space optimization is insufficient and a single sample is generated due to limited sample number, providing a conversion mechanism based on a feature domain optimization algorithm, adaptively dividing a feature domain and a content domain by introducing a countermeasure idea, reducing the parameter space by utilizing various abundant images in a source domain, reducing the generation of the single sample by adding a noise strategy, reducing the problem of mode collapse, constructing a reconstruction strategy according to the characteristics of the source domain and a target domain, completing a small-sample image conversion task by utilizing weakened circulation semantic consistency, and obtaining a conversion image with better visual effect. The feature domain optimization small sample image conversion method based on the characterization enhancement has a good conversion effect on small sample image conversion, can retain more image detail information, generates richer conversion images, and has certain effectiveness.
A feature domain optimization small sample image conversion method based on characterization enhancement comprises the following steps:
step 1: acquiring a small sample image dataset;
and 2, step: training an image conversion model;
step 2.1: initializing parameters and setting the maximum iteration times;
step 2.2: enhancing the image contrast information and the edge information;
step 2.2.1: enhancing the contrast information of the image by adopting a histogram equalization algorithm, wherein the processed image meets uniform probability density distribution;
step 2.2.2: taking the gray level image enhanced by the histogram equalization algorithm as the input of a Canny algorithm, and performing Gaussian smoothing; gradient solving is carried out on the smoothed image, non-maximum inhibition and double-threshold detection are utilized to complete the acquisition of edge information features, and the edge information features are added to an image feature set;
the smoothed image is:
Figure BDA0003517993380000081
wherein f (x, y) represents an input gray image;
Figure BDA0003517993380000082
representing a convolution process;
Figure BDA0003517993380000083
representing a gaussian function; σ represents the standard deviation;
and performing gradient solving on the smoothed image, specifically:
Figure BDA0003517993380000084
Figure BDA0003517993380000085
Figure BDA0003517993380000086
Figure BDA0003517993380000087
wherein, gxRepresents the gradient in the x-direction; gyRepresents the gradient in the y-direction; m (x, y) represents an image gradient; α (x, y) represents an image direction;
step 2.3: performing characteristic processing by utilizing a compression operation and an excitation operation of a channel attention mechanism;
the compression operation is specifically:
Figure BDA0003517993380000088
wherein z iscA statistic representing a channel length; h and W are characteristic diagramsThe spatial dimension of (a); u. ofcRepresenting a feature map; fsqRepresenting a compression operation;
the excitation operation applies a gating module and a Sigmoid activation function to learn nonlinear description information between channels, and the nonlinear description information is formally expressed as:
s=Fex(z,W)=σ(W2δ(W1z))
where, σ denotes a Sigmoid function,
Figure BDA0003517993380000089
Figure BDA00035179933800000810
representing a variable, e representing a natural constant;
Figure BDA00035179933800000811
Figure BDA0003517993380000091
c represents the number of channels; r represents the ratio of dimensionality reduction; delta denotes the value of the ReLU function,
Figure BDA0003517993380000092
Figure BDA0003517993380000093
representing a variable; fexRepresenting an actuation operation;
step 2.4: performing up-sampling operation on the image feature set after feature processing by combining sub-pixel convolution;
Figure BDA0003517993380000094
wherein, IoutRepresents an output value; i isinputRepresenting an input value; f represents a network; l represents the Lth layer; bLIndicating the bias of the L-th layer;
Figure BDA0003517993380000095
representing a convolution operation; wLParameter indicating L-th layerCounting; PS (-) represents the adjustment function of the pixel, specifically:
Figure BDA0003517993380000096
wherein T represents an input feature set;
Figure BDA0003517993380000097
representing a rounding-down operation; x represents the extended feature length; r represents an upsampling multiple; y represents the expanded feature width; c represents the number of the characteristic channels before expansion; mod (·,) represents a modulo operation; c represents the number of the expanded characteristic channels;
step 2.5: completing small sample image conversion with rich details by a conversion mechanism based on a feature domain optimization algorithm;
step 2.5.1: introducing a countermeasure thought, adaptively dividing a characteristic domain and a content domain, and reducing a parameter space by utilizing various rich images in a source domain;
the feature information is accessed into a new discriminator to construct a new discriminating process, and the concept of unchanged contents of two times of conversion is combined in the constructing discriminating process, namely the image feature information of the converted image is similar to the image feature information extracted by a given sample after the image feature information is re-extracted; in order to avoid the phenomenon that the loss is too large due to simple application of image difference and further the loss function is unbalanced, the process is completed by utilizing the countermeasure idea, the image characteristic information extracted twice is fitted to the vicinity of the same characteristic distribution, and the parameters of a generator in the network are forced to be further optimized through the loss function to complete a better characteristic extraction process, so that the division of the characteristic domain is more accurate, and the purpose of optimizing the division of the characteristic domain and the content domain is achieved;
by adding a new discriminator of category features, adopting the features of the reconstructed image as a pseudo label and the features of the real image as a real label, the correlation judgment is completed, and the added loss function is as follows:
Figure BDA0003517993380000098
where x denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,
Figure BDA0003517993380000099
indicating a desire;
step 2.5.2: adding a noise strategy;
random information is introduced and added to the noise information redundancy of a decoder module in a generator network, and the method specifically comprises the following steps:
Figure BDA00035179933800000910
newvec=Cov(Cov(Cov(concat(covec,z),z),z),z)
wherein x isimgRepresenting an input image; co is a mixture ofvecRepresenting a content vector; concat (·, ·) represents a noise addition operation; cov (·, ·) represents a convolution process;
Figure BDA0003517993380000101
an extraction process representing a content vector; z represents noise information; newvecRepresenting a new content vector;
step 2.5.3: according to the characteristics of the source domain and the target domain, a reconstruction strategy is constructed, and an image conversion task is completed by utilizing the weakened cyclic semantic consistency;
the cycle consistency reconstruction loss is specifically:
Figure BDA0003517993380000102
wherein the content of the first and second substances,
Figure BDA0003517993380000103
representing a weakened cyclic semantic consistency;
Figure BDA0003517993380000104
indicating forward loss consistency;
Figure BDA0003517993380000105
indicating backward loss consistency;
the forward loss consistency is specifically:
Figure BDA0003517993380000106
wherein Conv (·) represents a feature extraction process; lambda [ alpha ]1Hyper-parameters representing forward conversion
Figure BDA0003517993380000107
Figure BDA0003517993380000108
And
Figure BDA0003517993380000109
Aimg,A′img,Bimgand CimgRespectively representing images; dec (·,) denotes image decoding in accordance with given image category information and image content information; phi (-) represents the image class extraction process;
Figure BDA00035179933800001010
representing an image content feature information extraction process;
the back loss consistency is:
Figure BDA00035179933800001011
wherein λ is2Hyper-parameters representing a backward transformation
Figure BDA00035179933800001012
And
Figure BDA00035179933800001013
B′imgand C'imgRespectively representing images;
step 2.6: judging whether the maximum iteration times is reached, if not, returning to the step 2.2; otherwise, outputting the trained image conversion model;
and step 3: and inputting the image to be converted into the trained image conversion model to obtain a converted image with better visual effect.
Example 1:
the invention aims to provide a feature domain optimization small sample image conversion method which can retain more image detail information and generate richer images based on characterization enhancement.
The invention comprises the following steps in the realization:
(1) enhancing the image contrast information and the edge information: firstly, processing image contrast information by adopting a histogram equalization algorithm; acquiring image edge information by adopting a Canny algorithm;
(2) the method provides a channel attention mechanism based on sub-pixel convolution to improve the network feature extraction capability for characterization enhancement: carrying out feature processing by utilizing compression operation and excitation operation of a channel attention mechanism; combining sub-pixel convolution to carry out up-sampling operation with the multiplying power of 2 on the processed image feature set;
(3) and (3) providing a conversion mechanism based on a feature domain optimization algorithm to complete the conversion of the small sample image with rich details: firstly, a countermeasure thought is introduced to adaptively divide a characteristic domain and a content domain, and a parameter space is reduced by utilizing various rich images in a source domain; secondly, due to the addition of a noise strategy, the network is not limited to the generation of a single sample, and the problem of mode collapse is alleviated; and thirdly, according to the characteristics of the source domain and the target domain, a reconstruction strategy is constructed, an image conversion task is completed by utilizing weakened circulation semantic consistency, and a small sample conversion image with better visual effect is obtained.
The present invention may further comprise:
1. the probability density function of the small sample image in step (1) is
Figure BDA0003517993380000111
r represents the converted grayscale image, s represents the output grayscale image, pr(r) represents the probability density function of the random variable r.The conversion function is
Figure BDA0003517993380000112
w is a pseudo integral variable, pr(w) represents the probability density function of the random variable w, and L represents the maximum gray value.
2. And (2) taking the gray level image enhanced by the histogram equalization algorithm as the input of a Canny algorithm, performing Gaussian smoothing, gradient solving, non-maximum suppression and double-threshold detection to complete the acquisition of edge information features, and adding the edge information features to an image feature set.
3. In the step (2), the statistical information of the channel direction is generated through the global pooling operation, the compression operation of the overall visual angle is provided for the network after the global space information is compressed, and the limitation of the local visual angle is alleviated. On the basis, a gating module and an activation function are introduced to perform excitation operation by utilizing more sufficient dependent information related to the capture channels, and nonlinear description information among the channels is learned.
4. Introducing sub-pixel convolution in the step (2) to perform up-sampling operation with the magnification of 2 on the image feature set, specifically:
Figure BDA0003517993380000113
Ioutrepresents the output value, IinputRepresenting input values, f representing the network, L representing the Lth layer, bLThe bias of the L-th layer is indicated,
Figure BDA0003517993380000114
representing a convolution operation, WLRepresents the parameters of the L-th layer and PS (-) represents the adjustment function of the pixel.
5. The step (2) introduces a pixel adjustment function, which specifically comprises:
Figure BDA0003517993380000115
t represents a set of input features,
Figure BDA0003517993380000121
represents a lower rounding operation, x tableShowing the length of the feature after expansion, r representing the multiple of upsampling, y representing the width of the feature after expansion, C representing the number of the feature channels before expansion, mod (·, and), and C representing the number of the feature channels after expansion.
6. In the step (3), by adding a new discriminator of category characteristics, adopting the characteristics of the reconstructed image as a pseudo label and the characteristics of the real image as a real label, finishing the correlation judgment and adding a loss function
Figure BDA0003517993380000122
x denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,
Figure BDA00035179933800001216
indicating a desire.
7. In the step (3), random information is added by introducing redundancy of noise information of a decoder module in the generator network, specifically:
Figure BDA0003517993380000123
and newvec=Cov(Cov(Cov(concat(covec,z),z),z),z),ximgRepresenting an input image, covecRepresents the content vector, concat (·,) represents the noise addition operation, Cov (·,) represents the convolution process,
Figure BDA00035179933800001217
representing the extraction process of the content vector, z representing noise information, newvecRepresenting a new content vector.
8. Weakening the pixel level error of the image into a characteristic level error through adding the convolution layer in the step (3), specifically:
Figure BDA0003517993380000124
Figure BDA0003517993380000125
indicating a weakened cyclic semantic consistency,
Figure BDA0003517993380000126
it is shown that the forward loss consistency is,
Figure BDA0003517993380000127
and backward loss consistency is expressed, a conversion mechanism based on a feature domain optimization algorithm is realized through weakened circulation semantic consistency, and small sample image conversion with rich details is obtained.
9. Forward loss consistency in said step (3)
Figure BDA0003517993380000128
Conv (. beta.) denotes the feature extraction procedure, λ1A hyper-parameter representing the forward conversion,
Figure BDA0003517993380000129
Figure BDA00035179933800001210
and
Figure BDA00035179933800001211
Aimg,A′img,Bimgand CimgRespectively representing an image, Dec (-) representing image decoding according to given image class information and image content information, phi (-) representing an image class extraction process,
Figure BDA00035179933800001212
representing an image content feature information extraction process.
10. Consistency of back loss in the step (3)
Figure BDA00035179933800001213
λ2Hyper-parameters representing a backward transformation
Figure BDA00035179933800001214
And
Figure BDA00035179933800001215
B′imgand C'imgRespectively representing images.
Compared with the prior art, the invention has the advantages that: a. the traditional image conversion method adopts a large amount of source domain and target domain data to extract features and complete an image conversion task, but when the number of samples is limited, namely a small sample data set, the traditional image conversion method cannot be adopted to fully train the image feature extraction process, the effectiveness of feature extraction is influenced, and in order to ensure that the small sample image conversion has a better conversion effect and more image detail information can be reserved, the invention provides a feature domain optimization small sample image conversion method based on characterization enhancement; b. the invention relates to a method for enhancing the contrast information and the edge information of a small sample image, which solves the problem that the contrast information and the edge information of the small sample image are fuzzy due to the acquisition means of the small sample image; c. on the basis of enhancing image contrast information and edge information, the invention provides a channel attention mechanism based on sub-pixel convolution to improve network feature extraction capability, enhance image representation, solve the problem that fine features of an image are not prominent, and facilitate subsequent conversion tasks; d. the invention provides a conversion mechanism based on a characteristic domain optimization algorithm, adaptively divides a characteristic domain and a content domain by introducing a countermeasure idea, reduces the parameter space by utilizing various rich images in a source domain, reduces the problem of mode collapse by adding a noise strategy so that a network is not limited to the generation of a single sample, constructs a reconstruction strategy according to the characteristics of the source domain and a target domain, completes a small-sample image conversion task by utilizing weakened cyclic semantic consistency and obtains a converted image with better visual effect.
The feature domain optimization small sample image conversion method based on the characterization enhancement has a good conversion effect on small sample image conversion, can retain more image detail information, generates richer conversion images, and has certain effectiveness.
Example 2:
with reference to fig. 1, the specific steps of the present invention are as follows:
(1) image contrast information and edge information are enhanced
The image contrast information enhancement processing adopts a histogram equalization algorithm, and the probability density function is as follows:
Figure BDA0003517993380000131
where r represents the converted grayscale image, s represents the output grayscale image, pr(r) represents the probability density function of the random variable r.
The transfer function is:
Figure BDA0003517993380000132
wherein w is a pseudo integral variable, pr(w) represents the probability density function of the random variable w, and L represents the maximum gray value.
The two formulas can be obtained:
Figure BDA0003517993380000133
in the formula, s is not less than 0 and not more than L-1.
The image edge information enhancement processing adopts a Canny algorithm, the gray level image enhanced by the histogram equalization algorithm is used as the input of the Canny algorithm, Gaussian smoothing is carried out, and the smoothed image is as follows:
Figure BDA0003517993380000141
wherein f (x, y) represents an input gray image,
Figure BDA0003517993380000142
which represents the process of convolution, is,
Figure BDA0003517993380000143
is highThe gaussian function, σ, represents the standard deviation.
And performing gradient solving on the smoothed image, specifically:
Figure BDA0003517993380000144
Figure BDA0003517993380000145
Figure BDA0003517993380000146
Figure BDA0003517993380000147
in the formula, gxDenotes the gradient in the x direction, gyDenotes the gradient in the y direction, M (x, y) denotes the image gradient, and α (x, y) denotes the image direction.
And (4) finishing the acquisition of the edge information features by using non-maximum inhibition and double-threshold detection, and adding the edge information features into the image feature set.
Because the evaluation index of the current image conversion is not perfect, the method follows the common practice in the research field and carries out algorithm comparison from the image visual effect. In the field of small sample image conversion, the conversion effect of the FUNIT network is relatively best at present, and the invention takes the FUNIT network as an original network to carry out comparison experiment verification.
In order to verify the effectiveness of enhancement processing of contrast information and edge information of a small sample image dataset according to the present invention, fig. 2 is a graph of comparative experiment results of 10000 training rounds of underwater sonar small sample dataset, where fig. 2(a) is a graph of original FUNIT network experiment results, and fig. 2(b) is a graph of original FUNIT network experiment results after enhancement processing of contrast information and edge information. By comparing the generation results of the representative images outlined in fig. 2(a) and fig. 2(b), the images after the contrast information and the edge information enhancement processing are more complete and clear due to the further guidance of the contrast information and the edge information on the network gradient.
Fig. 3 is a 50000 round comparison experiment result diagram of Oxford optical small sample image data set training, wherein fig. 3(a) is an experiment result diagram of an original FUNIT network, and fig. 3(b) is an experiment result diagram of an original FUNIT network after contrast information and edge information enhancement processing. Although the generated image has randomness, by comparing the generated results of the representative images selected in the boxes in fig. 3(a) and fig. 3(b), the detail information such as the vein of the petals in the original FUNIT network after the contrast information and the edge information enhancement processing is clearer than the vein of the petals in the original FUNIT network.
(2) Channel attention mechanism based on sub-pixel convolution is provided to improve network feature extraction capability for characterization enhancement
Compression and excitation of channel attention mechanism
The compression operation provides an overall view angle for the network after compressing the global space information, namely, the statistical information of the channel direction is generated through the global pooling operation, the statistical information relieves the limitation that the feature map can only utilize a local view angle and cannot utilize network information except the local view angle, and the feature extraction of the network is more sufficient. The compression operation is specifically:
Figure BDA0003517993380000151
in the formula, zcStatistics representing channel length, H and W being spatial dimensions of the feature map, ucShowing a characteristic diagram, FsqRepresenting a compression operation.
On the basis, in order to more fully utilize information extracted after compression operation, the network performs excitation operation, can more fully capture channel-related dependency information, the excitation operation applies a gating module and a Sigmoid activation function, learns nonlinear description information between channels, and the nonlinear description information is expressed in a formalization mode as follows:
s=Fex(z,W)=σ(W2δ(W1z))
where, σ denotes a Sigmoid function,
Figure BDA0003517993380000152
c denotes the number of channels, r denotes the ratio of the dimensionality reduction, δ denotes the ReLU function, FexIndicating an actuation operation.
The Sigmoid function is formalized as:
Figure BDA0003517993380000153
in the formula (I), the compound is shown in the specification,
Figure BDA0003517993380000154
representing a variable and e a natural constant.
The formalization of the ReLU function is represented as:
Figure BDA0003517993380000155
in the formula (I), the compound is shown in the specification,
Figure BDA0003517993380000156
representing a variable.
Introducing sub-pixel convolution to carry out up-sampling operation on image feature set
The invention introduces sub-pixel convolution to carry out up-sampling operation with the multiplying power of 2 on an image feature set, and specifically comprises the following steps:
Figure BDA0003517993380000157
in the formula IoutRepresents the output value, IinputRepresenting input values, f representing the network, L representing the Lth layer, bLThe bias of the L-th layer is indicated,
Figure BDA0003517993380000158
representing a convolution operation, WLDenotes the L-th layerPS (-) represents the adjustment function of the pixel.
Introducing a pixel adjusting function, specifically:
Figure BDA0003517993380000161
in the formula, T represents an input feature set,
Figure BDA0003517993380000162
representing a lower rounding operation, x representing the extended feature length, r representing an upsampling multiple, y representing the extended feature width, C representing the number of feature channels before extension, mod (·,) representing a modulo operation, and C representing the number of feature channels after extension.
Fig. 4 is a schematic diagram of a network structure of a channel attention mechanism based on sub-pixel convolution, the scale of an image is continuously changed due to the cooperation of the channel attention mechanism and the sub-pixel convolution, a certain sparse representation advantage is brought to the network, a regular effect is added to the network by using sparse representation, and the phenomenon that the overfitting phenomenon is caused due to the fact that the network has a memory effect on a small sample image is prevented. Therefore, by utilizing the attention of image characteristic information brought by a channel attention mechanism and the advantages of image characteristic synthesis and partial sparse representation brought by sub-pixel convolution, the network characteristic extraction capability can be improved, the image characterization enhancement can be carried out, the problem that the fine characteristics of the image are not outstanding can be solved, and the subsequent conversion task can be favorably carried out.
On the basis of fig. 2 and fig. 3, to further verify the effectiveness of the sub-pixel convolution-based channel attention mechanism for improving network feature extraction capability for characterization enhancement, fig. 5 is a result graph of 10000 rounds of comparison experiment examples (underwater targets with complex shapes) trained by an underwater sonar small sample data set, where fig. 5(a) is a result graph of an original FUNIT network experiment example (box-selecting underwater targets with complex shapes), and fig. 5(b) is a result graph of an original FUNIT network experiment after characterization enhancement. As can be seen from comparison of the generated results of the representative underwater target image with a complex shape, which is boxed in fig. 5(a) and fig. 5(b), the outline of the image after characterization enhancement is more complete, and the target information is richer, but it should be noted that sufficient training is required for performing characterization enhancement based on the channel attention mechanism of sub-pixel convolution, otherwise, the quality of the generated image is adversely affected. Fig. 6 is a graph of results of Oxford optical small sample image data set training 57500 rounds of comparative experiments (target with complex shape), in which fig. 6(a) is a graph of results of an original FUNIT network experiment (target with complex shape box-selected), and fig. 6(b) is a graph of results of an original FUNIT network experiment after characterization enhancement. Although the generated images have randomness, it can be seen by comparing the generation results of the representative target images with complex shapes selected in the boxes in fig. 6(a) and fig. 6(b), that the shadow information on the petals in the flowers of the images after characterization and enhancement is clearer, the images are more complete and rich, the subtle features are more obvious, but the problem of sufficient training is also noticed, otherwise, the negative effect on the image quality is still generated.
(3) The method provides a conversion mechanism based on a feature domain optimization algorithm to complete small sample image conversion with rich details
The invention provides a conversion mechanism based on a characteristic domain optimization algorithm, adaptively divides a characteristic domain and a content domain by introducing a countermeasure idea, reduces the parameter space by utilizing various rich images in a source domain, reduces the generation of a network without being limited to a single sample by adding a noise strategy, alleviates the problem of mode collapse, constructs a reconstruction strategy according to the characteristics of the source domain and a target domain, and completes a small sample image conversion task by utilizing the weakened cyclic semantic consistency.
Self-adaptive division of feature domain and content domain by introducing countermeasure thought
The method comprises the steps of accessing feature information into a new discriminator, constructing a new discrimination process, combining the thought that the contents of two times of conversion are unchanged in the discrimination process, namely, after the image feature information of the converted image is extracted again, the image feature information is similar to the image feature information extracted by a given sample, and providing a countermeasure thought to finish the process in order to avoid the phenomenon that the loss is too large and further unbalance a loss function caused by simply applying image difference. Fig. 7 is a schematic diagram of a network structure for adaptively dividing a feature domain and a content domain by introducing countermeasures according to the present invention. The network can use sufficient source domain images to alleviate the problem of limited target domain sample number by dividing the feature domain and the content domain, namely, the parameter space is reduced by using multiple types of abundant images in the source domain. By adding a new discriminator of category features, adopting the features of the reconstructed image as a pseudo label and the features of the real image as a real label, the correlation judgment is completed, and the added loss function is as follows:
Figure BDA0003517993380000171
where x denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,
Figure BDA0003517993380000172
indicating a desire.
Addition of noise strategy
The noise strategy is added by enriching the content information of the image by using the random information contained in the noise, so that the result of image conversion is more diversified, but the noise information is directly added to the network input layer as extra information, and under the guidance of the loss function gradient, the network often ignores the added noise information after training iteration, so that the converted image can not enhance the self performance by using the random information. In the invention, the retention of random information is added by introducing the noise information redundancy into a decoder module of a generator network, so that the network is not limited to the generation of a single sample, and the problem of mode collapse is alleviated, and fig. 8 is a schematic diagram of a network structure added by a noise strategy.
The formalized description of the noise addition strategy is:
Figure BDA0003517993380000173
newvec=Cov(Cov(Cov(concat(covec,z),z),z),z)
in the formula, ximgRepresenting an input image, covecRepresents the content vector, concat (·,) represents the noise addition operation, Cov (·,) represents the convolution process,
Figure BDA0003517993380000174
representing the extraction process of the content vector, z representing noise information, newvecRepresenting a new content vector.
Thirdly, constructing a reconstruction strategy to weaken cyclic semantic consistency operation
The traditional cycle consistency loss function requires that the geometric semantics of the original image cannot be changed in the conversion process, and actually, if the visual effect of the original image is required to be unchanged, the application range of the image conversion process is necessarily limited, so that the conversion cannot be effectively completed between image domains with greatly changed image geometric semantics. Therefore, the present invention weakens the pixel level error of the image into the characteristic level error by adding the convolution layer to take account of the relaxation of the geometric constraint and the adaptation of the inter-domain content, and simultaneously constructs the image reconstruction strategy by using the common reconstruction loss in the codec, fig. 9 is a network structure diagram of the image reconstruction strategy, and the new cycle consistency reconstruction loss is specifically:
Figure BDA0003517993380000181
in the formula (I), the compound is shown in the specification,
Figure BDA0003517993380000182
indicating a weakened cyclic semantic consistency,
Figure BDA0003517993380000183
it is shown that the forward loss consistency is,
Figure BDA0003517993380000184
indicating the back loss consistency.
And (3) according to the characteristics of the source domain and the target domain, constructing a reconstruction strategy, and completing an image conversion task by using the weakened cyclic semantic consistency to obtain a small sample conversion image with better visual effect.
The forward loss consistency is:
Figure BDA0003517993380000185
in the formula, Conv (. cndot.) represents a feature extraction process, λ1A hyper-parameter representing the forward conversion,
Figure BDA0003517993380000186
Figure BDA0003517993380000187
and
Figure BDA0003517993380000188
Aimg,Aimg,Bimgand CimgRespectively representing an image, Dec (-) representing image decoding according to given image class information and image content information, phi (-) representing an image class extraction process,
Figure BDA0003517993380000189
representing an image content feature information extraction process.
The back loss consistency is:
Figure BDA00035179933800001810
in the formula of lambda2Hyper-parameters representing a backward transformation
Figure BDA00035179933800001811
And
Figure BDA00035179933800001812
B′imgand C'imgRespectively representing an image.
In order to verify the effectiveness of the conversion mechanism based on the feature domain optimization algorithm, firstly, the feature domain and the content domain are divided in a self-adaptive mode by introducing a countermeasure idea, and the operation of weakening the consistency of the cycle semantics by constructing a reconstruction strategy is verified in an experiment. Fig. 10 is an experimental example result diagram of 10000 rounds of training for the countermeasure introduced into the underwater sonar small sample data set, where fig. 10(a) is an experimental example result diagram of an original FUNIT network (box-selecting an underwater target with a single shape), and fig. 10(b) is an experimental example result diagram of an original FUNIT network after the countermeasure is introduced. Comparing the generation results of the underwater target images with single representative shapes, which are selected by frames in fig. 10(a) and 10(b), it can be seen that by introducing a countermeasure idea, that is, adding a new category discriminator, the category information is easier to extract and further expressed in the converted image, so that the color distribution of the converted image is closer to that of the target domain image, and the information of a part of the content domain is well retained. Fig. 11 is a comparative experiment result diagram of 10000 times of training for constructing a reconstruction strategy from an underwater sonar small sample data set, where fig. 11(a) is an experiment result diagram of an original FUNIT network (a large deformation underwater target is selected), and fig. 11(b) is an experiment result diagram of an original FUNIT network after a reconstruction strategy is constructed. Comparing the generated results of the representative underwater target images with large deformation, which are selected in the frames in fig. 11(a) and fig. 11(b), it can be seen that the images with large deformation are helpful, and the original network is more like copying the target domain images when facing the target objects with large differences.
Fig. 12 is a graph of the results of 50000 rounds of comparative experiments of Oxford optical small sample image dataset introduction countermeasure training, in which fig. 12(a) is a graph of the results of experiments of an original FUNIT network (for selecting targets with rich detailed information), and fig. 12(b) is a graph of the results of experiments of the original FUNIT network after introduction of countermeasure. As can be seen from comparison of the target image generation results with rich representative detailed information, which are boxed in fig. 12(a) and fig. 12(b), the content domain is also completely preserved due to the effect of the pseudo tag, and the content of the converted image is more inclined to the target domain image. Fig. 13 is a graph of results of 50000 rounds of comparative experiments in Oxford flower optical small sample image dataset reconstruction strategy construction, wherein fig. 13(a) is a graph of results of experiments in an original FUNIT network (the object with larger deformation, namely the optical object of single-plant to multi-plant conversion, is boxed), and fig. 13(b) is a graph of results of experiments in an original FUNIT network after a reconstruction strategy is constructed. Comparing the target image generation results of the single plant to the multiple plants, which are representative targets with large deformation, selected in the frames of fig. 13(a) and fig. 13(b), it can be seen that the original FUNIT network after the reconstruction strategy is constructed makes the conversion from single flowers to multiple flowers more natural.
Fig. 14 is a schematic network structure diagram of a feature domain optimized small sample image conversion method based on characterization enhancement provided by the present invention, fig. 15 is a comparative experiment example result diagram (underwater sonar small sample data set) of 10000 training rounds of the feature domain optimized small sample image conversion method based on characterization enhancement provided by the present invention, where fig. 15(a) is an experiment example result diagram of an original FUNIT network (for framing underwater objects with fewer texture features), and fig. 15(b) is an experiment example result diagram of the feature domain optimized small sample image conversion method based on characterization enhancement provided by the present invention. Comparing the results of generating the underwater target images with less representative texture features, which are selected by frames in fig. 15(a) and 15(b), shows that the method provided by the invention has a better conversion effect on small sample image conversion, can retain more image detail information, and can generate richer converted images.
Fig. 16 is a comparison experiment example result graph (Oxford flower optical small sample image dataset) of 50000 rounds of training of the feature domain optimization small sample image transformation method based on characterization enhancement provided by the present invention, wherein fig. 16(a) is an experiment example result graph (for framing a target with less texture features) of an original FUNIT network, and fig. 16(b) is an experiment example result graph of the feature domain optimization small sample image transformation method based on characterization enhancement provided by the present invention. As can be seen from comparison of the target image generation results with less representative texture features, which are boxed in fig. 16(a) and fig. 16(b), the method provided by the present invention automatically adds corresponding veins to petals, thereby increasing image richness and generating richer conversion images.
The feature domain optimization small sample image conversion method based on characterization enhancement provided by the invention is used for carrying out experimental result analysis on an underwater sonar small sample image data set and an Oxford flower optical small sample image data set, so that the provided method is verified to have a good conversion effect on small sample image conversion, can retain more image detail information, generates more abundant conversion images and has certain effectiveness.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. A method for optimizing small sample image conversion based on a feature domain with enhanced representation is characterized by comprising the following steps:
step 1: acquiring a small sample image dataset;
step 2: training an image conversion model;
step 2.1: initializing parameters and setting the maximum iteration times;
step 2.2: enhancing the image contrast information and the edge information;
step 2.3: performing characteristic processing by utilizing a compression operation and an excitation operation of a channel attention mechanism;
step 2.4: performing up-sampling operation on the image feature set after feature processing by combining sub-pixel convolution;
step 2.5: completing small sample image conversion with rich details by a conversion mechanism based on a feature domain optimization algorithm;
step 2.5.1: introducing a countermeasure thought, adaptively dividing a characteristic domain and a content domain, and reducing a parameter space by utilizing various rich images in a source domain;
the feature information is accessed into a new discriminator to construct a new discrimination process, and the concept of unchanged contents of two conversions is combined in the construction discrimination process, namely the image feature information of the converted image is re-extracted and is similar to the image feature information extracted by a given sample; in order to avoid the phenomenon that the loss is too large due to simple application of image difference and further the loss function is unbalanced, the process is completed by utilizing the countermeasure idea, the image characteristic information extracted twice is fitted to the vicinity of the same characteristic distribution, and the parameters of a generator in the network are forced to be further optimized through the loss function to complete a better characteristic extraction process, so that the division of the characteristic domain is more accurate, and the purpose of optimizing the division of the characteristic domain and the content domain is achieved;
by adding a new discriminator of category features, adopting the features of the reconstructed image as a pseudo label and the features of the real image as a real label, the correlation judgment is completed, and the added loss function is as follows:
Figure FDA0003517993370000011
where x denotes an input image, l denotes a feature class image, D denotes a discriminator, G denotes a generator,
Figure FDA0003517993370000012
indicates a desire;
step 2.5.2: adding a noise strategy;
random information is introduced and added to the noise information redundancy of a decoder module in a generator network, and the method specifically comprises the following steps:
Figure FDA0003517993370000013
newvec=Cov(Cov(Cov(concat(covec,z),z),z),z)
wherein x isimgRepresenting an input image; co ofvecRepresenting a content vector; concat (·, ·) represents a noise addition operation; cov (·, ·) represents a convolution process;
Figure FDA0003517993370000014
an extraction process representing a content vector; z represents noise information; newvecRepresenting a new content vector;
step 2.5.3: according to the characteristics of the source domain and the target domain, a reconstruction strategy is constructed, and an image conversion task is completed by utilizing the weakened cyclic semantic consistency;
the cycle consistency reconstruction loss is specifically:
Figure FDA0003517993370000021
wherein the content of the first and second substances,
Figure FDA0003517993370000022
representing a weakened cyclic semantic consistency;
Figure FDA0003517993370000023
representing forward loss consistency;
Figure FDA0003517993370000024
representing the backward loss consistency;
the forward loss consistency is specifically:
Figure FDA0003517993370000025
wherein Conv (·) represents a feature extraction process; lambda [ alpha ]1Hyper-parameters representing forward conversion
Figure FDA0003517993370000026
Figure FDA0003517993370000027
And
Figure FDA0003517993370000028
Aimg,Aimg,Bimgand CimgRespectively representing images; dec (·,) denotes image decoding in accordance with given image category information and image content information; phi (-) represents the image class extraction process;
Figure FDA0003517993370000029
representing an image content feature information extraction process;
the back loss consistency is:
Figure FDA00035179933700000210
wherein λ is2Hyper-parameters representing a backward transformation
Figure FDA00035179933700000211
And
Figure FDA00035179933700000212
B′imgand C'imgRespectively representing images;
step 2.6: judging whether the maximum iteration times is reached, if not, returning to the step 2.2; otherwise, outputting the trained image conversion model;
and step 3: and inputting the image to be converted into the trained image conversion model to obtain a converted image with better visual effect.
2. The method of claim 1, wherein the method comprises the following steps: the method for enhancing the image contrast information and the edge information in the step 2.2 specifically comprises the following steps:
step 2.2.1: enhancing the contrast information of the image by adopting a histogram equalization algorithm, wherein the processed image meets uniform probability density distribution;
step 2.2.2: taking the gray level image enhanced by the histogram equalization algorithm as the input of a Canny algorithm, and performing Gaussian smoothing; gradient solving is carried out on the smoothed image, non-maximum inhibition and double-threshold detection are utilized to complete the acquisition of edge information features, and the edge information features are added to an image feature set;
the smoothed image is:
Figure FDA00035179933700000213
wherein f (x, y) represents an input gray image;
Figure FDA0003517993370000031
representing a convolution process;
Figure FDA0003517993370000032
representing a gaussian function; σ represents the standard deviation;
and performing gradient solving on the smoothed image, specifically:
Figure FDA0003517993370000033
Figure FDA0003517993370000034
Figure FDA0003517993370000035
Figure FDA0003517993370000036
wherein, gxRepresents the gradient in the x-direction; gyRepresents the gradient in the y-direction; m (x, y) represents an image gradient; α (x, y) represents an image direction.
3. The method of claim 1, wherein the method comprises the following steps: the method for performing feature processing by using the compression operation and the excitation operation of the channel attention mechanism in the step 2.3 specifically comprises the following steps:
the compression operation is specifically:
Figure FDA0003517993370000037
wherein z iscA statistic representing a channel length; h and W are the spatial dimensions of the feature map; u. ofcRepresenting a feature map; fsqRepresenting a compression operation;
the excitation operation applies a gating module and a Sigmoid activation function to learn nonlinear description information between channels, and the nonlinear description information is formally expressed as:
s=Fex(z,W)=σ(W2δ(W1z))
where, σ denotes a Sigmoid function,
Figure FDA0003517993370000038
Figure FDA0003517993370000039
representing a variable, e representing a natural constant;
Figure FDA00035179933700000310
Figure FDA00035179933700000311
c represents the number of channels; r represents the ratio of dimensionality reduction; delta denotes the value of the ReLU function,
Figure FDA00035179933700000312
Figure FDA00035179933700000313
represents a variable; fexIndicating an actuation operation.
4. The method of claim 1, wherein the method comprises the following steps: the method for performing upsampling operation on the image feature set after feature processing by combining sub-pixel convolution in the step 2.4 specifically includes:
Figure FDA0003517993370000041
wherein, IoutRepresents an output value; i isinputRepresenting an input value; f represents a network; l represents the Lth layer; bLRepresents the bias of the L-th layer;
Figure FDA0003517993370000042
represents a convolution operation; wLA parameter indicating an L-th layer; PS (-) represents the adjustment function of the pixel, specifically:
Figure FDA0003517993370000043
wherein T represents an input feature set;
Figure FDA0003517993370000044
representing a rounding-down operation; x represents the extended feature length; r represents an upsampling multiple; y represents the expanded feature width; c represents the number of characteristic channels before expansion; mod (·,) represents a modulo operation; c represents the number of the characteristic channels after expansion.
CN202210170641.6A 2022-02-24 2022-02-24 Feature domain optimization small sample image conversion method based on characterization enhancement Pending CN114565806A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210170641.6A CN114565806A (en) 2022-02-24 2022-02-24 Feature domain optimization small sample image conversion method based on characterization enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210170641.6A CN114565806A (en) 2022-02-24 2022-02-24 Feature domain optimization small sample image conversion method based on characterization enhancement

Publications (1)

Publication Number Publication Date
CN114565806A true CN114565806A (en) 2022-05-31

Family

ID=81714453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210170641.6A Pending CN114565806A (en) 2022-02-24 2022-02-24 Feature domain optimization small sample image conversion method based on characterization enhancement

Country Status (1)

Country Link
CN (1) CN114565806A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116579918A (en) * 2023-05-19 2023-08-11 哈尔滨工程大学 Attention mechanism multi-scale image conversion method based on style independent discriminator

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116579918A (en) * 2023-05-19 2023-08-11 哈尔滨工程大学 Attention mechanism multi-scale image conversion method based on style independent discriminator
CN116579918B (en) * 2023-05-19 2023-12-26 哈尔滨工程大学 Attention mechanism multi-scale image conversion method based on style independent discriminator

Similar Documents

Publication Publication Date Title
CN113313657B (en) Unsupervised learning method and system for low-illumination image enhancement
CN110363068B (en) High-resolution pedestrian image generation method based on multiscale circulation generation type countermeasure network
CN110263858B (en) Bolt image synthesis method and device and related equipment
CN109242097B (en) Visual representation learning system and method for unsupervised learning
CN110503113B (en) Image saliency target detection method based on low-rank matrix recovery
CN113256494B (en) Text image super-resolution method
CN114723950A (en) Cross-modal medical image segmentation method based on symmetric adaptive network
CN112927137A (en) Method, device and storage medium for acquiring blind super-resolution image
CN111476272A (en) Dimension reduction method based on structural constraint symmetric low-rank retention projection
CN114548265A (en) Crop leaf disease image generation model training method, crop leaf disease identification method, electronic device and storage medium
CN114299130A (en) Underwater binocular depth estimation method based on unsupervised adaptive network
CN107301631B (en) SAR image speckle reduction method based on non-convex weighted sparse constraint
CN114565806A (en) Feature domain optimization small sample image conversion method based on characterization enhancement
CN112541566B (en) Image translation method based on reconstruction loss
CN117252936A (en) Infrared image colorization method and system adapting to multiple training strategies
CN116188791A (en) Intrinsic image decomposition method based on bilateral feature pyramid network and multi-scale identification
CN114387485B (en) Mars image augmentation method based on generation of countermeasure network
CN114022371B (en) Defogging device and defogging method based on space and channel attention residual error network
CN116152263A (en) CM-MLP network-based medical image segmentation method
CN114331894A (en) Face image restoration method based on potential feature reconstruction and mask perception
CN113781294A (en) Method for realizing shadow play style migration through improved cyclic generation confrontation network
CN114218850A (en) Heterogeneous multi-relation graph representation learning method
CN115294418A (en) Method and apparatus for domain adaptation for image segmentation, and storage medium
Wang et al. APST-Flow: A Reversible Network-Based Artistic Painting Style Transfer Method.
CN116579918B (en) Attention mechanism multi-scale image conversion method based on style independent discriminator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination