CN117746178A

CN117746178A - Image generation method based on shape space theory

Info

Publication number: CN117746178A
Application number: CN202311741845.1A
Authority: CN
Inventors: 韩越兴; 阮礼恒; 王冰
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2023-12-18
Filing date: 2023-12-18
Publication date: 2024-03-22

Abstract

The invention relates to an image generation method based on a shape space theory, which comprises the following steps: inputting an image data set and random noise, and constructing an image generation model and a pre-shape space, wherein the image generation model comprises a generator and a discriminator; randomly selecting a group of images from the image data set, inputting the images into a discriminator to obtain training features, projecting the training features to obtain data features, constructing a geodesic surface on a pre-shape space, and obtaining sample features according to randomly generated sampling weights; generating random noise through Gaussian distribution sampling, calculating anchor point characteristics through the random noise and sample characteristics, constructing anchor point images and interpolation images, and extracting the anchor point characteristics and interpolation image characteristics; constructing loss through the sample characteristics, anchor point characteristics and interpolation image characteristics, and updating a generator and a discriminator; and inputting random noise through the updated generator and the updated discriminator to generate an image. Compared with the prior art, the method does not need additional information for training, and the generated image is high in quality.

Description

Image generation method based on shape space theory

Technical Field

The invention relates to the field of machine learning, in particular to an image generation method based on shape space theory.

Background

When data processing is performed using machine learning, particularly deep learning, a large number of training samples are required. However, labeling high quality training samples is often limited due to equipment costs, labor costs, time costs, and the like. The number and diversity of training samples can be increased by generating enough and high quality samples through the training image generation model.

The current small sample image generation models are mainly divided into two types: the method is a small sample adaptation method, training is performed by using a pre-training model related to the semantics of a training set, and the method can achieve a good image generation effect under the condition of extremely small number of training samples, but the pre-training model related to the semantics is not easy to obtain. The other is to train the model by means of data enhancement, modification of the model structure, addition of regularization terms, etc. without using any additional information, but such means do not achieve the expected generation effect in case of extremely small samples with a training sample number of less than 10.

In order to overcome various limitations of the current small sample image generation method, it is necessary to propose an image generation method that does not require any additional information for training and can generate a higher quality image in a training scene of extremely small samples.

Disclosure of Invention

The invention aims to overcome the defect that the image generation in the prior art cannot achieve the expected effect, and provides an image generation method based on the shape space theory.

The aim of the invention can be achieved by the following technical scheme:

an image generation method based on shape space theory comprises the following steps:

s1: inputting an image data set and random noise, and constructing an image generation model and a pre-shape space, wherein the image generation model comprises a generator and a discriminator;

s2: randomly selecting a group of images from the image data set, inputting the images into a discriminator to obtain training features, projecting the training features to obtain data features, constructing a geodetic surface on a pre-shape space according to the data features, and obtaining sample features from the geodetic surface according to randomly generated sampling weights;

s3: generating random noise through Gaussian distribution sampling, calculating anchor point characteristics through the random noise and sample characteristics, constructing anchor point images and interpolation images, and extracting the anchor point characteristics and interpolation image characteristics;

s4: constructing loss through the sample characteristics, anchor point characteristics and interpolation image characteristics, and updating a generator and a discriminator;

s5: and inputting random noise through the updated generator and the updated discriminator to generate an image.

Further, the step S2 specifically includes the following steps:

s201: randomly selecting a group of images from the image data set as a training set, inputting the training set into a specified convolution layer of a discriminator, and extracting training features;

s202: projecting the extracted training features to a pre-shape space to obtain data features;

s203: constructing a geodesic curve function according to the data characteristics;

s204: generating a group of weights according to the Dirichlet distribution, iterating a geodetic curve function according to the weights to obtain an intermediate feature vector, and obtaining the geodetic curve function, namely a sample feature, when the iteration times are equal to the number of images of the training set;

s205: and S204, generating a new set of weights each time to obtain a plurality of groups of sample characteristics.

Further, the geodesic function is expressed as:

in the method, in the process of the invention,as a curve function τ ₁ ,τ ₂ Characteristic vectors representing the start and end points of the geodesic curve, and the parameter s controls the position of the generated vector on the geodesic curve, d (τ ₁ ,τ ₂ ) Denoted τ ₁ ,τ ₂ The geodesic distance between them.

Further, the step S3 specifically includes the following steps:

s301: for each group of sample characteristics obtained in the step S205, obtaining a group of random noise through Gaussian distribution sampling, and calculating anchor point noise according to the weight and the random noise corresponding to the sample characteristics;

s302: obtaining a group of interpolation noise sets by linearly interpolating random noise sampled by two adjacent Gaussian distributions;

s303: inputting the anchor point noise and the interpolation noise set into a generator to obtain an anchor point image and an interpolation image, and inputting the anchor point image into a specified convolution layer of a discriminator to obtain interpolation image characteristics;

s304: and projecting the interpolation image features to a pre-shape space to obtain anchor point features.

Further, the step S4 specifically includes the following steps:

s401: calculating cosine similarity of the vector of the sampling feature at the first position and the vector of the neighborhood of the first position;

s402: calculating cosine similarity of the vector of the anchor point characteristic at the first position and the vector of the first position neighborhood;

s403: traversing all corresponding positions of the first position neighborhood in space to obtain autocorrelation matrixes of the sampling features and the anchor point features at the first positions respectively;

s404: constructing an autocorrelation consistency loss according to the autocorrelation matrixes of the sampling characteristics and the anchor point characteristics;

s405: constructing unsaturated loss according to the interpolation image for supervising the interpolation image;

s406: constructing a first multidimensional vector, wherein the dimension of the vector is equal to the number of elements of the interpolation feature set;

s407: constructing a second multidimensional vector, wherein the dimension of the vector is equal to the number of elements of the interpolation feature set;

s408: constructing a distance constraint loss between interpolation images;

s409: constructing the countermeasures of the generator, and obtaining an optimization function of the generator through interpolation of the distance constraint losses and the countermeasures between the images;

s410: constructing the antagonism loss of the discriminator and constructing an optimization function of the discriminator;

s411: and updating parameters of the generator and the discriminator according to the optimization function of the generator and the optimization function of the discriminator until the loss converges.

Further, the expression for the autocorrelation consistency loss is:

wherein L is _sl1 (. Cndot.) represents the smoothl 1 loss function,cosine similarity for sampled features, +.>For cosine similarity of anchor point characteristics, omega-Dir is dirichlet distribution weight,/I>For training set image data, z-p (z) is random noise obtained by Gaussian distribution sampling, +.>Representing the computational expectations.

Further, the calculated expression of the non-saturation loss is:

wherein L is _inp As non-saturation loss, Z _inp (z′ ₁ ,z′ _k ) In order to interpolate the noise set,to interpolate image features, log (·) is expressed as10 base logarithm, z' ₁ ,z′ _k -p (z) represents interpolation noise, +.>Representing the computational expectations.

Further, the expression of the first multidimensional vector is:

wherein V is _inp [·]As a first multi-dimensional vector of the first set,for anchor pictures, z' _i Representing the ith interpolation noise, k is the dimension of the vector.

Further, the expression of the second multidimensional vector is:

V _q [1:k-1]＝1，V _q [k]＝k-1

wherein V is _q [·]For the second multidimensional vector, k is the dimension of the vector.

Further, the calculation expression of the inter-image distance constraint loss is:

L _dl ＝L _kl (V _inp ,V _q )

wherein L is _dl L is the constraint loss of the distance between images _kl For KL divergence loss, V _inp For the first multidimensional vector, V _q Is a second multidimensional vector.

Compared with the prior art, the invention has the following beneficial effects:

1) According to the invention, the training features of the original image are mapped in the pre-shape space to obtain the sampling features, the anchor point features and the interpolation image features are calculated according to the sampling features and are used for training the generator of the discriminator, so that an image generation model capable of outputting the image is obtained, no additional information is needed for training, and the generated image is high in quality.

2) According to the invention, under the data scarcity background, the feature distribution space is effectively depicted by combining with the shape space theory, so that the efficient image generation model training is realized.

Drawings

FIG. 1 is a schematic flow chart of the present invention.

Detailed Description

The invention will now be described in detail with reference to the drawings and specific examples. The present embodiment is implemented on the premise of the technical scheme of the present invention, and a detailed implementation manner and a specific operation process are given, but the protection scope of the present invention is not limited to the following examples.

As shown in fig. 1, the present invention is an image generation method based on shape space theory, comprising the steps of:

Example 1

The invention relates to an image generation method based on a shape space theory, which comprises the following steps:

s2: randomly selecting a group of images from the image data set, inputting the images into a discriminator to obtain training features, projecting the training features to obtain data features, constructing a geodetic surface on a pre-shape space according to the data features, and obtaining sample features from the geodetic surface according to randomly generated sampling weights; the method specifically comprises the following steps:

s201: randomly selecting a group of images from the image data set as a training set x, including n images, and inputting the training set x into a discriminatorExtracting training features from the specified convolution layer l of (1), denoted +.>c represents the number of channels, h represents the height, and w represents the width;

s202: training features to be extractedProjecting to the pre-shape space to obtain the data feature τ, and>

s203: constructing a geodesic curve function according to the data characteristics; the geodesic function has the expression:

S204: a set of weights omega is generated from the dirichlet distribution,obtaining an intermediate feature vector according to the weight iteration geodesic curve function, wherein the expression is as follows:

wherein mu is _j Is the j-th intermediate vector.

When the iteration times are equal to the number of images of the training set, obtaining a geodesic surface functionI.e. sample characteristics;

S3: generating random noise through Gaussian distribution sampling, calculating anchor point characteristics through the random noise and sample characteristics, constructing anchor point images and interpolation images, and extracting the anchor point characteristics and interpolation image characteristics; the method specifically comprises the following steps:

s301: for each set of sample features obtained in step S205, a set of random noise { z } is obtained by Gaussian distribution sampling _i :i∈[1,n]According to the weight and random noise corresponding to the sample characteristics, calculating anchor point noise

S302: obtaining a group of interpolation noise sets by linearly interpolating random noise sampled by two adjacent Gaussian distributionsz′ ₁ ,z′ ₂ ,...,z′ _k Is an interpolation result;

s303: will anchor noiseAnd interpolation noise set Z _inp Input generator, get anchor point picture +.>And interpolating the imageInputting the anchor point image into the discriminator +.>In the assigned convolution layer l, the interpolated image feature +.>

S304: to interpolate image featuresProjecting to the pre-shape space to obtain anchor point characteristics +.>c represents the number of channels, h represents the height, and w represents the width.

S4: constructing loss through the sample characteristics, anchor point characteristics and interpolation image characteristics, and updating a generator and a discriminator; the method specifically comprises the following steps:

s401: computing sampling featuresVector +.>And vector of the first position neighborhood (a, b)>Cosine similarity->The computational expression is:

wherein < ·, · > represents dot product, |·|| represents euclidean norm;

s402: computing anchor featuresVector +.>And vector of the first position neighborhood (a, b)>Cosine similarity->The computational expression is:

wherein < ·, · > represents dot product, |·|| represents euclidean norm;

s403: traversing all corresponding positions in space of the first position neighborhood (a, b) to obtain sampling characteristicsAnd Anchor Point feature->The autocorrelation matrices at the first positions (u, v), respectively, are denoted +.>And->

S404: constructing an autocorrelation consistency loss according to an autocorrelation matrix of the sampling characteristic and the anchor point characteristic, wherein the expression is as follows:

S405: from interpolated imagesConstructing unsaturated loss for supervising the interpolation image; the calculated expression of the non-saturation loss is:

wherein L is _inp As non-saturation loss, Z _inp (z′ ₁ ,z′ _k ) In order to interpolate the noise set,for interpolation of image features, log (·) is the base 10 logarithm, z' ₁ ,z′ _k -p (z) represents interpolation noise, +.>Representing the computational expectations.

S406: constructing a first multidimensional vector, wherein the expression is as follows:

wherein V is _inp [·]As a first multi-dimensional vector of the first set,for anchor pictures, z' _i Representing the ith interpolation noise, k being the dimension of the vector, the dimension of the vector being equal to the number of elements of the interpolation feature set;

s407: constructing a second multidimensional vector, and expressing as follows:

V _q [1:k-1]＝1，V _q [k]＝k-1

wherein V is _q [·]For a second multidimensional vector, k is the dimension of the vector, which is equal to the number of elements of the interpolation feature set; .

S408: constructing a distance constraint loss L between interpolation images _dl ＝L _kl (V _inp ,V _q ) Wherein L is _kl Indicating KL divergence loss;

s409: countering losses of construction generatorObtaining an optimization function of the generator by interpolating the inter-image distance constraint loss and the contrast loss>Wherein lambda is ₁ And lambda (lambda) ₂ Are all fixed super parameters;

s410: countering loss of construction discriminator Constructing an optimization function of the arbiter>Wherein lambda is ₃ Are all fixed super parameters;

s411: from the generator's optimisation functionAnd an optimization function of the arbiter->Update generator->And discriminator->Until the loss converges.

S5: by updated generatorsAnd discriminator->Random noise is input to generate an image.

The method is applicable to data sets of various data types, and the generator and the discriminator are conventional neural networks.

The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention by one of ordinary skill in the art without undue burden. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by the person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.

Claims

1. An image generation method based on shape space theory is characterized by comprising the following steps:

2. The image generation method based on the shape space theory according to claim 1, wherein the step S2 specifically comprises the following steps:

3. The image generation method based on the shape space theory according to claim 2, wherein the expression of the geodesic curve function is:

4. The image generation method based on the shape space theory according to claim 2, wherein the step S3 specifically comprises the following steps:

5. The image generation method based on the shape space theory according to claim 4, wherein the step S4 specifically comprises the steps of:

s408: constructing a distance constraint loss between interpolation images;

6. The image generation method based on shape space theory according to claim 5, wherein the expression of the autocorrelation consistency loss is:

wherein L is _sl1 (. Cndot.) represents smooth-l1 loss function,cosine similarity for sampled features, +.>For cosine similarity of anchor point characteristics, omega-Dir is dirichlet distribution weight,/I>For training set image data, z-p (z) is random noise obtained by Gaussian distribution sampling, +.>Representing the computational expectations.

7. The image generation method based on the shape space theory according to claim 5, wherein the calculation expression of the non-saturation loss is:

wherein L is _inp As non-saturation loss, z _inp (z′ ₁ ，z′ _k ) In order to interpolate the noise set,for interpolation of image features, log (·) is the base 10 logarithm, z' ₁ ，z′ _k -p (z) represents interpolation noise, +.>Representing the computational expectations.

8. The method of claim 5, wherein the expression of the first multidimensional vector is:

wherein V is _inp [·]As a first multi-dimensional vector of the first set,for anchor pictures, z' ₁ Representing the ith interpolation noise, k is the dimension of the vector.

9. The method of claim 5, wherein the expression of the second multidimensional vector is:

V _q [1：k-1]＝1，V _q [k]＝k-1

10. The image generation method based on the shape space theory according to claim 5, wherein the calculation expression of the inter-image distance constraint loss is:

L _dl ＝L _kl (V _inp ，V _q )