CN110619347A

CN110619347A - Image generation method based on machine learning and method thereof

Info

Publication number: CN110619347A
Application number: CN201910703906.2A
Authority: CN
Inventors: 张明森; 熊晓明; 黄宏敏; 胡恩; 刘祥
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2019-07-31
Filing date: 2019-07-31
Publication date: 2019-12-27

Abstract

The invention discloses an image generation method based on machine learning, which comprises the following steps: s1: acquiring image data and preprocessing the data; s2: constructing a network model; s3: training a network model; s4: an image matching the feature information of the user input image is generated. Wherein, a discriminant model is added behind a generated model (a decoder in a similar variational self-encoder), and the discriminant model is used for discriminating the similarity between the generated sample and the sample in the training data. After a series of training is carried out on the neural network model, the discriminator can not discriminate whether the image generated by the generator is from real data, so that the image generated by the generator, namely the image required by people, is achieved.

Description

Image generation method based on machine learning and method thereof

Technical Field

The invention relates to the technical field of computer vision, in particular to an image generation method based on machine learning and a method thereof.

Background

Image generation is an important area of computer vision, essentially generating realistic samples from certain distributions. The method used to generate the samples is called generative modeling. When one generative model is applied to the image domain, more images can be provided.

The traditional image generation method has two modes: including a deep belief network model and a variational self-coder model.

The deep belief network is formed by stacking a plurality of constrained boltzmann machines. The boltzmann machine is a neural network based on statistical mechanics, and can learn the inherent structure of data, and the sample distribution is named according to the boltzmann distribution. The boltzmann system is composed of a visible layer and a hidden layer, and neurons in the same layer are connected to each other in addition to the visible layer and the hidden layer. While there is no connection within the layers of the constrained boltzmann machine. The training of the limited Boltzmann machine is based on the principle that the lower the energy is, the more stable the machine is, a group of parameters are learned to enable the system energy to be the lowest, the training method is to train layer by layer from the bottommost layer, and each limited Boltzmann machine is trained one by using a contrast divergence algorithm.

The variational self-encoder is improved from a general self-encoder. A general self-encoder inputs some kind of data, for example, a picture or a high-dimensional vector, and as long as the self-encoder operates, the data can be compressed into smaller eigenvalues as much as possible through a neural network operation, and the process is divided into two processes, namely an encoder and a decoding network, wherein the first process is used for reconstructing the data, and the second process is used for calculating the reconstruction loss. The difference of the variational auto-encoder is that the conventional depth vector of the general auto-encoder is replaced by two independent vectors, one representing the mean value of the distribution, one representing the standard deviation of the distribution, and one vector connection is needed in the middle.

However, the two traditional models have defects, wherein the training speed of the boltzmann machine is slow, and the variational self-encoder cannot train the network model in a supervision mode.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide the image generation method based on machine learning, which is high in speed, can train image generation models of different types according to the needs of designers and can call the image generation models to generate images required by users according to the needs input by the users.

In order to achieve the purpose, the technical scheme provided by the invention is as follows:

an image generation method based on machine learning, comprising the steps of:

s1: acquiring image data and preprocessing the data;

s2: constructing a network model;

s3: training a network model;

s4: an image matching the feature information of the user input image is generated.

Further, the specific steps of step S1 are as follows:

s1-1: acquiring image training data by using a web crawler;

s1-2: reading an image, carrying out mean value removing processing on the image, and calculating a covariance matrix;

s1-3: decorrelation is carried out on the image data, and the data after mean value removal processing is projected to a feature base;

s1-4: and whitening, namely performing scale normalization processing on the characteristic value of the decorrelated data on each dimension.

Further, the network model constructed in step S2 includes a generation model and a discriminant model;

the generated model consists of an deconvolution layer, a BN layer and a ReLU activation function; the input of the image is a hidden vector z generated by standard normal distribution, and the output of the image is an RGB image; x represents data of an image, g (z) represents a generator function that maps z to data space; the goal of G is to estimate the distribution of the data set and then generate false samples from the estimated distribution;

the discrimination model consists of a convolution layer, a BN layer and a Leaky relu activation function; the method takes an image as input, processes a series of convolution layers, a BN layer and a LeakyReLU activation function layer, and outputs scalar probability that the input image is real through a Sigmoid activation function; where D (x) is a discriminator network that outputs a probability that x is from the training data. If x is from the training data, D (x) outputs a high probability, and vice versa;

d (G (z)) represents the probability that the output of G is a true image; the game between D and G, D aims to maximize the probability logD (x) of correctly classifying itself as true or false, and G aims to minimize the probability log (1-D (x)) that D predicts its output as false.

Further, in the constructed network model, the loss function is set as a binary cross entropy:

where x is the distribution of the real data, y is the distribution of the output data, and the result l (x, y) is the difference between the x and y distributions, i.e., the loss.

Further, the step S3 is specifically performed by training a network model as follows:

1) during training of the discriminator, firstly fixing a generator G, then randomly simulating and generating a sample G (z) by using the generator to serve as a negative sample, and sampling from a real data set to obtain a positive sample x; inputting the positive and negative samples into a discriminator D, and calculating errors according to the output of the discriminator, namely D (x) or D (G (z)) and the sample label; finally, updating the discriminator D by using an error back propagation algorithm;

2) during training of the generator, firstly fixing the discriminator D; then, randomly simulating and generating a sample by using a current generator, and inputting the sample into a discriminator D; the error is calculated according to the output D (G (z)) of the discriminator and the sample label, and finally the parameters of the generator G are updated by using an error back propagation algorithm.

Compared with the prior art, the principle and the advantages of the scheme are as follows:

in the neural network, a discriminant model is added behind a generated model (a decoder in a similar variational self-encoder), and the discriminant model is used for discriminating the similarity between a generated sample and a sample in training data. After a series of training is carried out on the neural network model, the discriminator can not discriminate whether the image generated by the generator is from real data, so that the image generated by the generator, namely the image required by people, is achieved.

The generation process of the image is carried out in stages, so that the scheme is relatively simple to realize, and the process is clear and non-redundant. The network algorithm model adopted in the stage of obtaining the image generation model can enable the model training process to be more automatic, too much manual intervention is avoided, the image generation is enabled to be more automatic, a series of training strategies are adopted in the training process, the generated image is enabled to be more vivid and more like the style of the original input image, the image required by a user can be automatically generated, and the method and the device can be applied to the aspect of artistic creation.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the services required for the embodiments or the technical solutions in the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flowchart illustrating a method for generating images based on machine learning according to the present invention;

FIG. 2 is a flowchart illustrating the operation of image data acquisition and data preprocessing in an image generation method based on machine learning according to the present invention;

FIG. 3 is a block diagram of a neural network model used in the image generation method based on machine learning according to the present invention;

fig. 4 is a block diagram of the generator G.

Detailed Description

The invention will be further illustrated with reference to specific examples:

as shown in fig. 1, the image generation method based on machine learning according to the embodiment includes the following steps:

s1: acquiring image data and preprocessing the data; the specific process is as follows:

s1-1: acquiring image training data by using a web crawler;

S2: constructing a network model;

the constructed network model comprises a generation model and a discrimination model;

d (G (z)) represents the probability that the output of G is a true image; d and G are played, the goal of D is to maximize the probability logD (x) of correctly classifying true and false per se, and G aims to minimize the probability log (1-D (x)) that D predicts the output of D as false;

in the constructed network model, the loss function is set as binary cross entropy:

S3: training a network model; the method comprises the following specific steps:

After the network model is trained, the discriminator cannot discriminate whether the image generated by the generator is from real data or not, so that the image generated by the step generator is the image required by people.

The principle of the present embodiment is explained again below by way of metaphors:

the generated model is like a counterfeit money producer, the discriminating model is like a police officer, the counterfeit money producer always tries to make counterfeit money which falsely confuses the police officer, the police officer always discriminates whether the banknote is true or false in a cautious way, the counterfeit money and the police officer resist each other, and finally the banknote is too similar to the genuine banknote, so that the police officer cannot discriminate whether the banknote is true or false (step S4 described corresponding to the embodiment generates an image matching the characteristic information of the user input image).

In the embodiment, a discriminant model is added behind a generated model (a decoder in a similar variational self-encoder), and the discriminant model is used for discriminating the similarity between the generated sample and the sample in the training data. After a series of training is carried out on the neural network model, the discriminator can not discriminate whether the image generated by the generator is from real data, so that the image generated by the generator, namely the image required by people, is achieved.

The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that variations based on the shape and principle of the present invention should be covered within the scope of the present invention.

Claims

1. An image generation method based on machine learning, characterized by comprising the steps of:

s1: acquiring image data and preprocessing the data;

s2: constructing a network model;

s3: training a network model;

2. The method for generating an image based on machine learning according to claim 1, wherein the specific steps of step S1 are as follows:

s1-1: acquiring image training data by using a web crawler;

3. The image generation method based on machine learning of claim 1, wherein the network model constructed in step S2 includes a generation model and a discriminant model;

4. The method according to claim 3, wherein in the constructed network model, the loss function is set as binary cross entropy:

5. The method for generating images based on machine learning according to claim 1, wherein the step S3 is implemented by the following steps: