CN110619347A - Image generation method based on machine learning and method thereof - Google Patents

Image generation method based on machine learning and method thereof Download PDF

Info

Publication number
CN110619347A
CN110619347A CN201910703906.2A CN201910703906A CN110619347A CN 110619347 A CN110619347 A CN 110619347A CN 201910703906 A CN201910703906 A CN 201910703906A CN 110619347 A CN110619347 A CN 110619347A
Authority
CN
China
Prior art keywords
image
data
discriminator
generator
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910703906.2A
Other languages
Chinese (zh)
Inventor
张明森
熊晓明
黄宏敏
胡恩
刘祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910703906.2A priority Critical patent/CN110619347A/en
Publication of CN110619347A publication Critical patent/CN110619347A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses an image generation method based on machine learning, which comprises the following steps: s1: acquiring image data and preprocessing the data; s2: constructing a network model; s3: training a network model; s4: an image matching the feature information of the user input image is generated. Wherein, a discriminant model is added behind a generated model (a decoder in a similar variational self-encoder), and the discriminant model is used for discriminating the similarity between the generated sample and the sample in the training data. After a series of training is carried out on the neural network model, the discriminator can not discriminate whether the image generated by the generator is from real data, so that the image generated by the generator, namely the image required by people, is achieved.

Description

Image generation method based on machine learning and method thereof
Technical Field
The invention relates to the technical field of computer vision, in particular to an image generation method based on machine learning and a method thereof.
Background
Image generation is an important area of computer vision, essentially generating realistic samples from certain distributions. The method used to generate the samples is called generative modeling. When one generative model is applied to the image domain, more images can be provided.
The traditional image generation method has two modes: including a deep belief network model and a variational self-coder model.
The deep belief network is formed by stacking a plurality of constrained boltzmann machines. The boltzmann machine is a neural network based on statistical mechanics, and can learn the inherent structure of data, and the sample distribution is named according to the boltzmann distribution. The boltzmann system is composed of a visible layer and a hidden layer, and neurons in the same layer are connected to each other in addition to the visible layer and the hidden layer. While there is no connection within the layers of the constrained boltzmann machine. The training of the limited Boltzmann machine is based on the principle that the lower the energy is, the more stable the machine is, a group of parameters are learned to enable the system energy to be the lowest, the training method is to train layer by layer from the bottommost layer, and each limited Boltzmann machine is trained one by using a contrast divergence algorithm.
The variational self-encoder is improved from a general self-encoder. A general self-encoder inputs some kind of data, for example, a picture or a high-dimensional vector, and as long as the self-encoder operates, the data can be compressed into smaller eigenvalues as much as possible through a neural network operation, and the process is divided into two processes, namely an encoder and a decoding network, wherein the first process is used for reconstructing the data, and the second process is used for calculating the reconstruction loss. The difference of the variational auto-encoder is that the conventional depth vector of the general auto-encoder is replaced by two independent vectors, one representing the mean value of the distribution, one representing the standard deviation of the distribution, and one vector connection is needed in the middle.
However, the two traditional models have defects, wherein the training speed of the boltzmann machine is slow, and the variational self-encoder cannot train the network model in a supervision mode.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide the image generation method based on machine learning, which is high in speed, can train image generation models of different types according to the needs of designers and can call the image generation models to generate images required by users according to the needs input by the users.
In order to achieve the purpose, the technical scheme provided by the invention is as follows:
an image generation method based on machine learning, comprising the steps of:
s1: acquiring image data and preprocessing the data;
s2: constructing a network model;
s3: training a network model;
s4: an image matching the feature information of the user input image is generated.
Further, the specific steps of step S1 are as follows:
s1-1: acquiring image training data by using a web crawler;
s1-2: reading an image, carrying out mean value removing processing on the image, and calculating a covariance matrix;
s1-3: decorrelation is carried out on the image data, and the data after mean value removal processing is projected to a feature base;
s1-4: and whitening, namely performing scale normalization processing on the characteristic value of the decorrelated data on each dimension.
Further, the network model constructed in step S2 includes a generation model and a discriminant model;
the generated model consists of an deconvolution layer, a BN layer and a ReLU activation function; the input of the image is a hidden vector z generated by standard normal distribution, and the output of the image is an RGB image; x represents data of an image, g (z) represents a generator function that maps z to data space; the goal of G is to estimate the distribution of the data set and then generate false samples from the estimated distribution;
the discrimination model consists of a convolution layer, a BN layer and a Leaky relu activation function; the method takes an image as input, processes a series of convolution layers, a BN layer and a LeakyReLU activation function layer, and outputs scalar probability that the input image is real through a Sigmoid activation function; where D (x) is a discriminator network that outputs a probability that x is from the training data. If x is from the training data, D (x) outputs a high probability, and vice versa;
d (G (z)) represents the probability that the output of G is a true image; the game between D and G, D aims to maximize the probability logD (x) of correctly classifying itself as true or false, and G aims to minimize the probability log (1-D (x)) that D predicts its output as false.
Further, in the constructed network model, the loss function is set as a binary cross entropy:
where x is the distribution of the real data, y is the distribution of the output data, and the result l (x, y) is the difference between the x and y distributions, i.e., the loss.
Further, the step S3 is specifically performed by training a network model as follows:
1) during training of the discriminator, firstly fixing a generator G, then randomly simulating and generating a sample G (z) by using the generator to serve as a negative sample, and sampling from a real data set to obtain a positive sample x; inputting the positive and negative samples into a discriminator D, and calculating errors according to the output of the discriminator, namely D (x) or D (G (z)) and the sample label; finally, updating the discriminator D by using an error back propagation algorithm;
2) during training of the generator, firstly fixing the discriminator D; then, randomly simulating and generating a sample by using a current generator, and inputting the sample into a discriminator D; the error is calculated according to the output D (G (z)) of the discriminator and the sample label, and finally the parameters of the generator G are updated by using an error back propagation algorithm.
Compared with the prior art, the principle and the advantages of the scheme are as follows:
in the neural network, a discriminant model is added behind a generated model (a decoder in a similar variational self-encoder), and the discriminant model is used for discriminating the similarity between a generated sample and a sample in training data. After a series of training is carried out on the neural network model, the discriminator can not discriminate whether the image generated by the generator is from real data, so that the image generated by the generator, namely the image required by people, is achieved.
The generation process of the image is carried out in stages, so that the scheme is relatively simple to realize, and the process is clear and non-redundant. The network algorithm model adopted in the stage of obtaining the image generation model can enable the model training process to be more automatic, too much manual intervention is avoided, the image generation is enabled to be more automatic, a series of training strategies are adopted in the training process, the generated image is enabled to be more vivid and more like the style of the original input image, the image required by a user can be automatically generated, and the method and the device can be applied to the aspect of artistic creation.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the services required for the embodiments or the technical solutions in the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart illustrating a method for generating images based on machine learning according to the present invention;
FIG. 2 is a flowchart illustrating the operation of image data acquisition and data preprocessing in an image generation method based on machine learning according to the present invention;
FIG. 3 is a block diagram of a neural network model used in the image generation method based on machine learning according to the present invention;
fig. 4 is a block diagram of the generator G.
Detailed Description
The invention will be further illustrated with reference to specific examples:
as shown in fig. 1, the image generation method based on machine learning according to the embodiment includes the following steps:
s1: acquiring image data and preprocessing the data; the specific process is as follows:
s1-1: acquiring image training data by using a web crawler;
s1-2: reading an image, carrying out mean value removing processing on the image, and calculating a covariance matrix;
s1-3: decorrelation is carried out on the image data, and the data after mean value removal processing is projected to a feature base;
s1-4: and whitening, namely performing scale normalization processing on the characteristic value of the decorrelated data on each dimension.
S2: constructing a network model;
the constructed network model comprises a generation model and a discrimination model;
the generated model consists of an deconvolution layer, a BN layer and a ReLU activation function; the input of the image is a hidden vector z generated by standard normal distribution, and the output of the image is an RGB image; x represents data of an image, g (z) represents a generator function that maps z to data space; the goal of G is to estimate the distribution of the data set and then generate false samples from the estimated distribution;
the discrimination model consists of a convolution layer, a BN layer and a Leaky relu activation function; the method takes an image as input, processes a series of convolution layers, a BN layer and a LeakyReLU activation function layer, and outputs scalar probability that the input image is real through a Sigmoid activation function; where D (x) is a discriminator network that outputs a probability that x is from the training data. If x is from the training data, D (x) outputs a high probability, and vice versa;
d (G (z)) represents the probability that the output of G is a true image; d and G are played, the goal of D is to maximize the probability logD (x) of correctly classifying true and false per se, and G aims to minimize the probability log (1-D (x)) that D predicts the output of D as false;
in the constructed network model, the loss function is set as binary cross entropy:
where x is the distribution of the real data, y is the distribution of the output data, and the result l (x, y) is the difference between the x and y distributions, i.e., the loss.
S3: training a network model; the method comprises the following specific steps:
1) during training of the discriminator, firstly fixing a generator G, then randomly simulating and generating a sample G (z) by using the generator to serve as a negative sample, and sampling from a real data set to obtain a positive sample x; inputting the positive and negative samples into a discriminator D, and calculating errors according to the output of the discriminator, namely D (x) or D (G (z)) and the sample label; finally, updating the discriminator D by using an error back propagation algorithm;
2) during training of the generator, firstly fixing the discriminator D; then, randomly simulating and generating a sample by using a current generator, and inputting the sample into a discriminator D; the error is calculated according to the output D (G (z)) of the discriminator and the sample label, and finally the parameters of the generator G are updated by using an error back propagation algorithm.
S4: an image matching the feature information of the user input image is generated.
After the network model is trained, the discriminator cannot discriminate whether the image generated by the generator is from real data or not, so that the image generated by the step generator is the image required by people.
The principle of the present embodiment is explained again below by way of metaphors:
the generated model is like a counterfeit money producer, the discriminating model is like a police officer, the counterfeit money producer always tries to make counterfeit money which falsely confuses the police officer, the police officer always discriminates whether the banknote is true or false in a cautious way, the counterfeit money and the police officer resist each other, and finally the banknote is too similar to the genuine banknote, so that the police officer cannot discriminate whether the banknote is true or false (step S4 described corresponding to the embodiment generates an image matching the characteristic information of the user input image).
In the embodiment, a discriminant model is added behind a generated model (a decoder in a similar variational self-encoder), and the discriminant model is used for discriminating the similarity between the generated sample and the sample in the training data. After a series of training is carried out on the neural network model, the discriminator can not discriminate whether the image generated by the generator is from real data, so that the image generated by the generator, namely the image required by people, is achieved.
The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that variations based on the shape and principle of the present invention should be covered within the scope of the present invention.

Claims (5)

1. An image generation method based on machine learning, characterized by comprising the steps of:
s1: acquiring image data and preprocessing the data;
s2: constructing a network model;
s3: training a network model;
s4: an image matching the feature information of the user input image is generated.
2. The method for generating an image based on machine learning according to claim 1, wherein the specific steps of step S1 are as follows:
s1-1: acquiring image training data by using a web crawler;
s1-2: reading an image, carrying out mean value removing processing on the image, and calculating a covariance matrix;
s1-3: decorrelation is carried out on the image data, and the data after mean value removal processing is projected to a feature base;
s1-4: and whitening, namely performing scale normalization processing on the characteristic value of the decorrelated data on each dimension.
3. The image generation method based on machine learning of claim 1, wherein the network model constructed in step S2 includes a generation model and a discriminant model;
the generated model consists of an deconvolution layer, a BN layer and a ReLU activation function; the input of the image is a hidden vector z generated by standard normal distribution, and the output of the image is an RGB image; x represents data of an image, g (z) represents a generator function that maps z to data space; the goal of G is to estimate the distribution of the data set and then generate false samples from the estimated distribution;
the discrimination model consists of a convolution layer, a BN layer and a Leaky relu activation function; the method takes an image as input, processes a series of convolution layers, a BN layer and a LeakyReLU activation function layer, and outputs scalar probability that the input image is real through a Sigmoid activation function; where D (x) is a discriminator network that outputs a probability that x is from the training data. If x is from the training data, D (x) outputs a high probability, and vice versa;
d (G (z)) represents the probability that the output of G is a true image; the game between D and G, D aims to maximize the probability logD (x) of correctly classifying itself as true or false, and G aims to minimize the probability log (1-D (x)) that D predicts its output as false.
4. The method according to claim 3, wherein in the constructed network model, the loss function is set as binary cross entropy:
where x is the distribution of the real data, y is the distribution of the output data, and the result l (x, y) is the difference between the x and y distributions, i.e., the loss.
5. The method for generating images based on machine learning according to claim 1, wherein the step S3 is implemented by the following steps:
1) during training of the discriminator, firstly fixing a generator G, then randomly simulating and generating a sample G (z) by using the generator to serve as a negative sample, and sampling from a real data set to obtain a positive sample x; inputting the positive and negative samples into a discriminator D, and calculating errors according to the output of the discriminator, namely D (x) or D (G (z)) and the sample label; finally, updating the discriminator D by using an error back propagation algorithm;
2) during training of the generator, firstly fixing the discriminator D; then, randomly simulating and generating a sample by using a current generator, and inputting the sample into a discriminator D; the error is calculated according to the output D (G (z)) of the discriminator and the sample label, and finally the parameters of the generator G are updated by using an error back propagation algorithm.
CN201910703906.2A 2019-07-31 2019-07-31 Image generation method based on machine learning and method thereof Pending CN110619347A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910703906.2A CN110619347A (en) 2019-07-31 2019-07-31 Image generation method based on machine learning and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910703906.2A CN110619347A (en) 2019-07-31 2019-07-31 Image generation method based on machine learning and method thereof

Publications (1)

Publication Number Publication Date
CN110619347A true CN110619347A (en) 2019-12-27

Family

ID=68921497

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910703906.2A Pending CN110619347A (en) 2019-07-31 2019-07-31 Image generation method based on machine learning and method thereof

Country Status (1)

Country Link
CN (1) CN110619347A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111260652A (en) * 2020-01-09 2020-06-09 浙江传媒学院 Image generation system and method based on MIMO-GAN
CN111489802A (en) * 2020-03-31 2020-08-04 重庆金域医学检验所有限公司 Report coding model generation method, system, device and storage medium
CN111581189A (en) * 2020-03-27 2020-08-25 浙江大学 Completion method and device for air quality detection data loss
CN113537379A (en) * 2021-07-27 2021-10-22 沈阳工业大学 Three-dimensional matching method based on CGANs
CN114862699A (en) * 2022-04-14 2022-08-05 中国科学院自动化研究所 Face repairing method, device and storage medium based on generation countermeasure network
CN115346091A (en) * 2022-10-14 2022-11-15 深圳精智达技术股份有限公司 Method and device for generating Mura defect image data set

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056141A (en) * 2016-05-27 2016-10-26 哈尔滨工程大学 Target recognition and angle coarse estimation algorithm using space sparse coding
CN107016406A (en) * 2017-02-24 2017-08-04 中国科学院合肥物质科学研究院 The pest and disease damage image generating method of network is resisted based on production
CN108961272A (en) * 2018-07-02 2018-12-07 浙江工业大学 It is a kind of to fight the generation method for generating the skin disease image of network based on depth convolution
CN108960073A (en) * 2018-06-05 2018-12-07 大连理工大学 Cross-module state image steganalysis method towards Biomedical literature
CN109190665A (en) * 2018-07-30 2019-01-11 国网上海市电力公司 A kind of general image classification method and device based on semi-supervised generation confrontation network
US20190049540A1 (en) * 2017-08-10 2019-02-14 Siemens Healthcare Gmbh Image standardization using generative adversarial networks
CN109584337A (en) * 2018-11-09 2019-04-05 暨南大学 A kind of image generating method generating confrontation network based on condition capsule
CN109741328A (en) * 2019-02-02 2019-05-10 东北大学 A kind of automobile apparent mass detection method based on production confrontation network
CN109977955A (en) * 2019-04-03 2019-07-05 南昌航空大学 A kind of precancerous lesions of uterine cervix knowledge method for distinguishing based on deep learning

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056141A (en) * 2016-05-27 2016-10-26 哈尔滨工程大学 Target recognition and angle coarse estimation algorithm using space sparse coding
CN107016406A (en) * 2017-02-24 2017-08-04 中国科学院合肥物质科学研究院 The pest and disease damage image generating method of network is resisted based on production
US20190049540A1 (en) * 2017-08-10 2019-02-14 Siemens Healthcare Gmbh Image standardization using generative adversarial networks
CN108960073A (en) * 2018-06-05 2018-12-07 大连理工大学 Cross-module state image steganalysis method towards Biomedical literature
CN108961272A (en) * 2018-07-02 2018-12-07 浙江工业大学 It is a kind of to fight the generation method for generating the skin disease image of network based on depth convolution
CN109190665A (en) * 2018-07-30 2019-01-11 国网上海市电力公司 A kind of general image classification method and device based on semi-supervised generation confrontation network
CN109584337A (en) * 2018-11-09 2019-04-05 暨南大学 A kind of image generating method generating confrontation network based on condition capsule
CN109741328A (en) * 2019-02-02 2019-05-10 东北大学 A kind of automobile apparent mass detection method based on production confrontation network
CN109977955A (en) * 2019-04-03 2019-07-05 南昌航空大学 A kind of precancerous lesions of uterine cervix knowledge method for distinguishing based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XIONG XIAOMING,ET AL: "A Resources-Efficient Configurable Accelerator for Deep Convolutional Neural Networks", 《IEEE ACCESS》 *
马春光等: "马春光等", 《信息网络安全》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111260652A (en) * 2020-01-09 2020-06-09 浙江传媒学院 Image generation system and method based on MIMO-GAN
CN111260652B (en) * 2020-01-09 2023-09-08 浙江传媒学院 MIMO-GAN-based image generation system and method
CN111581189A (en) * 2020-03-27 2020-08-25 浙江大学 Completion method and device for air quality detection data loss
CN111489802A (en) * 2020-03-31 2020-08-04 重庆金域医学检验所有限公司 Report coding model generation method, system, device and storage medium
CN113537379A (en) * 2021-07-27 2021-10-22 沈阳工业大学 Three-dimensional matching method based on CGANs
CN113537379B (en) * 2021-07-27 2024-04-16 沈阳工业大学 Three-dimensional matching method based on CGANs
CN114862699A (en) * 2022-04-14 2022-08-05 中国科学院自动化研究所 Face repairing method, device and storage medium based on generation countermeasure network
CN115346091A (en) * 2022-10-14 2022-11-15 深圳精智达技术股份有限公司 Method and device for generating Mura defect image data set

Similar Documents

Publication Publication Date Title
CN110619347A (en) Image generation method based on machine learning and method thereof
Kadam et al. Detection and localization of multiple image splicing using MobileNet V1
CN109919252B (en) Method for generating classifier by using few labeled images
CN111861945B (en) Text-guided image restoration method and system
Singh et al. Steganalysis of digital images using deep fractal network
CN111325237B (en) Image recognition method based on attention interaction mechanism
CN113627482A (en) Cross-mode image generation method and device based on audio-tactile signal fusion
CN113642621A (en) Zero sample image classification method based on generation countermeasure network
CN112507947A (en) Gesture recognition method, device, equipment and medium based on multi-mode fusion
CN109670559A (en) Recognition methods, device, equipment and the storage medium of handwritten Chinese character
CN113762138A (en) Method and device for identifying forged face picture, computer equipment and storage medium
CN112784929A (en) Small sample image classification method and device based on double-element group expansion
Deng et al. Deepfake video detection based on EfficientNet-V2 network
US20230281833A1 (en) Facial image processing method and apparatus, device, and storage medium
CN115761366A (en) Zero sample picture classification method, system, device and medium for supplementing missing features
Cosovic et al. Classification methods in cultural heritage
CN110111365B (en) Training method and device based on deep learning and target tracking method and device
CN114692750A (en) Fine-grained image classification method and device, electronic equipment and storage medium
CN111371611A (en) Weighted network community discovery method and device based on deep learning
CN108229505A (en) Image classification method based on FISHER multistage dictionary learnings
Lumini et al. Image orientation detection by ensembles of Stochastic CNNs
CN113222002A (en) Zero sample classification method based on generative discriminative contrast optimization
CN112818774A (en) Living body detection method and device
Khazaee et al. Detection of counterfeit coins based on 3D height-map image analysis
CN115965810A (en) Short video rumor detection method based on multi-modal consistency

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191227