CN116580121B - Method and system for generating 2D model by single drawing based on deep learning - Google Patents
Method and system for generating 2D model by single drawing based on deep learning Download PDFInfo
- Publication number
- CN116580121B CN116580121B CN202310561085.XA CN202310561085A CN116580121B CN 116580121 B CN116580121 B CN 116580121B CN 202310561085 A CN202310561085 A CN 202310561085A CN 116580121 B CN116580121 B CN 116580121B
- Authority
- CN
- China
- Prior art keywords
- image
- point
- generator
- input
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000013135 deep learning Methods 0.000 title claims abstract description 19
- 238000012549 training Methods 0.000 claims abstract description 44
- 238000005520 cutting process Methods 0.000 claims abstract description 39
- 238000012360 testing method Methods 0.000 claims abstract description 10
- 238000012805 post-processing Methods 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 40
- 239000013598 vector Substances 0.000 claims description 36
- 238000009826 distribution Methods 0.000 claims description 12
- 238000009499 grossing Methods 0.000 claims description 12
- 238000010422 painting Methods 0.000 claims description 12
- 239000003086 colorant Substances 0.000 claims description 7
- 238000012935 Averaging Methods 0.000 claims description 6
- 238000003708 edge detection Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000004140 cleaning Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000004576 sand Substances 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013442 quality metrics Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/001—Texturing; Colouring; Generation of texture or colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/20—Drawing from basic elements, e.g. lines or circles
- G06T11/206—Drawing of charts or graphs
-
- G06T5/70—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a method and a system for generating a 2D model by a single drawing based on deep learning, which comprises the following steps: acquiring a plurality of pictures, classifying the pictures into a plurality of data sets, and preprocessing the data sets; image cutting is carried out on the images in the preprocessed data set, and an input image is obtainedxAnd a target imageyThe method comprises the steps of carrying out a first treatment on the surface of the Establishing a 2D generation model, and training the 2D generation model by taking the data set after image cutting as a training sample; optimizing and testing the trained 2D generation model; and inputting the single drawing as an input image into the tested 2D generation model, and performing post-processing operation on the 2D model output by the 2D generation model. The invention can generate a specific 2D model based on the single Zhang Huihua image and can be applied to a plurality of fields such as movie animation special effects, game scenes, virtual fitting and the like.
Description
Technical Field
The invention relates to the technical field of drawing modeling, in particular to a method and a system for generating a 2D model by a single drawing based on deep learning.
Background
Deep learning is a machine learning method, training is performed by constructing a multi-layer neural network, and tasks such as classification, regression and the like are automatically extracted from data.
At present, a drawing generation algorithm has become one of research hotspots in the field of artificial intelligence. By utilizing the deep learning technology, high-quality and diversified drawing generation can be realized through learning and analysis of a large number of artwork samples, and the method has a wide application prospect.
In the existing painting generation methods, although models based on a variational self-encoder (VAE), a countermeasure generation network (GAN) and the like are studied, the methods still have a certain limitation in terms of generating sense of reality and diversity due to the structural complexity and more detail textures of a single painting image.
Disclosure of Invention
In order to overcome the technical defects in the prior art, the invention provides a method and a system for generating a 2D model by using a single drawing based on deep learning, which can effectively solve the problems in the background art.
In order to solve the technical problems, the technical scheme provided by the invention is as follows:
in a first aspect, an embodiment of the present invention discloses a method for generating a 2D model from a single drawing based on deep learning, including the following steps:
acquiring a plurality of pictures, classifying the pictures into a plurality of data sets, and preprocessing the data sets;
image cutting is carried out on the images in the preprocessed data set, and an input image x and a target image y are obtained;
establishing a 2D generation model, and training the 2D generation model by taking the data set after image cutting as a training sample;
optimizing and testing the trained 2D generation model;
and inputting the single drawing as an input image into the tested 2D generation model, and performing post-processing operation on the 2D model output by the 2D generation model.
In any of the foregoing aspects, preferably, the step of acquiring a plurality of drawings, classifying the plurality of drawings into a plurality of data sets, and preprocessing the data sets includes the steps of:
performing picture scaling, image enhancement and data cleaning on the data set;
performing edge detection on the image in the data set, and extracting the outline of the painting work;
and extracting color features of the pictorial representation according to the outline of the pictorial representation.
In any of the above aspects, preferably, the edge detection of the image in the dataset, and the extraction of the outline of the pictorial representation, include:
smoothing the image by a gaussian filter using the formulaSmoothing the image by a two-dimensional gaussian function, wherein the parameters of the distribution by the gaussian functionσThe degree of smoothing the image is controlled,σthe smaller the filter, the higher the positioning accuracy of the filter, the lower the signal-to-noise ratio, and vice versa;
by the formula:
;
;
;
computing gradient magnitude for each point in pictorial representation IGAnd gradientθDirection of (a) whereinG x (i,j) AndG y (i,j) Respectively is a point%i,j) At the position ofx,yPartial derivative of direction;
in the way ofi,j) As the center point of the field, the field isθ(i,j) Gradient value of each point in the directionG (i,j) Comparing, and taking the point where the gradient maximum value is locatedi,j) As candidate edge points, or else, non-edge points, obtaining a candidate edge image K;
setting a high thresholdT h And a low thresholdT l Any point of the obtained candidate edge pointsi,j) Detecting if it is%i,j) Gradient value of the outputG (i,j)>T h If the store is determined to be an edge point, ifG (i,j)<T l The point is not an edge point; if it isT l <G (i,j)<T h Judging whether an edge point exists in the field of the point, if so, judging that the point is the edge point, otherwise, judging that the point is not the edge point.
In any of the above aspects, preferably, the extracting the color features of the pictorial representation according to the outline of the pictorial representation includes:
by the formula:
;
;
the method comprises the steps of carrying out a first treatment on the surface of the Calculating first moments of colors in pictorial representationsμ ci Second moment ofσ ci Moment of third orderξ ci The method comprises the steps of carrying out a first treatment on the surface of the Wherein,Nfor the number of pixels in the image,P ci for a color value ofCAnd the color component isiIs used to determine the probability of the occurrence of a pixel,Ca number of colors included in the pictorial representation; each color has three components, each component having a third order moment;
by the formula:
P C =(μ cr , σ cr , ξ cr , μ cg , σ cg , ξ cg , μ cb , σ cb , ξ cb ),generating a set of color features from the color features of each pictorial representationP C 。
In any of the above aspects, preferably, the image in the preprocessed dataset is subjected to image segmentation to obtain an input imagexAnd a target imageyComprising the following steps:
from [0.5,1]Randomly selecting a scaling factor within a range of (a)sAnd zoom the image tosX original resolution;
randomly selecting a window with the width and the height being the original resolution from the zoomed image as a cutting area;
cutting the image into a plurality of small blocks according to the cutting area, wherein the image is inputxFor small block image after random cutting, target imageyIs a small block image at the same position as the input image x in the corresponding original image.
In any of the above schemes, preferably, the establishing a 2D generating model and training the 2D generating model by using the data set after image cutting as a training sample includes the following steps:
building a generator G and a discriminator D, and building a 2D generation model;
defining a loss function comprising a generator loss and a arbiter loss;
based on the data set after image cutting, the loss function is lowered through an alternate training generator and a discriminator, and training of the 2D generation model is completed.
In any of the above schemes, preferably, the generator is defined as:
w=MLP(z)ϵR dlatent ,(w,y 1 )=G proj (w,y 1 )ϵR dlatent ,x=G (w,y 1 )ϵR C*H*W where z represents the input low-dimensional noise vector,MLPrepresenting a multi-layer perceptron for vector of noise inputzMapping to style vectors in potential spacew,dlatentRepresenting the dimensions of the potential space,G proj representing a learnable projection layer for vector of patternswAnd condition information y 1 Merging and generating a new dimension vectorw, y 1 ) G represents a Generator model composed of a plurality of Generator Blocks for inputting pattern vectors #w, y 1 ) Conversion to high resolution imagesx 1 ,C、HAndWthe number, height and width of channels representing the image respectively;
the discriminant is defined as:
the method comprises the steps of carrying out a first treatment on the surface of the Wherein,xrepresenting the input image, obtaining feature images with different resolutions after convolution and downsampling processing,S i (x) Representing an input imagexCarry out the first stepiThe feature map obtained after the layer convolution and downsampling processes,D i represent the firstiA layer discriminant model for the firstiThe feature map of the layer is classified and scored,a i representing the corresponding discriminant D i And the weight coefficient of the classification module is used for controlling the contribution degree of each layer to the final classification result.
In any of the above schemes, preferably, the generator loss is:
L G =-E w y pdat(,1)~ a[logD(G (w,y 1 ))]wherein, the method comprises the steps of, wherein,pdatarepresenting the distribution of potential vectors and condition information in the input training set,Drepresenting a discriminant model for evaluating whether an input image is a true image, logD(G(w,y 1 ) A) represents the generatorGThe image produced is taken as input and calculated to pass through the discriminatorDThe logarithm of the probability of the post-output,Erepresenting the operation of summing and averaging the potential vectors and condition information in all input training sets,L G representation generatorGA loss function of (2);
the loss of the discriminator is as follows:
wherein, the method comprises the steps of, wherein,pdata(x) The distribution of the real image is represented,pG(z,y) Representation generatorGDistribution of the generated image, whereinzThe noise vector representing the input is represented by a vector,y 1 the condition information is represented by a set of conditions,D(x) Representing a real image as an input imagexAfter input, the probability of the output of the discriminator;D(G(z,y) A) represents the generatorGThe probability of the output of the arbiter after the generated image is taken as input,aandbrepresenting the threshold at which the arbiter attempts to distinguish between the real image and the generated image,Erepresenting the desire, representing the operation of summing and averaging all the input real images and the generated images,L D representation discriminatorDIs a function of the loss of (2).
In any of the above schemes, preferably, the training of the 2D generative model is completed by alternately training the generator and the discriminator to lower the loss function based on the data set after image cutting, and the method includes the following steps:
initialization generator and arbiter: initializing a generator G and a discriminator D as random functions;
preparing training data: randomly sampling mini-batch data from a real data set, wherein half of the mini-batch data are real images, and the other half of the mini-batch data are false images generated by a generator G;
forward propagation and backward propagation: the sampled mini-batch data are input into a generator G to obtain false images, and the true and false images are respectively input into a discriminator D to calculate discrimination probabilities. Then back-propagating and updating parameters of the generator and the discriminator according to the loss function;
calculating a loss function: during the forward and backward propagation, the value of the loss function needs to be calculated and recorded;
repeating the initializing generator and the arbiter to the calculating the loss function to drop the loss function by alternately training the generator and the arbiter.
In a second aspect, a system for generating a 2D model based on deep learning of a single drawing, the system comprising:
the acquisition module is used for acquiring a plurality of pictures classified into a plurality of data sets and preprocessing the data sets;
the processing module is used for cutting the image in the preprocessed data set to obtain an input image x and a target image y;
the generating module is used for establishing a 2D generating model, and training the 2D generating model by taking the data set after image cutting as a training sample;
the optimization module is used for optimizing and testing the trained 2D generation model;
the input module is used for inputting the single drawing as an input image into the tested 2D generation model and performing post-processing operation on the 2D model output by the 2D generation model.
Compared with the prior art, the invention has the beneficial effects that:
the method for generating the 2D model by using the single drawing based on the deep learning can generate a specific 2D model based on the single Zhang Huihua image, and can be applied to a plurality of fields such as movie animation special effects, game scenes, virtual fitting and the like;
by introducing the pattern noise layer and the projection layer, the diversity of the generator and the image detail are increased, and the sense of reality and the artistry of the generated image are improved;
regularization technology and super-parameter adjustment optimization are also applied in model training, so that stability and image quality of the generator can be further improved.
Drawings
The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification.
FIG. 1 is a flow chart of a method of generating a 2D model from a single drawing based on deep learning in accordance with the present invention;
FIG. 2 is a block diagram of a system for generating a 2D model based on a single drawing for deep learning in accordance with the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
It will be understood that when an element is referred to as being "mounted" or "disposed" on another element, it can be directly on the other element or be indirectly on the other element. When an element is referred to as being "connected to" another element, it can be directly connected to the other element or be indirectly connected to the other element.
In the description of the present invention, it should be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate describing the present invention and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present invention.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In order to better understand the above technical scheme, the following detailed description of the technical scheme of the present invention will be given with reference to the accompanying drawings of the specification and the specific embodiments.
As shown in fig. 1, the invention provides a method for generating a 2D model by a single drawing based on deep learning, which comprises the following steps:
step 1, acquiring a plurality of paintings classified into a plurality of data sets, and preprocessing the data sets;
step 2, image cutting is carried out on the images in the preprocessed data set, and an input image x and a target image y are obtained;
step 3, a 2D generation model is established, and the data set after image cutting is used as a training sample to train the 2D generation model;
step 4, optimizing and testing the trained 2D generation model;
and 5, inputting the single drawing as an input image into the tested 2D generation model, and performing post-processing operation on the 2D model output by the 2D generation model.
Specifically, step 1 obtains a plurality of drawings classified into a plurality of data sets, and preprocesses the data sets, including the steps of:
step 11, performing picture scaling, image enhancement and data cleaning on the data set;
step 12, carrying out edge detection on the image in the data set, and extracting the outline of the painting work;
and 13, extracting color features of the painting according to the outline of the painting.
Specifically, step 12, performing edge detection on the image in the dataset, and extracting the outline of the pictorial representation, including:
step 121, smoothing the image by a Gaussian filter using the formulaSmoothing the image by a two-dimensional gaussian function, wherein the parameters of the distribution by the gaussian functionσThe degree of smoothing the image is controlled,σthe smaller the filter, the higher the positioning accuracy of the filter, the lower the signal-to-noise ratio, and vice versa;
step 122, by the formula:
;
;
;
computing gradient magnitude for each point in pictorial representation IGAnd gradientθDirection of (a) whereinG x (i,j) AndG y (i,j) Respectively is a point%i,j) At the position ofx,yPartial derivative of direction;
step 123, using points @ to makei,j) As the center point of the field, the field isθ(i,j) Gradient value of each point in the directionG(i,j) Comparing, and taking the point where the gradient maximum value is locatedi,j) As candidate edge points, or else, non-edge points, obtaining a candidate edge image K;
step 124, setting a high thresholdT h And a low thresholdT l Any point of the obtained candidate edge pointsi,j) Detecting if it is%i,j) Gradient value of the outputG (i,j)>T h If the store is determined to be an edge point, ifG (i,j)<T l The point is not an edge point; if it isT l <G (i,j)<T h Judging whether an edge point exists in the field of the point, if so, judging that the point is the edge point, otherwise, judging that the point is not the edge point.
Specifically, step 13 extracts color features of the pictorial representation according to the outline of the pictorial representation, including:
step 131, by the formula:
;
;
the method comprises the steps of carrying out a first treatment on the surface of the Calculating first moments of colors in pictorial representationsμ ci Second moment ofσ ci Moment of third orderξ ci The method comprises the steps of carrying out a first treatment on the surface of the Wherein,Nfor the number of pixels in the image,P ci for a color value ofCAnd the color component isiIs used to determine the probability of the occurrence of a pixel,Ca number of colors included in the pictorial representation; each color has three components, each component having a third order moment;
by the formula:
P C =(μ cr , σ cr , ξ cr , μ cg , σ cg , ξ cg , μ cb , σ cb , ξ cb ),generating a set of color features from the color features of each pictorial representationP C 。
Specifically, in step 2, the image in the preprocessed dataset is subjected to image cutting to obtain an input imagexAnd a target imageyComprising the following steps:
step 21, from [0.5,1]Randomly selecting a scaling factor within a range of (a)sAnd zoom the image tosX original resolution;
step 22, randomly selecting a window with the width and the height being the original resolution from the zoomed image as a cutting area;
step 23, cutting the image into a plurality of small blocks according to the cutting area, wherein the image is inputxFor small block image after random cutting, target imageyIs the corresponding original image and input imagexSmall images at the same positions.
Specifically, in the step 3, a 2D generation model is built, and the 2D generation model is trained by taking the data set after image cutting as a training sample, including the following steps:
step 31, building a 2D generation model by building a generator G and a discriminator D;
step 32, defining a loss function, including generator loss and arbiter loss;
and step 33, based on the data set after image cutting, the loss function is lowered by alternating the training generator and the discriminator, so that the training of the 2D generation model is completed.
Further, the generator is defined as:
w=MLP(z)ϵR dlatent ,(w,y 1 )=G proj (w,y 1 )ϵR dlatent ,x=G (w,y 1 )ϵR C*H*W where z represents the input low-dimensional noise vector,MLPrepresenting a multi-layer perceptron for vector of noise inputzMapping to style vectors in potential spacew,dlatentRepresenting the dimensions of the potential space,G proj representing a learnable projection layer for vector of patternswAnd condition information y 1 Merging and generating a new dimension vectorw, y 1 ) G represents a generator model, consisting ofA plurality of Generator Blocks for generating input pattern vectors #w, y 1 ) Conversion to high resolution imagesx 1 ,C、HAndWthe number, height and width of channels representing the image respectively;
further, the discriminant is defined as:
the method comprises the steps of carrying out a first treatment on the surface of the Wherein,xrepresenting the input image, obtaining feature images with different resolutions after convolution and downsampling processing,S i (x) Representing an input imagexCarry out the first stepiThe feature map obtained after the layer convolution and downsampling processes,D i represent the firstiA layer discriminant model for the firstiThe feature map of the layer is classified and scored,a i representing the corresponding discriminant D i And the weight coefficient of the classification module is used for controlling the contribution degree of each layer to the final classification result.
Further, the generator penalty is:
L G =-E w y pdat(,1)~ a[logD(G (w,y 1 ))]wherein, the method comprises the steps of, wherein,pdatarepresenting the distribution of potential vectors and condition information in the input training set,Drepresenting a discriminant model for evaluating whether an input image is a true image, logD(G(w,y 1 ) A) represents the generatorGThe image produced is taken as input and calculated to pass through the discriminatorDThe logarithm of the probability of the post-output,Erepresenting the operation of summing and averaging the potential vectors and condition information in all input training sets,L G representation generatorGA loss function of (2);
further, the discriminator loss is:
wherein, the method comprises the steps of, wherein,pdata(x) The distribution of the real image is represented,pG(z,y) Representation generatorGDistribution of the generated image, whereinzThe noise vector representing the input is represented by a vector,y 1 the condition information is represented by a set of conditions,D(x) Representing a real image as an input imagexAfter input, the probability of the output of the discriminator;D(G(z,y) A) represents the generatorGThe probability of the output of the arbiter after the generated image is taken as input,aandbrepresenting the threshold at which the arbiter attempts to distinguish between the real image and the generated image,Erepresenting the desire, representing the operation of summing and averaging all the input real images and the generated images,L D representation discriminatorDIs a function of the loss of (2).
Specifically, step 33, based on the data set after image cutting, through an alternate training generator and a discriminator, makes the loss function descend, and completes the training of the 2D generation model, and includes the following steps:
step 331, initializing a generator and a discriminator: initializing a generator G and a discriminator D as random functions;
step 332, preparing training data: randomly sampling mini-batch data from a real data set, wherein half of the mini-batch data are real images, and the other half of the mini-batch data are false images generated by a generator G;
step 333, forward propagation and backward propagation: the sampled mini-batch data are input into a generator G to obtain false images, and the true and false images are respectively input into a discriminator D to calculate discrimination probabilities. Then back-propagating and updating parameters of the generator and the discriminator according to the loss function;
step 334, calculate the loss function: during the forward and backward propagation, the value of the loss function needs to be calculated and recorded;
step 335, repeating the initializing generator and the arbiter to the calculating the loss function to drop the loss function by alternately training the generator and the arbiter.
Specifically, step 4 optimizes and tests the trained 2D generation model, including:
step 41, adjusting and optimizing the super parameters;
step 42, using regularization technique to avoid overfitting;
step 43, introducing a pattern noise layer and a projection layer into the generator, and increasing the generation diversity and image details;
step 44, testing the model by using the test set, and evaluating the effect of generating a vivid 2D model;
the generated image is evaluated 45 using various quality metrics such as diversity, realism, sharpness, etc.
Specifically, step 5 inputs the sheet Zhang Huihua as an input image into the tested 2D generation model, and performs post-processing operation on the 2D model output by the 2D generation model, including:
the vector z of the single drawing is input into the generator G, the generated realistic 2D image is output, and post-processing such as denoising, brightness adjustment, etc. is performed to improve the realism and artistry thereof.
As shown in fig. 2, the present invention further provides a system for generating a 2D model from a single drawing based on deep learning, the system comprising:
the acquisition module is used for acquiring a plurality of pictures classified into a plurality of data sets and preprocessing the data sets;
the processing module is used for cutting the image in the preprocessed data set to obtain an input image x and a target image y;
the generating module is used for establishing a 2D generating model, and training the 2D generating model by taking the data set after image cutting as a training sample;
the optimization module is used for optimizing and testing the trained 2D generation model;
the input module is used for inputting the single drawing as an input image into the tested 2D generation model and performing post-processing operation on the 2D model output by the 2D generation model.
Compared with the prior art, the invention has the beneficial effects that:
the method for generating the 2D model by using the single drawing based on the deep learning can generate a specific 2D model based on the single Zhang Huihua image, and can be applied to a plurality of fields such as movie animation special effects, game scenes, virtual fitting and the like;
by introducing the pattern noise layer and the projection layer, the diversity of the generator and the image detail are increased, and the sense of reality and the artistry of the generated image are improved;
regularization technology and super-parameter adjustment optimization are also applied in model training, so that stability and image quality of the generator can be further improved.
The above is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that the present invention is described in detail with reference to the foregoing embodiments, and modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (4)
1. A method for generating a 2D model by a single drawing based on deep learning is characterized by comprising the following steps: the method comprises the following steps:
acquiring a plurality of paintings, classifying the paintings into a plurality of data sets, and preprocessing the data sets, wherein the method comprises the following steps of:
performing picture scaling, image enhancement and data cleaning on the data set;
edge detection is carried out on the image in the data set, and the outline of the painting work is extracted, and the method comprises the following steps:
smoothing the image by a gaussian filter using the formulaSmoothing the image by a two-dimensional Gaussian function, wherein the degree of smoothing the image is controlled by a distribution parameter sigma of the Gaussian function, the smaller the sigma is, the higher the positioning precision of the filter is, the lower the signal-to-noise ratio is, and conversely, the x is 1 And y 1 Respectively two-dimensional planesThe abscissa and ordinate of the coordinate point on the upper part;
by the formula:
calculating a gradient magnitude G and a gradient direction θ for each point in pictorial representation I, where G x (i, j) and G y (i, j) are the partial derivatives of point (i, j) in the x, y directions, respectively;
the point (i, j) is taken as a field center point, gradient values G (i, j) of each point in the theta (i, j) direction in the field are compared, the point (i, j) with the largest gradient value is taken as a candidate edge point, and otherwise, the candidate edge point is a non-edge point, so that a candidate edge image K is obtained;
setting a high threshold T h And a low threshold T l Detecting any point (i, j) of the obtained candidate edge points, if the gradient value G (i, j) > T of the point (i, j) h Judging the point as an edge point, if G (i, j) < T l The point is not an edge point; if T l <G(i,j)<T h Judging whether an edge point exists in the field of the point, if so, judging that the point is the edge point, otherwise, judging that the point is not the edge point;
extracting color features of the pictorial representation from the outline of the pictorial representation, comprising:
by the formula:
calculating colors in pictorial representationsFirst moment mu ci1 Second moment sigma ci1 Third order moment ζ ci1 The method comprises the steps of carrying out a first treatment on the surface of the Wherein N is the number of pixels in the image, P ci1 The probability of occurrence of a pixel having a color value of C and a color component of i1, C being the number of colors contained in the pictorial representation, each color having three components, each component having a third order moment;
by the formula:
generating a color feature set P from color features of each pictorial representation C ;
Image cutting is carried out on the images in the preprocessed data set, and an input image sx and a target image sy are obtained, wherein the method comprises the following steps:
randomly selecting a scaling factor s from the range of [0.5,1] and scaling the image to s x the original resolution;
randomly selecting a window with the width and the height being the original resolution from the zoomed image as a cutting area;
cutting the image into a plurality of small blocks according to the cutting area, wherein the input image sx is a small block image subjected to random cutting, and the target image sy is a small block image at the same position as the input image sx in the corresponding original image;
establishing a 2D generation model, and training the 2D generation model by taking the data set after image cutting as a training sample, wherein the method comprises the following steps of:
building a generator GG and a discriminator D, and building a 2D generation model;
defining a loss function comprising a generator loss and a arbiter loss;
based on the data set after image cutting, the loss function is lowered through an alternate training generator and a discriminator, and training of the 2D generation model is completed; the generator is defined as:
w=MLP(z)∈R dlatent ,(w,y 1 )=GG proj (w,y 1 )∈R dlatent ,x 1 =GG(w,y 1 )∈R CC*H*W ,
where z represents the input low-dimensional noise vector, MLP represents the multi-layer perceptron, for mapping the input noise vector z into a pattern vector w in potential space, dlatent represents the dimension of potential space, GG proj Representing a learnable projection layer for combining a pattern vector w and condition information y 1 Merging and generating a new vector of dimensions (w, y 1 ) GG represents a Generator model consisting of a plurality of Generator Blocks for generating input pattern vectors (w, y 1 ) Conversion to high resolution image x 1 CC, H, W represent the channel number, height and width of the image respectively;
the discriminant is defined as:
wherein sx represents an input image, and features with different resolutions are obtained after convolution and downsampling, S i2 (sx) represents a feature map obtained by performing an i 2-th layer convolution and downsampling process on the input image sx, D i2 A discriminant model representing layer i2 for classifying and scoring a feature map of layer i2, a i2 Representing the corresponding discriminant D i2 The weight coefficient of the classification module is used for controlling the contribution degree of each layer to the final classification result;
optimizing and testing the trained 2D generation model;
and inputting the single drawing as an input image into the tested 2D generation model, and performing post-processing operation on the 2D model output by the 2D generation model.
2. The method for generating a 2D model based on deep learning individual drawings of claim 1, wherein:
the generator penalty is:
wherein pdata represents potential vectors and condition information in the input training setD represents a discriminant model for evaluating whether an input image is a true image, log D (GG (w, y) 1 ) Representing the logarithm of the probability of taking as input the image produced by the generator GG and computing its output after passing through the arbiter D, E representing the operation of summing and averaging the potential vectors and the condition information in all the input training sets, L G A loss function representing generator GG;
the loss of the discriminator is as follows:
wherein pdata (sx) represents the distribution of the real image, pGG (z, y) 1 ) Representing the distribution of the image produced by generator GG, where z represents the input noise vector, y 1 The condition information is represented, and D (sx) represents the probability that the discriminator outputs after the image sx is taken as an input; d (GG (z, y) 1 ) A) represents the probability that the arbiter outputs after taking the image generated by the generator GG as input, a and b1 represent the threshold values at which the arbiter tries to distinguish the real image from the generated image, E1 represents the expectation, represents the operation of summing and averaging all the input real image and generated image, L D Representing the loss function of the arbiter D.
3. The method for generating a 2D model based on deep learning individual drawings of claim 2, wherein: the method for training the 2D generation model based on the image cut data set comprises the following steps of:
initialization generator and arbiter: initializing a generator GG and a discriminator D as random functions;
preparing training data: randomly sampling mini-batch data from a real data set, wherein half of the mini-batch data are real images, and the other half of the mini-batch data are false images generated by a generator GG;
forward propagation and backward propagation: inputting sampled mini-batch data into a generator GG to obtain false images, respectively inputting the true and false images into a discriminator D to calculate discrimination probability, and then carrying out back propagation and parameter updating on the generator and the discriminator according to a loss function;
calculating a loss function: during the forward and backward propagation, the value of the loss function needs to be calculated and recorded;
repeating the initializing generator and the arbiter to the calculating the loss function to drop the loss function by alternately training the generator and the arbiter.
4. A system for generating a 2D model from a single drawing based on deep learning, characterized in that: the system comprises:
the acquisition module is used for acquiring a plurality of paintings classified into a plurality of data sets and preprocessing the data sets, and comprises the following steps of:
performing picture scaling, image enhancement and data cleaning on the data set;
edge detection is carried out on the image in the data set, and the outline of the painting work is extracted, and the method comprises the following steps:
smoothing the image by a gaussian filter using the formulaSmoothing the image by a two-dimensional Gaussian function, wherein the degree of smoothing the image is controlled by a distribution parameter sigma of the Gaussian function, the smaller the sigma is, the higher the positioning precision of the filter is, the lower the signal-to-noise ratio is, and conversely, the x is 1 And y 1 Respectively the abscissa and the ordinate of the coordinate point on the two-dimensional plane;
by the formula:
calculating a gradient magnitude G and a gradient direction θ for each point in pictorial representation I, where G x (i, j) and G y (i, j) are the partial derivatives of point (i, j) in the x, y directions, respectively;
the point (i, j) is taken as a field center point, gradient values G (i, j) of each point in the theta (i, j) direction in the field are compared, the point (i, j) with the largest gradient value is taken as a candidate edge point, and otherwise, the candidate edge point is a non-edge point, so that a candidate edge image K is obtained;
setting a high threshold T h And a low threshold T l Detecting any point (i, j) of the obtained candidate edge points, if the gradient value G (i, j) > T of the point (i, j) h Judging the point as an edge point, if G (i, j) < T l The point is not an edge point; if T l <G(i,j)<T h Judging whether an edge point exists in the field of the point, if so, judging that the point is the edge point, otherwise, judging that the point is not the edge point;
extracting color features of the pictorial representation from the outline of the pictorial representation, comprising:
by the formula:
calculating first moment mu of color in pictorial representation ci1 Second moment sigma ci1 Third order moment ζ ci1 The method comprises the steps of carrying out a first treatment on the surface of the Wherein N is the number of pixels in the image, P ci1 The probability of occurrence of a pixel having a color value of C and a color component of i1, C being the number of colors contained in the pictorial representation, each color having three components, each component having a third order moment;
by the formula:
generating a color feature set P from color features of each pictorial representation C ;
The processing module is used for cutting the image in the preprocessed data set to obtain an input image sx and a target image sy, and comprises the following steps:
randomly selecting a scaling factor s from the range of [0.5,1] and scaling the image to s x the original resolution;
randomly selecting a window with the width and the height being the original resolution from the zoomed image as a cutting area;
cutting the image into a plurality of small blocks according to the cutting area, wherein the input image sx is a small block image subjected to random cutting, and the target image sy is a small block image at the same position as the input image sx in the corresponding original image;
the generating module is used for establishing a 2D generating model, taking the data set after image cutting as a training sample, and training the 2D generating model, and comprises the following steps:
building a generator GG and a discriminator D, and building a 2D generation model;
defining a loss function comprising a generator loss and a arbiter loss;
based on the data set after image cutting, the loss function is lowered through an alternate training generator and a discriminator, and training of the 2D generation model is completed; the generator is defined as:
w=MLP(z)∈R dlatent ,(w,y 1 )=GG proj (w,y 1 )∈R dlatent ,x 1 =GG(w,y 1 )∈R CC*H*W ,
where z represents the input low-dimensional noise vector, MLP represents the multi-layer perceptron, for mapping the input noise vector z into a pattern vector w in potential space, dlatent represents the dimension of potential space, GG proj Representing a learnable projection layer for combining a pattern vector w and condition information y 1 Merging and generating a new vector of dimensions (w, y 1 ) GG represents a Generator model consisting of a plurality of Generator Blocks for generating input pattern vectors (w, y 1 ) Conversion to high resolution image x 1 CC, H, W respectively represent the channel number of the imageHeight and width;
the discriminant is defined as:
wherein sx represents an input image, and features with different resolutions are obtained after convolution and downsampling, S i2 (sx) represents a feature map obtained by performing an i 2-th layer convolution and downsampling process on the input image sx, D i2 A discriminant model representing layer i2 for classifying and scoring a feature map of layer i2, a i2 Representing the corresponding discriminant D i2 The weight coefficient of the classification module is used for controlling the contribution degree of each layer to the final classification result;
the optimization module is used for optimizing and testing the trained 2D generation model;
the input module is used for inputting the single drawing as an input image into the tested 2D generation model and performing post-processing operation on the 2D model output by the 2D generation model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310561085.XA CN116580121B (en) | 2023-05-18 | 2023-05-18 | Method and system for generating 2D model by single drawing based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310561085.XA CN116580121B (en) | 2023-05-18 | 2023-05-18 | Method and system for generating 2D model by single drawing based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116580121A CN116580121A (en) | 2023-08-11 |
CN116580121B true CN116580121B (en) | 2024-04-09 |
Family
ID=87535488
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310561085.XA Active CN116580121B (en) | 2023-05-18 | 2023-05-18 | Method and system for generating 2D model by single drawing based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116580121B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117132687A (en) * | 2023-08-14 | 2023-11-28 | 北京元跃科技有限公司 | Animation generation method and device and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109472838A (en) * | 2018-10-25 | 2019-03-15 | 广东智媒云图科技股份有限公司 | A kind of sketch generation method and device |
CN110211192A (en) * | 2019-05-13 | 2019-09-06 | 南京邮电大学 | A kind of rendering method based on the threedimensional model of deep learning to two dimensional image |
CN111724299A (en) * | 2020-05-21 | 2020-09-29 | 同济大学 | Super-realistic painting image style migration method based on deep learning |
CN115068951A (en) * | 2022-06-29 | 2022-09-20 | 北京元跃科技有限公司 | Method for generating intelligence-developing game through pictures |
WO2023005186A1 (en) * | 2021-07-29 | 2023-02-02 | 广州柏视医疗科技有限公司 | Modal transformation method based on deep learning |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11354846B2 (en) * | 2020-05-04 | 2022-06-07 | Microsoft Technology Licensing, Llc | Computing photorealistic versions of synthetic images |
KR20220009757A (en) * | 2020-07-16 | 2022-01-25 | 현대자동차주식회사 | Deep learning based anchor-free object detection method and apparatus |
US11604947B2 (en) * | 2020-08-26 | 2023-03-14 | X Development Llc | Generating quasi-realistic synthetic training data for use with machine learning models |
US20220217321A1 (en) * | 2021-01-06 | 2022-07-07 | Tetavi Ltd. | Method of training a neural network configured for converting 2d images into 3d models |
JP2023066260A (en) * | 2021-10-28 | 2023-05-15 | テルモ株式会社 | Learning model generation method, image processing device, program and training data generation method |
-
2023
- 2023-05-18 CN CN202310561085.XA patent/CN116580121B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109472838A (en) * | 2018-10-25 | 2019-03-15 | 广东智媒云图科技股份有限公司 | A kind of sketch generation method and device |
CN110211192A (en) * | 2019-05-13 | 2019-09-06 | 南京邮电大学 | A kind of rendering method based on the threedimensional model of deep learning to two dimensional image |
CN111724299A (en) * | 2020-05-21 | 2020-09-29 | 同济大学 | Super-realistic painting image style migration method based on deep learning |
WO2023005186A1 (en) * | 2021-07-29 | 2023-02-02 | 广州柏视医疗科技有限公司 | Modal transformation method based on deep learning |
CN115068951A (en) * | 2022-06-29 | 2022-09-20 | 北京元跃科技有限公司 | Method for generating intelligence-developing game through pictures |
Also Published As
Publication number | Publication date |
---|---|
CN116580121A (en) | 2023-08-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108846358B (en) | Target tracking method for feature fusion based on twin network | |
CN107657279B (en) | Remote sensing target detection method based on small amount of samples | |
WO2019104767A1 (en) | Fabric defect detection method based on deep convolutional neural network and visual saliency | |
CN109360232B (en) | Indoor scene layout estimation method and device based on condition generation countermeasure network | |
CN111462206B (en) | Monocular structure light depth imaging method based on convolutional neural network | |
CN107330371A (en) | Acquisition methods, device and the storage device of the countenance of 3D facial models | |
CN112784736B (en) | Character interaction behavior recognition method based on multi-modal feature fusion | |
CN103971112B (en) | Image characteristic extracting method and device | |
WO2019071976A1 (en) | Panoramic image saliency detection method based on regional growth and eye movement model | |
CN116580121B (en) | Method and system for generating 2D model by single drawing based on deep learning | |
Trouvé et al. | Single image local blur identification | |
CN107944437B (en) | A kind of Face detection method based on neural network and integral image | |
CN108564120A (en) | Feature Points Extraction based on deep neural network | |
CN111242864A (en) | Finger vein image restoration method based on Gabor texture constraint | |
CN105321188A (en) | Foreground probability based target tracking method | |
CN111009005A (en) | Scene classification point cloud rough registration method combining geometric information and photometric information | |
Niu et al. | Siamese-network-based learning to rank for no-reference 2D and 3D image quality assessment | |
CN113570658A (en) | Monocular video depth estimation method based on depth convolutional network | |
CN113269682A (en) | Non-uniform motion blur video restoration method combined with interframe information | |
CN114463492A (en) | Adaptive channel attention three-dimensional reconstruction method based on deep learning | |
CN106529441A (en) | Fuzzy boundary fragmentation-based depth motion map human body action recognition method | |
CN113436251B (en) | Pose estimation system and method based on improved YOLO6D algorithm | |
CN105740867B (en) | The selection method of image texture window shape and scale | |
CN105913084A (en) | Intensive track and DHOG-based ultrasonic heartbeat video image classifying method | |
CN112053385B (en) | Remote sensing video shielding target tracking method based on deep reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |