CN110378985A

CN110378985A - A kind of animation drawing auxiliary creative method based on GAN

Info

Publication number: CN110378985A
Application number: CN201910653652.8A
Authority: CN
Inventors: 任慧; 李佳; 苏志斌; 李�真; 蒋玉暕
Original assignee: Communication University of China
Current assignee: Communication University of China
Priority date: 2019-07-19
Filing date: 2019-07-19
Publication date: 2019-10-25
Anticipated expiration: 2039-07-19
Also published as: CN110378985B

Abstract

The animation drawing auxiliary creative method based on GAN that the invention discloses a kind of, using based on generating the line original text profession enhancing technology of confrontation network and intensified learning, the line original text colouring technology based on the study of related generic and color cluster and the technologies such as generate based on the game music photograph album of audiovisual synesthesia and affection computation, integrated animation drawing auxiliary authoring system is constructed, the authoring experience and aesthetic of user are improved.

Description

A kind of animation drawing auxiliary creative method based on GAN

Technical field

The present invention relates to technology image generating technologies fields, paint and assist more particularly, to a kind of animation based on GAN Creative method.

Background technique

Have image interpretation very rich, Style Transfer task at present, no matter the sense of reality or feeling of unreality rendering, be all Information recovering or feature modification are carried out to image missing or to be converted.Image generates the deep learning network one used As be divided into variation self-encoding encoder (VAE), production fights network (GAN) and their mutation.VAE learns because of without confrontation, Cause the picture generated relatively fuzzyyer.GAN and its generation of the image of mutation are more widely applied, and effect is more excellent.

Conditional GAN is that original GAN adds some explicit external informations, the usually additional input of user, Such as semantic classes, high-dimensional classification, bottom constraint information etc..Improved GAN is using the criterion of arbiter as generation The input condition of device improves the stability of GAN training.And arbiter differentiates that multiple generate picture to make more simultaneously Reasonable decision.GVM is reversely given birth to using the color and shape information for generating picture and original image as constraint using GAN At optimal feature space flow pattern.Pix2pix has used Unet and PatchGAN, realizes from segmentation figure and becomes real pictures, Become cromogram from artwork master, becomes the figure rich in texture, shade and gloss from stick figure.The task of D goes judgement current Whether two pictures are a true transformation.Pix2pixHD realizes high-resolution image interpretation, and can carry out language Justice operation.GP-GAN proposes Gauss Poisson's equation, combine using gradient and color information excellent when merging high-definition image Change.The hidden variable of GAN obeys certain distribution, but the meaning of this distribution behind is unknown.Although trained GAN New image can be generated, but it can not but solve the problems, such as one, that is, generate the image with certain feature.For example, for The data of MNIST generate the image of some specific number, generate stroke compared with thick, inclined image in direction etc..InfoGAN is from letter Breath discusses angle and adds mutual information loss function, solves the problems, such as the interpretation of GAN hidden variable.

Grain refined guidance allows the generating process of GAN to be disassembled to multistep.LAPGAN is from Facebook, is for the first time by layer Secondaryization or the thought of grey iterative generation apply to the work in GAN.It, can be based on the knot of previous step when each step generates Fruit, but also only need " to fill " and " completion " new size required for those information.G is allowed to only generate residual error picture every time, it is raw Interpolation picture after done with the amplified picture of previous step it is cumulative, just obtained this step generation picture.StackGAN from Title generates birds, but the process generated is then the stratification as LAPGAN, to realize 256*256 resolution ratio Picture generating process.Picture generation is divided into two stages by StackGAN, and the stage one goes to capture profile and tone substantially, stage Two limitations being added in some details are to realize refine.This process effect is fine, or even can do on certain data sets To mixing the spurious with the genuine.PPGN also advocates not generate a complete picture once, but is constantly adjusted with an iterative process It is whole and perfect.Different from LAPGAN and StackGAN, PPGN is realized using the process of Denoising Auto-Encoder (DAE) Iteration.

GAN deep learning network is generally divided into three kinds of structure types.The multi-layer perception (MLP) that DCGAN succeeds GAN from MNIST (MLP) to convolutional neural networks structure, they propose one group of convolutional neural networks, not only GAN are existed structure extension On the true large-scale dataset of this real world of celebA and LSUN training, also make the trick such as batchnorm also by Function uses.But the advantage of U-Net structure is indicated in pix2pix.Unet is a kind of structure that codec is full symmetric, and The use of jump connection is added in such a configuration.One-to-one layer between encoder and decoder is allowed to acquire as far as possible Matched feature, it will very positive influence is generated on the effect for generating picture.It is proposed in the work of GP-GAN The module of blending GAN, although being also based on the structure of codec, the place being slightly different is to be situated between It joined a full articulamentum, such benefit is can to transmit more global informations, so that supervised learning becomes more Effectively.

Generator G less punishment in order to obtain has to selection and punishes the smallest strategy, that is, generates as much as possible Seem true picture without considering whether the same these pictures are！As long as recognizing that is, it generates an arbiter For the picture also gathered together reluctantly, it would not be ready that venture goes to attempt new picture again, because this venture can be brought more Punishment.Regularized GAN allows true picture to obtain the expression z an of latent space by an encoder E, then basic herein On last picture is generated with generator G.There is no the original GAN of encoder that can be very difficult to be fitted the mode of multiple dispersions, and The regularized GAN that joined encoder is then very easy to.The same period there are many more similar work, such as EBGAN and BEGAN can allow as far as possible generate data distribution Pg and truthful data distribution Pr it is more close, a possibility that coincidence and degree more It is high.It is almost the meaningless problem of gradient caused by constant to alleviate JS divergence.Addition noise can also solve mould to a certain degree The problem of formula is lost, for example noise is added in the neural network middle layer of G and D, or directly input in picture and noise is added. However roughly can all be classified as RegGAN, EBGAN and BEGAN for mentioning just now only having added an encoder, it only does primary The case where rebuilding constraint, can not avoid the problem that completely mode is lost, but can be between both of which " concussion ". CycleGAN, DiscoGAN, DualGAN then realize two encoder constraints in other words about with three kinds of different terms respectively Beam target, these are all based on the two-way recurrence of antithesis.

In order to thoroughly solve the problems, such as that gradient disappears, the mode that gradient is cut out is taken in WGAN, derivative is limited in [- C, c] in range.D in WGAN be no longer do two classification tasks, but do a recurrence task go fitting Wasserstein away from From.This distance is experimentally found, negatively correlated with the picture quality of generation.But when c is too big, it may appear that gradient explosion is asked Topic；It is too small to also result in gradient disappearance problem.Improved WGAN proves that gradient cutting method is actually optimal there is also one Arbiter D, when reaching this optimal arbiter, all weights of D can all be tended to be equal to c.Current weight and this The distance of a constant is treated as a kind of penalty term or regular terms, is added in the objective function of WGAN, such WGAN is just cried WGAN-GP.Faster, training is also more stable for convergence rate of the WGAN-GP than original WGAN, and the quality of obtained generation result is also more It is good.Disadvantage be exactly will lead to learn come out network it is too simple, for complicated function capability of fitting in other words for distribution Modeling ability can be decreased obviously.

Animation drawing auxiliary system: PaintsChainer is first animation colouring auxiliary tool, be can be used unconditional Lines sketch is converted to bright caricature by GAN.

AutoPainter improves pix2pix by addition high dimensional feature and global variation loss.In arbiter, The size of PatchGAN image block is 70*70, and exports the probability matrix of 30*30.

DeepColor proposes the setting using two series networks: it is based only upon the color primaries prediction network of profile, with And the rendering web based on profile and primary colours scheme.The art work of natural look can by Color scheme that user provides and from The Color scheme of contour prediction obtains, but the colouring diversity of deepcolor has missing.

ComiColorization is directed to entire caricature for the first time and is originally coloured, and makes on multiple panels of the same role With identical color.There are two interactive modification functions: global color histogram and partial color control point.Histogram is used for face The global colouring of plate, color point are repaired kind for part.

Style2paints is similar to PaintsChainer, is carried out by adding vgg characteristic information based on reference picture The colouring of migration formula.The generation result colouring of the first generation is compared naturally, still because vgg is covered, reference picture color is corresponding to be opened up Spatial information is flutterred, the color structure over-fitting of picture to be painted is caused, cannot adaptively be painted well.The third generation PaintsTransfer can generate the good coloring effect of vision, but the local detail of color is dirty and messy.Forth generation effect obtains Significant increase, and the effect of shadow such as bloom can be added, for one of coloring system best at present, but do not include the enhancing of line original text and Also there is very big difference in animation audiovisual synesthesia part with algorithm realization of the invention.

Traditional animation creation needs the painting skill of profession, colouring experience abundant, and creates the animation figure of high quality As time-consuming and laborious.There is a problem of in current existing animation authoring system it is some have it is to be optimized, such as line original text profession enhancing, The unreasonable diffusion of the color of line original text colouring, the effect of low, the interactive colouring of the degree of intelligence of automatic colouring and user experience deficiency, The whole process of game music photograph album efficiently produces.In view of the above problems, the present invention propose using deep learning, machine learning, The state-of-the-art technologies such as audiovisual fusion build the animation drawing auxiliary authoring system based on GAN.Flexibly realize the line based on GAN and RL The enhancing of original text profession and visualization；Realize that the line original text of high-effect high-quality is painted and its visualized with interacting automatically；It realizes and is based on audiovisual The game music photograph album of fusion and affection computation generates and visualization, to participate in improve the efficiency of animation creation and user Degree, audiovisual experience degree provide more reasonable system architecture.The key technology of the animation drawing auxiliary creation based on GAN is researched and developed, Specialized for line original text, the colouring of line original text effect optimization is necessary with abundant user's audiovisual experience.

Summary of the invention

In view of the above problems, the animation drawing auxiliary creative method based on GAN that the invention proposes a kind of, realizes that line original text is special Industry, the effect optimization of line original text colouring and abundant user's audiovisual experience.

A kind of animation drawing auxiliary creative method based on GAN of the present invention, comprising:

Step a: amateur line original text is subjected to image enhancement, obtains specialty line original text；

For amateur line original text, confrontation study is carried out using generator G and multiple dimensioned arbiter and obtains specialty line original text, is utilized Arbiter D carries out condition distinguishing to specialty line original text, obtains the generation state distribution in generator G in Policy-Gradient loss function, The full articulamentum FC layer building of arbiter D connection simultaneously is handed over and evaluates network than IOU, obtains Policy-Gradient in generator G and loses letter Reward in number；

Step b: it paints to the specialty line original text that step a is generated；

Include: to generate grayscale image and color cluster figure using true color figure and specialty line original text, is generated most based on confrontation network Whole cromogram；

The cromogram for generating final is evaluated；

Step c: game music photograph album is generated, comprising:

The colouring of animation emotion: keep animation color schemes associated with emotion, given birth at random according to the affective tag of input At corresponding color tips, color image is generated using confrontation network G AN；

Animation song emotion recognition: keeping animation song paragraph associated with emotion, obtains animation song paragraph and emotion is closed Gang mould type carries out generalization ability test to model；

Photograph album animation producing based on rhythm identification: first carrying out pumping value to audio signal, estimates function using confrontation network G AN Rate spectrum, find the frequency of stress or its harmonic wave in melody, controls the frequency of image transition and transformation accordingly.

It further, further include step d: by front end exploitation and rear end exploitation realization line original text generation visualization, line original text Color visualization, game music photograph album generate visualization, system interaction tool and save and share with works, the rear end development and utilization JavaScript or Python, the front end develop and use CSS/HTML frame.

Further, grayscale image and color cluster figure are generated using true color figure and specialty line original text described in step b, be based on Confrontation network generates final cromogram and includes:

The edge for extracting true color figure, obtains the line original text step of true color figure；

Using the line original text of the step a specialty line original text generated or true color figure as the input of generator G1, generator G1 and sentence Other device D1 carries out confrontation study and obtains false grayscale image step；

True color figure is converted into true grayscale image step；

The corresponding true palette for keeping edge of cromogram is obtained by the clustering algorithm that simple linear iteration clusters SLIC Step；

Using the line original text of the step a specialty line original text generated or true color figure as the input of generator G2, generator G2 and sentence Other device D2 carries out confrontation study and obtains false palette step；

Color cluster figure step is obtained to obtained false palette hole-filling and adjusting tone；

The color cluster figure is inputted into VGGNet network, obtains color cluster figure feature；

Using the gray scale of the false grayscale image and color cluster figure feature as the input of generator G3, generator G3 and differentiation Device D3 carries out confrontation study and obtains final cromogram.

It further, include objective indicator evaluation and subjectivity to generating final cromogram to carry out evaluation described in step b Experimental evaluation, the objective indicator evaluation includes measuring color by the local binary patterns algorithm CCLBP encoded with color Random diffusion and mixed and disorderly degree further include automatic colouring degree being measured by photosensitive evaluation algorithms LSS and to color tips Over-fitting degree；

Obedience degree evaluation that the subjective experiment evaluation paints including colouring Time evaluation, user and experiences evaluation, region is painted, The evaluation of color obedience degree and the quality evaluation of vision overall effect.

Further, the local binary patterns algorithm CCLBP by being encoded with color measures the random expansion of color It dissipates and mixed and disorderly degree, specifically by the gradient mean value for the rgb value for calculating central pixel point and 8 pixel of surrounding, if more than Threshold value is set to 1, is otherwise set to 0；Every three binary values are one group in the direction of the clock, successively obtain the binary coding of RGB, It is 8 sections that color-values 256, which are divided, again, and the corresponding Current central pixel rgb value for generating characteristic pattern is obtained after decimal system conversion；

It is photosensitive and by photosensitive evaluation algorithms LSS measurement automatic colouring degree and to the over-fitting degree of color tips Characteristic pattern is the differential chart of white painting canvas and grayscale image.

Further, on the basis of multiple dimensioned arbiter add discrete cosine transform differentiate loss, will generate image into The separation of row low-and high-frequency, obtained high frequency imaging and low-frequency image input multiple dimensioned arbiter through inverse discrete cosine transform, carry out more Scale differentiates.

Further, the full articulamentum FC layer building of the arbiter D connection is handed over and evaluates network than IOU, is specially sentencing Full articulamentum FC1 layers and FC2 layers that two scales are added on the basis of other device D, FC1 layers and FC2 layers of output are made by superposition For output friendship and than IOU predicted value.

Further, the generation state in the Policy-Gradient loss function is distributed as being averaged for arbiter D output result Log value.

The Reward of the Policy-Gradient loss function is the friendship of amateur line original text and specialty line original text and than IOU, described non- The friendship of specialty line original text and specialty line original text and be that coordinate value is handed over and handed over than IOU and pixel value and weighted sum than IOU than IOU.

Further, the edge that true color figure is extracted using line original text extraction algorithm sketchKeras, obtains true color figure Line original text.

Further, generator G uses the network of down-sampling+DenseNet+ up-sampling, and the generator G's swashs Function living is using scaling exponential type linear unit SELu.

A kind of animation drawing auxiliary creative method based on GAN of the present invention, using based on generation confrontation network and extensive chemical The line original text profession enhancing technology of habit, based on the study of related generic and the line original text of color cluster colouring technology and based on audiovisual synesthesia with The technologies such as the game music photograph album generation of affection computation construct integrated animation drawing auxiliary authoring system, improve user's Authoring experience and aesthetic.

Detailed description of the invention

Fig. 1 is the animation drawing auxiliary creative method schematic diagram based on GAN.

Specific embodiment

In order to enable those skilled in the art to better understand the solution of the present invention, below to the technology in the embodiment of the present invention Scheme is clearly and completely described.

In some processes of description and claims of this specification description, contain according to particular order appearance Multiple operations, but it should be clearly understood that these operations can not execute or parallel according to its sequence what appears in this article It executes, the serial number of operation such as step a, step b etc. are only used for distinguishing each different operation, and serial number itself does not represent Any executes sequence.In addition, these processes may include more or fewer operations, and these operations can be held in order Capable or parallel execution.

The following is a clear and complete description of the technical scheme in the embodiments of the invention, it is clear that described embodiment Only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those skilled in the art Member's every other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.

Explanation of nouns

GAN: production fights network；GAN includes two kinds of network G and D in GAN there are two network, wherein G is Generator, its effect are to generate picture, that is to say, that after inputting random coded (random code) z, it Picture G (z) that a width is automatically generated by neural network, false will be exported.Another network D is that Discriminator is to use Come what is judged, it receives the image of G output as input, then judges the true and false of diagram picture, really exports 1, false output 0。

Process is as follows: firstly, there is the generator of a generation, it can generate some very poor pictures, then there is one The discriminator of a generation, it can accurately be the picture and true picture classification of generation, in brief, this Discriminator is exactly two classifiers, 0 is exported to the picture of generation, to true picture output 1.

Then, start to train the generator in two generations, it can generate slightly better picture, can allow a generation The picture that discriminator thinks that these are generated is true picture.Then two generations can be trained Discriminator, it can accurately identify true picture, the picture that He Erdai generator is generated.And so on, Have three generations, four generations.The generator and discriminator in n generation, last discriminator can not differentiate generation Picture and true picture, this network are just fitted.

(loss function: being the inconsistent journey for estimating the predicted value f (x) of model Yu true value Y to loss function Degree, it is a non-negative real-valued function, is indicated usually using L (Y, f (x)), loss function is smaller, and the robustness of model is got over It is good.Loss function is core and the structure risk function important component of empirical risk function.

DenseNet improves the efficiency of transmission of information and gradient in a network, and every layer can directly take from loss function Gradient, and input signal is directly obtained, deeper network can be thus trained, there are also the effects of regularization for this network structure Fruit.Other networks are dedicated to promoting network performance from depth and width, and DenseNet is dedicated to coming from the angle of feature reuse Promote network performance.

Pix2Pix frame is based on GAN, and image is as a kind of communication media, and also there are many kinds of expression ways, such as gray scale Figure, cromogram, the gradient map even various labels etc. of people.Conversion between these images is referred to as image interpretation, is one Image generates task.For many years, these tasks require to go to generate with different models.After GAN appearance, these tasks one Lower son can be solved with same frame.The title of this algorithm is called Pix2Pix, based on confrontation neural fusion.

As mentioned, many information can be shared between outputting and inputting.If using common convolutional neural networks, It so will lead to each layer and all carry in store all information, such neural network error-prone thus uses U-Net To lighten the burden.

Activation primitive (Activation functions) goes to learn for artificial nerve network model, it is extremely complex to understand It has a very important role with for nonlinear function.Nonlinear characteristic is introduced into network by they.

In conjunction with attached drawing 1, the animation drawing auxiliary creative method based on GAN that the invention discloses a kind of, comprising:

Realization 1 middle line original text of attached drawing enhancing module, arbiter D (including arbiter D₁And D₇₀) act on as (1) feature extraction net Network, be followed by FC layer building IOU evaluation network, (2) as PatchGAN to generate effect differentiate, obtain Policy-Gradient damage Generation state in mistake；Multiple dimensioned arbiter is the comprehensive distinguishing for finally carrying out more receptive fields.Generator and multiple dimensioned arbiter Confrontation study is carried out, amateur line original text is enhanced as specialty line original text.The arbiter D (D1 and D70) of lower left is for constructing evaluation Network.

It specifically includes:

The characteristics of amateur line original text is point that stroke does not have the anxious slow weight of thickness, and there is shake, twisted phenomena and whole Body and part composition is careless about, details is imperfect, imaging unsightly the defects of.In order to solve the disadvantage that amateur line original text, pass through Pairs of sample carries out GAN network training, enhances as specialty line original text.Line original text profession enhancing (SPE) can be considered as a kind of image increasing By force.The sample resolution of imitation is 1024*1024, is divided into plate and approximate version, all for amateur picture person using Digitizing plate or Tablet computer hand, which is drawn, to be obtained.Plate line original text sketch only sketches the contours the global shape of original line original text, and approximate version requires quickly to copy more More details, but not refinement is thin.Line original text sample size and proportion statistics are as shown in table 1 in the database:

Table one: line original text statistics of attributes table (sum 3000)

In the building of generator G, the network up-sampled using down-sampling+DenseNet+ is reinforced between each layer feature Parameter transmitting.Activation primitive makes to train more stable using scaling exponential type linear unit (SELu), effectively gradient is avoided to explode With gradient disperse.It and is Unet network structure and leakyRelu activation primitive used in the algorithm pix2pix；In algorithm In pix2pixHD, ResNet network structure and Relu activation primitive are used.In the loss function of generator, with tradition Algorithm compare, on L1 loss function, Classification Loss function, VGG high frequency loss functional foundations, add the strategy of intensified learning Gradient loss function.

DenseNet (intensive convolutional network) is a kind of with the convolutional neural networks intensively connected, in the network, is appointed There is direct connection between what two layers, that is to say, that each layer of network of input is all the union of all layers of output in front, and The characteristic pattern that this layer is learnt, which can be also directly passed to, is used as input for all layers behind.

The building of arbiter D uses the PatchGAN based on markov random file (MRF), at multiple dimensioned (more receptive fields) Arbiter (D in such as attached drawing 1₃₄,D₇₀,D₁₄₂, under be designated as corresponding receptive field size) on the basis of, add discrete cosine transform (DCT) differentiate loss, image will be generated and carry out low-and high-frequency separation, then carry out respective multiple dimensioned differentiation, strengthen and differentiate effect.

In Policy-Gradient loss function, Reward is the friendship of amateur line original text and specialty line original text and than IOU, described amateur The friendship of line original text and specialty line original text and be that coordinate value is handed over and handed over than IOU and pixel value and weighted sum than IOU, corresponding shape than IOU State is distributed as arbiter D₇₀Export the average log value of result.

The full articulamentum FC layer building of arbiter D connection is handed over and evaluates network, the specially base in arbiter D than IOU Full articulamentum FC1 layers and FC2 layers that two scales are added on plinth, friendship of FC1 layers and FC2 layers of the output by superposition as output And than IOU predicted value.

About the building of evaluation network, i.e., in arbiter D (arbiter D₁And D₇₀Network is evaluated for constructing) on the basis of The FC layer of two scales is added, the FC1 layer parameter that wherein receptive field is 70 is nn.Linear (126*126,1), receptive field 1 Transport layer first pass around the maximum pond of 4*4, then input FC2 layers again, FC layer parameter be nn.Linear (256*256, 126*126), nn.Linear (126*126,1).IOU predicted value of two FC layers of the output by superposition as output, really Line original text IOU value is target value, obtains optimal evaluation network by training.

The cromogram for generating final is evaluated.

Traditional method minimized based on energy, such as LazyBrush and the grid based on Laplacian Differential Approach coordinate Deformation method etc., the distribution of color that cannot be fitted well at image detail.On the contrary, the painting methods based on GAN are more intelligent Change, it is more lively to generate image, and formation efficiency is higher.

Some mainstream painting methods based on GAN, such as paintschainer (V1, V2, V3), paintstransfer The colouring result of (V1, V3), DeepColor, AutoPainter etc. all have lost the topological structure of image pixel, for example occur Color diffusion phenomena, region colouring missing, color dirty and messy (being unevenly distributed, color jump) etc..It in order to solve these problems, can be with By addition grayscale image as the Interim for generating cromogram, it is more advantageous to the study and fitting of distribution of color in this way.Simultaneously Since the colouring process of line original text to grayscale image and the colouring process of grayscale image to cromogram belong to the associated class in category of psychology Belong to study (RSL), therefore grayscale image and cromogram are differentiated simultaneously using the same arbiter, does not need first to train generation The arbiter of grayscale image, retraining generate the arbiter of cromogram.Network complexity is reduced in this way, while strengthening image sky Between topological structure integrality.

In conjunction with attached drawing 1, grayscale image and color cluster figure, base are generated using true color figure and specialty line original text described in step b Final cromogram is generated in confrontation network, cromogram as input is specifically the true color figure in raw data base.Face Color hint effect is that generation sample is avoided diversity missing occur, mitigates training difficulty, while as data enhancing when training A part.The effect of T palette is true palette, is obtained by carrying out SLIC color cluster to original true (True) cromogram It arrives.F palette is the false palette generated by G2 to be fitted the distribution of true palette.Generator G1 and arbiter D1 training obtains grayscale image；Generator G3 and arbiter D3 obtain final cromogram；Generator G2 and arbiter D2 obtain F tune Colour table (being the palette of the holding contour of object of 10*10, every unit is the mass-tone of colouring).

It specifically includes:

The edge for extracting true color figure (T cromogram), obtains the line original text step of true color figure；

Line original text and cromogram are existing in pairs, thus the extraction of line original text is extremely important.Before deep learning, it can pass through Sobel filter or Canny detector in OpenCV, the edge extracting (XDoG) based on extension difference of Gaussian, global probability Boundary operator (gPb), structural edge detection (SED) scheduling algorithm based on random forest extract edge.It, can after deep learning occurs Edge is extracted by global nested edge detection (HED), DeepEdge, DeepContour scheduling algorithm, but is all not achieved ideal True line manuscript base effect.It can be (a kind of to be realized using Keras deep learning frame using the best sketchKeras of effect Line original text extraction algorithm) carry out edge extracting, use Unet training and combine traditional image processing techniques (Gaussian Blur, Median filtering etc.).

Using the line original text of the step a specialty line original text generated or true color figure as the input of generator G1, generator G1 and sentence Other device D1 carries out confrontation study and obtains false grayscale image (F grayscale image) step；

True color figure is converted into true grayscale image step；T cromogram is converted into T grayscale image using RGB2GRAY；

Using the false grayscale image color and color cluster figure feature as the input of generator G3, generator G3 and arbiter D3 carries out confrontation study and obtains final cromogram.

Interaction colouring key is the design method of color tips, is generally divided into four classes.The first kind is random to cromogram Be left white and Gaussian Blur after color block obtained through stochastical sampling.Its deficiency is that color tips obscure, spatial information lacks.Second class The palette of pocket mass-tone is square with reference to figure.Its deficiency is that marginal information lacks.Preceding two class in DeepColor all There is use.Third class is the block of pixels or lines of small scale, there is use in ComiColorization, PaintsChainer, It is given by user.Global color characteristic point can also be first generated according to GAN, be re-used as deformation constraint, passed through Differential Mesh Deformation algorithm, linear regression solve to obtain colouring reconstruction figure.Its deficiency be, quantity too small if scale very little if color tips to upper Color effect is unobvious；If scale is too big in training, because leading to model mistake comprising too many true color figure detailed information Fitting.It generally requires and global color tips information, then guide line is obtained according to small-scale color tips block or lines first The colouring of original text.4th class is that the global color of STOCHASTIC DIFFUSION prompts, and has use in PaintsTransferV3.

Because the colouring training of single step is difficult and effect is poor, general algorithm carries out color model instruction using two steps or multistep Practice.In order to obtain global color prompt information according to small range color tips, (SLIC) can be clustered by simple linear iteration Clustering algorithm obtain cromogram it is corresponding keep edge palette.Palette is 10 rows 10 column.SLIC is a kind of based on super The image segmentation algorithm of pixel, similar k-means algorithm, with reference to location information and Lab color space information.Other are similar Algorithm such as watershed algorithm easily cause over-segmentation, and be not directly available the dominant hue of segmentation area.

After the tone for changing true palette image at random, the line original text comprising small range color tips is as input, really Palette carries out model training as target.Its soft object palette generated contains color sky at face or background once in a while Hole, therefore median filtering, morphology or AR model scheduling algorithm filling cavity can be passed through.The last face based on grayscale image and generation again Color dendrogram interacts the training of color model, i.e. grayscale image, will be by the color cluster figure of VGG16 network as input terminal Fusion Features are to Unet network, and finally up-sampling is cromogram.

It include that objective indicator evaluation and subjective experiment are evaluated to generating final cromogram to carry out evaluation described in step b, The objective indicator evaluation includes that the random diffusion of color is measured by the local binary patterns algorithm CCLBP encoded with color It further include automatic colouring degree being measured by photosensitive evaluation algorithms LSS and to the over-fitting journey of color tips with mixed and disorderly degree Degree；The subjective experiment evaluation includes colouring Time evaluation, user's colouring experience evaluation, the evaluation of region colouring obedience degree, color The evaluation of obedience degree and the quality evaluation of vision overall effect.

The local binary patterns algorithm CCLBP by being encoded with color measures the random diffusion of color and mixed and disorderly Degree is set to 1 specifically by the gradient mean value for the rgb value for calculating central pixel point and 8 pixel of surrounding if more than threshold value, Otherwise it is set to 0；It is 8 sections that color-values 256, which are divided, and every three binary values are one group in the direction of the clock, successively obtains the two of RGB Scale coding, then it is 8 sections that color-values 256, which are divided, and the corresponding Current central pixel for generating characteristic pattern is obtained after decimal system conversion Rgb value；

And by photosensitive evaluation algorithms LSS (Light-sensitive score) measurement automatic colouring degree and to face The over-fitting degree of color prompt, photo-character figure are the differential chart of white painting canvas and grayscale image.Black region in photo-character figure It is more, show that the automatic colouring ability of algorithm is poorer, i.e. some regions of line original text are left white, and do not add colouring information rationally.

The evaluation figure that upper chromatic graph Yu original color figure are respectively obtained based on the two evaluation algorithms, by the L1 for calculating the two Distance obtains evaluation score value.Score is higher, and the image effect for representing automatic colouring is better.

Subjective assessment refers to style2paintsV3, in experience that content can be divided into the colouring time, user paints, region Color obedience degree (measuring the whether unreasonable diffusion of color), color obedience degree (tone, saturation degree and the luminance deviation of measuring color) With vision overall effect quality etc..

Step c: game music photograph album is generated, comprising:

Traditional music album does not include color part on image enhancement and Image emotional semantic, only includes alternative photograph album and music Association matching is synthesized with audiovisual.The generation of game music photograph album mainly uses audiovisual synesthesia technology, i.e., at audio visual fuse information Reason.The function that may be implemented includes the colouring of animation emotion, animation song emotion recognition, the photograph album animation producing based on rhythm identification Deng.Emotion includes 4 classes: happy, sad, terrified and angry.

The colouring of animation emotion is matched firstly the need of by feature extraction and analysis and subjective assessment acquisition animation color Then the correlation model of color scheme and 4 class emotions generates corresponding color matching according to the affective tag of input and prompts, at random as ginseng It examines and generates color image using GAN.

The feature of extraction is color histogram, chooses 2 dimensions or three-dimensional mass-tone.Make scheme of colour and the associated psychology of emotion The method that ordinal scale can use has rank arrangement method and method of paired comparison.Such as use method of paired comparison, 11 kinds of colors into Row two dimension pairing, in order to eliminate sequence error and space error, each pair of stimulation is compared twice, the sequence of second presentation on the contrary, Left-right position is exchanged, and totally 110 times.The emotion that different colours and the different schemes of colour of two dimension are conveyed is as shown in table two and table three. Pass through affective tag in the further analytical table of subjective assessment and happy, sad, terrified and angry four kinds of main affective tags Correspondence or attaching relation.

Then using different color matching information as input, emotion information as label, using SVM or BP neural network into Row training, obtains the sentiment classification model of image.The emotion for finally carrying out different schemes of colour to same line manuscript base is painted.

Table two: the front and negative information that different colours are conveyed

Table three: the emotion information that two-dimensional color color matching is conveyed

It is red

Orange

It is yellow

It is green

It is blue

It is purple

Powder

Palm fibre

It is white

It is black

Ash

It is red

Passion

It is cheerful and light-hearted

It is happy

Vigor

It is graceful

It is gentle

It is graceful

Passion

It is powerful

Fashion

Orange

It is cheerful and light-hearted

It is warm

It is soft

It is pure

Vigor

Tender Chu

It is lovely

Easily

It is warm

It is magnificent

It is warm

It is yellow

It is happy

It is leisurely and carefree

It is graceful

It is romantic

It is lovely

It is natural

It is leisurely and carefree

It is magnificent

It is stable

It is green

Vigor

It is leisurely and carefree

It is happy

It is natural

Child's simplicity

It is soft

It is romantic

Leisure

It is pure and fresh

It is cold

It is blue

It is soft

Vigor

It is graceful

Vigor

It is sedate

It is graceful

Child's simplicity

Simply

It is young

It is sedate

Fashion

It is purple

Childishness

It is graceful

It is romantic

It is graceful

It is romantic

It is graceful

It is pure

It is graceful

Powder

It is soft

It is lovely

It is stable

It is pure

It is graceful

It is romantic

It is sweet

It is graceful

It is soft

It is stable

It is lovely

Palm fibre

It is graceful

Easily

It is natural

Simply

It is graceful

Simply

It is classic

It is graceful

It is white

Passion

It is warm

It is pure

Peace

Tender Chu

It is graceful

It is lovely

Simply

It is pure

Sharply

It is pure

It is black

It is powerful

It is magnificent

It is cold

It is sedate

It is graceful

It is stable

It is classic

Sharply

Strictly

Succinctly

Ash

Fashion

It is pure

It is stable

It is gentle

It is sedate

Child's simplicity

It is lovely

It is simple

Simply

Succinctly

It is steady

Animation song emotion recognition can by feature extraction and analysis and subjective assessment obtain song paragraph with The correlation model of 4 class emotions, carries out extensive test later.

The audio frequency characteristics that music emotion modeling is extracted include tamber characteristic, rhythm characteristic, melody characteristics and mode feature etc.. Low-dimensional feature includes fundamental frequency, formant, spectral centroid, MFCC (mel cepstrum coefficients, Mel-scale Frequency Cepstrum Coefficients), bandwidth etc..It is totally 30 first that various types of game music refrain segments are covered in selection, 15 by Examination person during listening 30 snatch of music, with 4 emotion words (each emotion be divided into it is weak, in, strong three kinds of degree, totally 12 kinds Classification) snatch of music is described.The model that music emotion classification can use has SVM, BP neural network, random forest or depth The methods of degree study.

Photograph album animation producing based on rhythm identification mainly passes through knowledge on the basis of animation image and music emotion are associated The speed of other music rhythm come control image transition and translation, rotation, scaling etc. transformation speed.Traditional frequency domain rhythm extracts Algorithm carries out pumping value to audio signal first, then uses AR model estimated power spectrum, so that it may find stress or its harmonic wave exists Frequency in melody controls the frequency of image transition and transformation accordingly, and then generates game music photograph album.Power of the invention Power estimation does not use traditional autoregression model, and is learnt using GAN, and efficiency and fidelity are higher.

The method also includes step d: realizing that line original text generates visualization, line original text is painted by front end exploitation and rear end exploitation Visualization, game music photograph album generate visualization, system interaction tool and save and share with works, and the rear end exploitation is available JavaScript or Python, the front end exploitation can utilize the CSS/HTML frames such as bootstrap.

Animation drawing based on GAN assists its available application software of creative method, mainly includes corresponding to this method The application layer software of CUCPaints drawing board, image interpretation device and game music photograph album player.

The animation drawing auxiliary authoring system based on GAN is constructed, facilitates client to be thought according to itself finding, flexibly and fast Animation line original text is created on ground, carries out image colouring, generates game music photograph album etc., be conducive to the propagation of art with exchange.It is not only full The design demand of sufficient beginner, child, old man are also beneficial to meet the professional client (visitors such as cartoon making, advertisement design Family) accelerate to handle the demand of cartoon, improve the process and technique with optimization animation creation, promotion industry development.And it should Invention can create the auxiliary of other artistic pictures such as oil painting, ink and wash and provide guiding scheme and referential technical method. The present invention is utilized based on generation confrontation network to the line original text profession of intensified learning enhancing technology, based on the study of related generic and color Cluster line original text colouring technology and based on the game music photograph album of audiovisual synesthesia and affection computation generate etc. technologies, building integration Animation drawing auxiliary authoring system, improves the authoring experience and aesthetic of user.

Compared with prior art, foregoing invention has the advantage that the drawing auxiliary creative method of the animation based on GAN, benefit Enhance technology, the line based on related generic study and color cluster with based on the line original text profession for generating confrontation network and intensified learning Original text, which is painted, technology and the technologies such as generates based on the game music photograph album of audiovisual synesthesia and affection computation, constructs integrated animation drawing Authoring system is assisted, the authoring experience and aesthetic of user are improved.

The animation drawing auxiliary creative method to provided by the present invention based on GAN is described in detail above, for Those of ordinary skill in the art, thought according to an embodiment of the present invention, have in specific embodiments and applications Change place, in conclusion the contents of this specification are not to be construed as limiting the invention.

Claims

1. a kind of animation drawing auxiliary creative method based on GAN characterized by comprising

For amateur line original text, confrontation study is carried out using generator G and multiple dimensioned arbiter and obtains specialty line original text, utilizes differentiation Device D carries out condition distinguishing to specialty line original text, obtains the generation state distribution in generator G in Policy-Gradient loss function, simultaneously The full articulamentum FC layer building of arbiter D connection is handed over and evaluates network than IOU, obtains in generator G in Policy-Gradient loss function Reward；

Include: to generate grayscale image and color cluster figure using true color figure and specialty line original text, is generated finally based on confrontation network Cromogram；

The cromogram for generating final is evaluated；

Step c: game music photograph album is generated, comprising:

The colouring of animation emotion: keep animation color schemes associated with emotion, generated at random according to the affective tag of input pair The color tips answered generate color image using confrontation network G AN；

Animation song emotion recognition: keeping animation song paragraph associated with emotion, obtains animation song paragraph with emotion and is associated with mould Type carries out generalization ability test to model；

Photograph album animation producing based on rhythm identification: first carrying out pumping value to audio signal, estimates power using confrontation network G AN Spectrum, finds the frequency of stress or its harmonic wave in melody, controls the frequency of image transition and transformation accordingly.

2. the animation drawing auxiliary creative method according to claim 1 based on GAN, which is characterized in that further include step D: by front end exploitation and rear end exploitation realize line original text generate visualization, line original text colouring visualization, game music photograph album generation can It saves and shares with works depending on change, system interaction tool, the rear end develops and uses JavaScript or Python, the front end Develop and use CSS/HTML frame.

3. a kind of animation drawing auxiliary creative method based on GAN according to claim 1 or 2, which is characterized in that step Grayscale image and color cluster figure are generated using true color figure and specialty line original text described in b, final coloured silk is generated based on confrontation network Chromatic graph includes:

Using the line original text of the step a specialty line original text generated or true color figure as the input of generator G1, generator G1 and arbiter D1 carries out confrontation study and obtains false grayscale image step；

True color figure is converted into true grayscale image step；

The corresponding true palette step for keeping edge of cromogram is obtained by the clustering algorithm that simple linear iteration clusters SLIC；

Using the line original text of the step a specialty line original text generated or true color figure as the input of generator G2, generator G2 and arbiter D2 carries out confrontation study and obtains false palette step；

Using the gray scale of the false grayscale image and color cluster figure feature as the input of generator G3, generator G3 and arbiter D3 It carries out confrontation study and obtains final cromogram.

4. a kind of animation drawing auxiliary creative method based on GAN according to claim 3, which is characterized in that in step b It includes objective indicator evaluation and subjective experiment evaluation, the objective indicator evaluation that described pair, which generates final cromogram to carry out evaluation, Including measuring the random diffusion of color and mixed and disorderly degree by the local binary patterns algorithm CCLBP encoded with color, also Including measuring automatic colouring degree by photosensitive evaluation algorithms LSS and to the over-fitting degree of color tips；

The subjective experiment evaluation includes colouring Time evaluation, user's colouring experience evaluation, the evaluation of region colouring obedience degree, color The evaluation of obedience degree and the quality evaluation of vision overall effect.

5. a kind of animation drawing auxiliary creative method based on GAN according to claim 4, which is characterized in that described logical The random diffusion of color and the mixed and disorderly degree of measuring are crossed with the local binary patterns algorithm CCLBP of color coding, it is specially logical The gradient mean value for calculating the rgb value of central pixel point and 8 pixel of surrounding is crossed, 1 is set to if more than threshold value, is otherwise set to 0；By suitable Every three binary values of clockwise are one group, successively obtain the binary coding of RGB, then it is 8 sections that color-values 256, which are divided, ten The corresponding Current central pixel rgb value for generating characteristic pattern is obtained after system conversion；

Automatic colouring degree is measured by photosensitive evaluation algorithms LSS and to the over-fitting degree of color tips, photo-character figure is The differential chart of white painting canvas and grayscale image.

6. a kind of animation drawing auxiliary creative method based on GAN according to claim 5, which is characterized in that in more rulers Addition discrete cosine transform differentiates loss on the basis of spending arbiter, will generate image and carries out low-and high-frequency separation, obtained high frequency Image and low-frequency image input multiple dimensioned arbiter through inverse discrete cosine transform, carry out multiple dimensioned differentiation.

7. a kind of animation drawing auxiliary creative method based on GAN according to claim 6, which is characterized in that described to sentence The other full articulamentum FC layer building of device D connection is handed over and evaluates network than IOU, and two rulers are specially added on the basis of arbiter D Full articulamentum FC1 layers and FC2 layers of degree, FC1 layers and FC2 layers of output is by superposition as the friendship of output and than IOU predicted value.

8. a kind of animation drawing auxiliary creative method based on GAN according to claim 1, which is characterized in that the plan Generation state slightly in gradient loss function is distributed as the average log value of arbiter D output result；

The Reward of the Policy-Gradient loss function is the friendship of amateur line original text and specialty line original text and than IOU, described amateur The friendship of line original text and specialty line original text and be that coordinate value is handed over and handed over than IOU and pixel value and weighted sum than IOU than IOU.

9. a kind of animation drawing auxiliary creative method based on GAN according to claim 3, which is characterized in that utilize line Original text extraction algorithm sketchKeras extracts the edge of true color figure, obtains the line original text of true color figure.

10. a kind of animation drawing auxiliary creative method based on GAN according to claim 1, which is characterized in that generator G uses the network of down-sampling+DenseNet+ up-sampling, and the activation primitive of the generator G is using scaling index Linear Cell S ELu.