CN114387365A - Line draft coloring method and device - Google Patents
Line draft coloring method and device Download PDFInfo
- Publication number
- CN114387365A CN114387365A CN202111665537.6A CN202111665537A CN114387365A CN 114387365 A CN114387365 A CN 114387365A CN 202111665537 A CN202111665537 A CN 202111665537A CN 114387365 A CN114387365 A CN 114387365A
- Authority
- CN
- China
- Prior art keywords
- model
- coloring
- line
- color
- line draft
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004040 coloring Methods 0.000 title claims abstract description 105
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000012549 training Methods 0.000 claims abstract description 44
- 230000007246 mechanism Effects 0.000 claims abstract description 21
- 238000012216 screening Methods 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims abstract description 15
- 230000009191 jumping Effects 0.000 claims abstract description 7
- 230000010355 oscillation Effects 0.000 claims abstract description 6
- 238000005070 sampling Methods 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 13
- 230000004913 activation Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 8
- 238000005520 cutting process Methods 0.000 claims description 6
- 238000009826 distribution Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 4
- 230000008033 biological extinction Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000015556 catabolic process Effects 0.000 claims description 3
- 230000006835 compression Effects 0.000 claims description 3
- 238000007906 compression Methods 0.000 claims description 3
- 238000006731 degradation reaction Methods 0.000 claims description 3
- 230000006872 improvement Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000005096 rolling process Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 11
- 230000015654 memory Effects 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000003860 storage Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/001—Texturing; Colouring; Generation of texture or colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a line draft coloring method and device, wherein the method comprises the following steps: collecting a color image, preprocessing the collected color image, generating a line draft according to the preprocessed color image, extracting a corresponding color prompt by adopting a random covering fuzzy method, and constructing a training data set; introducing an attention screening mechanism at a jumping connection part of a generator model of the pix2pixGAN model to improve the pix2pixGAN model to obtain a line draft coloring model; training the line draft coloring model by using the training data set; and coloring the line draft to be colored by utilizing the trained line draft coloring model. The method can improve the accuracy of the line draft coloring result and the quality of the generated picture, and can better converge and prevent oscillation during training.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a line draft coloring method and device.
Background
Image coloring is a popular theme in computer vision, and most coloring tasks are to color a gray image and to restore a black-and-white photo to a color image. At present, the animation industry is continuously developing, and line art coloring is an important ring, and good character color matching can attract more audiences. The coloring of line drawings is a very complicated task, and it may take hours or even days to manually draw a line drawing into an elegant picture. Although many deep learning methods are available to achieve satisfactory coloring results of grayscale images, if only a simple method of coloring grayscale images or image migration is applied to coloring line drawings, the final coloring results are not ideal. Because the line draft only contains contour information of some sparse edges, the information such as complete texture, shadow and gradual change is not contained like a gray image, and the coloring difficulty is greatly increased.
With the rapid development of deep learning technology, CNN (Convolutional Neural Network) based algorithms are widely applied in various fields, but for some generative tasks, the Convolutional Neural Network is not well completed. Until Goodfellow et al put forward a GAN (generic adaptive Network, generating a countermeasure Network), there was a better solution to the Generative tasks of super-resolution, image coloring, style migration, etc. Isola et al propose a generative countermeasure network (Pix2pixGAN) for image-to-image conversion to accomplish the task of image translation mapping from a source image domain to a target image domain. The Pix2pixGAN can also be used for coloring line manuscripts, but the overall effect is general, and the coloring result is random, so that the overall color matching of the image is disordered. To address this problem, Sangkloy et al propose to use the same sparse color brush as the line copy to accomplish controllable coloring, i.e. coloring by adding some continuous lines of color as cues. Zhang et al extracted a 4096-dimensional global feature of the reference map using a pre-trained VGG16/19 model, added to the middle layer of the generated network, and added two guideline decoders to avoid the disappearance of the middle layer gradients. Junsoo et al propose an SCFT (spatial correspondence Feature Transfer) module for fusing color information of a reference map into a line manuscript map, and in addition, a triplet loss based on similarity, which can better zoom in the distance between a coloring result and the reference map. Wang et al, by extracting features from an early stage of a pre-training network, introduced a local feature network to extract a semantic feature map, and used the local features as one of the inputs to a generator and a discriminator, but since the line drawing is colored, feature information is much less than a gray scale map, and the local feature network is not very good for semantic feature extraction. Guo et al propose a two-step coloring method, where the model trained in the first step guesses the color area and roughly fills in multiple colors on the line draft; the second step is used to correct for false shading and refine the shading results, which has the disadvantage of requiring manual production of a large number of images of intermediate results of the first and second steps to train the model.
In summary, the existing line draft coloring technologies are mostly improved on the basis of GAN, and these technologies have disadvantages of color overflow (i.e. filled color overflows into the surrounding area dispersively like water color), error in boundary line recognition (i.e. when lines are dense, other lines are mistaken as boundary lines, resulting in adjacent non-relevant areas being filled with wrong color), color inconsistency (i.e. the color of some areas of the coloring result does not correspond to the color hint), too large randomness of the coloring result, and the like.
Disclosure of Invention
The invention provides a line draft coloring method and device, and aims to solve the technical problems that information transmitted by a skip connection part of a current model is redundant, bottom layer characteristics are not fully utilized, and the randomness of coloring results is too high.
In order to solve the technical problems, the invention provides the following technical scheme:
in one aspect, the present invention provides a line draft coloring method, including:
collecting a color image, preprocessing the collected color image, generating a line draft according to the preprocessed color image, extracting a corresponding color prompt by adopting a random covering fuzzy method, and constructing a training data set;
introducing an attention screening mechanism at a jumping connection part of a generator model of the pix2pixGAN model to improve the pix2pixGAN model to obtain a line draft coloring model;
training the line draft coloring model by using the training data set;
and coloring the line draft to be colored by utilizing the trained line draft coloring model.
Further, the collecting the color image, preprocessing the collected color image, generating a line draft according to the preprocessed color image, extracting corresponding color prompts by using a random covering fuzzy method, and constructing a training data set, including:
collecting a color image, and preprocessing the collected color image; wherein the preprocessing the collected color images comprises: rotating and mirroring a part of images in the collected color images to realize data expansion, and cutting all the images; wherein the cutting size is 256 multiplied by 256;
generating a line draft according to the preprocessed color image;
covering the color randomly by using 120 white color blocks with the size of 20 multiplied by 20, and then carrying out fuzzy processing on the covered image by taking an average filter with the kernel size of 100 multiplied by 100 to obtain a corresponding color prompt of the color image;
and storing the color image and the corresponding line draft and color prompt thereof to construct a training data set.
Further, the generator model of the line draft coloring model comprises an up-sampling module, a down-sampling module and an attention screening module; the process of generating the color image according to the line draft and the corresponding color prompt by the generator model of the line draft coloring model comprises the following steps:
firstly, sending a single-channel line draft with the size of 256 multiplied by 256 and a corresponding color prompt with the size of 256 multiplied by 3 into a generator model of a line draft coloring model after cascading, then reducing the dimension of an image through a multilayer down-sampling module and extracting the features of all scales; then, deep information is extracted through an up-sampling module, and each up-sampling is fused with the corresponding feature extraction part; the down-sampling module and the up-sampling module both take a ReLU (Rectified Linear Unit) active layer, a 3 x 3 convolution and BN (Batch Normalization) layer as basic units, and introduce a residual mechanism to avoid gradient extinction and degradation problems of a deep network; the down-sampling and up-sampling operations scale using max-pooling and anti-max-pooling; after the up-sampling operation, a LeakyRelu active layer, a 3 × 3 convolution layer and a Tanh active layer are further performed, and finally a 256 × 256 color image is generated;
the Attention screening module takes a current network layer characteristic x and a next network layer characteristic g as input, aligns and adds the size and the channel number through convolution operation, performs dimension compression by using ReLU activation and convolution operation, generates an Attention coefficient through Sigmoid activation, and finally upsamples the Attention coefficient to the original dimension of an original vector x, multiplies the original vector x by the Attention coefficient and outputs the Attention coefficient; the calculation method is as follows:
wherein σ1For ReLU activation function, σ2Activating a function for Sigmoid; thetaattInvolving linear transformationAre all 1 × 1 convolution,An offset term for the corresponding convolution; vector quantityFor the ith feature of the current network layer, vector giIs the ith feature in the next layer.
Further, the discriminator model of the line draft coloring model is a full convolution network; the discriminator model of the line manuscript coloring model takes a line manuscript image and a generated image output by a generator model of the line manuscript coloring model or a line manuscript image and a real image which are cascaded as input, and outputs a matrix of 30 multiplied by 30 after passing through a plurality of layers of rolling blocks; each value in the matrix is used for judging a small area in the original image, and the values are averaged to be finally output by the discriminator.
Further, the loss function of the generator model of the line draft coloring model is as follows:
LG=λl1Ll1+λadvLadv+λtvLtv
wherein,
wherein, IsIs a line drawing, IhIs a color cue, IrIs a real image, G is a generator, D is a discriminator, lambdal1、λadv、λtvIs a weight, xi,jThe pixel value representing the position of coordinates (i, j) in the image, β being a super-positionA parameter;respectively for the combined distribution Ir、IsFor combined distribution Ih、IsIs mathematically expected and distributed for combinations Is、Ih、IrThe mathematical expectation of (2).
Further, the loss function of the discriminator model of the line draft coloring model is as follows:
LD=-λadvLadv
wherein,
further, the method further comprises:
in the process of training the line draft coloring model, a learning rate attenuation mechanism is added to help the training convergence and prevent oscillation; wherein the expression of the learning rate decay mechanism is as follows:
wherein alpha is the learning rate of the optimizer in the current stage, and the learning rate attenuation operation is performed every 15 rounds.
On the other hand, the invention also provides a line draft coloring device, which comprises:
the training data set building module is used for collecting color images, preprocessing the collected color images, generating line drafts according to the preprocessed color images, extracting corresponding color prompts by adopting a random covering fuzzy method and building a training data set;
the model improvement module is used for introducing an attention screening mechanism into a jumping connection part of a generator model of the pix2pixGAN model so as to improve the pix2pixGAN model and obtain a line draft coloring model;
the model training module is used for training the line draft coloring model by utilizing the training data set;
and the coloring module is used for coloring the line draft to be colored by utilizing the trained line draft coloring model.
In yet another aspect, the present invention also provides an electronic device comprising a processor and a memory; wherein the memory has stored therein at least one instruction that is loaded and executed by the processor to implement the above-described method.
In yet another aspect, the present invention also provides a computer-readable storage medium having at least one instruction stored therein, the instruction being loaded and executed by a processor to implement the above method.
The technical scheme provided by the invention has the beneficial effects that at least:
compared with a pix2pixGAN model, the color prompt method can automatically extract the color prompt from the color image for training, and the network can color correctly according to the given color prompt; the learning rate attenuation mechanism can help train better convergence and prevent oscillation; and an attention screening mechanism is added, so that the model can be more fully utilized for characteristic information, the coloring of the line draft detail part is more sensitive, and the accuracy of the coloring result and the quality of the generated picture are improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is an execution flow diagram of a line draft coloring method according to an embodiment of the present invention;
fig. 2 is a schematic network structure diagram of a line draft coloring model generator according to an embodiment of the present invention;
fig. 3 is a schematic network structure diagram of an attention screening mechanism according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
First embodiment
The embodiment provides a line draft coloring method, which can be implemented by an electronic device, and the electronic device can be a terminal or a server. The execution flow of the method is shown in fig. 1, and comprises the following steps:
s1, collecting color images, preprocessing the collected color images, generating line drafts according to the preprocessed color images, extracting corresponding color prompts by adopting a random covering fuzzy method, and constructing a training data set;
s2, introducing an attention screening mechanism at a jumping connection part of a generator model of the pix2pixGAN model to improve the pix2pixGAN model to obtain a line draft coloring model;
s3, training the line draft coloring model by using the training data set;
and S4, coloring the line draft to be colored by using the trained line draft coloring model.
Further, in this embodiment, the implementation process of S1 is specifically as follows:
collecting a large number of color images, and preprocessing the collected color images; wherein the pre-processing process comprises: rotating and mirroring a part of images in the collected color images to realize data expansion, and cutting all the images; wherein the cutting size is 256 multiplied by 256;
generating a line draft according to the preprocessed color image;
covering the color randomly by using 120 white color blocks with the size of 20 multiplied by 20, and then carrying out fuzzy processing on the covered image by taking an average filter with the kernel size of 100 multiplied by 100 to obtain a corresponding color prompt of the color image;
and storing the color image and the corresponding line draft and color prompt thereof to construct a training data set.
Further, in this embodiment, the generator model of the line draft coloring model has a structure as shown in fig. 2, and is designed based on a U-Net architecture, and is composed of an up-sampling module, a down-sampling module and an attention screening module; the process of the generator model generating a color image from the line draft and the corresponding color cues is as follows:
firstly, cascading single-channel line draft with the size of 256 multiplied by 256 and corresponding color prompts with the size of 256 multiplied by 3, sending the single-channel line draft and the corresponding color prompts into a network, then performing dimensionality reduction on an image through a multilayer downsampling module, and extracting features of all scales; then, deep information is extracted through an up-sampling module, and each up-sampling is fused with the corresponding feature extraction part; the down-sampling module and the up-sampling module both take a ReLU (Rectified Linear Unit) active layer, a 3 x 3 convolution and BN (Batch Normalization) layer as basic units, and introduce a residual mechanism to avoid gradient extinction and degradation problems of a deep network; the down-sampling and up-sampling operations adopt maximum pooling and inverse maximum pooling for scaling; after the upsampling operation, a LeakyRelu active layer, a 3 × 3 convolutional layer and a Tanh active layer are passed, finally producing a 256 × 256 color image.
The Attention screening module is shown in fig. 3, and takes a current network layer feature x and a next network layer feature g as input, aligns and adds the size and the channel number through convolution operation, performs dimension compression by using ReLU activation and convolution operation, generates an Attention coefficient through Sigmoid activation, and finally upsamples the Attention coefficient to an original dimension of an original vector x, multiplies the original vector x by the Attention coefficient and outputs the result; the specific calculation method is as follows:
wherein σ1For ReLU activation function, σ2Activating a function for Sigmoid; thetaattInvolving linear transformationAre all 1 x 1 convolutions of the original,an offset term for the corresponding convolution; vector quantityFor the ith feature of the current network layer, vector giIs the ith feature in the next layer.
Further, in this embodiment, the discriminator model of the line draft coloring model takes into account the concept of PatchGAN, and is a full convolution network; the classifiers of GAN are typically mapped from a 256 x 256 image to a single scalar output, with the scalar value being used to measure the authenticity of the image. The output result is that the whole picture is weighted, the characteristics of the local image cannot be well reflected, and the method is not practical for tasks with high precision requirements. In order to better judge the local features of the image, the discriminator model of the embodiment takes the line draft and the generated image or the line draft and the real image as input after being cascaded, and outputs a matrix of 30 × 30 after passing through a plurality of layers of rolling blocks; each value in the matrix is used for judging a small area in the original image, and the values are averaged to be finally output by the discriminator.
Further, in this embodiment, the loss function of the generator model of the line draft coloring model is:
LG=λl1Ll1+λadvLadv+λtvLtv
the loss function of the discriminator model of the line draft coloring model is as follows:
LD=-λadvLadv
wherein:
wherein, IsIs a line drawing, IhIs a corresponding color cue, IrIs a real image, G is a generator, D is a discriminator, lambdal1、λadv、λtvAs a weight, λl1=100,λadv=1,λtv=1;xi,jA pixel value indicating a position with coordinates (i, j) in an image, β is a hyper parameter, and in the present embodiment, β is 2.Respectively for the combined distribution Ir、IsFor combined distribution Ih、IsIs mathematically expected and distributed for combinations Is、Ih、IrThe mathematical expectation of (2).
Further, in this embodiment, the method for coloring a line draft further includes:
in the process of training the line draft coloring model, a learning rate attenuation mechanism is added to help the training convergence and prevent oscillation; wherein the expression of the learning rate decay mechanism is as follows:
wherein alpha is the learning rate of the optimizer in the current stage, and the learning rate attenuation operation is performed every 15 rounds.
In summary, the embodiment is used for training by automatically extracting the color prompt, and the color prompt is added as additional information to assist in coloring when coloring, so as to achieve the purpose of interactive coloring, and enable the corresponding part of the online manuscript to be painted with the desired color. An attention screening mechanism is introduced at a jump connection part of a generator model, so that a network can automatically learn and distinguish the shape and the size of a target, and the sensitivity and the accuracy of the model are improved by inhibiting the activation of features in an irrelevant area. Thereby improving the accuracy of the coloring result and the quality of the generated picture. In addition, in the embodiment, a learning rate attenuation mechanism is added in the training process, which is beneficial to training convergence and prevents oscillation.
Second embodiment
The embodiment provides a line draft coloring device, which comprises the following modules:
the training data set building module is used for collecting color images, preprocessing the collected color images, generating line drafts according to the preprocessed color images, extracting corresponding color prompts by adopting a random covering fuzzy method and building a training data set;
the model improvement module is used for introducing an attention screening mechanism into a jumping connection part of a generator model of the pix2pixGAN model so as to improve the pix2pixGAN model and obtain a line draft coloring model;
the model training module is used for training the line draft coloring model by utilizing the training data set;
and the coloring module is used for coloring the line draft to be colored by utilizing the trained line draft coloring model.
The line draft coloring device of the present embodiment corresponds to the line draft coloring method of the first embodiment; the functions implemented by the functional modules in the line draft coloring device of the embodiment correspond to the flow steps in the line draft coloring method of the first embodiment one by one; therefore, it is not described herein.
Third embodiment
The present embodiment provides an electronic device, which includes a processor and a memory; wherein the memory has stored therein at least one instruction that is loaded and executed by the processor to implement the method of the first embodiment.
The electronic device may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) and one or more memories, where at least one instruction is stored in the memory, and the instruction is loaded by the processor and executes the method.
Fourth embodiment
The present embodiments provide a computer-readable storage medium having stored therein at least one instruction, which is loaded and executed by a processor, to implement the method of the first embodiment. The computer readable storage medium may be, among others, ROM, random access memory, CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like. The instructions stored therein may be loaded by a processor in the terminal and perform the above-described method.
Furthermore, it should be noted that the present invention may be provided as a method, apparatus or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied in the medium.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
Finally, it should be noted that while the above describes a preferred embodiment of the invention, it will be appreciated by those skilled in the art that, once the basic inventive concepts have been learned, numerous changes and modifications may be made without departing from the principles of the invention, which shall be deemed to be within the scope of the invention. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Claims (8)
1. A line draft coloring method is characterized by comprising the following steps:
collecting a color image, preprocessing the collected color image, generating a line draft according to the preprocessed color image, extracting a corresponding color prompt by adopting a random covering fuzzy method, and constructing a training data set;
introducing an attention screening mechanism at a jumping connection part of a generator model of the pix2pixGAN model to improve the pix2pixGAN model to obtain a line draft coloring model;
training the line draft coloring model by using the training data set;
and coloring the line draft to be colored by utilizing the trained line draft coloring model.
2. The method for coloring line draft according to claim 1, wherein the collecting color images, preprocessing the collected color images, generating the line draft according to the preprocessed color images, extracting corresponding color prompts by a random covering fuzzy method, and constructing a training data set comprises:
collecting a color image, and preprocessing the collected color image; wherein the preprocessing the collected color images comprises: rotating and mirroring a part of images in the collected color images to realize data expansion, and cutting all the images; wherein the cutting size is 256 multiplied by 256;
generating a line draft according to the preprocessed color image;
covering the color randomly by using 120 white color blocks with the size of 20 multiplied by 20, and then carrying out fuzzy processing on the covered image by taking an average filter with the kernel size of 100 multiplied by 100 to obtain a corresponding color prompt of the color image;
and storing the color image and the corresponding line draft and color prompt thereof to construct a training data set.
3. The line draft coloring method according to claim 1, wherein the generator model of the line draft coloring model comprises an up-sampling module, a down-sampling module and an attention screening module; the process of generating the color image according to the line draft and the corresponding color prompt by the generator model of the line draft coloring model comprises the following steps:
firstly, sending a single-channel line draft with the size of 256 multiplied by 256 and a corresponding color prompt with the size of 256 multiplied by 3 into a generator model of a line draft coloring model after cascading, then reducing the dimension of an image through a multilayer down-sampling module and extracting the features of all scales; then, deep information is extracted through an up-sampling module, and each up-sampling is fused with the corresponding feature extraction part; the down-sampling module and the up-sampling module both take a ReLU (Rectified Linear Unit) active layer, a 3 x 3 convolution and BN (Batch Normalization) layer as basic units, and introduce a residual mechanism to avoid gradient extinction and degradation problems of a deep network; the down-sampling and up-sampling operations scale using max-pooling and anti-max-pooling; after the up-sampling operation, a LeakyRelu active layer, a 3 × 3 convolution layer and a Tanh active layer are further performed, and finally a 256 × 256 color image is generated;
the Attention screening module takes a current network layer characteristic x and a next network layer characteristic g as input, aligns and adds the size and the channel number through convolution operation, performs dimension compression by using ReLU activation and convolution operation, generates an Attention coefficient through Sigmoid activation, and finally upsamples the Attention coefficient to the original dimension of an original vector x, multiplies the original vector x by the Attention coefficient and outputs the Attention coefficient; the calculation method is as follows:
wherein σ1For ReLU activation function, σ2Activating a function for Sigmoid; thetaattInvolving linear transformationAre all 1 x 1 convolutions of the original,an offset term for the corresponding convolution; vector quantityFor the ith feature of the current network layer, vector giIs the ith feature in the next layer.
4. The method for coloring a line draft according to claim 1, wherein the discriminator model of the line draft coloring model is a full convolution network; the discriminator model of the line manuscript coloring model takes a line manuscript image and a generated image output by a generator model of the line manuscript coloring model or a line manuscript image and a real image which are cascaded as input, and outputs a matrix of 30 multiplied by 30 after passing through a plurality of layers of rolling blocks; each value in the matrix is used for judging a small area in the original image, and the values are averaged to be finally output by the discriminator.
5. The line draft coloring method according to claim 1, wherein a loss function of a generator model of the line draft coloring model is:
LG=λl1Ll1+λadvLadv+λtvLtv
wherein,
wherein, IsIs a line drawing, IhIs a color cue, IrIs a real image, G is a generator, D is a discriminator, lambdal1、λadv、λtvIs a weight, xi,jA pixel value representing a location of coordinates (i, j) in the image, β being a hyper-parameter;respectively for the combined distribution Ir、IsFor combined distribution Ih、IsIs mathematically expected and distributed for combinations Is、Ih、IrThe mathematical expectation of (2).
7. the method for coloring line art according to claim 1, further comprising:
in the process of training the line draft coloring model, a learning rate attenuation mechanism is added to help the training convergence and prevent oscillation; wherein the expression of the learning rate decay mechanism is as follows:
wherein alpha is the learning rate of the optimizer in the current stage, and the learning rate attenuation operation is performed every 15 rounds.
8. The utility model provides a line manuscript coloring device which characterized in that includes:
the training data set building module is used for collecting color images, preprocessing the collected color images, generating line drafts according to the preprocessed color images, extracting corresponding color prompts by adopting a random covering fuzzy method and building a training data set;
the model improvement module is used for introducing an attention screening mechanism into a jumping connection part of a generator model of the pix2pixGAN model so as to improve the pix2pixGAN model and obtain a line draft coloring model;
the model training module is used for training the line draft coloring model by utilizing the training data set;
and the coloring module is used for coloring the line draft to be colored by utilizing the trained line draft coloring model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111665537.6A CN114387365B (en) | 2021-12-30 | 2021-12-30 | Method and device for coloring line manuscript |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111665537.6A CN114387365B (en) | 2021-12-30 | 2021-12-30 | Method and device for coloring line manuscript |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114387365A true CN114387365A (en) | 2022-04-22 |
CN114387365B CN114387365B (en) | 2024-09-24 |
Family
ID=81199755
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111665537.6A Active CN114387365B (en) | 2021-12-30 | 2021-12-30 | Method and device for coloring line manuscript |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114387365B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115170388A (en) * | 2022-07-28 | 2022-10-11 | 西南大学 | Character line draft generation method, device, equipment and medium |
CN115953597A (en) * | 2022-04-25 | 2023-04-11 | 北京字跳网络技术有限公司 | Image processing method, apparatus, device and medium |
CN116433788A (en) * | 2023-02-24 | 2023-07-14 | 北京科技大学 | Gray image coloring method and device based on self-attention and generation countermeasure network |
WO2023207779A1 (en) * | 2022-04-25 | 2023-11-02 | 北京字跳网络技术有限公司 | Image processing method and apparatus, device, and medium |
CN117557589A (en) * | 2023-11-08 | 2024-02-13 | 深圳市闪剪智能科技有限公司 | Line drawing coloring method, device and storage medium based on neural network |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109712203A (en) * | 2018-12-29 | 2019-05-03 | 福建帝视信息科技有限公司 | A kind of image rendering methods based on from attention generation confrontation network |
CN112084728A (en) * | 2020-09-07 | 2020-12-15 | 中国人民解放军战略支援部队信息工程大学 | Pix2 pix-based PCB gray image coloring method and system |
CN112288645A (en) * | 2020-09-30 | 2021-01-29 | 西北大学 | Skull face restoration model construction method, restoration method and restoration system |
-
2021
- 2021-12-30 CN CN202111665537.6A patent/CN114387365B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109712203A (en) * | 2018-12-29 | 2019-05-03 | 福建帝视信息科技有限公司 | A kind of image rendering methods based on from attention generation confrontation network |
CN112084728A (en) * | 2020-09-07 | 2020-12-15 | 中国人民解放军战略支援部队信息工程大学 | Pix2 pix-based PCB gray image coloring method and system |
CN112288645A (en) * | 2020-09-30 | 2021-01-29 | 西北大学 | Skull face restoration model construction method, restoration method and restoration system |
Non-Patent Citations (2)
Title |
---|
LIM, SO-HYUN, AND JUN-CHUL CHUN.: "Image-to-Image Translation Based on U-Net with R2 and Attention.", JOURNAL OF INTERNET COMPUTING AND SERVICES, vol. 21, no. 4, 31 August 2020 (2020-08-31) * |
姚瑶: "基于生成对抗网络的图像转换方法研究", 中国优秀硕士学位论文全文数据库 (信息科技辑), no. 2020, 15 February 2020 (2020-02-15) * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115953597A (en) * | 2022-04-25 | 2023-04-11 | 北京字跳网络技术有限公司 | Image processing method, apparatus, device and medium |
WO2023207779A1 (en) * | 2022-04-25 | 2023-11-02 | 北京字跳网络技术有限公司 | Image processing method and apparatus, device, and medium |
CN115953597B (en) * | 2022-04-25 | 2024-04-16 | 北京字跳网络技术有限公司 | Image processing method, device, equipment and medium |
CN115170388A (en) * | 2022-07-28 | 2022-10-11 | 西南大学 | Character line draft generation method, device, equipment and medium |
CN116433788A (en) * | 2023-02-24 | 2023-07-14 | 北京科技大学 | Gray image coloring method and device based on self-attention and generation countermeasure network |
CN117557589A (en) * | 2023-11-08 | 2024-02-13 | 深圳市闪剪智能科技有限公司 | Line drawing coloring method, device and storage medium based on neural network |
Also Published As
Publication number | Publication date |
---|---|
CN114387365B (en) | 2024-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114387365A (en) | Line draft coloring method and device | |
Jiang et al. | Image inpainting based on generative adversarial networks | |
CN113158862B (en) | Multitasking-based lightweight real-time face detection method | |
CN111861945B (en) | Text-guided image restoration method and system | |
CN111079532A (en) | Video content description method based on text self-encoder | |
CN115222998B (en) | Image classification method | |
CN116704079B (en) | Image generation method, device, equipment and storage medium | |
CN115641391A (en) | Infrared image colorizing method based on dense residual error and double-flow attention | |
CN113392711A (en) | Smoke semantic segmentation method and system based on high-level semantics and noise suppression | |
CN111127472A (en) | Multi-scale image segmentation method based on weight learning | |
CN111652864A (en) | Casting defect image generation method for generating countermeasure network based on conditional expression | |
CN116958324A (en) | Training method, device, equipment and storage medium of image generation model | |
CN116757986A (en) | Infrared and visible light image fusion method and device | |
CN115546171A (en) | Shadow detection method and device based on attention shadow boundary and feature correction | |
CN116051593A (en) | Clothing image extraction method and device, equipment, medium and product thereof | |
CN111814693A (en) | Marine ship identification method based on deep learning | |
CN110675311A (en) | Sketch generation method and device under sketch order constraint and storage medium | |
CN114694074A (en) | Method, device and storage medium for generating video by using image | |
CN117252892B (en) | Automatic double-branch portrait matting device based on light visual self-attention network | |
Zhao et al. | Guiding intelligent surveillance system by learning-by-synthesis gaze estimation | |
CN117788629A (en) | Image generation method, device and storage medium with style personalization | |
CN111311732A (en) | 3D human body grid obtaining method and device | |
CN116402702A (en) | Old photo restoration method and system based on deep neural network | |
CN115018729A (en) | White box image enhancement method for content | |
CN114519678A (en) | Scanning transmission image recovery method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |