CN114387365A - Line draft coloring method and device - Google Patents

Line draft coloring method and device Download PDF

Info

Publication number
CN114387365A
CN114387365A CN202111665537.6A CN202111665537A CN114387365A CN 114387365 A CN114387365 A CN 114387365A CN 202111665537 A CN202111665537 A CN 202111665537A CN 114387365 A CN114387365 A CN 114387365A
Authority
CN
China
Prior art keywords
model
coloring
line
color
line draft
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111665537.6A
Other languages
Chinese (zh)
Other versions
CN114387365B (en
Inventor
王粉花
严由齐
郑嘉伟
林超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN202111665537.6A priority Critical patent/CN114387365B/en
Publication of CN114387365A publication Critical patent/CN114387365A/en
Application granted granted Critical
Publication of CN114387365B publication Critical patent/CN114387365B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a line draft coloring method and device, wherein the method comprises the following steps: collecting a color image, preprocessing the collected color image, generating a line draft according to the preprocessed color image, extracting a corresponding color prompt by adopting a random covering fuzzy method, and constructing a training data set; introducing an attention screening mechanism at a jumping connection part of a generator model of the pix2pixGAN model to improve the pix2pixGAN model to obtain a line draft coloring model; training the line draft coloring model by using the training data set; and coloring the line draft to be colored by utilizing the trained line draft coloring model. The method can improve the accuracy of the line draft coloring result and the quality of the generated picture, and can better converge and prevent oscillation during training.

Description

Line draft coloring method and device
Technical Field
The invention relates to the technical field of image processing, in particular to a line draft coloring method and device.
Background
Image coloring is a popular theme in computer vision, and most coloring tasks are to color a gray image and to restore a black-and-white photo to a color image. At present, the animation industry is continuously developing, and line art coloring is an important ring, and good character color matching can attract more audiences. The coloring of line drawings is a very complicated task, and it may take hours or even days to manually draw a line drawing into an elegant picture. Although many deep learning methods are available to achieve satisfactory coloring results of grayscale images, if only a simple method of coloring grayscale images or image migration is applied to coloring line drawings, the final coloring results are not ideal. Because the line draft only contains contour information of some sparse edges, the information such as complete texture, shadow and gradual change is not contained like a gray image, and the coloring difficulty is greatly increased.
With the rapid development of deep learning technology, CNN (Convolutional Neural Network) based algorithms are widely applied in various fields, but for some generative tasks, the Convolutional Neural Network is not well completed. Until Goodfellow et al put forward a GAN (generic adaptive Network, generating a countermeasure Network), there was a better solution to the Generative tasks of super-resolution, image coloring, style migration, etc. Isola et al propose a generative countermeasure network (Pix2pixGAN) for image-to-image conversion to accomplish the task of image translation mapping from a source image domain to a target image domain. The Pix2pixGAN can also be used for coloring line manuscripts, but the overall effect is general, and the coloring result is random, so that the overall color matching of the image is disordered. To address this problem, Sangkloy et al propose to use the same sparse color brush as the line copy to accomplish controllable coloring, i.e. coloring by adding some continuous lines of color as cues. Zhang et al extracted a 4096-dimensional global feature of the reference map using a pre-trained VGG16/19 model, added to the middle layer of the generated network, and added two guideline decoders to avoid the disappearance of the middle layer gradients. Junsoo et al propose an SCFT (spatial correspondence Feature Transfer) module for fusing color information of a reference map into a line manuscript map, and in addition, a triplet loss based on similarity, which can better zoom in the distance between a coloring result and the reference map. Wang et al, by extracting features from an early stage of a pre-training network, introduced a local feature network to extract a semantic feature map, and used the local features as one of the inputs to a generator and a discriminator, but since the line drawing is colored, feature information is much less than a gray scale map, and the local feature network is not very good for semantic feature extraction. Guo et al propose a two-step coloring method, where the model trained in the first step guesses the color area and roughly fills in multiple colors on the line draft; the second step is used to correct for false shading and refine the shading results, which has the disadvantage of requiring manual production of a large number of images of intermediate results of the first and second steps to train the model.
In summary, the existing line draft coloring technologies are mostly improved on the basis of GAN, and these technologies have disadvantages of color overflow (i.e. filled color overflows into the surrounding area dispersively like water color), error in boundary line recognition (i.e. when lines are dense, other lines are mistaken as boundary lines, resulting in adjacent non-relevant areas being filled with wrong color), color inconsistency (i.e. the color of some areas of the coloring result does not correspond to the color hint), too large randomness of the coloring result, and the like.
Disclosure of Invention
The invention provides a line draft coloring method and device, and aims to solve the technical problems that information transmitted by a skip connection part of a current model is redundant, bottom layer characteristics are not fully utilized, and the randomness of coloring results is too high.
In order to solve the technical problems, the invention provides the following technical scheme:
in one aspect, the present invention provides a line draft coloring method, including:
collecting a color image, preprocessing the collected color image, generating a line draft according to the preprocessed color image, extracting a corresponding color prompt by adopting a random covering fuzzy method, and constructing a training data set;
introducing an attention screening mechanism at a jumping connection part of a generator model of the pix2pixGAN model to improve the pix2pixGAN model to obtain a line draft coloring model;
training the line draft coloring model by using the training data set;
and coloring the line draft to be colored by utilizing the trained line draft coloring model.
Further, the collecting the color image, preprocessing the collected color image, generating a line draft according to the preprocessed color image, extracting corresponding color prompts by using a random covering fuzzy method, and constructing a training data set, including:
collecting a color image, and preprocessing the collected color image; wherein the preprocessing the collected color images comprises: rotating and mirroring a part of images in the collected color images to realize data expansion, and cutting all the images; wherein the cutting size is 256 multiplied by 256;
generating a line draft according to the preprocessed color image;
covering the color randomly by using 120 white color blocks with the size of 20 multiplied by 20, and then carrying out fuzzy processing on the covered image by taking an average filter with the kernel size of 100 multiplied by 100 to obtain a corresponding color prompt of the color image;
and storing the color image and the corresponding line draft and color prompt thereof to construct a training data set.
Further, the generator model of the line draft coloring model comprises an up-sampling module, a down-sampling module and an attention screening module; the process of generating the color image according to the line draft and the corresponding color prompt by the generator model of the line draft coloring model comprises the following steps:
firstly, sending a single-channel line draft with the size of 256 multiplied by 256 and a corresponding color prompt with the size of 256 multiplied by 3 into a generator model of a line draft coloring model after cascading, then reducing the dimension of an image through a multilayer down-sampling module and extracting the features of all scales; then, deep information is extracted through an up-sampling module, and each up-sampling is fused with the corresponding feature extraction part; the down-sampling module and the up-sampling module both take a ReLU (Rectified Linear Unit) active layer, a 3 x 3 convolution and BN (Batch Normalization) layer as basic units, and introduce a residual mechanism to avoid gradient extinction and degradation problems of a deep network; the down-sampling and up-sampling operations scale using max-pooling and anti-max-pooling; after the up-sampling operation, a LeakyRelu active layer, a 3 × 3 convolution layer and a Tanh active layer are further performed, and finally a 256 × 256 color image is generated;
the Attention screening module takes a current network layer characteristic x and a next network layer characteristic g as input, aligns and adds the size and the channel number through convolution operation, performs dimension compression by using ReLU activation and convolution operation, generates an Attention coefficient through Sigmoid activation, and finally upsamples the Attention coefficient to the original dimension of an original vector x, multiplies the original vector x by the Attention coefficient and outputs the Attention coefficient; the calculation method is as follows:
Figure BDA0003448221240000031
Figure BDA0003448221240000032
wherein σ1For ReLU activation function, σ2Activating a function for Sigmoid; thetaattInvolving linear transformation
Figure BDA0003448221240000033
Are all 1 × 1 convolution,
Figure BDA0003448221240000034
An offset term for the corresponding convolution; vector quantity
Figure BDA0003448221240000035
For the ith feature of the current network layer, vector giIs the ith feature in the next layer.
Further, the discriminator model of the line draft coloring model is a full convolution network; the discriminator model of the line manuscript coloring model takes a line manuscript image and a generated image output by a generator model of the line manuscript coloring model or a line manuscript image and a real image which are cascaded as input, and outputs a matrix of 30 multiplied by 30 after passing through a plurality of layers of rolling blocks; each value in the matrix is used for judging a small area in the original image, and the values are averaged to be finally output by the discriminator.
Further, the loss function of the generator model of the line draft coloring model is as follows:
LG=λl1Ll1advLadvtvLtv
wherein,
Figure BDA0003448221240000041
Figure BDA0003448221240000042
Figure BDA0003448221240000043
wherein, IsIs a line drawing, IhIs a color cue, IrIs a real image, G is a generator, D is a discriminator, lambdal1、λadv、λtvIs a weight, xi,jThe pixel value representing the position of coordinates (i, j) in the image, β being a super-positionA parameter;
Figure BDA0003448221240000044
respectively for the combined distribution Ir、IsFor combined distribution Ih、IsIs mathematically expected and distributed for combinations Is、Ih、IrThe mathematical expectation of (2).
Further, the loss function of the discriminator model of the line draft coloring model is as follows:
LD=-λadvLadv
wherein,
Figure BDA0003448221240000045
further, the method further comprises:
in the process of training the line draft coloring model, a learning rate attenuation mechanism is added to help the training convergence and prevent oscillation; wherein the expression of the learning rate decay mechanism is as follows:
Figure BDA0003448221240000046
wherein alpha is the learning rate of the optimizer in the current stage, and the learning rate attenuation operation is performed every 15 rounds.
On the other hand, the invention also provides a line draft coloring device, which comprises:
the training data set building module is used for collecting color images, preprocessing the collected color images, generating line drafts according to the preprocessed color images, extracting corresponding color prompts by adopting a random covering fuzzy method and building a training data set;
the model improvement module is used for introducing an attention screening mechanism into a jumping connection part of a generator model of the pix2pixGAN model so as to improve the pix2pixGAN model and obtain a line draft coloring model;
the model training module is used for training the line draft coloring model by utilizing the training data set;
and the coloring module is used for coloring the line draft to be colored by utilizing the trained line draft coloring model.
In yet another aspect, the present invention also provides an electronic device comprising a processor and a memory; wherein the memory has stored therein at least one instruction that is loaded and executed by the processor to implement the above-described method.
In yet another aspect, the present invention also provides a computer-readable storage medium having at least one instruction stored therein, the instruction being loaded and executed by a processor to implement the above method.
The technical scheme provided by the invention has the beneficial effects that at least:
compared with a pix2pixGAN model, the color prompt method can automatically extract the color prompt from the color image for training, and the network can color correctly according to the given color prompt; the learning rate attenuation mechanism can help train better convergence and prevent oscillation; and an attention screening mechanism is added, so that the model can be more fully utilized for characteristic information, the coloring of the line draft detail part is more sensitive, and the accuracy of the coloring result and the quality of the generated picture are improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is an execution flow diagram of a line draft coloring method according to an embodiment of the present invention;
fig. 2 is a schematic network structure diagram of a line draft coloring model generator according to an embodiment of the present invention;
fig. 3 is a schematic network structure diagram of an attention screening mechanism according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
First embodiment
The embodiment provides a line draft coloring method, which can be implemented by an electronic device, and the electronic device can be a terminal or a server. The execution flow of the method is shown in fig. 1, and comprises the following steps:
s1, collecting color images, preprocessing the collected color images, generating line drafts according to the preprocessed color images, extracting corresponding color prompts by adopting a random covering fuzzy method, and constructing a training data set;
s2, introducing an attention screening mechanism at a jumping connection part of a generator model of the pix2pixGAN model to improve the pix2pixGAN model to obtain a line draft coloring model;
s3, training the line draft coloring model by using the training data set;
and S4, coloring the line draft to be colored by using the trained line draft coloring model.
Further, in this embodiment, the implementation process of S1 is specifically as follows:
collecting a large number of color images, and preprocessing the collected color images; wherein the pre-processing process comprises: rotating and mirroring a part of images in the collected color images to realize data expansion, and cutting all the images; wherein the cutting size is 256 multiplied by 256;
generating a line draft according to the preprocessed color image;
covering the color randomly by using 120 white color blocks with the size of 20 multiplied by 20, and then carrying out fuzzy processing on the covered image by taking an average filter with the kernel size of 100 multiplied by 100 to obtain a corresponding color prompt of the color image;
and storing the color image and the corresponding line draft and color prompt thereof to construct a training data set.
Further, in this embodiment, the generator model of the line draft coloring model has a structure as shown in fig. 2, and is designed based on a U-Net architecture, and is composed of an up-sampling module, a down-sampling module and an attention screening module; the process of the generator model generating a color image from the line draft and the corresponding color cues is as follows:
firstly, cascading single-channel line draft with the size of 256 multiplied by 256 and corresponding color prompts with the size of 256 multiplied by 3, sending the single-channel line draft and the corresponding color prompts into a network, then performing dimensionality reduction on an image through a multilayer downsampling module, and extracting features of all scales; then, deep information is extracted through an up-sampling module, and each up-sampling is fused with the corresponding feature extraction part; the down-sampling module and the up-sampling module both take a ReLU (Rectified Linear Unit) active layer, a 3 x 3 convolution and BN (Batch Normalization) layer as basic units, and introduce a residual mechanism to avoid gradient extinction and degradation problems of a deep network; the down-sampling and up-sampling operations adopt maximum pooling and inverse maximum pooling for scaling; after the upsampling operation, a LeakyRelu active layer, a 3 × 3 convolutional layer and a Tanh active layer are passed, finally producing a 256 × 256 color image.
The Attention screening module is shown in fig. 3, and takes a current network layer feature x and a next network layer feature g as input, aligns and adds the size and the channel number through convolution operation, performs dimension compression by using ReLU activation and convolution operation, generates an Attention coefficient through Sigmoid activation, and finally upsamples the Attention coefficient to an original dimension of an original vector x, multiplies the original vector x by the Attention coefficient and outputs the result; the specific calculation method is as follows:
Figure BDA0003448221240000061
Figure BDA0003448221240000062
wherein σ1For ReLU activation function, σ2Activating a function for Sigmoid; thetaattInvolving linear transformation
Figure BDA0003448221240000063
Are all 1 x 1 convolutions of the original,
Figure BDA0003448221240000064
an offset term for the corresponding convolution; vector quantity
Figure BDA0003448221240000065
For the ith feature of the current network layer, vector giIs the ith feature in the next layer.
Further, in this embodiment, the discriminator model of the line draft coloring model takes into account the concept of PatchGAN, and is a full convolution network; the classifiers of GAN are typically mapped from a 256 x 256 image to a single scalar output, with the scalar value being used to measure the authenticity of the image. The output result is that the whole picture is weighted, the characteristics of the local image cannot be well reflected, and the method is not practical for tasks with high precision requirements. In order to better judge the local features of the image, the discriminator model of the embodiment takes the line draft and the generated image or the line draft and the real image as input after being cascaded, and outputs a matrix of 30 × 30 after passing through a plurality of layers of rolling blocks; each value in the matrix is used for judging a small area in the original image, and the values are averaged to be finally output by the discriminator.
Further, in this embodiment, the loss function of the generator model of the line draft coloring model is:
LG=λl1Ll1advLadvtvLtv
the loss function of the discriminator model of the line draft coloring model is as follows:
LD=-λadvLadv
wherein:
Figure BDA0003448221240000071
Figure BDA0003448221240000072
Figure BDA0003448221240000073
wherein, IsIs a line drawing, IhIs a corresponding color cue, IrIs a real image, G is a generator, D is a discriminator, lambdal1、λadv、λtvAs a weight, λl1=100,λadv=1,λtv=1;xi,jA pixel value indicating a position with coordinates (i, j) in an image, β is a hyper parameter, and in the present embodiment, β is 2.
Figure BDA0003448221240000074
Respectively for the combined distribution Ir、IsFor combined distribution Ih、IsIs mathematically expected and distributed for combinations Is、Ih、IrThe mathematical expectation of (2).
Further, in this embodiment, the method for coloring a line draft further includes:
in the process of training the line draft coloring model, a learning rate attenuation mechanism is added to help the training convergence and prevent oscillation; wherein the expression of the learning rate decay mechanism is as follows:
Figure BDA0003448221240000075
wherein alpha is the learning rate of the optimizer in the current stage, and the learning rate attenuation operation is performed every 15 rounds.
In summary, the embodiment is used for training by automatically extracting the color prompt, and the color prompt is added as additional information to assist in coloring when coloring, so as to achieve the purpose of interactive coloring, and enable the corresponding part of the online manuscript to be painted with the desired color. An attention screening mechanism is introduced at a jump connection part of a generator model, so that a network can automatically learn and distinguish the shape and the size of a target, and the sensitivity and the accuracy of the model are improved by inhibiting the activation of features in an irrelevant area. Thereby improving the accuracy of the coloring result and the quality of the generated picture. In addition, in the embodiment, a learning rate attenuation mechanism is added in the training process, which is beneficial to training convergence and prevents oscillation.
Second embodiment
The embodiment provides a line draft coloring device, which comprises the following modules:
the training data set building module is used for collecting color images, preprocessing the collected color images, generating line drafts according to the preprocessed color images, extracting corresponding color prompts by adopting a random covering fuzzy method and building a training data set;
the model improvement module is used for introducing an attention screening mechanism into a jumping connection part of a generator model of the pix2pixGAN model so as to improve the pix2pixGAN model and obtain a line draft coloring model;
the model training module is used for training the line draft coloring model by utilizing the training data set;
and the coloring module is used for coloring the line draft to be colored by utilizing the trained line draft coloring model.
The line draft coloring device of the present embodiment corresponds to the line draft coloring method of the first embodiment; the functions implemented by the functional modules in the line draft coloring device of the embodiment correspond to the flow steps in the line draft coloring method of the first embodiment one by one; therefore, it is not described herein.
Third embodiment
The present embodiment provides an electronic device, which includes a processor and a memory; wherein the memory has stored therein at least one instruction that is loaded and executed by the processor to implement the method of the first embodiment.
The electronic device may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) and one or more memories, where at least one instruction is stored in the memory, and the instruction is loaded by the processor and executes the method.
Fourth embodiment
The present embodiments provide a computer-readable storage medium having stored therein at least one instruction, which is loaded and executed by a processor, to implement the method of the first embodiment. The computer readable storage medium may be, among others, ROM, random access memory, CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like. The instructions stored therein may be loaded by a processor in the terminal and perform the above-described method.
Furthermore, it should be noted that the present invention may be provided as a method, apparatus or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied in the medium.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
Finally, it should be noted that while the above describes a preferred embodiment of the invention, it will be appreciated by those skilled in the art that, once the basic inventive concepts have been learned, numerous changes and modifications may be made without departing from the principles of the invention, which shall be deemed to be within the scope of the invention. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Claims (8)

1. A line draft coloring method is characterized by comprising the following steps:
collecting a color image, preprocessing the collected color image, generating a line draft according to the preprocessed color image, extracting a corresponding color prompt by adopting a random covering fuzzy method, and constructing a training data set;
introducing an attention screening mechanism at a jumping connection part of a generator model of the pix2pixGAN model to improve the pix2pixGAN model to obtain a line draft coloring model;
training the line draft coloring model by using the training data set;
and coloring the line draft to be colored by utilizing the trained line draft coloring model.
2. The method for coloring line draft according to claim 1, wherein the collecting color images, preprocessing the collected color images, generating the line draft according to the preprocessed color images, extracting corresponding color prompts by a random covering fuzzy method, and constructing a training data set comprises:
collecting a color image, and preprocessing the collected color image; wherein the preprocessing the collected color images comprises: rotating and mirroring a part of images in the collected color images to realize data expansion, and cutting all the images; wherein the cutting size is 256 multiplied by 256;
generating a line draft according to the preprocessed color image;
covering the color randomly by using 120 white color blocks with the size of 20 multiplied by 20, and then carrying out fuzzy processing on the covered image by taking an average filter with the kernel size of 100 multiplied by 100 to obtain a corresponding color prompt of the color image;
and storing the color image and the corresponding line draft and color prompt thereof to construct a training data set.
3. The line draft coloring method according to claim 1, wherein the generator model of the line draft coloring model comprises an up-sampling module, a down-sampling module and an attention screening module; the process of generating the color image according to the line draft and the corresponding color prompt by the generator model of the line draft coloring model comprises the following steps:
firstly, sending a single-channel line draft with the size of 256 multiplied by 256 and a corresponding color prompt with the size of 256 multiplied by 3 into a generator model of a line draft coloring model after cascading, then reducing the dimension of an image through a multilayer down-sampling module and extracting the features of all scales; then, deep information is extracted through an up-sampling module, and each up-sampling is fused with the corresponding feature extraction part; the down-sampling module and the up-sampling module both take a ReLU (Rectified Linear Unit) active layer, a 3 x 3 convolution and BN (Batch Normalization) layer as basic units, and introduce a residual mechanism to avoid gradient extinction and degradation problems of a deep network; the down-sampling and up-sampling operations scale using max-pooling and anti-max-pooling; after the up-sampling operation, a LeakyRelu active layer, a 3 × 3 convolution layer and a Tanh active layer are further performed, and finally a 256 × 256 color image is generated;
the Attention screening module takes a current network layer characteristic x and a next network layer characteristic g as input, aligns and adds the size and the channel number through convolution operation, performs dimension compression by using ReLU activation and convolution operation, generates an Attention coefficient through Sigmoid activation, and finally upsamples the Attention coefficient to the original dimension of an original vector x, multiplies the original vector x by the Attention coefficient and outputs the Attention coefficient; the calculation method is as follows:
Figure FDA0003448221230000021
Figure FDA0003448221230000022
wherein σ1For ReLU activation function, σ2Activating a function for Sigmoid; thetaattInvolving linear transformation
Figure FDA0003448221230000023
Are all 1 x 1 convolutions of the original,
Figure FDA0003448221230000024
an offset term for the corresponding convolution; vector quantity
Figure FDA0003448221230000025
For the ith feature of the current network layer, vector giIs the ith feature in the next layer.
4. The method for coloring a line draft according to claim 1, wherein the discriminator model of the line draft coloring model is a full convolution network; the discriminator model of the line manuscript coloring model takes a line manuscript image and a generated image output by a generator model of the line manuscript coloring model or a line manuscript image and a real image which are cascaded as input, and outputs a matrix of 30 multiplied by 30 after passing through a plurality of layers of rolling blocks; each value in the matrix is used for judging a small area in the original image, and the values are averaged to be finally output by the discriminator.
5. The line draft coloring method according to claim 1, wherein a loss function of a generator model of the line draft coloring model is:
LG=λl1Ll1advLadvtvLtv
wherein,
Figure FDA0003448221230000026
Figure FDA0003448221230000027
Figure FDA0003448221230000028
wherein, IsIs a line drawing, IhIs a color cue, IrIs a real image, G is a generator, D is a discriminator, lambdal1、λadv、λtvIs a weight, xi,jA pixel value representing a location of coordinates (i, j) in the image, β being a hyper-parameter;
Figure FDA0003448221230000029
respectively for the combined distribution Ir、IsFor combined distribution Ih、IsIs mathematically expected and distributed for combinations Is、Ih、IrThe mathematical expectation of (2).
6. The method for coloring line art according to claim 5, wherein the loss function of the discriminator model of the line art coloring model is:
LD=-λadvLadv
wherein,
Figure FDA00034482212300000210
7. the method for coloring line art according to claim 1, further comprising:
in the process of training the line draft coloring model, a learning rate attenuation mechanism is added to help the training convergence and prevent oscillation; wherein the expression of the learning rate decay mechanism is as follows:
Figure FDA0003448221230000031
wherein alpha is the learning rate of the optimizer in the current stage, and the learning rate attenuation operation is performed every 15 rounds.
8. The utility model provides a line manuscript coloring device which characterized in that includes:
the training data set building module is used for collecting color images, preprocessing the collected color images, generating line drafts according to the preprocessed color images, extracting corresponding color prompts by adopting a random covering fuzzy method and building a training data set;
the model improvement module is used for introducing an attention screening mechanism into a jumping connection part of a generator model of the pix2pixGAN model so as to improve the pix2pixGAN model and obtain a line draft coloring model;
the model training module is used for training the line draft coloring model by utilizing the training data set;
and the coloring module is used for coloring the line draft to be colored by utilizing the trained line draft coloring model.
CN202111665537.6A 2021-12-30 2021-12-30 Method and device for coloring line manuscript Active CN114387365B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111665537.6A CN114387365B (en) 2021-12-30 2021-12-30 Method and device for coloring line manuscript

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111665537.6A CN114387365B (en) 2021-12-30 2021-12-30 Method and device for coloring line manuscript

Publications (2)

Publication Number Publication Date
CN114387365A true CN114387365A (en) 2022-04-22
CN114387365B CN114387365B (en) 2024-09-24

Family

ID=81199755

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111665537.6A Active CN114387365B (en) 2021-12-30 2021-12-30 Method and device for coloring line manuscript

Country Status (1)

Country Link
CN (1) CN114387365B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115170388A (en) * 2022-07-28 2022-10-11 西南大学 Character line draft generation method, device, equipment and medium
CN115953597A (en) * 2022-04-25 2023-04-11 北京字跳网络技术有限公司 Image processing method, apparatus, device and medium
CN116433788A (en) * 2023-02-24 2023-07-14 北京科技大学 Gray image coloring method and device based on self-attention and generation countermeasure network
WO2023207779A1 (en) * 2022-04-25 2023-11-02 北京字跳网络技术有限公司 Image processing method and apparatus, device, and medium
CN117557589A (en) * 2023-11-08 2024-02-13 深圳市闪剪智能科技有限公司 Line drawing coloring method, device and storage medium based on neural network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109712203A (en) * 2018-12-29 2019-05-03 福建帝视信息科技有限公司 A kind of image rendering methods based on from attention generation confrontation network
CN112084728A (en) * 2020-09-07 2020-12-15 中国人民解放军战略支援部队信息工程大学 Pix2 pix-based PCB gray image coloring method and system
CN112288645A (en) * 2020-09-30 2021-01-29 西北大学 Skull face restoration model construction method, restoration method and restoration system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109712203A (en) * 2018-12-29 2019-05-03 福建帝视信息科技有限公司 A kind of image rendering methods based on from attention generation confrontation network
CN112084728A (en) * 2020-09-07 2020-12-15 中国人民解放军战略支援部队信息工程大学 Pix2 pix-based PCB gray image coloring method and system
CN112288645A (en) * 2020-09-30 2021-01-29 西北大学 Skull face restoration model construction method, restoration method and restoration system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIM, SO-HYUN, AND JUN-CHUL CHUN.: "Image-to-Image Translation Based on U-Net with R2 and Attention.", JOURNAL OF INTERNET COMPUTING AND SERVICES, vol. 21, no. 4, 31 August 2020 (2020-08-31) *
姚瑶: "基于生成对抗网络的图像转换方法研究", 中国优秀硕士学位论文全文数据库 (信息科技辑), no. 2020, 15 February 2020 (2020-02-15) *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115953597A (en) * 2022-04-25 2023-04-11 北京字跳网络技术有限公司 Image processing method, apparatus, device and medium
WO2023207779A1 (en) * 2022-04-25 2023-11-02 北京字跳网络技术有限公司 Image processing method and apparatus, device, and medium
CN115953597B (en) * 2022-04-25 2024-04-16 北京字跳网络技术有限公司 Image processing method, device, equipment and medium
CN115170388A (en) * 2022-07-28 2022-10-11 西南大学 Character line draft generation method, device, equipment and medium
CN116433788A (en) * 2023-02-24 2023-07-14 北京科技大学 Gray image coloring method and device based on self-attention and generation countermeasure network
CN117557589A (en) * 2023-11-08 2024-02-13 深圳市闪剪智能科技有限公司 Line drawing coloring method, device and storage medium based on neural network

Also Published As

Publication number Publication date
CN114387365B (en) 2024-09-24

Similar Documents

Publication Publication Date Title
CN114387365A (en) Line draft coloring method and device
Jiang et al. Image inpainting based on generative adversarial networks
CN113158862B (en) Multitasking-based lightweight real-time face detection method
CN111861945B (en) Text-guided image restoration method and system
CN111079532A (en) Video content description method based on text self-encoder
CN115222998B (en) Image classification method
CN116704079B (en) Image generation method, device, equipment and storage medium
CN115641391A (en) Infrared image colorizing method based on dense residual error and double-flow attention
CN113392711A (en) Smoke semantic segmentation method and system based on high-level semantics and noise suppression
CN111127472A (en) Multi-scale image segmentation method based on weight learning
CN111652864A (en) Casting defect image generation method for generating countermeasure network based on conditional expression
CN116958324A (en) Training method, device, equipment and storage medium of image generation model
CN116757986A (en) Infrared and visible light image fusion method and device
CN115546171A (en) Shadow detection method and device based on attention shadow boundary and feature correction
CN116051593A (en) Clothing image extraction method and device, equipment, medium and product thereof
CN111814693A (en) Marine ship identification method based on deep learning
CN110675311A (en) Sketch generation method and device under sketch order constraint and storage medium
CN114694074A (en) Method, device and storage medium for generating video by using image
CN117252892B (en) Automatic double-branch portrait matting device based on light visual self-attention network
Zhao et al. Guiding intelligent surveillance system by learning-by-synthesis gaze estimation
CN117788629A (en) Image generation method, device and storage medium with style personalization
CN111311732A (en) 3D human body grid obtaining method and device
CN116402702A (en) Old photo restoration method and system based on deep neural network
CN115018729A (en) White box image enhancement method for content
CN114519678A (en) Scanning transmission image recovery method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant