CN110717953B

CN110717953B - Coloring method and system for black-and-white pictures based on CNN-LSTM (computer-aided three-dimensional network-link) combination model

Info

Publication number: CN110717953B
Application number: CN201910914057.5A
Authority: CN
Inventors: 宋旭博
Original assignee: Beijing Moviebook Science And Technology Co ltd
Current assignee: Beijing Moviebook Science And Technology Co ltd
Priority date: 2019-09-25
Filing date: 2019-09-25
Publication date: 2024-03-01
Anticipated expiration: 2039-09-25
Also published as: CN110717953A

Abstract

The invention discloses a coloring method and a coloring system of black-and-white pictures based on a CNN-LSTM combination model, wherein the method comprises the following steps: collecting a plurality of color initial images, converting the plurality of color initial images into a plurality of black-and-white images, and forming a training set and a testing set based on the plurality of color images and the plurality of black-and-white images respectively; inputting a plurality of black-and-white images and a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, respectively generating black-and-white and color characteristic patterns, and performing color matching test on the black-and-white characteristic patterns and the generated color characteristic patterns to finally obtain a coloring network model; and coloring the black-and-white image to be colored by using a coloring network model. The invention can promote the coloring accuracy of the finally obtained coloring network model, automatically color the black-and-white image, eliminate the color inconsistency of each frame-by-frame picture, thereby generating smoother color image and obtaining overall beautiful viewing experience.

Description

Coloring method and system for black-and-white pictures based on CNN-LSTM (computer-aided three-dimensional network-link) combination model

Technical Field

The application relates to the technical field of image recognition, in particular to a coloring method and a coloring system of black-and-white pictures based on a CNN-LSTM combination model.

Background

The image coloring is a basic means of image enhancement, is a computer-aided processing technology for adding colors to images or videos, can obtain better visual effects by complementing colors to black-and-white pictures, and has very wide application in entertainment, education, scientific research, medical treatment and other aspects.

The image coloring method in the prior art mainly comprises the following two coloring methods, wherein one is a coloring method based on user prompt, and the other is that black and white pictures can be colored without any coloring prompt. The first coloring method requires manual intervention, which increases labor cost and time cost; the second coloring method automatically extracts features suitable for a user-specified stroke from a single image by employing a deep learning method using low-level visual patches and spatial pixel coordinates as inputs to a deep neural network. Estimating user stroke probabilities from the extracted features across the image using the deep neural network as a classifier, the probabilities representing the likelihood that each pixel belongs to each stroke; although this coloring method is automated. However, the method has poor coloring effect, single coloring and relatively low accuracy.

Disclosure of Invention

The present application aims to overcome or at least partially solve or alleviate the above-mentioned problems.

According to one aspect of the present application, there is provided a coloring method of black-and-white pictures based on a CNN-LSTM combination model, the method comprising the steps of:

collecting a plurality of color initial images, converting each color initial image into a corresponding black-and-white image to obtain a plurality of black-and-white images, and forming a training set and a testing set based on the plurality of color images and the plurality of black-and-white images respectively;

inputting a plurality of black-and-white images in a training set into a CNN-LSTM convolutional neural network model, extracting the characteristics of the black-and-white images and generating a black-and-white characteristic map; inputting a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, extracting features of the plurality of color images to generate a color feature map, and performing color matching test based on the black-white feature map and the generated color feature map to obtain a coloring training model;

inputting the test samples in the test set into the coloring training model, and performing color matching test on black-and-white images and color images in the test samples to generate color matching test results; iteratively updating the coloring training model according to the color matching test result to generate a coloring network model;

and coloring the black-and-white image to be colored by using the coloring network model.

Optionally, the plurality of color initial images are frame pictures extracted from an original video.

Optionally, the plurality of color initial images are derived from an ImageNet dataset.

Optionally, the CNN structure in the CNN-LSTM convolutional neural network model includes a plurality of groups of conversion layers and a full connection layer, where each group of conversion layers includes a convolutional layer and a pooling layer, and the full connection layer is connected after the pooling layer.

Optionally, the generating the black-and-white feature map includes the substeps of:

inputting a plurality of black-and-white images into a convolution layer to obtain a first characteristic map;

inputting the first characteristic map into a pooling layer and obtaining a second characteristic map by using a back propagation method;

combining the features in the second feature map and the features in the first feature map by using the full-connection layer to form combined features;

inputting the combined features into a Softmax layer to classify objects in a black-and-white image and generate CIE Lab color space data;

inputting the CIE Lab color space data into an LSTM long-short time memory network structure for training to obtain a coloring training model.

Optionally, a ReLU nonlinear layer is added after each convolution layer, with the ReLU nonlinear layer being utilized as an activation function following each convolution layer.

Optionally, the obtaining the coloring training model includes the sub-steps of:

establishing an objective function by utilizing the CIE Lab color space data, and predicting Euclidean loss between the color and the ground true color by utilizing the objective function;

the objective function is:

wherein L is ₂ Euclidean loss between the predicted color and the ground true color; y is the function of the object and,is a mapping of the objective function; h and w are the height and width of the channel of the CIE Lab color space, respectively;

given a mapping modelWherein the probability distribution of colors in the mapping model is:

wherein q is the number of quantized a, b values; comparing predictionsAgainst the ground truth model Z, define the ground truth model as +.>Converting an objective function Y of ground truth colors into a vector Z, wherein g is the number of nodes, and t is a time parameter;

the color level rarity is rebalanced with a cross entropy loss function that is:

wherein L is _cl Is a cross entropy loss function value; v is a weighted value; q is the number of quantized a, b values; h and w are the height and width of the channel of the CIE Lab color space, respectively;

coloring Y by using a function H (Z) of a mapping model Z; the formula using the function H (Z) is:

optionally, before the step of combining the features in the second feature map and the features in the first feature map by using the full-connection layer to form a combined feature, the method further includes a normalization step: and processing the second characteristic spectrum by using a batch standardization method to obtain a standardized characteristic spectrum.

According to another aspect of the present application, there is provided a coloring system for black-and-white pictures based on a CNN-LSTM combination model, the system comprising:

the acquisition module is used for acquiring a plurality of color initial images, converting each color initial image into a corresponding black-and-white image to obtain a plurality of black-and-white images, and respectively forming a training set and a testing set based on the plurality of color images and the plurality of black-and-white images;

the training module is used for inputting a plurality of black-and-white images in the training set into the CNN-LSTM convolutional neural network model, extracting the characteristics of the black-and-white images and generating a black-and-white characteristic map; inputting a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, extracting features of the plurality of color images to generate a color feature map, and performing color matching test on the black-white feature map and the generated color feature map to obtain a coloring training model;

the coloring network model generation module is used for inputting the test samples in the test set into the coloring training model, and performing color matching test on the black-and-white images and the color images in the test samples to generate color matching test results; iteratively updating the coloring training model according to the color matching test result to generate a coloring network model;

and the coloring module is used for coloring the black-and-white image to be colored by utilizing the coloring network model.

Optionally, the CNN structure in the CNN-LSTM convolutional neural network model includes multiple groups of conversion layers and full connection layers, each group of conversion layers including a convolutional layer and a pooling layer; the full-connection layer is connected to the pooling layer;

the training module performs the following operations:

inputting the combined features into a Softmax layer, classifying objects in a black-and-white image according to the combined features, and generating CIE Lab color space data according to classification results;

inputting the CIE Lab color space data into an LSTM long-short time memory network structure to perform training and learning, and obtaining a coloring training model.

According to the coloring method of the black-and-white picture based on the CNN-LSTM combined model, the black-and-white image is input into the CNN-LSTM combined model for training, the color image is used as a final matching target, the error between the characteristics of the characteristic patterns of the color image and the black-and-white image is reversely propagated, the weight is modified, the coloring accuracy of the finally obtained coloring network model can be improved, the black-and-white image can be automatically colored, the color inconsistency of each frame-by-frame picture is eliminated, a smoother color image can be generated, and the overall attractive viewing experience is obtained.

The above, as well as additional objectives, advantages, and features of the present application will become apparent to those skilled in the art from the following detailed description of a specific embodiment of the present application when read in conjunction with the accompanying drawings.

Drawings

Some specific embodiments of the present application will be described in detail hereinafter by way of example and not by way of limitation with reference to the accompanying drawings. The same reference numbers will be used throughout the drawings to refer to the same or like parts or portions. It will be appreciated by those skilled in the art that the drawings are not necessarily drawn to scale. In the accompanying drawings:

FIG. 1 is a flow chart of a method for coloring black and white pictures based on a CNN-LSTM combination model according to one embodiment of the present application;

FIG. 2 is a schematic diagram of a coloring system for black and white pictures based on a CNN-LSTM combination model according to one embodiment of the present application;

FIG. 3 is a schematic diagram of a computing device according to an embodiment of the present application;

fig. 4 is a schematic diagram of a computer-readable storage medium according to an embodiment of the present application.

Detailed Description

FIG. 1 is a flow chart of a method for coloring black and white pictures of a CNN-LSTM combined model according to one embodiment of the present application; the coloring method comprises the following steps:

step 100, collecting a plurality of color initial images, converting each color initial image into a corresponding black-and-white image to obtain a plurality of black-and-white images, and forming a training set and a testing set based on the plurality of color images and the plurality of black-and-white images;

alternatively, the color initial image may be a frame picture extracted from the original video.

Optionally, the color initial image is derived from an ImageNet dataset.

Step 200: inputting a plurality of black-and-white images in a training set into a CNN convolutional neural network model, extracting the characteristics of the black-and-white images and generating a black-and-white characteristic map; inputting a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, and extracting the characteristics of the plurality of color images to generate a color characteristic map; obtaining a coloring training model based on the black-and-white feature pattern and the generated color feature pattern

The CNN convolutional neural network model comprises a plurality of groups of conversion layers and full-connection layers, wherein each group of conversion layers comprises a convolutional layer and a pooling layer; the full-connection layer is connected to the pooling layer; preferably, the conversion layers in this example include 5 groups of conversion layers;

optionally, generating the black-and-white feature map includes the sub-steps of:

step 210: inputting a plurality of black-and-white images into a convolution layer to obtain a first characteristic map;

preferably, a ReLU nonlinear layer is added after each convolution layer, and the ReLU nonlinear layer can be used as an activation function following each convolution layer to accelerate the convergence of the coloring training model obtained later.

Step 220: inputting the first characteristic map into a pooling layer and obtaining a second characteristic map by using a back propagation method;

step 230: combining the features in the second feature map and the features in the first feature map by using the full-connection layer to form combined features;

step 240: inputting the combined features into the Softmax layer, classifying objects in the black-and-white image according to the combined features, and generating CIE Lab color space data according to classification results;

step 250: inputting the CIE Lab color space data into an LSTM long-short time memory network structure to perform training and learning, and obtaining a coloring training model.

Specifically, step 250 includes the sub-steps of:

step 251: establishing an objective function by utilizing the CIE Lab color space data, and predicting Euclidean loss between the color and the ground true color by utilizing the objective function;

the objective function is:

step 252: given a mapping modelWherein the probability distribution of colors in the mapping model is:

if a subject takes a set of different a, b values, the optimal solution for Euclidean loss will be the CIE Lab color space data average of black and white images in the training set, which average is used in color prediction to favor gray, desaturated results; in this embodiment, the a, b output space amounts are converted into a region with a grid size of 10, and the value in the q=310 gamut is held.

Step 253: the color level rarity is rebalanced with a cross entropy loss function that is:

wherein L is _cl Is a cross entropy loss function value; v is a weighted value; q is the number of quantized a, b values; h and w are the height and width of the channel of the CIE Lab color space, respectively.

Step 254: coloring Y by using a function H (Z) of a mapping model Z; the formula of the H (Z) usage function is:

in another embodiment, the generating the black and white feature map includes the sub-steps of:

step 230: processing the second characteristic spectrum by using a batch standardization method to obtain a standardized characteristic spectrum;

step 240: combining the features in the first standardized feature map and the features in the first feature map by using a full connection layer to form combined features;

step 250: inputting the combined features into the Softmax layer, classifying objects in the black-and-white image according to the combined features, and generating CIE Lab color space data according to classification results;

step 260: inputting the Lab color space data into an LSTM long-short time memory network structure to perform training and learning, and obtaining a coloring training model.

Step 300: inputting the test samples in the test set into the coloring training model, and performing color matching test on black-and-white images and color images in the test samples to generate color matching test results; and iteratively updating the coloring training model according to the color matching test result to generate a coloring network model.

Step 400: and coloring the black-and-white image to be colored by using the coloring network model.

Based on the same inventive concept, as shown in fig. 2, the embodiment of the application further provides a coloring system for black-and-white pictures based on a CNN-LSTM combination model, where the system includes:

Optionally, the CNN structure in the CNN-LSTM convolutional neural network model includes a plurality of groups of conversion layers and a full-connection layer, each group of conversion layers includes a convolutional layer and a pooling layer, and the full-connection layer is connected to the pooling layer;

the training module performs the following operations:

The above coloring system provided in this embodiment may execute the method provided in any of the above embodiments of the black-and-white picture coloring method based on the CNN-LSTM combination model, and detailed processes are described in the method embodiments and are not repeated herein.

The present embodiment also provides a computing device comprising a memory 520, a processor 510 and a computer program stored in said memory 520 and executable by said processor 510, which computer program is stored in a space 530 for program code in the memory 520, which computer program, when being executed by the processor 510, is realized for performing any of the shading method steps 531 according to the present invention.

Embodiments of the present application also provide a computer-readable storage medium. Referring to fig. 4, the computer-readable storage medium includes a storage unit for program code provided with a program 531' for executing the steps of the identification method according to the present invention, the program being executed by a processor.

Embodiments of the present application also provide a computer program product comprising instructions. The computer program product, when run on a computer, causes the computer to perform the method steps according to the invention.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed by a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of function in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

Those of ordinary skill in the art will appreciate that all or some of the steps in implementing the methods of the above embodiments may be implemented by a program that instructs a processor to perform the steps, and the program may be stored in a computer readable storage medium, where the storage medium is a non-transitory (english) medium, such as a random access memory, a read-only memory, a flash memory, a hard disk, a solid state disk, a magnetic tape (english), a floppy disk (english), an optical disc (english), and any combination thereof.

The foregoing is merely a preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A coloring method of black-and-white pictures based on a CNN-LSTM combination model, comprising the following steps:

inputting a plurality of black-and-white images in a training set into a CNN-LSTM convolutional neural network model, extracting features of the plurality of black-and-white images to generate a black-and-white feature map, comprising: inputting a plurality of black-and-white images into a convolution layer to obtain a first characteristic map; inputting the first characteristic map into a pooling layer and obtaining a second characteristic map by using a back propagation method; combining the features in the second feature map and the features in the first feature map by using the full-connection layer to form combined features; inputting the combined features into a Softmax layer, classifying objects in a black-and-white image according to the combined features, and generating CIE Lab color space data according to classification results; inputting the CIE Lab color space data into an LSTM long-short time memory network structure for training to obtain a coloring training model; inputting a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, extracting features of the plurality of color images to generate a color feature map, and performing color matching test on the black-white feature map and the generated color feature map to obtain a coloring training model;

2. The coloring method according to claim 1, wherein: the plurality of color initial images are frame pictures extracted from an original video.

3. The coloring method according to claim 1, wherein: the plurality of color initial images are derived from an ImageNet dataset.

4. The coloring method according to claim 1, wherein: the CNN structure in the CNN-LSTM convolutional neural network model comprises a plurality of groups of conversion layers and full-connection layers, each group of conversion layers comprises a convolutional layer and a pooling layer, and the full-connection layers are connected behind the pooling layer.

5. The coloring method according to claim 1, wherein: and adding a ReLU nonlinear layer after each convolution layer, and utilizing the ReLU nonlinear layer as an activation function following each convolution layer.

6. The coloring method according to claim 1, wherein: the obtaining a colored training model includes the sub-steps of:

the objective function is:

wherein q is the number of quantized a, b values; comparing predictionsAgainst the ground truth model Z, the ground truth model is defined as: />Converting an objective function Y of ground truth colors into a vector Z, wherein g is the number of nodes, and t is a time parameter;

using a mapping modelIs used to color Y using a function H (Z); the formula using the function H (Z) is:

7. the coloring method according to claim 6, wherein: the method further comprises a normalization step before the step of combining the features in the second feature map and the features in the first feature map by using the full-connection layer to form a combined feature: and processing the second characteristic spectrum by using a batch standardization method to obtain a standardized characteristic spectrum.

8. A coloring system for black and white pictures based on a CNN-LSTM combination model, the system comprising:

the training module is used for inputting a plurality of black-and-white images in a training set into the CNN-LSTM convolutional neural network model, extracting the characteristics of the black-and-white images to generate a black-and-white characteristic map, and comprises the following steps: inputting a plurality of black-and-white images into a convolution layer to obtain a first characteristic map; inputting the first characteristic map into a pooling layer and obtaining a second characteristic map by using a back propagation method; combining the features in the second feature map and the features in the first feature map by using the full-connection layer to form combined features; inputting the combined features into a Softmax layer, classifying objects in a black-and-white image according to the combined features, and generating CIE Lab color space data according to classification results; inputting the CIE Lab color space data into an LSTM long-short time memory network structure for training to obtain a coloring training model; inputting a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, extracting features of the plurality of color images to generate a color feature map, and performing color matching test on the black-white feature map and the generated color feature map to obtain a coloring training model;

9. The coloring system of claim 8, wherein: the plurality of color initial images are frame pictures extracted from an original video.