CN110717953A

CN110717953A - Black-white picture coloring method and system based on CNN-LSTM combined model

Info

Publication number: CN110717953A
Application number: CN201910914057.5A
Authority: CN
Inventors: 宋旭博
Original assignee: Beijing Yingpu Technology Co Ltd
Current assignee: Beijing Yingpu Technology Co Ltd
Priority date: 2019-09-25
Filing date: 2019-09-25
Publication date: 2020-01-21
Anticipated expiration: 2039-09-25
Also published as: CN110717953B

Abstract

The invention discloses a black and white picture coloring method and system based on a CNN-LSTM combined model, wherein the method comprises the following steps: collecting a plurality of color initial images, converting the plurality of color initial images into a plurality of black-and-white images, and respectively forming a training set and a test set based on the plurality of color images and the plurality of black-and-white images; inputting a plurality of black-and-white images and a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, respectively generating black-and-white and color feature maps, and performing color matching test on the black-and-white feature maps and the generated color feature maps to finally obtain a coloring network model; and coloring the black-white image to be colored by utilizing the coloring network model. The invention can improve the coloring accuracy of the finally obtained coloring network model, automatically color the black and white image, and eliminate the color inconsistency of each frame-by-frame picture, thereby generating a smoother color image and obtaining the overall beautiful viewing experience.

Description

Black-white picture coloring method and system based on CNN-LSTM combined model

Technical Field

The application relates to the technical field of image recognition, in particular to a black and white picture coloring method and system based on a CNN-LSTM combined model.

Background

The image coloring is a basic means for image enhancement, is a computer-aided processing technology for adding colors to images or videos, can obtain better visual effect by complementing colors to black and white pictures, and has very wide application in entertainment, education, scientific research, medical treatment and the like.

The image coloring method in the prior art mainly includes two methods, one is a coloring method based on user prompt, and the other is a method that black and white pictures can be colored without any coloring prompt. The first coloring method requires manual intervention, increasing labor and time costs; the second coloring method automatically extracts features suitable for user-specified strokes from a single image by employing a deep learning method that uses low-level visual patches and spatial pixel coordinates as input to a deep neural network. Estimating user stroke probabilities from the extracted features across the entire image using a deep neural network as a classifier, the probabilities representing a likelihood that each pixel belongs to each stroke; although this coloring method is automated. However, the method has poor coloring effect, single coloring and relatively low accuracy.

Disclosure of Invention

It is an object of the present application to overcome the above problems or to at least partially solve or mitigate the above problems.

According to one aspect of the present application, there is provided a black and white picture coloring method based on a CNN-LSTM combination model, the method comprising the steps of:

collecting a plurality of color initial images, converting each color initial image into a corresponding black-and-white image to obtain a plurality of black-and-white images, and respectively forming a training set and a testing set based on the plurality of color images and the plurality of black-and-white images;

inputting a plurality of black-and-white images in a training set into a CNN-LSTM convolutional neural network model, and extracting the characteristics of the black-and-white images to generate a black-and-white characteristic map; inputting a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, extracting the characteristics of the plurality of color images to generate a color characteristic map, and performing color matching test based on the black-and-white characteristic map and the generated color characteristic map to obtain a coloring training model;

inputting the test sample in the test set into the coloring training model, and performing color matching test on the black-and-white image and the color image in the test sample to generate a color matching test result; iteratively updating the coloring training model according to the color matching test result to generate a coloring network model;

and coloring the black-white image to be colored by utilizing the coloring network model.

Optionally, the plurality of color initial images are frame pictures extracted from an original video.

Optionally, the plurality of color initial images are derived from an ImageNet dataset.

Optionally, the CNN structure in the CNN-LSTM convolutional neural network model includes multiple sets of conversion layers and a full connection layer, each conversion layer includes a convolutional layer and a pooling layer, and the full connection layer is connected behind the pooling layer.

Optionally, the generating the black-and-white feature map comprises the following sub-steps:

inputting a plurality of black-and-white images into the convolutional layer to obtain a first characteristic map;

inputting the first feature map into a pooling layer and obtaining a second feature map by using a back propagation method;

combining the features in the second feature map and the features in the first feature map by using a full connection layer to form combined features;

inputting the joint features into a Softmax layer to classify objects in black and white images and generate CIE Lab color space data;

and inputting the CIE Lab color space data into an LSTM long-time and short-time memory network structure for training to obtain a coloring training model.

Optionally, a ReLU non-linear layer is added after each convolutional layer, using the ReLU non-linear layer as an activation function following each convolutional layer.

Optionally, the obtaining a coloring training model comprises the following sub-steps:

establishing an objective function by utilizing the CIE Lab color space data, and predicting Euclidean loss between the color and the ground real color by utilizing the objective function;

the objective function is:

wherein L is₂To predict the euclidian loss between the color and the ground true color; y is the function of the object and,

is a mapping of the objective function; h and w are the height and width of the channel of the CIE Lab color space, respectively;

given a mapping model

Wherein the probability distribution of colors in the mapping model is:

wherein q is the number of quantized a, b values; comparative prediction

Against the ground truth model Z, define the ground truth model as

Converting an objective function Y of ground truth color into a vector Z, wherein g is the number of nodes, and t is a time parameter;

rebalancing the color level rarity losses using a cross-entropy loss function that is:

wherein L is_clIs a cross entropy loss function value; v is a weighted number; q is the number of quantized a, b values; h and w are the height and width of the channel of the CIE Lab color space, respectively;

coloring Y with a usage function H (Z) of the mapping model Z; the formula of the usage function h (z) is:

optionally, before the step of combining the features in the second feature map and the features in the first feature map by using the full connection layer to form combined features, the method further includes a step of normalizing: and processing the second feature map by using a batch standardization method to obtain a standardized feature map.

According to another aspect of the present application, there is provided a black and white picture rendering system based on a CNN-LSTM combination model, the system comprising:

the acquisition module is used for acquiring a plurality of color initial images, converting each color initial image into a corresponding black-and-white image to obtain a plurality of black-and-white images, and respectively forming a training set and a test set based on the plurality of color images and the plurality of black-and-white images;

the training module is used for inputting a plurality of black-and-white images in the training set into the CNN-LSTM convolutional neural network model, extracting the characteristics of the black-and-white images and generating a black-and-white characteristic map; inputting a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, extracting the characteristics of the plurality of color images to generate a color characteristic map, and performing color matching test on the black-white characteristic map and the generated color characteristic map to obtain a coloring training model;

the coloring network model generating module is used for inputting the test sample in the test set into the coloring training model and carrying out color matching test on the black-and-white image and the color image in the test sample to generate a color matching test result; iteratively updating the coloring training model according to the color matching test result to generate a coloring network model;

and the coloring module is used for coloring the black and white image to be colored by utilizing the coloring network model.

Optionally, the CNN structure in the CNN-LSTM convolutional neural network model includes multiple sets of conversion layers and fully-connected layers, each set of conversion layers includes a convolutional layer and a pooling layer; the full connecting layer is connected to the rear of the pooling layer;

the training module performs the following operations:

inputting the joint features into a Softmax layer, classifying objects in the black and white image according to the joint features, and generating CIE Lab color space data according to a classification result;

and inputting the CIE Lab color space data into an LSTM long-time and short-time memory network structure for training, training and learning to obtain a coloring training model.

According to the black-and-white picture coloring method based on the CNN-LSTM combined model, the black-and-white picture is input into the CNN-LSTM combined model for training, the color picture is used as a final matching target, and the error between the features of the feature maps of the color picture and the black-and-white picture is propagated in the opposite direction and the weight is modified, so that the coloring accuracy of the finally obtained coloring network model can be improved, the black-and-white picture can be automatically colored, the color inconsistency of each frame-by-frame picture is eliminated, a smoother color picture can be generated, and the overall attractive viewing experience is obtained.

The above and other objects, advantages and features of the present application will become more apparent to those skilled in the art from the following detailed description of specific embodiments thereof, taken in conjunction with the accompanying drawings.

Drawings

Some specific embodiments of the present application will be described in detail hereinafter by way of illustration and not limitation with reference to the accompanying drawings. The same reference numbers in the drawings identify the same or similar elements or components. Those skilled in the art will appreciate that the drawings are not necessarily drawn to scale. In the drawings:

FIG. 1 is a flow chart of a black-and-white picture rendering method based on a CNN-LSTM combined model according to an embodiment of the present application;

FIG. 2 is a schematic structural diagram of a black and white picture rendering system based on a CNN-LSTM combined model according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a computing device according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a computer-readable storage medium according to an embodiment of the application.

Detailed Description

FIG. 1 is a flow chart of a black and white picture rendering method of a CNN-LSTM combined model according to an embodiment of the present application; the coloring method comprises the following steps:

step 100, collecting a plurality of color initial images, converting each color initial image into a corresponding black-and-white image to obtain a plurality of black-and-white images, and forming a training set and a test set based on the plurality of color images and the plurality of black-and-white images;

alternatively, the color initial image may be a frame picture extracted from the original video.

Optionally, the color initial image is derived from an ImageNet dataset.

Step 200: inputting a plurality of black-and-white images in a training set into a CNN convolutional neural network model, and extracting the characteristics of the black-and-white images to generate a black-and-white characteristic map; inputting a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, and extracting the characteristics of the plurality of color images to generate a color characteristic map; obtaining a coloring training model based on the black-and-white feature map and the generated color feature map

The CNN convolutional neural network model structurally comprises a plurality of groups of conversion layers and full-connection layers, wherein each group of conversion layers comprises a convolutional layer and a pooling layer; the full connecting layer is connected to the rear of the pooling layer; preferably, the translation layers in this example include 5 sets of translation layers;

optionally, the generating the black-and-white feature map includes the following sub-steps:

step 210: inputting a plurality of black-and-white images into the convolutional layer to obtain a first characteristic map;

preferably, a ReLU nonlinear layer is added after each convolutional layer, and the ReLU nonlinear layer can be used as an activation function following each convolutional layer to accelerate the convergence of the coloring training model obtained at the later stage.

Step 220: inputting the first feature map into a pooling layer and obtaining a second feature map by using a back propagation method;

step 230: combining the features in the second feature map and the features in the first feature map by using a full connection layer to form combined features;

step 240: inputting the joint features into the Softmax layer, classifying objects in the black-white image according to the joint features, and generating CIE Lab color space data according to a classification result;

step 250: and inputting the CIE Lab color space data into an LSTM long-time and short-time memory network structure for training, training and learning to obtain a coloring training model.

Specifically, step 250 includes the following substeps:

step 251: establishing an objective function by utilizing the CIE Lab color space data, and predicting Euclidean loss between the color and the ground real color by utilizing the objective function;

the objective function is:

wherein L is₂To predict the euclidian loss between the color and the ground true color; y is the function of the object and,is a mapping of the objective function; h and w are the height and width of the channel of the CIE Lab color space, respectively;

step 252: given a mapping model

Wherein the mapping moduleThe probability distribution of colors in a pattern is:

wherein q is the number of quantized a, b values; comparative prediction

Against the ground truth model Z, define the ground truth model as

if an object uses a set of different values of a and b, the optimal solution of Euclidean loss will be the mean value of CIE Lab color space data of black and white images in the training set, and the mean value is adopted to be beneficial to the results of gray and desaturation in color prediction; in this embodiment, the a, b output space amounts are converted into a section with a grid size of 10, and Q is maintained as a value in 310 color gamut.

Step 253: rebalancing the color level rarity losses using a cross-entropy loss function that is:

wherein L is_clIs a cross entropy loss function value; v is a weighted number; q is the number of quantized a, b values; h and w are the height and width, respectively, of the channel of the CIE Lab color space.

Step 254: coloring Y with a usage function H (Z) of the mapping model Z; the formula for the H (Z) usage function is:

in another embodiment, the generating the black-and-white feature map comprises the sub-steps of:

step 230: processing the second characteristic map by using a batch standardization method to obtain a standardized characteristic map;

step 240: combining features in the first normalized feature map and features in the first feature map to form combined features using a full connectivity layer;

step 250: inputting the joint features into the Softmax layer, classifying objects in the black-white image according to the joint features, and generating CIE Lab color space data according to a classification result;

step 260: and inputting the Lab color space data into an LSTM long-short time memory network structure for training, training and learning to obtain a coloring training model.

Step 300: inputting the test sample in the test set into the coloring training model, and performing color matching test on the black-and-white image and the color image in the test sample to generate a color matching test result; and iteratively updating the coloring training model according to the color matching test result to generate a coloring network model.

Step 400: and coloring the black-white image to be colored by utilizing the coloring network model.

Based on the same inventive concept, as shown in fig. 2, an embodiment of the present application further provides a system for coloring black and white pictures based on a CNN-LSTM combined model, where the system includes:

Optionally, the CNN structure in the CNN-LSTM convolutional neural network model includes multiple groups of conversion layers and a full connection layer, each group of conversion layers includes a convolutional layer and a pooling layer, and the full connection layer is connected behind the pooling layer;

the training module performs the following operations:

The coloring system provided in this embodiment may execute the method provided in any one of the methods for coloring black and white pictures based on the CNN-LSTM combined model, and the detailed process is described in the method embodiment and is not repeated herein.

An embodiment of the present application also provides a computing device, referring to fig. 3, comprising a memory 520, a processor 510 and a computer program stored in said memory 520 and executable by said processor 510, the computer program being stored in a space 530 for program code in the memory 520, the computer program, when executed by the processor 510, implementing steps 531 for performing any of the shading methods according to the present invention.

The embodiment of the application also provides a computer readable storage medium. Referring to fig. 4, the computer readable storage medium comprises a storage unit for program code provided with a program 531' for performing the steps of the identification method according to the invention, which program is executed by a processor.

The embodiment of the application also provides a computer program product containing instructions. Which, when run on a computer, causes the computer to carry out the steps of the method according to the invention.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed by a computer, cause the computer to perform, in whole or in part, the procedures or functions described in accordance with the embodiments of the application. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by a program, and the program may be stored in a computer-readable storage medium, where the storage medium is a non-transitory medium, such as a random access memory, a read only memory, a flash memory, a hard disk, a solid state disk, a magnetic tape (magnetic tape), a floppy disk (floppy disk), an optical disk (optical disk), and any combination thereof.

The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A black and white picture coloring method based on a CNN-LSTM combined model comprises the following steps:

inputting a plurality of black-and-white images in a training set into a CNN-LSTM convolutional neural network model, and extracting the characteristics of the black-and-white images to generate a black-and-white characteristic map; inputting a plurality of color images in a training set into a CNN-LSTM convolutional neural network model, extracting the characteristics of the plurality of color images to generate a color characteristic map, and performing color matching test on the black-white characteristic map and the generated color characteristic map to obtain a coloring training model;

2. The coloring method according to claim 1, characterized in that: the plurality of color initial images are frame pictures extracted from an original video.

3. The coloring method according to claim 1, characterized in that: the plurality of color initial images are derived from an ImageNet dataset.

4. The coloring method according to claim 1, characterized in that: the CNN structure in the CNN-LSTM convolutional neural network model comprises a plurality of groups of conversion layers and full connection layers, each group of conversion layers comprises a convolutional layer and a pooling layer, and the full connection layers are connected behind the pooling layers.

5. The coloring method according to claim 4, characterized in that: the generating of the black-and-white feature map comprises the following sub-steps:

6. The coloring method according to claim 5, characterized in that: adding a ReLU nonlinear layer after each convolutional layer, using the ReLU nonlinear layer as an activation function following each convolutional layer.

7. The coloring method according to claim 5, characterized in that: the obtaining of the coloring training model comprises the following substeps:

the objective function is:

wherein L is₂To predict the euclidian loss between the color and the ground true color; y is a targetThe function of the function is that of the function,

given a mapping modelWherein the probability distribution of colors in the mapping model is:

wherein q is the number of quantized a, b values; comparative prediction

Against the ground truth model Z, the ground truth model is defined as:converting an objective function Y of ground truth color into a vector Z, wherein g is the number of nodes, and t is a time parameter;

wherein L is_clIs a cross entropy loss function value; v is a weighted number; q is the number of quantized a, b values; h and w are respectively the height and width of the channel of the CIELab color space;

using a mapping modelIs used to color Y (using function h), (z); the formula of the usage function h (z) is:

8. the coloring method according to claim 7, characterized in that: before the step of combining the features in the second feature map and the features in the first feature map by using the full connection layer to form combined features, the method further comprises a step of normalizing: and processing the second feature map by using a batch standardization method to obtain a standardized feature map.

9. A system for rendering black and white pictures based on CNN-LSTM combined model, the system comprising:

10. The coloring system according to claim 9, wherein: the plurality of color initial images are frame pictures extracted from an original video.

11. A colouring system according to claim 9 or 10, characterized in that: the CNN structure in the CNN-LSTM convolutional neural network model comprises a plurality of groups of conversion layers and full connection layers, each group of conversion layers comprises a convolutional layer and a pooling layer, and the full connection layers are connected behind the pooling layers;

the training module performs the following operations: