CN109949387B

CN109949387B - Scenic image post-production method based on deep learning

Info

Publication number: CN109949387B
Application number: CN201910220637.4A
Authority: CN
Inventors: 张晖; 叶子皓; 何辉
Original assignee: Nupt Institute Of Big Data Research At Yancheng; Nanjing University of Posts and Telecommunications
Current assignee: Nupt Institute Of Big Data Research At Yancheng; Nanjing University of Posts and Telecommunications
Priority date: 2019-03-22
Filing date: 2019-03-22
Publication date: 2023-07-07
Anticipated expiration: 2039-03-22
Also published as: CN109949387A

Abstract

The scenic image post-production method based on deep learning comprises the steps of firstly, marking an original picture by using a full convolution neural network, distinguishing each image part in the picture, identifying, and obtaining a semantic segmentation image D of the original image; then, key coordinates of each divided area are recorded according to the semantic divided image D, a corresponding image mask is generated, and the image mask is multiplied with the original image to obtain a series of divided images { x }; the serial images { x } are processed by corresponding system modules according to the processing targets and the identification labels of the images, each system module adjusts the parameters and redraws of the images according to the processing targets, and the images output by each system module are collected to obtain an image set { y }; carrying out parameter smoothing and edge optimization processing on the image set { y }; and finally, merging the processed images of all the areas in an image adding mode to obtain a final processed image. The invention realizes the automatic post-production of various scenic images by designing a general system architecture for intelligent post-production of scenic images.

Description

Scenic image post-production method based on deep learning

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a scenic image post-production method based on deep learning.

Background

Post-production of images is an important ring in various fields such as movies, television, photography, publishing, and the internet. Editing and drawing tools represented by Photoshop can effectively edit pictures. In the post-production of images, the production process of scenic images is quite common, and in order to produce a satisfactory scenic image, a great deal of global parameter adjustment and detail improvement are often required, and even special effects are required to be added through local redrawing. Although a relatively perfect algorithm has been known for adjusting parameters such as brightness, saturation, white balance, etc. of the whole image, manual operations are required for processing and redrawing details of a local image to achieve a satisfactory effect.

Disclosure of Invention

The invention aims to solve the technical problem of overcoming the defects of the prior art and providing a scenic image post-production method based on deep learning.

The invention provides a scenic image post-production method based on deep learning, which comprises the following steps:

step S1, marking an original picture I by using a full convolution neural network, distinguishing each image part in the picture I, and identifying to obtain a semantic segmentation image D of the original image;

s2, recording key coordinates of each segmentation area according to the semantic segmentation image D, generating a corresponding image mask, and multiplying the image mask with an original image to obtain a series of segmented images { x };

s3, the serial images { x } are processed by the corresponding system modules according to the processing targets and the identification labels of the images, the system modules adjust the parameters and redraw of the images according to the processing targets, and the image sets { y } which are output by the system modules and are obtained after the images are processed are summarized;

s4, carrying out parameter smoothing and edge optimization processing on the processed image set { y };

and S5, combining the processed area images in an image adding mode to obtain a final processed image I'.

In step S1, the full convolutional neural network is an extended convolutional neural network, the extended convolutional neural network can receive an input image with any size, the deconvolution layer is adopted to up-sample the feature image of the last convolutional layer, so that the feature image of the last convolutional layer is restored to the size of the input image, a prediction is generated for each pixel, spatial information in the original input image is reserved, and finally the odd-even up-sampled feature image is used for classifying the pixels.

Further, in step S2, pixels at the edge portion of the mark region are reserved when the mask is generated, and the pixels at the edge region are set to 0.5.

Further, in step S5, the image coverage is performed according to the stacking level from low to high, that is, the high stacking level region may cover the low stacking level region, and the image after stacking is subjected to denoising processing.

The invention constructs a general scenic image intelligent post-production-oriented system architecture, adopts the full convolution neural network of expansion convolution (Dilated Convolution) to mark the original picture, and can fully utilize the information of the original picture to ensure higher segmentation accuracy; the processing module reasonably adjusts parameters of the segmented series of images, so that the generated images are more vivid; the processed image can avoid the problems of abrupt color change, abrupt connection and the like between areas through parameter smoothing processing.

Drawings

FIG. 1 is a schematic flow chart of the present invention;

fig. 2 is a structural diagram of landscape image processing according to the present invention.

Detailed Description

Referring to fig. 1 and 2, the present embodiment provides a scenic image post-production method based on deep learning, which includes the following steps:

step S101, training a full convolution neural network (Fully Convolutional Networks, FCN) by using a landscape image set, wherein the FCN classifies images at a pixel level, thereby solving the problem of image segmentation at a semantic level. Unlike classical CNN, which uses a full-connection layer to obtain a Feature vector with a fixed length in a convolution layer to classify, FCN can accept an input image with any size, and uses a deconvolution layer to up-sample a Feature Map (Feature Map) of a last roll layer, so that it is restored to the same size as the input image, thereby generating a prediction for each pixel, retaining spatial information in the original input image, and finally classifying the pixels by parity up-sampling the Feature Map. The original picture I to be processed is input into the FCN, and the FCN outputs the semantic segmentation image D with the same size as the input picture. Meanwhile, the types of the areas after semantic segmentation are identified and can be roughly classified into mountain, water area, plain, forest, building, road and the like. In addition, the FCN adopting the expansion convolution (Dilated Convolution) is adopted to mark the original picture, so that the information of the original picture can be fully utilized to ensure higher segmentation accuracy.

In step S102, a mask is generated for each region of the semantic division image D, the pixel value of the region is set to 1, and the remaining pixels are set to 0. Considering the problem that the edge of the identification area is coarse and inaccurate due to the fine structures such as branches and leaves of trees in areas such as forests, the mask can be specially processed on the part with complex edge when the mask is generated, the pixels of a small part of areas outside the edge of the part are set to be 0.5, and the pixels which are lost due to the inaccuracy of identification can be prevented to a certain extent by keeping the information outside a certain area. A mask can be generated for each semantic region according to the semantic segmentation image D, and the segmented image set { x } is obtained by performing image multiplication operation on each mask and the original image.

And step S103, the image is processed by a corresponding processing module according to the identification tag of the image in { x }. Taking snow scene generation processing of a house as an example, an image of a house label and a snow scene generation instruction are transmitted to a house processing module. In order to generate a realistic snow scene image, the processing module needs to adjust parameters of the image, add snow and drop snow. The processing module can be realized by adopting a simple algorithm adjusted according to experience, or adopting a deep learning model such as generating a countermeasure network (Generative Adversarial Networks, GAN) and the like. In order to make the generated image more realistic, the processing module may also have a higher level FCN for finer semantic segmentation. For example, FCNs trained by using building data sets can be used for identifying building roofs, eave, chimneys or doors and windows of buildings in the process of generating snow scenes through building treatment. With finer division, the processing module can draw snow in a place where snow is formed, adjust the ambient light of the brightness simulation snow scene according to the shadow relation of the building, and then draw snow according to the shielding relation. The processed images are output by the processing modules to obtain a processed image set { y }, and parameters such as brightness, saturation, contrast, white balance and the like of the images are output for subsequent parameter smoothing.

Step S104, according to the position of the image in { y }, synthesizing the image parameters of the surrounding areas of each area to carry out parameter smoothing processing, and preventing the problems of abrupt color change and abrupt connection among the areas. Because the positions of the distant view and the close view are different, the illumination environment of each region may have larger difference, the parameter smoothing process should be performed in combination with the semantic segmentation image D, that is, the parameters of the neighboring region should be limited according to the regions of the semantic segmentation image under the condition that the parameters of the neighboring region have larger difference, and the parameters of the region should be gradually changed from the parameters of the neighboring region to the parameters of the neighboring region instead of suddenly changed. In addition, a scoring mechanism for scenes is introduced to distinguish foreground and background relations of the scenes, and the shielding relation of the integrity and definition of the scenes in each area is analyzed. Depending on the analysis results, a higher priority treatment and higher overlay ranking may be given to the front Jing Jingwu, the obstruction. After the parameter smoothing is completed, the combined image obtained by adding the images in { y } may have a situation of tearing the picture. Because the processing of the processing module may cause redrawing and erasing of the image portion, the redrawing and erasing occurring at the image edge portion may cause the edges of the semantic segmentation region in the composite image to appear as a shadow or black line. The occlusion can be handled by the principle of foreground covering background, whereas the tearing of the picture, i.e. the black line of the edge, requires an edge optimization. Under the condition of accurate semantic segmentation, the width of the black line is very limited, and the black line can be filled by performing certain feathering treatment on the image edges on two sides of the black line by edge optimization treatment, but the feathering is likely to cause the phenomenon of edge blurring. This can be alleviated by adjusting the sharpness of the composite image. In addition, this can be erased for some black spots where the image edges are discontinuous by algorithms that replace or denoise the pixels in the neighborhood.

Step S105, merging the region images subjected to the parameter smoothing and the edge optimization processing by means of image addition to obtain a final processed image I'.

The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the specific embodiments described above, and that the above specific embodiments and descriptions are provided for further illustration of the principles of the present invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined in the appended claims. The scope of the invention is defined by the claims and their equivalents.

Claims

1. The scenic image post-production method based on deep learning is characterized by comprising the following steps of S1, marking an original picture I by utilizing a full convolution neural network FCN, distinguishing each image part in the picture I, identifying, and obtaining a semantic segmentation image D of the original image;

step S5, combining the processed area images in an image adding mode to obtain a final processed image I';

in step S2, pixels at the edge portion of the mark region are retained when the mask is generated, and the pixels at the edge region are set to 0.5;

in the step S3, the image is transmitted to a corresponding processing module for processing according to the identification tag of the image in { x }; when a house snow scene is generated, an image of a house label and a snow scene generating instruction are transmitted to a house processing module; the processing module adjusts parameters of the image, and snow is added and falls; the processing module adopts FCNs for semantic segmentation; during the process of house processing and snow scene generation, FCNs trained by using a building data set identify components in houses where snow is formed; the processing module draws snow in a place where the snow is formed, adjusts the ambient light of the brightness simulation snow scene according to the shadow relation of the building, and draws snow according to the shielding relation; each processing module outputs the processed image to obtain a processed image set { y }, and simultaneously outputs image parameters for subsequent parameter smoothing;

in the step S4, according to the position of the image in { y }, synthesizing the image parameters of the surrounding areas of each area to carry out parameter smoothing processing; the parameter smoothing process is carried out in combination with the semantic segmentation image D, namely the parameters of the adjacent areas are limited according to the areas of the semantic segmentation image under the condition of large difference, and gradual change rather than abrupt change is adopted from the parameters of the areas to the parameters of the adjacent areas; after the parameter smoothing is completed, the combined image obtained by carrying out image addition on the images in { y } may have the condition of image tearing; because the processing of the processing module can cause redrawing and erasing of the image part, the redrawing and erasing occurring at the image edge part can cause the edge of the semantic segmentation area in the composite image to be blocked or black lines; the shielding is processed by the principle that the foreground covers the background, and the picture is torn, namely the black line of the edge needs to be subjected to edge optimization processing; under the condition of accurate semantic segmentation, the edge optimization processing fills the black line by performing eclosion processing on the image edges on two sides of the black line; however, feathering can cause edge blurring by adjusting the sharpness of the composite image; in addition, black points discontinuous to the image edge are erased by an algorithm of pixel substitution or denoising in the adjacent area;

in step S5, a scoring mechanism for scenes is introduced to distinguish foreground-background relation of the scenes, and the shielding relation of the integrity and definition of the scenes in each area is analyzed; based on the analysis results, giving the front Jing Jingwu, the barrier a higher priority treatment and a higher overlay rating; and (3) performing image coverage according to the superposition level from low to high, namely covering a low superposition level region by a high superposition level region, and performing denoising treatment on the superimposed image.

2. The method according to claim 1, wherein in the step S1, the full convolutional neural network is an extended convolutional neural network, the input of the extended convolutional neural network receives an input image with any size, the feature map of the last convolutional layer is up-sampled by the deconvolution layer to restore the feature map to the size of the input image, thereby generating a prediction for each pixel, retaining spatial information in the original input image, and finally classifying the pixels by parity up-sampled feature map.