CN110838131A - Method and device for realizing automatic cutout, electronic equipment and medium - Google Patents

Method and device for realizing automatic cutout, electronic equipment and medium Download PDF

Info

Publication number
CN110838131A
CN110838131A CN201911066619.1A CN201911066619A CN110838131A CN 110838131 A CN110838131 A CN 110838131A CN 201911066619 A CN201911066619 A CN 201911066619A CN 110838131 A CN110838131 A CN 110838131A
Authority
CN
China
Prior art keywords
image
matting
processed
determining
category
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911066619.1A
Other languages
Chinese (zh)
Other versions
CN110838131B (en
Inventor
吴凯琳
姜波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netease Hangzhou Network Co Ltd
Original Assignee
Netease Hangzhou Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netease Hangzhou Network Co Ltd filed Critical Netease Hangzhou Network Co Ltd
Priority to CN201911066619.1A priority Critical patent/CN110838131B/en
Publication of CN110838131A publication Critical patent/CN110838131A/en
Application granted granted Critical
Publication of CN110838131B publication Critical patent/CN110838131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the disclosure provides an automatic cutout implementation method, an automatic cutout implementation device, electronic equipment and a computer-readable storage medium, and relates to the technical field of computers. The method for realizing automatic cutout comprises the following steps: classifying the image to be processed from multiple dimensions to obtain multiple classification results, and determining a matting parameter according to the multiple classification results; fusing the classification results according to preset dimension weights to obtain a score corresponding to the image to be processed, and determining the category of the image to be processed according to the score; determining the similarity of a foreground region and a background region in an image to be processed, and determining a matting algorithm according to the similarity and the category; and carrying out cutout processing on the image to be processed according to the cutout algorithm and the cutout parameters. Therefore, the implementation of the embodiment of the disclosure can realize automatic matting, and the algorithm required by region self-adaptive selection of the matting according to different image needs is adopted, so that the matting efficiency is improved, and the matting effect is improved.

Description

Method and device for realizing automatic cutout, electronic equipment and medium
Technical Field
Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to an implementation method of automatic matting, an implementation apparatus of automatic matting, an electronic device, and a computer-readable storage medium.
Background
In the field of design, it is often necessary to cut out graphics in an image for application in another image. In general, one can draw a border line coinciding with an edge of a portion of an image that needs to be scratched through a scratching tool in image processing software, but if the portion is irregular, the time for the person to draw the border line may be longer. Alternatively, one can also design a general algorithm to apply it to all matting work to reduce labor cost, but the matting image usually needs to satisfy some rules (e.g. simple picture, clear lines) to be applied to the general algorithm. Then, under this kind of circumstances, will have the problem that lacks the strain capacity to complicated image because of the single formation of matting mode, it is visible, and the general degree of this kind of matting method is not high to, also can have the problem that matting efficiency is not high through artifical manual matting.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
Therefore, the matting method in the related art has limitation when being applied to large-batch matting work, and when facing a large amount of matting work, if manual matting is used, the matting efficiency can be greatly reduced. In addition, because the positions of different images needing to be subjected to matting can be different, if a universal matting algorithm is applied to all the images through presetting, the problem of lack of strain capacity to a complex image scene due to single matting mode is easy to occur. Therefore, an improved method for realizing automatic matting is needed, so as to realize automatic matting, and adaptively select a required algorithm according to areas of different images to be matted, so as to improve matting efficiency and improve matting effect.
In this context, embodiments of the present disclosure provide an implementation method of automatic matting, an implementation apparatus of automatic matting, an electronic device, and a computer-readable storage medium.
According to a first aspect of the embodiments of the present disclosure, a method for implementing automatic cutout is disclosed, which includes:
classifying the image to be processed from multiple dimensions to obtain multiple classification results, and determining a matting parameter according to the multiple classification results;
fusing the classification results according to preset dimension weights to obtain a score corresponding to the image to be processed, and determining the category of the image to be processed according to the score;
determining the similarity of a foreground region and a background region in an image to be processed, and determining a matting algorithm according to the similarity and the category;
and carrying out cutout processing on the image to be processed according to the cutout algorithm and the cutout parameters.
In one embodiment, based on the foregoing scheme, classifying the image to be processed from multiple dimensions includes:
carrying out convolution processing on an image to be processed through a deep convolution neural network to obtain a first image characteristic;
performing average pooling on the first image features to obtain second image features, and converting the second image features into third image features through a full-connection layer;
and determining the score of each category corresponding to the image to be processed in the multiple dimensions according to the third image characteristics so as to obtain multiple classification results corresponding to each dimension.
In one embodiment, based on the foregoing scheme, determining the category to which the image to be processed belongs according to the score includes:
determining a target score range where the score is located according to a preset classification rule, and determining a category corresponding to the target score range as a category to which the image to be processed belongs; the preset classification rule comprises a plurality of non-intersection score ranges, and each score range corresponds to one category.
In one embodiment, based on the foregoing scheme, determining a matting parameter from a plurality of classification results includes:
selecting at least two target categories from various categories corresponding to the images to be processed in multiple dimensions, and normalizing scores corresponding to the at least two target categories; wherein the score of the target category is higher than the scores of other categories in the corresponding multiple dimensions;
calculating a cutout parameter according to the normalization processing result and a preset weight parameter; .
In one embodiment, based on the foregoing scheme, determining the similarity between the foreground region and the background region in the image to be processed includes:
performing region division on an image to be processed to determine a foreground region and a background region, and calculating a foreground color histogram corresponding to the foreground region;
determining a color histogram corresponding to each background area, and fusing the color histograms according to a preset area weight to obtain a background color histogram;
and calculating the similarity of the foreground color histogram and the background color histogram as the similarity of a foreground region and a background region in the image to be processed.
In one embodiment, based on the foregoing scheme, performing matting processing on an image to be processed according to a matting algorithm and a matting parameter includes:
adjusting the discrimination of the image to be processed according to the matting parameter;
and separating the foreground and the background of the image to be processed after the discrimination adjustment according to a matting algorithm, and taking the foreground obtained by separation as a matting processing result.
In one embodiment, based on the foregoing scheme, taking the separated foreground as a matting processing result includes:
and softening the edges of the separated foreground through the edge softening parameters, and performing gradual change processing on the area in the preset range of the edges through the feather range parameters so as to take the foreground after the gradual change processing as a matting processing result.
According to the second aspect of the embodiment of the present disclosure, an implementation apparatus for automatic matting is disclosed, which includes an image classification unit, a classification result fusion unit, a region similarity determination unit, and a matting unit, wherein:
the image classification unit is used for classifying the image to be processed from multiple dimensions to obtain multiple classification results and determining a matting parameter according to the multiple classification results;
the classification result fusion unit is used for fusing the classification results according to the preset dimensionality weight to obtain a score corresponding to the image to be processed and determining the category of the image to be processed according to the score;
the region similarity determining unit is used for determining the similarity between a foreground region and a background region in the image to be processed and determining a matting algorithm according to the similarity and the category;
and the matting unit is used for carrying out matting processing on the image to be processed according to the matting algorithm and the matting parameters.
In an embodiment, based on the foregoing scheme, a manner of classifying the image to be processed from multiple dimensions by the image classification unit may specifically be:
the image classification unit performs convolution processing on the image to be processed through a deep convolution neural network to obtain a first image characteristic;
the image classification unit performs average pooling on the first image features to obtain second image features, and the second image features are converted into third image features through a full connection layer;
and the image classification unit determines the scores of all classes corresponding to the images to be processed in the multiple dimensions according to the third image characteristics so as to obtain multiple classification results corresponding to each dimension.
In an embodiment, based on the foregoing scheme, the manner in which the classification result fusion unit determines the category to which the image to be processed belongs according to the score may specifically be:
the classification result fusion unit determines a target score range where the score is located according to a preset classification rule, and determines a category corresponding to the target score range as a category to which the image to be processed belongs; the preset classification rule comprises a plurality of non-intersection score ranges, and each score range corresponds to one category.
In an embodiment, based on the foregoing scheme, the manner in which the image classification unit determines the matting parameter according to the multiple classification results may specifically be:
the image classification unit selects at least two target categories from all categories corresponding to the images to be processed in multiple dimensions, and normalizes scores corresponding to the at least two target categories; wherein the score of the target category is higher than the scores of other categories in the corresponding multiple dimensions;
and the image classification unit calculates the matting parameters according to the normalization processing result and the preset weight parameters.
In an embodiment, based on the foregoing scheme, the manner in which the classification result fusion unit determines the similarity between the foreground region and the background region in the image to be processed may specifically be:
the classification result fusion unit performs region division on the image to be processed to determine a foreground region and a background region and calculates a foreground color histogram corresponding to the foreground region;
the classification result fusion unit determines color histograms corresponding to the background regions, and fuses the color histograms according to preset region weights to obtain a background color histogram;
and the classification result fusion unit calculates the similarity of the foreground color histogram and the background color histogram as the similarity of a foreground region and a background region in the image to be processed.
In an embodiment, based on the foregoing scheme, the way for the matting unit to perform the matting processing on the to-be-processed image according to the matting algorithm and the matting parameter may specifically be:
the sectional drawing unit adjusts the discrimination of the image to be processed according to the sectional drawing parameters;
the matting unit separates the foreground and the background of the image to be processed after the discrimination adjustment according to a matting algorithm, and takes the separated foreground as a matting processing result.
In an embodiment, based on the foregoing scheme, a manner in which the matting unit takes the separated foreground as a matting processing result may specifically be:
the matting unit carries out softening treatment on the edges of the separated foreground through the edge softening parameters, and carries out gradual change treatment on the regions in the preset range of the edges through the feather range parameters, so that the foreground after the gradual change treatment is used as a matting treatment result.
According to a third aspect of the embodiments of the present disclosure, there is disclosed an electronic device comprising: a processor; and a memory having computer readable instructions stored thereon, the computer readable instructions when executed by the processor implementing the method of implementing automatic matting as disclosed in the first aspect.
According to a fourth aspect of the embodiments of the present disclosure, a computer program medium is disclosed, on which computer readable instructions are stored, which, when executed by a processor of a computer, cause the computer to perform the implementation method of automatic matting disclosed according to the first aspect of the present disclosure.
The image matting method and the device can classify an image to be processed (namely, the image to be subjected to matting) from multiple dimensions (such as scene dimensions, object dimensions, image complexity dimensions and the like) to obtain multiple classification results (such as the scene is indoor dim light, the object is a person, the image complexity belongs to a simple class and the like), and determine matting parameters according to the multiple classification results; furthermore, the classification results can be fused according to preset dimension weights to obtain a score corresponding to the image to be processed, and the category (such as an easy-to-understand category and a difficult-to-understand category) to which the image to be processed belongs is determined according to the score; furthermore, the similarity of a foreground region and a background region in the image to be processed can be determined, and a matting algorithm is determined according to the similarity and the category; and then, the image to be processed can be subjected to matting processing according to the matting algorithm and the matting parameters. Compared with the prior art, on one hand, automatic matting can be realized, and a required algorithm is selected in a self-adaptive manner according to areas needing matting of different images, so that the matting efficiency is improved, and the matting effect is improved; on the other hand, the matching degree of the determined matting algorithm and the image to be processed can be improved through multi-dimensional classification of the image to be processed, and therefore the matting effect is further improved.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
FIG. 1 illustrates a flow diagram of an implementation of automatic matting according to an example embodiment of the present disclosure;
FIG. 2 illustrates an architectural diagram of a deep convolutional neural network, according to an example embodiment of the present disclosure;
FIG. 3 illustrates another architectural schematic of a deep convolutional neural network according to an example embodiment of the present disclosure;
FIG. 4 is a diagram illustrating the result of region partitioning an image to be processed according to an example embodiment of the present disclosure;
FIG. 5 illustrates an architectural diagram of an implementation system for automatic matting according to an example embodiment of the present disclosure;
FIG. 6 is a block diagram illustrating an apparatus for automatic matting according to an example embodiment of the present disclosure;
fig. 7 is a block diagram illustrating an implementation of automatic matting according to another alternative example embodiment of the present disclosure.
In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Detailed Description
The principles and spirit of the present disclosure will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present disclosure, and are not intended to limit the scope of the present disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
According to the embodiment of the disclosure, an implementation method of automatic cutout, an implementation device of automatic cutout, an electronic device and a computer-readable storage medium are provided.
Any number of elements in the drawings are by way of example and not by way of limitation, and any nomenclature is used solely for differentiation and not by way of limitation.
The principles and spirit of the present disclosure are explained in detail below with reference to several representative embodiments of the present disclosure.
Summary of The Invention
Matting refers to a technique of accurately extracting a certain portion from an original image, and is one of the most common operations in image processing. People can draw out the region that needs to scratch out through manual, and then, the computer can cut the region that people outlined down to supply people to carry out subsequent application to it. However, manually delineating the area requiring matting is often time consuming and labor intensive and less accurate. In view of this problem, it is thought to design a matting algorithm by which image matting is achieved. Specifically, the matting algorithm is to binarize the image to obtain a binarized image (i.e., a black-and-white image), so that edge information of a matting part in the image can be obtained to perform matting according to the edge information. However, some images have the problems of low contrast, complex images and the like, so that edge information cannot be easily obtained after binarization, and thus the application range of the matting algorithm is small, the requirements on the images are high (for example, the contrast is high, the images are pure, and the exposure time is long), and otherwise the matting algorithm is not easy to perform matting.
Based on the above problems, the applicant finds that the problems can be solved by designing a matting method with higher universality, and the method can be used for matting aiming at different images. However, since the kinds of images are too many, the contents included in the images vary widely, and the contents to be cut out also vary greatly. Therefore, the applicant thinks that the images needing to be subjected to matting can be classified as accurately as possible from a plurality of angles, different classifications correspond to a matting algorithm specially applied to the classification, and the matting efficiency and the matting effect can be ensured to a certain extent by distributing a corresponding matting algorithm for the images, so that the waste of manpower can be reduced, and the existing matting mode is improved.
Having described the general principles of the present disclosure, various non-limiting embodiments of the present disclosure are described in detail below.
Application scene overview
It should be noted that the following application scenarios are merely illustrated to facilitate understanding of the spirit and principles of the present disclosure, and embodiments of the present disclosure are not limited in this respect. Rather, embodiments of the present disclosure may be applied to any scenario where applicable.
The embodiment of the disclosure can be applied to scenes such as graphic design and image synthesis. For graphic design, the image materials can be analyzed from multiple angles through the technical scheme of the disclosure, and a proper matting algorithm is distributed to the image materials according to the classification to which the image materials belong so as to scrub elements in the image materials, so that designers can conveniently use the elements to design the graphics. For image synthesis, the category of the image to be subjected to matting can be analyzed from multiple angles through the technical scheme of the disclosure, and a proper matting algorithm is allocated to the image to be subjected to matting according to the category so as to matte the required part, so that the image to be subjected to matting can be conveniently applied to a scene synthesized with other images.
Exemplary method
In combination with the application scenarios described above, an implementation method of automatic matting according to an exemplary embodiment of the present disclosure is described below with reference to fig. 1 to 4.
Referring to fig. 1, fig. 1 is a flow chart illustrating an implementation method of automatic matting according to an example embodiment of the present disclosure, which may be implemented by a server or a terminal device.
As shown in fig. 1, an implementation method of automatic matting according to one embodiment of the present disclosure includes:
step S110: classifying the image to be processed from multiple dimensions to obtain multiple classification results, and determining the matting parameters according to the multiple classification results.
Step S120: and fusing the classification results according to the preset dimension weight to obtain a score corresponding to the image to be processed, and determining the category to which the image to be processed belongs according to the score.
Step S130: and determining the similarity of the foreground region and the background region in the image to be processed, and determining a matting algorithm according to the similarity and the category.
Step S140: and carrying out cutout processing on the image to be processed according to the cutout algorithm and the cutout parameters.
These steps are described in detail below.
In step S110, the image to be processed is classified from multiple dimensions to obtain multiple classification results, and the matting parameter is determined according to the multiple classification results.
The multiple dimensions may include at least two dimensions of a scene dimension, an image complexity dimension, and an object dimension, and the multiple dimensions may further include a viewing angle dimension, a hue dimension, a brightness dimension, and the like.
In addition, each dimension also corresponds to a plurality of categories, wherein the image complexity dimension can correspond to at least two categories of a simple category, a general category, a medium category, a complex category, a difficult-to-solve category, a special category and the like. The scene dimensions may correspond to at least two of the categories of indoor bright light, indoor medium light, indoor dim light, outdoor bright light, outdoor medium light, outdoor low light, artificial dim light, artificial medium light, and artificial dim light. The object dimensions may correspond to at least two of the categories of people, animals, goods, food, buildings, vehicles, and plants.
In addition, the classification result is used for representing the class to which the image to be processed belongs in the dimension.
In addition, the matting parameter is used for adjusting the discrimination of the foreground and the background in the image to be processed, so that the foreground part in the image to be processed can be separated more easily.
In an alternative embodiment, classifying the image to be processed from multiple dimensions includes:
carrying out convolution processing on an image to be processed through a deep convolution neural network to obtain a first image characteristic;
performing average pooling on the first image features to obtain second image features, and converting the second image features into third image features through a full-connection layer;
and determining the score of each category corresponding to the image to be processed in the multiple dimensions according to the third image characteristics so as to obtain multiple classification results corresponding to each dimension.
The method for obtaining the first image feature by performing convolution processing on the image to be processed through the deep convolution neural network can be as follows:
determining a pixel matrix corresponding to an image to be processed, and multiplying a preset n x n convolution kernel in a deep convolution neural network by the pixel matrix according to a preset step length to obtain a convolution result matrix, wherein n is a positive integer; the deep convolutional neural network comprises a plurality of convolutional layers, so that the server executes the convolution step for a plurality of times, and the convolution result matrixes obtained by the convolutional layers are the first image characteristics, so that the number of the first image characteristics can be multiple. In addition, the predetermined convolution kernels of the plurality of convolution layers may be the same or different, and the embodiment of the present disclosure is not limited thereto.
After the first image feature is obtained, there are two execution modes, one is to perform the above-mentioned "average pooling of the first image feature", and the other is to perform the maximum pooling of the first image feature, and in the embodiment of the present disclosure, the "average pooling of the first image feature" is schematically illustrated, but this does not prevent the embodiment of the present disclosure from obtaining the second image feature by performing the maximum pooling of the first image feature, and then, the above-mentioned "converting the second image feature into the third image feature through the full connection layer" is performed again.
The method for obtaining the second image features by performing average pooling on the first image features may be as follows: the first image features are averaged and pooled by a predetermined window size (e.g., 2 x 2) to obtain second image features. The second image features may be multiple, and the deep convolutional neural network may include multiple pooling layers, similarly to the above multiple convolutional layers. In addition, the dimension of the second image feature may be 2048 dimensions, 1024 dimensions, or other dimensions, and the embodiment of the present disclosure is not limited. In addition, the second image features corresponding to the to-be-processed image obtained after the convolution and average pooling of the image can be multiplexed in the full connection layer, and therefore, the waste of computer resources can be reduced and the matting efficiency can be improved by performing the feature extraction operation once and multiplexing the extracted features.
The method for converting the second image feature into the third image feature through the full connection layer may be: performing feature splicing on the plurality of second image features through a full-connection layer to obtain third image features so that the third image features are suitable for the classification task of the dimension, wherein the third image features are high-dimension vectors used for representing the part to be subjected to image matting under the dimension; wherein the third image feature may be plural and a dimension of the third image feature may be the same as a dimension of the second image feature. In addition, optionally, after the second image feature is converted into the third image feature through the full connection layer, the method may further include the following steps: the random inactivation process is performed on the third image feature, which can also be understood as setting the partial value in the third image feature to 0 to avoid the over-fitting problem to some extent.
The method for determining the scores of the categories corresponding to the images to be processed in the multiple dimensions according to the third image features to obtain the multiple classification results corresponding to each dimension may be as follows: performing feature conversion on the third image features through another full-connection layer to obtain fourth image features with the same dimensionality, determining scores of all classes corresponding to the images to be processed in multiple dimensionalities according to the fourth image features, wherein each dimensionality corresponds to multiple classification results, and the classification results are used for representing the probability of dividing the images to be processed into all classes in the dimensionality; the score of each category corresponding to the images to be processed in the multiple dimensions is determined according to the fourth image characteristics, the score is obtained according to a softmax function, the fourth image characteristics are the recombination of the third image characteristics, and compared with the third image characteristics, the fourth image characteristics can accurately represent the portions to be subjected to image matting. In addition, optionally, the score of each category corresponding to the image to be processed in the multiple dimensions is determined according to the third image feature, so as to obtain a classification stage that multiple classification results corresponding to each dimension belong to the image to be matted, and the convolution, average pooling and full connection performed on the image to be processed belong to a feature extraction stage (which can also be understood as a feature conversion stage), where the feature extraction stage and the classification stage can perform split and multi-path parallel computation on categories of different dimensions, so as to improve the classification efficiency and classification accuracy of the image to be matted. It should be noted that the softmax function is an activation function, the third image feature may be mapped to a range from 0 to 1, so that the mapped value can represent the probability of each category in each dimension corresponding to the image to be processed, and the sum of the probabilities of each category in each dimension is 1, so that each dimension corresponds to one classification result, thereby improving the accuracy of classification. For example, if the scores corresponding to the simple class, the general class, the medium class, the complex class, the difficult-to-understand class, and the special class in the image complexity dimension are 0.1, 0.5, and 0.1, respectively, then the corresponding class of the image to be processed in the image complexity dimension should be the complex class.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating an architecture of a deep convolutional neural network according to an exemplary embodiment of the present disclosure. As shown in fig. 2, the deep convolutional neural network includes a convolutional layer 202, an average pooling layer 204, a fully-connected layer 206, a fully-connected layer 208, and an activation function 210. After the image 201 to be processed is input into the convolution layer 202, a pixel matrix corresponding to the image to be processed is determined, and a convolution result matrix, namely the first image feature 203, is obtained by multiplying a preset n × n convolution kernel in the deep convolution neural network by the pixel matrix according to a preset step length. Further, the second image feature 203 may be input into the average pooling layer 204, so that the average pooling layer 204 averagely pools the first image feature by a preset window size to obtain the second image feature 205. Furthermore, a plurality of second image features 205 may be feature-stitched through a fully-connected layer 206 to obtain a third image feature 207, and a feature transformation may be performed on the third image feature 207 through another fully-connected layer 208 to obtain a fourth image feature 209 with the same dimension. Further, scores 211 of each category corresponding to the image to be processed in the plurality of dimensions may be determined by the activation function 210 and the fourth image feature 209.
It should be noted that the to-be-processed image 201, the first image feature 203, the second image feature 205, the third image feature 207, the fourth image feature 209, and the scores 211 of the categories corresponding to the to-be-processed images in multiple dimensions shown in fig. 2 respectively correspond to the scores of the categories corresponding to the to-be-processed image, the first image feature, the second image feature, the third image feature, the fourth image feature, and the to-be-processed images in multiple dimensions in the above-described alternative embodiment.
Referring to fig. 3 in conjunction with the schematic architecture diagram of the deep convolutional neural network shown in fig. 2, fig. 3 is another schematic architecture diagram of the deep convolutional neural network according to an exemplary embodiment of the present disclosure. As shown in fig. 3, another architecture diagram includes an input layer 301, a CNN layer 302, an output layer 303, a full connection layer 304, and an active layer 305; the CNN layer 302 is a convolutional neural network layer, and is used for feature extraction.
The input to the input layer 301 may be the image to be processed 201 shown in fig. 2, the CNN layer 302 may include the convolution layer 202 and the average pooling layer 204 shown in fig. 2, and the first image feature 203 shown in fig. 2 may be obtained as an input to the average pooling layer 204 after the convolution layer 202 convolves the image to be processed 201. Further, the average pooling layer 204 may obtain the second image feature 205 as the image feature output by the output layer 303 by performing an average pooling process on the first image feature 203, so as to complete the feature extraction stage. In addition, the fully-connected layer 304 may include the fully-connected layer 206 and the fully-connected layer 208 shown in fig. 2, wherein the fully-connected layer 206 may perform feature stitching on the plurality of second image features 205 to obtain a third image feature 207, so as to complete the work of the feature conversion stage. In the classification stage, the third image features 207 are subjected to feature conversion through another full-connection layer 208 to obtain fourth image features 209 with the same dimension, so that the converted image features can adapt to the classification task of the dimension; wherein another fully-connected layer 208 may multiplex the features obtained in the feature extraction stage, and by performing the feature extraction operation only once, the repetitive operation and time consumption may be reduced. In addition, the activation layer 305 may include the activation function 210 shown in fig. 2, scores 211 of each category corresponding to the image to be processed in the multiple dimensions may be determined through the activation function 210 and the fourth image feature 209, and the sum of the scores of each category of each dimension may be guaranteed to be 1 through the activation function 210 in the activation layer 305, so that it is guaranteed that each dimension has one or only one classification result, and accuracy of the classification result is improved. In addition, it should be noted that, in the feature conversion stage and the classification stage, multiple parallel computations can be performed to increase the computation speed.
Therefore, by implementing the optional embodiment, the important features in the image to be processed can be extracted by performing multi-layer feature processing on the image to be processed, and then the probability that the image to be processed is divided into various categories in each dimension can be determined according to the important features, so that the matting effect on the image to be processed can be improved according to the category to which the image to be processed belongs.
In another alternative embodiment, determining the matting parameter from a plurality of classification results comprises:
selecting at least two target categories from various categories corresponding to the images to be processed in multiple dimensions, and normalizing scores corresponding to the at least two target categories; wherein the score of the target category is higher than the scores of other categories in the corresponding multiple dimensions
And calculating the matting parameters according to the normalization processing result and the preset weight parameters.
The manner of selecting at least two target categories from the categories corresponding to the images to be processed in the multiple dimensions may be: selecting at least two target categories from the categories corresponding to the images to be processed in the multiple dimensions according to a preset selection rule, wherein the preset selection rule can be as follows: and selecting categories corresponding to the first N scores of the scores in the dimensionality from high to low as target categories, wherein N is a positive integer. For example, if the scores corresponding to the simple class, the general class, the medium class, the complex class, the difficult-to-solve class, and the special class in the image complexity dimension are 0.1, 0.2, 0.3, and 0.2, respectively, and the preset selection rule is to select the classes corresponding to the first 3 scores in the dimension with the scores sorted from high to low, then the classes complex class, the difficult-to-solve class, and the special class corresponding to the scores of 0.2, 0.3, and 0.2 are the target classes, respectively.
The method for normalizing the scores corresponding to the at least two target categories may be as follows: determining normalization parameters S corresponding to each category in each dimensioniAnd the score b corresponding to each category in each dimensioniAnd according to the expression
Figure BDA0002259571070000131
Calculating a weight coefficient W corresponding to each category in the dimensioniAs a result of normalization; wherein i is used for representing each category in the dimension, and n is the number of categories in the dimension.
The method for calculating the matting parameter according to the normalization processing result and the preset weight parameter can be as follows: according to the expression
Figure BDA0002259571070000132
Calculating a sectional drawing parameter of the image to be processed; wherein, WiTo normalize the result, piIs a preset weight parameter, namely a preset optimal parameter of the ith category in the dimension.
Therefore, by implementing the optional embodiment, the category with higher importance can be selected as the target category according to the score corresponding to the category, and the matting parameter for image matting is calculated according to the normalization of the score of the target category, so as to improve the matting effect.
In step S120, the classification results are fused according to the preset dimensionality weight to obtain a score corresponding to the image to be processed, and a category to which the image to be processed belongs is determined according to the score.
The method for fusing the classification results according to the preset dimension weight to obtain the score corresponding to the image to be processed can be as follows: determining a weight value corresponding to each category in each dimension in preset dimension weights, calculating a weighted sum corresponding to each dimension according to the weight value and the score of each category in each dimension, and determining the weighted sum as the score corresponding to the image to be processed in the dimension. Since the image to be processed corresponds to one score in each dimension, the score corresponding to the image to be processed may be multiple.
In addition, the above-mentioned calculation formula for calculating the weighting and basis corresponding to each dimension according to the weight value and the score of each category in each dimension is
Figure BDA0002259571070000141
XiIs the weight value of each class in the dimension, CiThe score of each category in the dimension is shown, and N is the number of categories contained in the dimension. For example, if the number of classes included in a dimension is 5, that is, 5 classes are included in the dimension, then N equals 5; if the number of classes contained in a dimension is 8, i.e. 8 classes are contained in the dimension, then N is 8. If the weight value corresponding to the simple class, the general class, the medium class, the complex class, the difficult-to-solve class and the special class in the image complexity dimension is Xi(Xi=i,i∈[1,6]) And the scores corresponding to the simple class, the general class, the medium class, the complex class, the difficult-to-understand class and the special class in the image complexity dimension are 0.1, 0.5 and 0.1 respectively, then the weighted sum corresponding to the image complexity dimension is 0.1 × 1+0.1 × 2+0.1 × 3+0.1 × 4+0.5 +0.1 × 6, and therefore 4.1 is the score corresponding to the image to be processed in the image complexity dimension.
In an alternative embodiment, determining the category to which the image to be processed belongs according to the score includes:
determining a target score range where the score is located according to a preset classification rule, and determining a category corresponding to the target score range as a category to which the image to be processed belongs; the preset classification rule comprises a plurality of non-intersection score ranges, and each score range corresponds to one category.
And the preset classification rule is used for classifying the scores. The method for determining the target score range of the score according to the preset classification rule may be as follows: determining a plurality of score ranges in the preset classification rule corresponding to each dimension, determining a target score range of the score in the preset classification rule of the dimension according to the dimension to which the score belongs, wherein the score ranges comprise the target score range. In addition, there is no intersection between the score ranges in the preset classification rule corresponding to each dimension, but there may be an intersection between the dimensions and the score ranges between the dimensions.
The manner of determining the category corresponding to the target score range as the category to which the image to be processed belongs may be: and determining a category corresponding to the target score range, and determining the category as the category to which the image to be processed belongs.
For example, if the score corresponding to the image to be processed in the image complexity dimension is 4.1, and the score ranges in the preset classification rule corresponding to the image complexity dimension are (0,1], (1,2], (2-3 ], (3,4], (4,5], and (5, 6); wherein the content of the first and second substances, (0, 1) the corresponding category is a simple category, (1, 2) the corresponding category is a general category, (2 to 3) the corresponding category is a medium category, (3, 4) the corresponding category is a complex category, (4, 5) the corresponding category is a difficult category, (5, 6) the corresponding category is a special category, and the score 4.1 belongs to the score range (4, 5), so (4, 5) is the above-mentioned target score range, since the corresponding category of (4, 5) is a difficult-to-solve category, the category to which the image to be processed belongs in the image complexity dimension is the difficult-to-solve category.
Generally speaking, in the existing classification method, the category corresponding to the maximum score selected from the scores of the categories is usually determined as the category of the image to be processed in the dimension, but for the condition that the scores of the categories are relatively similar, the accuracy of the category determined by the method is not high, and by the score fusion method in the disclosure, the scores of the categories in the dimension can be fused according to the weight values corresponding to the categories, so that the problem of inaccurate classification result caused by the condition is avoided to a certain extent, and further, the classification effect on the image to be processed is improved, and the subsequent image matting effect is improved.
In step S130, the similarity between the foreground region and the background region in the image to be processed is determined, and a matting algorithm is determined according to the similarity and the category.
The matting algorithm may be a threshold method, a flooding method, a watershed method, an image segmentation method, a deep learning method, or the like, and the embodiment of the disclosure is not limited.
In an exemplary embodiment of the present disclosure, optionally, determining the similarity between the foreground region and the background region in the image to be processed includes:
performing region division on an image to be processed to determine a foreground region and a background region, and calculating a foreground color histogram corresponding to the foreground region;
determining a color histogram corresponding to each background area, and fusing the color histograms according to a preset area weight to obtain a background color histogram;
and calculating the similarity of the foreground color histogram and the background color histogram as the similarity of a foreground region and a background region in the image to be processed.
The method for dividing the image to be processed into regions to determine the foreground region and the background region may be as follows: the method comprises the steps of dividing an image to be processed into N (N is a positive integer) areas, determining an area which is intersected with a frame of the image to be processed as a background area, and determining an area which is not intersected with the frame of the image to be processed as a foreground area. For example, referring to fig. 4, fig. 4 is a schematic diagram illustrating a result of dividing a region of an image to be processed according to an exemplary embodiment of the disclosure. As shown in FIG. 4, the image to be processed can be divided into 9 regions, wherein the regions 2-9 are intersected with the frame of the image to be processed, and the region 1 is not intersected with the frame of the image to be processed. Therefore, region 1 can be determined as a foreground region and regions 2-9 can be determined as background regions.
Among them, the color histogram is a color feature widely adopted in many image retrieval systems. It describes the proportion of different colors in the whole image, and does not care about the spatial position of each color, i.e. cannot describe the object or object in the image. The foreground color histogram and the background color histogram represent distribution curves of colors of each color channel.
The method for obtaining the background color histogram by fusing the color histograms according to the preset region weight value may be as follows: and determining preset region weights corresponding to a plurality of regions obtained by region division, and calculating the weighted sum of the preset region weights and the color histograms of the corresponding background regions to obtain a background color histogram. Wherein, the background color histogram is integrated with the color distribution of each region under different region weights. Therefore, corresponding weights can be distributed according to the importance degrees of different background areas, so that the finally obtained background color histogram is more accurate, and the improvement on the matting effect of the to-be-processed image is facilitated.
The method for calculating the similarity between the foreground color histogram and the background color histogram as the similarity between the foreground region and the background region in the image to be processed may be: calculating a cosine distance s between the foreground color histogram and the background color histogram, and calculating the similarity of the foreground color histogram and the background color histogram through an expression (s + 1)/2; the cosine distance is used for measuring the difference between two individuals through the cosine value of an included angle between two vectors in a vector space, and the numerical range of the cosine distance belongs to [ -1,1 ].
Therefore, by implementing the optional embodiment, the similarity degree of the image foreground and the background can be considered, and the matting algorithm more suitable for the image to be processed is determined according to the similarity degree, so that the matting efficiency of the image to be processed can be improved.
In step S140, the image to be processed is subjected to matting processing according to the matting algorithm and the matting parameter.
In an exemplary embodiment of the present disclosure, optionally, performing matting processing on an image to be processed according to a matting algorithm and a matting parameter includes:
adjusting the discrimination of the image to be processed according to the matting parameter;
and separating the foreground and the background of the image to be processed after the discrimination adjustment according to a matting algorithm, and taking the foreground obtained by separation as a matting processing result.
Specifically, the foreground obtained by separation is used as a matting processing result, and the method comprises the following steps:
and softening the edges of the separated foreground through the edge softening parameters, and performing gradual change processing on the area in the preset range of the edges through the feather range parameters so as to take the foreground after the gradual change processing as a matting processing result.
The method for adjusting the discrimination of the image to be processed according to the matting parameter can be as follows: and adjusting the contrast, brightness, sharpness and dynamic range of the image to be processed according to the matting parameters. In addition, the discrimination of the image to be processed is adjusted according to the matting parameter, which can be understood that if the image to be processed is shot in an outdoor over-bright scene, the brightness of the image to be processed can be reduced through the matting parameter, and the discrimination of the foreground and the background is further improved; if the image to be processed is shot in an outdoor dark scene, the brightness of the image to be processed can be improved through the matting parameters, and the distinguishing degree of the foreground and the background is further improved.
The mode of separating the foreground and the background of the image to be processed after the discrimination adjustment according to the matting algorithm can be as follows: and separating the foreground and the background of the image to be processed after the discrimination adjustment is carried out by a threshold value method, a water overflow method, a watershed method, an image segmentation method or a deep learning method.
Specifically, optionally, the manner of separating the foreground and the background of the image to be processed after the discrimination adjustment by the threshold method may be: and determining pixels meeting the requirement of a preset threshold in the image to be processed after the discrimination adjustment as foreground pixels, and determining pixels not meeting the requirement of the preset threshold as background pixels, so that the image to be processed is converted into a binary image to realize the separation of the foreground and the background.
Specifically, optionally, the manner of separating the foreground and the background of the image to be processed, which are subjected to the discrimination adjustment by the flooding method, may be: and sequentially traversing adjacent pixel points in all directions of the image to be processed after the discrimination adjustment from the target vertex of the image to be processed, marking the pixels meeting the requirement of a preset threshold value, and separating the foreground from the background when the traversal is finished.
Specifically, optionally, the manner of separating the foreground and the background of the to-be-processed image after the discrimination adjustment by the watershed method may be: and connecting the pixels of which the spatial position is smaller than the spatial threshold and the gray value is smaller than the gray threshold in the image to be processed after the discrimination adjustment to form a closed contour, thereby realizing the separation of the foreground and the background.
The method for softening the edge of the foreground obtained by separation through the edge softening parameters may be as follows: the edges of the separated foreground are sampled through the edge softening parameters, so that the violent change of the edges can be reduced, the edges are smoother, and the softening effect is achieved.
Wherein, carry out gradual change processing to the region in the predetermined scope of edge through feather scope parameter, can obtain the feather effect, and then, can regard the prospect that has the feather effect as the cutout processing result.
Therefore, by implementing the optional embodiment, the matting effect on the image to be processed can be improved by adjusting the foreground and background region indexes, and the softening processing and the gradual change processing on the foreground region can further improve the matting effect and improve the use experience of the user.
Therefore, the implementation of the method for realizing automatic cutout shown in fig. 1 can realize automatic cutout, and adaptively select the required algorithm according to the areas of different images to be cutout, so as to improve the cutout efficiency and improve the cutout effect; and the matching degree of the determined matting algorithm and the image to be processed can be improved through multi-dimensional classification of the image to be processed, so that the matting effect is further improved.
Turning to fig. 5, fig. 5 is an architectural diagram of an implementation system for automatic matting according to an example embodiment of the present disclosure. As shown in fig. 5, the system for implementing automatic matting comprises an input module 501, a classifier module 502, an adaptive algorithm and parameter selection module 503, an algorithm execution module 504 and an output module 505, specifically:
an input module 501 is used for inputting an image to be processed.
The classifier module 502 is configured to perform convolution processing on the image to be processed through a deep convolution neural network to obtain a first image feature; carrying out average pooling on the first image characteristics to obtain second image characteristics, and converting the second image characteristics into third image characteristics through a full connection layer; determining the score of each category corresponding to the image to be processed in the multiple dimensions according to the third image characteristics so as to obtain multiple classification results corresponding to each dimension; and determining a target score range where the score is located according to a preset classification rule, and determining a category corresponding to the target score range as a category to which the image to be processed belongs.
The adaptive algorithm and parameter selection module 503 is configured to select at least two target categories from the categories corresponding to the images to be processed in the multiple dimensions, and perform normalization processing on scores corresponding to the at least two target categories; and determining a classification result according to the normalization processing result and determining a matting parameter according to the classification result
An algorithm execution module 504, configured to perform region division on an image to be processed to determine a foreground region and a background region, and calculate a foreground color histogram corresponding to the foreground region; determining a color histogram corresponding to each background area, and fusing the color histograms according to a preset area weight to obtain a background color histogram; calculating the similarity of the foreground color histogram and the background color histogram as the similarity of a foreground region and a background region in the image to be processed; and adjusting the discrimination of the image to be processed according to the matting parameter; separating the foreground and the background of the image to be processed after the discrimination adjustment according to a matting algorithm, and taking the foreground obtained by separation as a matting processing result; and softening the edges of the separated foreground through the edge softening parameters, and performing gradual change processing on the area in the preset range of the edges through the feather range parameters so as to take the foreground after the gradual change processing as a matting processing result.
And an output module 505 for outputting the matting processing result.
Therefore, the implementation of the automatic matting implementation system shown in fig. 5 can realize automatic matting, and the required algorithm is adaptively selected according to the regions of different images to be matted, so as to improve matting efficiency and improve matting effect; and the matching degree of the determined matting algorithm and the image to be processed can be improved through multi-dimensional classification of the image to be processed, so that the matting effect is further improved.
Moreover, although the steps of the methods of the present disclosure are depicted in the drawings in a particular order, this does not require or imply that the steps must be performed in this particular order, or that all of the depicted steps must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions, etc.
Exemplary Medium
Having described the methods of the exemplary embodiments of the present disclosure, the media of the exemplary embodiments of the present disclosure will now be described.
In some possible embodiments, various aspects of the present disclosure may also be implemented as a medium having stored thereon program code for implementing steps in an implementation method of automatic matting according to various exemplary embodiments of the present disclosure described in the above-mentioned "exemplary methods" section of this specification when the program code is executed by a processor of a device.
Specifically, the processor of the device, when executing the program code, is configured to implement the following steps: classifying the image to be processed from multiple dimensions to obtain multiple classification results, and determining a matting parameter according to the multiple classification results; fusing the classification results according to preset dimension weights to obtain a score corresponding to the image to be processed, and determining the category of the image to be processed according to the score; determining the similarity of a foreground region and a background region in an image to be processed, and determining a matting algorithm according to the similarity and the category; and carrying out cutout processing on the image to be processed according to the cutout algorithm and the cutout parameters.
In some embodiments of the disclosure, the program code is further configured to, when executed by the processor of the device, perform the following steps: carrying out convolution processing on an image to be processed through a deep convolution neural network to obtain a first image characteristic; performing average pooling on the first image features to obtain second image features, and converting the second image features into third image features through a full-connection layer; determining the score of each category corresponding to the image to be processed in multiple dimensions according to the third image characteristics; and fusing the scores of all categories corresponding to the images to be processed in the dimensions to determine the classification result corresponding to each dimension.
In some embodiments of the disclosure, the program code is further configured to, when executed by the processor of the device, perform the following steps: determining a target score range where the score is located according to a preset classification rule, and determining a category corresponding to the target score range as a category to which the image to be processed belongs; the preset classification rule comprises a plurality of non-intersection score ranges, and each score range corresponds to one category.
In some embodiments of the disclosure, the program code is further configured to, when executed by the processor of the device, perform the following steps: selecting at least two target categories from various categories corresponding to the images to be processed in multiple dimensions, and normalizing scores corresponding to the at least two target categories; determining a classification result according to the normalization processing result and determining a matting parameter according to the classification result; wherein the score of the target category is higher than the scores of other categories in the corresponding plurality of dimensions.
In some embodiments of the disclosure, the program code is executable by a processor of the device to perform the steps of: performing region division on an image to be processed to determine a foreground region and a background region, and calculating a foreground color histogram corresponding to the foreground region; determining a color histogram corresponding to each background area, and fusing the color histograms according to a preset area weight to obtain a background color histogram; and calculating the similarity of the foreground color histogram and the background color histogram as the similarity of a foreground region and a background region in the image to be processed.
In some embodiments of the disclosure, the program code is further configured to, when executed by the processor of the device, perform the following steps: adjusting the discrimination of the image to be processed according to the matting parameter; and separating the foreground and the background of the image to be processed after the discrimination adjustment according to a matting algorithm, and taking the foreground obtained by separation as a matting processing result.
In some embodiments of the disclosure, the program code is further configured to, when executed by the processor of the device, perform the following steps: and softening the edges of the separated foreground through the edge softening parameters, and performing gradual change processing on the area in the preset range of the edges through the feather range parameters so as to take the foreground after the gradual change processing as a matting processing result.
It should be noted that: the above-mentioned medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take a variety of forms, including, but not limited to: an electromagnetic signal, an optical signal, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device over any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., over the internet using an internet service provider).
Exemplary devices
Having described the media of the exemplary embodiments of the present disclosure, next, an implementation of automatic matting of the exemplary embodiments of the present disclosure will be described with reference to fig. 5.
Referring to fig. 6, fig. 6 is a block diagram illustrating an implementation apparatus for automatic matting according to an example embodiment of the present disclosure. As shown in fig. 6, an implementation apparatus for automatic matting according to an example embodiment of the present disclosure includes: an image classification unit 601, a classification result fusion unit 602, a region similarity determination unit 603, and a matting unit 604, wherein:
the image classification unit 601 is configured to classify an image to be processed from multiple dimensions to obtain multiple classification results, and determine a matting parameter according to the multiple classification results;
a classification result fusion unit 602, configured to fuse the multiple classification results according to a preset dimensionality weight to obtain a score corresponding to the image to be processed, and determine a category to which the image to be processed belongs according to the score;
a region similarity determining unit 603, configured to determine similarity between a foreground region and a background region in the image to be processed, and determine a matting algorithm according to the similarity and the category;
and a matting unit 604, configured to perform matting processing on the image to be processed according to the matting algorithm and the matting parameter.
Therefore, the implementation of the device for realizing automatic matting shown in fig. 6 can realize automatic matting, and adaptively select a required algorithm according to areas needing matting of different images, so as to improve matting efficiency and improve matting effect; and the matching degree of the determined matting algorithm and the image to be processed can be improved through multi-dimensional classification of the image to be processed, so that the matting effect is further improved.
As an optional implementation manner, the manner of classifying the image to be processed from multiple dimensions by the image classification unit 601 may specifically be:
the image classification unit 601 performs convolution processing on the image to be processed through a deep convolution neural network to obtain a first image characteristic;
the image classification unit 601 performs average pooling on the first image features to obtain second image features, and converts the second image features into third image features through a full connection layer;
the image classification unit 601 determines scores of each category corresponding to the to-be-processed image in multiple dimensions according to the third image features, so as to obtain multiple classification results corresponding to each dimension.
Therefore, by implementing the optional implementation mode, the important features in the image to be processed can be extracted by performing multi-layer feature processing on the image to be processed, and then the probability that the image to be processed is divided into various categories in each dimension can be determined according to the important features, so that the matting effect on the image to be processed can be improved according to the category to which the image to be processed belongs.
As an optional implementation manner, the manner of determining the category to which the to-be-processed image belongs according to the score by the classification result fusion unit 602 may specifically be:
the classification result fusion unit 602 determines a target score range where the score is located according to a preset classification rule, and determines a category corresponding to the target score range as a category to which the image to be processed belongs; the preset classification rule comprises a plurality of non-intersection score ranges, and each score range corresponds to one category.
Generally speaking, in the existing classification method, the category corresponding to the maximum score selected from the scores of the categories is usually determined as the category of the image to be processed in the dimension, but for the condition that the scores of the categories are relatively similar, the accuracy of the category determined by the method is not high, and by the score fusion method in the disclosure, the scores of the categories in the dimension can be fused according to the weight values corresponding to the categories, so that the problem of inaccurate classification result caused by the condition is avoided to a certain extent, and further, the classification effect on the image to be processed is improved, and the subsequent image matting effect is improved.
As an optional implementation manner, the manner of determining the matting parameter according to the plurality of classification results by the image classification unit 601 may specifically be:
the image classification unit 601 selects at least two target categories from the categories corresponding to the images to be processed in multiple dimensions, and normalizes the scores corresponding to the at least two target categories; wherein the score of the target category is higher than the scores of other categories in the corresponding multiple dimensions;
the image classification unit 601 calculates a matting parameter according to the normalization processing result and a preset weight parameter.
Therefore, by implementing the optional implementation mode, the category with higher importance can be selected as the target category according to the score corresponding to the category, and the matting parameter for image matting is calculated according to the normalization of the score of the target category so as to improve the matting effect.
As an optional implementation manner, the manner of determining the similarity between the foreground region and the background region in the image to be processed by the classification result fusion unit 602 may specifically be:
the classification result fusion unit 602 performs region division on the image to be processed to determine a foreground region and a background region, and calculates a foreground color histogram corresponding to the foreground region;
the classification result fusion unit 602 determines color histograms corresponding to the background regions, and fuses the color histograms according to a preset region weight to obtain a background color histogram;
the classification result fusion unit 602 calculates the similarity between the foreground color histogram and the background color histogram as the similarity between the foreground region and the background region in the image to be processed.
Therefore, by implementing the optional implementation mode, the similarity degree of the image foreground and the background can be considered, and the matting algorithm more suitable for the image to be processed is determined according to the similarity degree, so that the matting efficiency of the image to be processed can be improved.
As an optional implementation manner, the way of performing the matting processing on the image to be processed according to the matting algorithm and the matting parameter by the matting unit 604 may specifically be:
the matting unit 604 adjusts the degree of distinction of the image to be processed according to the matting parameters;
the matting unit 604 separates the foreground and the background of the to-be-processed image after the discrimination adjustment according to the matting algorithm, and takes the separated foreground as a matting processing result.
The way for the matting unit 604 to use the separated foreground as the matting processing result may specifically be:
the matting unit 604 performs softening processing on the edge of the separated foreground through the edge softening parameter, and performs gradual change processing on the region within the preset range of the edge through the feather range parameter, so as to take the foreground after the gradual change processing as a matting processing result.
Therefore, by implementing the optional implementation mode, the matting effect on the image to be processed can be improved by adjusting the foreground and background region indexes, and the softening processing and the gradual change processing on the foreground region can further improve the matting effect and improve the use experience of the user.
Since each functional module of the implementation apparatus for automatic cutout of the example embodiment of the present disclosure corresponds to a step of the example embodiment of the implementation method for automatic cutout, please refer to the embodiment of the implementation method for automatic cutout of the present disclosure for details that are not disclosed in the embodiment of the apparatus of the present disclosure.
It should be noted that although in the above detailed description reference is made to several modules or units of the implementation means of automatic matting, such division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Exemplary electronic device
Having described the method, medium, and apparatus of the exemplary embodiments of the present disclosure, an electronic device according to another exemplary embodiment of the present disclosure is described next.
As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An implementation 700 of automatic matting according to yet another alternative example embodiment of the present disclosure is described below with reference to fig. 7. The implementation device 700 for automatic matting shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of the embodiments of the present disclosure.
As shown in fig. 7, the implementation apparatus 700 for automatic matting is represented in the form of an electronic device. The components of the implementation device 700 for automatic matting can include, but are not limited to: the at least one processing unit 710, the at least one memory unit 720, and a bus 730 that couples various system components including the memory unit 720 and the processing unit 710.
Wherein the storage unit stores program code executable by the processing unit 710 to cause the processing unit 710 to perform steps according to various exemplary embodiments of the present disclosure described in the description part of the above exemplary methods of the present specification. For example, the processing unit 710 may perform the various steps as shown in fig. 1 and 2.
The storage unit 720 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)7201 and/or a cache memory unit 7202, and may further include a read only memory unit (ROM) 7203.
The storage unit 720 may also include a program/utility 7204 having a set (at least one) of program modules 7205, such program modules 7205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 730 may be any representation of one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The automatic matte implementation 700 may also communicate with one or more external devices 800 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the automatic matte implementation 700, and/or with any devices (e.g., router, modem, etc.) that enable the automatic matte implementation 700 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 750. Also, the automated matte implementation 700 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 760. As shown in fig. 7, the network adapter 760 communicates with other modules of the automated matting implementation 700 over a bus 730. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the implementation of automatic matting 700, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure. In some embodiments, the automatic matting implementation apparatus 700 may further include an AI (artificial intelligence) processor for processing computing operations related to machine learning.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
Machine learning is of great significance to the field of automatic matting. With the development of science and technology, the initial manual cutout is gradually converted into the algorithm cutout, and the machine learning plays a crucial role therein. Through the learning of the computer on the cutout rule, a set of mathematical expressions for automatic image deduction can be summarized, and the cutout requirement on any image can be met through the mathematical expressions. The embodiment of the disclosure utilizes the idea of machine learning, and through the improvement of the matting algorithm, the application range of the matting algorithm is enlarged, and then the matting efficiency and the matting effect are improved.
While the spirit and principles of the present disclosure have been described with reference to several particular embodiments, it is to be understood that the present disclosure is not limited to the particular embodiments disclosed, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. An implementation method for automatic cutout is characterized by comprising the following steps:
classifying the image to be processed from multiple dimensions to obtain multiple classification results, and determining a matting parameter according to the multiple classification results;
fusing the classification results according to preset dimension weight to obtain a score corresponding to the image to be processed, and determining the category of the image to be processed according to the score;
determining the similarity of a foreground region and a background region in the image to be processed, and determining a matting algorithm according to the similarity and the category;
and carrying out cutout processing on the image to be processed according to the cutout algorithm and the cutout parameters.
2. The method of claim 1, wherein classifying the image to be processed from multiple dimensions comprises:
carrying out convolution processing on the image to be processed through a deep convolution neural network to obtain a first image characteristic;
performing average pooling on the first image features to obtain second image features, and converting the second image features into third image features through a full-connection layer;
and determining the score of each category corresponding to the image to be processed in the multiple dimensions according to the third image characteristics so as to obtain multiple classification results corresponding to each dimension.
3. The method according to claim 2, wherein determining the category to which the image to be processed belongs according to the score comprises:
determining a target score range where the score is located according to a preset classification rule, and determining a category corresponding to the target score range as a category to which the image to be processed belongs; the preset classification rule comprises a plurality of non-intersection score ranges, and each score range corresponds to one category.
4. The method of claim 2, wherein determining a matting parameter from the plurality of classification results comprises:
selecting at least two target categories from the categories corresponding to the images to be processed in the multiple dimensions, and normalizing scores corresponding to the at least two target categories; wherein the score of the target category is higher than the scores of other categories within the corresponding plurality of dimensions;
and calculating the matting parameters according to the normalization processing result and the preset weight parameters.
5. The method of claim 1, wherein determining the similarity between the foreground region and the background region in the image to be processed comprises:
performing region division on the image to be processed to determine a foreground region and a background region, and calculating a foreground color histogram corresponding to the foreground region;
determining a color histogram corresponding to each background area, and fusing the color histograms according to a preset area weight to obtain a background color histogram;
and calculating the similarity of the foreground color histogram and the background color histogram as the similarity of a foreground region and a background region in the image to be processed.
6. The method of claim 1, wherein matting the to-be-processed image according to the matting algorithm and the matting parameters comprises:
performing discrimination adjustment on the image to be processed according to the matting parameter;
and separating the foreground and the background of the image to be processed after the discrimination adjustment according to the matting algorithm, and taking the foreground obtained by separation as a matting processing result.
7. The method of claim 6, wherein the separating the foreground as a matting processing result comprises:
and softening the separated edges of the foreground through edge softening parameters, and performing gradual change processing on the area in the preset range of the edges through feather range parameters so as to take the foreground after the gradual change processing as a matting processing result.
8. An automatic cutout implementation device is characterized by comprising:
the image classification unit is used for classifying the image to be processed from multiple dimensions to obtain multiple classification results, and determining a matting parameter according to the multiple classification results;
the classification result fusion unit is used for fusing the classification results according to preset dimension weights to obtain a score corresponding to the image to be processed, and determining the category of the image to be processed according to the score;
the region similarity determining unit is used for determining the similarity between a foreground region and a background region in the image to be processed and determining a matting algorithm according to the similarity and the category;
and the matting unit is used for carrying out matting processing on the image to be processed according to the matting algorithm and the matting parameters.
9. An electronic device, comprising:
a processor; and
a memory having stored thereon computer-readable instructions that, when executed by the processor, implement a method of implementing automatic matting as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, implements a method of implementing automatic matting as claimed in any one of claims 1 to 7.
CN201911066619.1A 2019-11-04 2019-11-04 Method and device for realizing automatic cutout, electronic equipment and medium Active CN110838131B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911066619.1A CN110838131B (en) 2019-11-04 2019-11-04 Method and device for realizing automatic cutout, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911066619.1A CN110838131B (en) 2019-11-04 2019-11-04 Method and device for realizing automatic cutout, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN110838131A true CN110838131A (en) 2020-02-25
CN110838131B CN110838131B (en) 2022-05-17

Family

ID=69576128

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911066619.1A Active CN110838131B (en) 2019-11-04 2019-11-04 Method and device for realizing automatic cutout, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN110838131B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330634A (en) * 2020-11-05 2021-02-05 恒信东方文化股份有限公司 Method and system for fine edge matting of clothing
CN112381222A (en) * 2020-11-12 2021-02-19 北京乐学帮网络技术有限公司 Sample generation method and device, computer equipment and storage medium
CN112562056A (en) * 2020-12-03 2021-03-26 广州博冠信息科技有限公司 Control method, device, medium and equipment for virtual light in virtual studio
CN112598694A (en) * 2020-12-31 2021-04-02 深圳市即构科技有限公司 Video image processing method, electronic device and storage medium
WO2022156587A1 (en) * 2021-01-25 2022-07-28 Huawei Technologies Co., Ltd. Artificial intelligence based cut copy paste
CN114820686A (en) * 2022-05-16 2022-07-29 北京百度网讯科技有限公司 Matting method and device, electronic equipment and storage medium
CN114926705A (en) * 2022-05-12 2022-08-19 网易(杭州)网络有限公司 Cover design model training method, medium, device and computing equipment
CN115375535A (en) * 2022-10-21 2022-11-22 珠海金山办公软件有限公司 Interactive matting method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1873656A (en) * 2005-06-03 2006-12-06 中国科学院自动化研究所 Detection method of natural target in robot vision navigation
CN102385753A (en) * 2011-11-17 2012-03-21 江苏大学 Illumination-classification-based adaptive image segmentation method
CN102968782A (en) * 2012-09-12 2013-03-13 苏州大学 Automatic digging method for remarkable objects of color images
US20140003686A1 (en) * 2012-06-28 2014-01-02 Technologie Avanzate T.A. Srl Multimodality Image Segmentation of Volumetric Data Sets
US20150170389A1 (en) * 2013-12-13 2015-06-18 Konica Minolta Laboratory U.S.A., Inc. Automatic selection of optimum algorithms for high dynamic range image processing based on scene classification
CN107301405A (en) * 2017-07-04 2017-10-27 上海应用技术大学 Method for traffic sign detection under natural scene
CN107507206A (en) * 2017-06-09 2017-12-22 合肥工业大学 A kind of depth map extracting method based on conspicuousness detection
CN108711161A (en) * 2018-06-08 2018-10-26 Oppo广东移动通信有限公司 A kind of image partition method, image segmentation device and electronic equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1873656A (en) * 2005-06-03 2006-12-06 中国科学院自动化研究所 Detection method of natural target in robot vision navigation
CN102385753A (en) * 2011-11-17 2012-03-21 江苏大学 Illumination-classification-based adaptive image segmentation method
US20140003686A1 (en) * 2012-06-28 2014-01-02 Technologie Avanzate T.A. Srl Multimodality Image Segmentation of Volumetric Data Sets
CN102968782A (en) * 2012-09-12 2013-03-13 苏州大学 Automatic digging method for remarkable objects of color images
US20150170389A1 (en) * 2013-12-13 2015-06-18 Konica Minolta Laboratory U.S.A., Inc. Automatic selection of optimum algorithms for high dynamic range image processing based on scene classification
CN107507206A (en) * 2017-06-09 2017-12-22 合肥工业大学 A kind of depth map extracting method based on conspicuousness detection
CN107301405A (en) * 2017-07-04 2017-10-27 上海应用技术大学 Method for traffic sign detection under natural scene
CN108711161A (en) * 2018-06-08 2018-10-26 Oppo广东移动通信有限公司 A kind of image partition method, image segmentation device and electronic equipment

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330634A (en) * 2020-11-05 2021-02-05 恒信东方文化股份有限公司 Method and system for fine edge matting of clothing
CN112381222A (en) * 2020-11-12 2021-02-19 北京乐学帮网络技术有限公司 Sample generation method and device, computer equipment and storage medium
CN112562056A (en) * 2020-12-03 2021-03-26 广州博冠信息科技有限公司 Control method, device, medium and equipment for virtual light in virtual studio
CN112598694A (en) * 2020-12-31 2021-04-02 深圳市即构科技有限公司 Video image processing method, electronic device and storage medium
WO2022156587A1 (en) * 2021-01-25 2022-07-28 Huawei Technologies Co., Ltd. Artificial intelligence based cut copy paste
CN114926705A (en) * 2022-05-12 2022-08-19 网易(杭州)网络有限公司 Cover design model training method, medium, device and computing equipment
CN114926705B (en) * 2022-05-12 2024-05-28 网易(杭州)网络有限公司 Cover design model training method, medium, device and computing equipment
CN114820686A (en) * 2022-05-16 2022-07-29 北京百度网讯科技有限公司 Matting method and device, electronic equipment and storage medium
CN115375535A (en) * 2022-10-21 2022-11-22 珠海金山办公软件有限公司 Interactive matting method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110838131B (en) 2022-05-17

Similar Documents

Publication Publication Date Title
CN110838131B (en) Method and device for realizing automatic cutout, electronic equipment and medium
CN112163465B (en) Fine-grained image classification method, fine-grained image classification system, computer equipment and storage medium
CN111507993B (en) Image segmentation method, device and storage medium based on generation countermeasure network
US10088600B2 (en) Weather recognition method and device based on image information detection
CN112150821B (en) Lightweight vehicle detection model construction method, system and device
CN108765336B (en) Image defogging method based on dark and bright primary color prior and adaptive parameter optimization
Yang et al. Single image haze removal via region detection network
CN111738064B (en) Haze concentration identification method for haze image
CN111368846B (en) Road ponding identification method based on boundary semantic segmentation
CN112464911A (en) Improved YOLOv 3-tiny-based traffic sign detection and identification method
CN111553837B (en) Artistic text image generation method based on neural style migration
CN109753878B (en) Imaging identification method and system under severe weather
CN112163628A (en) Method for improving target real-time identification network structure suitable for embedded equipment
CN111950723A (en) Neural network model training method, image processing method, device and terminal equipment
CN112348036A (en) Self-adaptive target detection method based on lightweight residual learning and deconvolution cascade
CN109033978B (en) Error correction strategy-based CNN-SVM hybrid model gesture recognition method
CN112990331A (en) Image processing method, electronic device, and storage medium
CN114283431B (en) Text detection method based on differentiable binarization
CN114627269A (en) Virtual reality security protection monitoring platform based on degree of depth learning target detection
CN112418032A (en) Human behavior recognition method and device, electronic equipment and storage medium
CN114708615A (en) Human body detection method based on image enhancement in low-illumination environment, electronic equipment and storage medium
CN114187515A (en) Image segmentation method and image segmentation device
CN110188693B (en) Improved complex environment vehicle feature extraction and parking discrimination method
CN111738964A (en) Image data enhancement method based on modeling
CN114549340A (en) Contrast enhancement method, computer program product, storage medium, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant