CN109583362B

CN109583362B - Image cartoon method and device

Info

Publication number: CN109583362B
Application number: CN201811421628.3A
Authority: CN
Inventors: 阮仕海; 洪炜冬; 许清泉; 张伟; 王旭
Original assignee: Xiamen Meitu Technology Co Ltd
Current assignee: Xiamen Meitu Technology Co Ltd
Priority date: 2018-11-26
Filing date: 2018-11-26
Publication date: 2021-11-30
Anticipated expiration: 2038-11-26
Also published as: CN109583362A

Abstract

The invention provides an image cartoon method and device, which relate to the technical field of image processing and comprise the following steps: extracting the characteristics of the original image and the reference image to obtain characteristic identifications respectively corresponding to the original image and the reference image; obtaining semantic labels corresponding to each pixel in an original image and a reference image; establishing a mapping relation between at least one original image block and at least one reference image block according to the corresponding feature identifiers of the original image and the reference image respectively and the semantic labels corresponding to each pixel in the original image and the reference image; and processing the original image according to the mapping relation and the reference image to obtain a target image with a cartoon effect. The original image is processed according to the mapping relation by determining the feature identifier and the semantic label which correspond to the original image and the reference image respectively and establishing the mapping relation according to the feature identifier and the semantic label, so that the reduction degree of the target image is improved, and the flexibility of image cartoon is improved.

Description

Image cartoon method and device

Technical Field

The invention relates to the technical field of image processing, in particular to an image cartoon method and device.

Background

With the continuous development of scientific technology, when a user processes an image through an application program, the user can not only beautify the image, but also can carry out cartoon processing on the image, so that the processed image has the cartoon effect.

In the related art, the image may be cartoon processed in a PatchMatch manner, that is, when the original image is cartoon processed with the reference image as a sample, a mapping relationship between the original image and the reference image may be established first, and according to each image block in the mapping relationship, the image block of the reference image is used to fill the image block of the original image, so as to obtain the target image.

However, because there is a difference in color and texture between the original image and the reference image, after the original image is cartoon-finished, the target image has a large difference from the reference image, and the cartoon effect of the reference image cannot be sufficiently restored.

Disclosure of Invention

The present invention is directed to provide an image cartoon method and apparatus, to solve the problem that after an original image is cartoon-ized, a target image and a reference image have a large difference, and the cartoon effect of the reference image cannot be fully restored.

In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:

in a first aspect, an embodiment of the present invention provides an image cartoon method, where the method includes:

extracting features of an original image and a reference image to obtain feature identifications corresponding to the original image and the reference image respectively;

obtaining semantic labels corresponding to each pixel in the original image and the reference image;

establishing a mapping relation between at least one original image block and at least one reference image block according to the feature identifications respectively corresponding to the original image and the reference image and the semantic labels corresponding to each pixel in the original image and the reference image, wherein the original image block is any one area in the original image, and the reference image block is any one area in the reference image;

and processing the original image according to the mapping relation and the reference image to obtain a target image with a cartoon effect.

Optionally, the establishing a mapping relationship between at least one original image block and at least one reference image block according to the feature identifiers respectively corresponding to the original image and the reference image and the semantic label corresponding to each pixel in the original image and the reference image includes:

calculating the characteristic distance and the semantic distance between each original image block and each reference image block according to the characteristic identifications respectively corresponding to the original image and the reference image and the semantic labels corresponding to each pixel in the original image and the reference image;

calculating according to the characteristic distance and the semantic distance between each original image block and each reference image block and the preset characteristic weight and semantic weight to obtain the comprehensive distance between each original image block and each reference image block;

and establishing a mapping relation between the at least one original image block and the at least one reference image block according to the comprehensive distance between each original image block and each reference image block.

Optionally, the obtaining of the semantic label corresponding to each pixel in the original image and the reference image includes:

inputting the original image and the reference image into a semantic segmentation network to obtain a classified original image and a classified reference image;

and determining a semantic label corresponding to each pixel according to the classified category of each pixel in the original image and the classified reference image.

Optionally, the processing the original image according to the mapping relationship and the reference image to obtain a target image with a cartoon effect includes:

for each original image block, determining a target reference image block corresponding to the original image block according to the mapping relation;

and filling the original image blocks according to the target reference image blocks to obtain the target image.

Optionally, before the feature extraction is performed on the original image and the reference image to obtain the feature identifiers corresponding to the original image and the reference image, the method further includes:

identifying the original image and at least one initial image, and determining an original face point corresponding to the original image and an initial face point corresponding to each initial image;

calculating the similarity between the original image and each initial image according to the original face points and the initial face points corresponding to each initial image;

and selecting the reference image from the at least one initial image according to each approximation degree.

In a second aspect, an embodiment of the present invention further provides an image cartoon apparatus, where the apparatus includes:

the extraction module is used for extracting the characteristics of an original image and a reference image to obtain characteristic identifications corresponding to the original image and the reference image respectively;

the acquisition module is used for acquiring semantic labels corresponding to each pixel in the original image and the reference image;

a mapping relation establishing module, configured to establish a mapping relation between at least one original image block and at least one reference image block according to feature identifiers corresponding to the original image and the reference image, respectively, and semantic tags corresponding to each pixel in the original image and the reference image, where the original image block is any one region in the original image, and the reference image block is any one region in the reference image;

and the processing module is used for processing the original image according to the mapping relation and the reference image to obtain a target image with a cartoon effect.

Optionally, the mapping relationship establishing module is specifically configured to calculate a feature distance and a semantic distance between each original image block and each reference image block according to the feature identifiers respectively corresponding to the original image and the reference image and the semantic label corresponding to each pixel in the original image and the reference image; calculating according to the characteristic distance and the semantic distance between each original image block and each reference image block and the preset characteristic weight and semantic weight to obtain the comprehensive distance between each original image block and each reference image block; and establishing a mapping relation between the at least one original image block and the at least one reference image block according to the comprehensive distance between each original image block and each reference image block.

Optionally, the obtaining module is specifically configured to input the original image and the reference image into a semantic segmentation network to obtain a classified original image and a classified reference image; and determining a semantic label corresponding to each pixel according to the classified category of each pixel in the original image and the classified reference image.

Optionally, the processing module is specifically configured to, for each original image block, determine, according to the mapping relationship, a target reference image block corresponding to the original image block; and filling the original image blocks according to the target reference image blocks to obtain the target image.

Optionally, the apparatus further comprises:

the identification module is used for identifying the original image and at least one initial image and determining an original face point corresponding to the original image and an initial face point corresponding to each initial image;

the calculation module is used for calculating the similarity between the original image and each initial image according to the original face points and the initial face points corresponding to each initial image;

a selecting module for selecting the reference image from the at least one initial image according to each of the approximations.

The invention has the beneficial effects that:

the image cartoon method and the device provided by the embodiment of the invention can be used for obtaining the characteristic identifications respectively corresponding to the original image and the reference image by extracting the characteristics of the original image and the reference image, obtaining the semantic label corresponding to each pixel in the original image and the reference image, establishing the mapping relation between at least one original image block and at least one reference image block according to the characteristic identifications respectively corresponding to the original image and the reference image and the semantic labels corresponding to each pixel in the original image and the reference image, and finally processing the original image according to the mapping relation and the reference image to obtain the target image with the cartoon effect. The feature identification and the semantic label which correspond to the original image and the reference image respectively are determined, and the mapping relation is established according to the feature identification and the semantic label, so that the original image is processed according to the mapping relation, the condition that the original image is processed only according to colors is avoided, the reduction degree of a target image is improved, and the flexibility of image cartoon is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

FIG. 1 is a schematic structural diagram of an image cartoon system according to the present invention;

fig. 2 is a schematic flowchart of an image cartoon method according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating an image cartoon method according to another embodiment of the present invention;

FIG. 4 is a schematic diagram of an image cartoon apparatus according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an image cartoonizing apparatus according to another embodiment of the present invention;

fig. 6 is a schematic diagram of an image cartoonizing apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention.

Fig. 1 is a schematic structural diagram of an image cartoonizing system according to an image cartoonizing method provided by the present invention; as shown in fig. 1, the system includes: a server 100 and a terminal 200.

Wherein the server 100 and the terminal 200 are connected by a link.

Specifically, in the process of shooting by the user through the terminal 200, if the original image obtained by shooting needs to be subjected to the cartoonification processing, the original image obtained by shooting can be transmitted to the server 100.

Accordingly, the server 100 may receive the original image sent by the terminal 200, and perform a search from a plurality of pre-stored original images according to the original image to obtain a reference image matching the original image. The server 100 may continue to process the original image and the reference image to obtain feature identifiers corresponding to the original image and the reference image, respectively, obtain a semantic label corresponding to each pixel in the original image and the reference image, then establish a mapping relationship between each original image block in the original image and each reference image block in the reference image according to the obtained feature identifiers and semantic labels, and finally fill each original image block in the original image with the reference image block according to the mapping relationship to obtain a target image with a cartoon effect, thereby sending the target image to the terminal 200.

The terminal 200 may receive the target image sent by the server 100, and complete the cartoon processing of the original image.

In practical applications, the terminal 200 may also store a plurality of initial images in advance, and perform cartoon processing on the initial images according to the plurality of initial images, which is not limited in the embodiment of the present invention.

Fig. 2 is a schematic flow chart of an image cartoonizing method according to an embodiment of the present invention, which is applied to the server shown in fig. 1, and as shown in fig. 2, the method includes:

step 201, feature extraction is performed on the original image and the reference image to obtain feature identifiers corresponding to the original image and the reference image respectively.

The reference image is obtained by the server through matching from a plurality of pre-stored initial images according to the original image. For example, the selection may be performed according to the euclidean distance between the original image and each original image, or may be performed in other manners, which is not limited in the embodiment of the present invention.

In order to improve the similarity between the converted target image and the original image, the feature identifiers of the original image and the reference image may be extracted, so that in the subsequent steps, the mapping relationship between each original image block in the original image and each reference image block in the reference image may be established according to the extracted feature identifiers, and the original image is processed according to the mapping relationship.

Specifically, the server may input both the original image and the reference image into a preset neural network, and perform convolution processing on the original image and the reference image through each convolution layer in the neural network to obtain feature maps (feature maps) of the original image and the reference image, so as to obtain feature identifiers corresponding to the original image and the reference image respectively according to the feature maps.

Step 202, obtaining semantic labels corresponding to each pixel in the original image and the reference image.

After the feature identifiers corresponding to the original image and the reference image are obtained, semantic labels of the original image and the reference image can be obtained, so that in the subsequent step, in the process of establishing the mapping relationship, the semantic labels can be added, and the matching degree of the established mapping relationship is higher.

Specifically, the server may input the original image and the reference image into a semantic segmentation network, and identify scenes in the original image and the reference image through the semantic segmentation network, thereby determining semantic tags corresponding to respective pixels in the original image and the reference image.

It should be noted that the server may also determine the semantic label corresponding to each pixel in the reference image in a manual annotation manner. For example, each region in the reference image may be divided in advance, and a semantic label is set for each region, so that the semantic label corresponding to each pixel in each region is the semantic label of the region.

Correspondingly, when step 202 is executed, the semantic tags do not need to be acquired through a semantic segmentation network, and the semantic tag corresponding to each pixel in the reference image can be acquired according to the preset semantic tags corresponding to each region.

Step 203, establishing a mapping relationship between at least one original image block and at least one reference image block according to the feature identifiers respectively corresponding to the original image and the reference image and the semantic label corresponding to each pixel in the original image and the reference image.

The original image block is any one area in the original image, and the reference image block is any one area in the reference image.

In the process of processing the original image according to the reference image, the original image needs to be divided into a plurality of original image blocks, and different original image blocks are processed respectively. Therefore, a mapping relationship between each original image block and each reference image block can be established, so that in the subsequent step, the original image can be processed according to the established mapping relationship.

And the original image and the reference image respectively correspond to different feature identifiers and semantic labels, so that the feature identifiers corresponding to the original image blocks and the semantic labels corresponding to each pixel in the original image blocks can be compared with the feature identifiers of the reference image blocks and the semantic labels in the reference image blocks, the feature distance and the semantic distance between each original image block and each reference image block are calculated, and the mapping relation between the original image blocks and the reference image blocks is established according to the feature distances and the semantic distances.

Specifically, the server may obtain a feature map by extracting feature identifiers, calculate feature distances between each original image block and each reference image block, express different semantic labels in an RGB (Red-Green-Blue ) image by using different colors, thereby calculating the semantic distances between each original image block and each reference image block, further comprehensively compare the calculated feature distances and the semantic distances to obtain a comprehensive distance, finally select the original image block and the reference image block with the smallest comprehensive distance, establish a corresponding relationship between the original image block and the reference image block, and determine the reference image block corresponding to each original image block.

And 204, processing the original image according to the mapping relation and the reference image to obtain a target image with a cartoon effect.

After the mapping relationship is established, the server can fill the original image blocks in the original image through the reference image blocks in the reference image according to the mapping relationship and in combination with the reference image, and after the filling of the original image blocks is completed, the target image with the cartoon effect can be obtained.

Specifically, the server may select a certain original image block in the original image, find a reference image block corresponding to the original image block in the mapping relationship, and fill the selected original image block according to each pixel in the reference image block, so as to fill each original image block according to the above manner, and obtain the target image with the cartoon effect after filling.

It should be noted that, in the filling process, the server may fill one line in the original image according to the size of the original image block, and fill the next line of the original image after the filling is completed; the server can also fill one column in the original image in a similar way, and the next column is filled after the filling is finished; of course, the original image may also be filled in by other ways, which is not limited in the embodiment of the present invention.

In summary, in the image cartoon method provided in the embodiment of the present invention, feature identifications corresponding to an original image and a reference image are obtained by performing feature extraction on the original image and the reference image, a semantic label corresponding to each pixel in the original image and the reference image is obtained, a mapping relationship between at least one original image block and at least one reference image block is established according to the feature identifications corresponding to the original image and the reference image, and the semantic label corresponding to each pixel in the original image and the reference image, and finally, the original image is processed according to the mapping relationship and the reference image, so as to obtain a target image with a cartoon effect. The feature identification and the semantic label which correspond to the original image and the reference image respectively are determined, and the mapping relation is established according to the feature identification and the semantic label, so that the original image is processed according to the mapping relation, the condition that the original image is processed only according to colors is avoided, the reduction degree of a target image is improved, and the flexibility of image cartoon is improved.

Fig. 3 is a schematic flowchart of an image cartoonizing method according to another embodiment of the present invention, which is applied to the server shown in fig. 1, and as shown in fig. 3, the method includes:

step 301, identifying an original image and at least one initial image, and determining an original face point corresponding to the original image and an initial face point corresponding to each initial image.

After receiving the original image sent by the terminal, the server needs to determine a reference image matched with the original image, so that in the subsequent steps, the original image can be processed according to the reference image to obtain a target image with a cartoon effect.

Therefore, the server can firstly identify the original image and the plurality of initial images to obtain a plurality of original face points corresponding to the original image and a plurality of initial face points corresponding to each initial image.

Specifically, the server may perform image recognition on the original image to obtain a plurality of original face points corresponding to each feature in the face image, and perform normalization processing on the minimum circumscribed rectangle of the original face points corresponding to each feature.

Correspondingly, the server can identify each initial image by adopting the above mode to obtain a plurality of normalized initial face points, so as to obtain a plurality of initial face points corresponding to the plurality of initial images, and in the subsequent step, the similarity between the initial image and each initial image can be determined according to the initial face points and the initial face points.

And 302, calculating the similarity between the original image and each initial image according to the original face points and the initial face points corresponding to each initial image.

After the server identifies and obtains the original face points and the initial face points, calculation can be carried out according to the face points, and the similarity between the original image and each initial image is determined, so that in the subsequent steps, a reference image can be selected according to the value of the similarity parameter.

Because a plurality of original face points acquired according to the original image correspond to different characteristic parts, and each original face point in the original image corresponds to different characteristic parts. Therefore, the original face points corresponding to the same feature parts can be compared with the initial face points, and the distance between the original face points and the initial face points is determined according to the corresponding coordinates of the original face points and the initial face points.

Correspondingly, after the distance between the original face point and the initial face point of each identical feature part is determined according to the corresponding coordinates of each original face point and each initial face point, the similarity between the original image and a certain initial image can be calculated according to a plurality of calculated distances and the weight corresponding to each feature part, and the smaller the similarity is, the more similar the original image is to the initial image.

For example, if the server identifies 106 original face points for an original image according to a preset algorithm, and identifies 106 original face points for each original image, the euclidean distance between the original image and a certain original image is determined, so that the similarity between the original image and the certain original image is determined according to each euclidean distance, and the euclidean distance is inversely proportional to the similarity.

The formula for calculating the euclidean distance between an original image and a certain original image may be d ═ w₁*||Pa1-Pb1||₂+w₂*||Pa2-Pb2||₂+...+w₁₀₆*||Pa106-Pb106||₂Where d is the similarity between the original image and some original image, w₁Weight corresponding to 106 th feature, w₂Weight corresponding to the 2 nd feature, w₁₀₆The weight corresponding to the 1 st feature, Pa1 is the 1 st originalThe coordinates corresponding to the initial face point, Pb1 are the coordinates corresponding to the 1 st initial face point, Pa2 is the coordinates corresponding to the 2 nd initial face point, Pb2 is the coordinates corresponding to the 2 nd initial face point, Pa106 is the coordinates corresponding to the 106 th initial face point, and Pb106 is the coordinates corresponding to the 106 th initial face point.

It should be noted that the features of the face image in the original image and the initial image may include eyebrows, eyes, nose, mouth and face contour, and each feature corresponds to at least one original face point or initial face point.

Step 303, selecting a reference image from at least one initial image according to each approximation degree.

After the approximation degree between the original image and each initial image is obtained through calculation, selection can be performed according to parameter values corresponding to the approximation degrees, and a parameter image is selected from the plurality of initial images, so that the original image can be processed according to the selected reference image in the subsequent step.

Specifically, the server may determine a parameter value corresponding to each approximation degree, traverse the plurality of parameter values, obtain a maximum target parameter value among the plurality of parameter values, and determine an initial image corresponding to the target parameter value, so as to use the initial image corresponding to the target parameter value as a parameter image.

And step 304, performing feature extraction on the original image and the reference image to obtain feature identifications corresponding to the original image and the reference image respectively.

Step 304 is similar to step 201 and will not be described herein again.

And 305, inputting the original image and the reference image into a semantic segmentation network to obtain a classified original image and a classified reference image.

In the process of establishing the mapping relationship between the original image block and the reference image block, not only the feature identifiers of the original image and the reference image are needed, but also the semantic label corresponding to each pixel in the original image and the reference image is needed to be considered.

Therefore, the server can input the original image and the reference image into a preset semantic segmentation network, and the original image and the reference image are identified and classified through the semantic segmentation network to obtain the classified original image and the classified reference image, so that in the subsequent steps, the semantic label corresponding to each pixel can be determined according to the classified original image and the classified reference image.

The classified original image or the classified reference image is an image including a small number of colors, and each color corresponds to a type of scenery, namely corresponds to a semantic label.

And step 306, determining the semantic label corresponding to each pixel according to the category of each pixel in the classified original image and the classified reference image.

After obtaining the classified original image or the classified reference image, the server may search for a label corresponding to each region according to the regions corresponding to different colors in the image, thereby determining a semantic label corresponding to each pixel in the region as the label corresponding to the region.

For example, each feature of the face region in the classified original image is displayed in different colors, and each pixel in the region corresponding to the different colors may be set as semantic labels such as eyebrow, eye, nose, mouth, face contour, and the like.

Step 307, calculating a feature distance and a semantic distance between each original image block and each reference image block according to the feature identifiers respectively corresponding to the original image and the reference image and the semantic label corresponding to each pixel in the original image and the reference image.

After the server obtains the feature identifiers corresponding to the original image and the reference image respectively and the semantic label corresponding to each pixel, the server can divide the original image and the reference image to obtain a plurality of original image blocks and a plurality of reference image blocks, and calculate to obtain the feature distance and the semantic distance between each original image block and each reference image block.

Since the processes of calculating the feature distance and calculating the semantic distance are similar to the process of calculating each of the approximations in step 302, detailed description thereof is omitted.

And 308, calculating according to the characteristic distance and the semantic distance between each original image block and each reference image block and the preset characteristic weight and semantic weight to obtain the comprehensive distance between each original image block and each reference image block.

After the characteristic distance and the semantic distance are obtained through calculation, the server performs weighted calculation on the characteristic distance and the semantic distance according to preset characteristic weight and semantic weight, so that the comprehensive distance of the comprehensive characteristic identifier and the semantic label is obtained, and a mapping relation between the original image block and the reference image block can be established according to the comprehensive distance in subsequent steps.

Specifically, the server may multiply the feature distance and the feature weight, multiply the semantic distance and the semantic weight, and finally add the two products obtained by the multiplication to obtain a comprehensive distance between a certain original image block and a certain reference image block.

Step 309, establishing a mapping relationship between at least one original image block and at least one reference image block according to the integrated distance between each original image block and each reference image block.

The server may determine a reference image block similar to any one of the original image blocks according to the integrated distance between each original image block and each reference image block, thereby establishing a mapping relationship between the original image block and the reference image block.

Specifically, for each original image block, the server may select, from the parameter values corresponding to the multiple comprehensive distances, the reference image block corresponding to the comprehensive distance with the smallest parameter value according to the comprehensive distance between the original image block and each reference image block, so as to establish a mapping relationship between the original image block and the selected reference image block.

Correspondingly, after the corresponding relationship between each original image block and the corresponding reference image block is established, the establishment of the mapping relationship between at least one original image block and at least one reference image block is completed, and the original image can be processed according to the established corresponding relationship in the subsequent steps.

The mapping relationship between at least one original image block and at least one reference image block may be established as follows: the mapping relationship between 1 original image block and 1 reference image block is established, or the mapping relationship between n original image blocks and n reference image blocks is established, where n is a positive integer.

For example, if the original image includes 10 original image blocks and the reference image also includes 10 reference image blocks, a mapping relationship between the first original image block and the first reference image block may be established according to a plurality of integrated distances between the first original image block and the 10 reference image blocks, and if the reference image block corresponding to the integrated distance with the smallest parameter value is the first reference image block, similarly, the mapping relationship between each original image block and the corresponding reference image block may be established according to the parameter values of the integrated distances between the original image block and each reference image block by using the above-described method.

And 310, processing the original image according to the mapping relation and the reference image to obtain a target image with a cartoon effect.

The server can respectively fill each original image block in the original image by combining the established mapping relation in the process of processing the original image, and can finish the processing of the original image after each original image block is filled, so as to obtain the target image with the cartoon effect.

Optionally, for each original image block, the server may determine a target reference image block corresponding to the original image block according to the mapping relationship, and fill the original image block according to the target reference image block to obtain the target image.

Specifically, in the process of processing a certain original image block, the original image block may be searched in the established mapping relationship, and a reference image block corresponding to the original image block is determined according to the mapping relationship, so that the determined reference image block may be used as a target reference image block, and finally, each pixel in the original image block is filled according to each pixel in the target reference image block, so as to obtain a filled original image block.

It should be noted that, in practical applications, the processes in step 201 to step 204 and the processes in step 304 to step 310 may be processed in a coarse-to-fine manner, may also be processed in a VGG (visual geometry Group, oxford university computer vision Group), and may also be processed in other manners, which is not limited in this embodiment of the present invention.

Fig. 4 is a schematic diagram of an image cartoon apparatus according to an embodiment of the present invention, and as shown in fig. 4, the apparatus specifically includes:

an extraction module 401, configured to perform feature extraction on an original image and a reference image to obtain feature identifiers corresponding to the original image and the reference image, respectively;

an obtaining module 402, configured to obtain a semantic tag corresponding to each pixel in the original image and the reference image;

a mapping relationship establishing module 403, configured to establish a mapping relationship between at least one original image block and at least one reference image block according to feature identifiers corresponding to the original image and the reference image, respectively, and semantic tags corresponding to each pixel in the original image and the reference image, where the original image block is any one region in the original image, and the reference image block is any one region in the reference image;

a processing module 404, configured to process the original image according to the mapping relationship and the reference image, so as to obtain a target image with a cartoon effect.

Optionally, the mapping relationship establishing module 403 is specifically configured to calculate a feature distance and a semantic distance between each original image block and each reference image block according to the feature identifiers respectively corresponding to the original image and the reference image and the semantic label corresponding to each pixel in the original image and the reference image; calculating according to the characteristic distance and the semantic distance between each original image block and each reference image block and the preset characteristic weight and semantic weight to obtain the comprehensive distance between each original image block and each reference image block; and establishing a mapping relation between the at least one original image block and the at least one reference image block according to the comprehensive distance between each original image block and each reference image block.

Optionally, the obtaining module 402 is specifically configured to input the original image and the reference image into a semantic segmentation network, so as to obtain a classified original image and a classified reference image; and determining the semantic label corresponding to each pixel according to the classified original image and the classified category to which each pixel in the reference image belongs.

Optionally, the processing module 404 is specifically configured to, for each original image block, determine, according to the mapping relationship, a target reference image block corresponding to the original image block; and filling the original image block according to the target reference image block to obtain the target image.

Optionally, referring to fig. 5, the apparatus further includes:

an identification module 405, configured to identify the original image and at least one initial image, and determine an original face point corresponding to the original image and an initial face point corresponding to each initial image;

a calculating module 406, configured to calculate, according to the original face point and the initial face point corresponding to each initial image, an approximation degree between the original image and each initial image;

a selecting module 407 configured to select the reference image from the at least one initial image according to each of the approximations.

In summary, the image cartoon device provided in the embodiment of the present invention obtains the feature identifiers corresponding to the original image and the reference image respectively by performing feature extraction on the original image and the reference image, obtains the semantic label corresponding to each pixel in the original image and the reference image, establishes the mapping relationship between at least one original image block and at least one reference image block according to the feature identifiers corresponding to the original image and the reference image respectively and the semantic label corresponding to each pixel in the original image and the reference image, and finally processes the original image according to the mapping relationship and the reference image to obtain the target image with the cartoon effect. The feature identification and the semantic label which correspond to the original image and the reference image respectively are determined, and the mapping relation is established according to the feature identification and the semantic label, so that the original image is processed according to the mapping relation, the condition that the original image is processed only according to colors is avoided, the reduction degree of a target image is improved, and the flexibility of image cartoon is improved.

The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.

These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).

Fig. 6 is a schematic diagram of an image cartoon apparatus according to an embodiment of the present invention, where the apparatus may be integrated in a terminal device or a chip of the terminal device, and the terminal may be a computing device with an image processing function.

The device includes: memory 601, processor 602.

The memory 601 is used for storing programs, and the processor 602 calls the programs stored in the memory 601 to execute the above method embodiments. The specific implementation and technical effects are similar, and are not described herein again.

Optionally, the invention also provides a program product, for example a computer-readable storage medium, comprising a program which, when being executed by a processor, is adapted to carry out the above-mentioned method embodiments.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. An image cartoonizing method, comprising:

processing the original image according to the mapping relation and the reference image to obtain a target image with a cartoon effect;

the obtaining of the semantic label corresponding to each pixel in the original image and the reference image includes:

determining a semantic label corresponding to each pixel according to the classified original image and the classified category to which each pixel in the reference image belongs;

the establishing a mapping relationship between at least one original image block and at least one reference image block according to the feature identifiers respectively corresponding to the original image and the reference image and the semantic label corresponding to each pixel in the original image and the reference image comprises:

2. The method according to claim 1, wherein the processing the original image according to the mapping relationship and the reference image to obtain a target image with a cartoon effect comprises:

3. The method according to any one of claims 1 to 2, wherein before the extracting features of the original image and the reference image to obtain the feature identifiers corresponding to the original image and the reference image respectively, the method further comprises:

4. An image cartoonizing apparatus, comprising:

the processing module is used for processing the original image according to the mapping relation and the reference image to obtain a target image with a cartoon effect;

the acquisition module is specifically used for inputting the original image and the reference image into a semantic segmentation network to obtain a classified original image and a classified reference image; determining a semantic label corresponding to each pixel according to the classified original image and the classified category to which each pixel in the reference image belongs;

the mapping relationship establishing module is specifically configured to calculate a feature distance and a semantic distance between each original image block and each reference image block according to the feature identifiers respectively corresponding to the original image and the reference image and the semantic labels corresponding to each pixel in the original image and the reference image; calculating according to the characteristic distance and the semantic distance between each original image block and each reference image block and the preset characteristic weight and semantic weight to obtain the comprehensive distance between each original image block and each reference image block; and establishing a mapping relation between the at least one original image block and the at least one reference image block according to the comprehensive distance between each original image block and each reference image block.

5. The apparatus according to claim 4, wherein the processing module is specifically configured to, for each original image block, determine, according to the mapping relationship, a target reference image block corresponding to the original image block; and filling the original image blocks according to the target reference image blocks to obtain the target image.

6. The apparatus of any of claims 4 to 5, wherein the apparatus further comprises: