CN113537259A

CN113537259A - Automatic generation method and device of object simplified strokes

Info

Publication number: CN113537259A
Application number: CN202010303809.7A
Authority: CN
Inventors: 徐婷; 邹建法
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-04-17
Filing date: 2020-04-17
Publication date: 2021-10-22

Abstract

The invention provides an automatic generation method and device of object simple strokes, which quickly select simple stroke materials corresponding to objects through object detection, object characteristic extraction and matching of images, solves the problem that the prior art has higher requirements on image size, image labeling and system resources when generating object simple strokes from shot images, avoids the influence of the image size on an algorithm, reduces the consumption of complex image labeling, and effectively improves the processing efficiency of the system.

Description

Automatic generation method and device of object simplified strokes

Technical Field

The invention relates to the field of image processing, in particular to an automatic generation method and device of object sketch lines.

Background

With the improvement of the performance of intelligent terminal equipment such as a smart phone, more and more people are used to take pictures and record videos by using the smart phone. The improvement direction of the camera device of the smart phone mainly has two: firstly, the hardware performance is continuously improved, and the quality of the shot photos is closer to that of a professional single lens reflex camera; but the functions of software are becoming rich, and the functions of special scene mode, object identification, object beautification and the like can be provided, so that various shooting requirements of people are met.

One of the functions that is popular at present is to identify a specific object on a shot image and generate a simple stroke of the specific object, i.e., an object simple stroke generation scheme, which can increase the interest and interactivity of shooting. For example, when a sky image is shot, the shape of the cloud in the sky is various, and simple strokes are drawn according to the edges of the cloud, so that when the sky image is shot in a specific scene, the shooting effect of the cloud is enhanced.

Meanwhile, the object sketch line generation scheme can also be used in the industrial field, such as production of product posters. The existing poster manufacturing method of the product adopts a manual mode, and image processing software is used for processing according to a real object picture of the product, so that the mode has low efficiency and is not beneficial to poster production of mass products. Therefore, a method for generating simple lines is needed, which can automatically generate simple lines from the shot images, automatically complete the production of product posters, and improve the production efficiency of the product posters.

However, most of the conventional schemes for generating simplified strokes for images use a structural learning algorithm to generate images, and particularly, use a generative countermeasure network GAN to generate images. These algorithms suffer from the following disadvantages: 1. a large amount of complex manual labeling data is needed, and the labeling cost of the sample for realizing the same function is extremely high; 2. the algorithm of image generation is sensitive to the image size; 3. the structured algorithm realizes the same function, and requires larger video memory and more computing resources during training.

Therefore, there is a need for an automatic generation method of object simplified strokes, which can perform image simplified stroke generation processing on shot images of all sizes and types, and reduce manpower and material resources consumed by manual labeling and avoid a large amount of consumption of processing resources of an intelligent terminal by adopting a deep learning mode.

Disclosure of Invention

In view of the above, the present invention provides an automatic generation method and an automatic generation device for object simple strokes, which are used to solve the problem that in the prior art, when generating object simple strokes from a shot image, high requirements are imposed on image size, image labeling and system resources, avoid the influence of image size on an algorithm, reduce the consumption of complex image labeling, and effectively improve the processing efficiency of a system.

In order to solve the above technical problems, the proposed solution is as follows:

an automatic generation method of object sketch lines comprises the following steps:

carrying out object detection on an input image to acquire an object area in the image;

extracting a feature vector of the object region;

carrying out similarity matching on the feature vector of the object region and a feature vector template in a registered simple stroke feature library to obtain registered simple stroke materials corresponding to the most similar feature vector template;

and generating object sketch on the input image according to the registered sketch materials.

Preferably, the performing object detection on the input image to obtain the object region in the image specifically includes:

determining the number of objects in the image and an object region of at least one object based on an object detection model D;

wherein the object region comprises a position coordinate range of an object;

the training method of the object detection model D specifically comprises the following steps:

acquiring an image set of the marked object outline coordinate information as a training sample A;

and based on a MobileNet V2 backbone network, merging candidate frames of the training sample A by using a non-maximum suppression NMS algorithm, and training to obtain an object detection model D.

Preferably, the extracting the feature vector of the object region specifically includes:

extracting 512-dimensional feature vectors of the object region based on a feature extraction model E;

the training method of the feature extraction model E specifically comprises the following steps:

selecting at least one image in a registered sketching material library, carrying out edge detection by using different threshold parameters, and acquiring an edge detection data set of each image as a training sample B;

and calculating the training sample B by using an Arcface Loss function based on the 18-layer residual error neural network resnet18 and a compressed excitation Block SE-Block structure to obtain a feature extraction model E.

Preferably, the matching of the similarity between the feature vector of the object region and the feature vector template in the registered sketch feature library to obtain the sketch materials corresponding to the most similar feature vector template specifically includes:

calculating the similarity of the feature vector f1 of the object region and at least one feature vector template f2 in the registered sketch feature library based on a similarity function formula (1);

and determining the registered skein materials corresponding to the feature vector template f2 which is most similar to the feature vector.

Preferably, the method for acquiring the feature vector template in the registered sketch feature library specifically includes:

performing edge detection on all images in the registered sketch material library to obtain a registered sketch edge detection image set;

and extracting a feature vector template set of a registered simple stroke feature library from the registered simple stroke edge detection image set based on a feature extraction model E.

Preferably, the generating the object stroked strokes on the input image according to the registered stroked stroke materials specifically includes:

and zooming the obtained simplified stroke materials to a proper size based on the position coordinate range of the object area, and generating the simplified stroke materials on the input image.

An apparatus for automatically generating object sketch lines, comprising:

the object region detection module is used for carrying out object detection on an input image and acquiring an object region in the image;

the characteristic vector extraction module is used for extracting the characteristic vector of the object region;

the characteristic vector matching module is used for matching the similarity of the characteristic vector of the object region with the characteristic vector template in the registered sketch feature library to obtain registered sketch materials corresponding to the most similar characteristic vector template;

and the stroke generating module is used for generating object stroke on the input image according to the registered stroke materials.

Preferably, the object region detection module specifically includes:

wherein the object region comprises a position coordinate range of an object;

Preferably, the feature vector extraction module specifically includes:

Preferably, the feature vector matching module specifically includes:

the similarity calculation module is used for calculating the similarity between the feature vector f1 of the object region and at least one feature vector template f2 in the registered sketch feature library based on a similarity function formula (1);

and the material determining module is used for determining the registered sketch material corresponding to the feature vector template f2 which is most similar to the feature vector.

Preferably, the stroke generating module specifically includes:

An apparatus for automatically generating object sketch lines comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the method for automatically generating object sketch lines when executing the computer program.

A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method for automatic generation of object sketch lines.

According to the technical scheme, the automatic generation method of the object simple drawing is characterized in that the simple drawing materials corresponding to the object are quickly selected through object detection, object feature extraction and matching of the image, the problem that high requirements are required for image size, image labeling and system resources when the object simple drawing is generated from the shot image in the prior art is solved, the influence of the image size on the algorithm is avoided, the consumption of complex image labeling is reduced, and the processing efficiency of the system is effectively improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

FIG. 1 is a flow chart of a method for automatically generating object sketching in accordance with the present invention.

FIG. 2 is a flow chart of the training of the object detection model D in the method for automatically generating object skein according to the present invention.

FIG. 3 is a flowchart illustrating the training of the feature extraction model E in the method for automatically generating object sketching according to the present invention.

FIG. 4 is a flowchart of a method for acquiring a feature vector template registered in a stroke feature library in the method for automatically generating object strokes according to the present invention.

Fig. 5 is a schematic structural diagram of an apparatus for automatically generating object sketch lines according to the present invention.

FIG. 6 is a second schematic structural diagram of an apparatus for automatically generating object sketching lines according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The method for automatically generating the object stroke is suitable for the field of image processing, and particularly relates to the processes of identifying, enhancing and the like of an object in an image when an intelligent terminal is used for image shooting and image processing.

Image processing (image processing) techniques that analyze an image with a computer to achieve a desired result. Digital images are large two-dimensional arrays of elements called pixels and values called gray-scale values, which are captured by industrial cameras, video cameras, scanners, etc. Image processing techniques generally include three parts, image compression, enhancement and restoration, matching, description and recognition.

Generating a generic countermeasure network gan (generic adaptive nets): is a deep learning model that passes through (at least) two modules in the framework: the mutual game learning of the Generative Model (Generative Model) and the Discriminative Model (Discriminative Model) yields a reasonably good output. In practice, deep neural networks are generally used as G and D, and a good training method is needed. The most frequently used place for GAN is image generation, which is used for data enhancement.

As shown in fig. 1, the method for automatically generating object sketch lines provided by the present invention specifically includes:

step 101, performing object detection on an input image to acquire an object region in the image.

The object detection in the image refers to finding out a rectangular area containing a certain target object in the image and obtaining the position and the size of the object in the image.

The input image may be an image of various sizes or a plurality of images in a large batch.

Step 101 specifically determines the number of objects in the image and the object region of at least one object based on the trained object detection model D. Wherein the object region includes a position coordinate range of the object.

Conventional GAN algorithms are sensitive to image size. According to the invention, by intercepting the object region image in the image, the size of the input image with any size can be reset, and then the feature matching is carried out, so that the limitation on the size of the input image is avoided.

In order to improve the accuracy of object detection, deep learning of the object detection model D is important. As shown in fig. 2, the training method of the object detection model D specifically includes:

and acquiring an image set of the marked object outline coordinate information as a training sample A.

The main framework of the MobileNet V2 is to combine residual units of the MobileNet V1 and a residual network ResNet, and replace bottleeck of the residual units with Depthwise constants, most importantly, the most important point is opposite to the residual blocks, and the common residual blocks are firstly convolved by 1 × 1 to reduce the number of feature map channels, then convolved by 3 × 3, and finally convolved by 1 × 1 to expand the number of feature map channels; and in order to avoid the damage of the ReLU to the characteristics, the ReLU nonlinear activation after the layer with less channels is replaced by the linear layer.

The cloud detection network may be replaced by other general purpose networks.

Non-Maximum Suppression NMS algorithm (Non-Maximum Suppression): the essence is to search for local maxima and suppress non-maxima elements. The post-processing module in the target detection framework is mainly used for removing redundant candidate frames to obtain a most representative result so as to accelerate the efficiency of target detection.

The algorithm has the following functions: when the algorithm generates multiple candidate boxes for an object, the box with the highest score is selected and the other candidate boxes for the object are suppressed.

Applicable scenarios are as follows: a plurality of targets are arranged in one graph; if there is only one target, then the candidate box with the highest score is taken directly.

The input of the algorithm is as follows: the algorithm generates all candidate boxes for a graph, and score, threshold thresh, for each box. It can be represented by a 5-dimensional array dets, the first 4 dimensions representing the coordinates of the four corners and the 5 th dimension representing the score.

And (3) outputting an algorithm: the correct set of candidate boxes, a subset of dets.

The training data needed in training is less, only the detection frame data of object detection needs to be labeled, and the problem that complex manual labeling needs to be carried out on training sample images is solved.

Step 102, extracting a feature vector of the object region.

The image edge detection means that pixel points of an image containing edge information are detected according to an input image. The image edge detection greatly reduces the data volume, eliminates information which can be considered irrelevant, and retains important structural attributes of the image. There are three common methods, Sobel operator, Laplacian operator, Canny operator.

Feature extraction is to extract a vector having the most expressive power from an image of a fixed size.

Specifically, based on the feature extraction model E, a 512-dimensional feature vector f of each object region is extracted.

As shown in fig. 3, the training method of the feature extraction model E specifically includes:

selecting at least one image in the registered skein material library, carrying out edge detection by using different threshold parameters, and acquiring an edge detection data set of each image as a training sample B.

The simplified strokes can obtain a series of edge images through canny operations with different thresholds, the data of the part can be used as training sample training networks, the thresholds are different, and the richness degree of image edge details is different.

Based on the 18-layer residual error neural network resnet18 and the compressed excitation Block SE-Block structure as a classification network, calculating a training sample B by using an Arcface Loss function as a recognition algorithm to obtain a feature extraction model E.

The description here uses clouds in the sky as an example. And detecting the model D to obtain the cloud position coordinates, and then obtaining the cloud area image. Since the cloud itself has rare edges, it is difficult to obtain an edge image by a fixed canny threshold. When the threshold value of canny is small, edges are rich, and the interior of the cloud is frequently subjected to detail interference matching; when the threshold value is larger, the edges are fewer, and even the outline information is lacked, and the matching precision is reduced due to the lack of the outline edges.

In order to solve the problem, the scheme adopts a dynamic threshold value to extract the canny edge. The statistical result shows that when different threshold marginalized images are adopted, the proportion of edge pixels accounts for the proportion of the whole image within a certain range, so that the integrity of the cloud outer contour can be ensured, and the interior details of the cloud are ensured to be less.

The registered skein material library is a dynamically updated skein material database, which refers to existing skein materials and skein materials which are gradually acquired and stored in the image training and processing process.

The residual neural network resnet18 is an 18-layer convolutional neural network. The residual network is characterized by easy optimization and can improve accuracy by adding considerable depth. The inner residual block uses jump connection, and the problem of gradient disappearance caused by depth increase in a deep neural network is relieved.

Here, the network structure of the feature extraction model may be replaced with other general networks. For example: MobileNet, ShuffleNet, etc.

A compressed Excitation Block SE-Block structure (Squeeze-and-Excitation Networks) is a brand-new image identification structure, and the accuracy is improved by modeling the correlation among feature channels and strengthening important features. The characteristic response value of each channel can be adjusted in a self-adaptive mode, and the internal dependency relationship among the channels can be modeled. If SE block is added to the prior advanced network, only small calculation consumption is increased, but network performance can be greatly improved.

In face recognition, the improvement of the algorithm is mainly reflected in the design of a loss function. The Arcface Loss function is characterized in that Arcface is directly used for maximizing classification limits in an angular space (angular space) and has a guiding effect on the optimization of the whole network.

And 103, carrying out similarity matching on the feature vector of the object region and the feature vector template in the registered sketch feature library to obtain the registered sketch materials corresponding to the most similar feature vector template.

Feature matching refers to calculating the similarity of two vectors according to the distance between two features.

Specifically, the similarity of the feature vector f1 of the object region to at least one feature vector template f2 in the registered skein feature library is calculated based on the similarity function formula (1).

And then determines the registered skein materials corresponding to the feature vector template f2 with the most similar feature vectors of the object region.

Here, each feature vector template in the registered strollers feature library has corresponding registered strollers materials, and based on the corresponding relationship, the strollers materials used by the object in the image are determined.

As shown in fig. 4, the method for acquiring a feature vector template in a registered strollers character library specifically includes:

and carrying out edge detection on all images in the registered sketch material library to obtain a registered sketch edge detection image set.

And 104, generating object stroked strokes on the input image according to the registered stroked stroke materials.

When the most matched sketch stroke material is selected, the sketch stroke material is zoomed to be in the proper size of the cloud, the material cannot cause distortion, and the output is not sensitive to the size.

The method can be applied to product poster making, namely the method for making the product poster. Specifically, a real object picture of the product is obtained, and according to the automatic generation method of the object simple stroke, the shot image is automatically generated into the simple stroke with simple lines, so that the production of the product poster is automatically completed.

Based on the same concept of the method for automatically generating object sketch lines provided in the foregoing, the present invention further provides an apparatus for automatically generating object sketch lines, as shown in fig. 5, the apparatus comprising: an object region detection module 100, a feature vector extraction module 200, a feature vector matching module 300, and a stroke line generation module 400.

The object region detection module 100 is configured to perform object detection on an input image, and acquire an object region in the image.

The feature vector extraction module 200 is configured to extract a feature vector of the object region.

The feature vector matching module 300 is configured to perform similarity matching on the feature vectors of the object region and the feature vector templates in the registered sketch feature library to obtain registered sketch materials corresponding to the most similar feature vector template.

The stroke generating module 400 is configured to generate object strokes on the input image according to the registered stroke materials.

Based on the same concept of the method for automatically generating object sketch lines provided in the foregoing, the present invention further provides an apparatus for automatically generating object sketch lines, as shown in fig. 6, the apparatus comprising: a memory 101, a processor 102, and a computer program stored in the memory and executable on the processor 102. The processor 102, when executing the computer program, performs the steps of a method for automatically generating object skeleton strokes.

Finally, it should also be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods, apparatus, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart and block diagrams may represent a module, segment, or portion of code, which comprises one or more computer-executable instructions for implementing the logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. It will also be noted that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims

1. A method for automatic generation of object sketching, the method comprising:

extracting a feature vector of the object region;

2. The method according to claim 1, wherein the performing object detection on the input image to obtain an object region in the image specifically includes:

wherein the object region comprises a position coordinate range of an object;

3. The method according to claim 1 or 2, wherein the extracting the feature vector of the object region specifically comprises:

4. The method according to claim 3, wherein the matching of the similarity between the feature vector of the object region and the feature vector template in the registered strollers feature library to obtain the strollers materials corresponding to the most similar feature vector template specifically comprises:

5. The method according to claim 4, wherein the method for obtaining the feature vector template in the registered strollers character library specifically comprises:

6. The method of claim 5, wherein generating object sketch lines on the input image according to the registered sketch materials specifically comprises:

7. An apparatus for automatic generation of object sketch lines, the apparatus comprising:

8. The apparatus according to claim 7, wherein the object region detection module specifically includes:

wherein the object region comprises a position coordinate range of an object;

9. The apparatus according to claim 7 or 8, wherein the feature vector extraction module specifically comprises:

10. The apparatus of claim 9, wherein the feature vector matching module specifically comprises:

11. The method according to claim 10, wherein the stroke generating module specifically comprises:

12. An apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to any one of claims 1-6 when executing the computer program.

13. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.