WO2021139557A1 - Portrait stick figure generation method and system, and drawing robot - Google Patents

Portrait stick figure generation method and system, and drawing robot Download PDF

Info

Publication number
WO2021139557A1
WO2021139557A1 PCT/CN2020/140335 CN2020140335W WO2021139557A1 WO 2021139557 A1 WO2021139557 A1 WO 2021139557A1 CN 2020140335 W CN2020140335 W CN 2020140335W WO 2021139557 A1 WO2021139557 A1 WO 2021139557A1
Authority
WO
WIPO (PCT)
Prior art keywords
portrait
stick
image
photo
style
Prior art date
Application number
PCT/CN2020/140335
Other languages
French (fr)
Chinese (zh)
Inventor
朱静洁
高飞
李鹏
俞泽远
王韬
Original Assignee
杭州未名信科科技有限公司
浙江省北大信息技术高等研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州未名信科科技有限公司, 浙江省北大信息技术高等研究院 filed Critical 杭州未名信科科技有限公司
Publication of WO2021139557A1 publication Critical patent/WO2021139557A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation

Definitions

  • This application belongs to the field of image processing technology, and specifically relates to a method, system and painting robot for generating stick figures.
  • the present invention proposes a portrait stick figure generation method, system and painting robot, aiming to solve the problem that the stick figure generation method in the prior art cannot be well applied to the portrait stick figure of the moving image of the drawing robot.
  • a method for generating stick figures of portraits including the following steps:
  • the stick figure image is obtained through the convolutional neural network model.
  • the convolutional neural network model is specifically:
  • the high-level semantic features of the pre-processed portrait image and stick figure style photo are obtained through the VGG encoder
  • image preprocessing is performed according to the portrait photo to obtain the preprocessed portrait image, and the image preprocessing specifically includes:
  • the encoder adopts VGG encoder
  • Adaptive instantiation module adopts AdaIN network structure
  • the decoder adopts AdaIN network structure.
  • the loss function used for optimization of the convolutional neural network model includes a content loss function, a style loss function, a local sparse loss function, and a consistency loss function.
  • the method further includes:
  • the post-processing of the stick figure is performed according to the stick figure image to obtain the final stick figure image suitable for the painting robot.
  • the post-processing of the stick figure includes Gaussian blur processing, adaptive binarization processing, and line expansion processing.
  • the post-processing of the stick figure specifically includes:
  • the binary image is obtained by the adaptive binarization method of histogram equalization
  • a system for generating stick figures of portraits which specifically includes:
  • Portrait photo preprocessing module used to perform image preprocessing according to portrait photos to obtain preprocessed portrait images
  • Stick figure generation module It is used to obtain stick figure images through convolutional neural network model according to preprocessed portrait images and stick figure style photos.
  • the portrait photo preprocessing module includes:
  • Face key point detection model used to detect facial block diagrams and key points of facial features based on portrait photos to obtain facial bounding box information and position coordinates of key points of facial features;
  • Face alignment unit used to obtain face alignment portrait images according to the facial bounding box information and the position coordinates of the key points of facial features;
  • Face analysis model used to align portrait images to obtain portrait photo analysis masks
  • Image background removal unit used to parse the mask image according to the portrait photo to obtain the background removal portrait image.
  • a painting robot which specifically includes a processor, a communication module, a camera module, and a portrait execution module, wherein the processor can execute the above portrait stick figure generation method.
  • pre-processed portrait images are obtained by image preprocessing according to portrait photos; then pre-processed portrait images and stick figure style photos are obtained through convolutional neural network models
  • a stick figure image the convolutional neural network model is: obtain the high-level semantic features of the pre-processed portrait image and stick figure style photo through an encoder according to the pre-processed portrait image and stick figure style photo; input the high-level semantics
  • the feature is sent to the adaptive instantiation module to obtain the statistical feature; the statistical feature is input to the decoder to obtain an image with a stick figure style.
  • This application realizes that high-quality stick figures can be quickly generated from portrait photos, which is suitable for painting robots and can draw portrait stick figures in a short time. This solves the problem that the stick figure generation method of the prior art cannot be well applied to the portrait stick figure of the animated figure drawn by the painting robot.
  • Fig. 1 shows a flow chart of the steps of a method for generating stick figures of portraits according to an embodiment of the present application
  • FIG. 2 shows a schematic diagram of a network structure of a deep convolutional neural network model according to an embodiment of the present application
  • FIG. 3 shows a schematic diagram of a specific network structure of an encoder and a decoder in a deep convolutional neural network model according to an embodiment of the present application
  • Figure 4 shows a schematic structural diagram of a system for generating stick figures of portraits according to an embodiment of the present application
  • Fig. 5 shows a schematic diagram of the design process of a system for generating simple strokes of portraits according to another embodiment of the present application.
  • an embodiment of the present application provides a method for generating stick figures for portraits.
  • Preprocessed portrait images are obtained by image preprocessing according to portrait photos; then the preprocessed portrait images and stick figures style photos are passed through convolutional nerves.
  • the network model obtains the stick figure image, realizes that the portrait photo can be quickly generated high-quality stick figure, and is suitable for painting robots, which can draw portrait stick figure in a short time. This solves the problem that the stick figure generation method of the prior art cannot be well applied to the portrait stick figure of the animated figure drawn by the painting robot.
  • this application discloses a multi-style portrait stick figure generation method for painting robots, which can perform face recognition and face cutting operations through portrait photos, and then perform portrait-stick figure style conversion.
  • the details of each part generated by the stick figure generation model used in this application are more abundant.
  • the stick figure generation model adopted in this application is suitable for multiple stick figure styles, and has robustness to adapt to multiple stick figure styles and retain the details of character identity information;
  • this application integrates the algorithm into the painting robot to quickly generate the portrait of the simple stroke image to meet the needs of family companionship.
  • Fig. 1 shows a flow chart of the steps of a method for generating stick figures of portraits according to an embodiment of the present application.
  • the method for generating stick figures in this embodiment specifically includes the following steps:
  • S102 Obtain the stick figure image through the convolutional neural network model according to the preprocessed portrait image and stick figure style photo.
  • image preprocessing is performed according to the portrait photo to obtain the preprocessed portrait image, and the image preprocessing specifically includes:
  • the face block diagram and the key points of the facial features are detected, and the information of the face bounding box and the position coordinates of the key points of the facial features are obtained.
  • the face block diagram and key point detection are performed through the face key point prediction model to obtain the face bounding box information of the portrait photo and the corresponding position coordinates of the key points of the facial features.
  • the key points of the facial features are the center of the left eye, the center of the right eye, the tip of the nose and the corner of the mouth.
  • 1000 face images are randomly selected from the CelebA and CelebA-HQ datasets, with sizes of 178*218 and 1024*1024, respectively, and data augmentation is performed on these 2000 face images.
  • Gaussian blur, horizontal flip and mirror flip operations to obtain 6000 content images I C in the training set.
  • the present invention uses the character photos I T taken by mobile phones as the test set.
  • MTCNN Multi-task convolutional neural network
  • candidate window Candidate Bounding Box
  • O-Net three-part network structure that generates the final bounding box and five key points of the face.
  • the five key points finally obtained are the center of the left eye, the center of the right eye, the tip of the nose, the corner of the left mouth, and the right.
  • Mouth corner position: Landmark ⁇ p leye ,p reye ,p nose ,p lmouth ,p rmouth ⁇ .
  • This step belongs to the face alignment step.
  • the position coordinates of the left and right eye centers in the key points of the face are subjected to an affine transformation operation to align the face.
  • the two key points are kept in a horizontal position and a fixed distance from the upper boundary of the image through affine transformation and image cutting operations to perform face alignment.
  • the color of the face image is set to white to achieve the background removal operation, and the processed content image I TT is obtained .
  • all images ⁇ I TT ,I S ⁇ are uniformly scaled to an equal size of 512 in width, and 256*256 patches are randomly cropped; during testing, all images are uniformly scaled to a width of 512, etc. Proportional size.
  • the stick figure image is obtained through the convolutional neural network model according to the preprocessed portrait image and stick figure style photos.
  • Fig. 2 shows a schematic diagram of a network structure of a deep convolutional neural network model according to an embodiment of the present application.
  • the convolutional neural network model is specifically: according to the preprocessed portrait image and the stick figure style photo, the high-level semantic features of the preprocessed portrait image and stick figure style photo are obtained through the VGG encoder;
  • the loss function used in the optimization of the convolutional neural network model includes the content loss function, the style loss function, the local sparse loss function, and the consistency loss function.
  • Convolutional neural network model generation steps First, the deep convolutional neural network model is inspired by the AdaIN network structure, and the high-level semantic features of the content image and style image are obtained through the encoder; then, the last feature map in the encoder is used as an adaptive Instantiate the input of the AdaIN (Adaptive Instance Normalization) module, and combine the content features of the preprocessed portrait image obtained by S101 with the style features of the stick figure style photos through learning feature statistics; finally, the statistical features are output inverted after passing through the decoder Converted to image space to get an image with stick figure style.
  • AdaIN Adaptive Instance Normalization
  • FIG. 3 shows a schematic diagram of the specific network structure of the encoder and the decoder in the deep convolutional neural network model according to an embodiment of the present application.
  • v( ⁇ ) is the VGG encoder with pre-trained model parameters
  • g c is the high-level semantic feature obtained by inputting the content image into the VGG encoder
  • g s is the high-level semantic feature obtained by inputting the style image into the VGG encoder.
  • the network structure of the first few layers of the VGG network of the VGG model, such as the result of Relu4_1, is used as the output feature of the encoder, and the output feature is input into the AdaIN module for learning feature statistics.
  • the learning feature statistics o formula is:
  • AdaIN is an adaptive instantiation module, which learns feature statistics through the combination of mean and standard deviation.
  • the specific formula of AdaIN is as follows:
  • ⁇ ( ⁇ ) is the mean value of the calculated feature
  • ⁇ ( ⁇ ) is the standard deviation of the calculated feature
  • the statistical features obtained by adaptively instantiating the AdaIN module are decoded and output inversely converted into image space.
  • the decoder network structure shown in Figure 3 the decoder is divided into 12 modules (block), the second, seventh, and tenth modules are Upsampling Layer, and the last module is Reflection Padding. And convolution (Convolutional Neural Networks, CNN), the rest of the modules have 3 kinds of operation components: mirror filling, convolution and corrected linear units (Rectified Linear Units, ReLU)).
  • d( ⁇ ) is the decoder
  • cs is the image obtained by the encoder.
  • the optimization of the neural network model uses a variety of loss functions to combine. details as follows:
  • v(cs) represents the features obtained by inputting the color space image obtained by the decoder to the VGG encoder
  • o is the feature statistics of the VGG encoder
  • 1 represents the calculation between the target feature and the output image feature Euclidean distance.
  • each ⁇ i ( ⁇ ) means that one layer of VGG-19 is used to calculate the style loss.
  • the embodiment of the present application uses relu1_1, relu2_1, relu3_1, and relu4_1 layer features with equal weights.
  • represents the multiplication of the corresponding element points
  • M' is the tag mask after updating M
  • M has n categories in total.
  • the areas where the contours of the eyebrows, eyes, glasses, nose, mouth, face and background are extracted are all marked as 0, and the remaining areas are all marked as 1, and M′ m ⁇ n having the same size as M is obtained.
  • the purpose is to sparse the area labeled 1 so that the generated result fits the drawing trajectory of the painting robot more closely.
  • 1 means calculating the Euclidean distance between the two, and the Euclidean distance makes the map generated by the global generator consistent with the pixels of the stick figure style photo.
  • ⁇ 1 , ⁇ 2 , ⁇ 3 , and ⁇ 4 are custom weights.
  • S103 Perform post-processing of the stick figure according to the stick figure image to obtain a final stick figure image suitable for the painting robot.
  • the post-processing of the stick figure includes Gaussian blur processing, adaptive binarization processing, and line expansion processing.
  • the post-processing of stick figures includes:
  • the binary image is obtained by the adaptive binarization method of histogram equalization
  • stick figure post-processing realizes the transition optimization from the stick figure generation result to the drawing result of the painting robot.
  • the Gaussian blur is essentially a low-pass filter, that is, each pixel in the output image is the weighted sum of the corresponding pixel in the original image and the surrounding pixels.
  • the formula of the low-pass filter is:
  • the Gaussian distribution weight matrix and the original image matrix are used for convolution to obtain a Gaussian blurred image. Because the use of a specified threshold binarization will cause unnecessary dark spots, this embodiment uses histogram equalization (Otsu) adaptive binarization
  • the optimization method finds the optimal threshold and binarizes it. The specific process is as follows:
  • the adaptive binarization method of histogram equalization obtains a binary image with black pixels in the foreground and white pixels in the background.
  • f is our binary image
  • b is the convolution template
  • the value of the template is defined as
  • the expansion of the image by b at any position (x, y) is defined as the maximum value of f and the overlapping area b in the image.
  • Fig. 4 shows a schematic structural diagram of a system for generating stick figures of portraits according to an embodiment of the present application.
  • a simple stroke generation system based on portrait photos specifically includes:
  • Portrait photo preprocessing module 10 used to perform image preprocessing according to the portrait photo to obtain a preprocessed portrait image
  • the stick figure generating module 20 is used to obtain the stick figure image through the convolutional neural network model according to the preprocessed portrait image and the stick figure style photo.
  • the portrait photo preprocessing module 10 includes:
  • Face key point detection model used to detect facial block diagrams and key points of facial features based on portrait photos to obtain facial bounding box information and position coordinates of key points of facial features;
  • Face alignment unit used to obtain face alignment portrait images according to the facial bounding box information and the position coordinates of the key points of facial features;
  • Face analysis model used to align portrait images to obtain portrait photo analysis masks
  • Image background removal unit used to parse the mask image according to the portrait photo to obtain the background removal portrait image.
  • Fig. 5 shows a schematic diagram of the design process of a system for generating simple strokes of portraits according to another embodiment of the present application.
  • the portrait stick figure generation system of the embodiment shown in FIG. 5 adds a stick figure post-processing module.
  • the stick figure post-processing module performs stick figure post-processing according to the stick figure image to obtain the final stick figure image.
  • the stick figure post-processing includes Gaussian blur processing, adaptive binarization processing, and line expansion processing.
  • the portrait stick figure generation method, system and painting robot in the embodiments of this application obtain the pre-processed portrait image by image preprocessing according to the portrait photo; then according to the pre-processed portrait image and stick figure style photo, the convolutional neural network model is used to obtain the simple figure.
  • the stroke image realizes the ability to quickly generate high-quality stick figures from portrait photos, and is suitable for painting robots, which can draw portrait stick figures in a short time. This solves the problem that the stick figure generation method of the prior art cannot be well applied to the portrait stick figure of the animated figure drawn by the painting robot.
  • the stick figure generation model used in this application has richer details in each part generated by the stick figure generation model. Specifically, through the feature statistics between the content image and the style image, local sparse constraints, and post-processing, the details of the generated character portrait stick figure are more abundant than the method based on rule generation or direct global generation.
  • the stick figure generation model adopted in this application is suitable for multiple stick figure styles, and has robustness to adapt to multiple stick figure styles and retain the details of character identity information;
  • this application integrates the algorithm into the painting robot to quickly generate portrait images of simple strokes to meet the needs of family companionship.
  • This embodiment also provides a painting robot, which specifically includes a processor, a communication module, a camera module, and a portrait execution module, wherein the processor can execute the above portrait stick figure generation method.
  • the embodiments of the present application also provide a computer program product. Since the principle of the computer program product to solve the problem is similar to the method provided in the first embodiment of the present application, the implementation of the computer program product can refer to the method The implementation of the repetition will not be repeated.
  • this application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A portrait stick figure generation method and system, and a drawing robot. The portrait stick figure generation method comprises: performing image preprocessing according to a portrait photo to obtain a preprocessed portrait image (S101); and obtaining a stick figure image by means of a convolutional neural network model according to the preprocessed portrait image and a stick figure style photo (S102). According to the method, a high-quality stick figure can be quickly generated from a portrait photo, and the method is applicable to the drawing robot; a portrait stick figure can be drawn in a short time; the problem that stick figure generation methods in the prior art cannot be well applied to drawing robots to draw vivid portrait stick figures is solved.

Description

肖像简笔画生成方法、系统及绘画机器人Method and system for generating stick figures of portrait and painting robot 技术领域Technical field
本申请属于图像处理技术领域,具体地,涉及一种肖像简笔画生成方法、系统及绘画机器人。This application belongs to the field of image processing technology, and specifically relates to a method, system and painting robot for generating stick figures.
背景技术Background technique
随着人工智能的发展,越来越多人开始研究人工智能与艺术的结合,即计算艺术。另一方面,人工智能技术与生活越来越息息相关,家庭陪伴机器人领域也正蓬勃发展,家庭陪伴机器人不仅走进了我们的生活,也照亮了我们的精神世界。目前,家庭陪伴机器人可以进行漫画、素描等艺术创作,简笔画利用点、线等简单元素足够将人的特点表现得淋漓尽致,也更符合机器人在短时间内绘制出生动形象的人物肖像画的要求。With the development of artificial intelligence, more and more people have begun to study the combination of artificial intelligence and art, that is, computational art. On the other hand, artificial intelligence technology is more and more closely related to life, and the field of family companion robots is also booming. Family companion robots not only enter our lives, but also illuminate our spiritual world. At present, family companion robots can perform artistic creations such as cartoons and sketches. Simple strokes using simple elements such as dots and lines are sufficient to express human characteristics vividly, and are more in line with the requirements of robots to draw portraits of lively characters in a short time.
目前,对简笔画的制作研究以及将其应用在机器人上的研究还比较少,在现有的基于卷积神经网络的算法中,大多是根据现有的风格照片,把目标照片转化为带有风格的照片,同时保留照片内容。现有算法均以一种块到块(patch-to-patch)的方式替换了风格特性的内容特性,且每次基于数据的训练只能得到适应一种风格画的模型。At present, there are relatively few researches on the production of stick figures and their application to robots. In the existing algorithms based on convolutional neural networks, most of them are based on the existing style photos, and the target photos are transformed into Style photos, while preserving the content of the photos. Existing algorithms all replace the content characteristics of style characteristics in a patch-to-patch manner, and each training based on data can only obtain a model that adapts to one style of painting.
由于实际人脸肖像内容复杂,人脸部件所需要呈现的细节也存在差异,且目前的绘画机器人还具有一定局限性,使得基于绘画机器人的人脸肖像简笔画算法面临巨大挑战。Due to the complex content of actual face portraits, there are also differences in the details that face parts need to present, and the current painting robots also have certain limitations, which makes the face portrait simple stroke drawing algorithm based on painting robots face a huge challenge.
具体而言,主要存在如下以下难点:(1)很多算法能够生成很好的风格画,但生成具有同风格的人物肖像效果并不尽如人意,人脸的轮廓杂乱有重影且五官细节部分无法准确表达,完全不适用绘画机器人提取轨迹来绘画人物肖像。因此,在基于绘画机器人的人物肖像简笔画生成的过程中,保留人物身份信息及符合绘画机器人绘制是一个必须解决的问题。(2)目前的风格照片转换基本上都是基于网站式交互,且生成一张风格化图像要等待一段时间。对于老人小孩来说,操作网站对他们来说非常困难,并且无法作为纪念品收藏留念。在娱乐生活方面,基于机器人交互的绘画机器人且提升画像生成及绘画速度也是一个重要的方面。Specifically, there are mainly the following difficulties: (1) Many algorithms can generate good style paintings, but the effect of generating portraits with the same style is not satisfactory. The contours of the face are messy with ghosting and the details of the facial features. It cannot be accurately expressed and is completely unsuitable for drawing portraits by drawing robots to extract trajectories. Therefore, in the process of generating simple strokes of portraits based on painting robots, it is a problem that must be solved to retain the identity information of the characters and conform to the drawing of the painting robot. (2) The current style photo conversion is basically based on website-style interaction, and it takes a while to generate a stylized image. For the elderly and children, operating the website is very difficult for them and cannot be used as a souvenir collection. In terms of entertainment and life, drawing robots based on robot interaction and increasing the speed of portrait generation and drawing are also an important aspect.
因此,亟需一种可以转换得到高质量、生动形象的肖像简笔画的生成方法,并适用于绘画机器人进行绘制。Therefore, there is an urgent need for a method for generating high-quality and vivid portrait stick figures, and is suitable for drawing by painting robots.
发明内容Summary of the invention
本发明提出了一种肖像简笔画生成方法、系统及绘画机器人,旨在解决现有技术的简 笔画生成方法不能很好地应用于绘画机器人绘画出生动形象的肖像简笔画的问题。The present invention proposes a portrait stick figure generation method, system and painting robot, aiming to solve the problem that the stick figure generation method in the prior art cannot be well applied to the portrait stick figure of the moving image of the drawing robot.
根据本申请实施例的第一个方面,提供了一种肖像简笔画生成方法,包括以下步骤:According to the first aspect of the embodiments of the present application, there is provided a method for generating stick figures of portraits, including the following steps:
根据肖像照片进行图像预处理得到预处理肖像图像;Perform image preprocessing according to portrait photos to obtain preprocessed portrait images;
根据预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像,卷积神经网络模型具体为:According to the preprocessed portrait image and the stick figure style photo, the stick figure image is obtained through the convolutional neural network model. The convolutional neural network model is specifically:
根据预处理肖像图像以及简笔画风格照片通过VGG编码器得到预处理肖像图像以及简笔画风格照片的高层语义特征;According to the pre-processed portrait image and the stick figure style photo, the high-level semantic features of the pre-processed portrait image and stick figure style photo are obtained through the VGG encoder;
输入高层语义特征至自适应实例化AdaIN模块得到统计特征;Input high-level semantic features to the adaptive instantiation AdaIN module to obtain statistical features;
输入统计特征至解码器得到具有简笔画风格的图像。Input the statistical features to the decoder to get an image with stick figure style.
可选地,根据肖像照片进行图像预处理得到预处理肖像图像,图像预处理具体包括:Optionally, image preprocessing is performed according to the portrait photo to obtain the preprocessed portrait image, and the image preprocessing specifically includes:
根据肖像照片进行面部框图以及面部五官关键点检测,得到面部边界框信息以及面部五官关键点的位置坐标;Perform face block diagram and key point detection of facial features based on portrait photos to obtain facial bounding box information and position coordinates of key points of facial features;
根据面部边界框信息以及面部五官关键点的位置坐标得到人脸对齐肖像图像;Obtain a face-aligned portrait image according to the facial bounding box information and the position coordinates of the key points of the facial features;
根据人脸对齐肖像图像得到肖像照片解析掩码图;Obtain a portrait photo parsing mask image according to the face alignment portrait image;
根据肖像照片解析掩码图得到去背景的肖像图像。Analyze the mask image according to the portrait photo to obtain the portrait image without background.
可选地,Optionally,
编码器采用VGG编码器;The encoder adopts VGG encoder;
自适应实例化模块采用AdaIN网络结构;Adaptive instantiation module adopts AdaIN network structure;
解码器采用AdaIN网络结构。The decoder adopts AdaIN network structure.
可选地,卷积神经网络模型进行优化采用的损失函数包括内容损失函数、风格损失函数、局部稀疏损失函数以及一致性损失函数。Optionally, the loss function used for optimization of the convolutional neural network model includes a content loss function, a style loss function, a local sparse loss function, and a consistency loss function.
可选地,在根据预处理肖像图像通过卷积神经网络模型得到简笔画图像后,还包括:Optionally, after obtaining the stick figure image through the convolutional neural network model according to the preprocessed portrait image, the method further includes:
根据简笔画图像进行简笔画后处理得到绘画机器人适用的最终简笔画图像。The post-processing of the stick figure is performed according to the stick figure image to obtain the final stick figure image suitable for the painting robot.
可选地,简笔画后处理包括高斯模糊处理、自适应二值化处理以及线条膨胀处理。Optionally, the post-processing of the stick figure includes Gaussian blur processing, adaptive binarization processing, and line expansion processing.
可选地,简笔画后处理具体包括:Optionally, the post-processing of the stick figure specifically includes:
输入简笔画图像至低通滤波器进行高斯模糊处理得到高斯模糊图像;Input the stick figure image to the low-pass filter to perform Gaussian blur processing to obtain a Gaussian blurred image;
根据高斯模糊图像采用直方图均衡化的自适应二值化方法得到二值图像;According to the Gaussian blurred image, the binary image is obtained by the adaptive binarization method of histogram equalization;
根据二值图像进行线条膨胀处理得到最终简笔画图像。Perform line expansion processing according to the binary image to obtain the final stick figure image.
根据本申请实施例的第二个方面,提供了一种肖像简笔画生成系统,具体包括:According to the second aspect of the embodiments of the present application, there is provided a system for generating stick figures of portraits, which specifically includes:
肖像照片预处理模块:用于根据肖像照片进行图像预处理得到预处理肖像图像;Portrait photo preprocessing module: used to perform image preprocessing according to portrait photos to obtain preprocessed portrait images;
简笔画生成模块:用于根据预处理肖像图像以及简笔画风格照片通过卷积神经网络模 型得到简笔画图像。Stick figure generation module: It is used to obtain stick figure images through convolutional neural network model according to preprocessed portrait images and stick figure style photos.
可选地,肖像照片预处理模块包括:Optionally, the portrait photo preprocessing module includes:
人脸关键点检测模型:用于根据肖像照片进行面部框图以及面部五官关键点检测,得到面部边界框信息以及面部五官关键点的位置坐标;Face key point detection model: used to detect facial block diagrams and key points of facial features based on portrait photos to obtain facial bounding box information and position coordinates of key points of facial features;
人脸对齐单元:用于根据面部边界框信息以及面部五官关键点的位置坐标得到人脸对齐肖像图像;Face alignment unit: used to obtain face alignment portrait images according to the facial bounding box information and the position coordinates of the key points of facial features;
人脸解析模型:用于根据人脸对齐肖像图像得到肖像照片解析掩码图;Face analysis model: used to align portrait images to obtain portrait photo analysis masks;
图像去背景单元:用于根据肖像照片解析掩码图得到去背景的肖像图像。Image background removal unit: used to parse the mask image according to the portrait photo to obtain the background removal portrait image.
根据本申请实施例的第三个方面,提供了一种绘画机器人,具体包括:处理器、通信模块、摄像模块和画像执行模块,其中,处理器可以执行以上的肖像简笔画生成方法。According to a third aspect of the embodiments of the present application, a painting robot is provided, which specifically includes a processor, a communication module, a camera module, and a portrait execution module, wherein the processor can execute the above portrait stick figure generation method.
采用本申请实施例中的肖像简笔画生成方法、系统及绘画机器人,通过根据肖像照片进行图像预处理得到预处理肖像图像;然后根据预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像,所述卷积神经网络模型为:根据所述预处理肖像图像以及简笔画风格照片通过编码器得到所述预处理肖像图像以及简笔画风格照片的高层语义特征;输入所述高层语义特征至自适应实例化模块得到统计特征;输入所述统计特征至解码器得到具有简笔画风格的图像。本申请实现了能够快速将肖像照片生成高质量的简笔画,并适用于绘画机器人,可在短的时间内绘制出肖像简笔画。解决了现有技术的简笔画生成方法不能很好地应用于绘画机器人绘画出生动形象的肖像简笔画的问题。Using the portrait stick figure generation method, system and painting robot in the embodiments of this application, pre-processed portrait images are obtained by image preprocessing according to portrait photos; then pre-processed portrait images and stick figure style photos are obtained through convolutional neural network models A stick figure image, the convolutional neural network model is: obtain the high-level semantic features of the pre-processed portrait image and stick figure style photo through an encoder according to the pre-processed portrait image and stick figure style photo; input the high-level semantics The feature is sent to the adaptive instantiation module to obtain the statistical feature; the statistical feature is input to the decoder to obtain an image with a stick figure style. This application realizes that high-quality stick figures can be quickly generated from portrait photos, which is suitable for painting robots and can draw portrait stick figures in a short time. This solves the problem that the stick figure generation method of the prior art cannot be well applied to the portrait stick figure of the animated figure drawn by the painting robot.
附图说明Description of the drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The exemplary embodiments and descriptions of the application are used to explain the application, and do not constitute an improper limitation of the application. In the attached picture:
图1中示出了根据本申请实施例的一种肖像简笔画生成方法的步骤流程图;Fig. 1 shows a flow chart of the steps of a method for generating stick figures of portraits according to an embodiment of the present application;
图2中示出了根据本申请实施例的深度卷积神经网络模型网络结构示意图;FIG. 2 shows a schematic diagram of a network structure of a deep convolutional neural network model according to an embodiment of the present application;
图3中示出了根据本申请实施例的深度卷积神经网络模型中编码器和解码器的具体网络结构示意图;FIG. 3 shows a schematic diagram of a specific network structure of an encoder and a decoder in a deep convolutional neural network model according to an embodiment of the present application;
图4示出了根据本申请实施例的一种肖像简笔画生成系统的结构示意图;Figure 4 shows a schematic structural diagram of a system for generating stick figures of portraits according to an embodiment of the present application;
图5中示出了根据本申请另一实施例的一种肖像简笔画生成系统的设计流程示意图。Fig. 5 shows a schematic diagram of the design process of a system for generating simple strokes of portraits according to another embodiment of the present application.
具体实施方式Detailed ways
在实现本申请的过程中,发明人发现随着人工智能技术的不断发展,绘画机器人越来越多的应用于人类生活中,肖像绘制在虚拟现实、增强现实以及机器人肖像绘制系统等多媒体、个性化娱乐以及互联网等有广泛应用。由于实际人脸肖像内容复杂,人脸部件所需要呈现的细节也存在差异,且目前的绘画机器人还具有一定局限性,使得基于绘画机器人的人脸肖像简笔画算法应用于绘画机器人面临巨大挑战。因此,亟需一种可以转换得到高质量、生动形象的肖像简笔画的生成方法,并适用于绘画机器人进行绘制。In the process of realizing this application, the inventor found that with the continuous development of artificial intelligence technology, painting robots are increasingly used in human life. Portraits are drawn in virtual reality, augmented reality, and robot portrait rendering systems, such as multimedia and personality. It is widely used in entertainment and the Internet. Due to the complex content of actual face portraits, there are also differences in the details that face parts need to present, and current painting robots also have certain limitations, making the simple strokes algorithm of face portraits based on painting robots face huge challenges when applied to painting robots . Therefore, there is an urgent need for a method for generating high-quality and vivid portrait stick figures, and is suitable for drawing by painting robots.
针对上述问题,本申请实施例中提供了一种肖像简笔画生成方法,通过根据肖像照片进行图像预处理得到预处理肖像图像;然后根据所述预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像,实现了能够快速将肖像照片生成高质量的简笔画,并适用于绘画机器人,可在短的时间内绘制出肖像简笔画。解决了现有技术的简笔画生成方法不能很好地应用于绘画机器人绘画出生动形象的肖像简笔画的问题。In response to the above problems, an embodiment of the present application provides a method for generating stick figures for portraits. Preprocessed portrait images are obtained by image preprocessing according to portrait photos; then the preprocessed portrait images and stick figures style photos are passed through convolutional nerves. The network model obtains the stick figure image, realizes that the portrait photo can be quickly generated high-quality stick figure, and is suitable for painting robots, which can draw portrait stick figure in a short time. This solves the problem that the stick figure generation method of the prior art cannot be well applied to the portrait stick figure of the animated figure drawn by the painting robot.
本申请与现有技术相比,公开了一种面向绘画机器人的多风格人物肖像简笔画生成方法,可以通过肖像照片进行人脸识别、人脸切割等操作,然后进行肖像-简笔画风格转换,本申请采用的简笔画生成模型生成的各个部分的细节更加丰富。Compared with the prior art, this application discloses a multi-style portrait stick figure generation method for painting robots, which can perform face recognition and face cutting operations through portrait photos, and then perform portrait-stick figure style conversion. The details of each part generated by the stick figure generation model used in this application are more abundant.
具体的,肖像-简笔画风格转换的过程中,本申请采用的简笔画生成模型适用于多种简笔画风格,具备适应多个简笔画风格和保留人物身份信息细节的鲁棒性;Specifically, in the process of portrait-sticky stroke style conversion, the stick figure generation model adopted in this application is suitable for multiple stick figure styles, and has robustness to adapt to multiple stick figure styles and retain the details of character identity information;
在肖像-简笔画风格转换后,进行展示时,本申请将算法集成在绘画机器人中可快速生成人物肖像简笔画图像,满足家庭陪伴的需求。After the portrait-simplified stroke style conversion, when performing the display, this application integrates the algorithm into the painting robot to quickly generate the portrait of the simple stroke image to meet the needs of family companionship.
为了使本申请实施例中的技术方案及优点更加清楚明白,以下结合附图对本申请的示例性实施例进行进一步详细的说明,显然,所描述的实施例仅是本申请的一部分实施例,而不是所有实施例的穷举。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。In order to make the technical solutions and advantages of the embodiments of the present application clearer, the exemplary embodiments of the present application will be described in further detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present application, and Not all examples are exhaustive. It should be noted that the embodiments in the application and the features in the embodiments can be combined with each other if there is no conflict.
实施例1Example 1
图1中示出了根据本申请实施例的一种肖像简笔画生成方法的步骤流程图。Fig. 1 shows a flow chart of the steps of a method for generating stick figures of portraits according to an embodiment of the present application.
如图1所示,本实施例的肖像简笔画生成方法,具体包括以下步骤:As shown in Figure 1, the method for generating stick figures in this embodiment specifically includes the following steps:
S101:根据肖像照片进行图像预处理得到预处理肖像图像;S101: Perform image preprocessing according to the portrait photo to obtain a preprocessed portrait image;
S102:根据预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像。S102: Obtain the stick figure image through the convolutional neural network model according to the preprocessed portrait image and stick figure style photo.
S101中,根据肖像照片进行图像预处理得到预处理肖像图像,图像预处理具体包括:In S101, image preprocessing is performed according to the portrait photo to obtain the preprocessed portrait image, and the image preprocessing specifically includes:
1)根据肖像照片进行面部框图以及面部五官关键点检测,得到面部边界框信息以及面部五官关键点的位置坐标。1) According to the portrait photo, the face block diagram and the key points of the facial features are detected, and the information of the face bounding box and the position coordinates of the key points of the facial features are obtained.
具体的,对于给定的肖像照片,通过人脸关键点预测模型进行面部框图以及关键点检测,得到肖像照片的面部边界框信息以及五官关键点相应的位置坐标。五官关键点为左眼中心、右眼中心、鼻尖以及嘴角。Specifically, for a given portrait photo, the face block diagram and key point detection are performed through the face key point prediction model to obtain the face bounding box information of the portrait photo and the corresponding position coordinates of the key points of the facial features. The key points of the facial features are the center of the left eye, the center of the right eye, the tip of the nose and the corner of the mouth.
本实施例中,从CelebA和CelebA-HQ数据集中随机选取各1000张人脸图像,尺寸分别为178*218和1024*1024,并对这2000张人脸图像进行数据增广,对其分别做高斯模糊、水平翻转以及镜像翻转操作,得到6000张训练集中的内容图像I C。随机挑选n张简笔画风格的图像作为训练集中的风格图像I S。本发明用手机拍摄的人物照片I T作为测试集。 In this embodiment, 1000 face images are randomly selected from the CelebA and CelebA-HQ datasets, with sizes of 178*218 and 1024*1024, respectively, and data augmentation is performed on these 2000 face images. Gaussian blur, horizontal flip and mirror flip operations to obtain 6000 content images I C in the training set. Randomly select n images of stick figure style as the style image I S in the training set. The present invention uses the character photos I T taken by mobile phones as the test set.
具体的,本实施例中,基于MTCNN对内容图像I T进行关键点检测,MTCNN(Multi-task convolutional neural network)大致分为快速生成候选窗口(Candidate Bounding Box)的P-Net、进行高精度候选窗口过滤选择的R-Net和生成最终边界框与五个人脸关键点的O-Net三部分网络结构,最终得到的五个关键点分别为左眼中心、右眼中心、鼻尖、左嘴角及右嘴角位置:Landmark={p leye,p reye,p nose,p lmouth,p rmouth}。 Specifically, in this embodiment, based on the content of the image I T MTCNN for detecting critical points, MTCNN (Multi-task convolutional neural network) are roughly classified into quickly generate candidate window (Candidate Bounding Box) of the P-Net, with high accuracy the candidate The R-Net selected by window filtering and the O-Net three-part network structure that generates the final bounding box and five key points of the face. The five key points finally obtained are the center of the left eye, the center of the right eye, the tip of the nose, the corner of the left mouth, and the right. Mouth corner position: Landmark={p leye ,p reye ,p nose ,p lmouth ,p rmouth }.
2)根据面部边界框信息以及面部五官关键点的位置坐标得到人脸对齐肖像图像。2) Obtain the face-aligned portrait image according to the facial bounding box information and the position coordinates of the key points of the facial features.
这步属于人脸对齐步骤,通过人脸关键点中的左右眼中心的位置坐标经过仿射变换操作以人脸对齐。This step belongs to the face alignment step. The position coordinates of the left and right eye centers in the key points of the face are subjected to an affine transformation operation to align the face.
首先,通过纵轴坐标的值计算两眼中心的水平偏差角度,旋转图像使两眼中心保持水平;然后,.通过缩放使两眼之间距离保持固定。First, calculate the horizontal deviation angle of the center of the two eyes by the value of the vertical axis, and rotate the image to keep the center of the two eyes level; then, the distance between the two eyes is kept fixed by zooming.
利用左右眼中心关键点,通过仿射变换和图像切割操作使两个关键点保持在水平位置且与图像上边界保持固定距离从而进行人脸对齐。Using the key points of the center of the left and right eyes, the two key points are kept in a horizontal position and a fixed distance from the upper boundary of the image through affine transformation and image cutting operations to perform face alignment.
3)根据人脸对齐肖像图像得到肖像照片解析掩码图。3) Obtain a portrait photo parsing mask image according to the face alignment portrait image.
4)根据肖像照片解析掩码图得到去背景的肖像图像。这步中,利用掩码中类别为背景的区域,将肖像照片在其区域内的颜色变为白色以达到肖像照片去背景操作。4) Analyze the mask image according to the portrait photo to obtain the portrait image with the background removed. In this step, the background area in the mask is used to change the color of the portrait photo in its area to white to achieve the background removal operation of the portrait photo.
具体的,基于肖像照片解析掩码方法检测人脸对齐肖像图像得到具有标签的解析掩码图M m×n={k i,j=0,1,...,n},其中m×n为与检测的人脸图像相同的尺寸,k i,j=0,1,...,n为每个像素所属类别,包括背景、脸部、左右眼睛等类别。 Specifically, the face-aligned portrait image is detected based on the portrait photo parsing mask method to obtain a labeled parsing mask image M m×n ={ki ,j =0,1,...,n}, where m×n It is the same size as the detected face image, k i,j = 0,1,...,n is the category to which each pixel belongs, including categories such as background, face, left and right eyes.
根据检测到的背景区域,将人脸图像的颜色设置为白色从而做到去背景操作,得到处理好的的内容图像I TTAccording to the detected background area, the color of the face image is set to white to achieve the background removal operation, and the processed content image I TT is obtained .
训练时,本实施例将所有图像{I TT,I S}统一缩放到宽为512的等比例尺寸,并随机裁剪出256*256的patch;测试时将所有图像统一缩放到宽为512的等比例尺寸。 During training, in this embodiment, all images {I TT ,I S } are uniformly scaled to an equal size of 512 in width, and 256*256 patches are randomly cropped; during testing, all images are uniformly scaled to a width of 512, etc. Proportional size.
S102中,根据预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像。In S102, the stick figure image is obtained through the convolutional neural network model according to the preprocessed portrait image and stick figure style photos.
图2中示出了根据本申请实施例的深度卷积神经网络模型网络结构示意图。Fig. 2 shows a schematic diagram of a network structure of a deep convolutional neural network model according to an embodiment of the present application.
如图2所示,卷积神经网络模型具体为:根据预处理肖像图像以及简笔画风格照片通过VGG编码器得到预处理肖像图像以及简笔画风格照片的高层语义特征;As shown in Figure 2, the convolutional neural network model is specifically: according to the preprocessed portrait image and the stick figure style photo, the high-level semantic features of the preprocessed portrait image and stick figure style photo are obtained through the VGG encoder;
输入高层语义特征至自适应实例化AdaIN模块得到统计特征;Input high-level semantic features to the adaptive instantiation AdaIN module to obtain statistical features;
输入统计特征至解码器得到具有简笔画风格的图像。Input the statistical features to the decoder to get an image with stick figure style.
其中,卷积神经网络模型进行优化采用的损失函数包括内容损失函数、风格损失函数、局部稀疏损失函数以及一致性损失函数。Among them, the loss function used in the optimization of the convolutional neural network model includes the content loss function, the style loss function, the local sparse loss function, and the consistency loss function.
具体的:specific:
卷积神经网络模型生成步骤:首先,深度卷积神经网络模型基于AdaIN网络结构启发,通过编码器得到内容图像和风格图像的高层语义特征;然后,由编码器中的最后一个特征图作为自适应实例化AdaIN(Adaptive Instance Normalization)模块的输入,通过学习特征统计将S101得到的预处理肖像图像的内容特征与简笔画风格照片的风格特征相结合;最后,将统计特征经过解码器后输出反向转换为图像空间,得到具有简笔画风格的图像。Convolutional neural network model generation steps: First, the deep convolutional neural network model is inspired by the AdaIN network structure, and the high-level semantic features of the content image and style image are obtained through the encoder; then, the last feature map in the encoder is used as an adaptive Instantiate the input of the AdaIN (Adaptive Instance Normalization) module, and combine the content features of the preprocessed portrait image obtained by S101 with the style features of the stick figure style photos through learning feature statistics; finally, the statistical features are output inverted after passing through the decoder Converted to image space to get an image with stick figure style.
图3中示出了根据本申请实施例的深度卷积神经网络模型中编码器和解码器的具体网络结构示意图。FIG. 3 shows a schematic diagram of the specific network structure of the encoder and the decoder in the deep convolutional neural network model according to an embodiment of the present application.
如图3所示,在编码器中,由于训练一个编码器需要消耗大量的时间和计算能力,我们采取现有的VGG网络并加载其预先训练好的模型作为编码器,将预处理肖像图像以及简笔画风格照片分别投入到VGG编码器中,编码公式如下:As shown in Figure 3, in the encoder, since training an encoder requires a lot of time and computing power, we take the existing VGG network and load its pre-trained model as the encoder, which will preprocess portrait images and The stick figure style photos are put into the VGG encoder, and the coding formula is as follows:
g c=v(I TT)                                          公式(1) g c =v(I TT ) formula (1)
g s=v(I S)                                           公式(2) g s =v(I S ) formula (2)
其中v(·)为模型参数已预先训练好的VGG编码器,g c为内容图像输入到VGG编码器得到的高层语义特征,g s为风格图像输入到VGG编码器得到的高层语义特征。 Where v(·) is the VGG encoder with pre-trained model parameters, g c is the high-level semantic feature obtained by inputting the content image into the VGG encoder, and g s is the high-level semantic feature obtained by inputting the style image into the VGG encoder.
将VGG模型的VGG网络前几层的网络结构,如Relu4_1的结果作为编码器的输出特征,并将输出特征输入AdaIN模块中进行学习特征统计,学习特征统计o公式为:The network structure of the first few layers of the VGG network of the VGG model, such as the result of Relu4_1, is used as the output feature of the encoder, and the output feature is input into the AdaIN module for learning feature statistics. The learning feature statistics o formula is:
o=AdaIN(g c,g s)                                       公式(3) o=AdaIN(g c ,g s ) formula (3)
其中AdaIN为自适应实例化模块,通过均值与标准差的结合学习特征统计,AdaIN具体公式如下:Among them, AdaIN is an adaptive instantiation module, which learns feature statistics through the combination of mean and standard deviation. The specific formula of AdaIN is as follows:
Figure PCTCN2020140335-appb-000001
Figure PCTCN2020140335-appb-000001
其中μ(·)为计算特征均值,σ(·)为计算特征标准差。Among them, μ(·) is the mean value of the calculated feature, and σ(·) is the standard deviation of the calculated feature.
将自适应实例化AdaIN模块得到的统计特征经过解码输出反向转换为图像空间。The statistical features obtained by adaptively instantiating the AdaIN module are decoded and output inversely converted into image space.
如图3所示的解码器网络结构,解码器共分为12个模块(block),第2、7、10个模块为上采样层(Upsampling Layer),最后一个模块为镜像填充(Reflection Padding)和卷积(Convolutional Neural Networks,CNN),其余模块有3种运算组成分别为镜像填充、卷积和修正线性单元(Rectified Linear Units,ReLU))。The decoder network structure shown in Figure 3, the decoder is divided into 12 modules (block), the second, seventh, and tenth modules are Upsampling Layer, and the last module is Reflection Padding. And convolution (Convolutional Neural Networks, CNN), the rest of the modules have 3 kinds of operation components: mirror filling, convolution and corrected linear units (Rectified Linear Units, ReLU)).
通过解码器得到具有简笔画风格的图像:Get an image with stick figure style through the decoder:
cs=d(o)                       公式(5)cs=d(o) Formula (5)
其中d(·)为解码器,cs为经过编码器得到的图像。Among them, d(·) is the decoder, and cs is the image obtained by the encoder.
关于损失函数具体计算:优化神经网络模型的采用多种损失函数进行组合。具体如下:Regarding the specific calculation of the loss function: the optimization of the neural network model uses a variety of loss functions to combine. details as follows:
对于内容损失,内容损失函数L content计算公式为: For content loss, the calculation formula of the content loss function L content is:
L content=||v(cs)-o|| 2                   公式(6) L content =||v(cs)-o|| 2 Formula (6)
其中v(cs)表示经解码器得到的颜色空间图像输入到VGG编码器得到的特征,o为VGG编码器的特征统计,||·|| 1表示计算计算目标特征与输出图像特征之间的欧式距离。 Where v(cs) represents the features obtained by inputting the color space image obtained by the decoder to the VGG encoder, o is the feature statistics of the VGG encoder, and ||·|| 1 represents the calculation between the target feature and the output image feature Euclidean distance.
对于风格损失,通过优化传输样式特征的平均值和标准偏差的统计数据,风格损失函数L style公式为: For style loss, by optimizing the statistical data of the average and standard deviation of the transmission style features, the formula of the style loss function L style is:
Figure PCTCN2020140335-appb-000002
Figure PCTCN2020140335-appb-000002
其中,每个φ i(·)表示用VGG-19的其中一层来计算风格损失。本申请实施例使用了等权重的relu1_1、relu2_1、relu3_1、relu4_1层特征。 Among them, each φ i (·) means that one layer of VGG-19 is used to calculate the style loss. The embodiment of the present application uses relu1_1, relu2_1, relu3_1, and relu4_1 layer features with equal weights.
对于局部稀疏损失,在已有人脸结构解析掩码的基础上,分别对每个部件进行优化,局部稀疏损失函数L lsparse公式为: For local sparse loss, on the basis of the existing face structure analysis mask, each component is optimized separately. The formula of the local sparse loss function L lsparse is:
L lsparse=||M′Θ(1-d(o))|| 1          公式(8) L lsparse =||M′Θ(1-d(o))|| 1 formula (8)
其中Θ表示对应元素点相乘,M′为将M更新后的标签掩码,M共有n个类别。Where Θ represents the multiplication of the corresponding element points, M'is the tag mask after updating M, and M has n categories in total.
本申请实施例将其眉毛、眼睛、眼镜、鼻子、嘴巴、脸部提取轮廓、背景提取轮廓的区域全标记为0,其余区域全标记为1,得到与M具有共同尺寸的M′ m×n,目的在于稀疏标签 为1的区域,使生成结果更加贴合绘画机器人绘制轨迹。 In the embodiment of this application, the areas where the contours of the eyebrows, eyes, glasses, nose, mouth, face and background are extracted are all marked as 0, and the remaining areas are all marked as 1, and M′ m×n having the same size as M is obtained. , The purpose is to sparse the area labeled 1 so that the generated result fits the drawing trajectory of the painting robot more closely.
对于一致性损失,一致性损失函数公式为:For the consistency loss, the consistency loss function formula is:
L consist=||d(AdaIN(g s,g s)-I s|| 1           公式(9) L consist =||d(AdaIN(g s ,g s )-I s || 1 formula (9)
其中,||·|| 1表示计算两者之间的欧式距离,欧式距离使全局生成器生成图和简笔画风格照片像素相一致。 Among them, ||·|| 1 means calculating the Euclidean distance between the two, and the Euclidean distance makes the map generated by the global generator consistent with the pixels of the stick figure style photo.
最终得到神经网络总损失函数为:Finally, the total loss function of the neural network is:
L=λ 1L content2L style3L lsparse4L consist          公式(10) L=λ 1 L content2 L style3 L lsparse4 L consist formula (10)
其中,λ 1234为自定义权重。 Among them, λ 1 , λ 2 , λ 3 , and λ 4 are custom weights.
实施例2Example 2
本实施例2在实施例1的S102中根据预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像之后,增加了以下步骤:In the second embodiment, after obtaining the stick figure image through the convolutional neural network model according to the preprocessed portrait image and stick figure style photos in S102 of the embodiment 1, the following steps are added:
S103:根据简笔画图像进行简笔画后处理得到绘画机器人适用的最终简笔画图像。S103: Perform post-processing of the stick figure according to the stick figure image to obtain a final stick figure image suitable for the painting robot.
具体的,S103中,简笔画后处理包括高斯模糊处理、自适应二值化处理以及线条膨胀处理。Specifically, in S103, the post-processing of the stick figure includes Gaussian blur processing, adaptive binarization processing, and line expansion processing.
其中,简笔画后处理具体包括:Among them, the post-processing of stick figures includes:
输入简笔画图像至低通滤波器进行高斯模糊处理得到高斯模糊图像;Input the stick figure image to the low-pass filter to perform Gaussian blur processing to obtain a Gaussian blurred image;
根据高斯模糊图像采用直方图均衡化的自适应二值化方法得到二值图像;According to the Gaussian blurred image, the binary image is obtained by the adaptive binarization method of histogram equalization;
根据二值图像进行线条膨胀处理得到最终简笔画图像。Perform line expansion processing according to the binary image to obtain the final stick figure image.
为了减少简笔画图像中多余的不必要杂边,简笔画后处理实现了简笔画的生成结果到绘画机器人绘制结果的过渡优化。In order to reduce the unnecessary and unnecessary edges in the stick figure image, stick figure post-processing realizes the transition optimization from the stick figure generation result to the drawing result of the painting robot.
具体的,首先采用高斯模糊操作,高斯模糊本质上是低通滤波器,即输出图像的每个像素点是原图像上对应像素点与周围像素点的加权和,低通滤波器公式为:Specifically, the Gaussian blur operation is first adopted. The Gaussian blur is essentially a low-pass filter, that is, each pixel in the output image is the weighted sum of the corresponding pixel in the original image and the surrounding pixels. The formula of the low-pass filter is:
Figure PCTCN2020140335-appb-000003
Figure PCTCN2020140335-appb-000003
用高斯分布权值矩阵与原始图像矩阵做卷积运算得到高斯模糊图像,因为采用指定阈值二值化会导致不必要的黑斑,本实施例采用直方图均衡化(Otsu)的自适应二值化方法找到最佳阈值并二值化,具体如下过程:The Gaussian distribution weight matrix and the original image matrix are used for convolution to obtain a Gaussian blurred image. Because the use of a specified threshold binarization will cause unnecessary dark spots, this embodiment uses histogram equalization (Otsu) adaptive binarization The optimization method finds the optimal threshold and binarizes it. The specific process is as follows:
①计算输入图像的归一化直方图,使用p i,i=0,1,...,l-1表示该直方图的各个分量; ① Calculate the normalized histogram of the input image, and use p i ,i=0,1,...,l-1 to represent each component of the histogram;
②对于k=0,1,...,l-1,计算累积和P 1(k)和累积均值m(k); ②For k=0,1,...,l-1, calculate the cumulative sum P 1 (k) and the cumulative mean m(k);
③计算全局灰度均值m G③Calculate the global average gray value m G ;
④对于k=0,1,...,l-1,计算类间方差
Figure PCTCN2020140335-appb-000004
④For k=0,1,...,l-1, calculate the variance between classes
Figure PCTCN2020140335-appb-000004
⑤得到Otsu阈值k *,即使得最大的k值。如果最大值不唯一,用相应检测到的各个最大值k的平均得到k *,从而得到可分性测度η *⑤ Obtain the Otsu threshold k * , which is the maximum k value. If the maximum value is not unique, use the average of the detected maximum values k to obtain k * , thereby obtaining the separability measure η * ;
经过直方图均衡化的自适应二值化方法得到了前景为黑色像素,后景为白色像素的二值图像。The adaptive binarization method of histogram equalization obtains a binary image with black pixels in the foreground and white pixels in the background.
最后根据热值图像做线条膨胀处理,线条膨胀公式为:Finally, the line expansion process is performed according to the calorific value image, and the line expansion formula is:
Figure PCTCN2020140335-appb-000005
Figure PCTCN2020140335-appb-000005
其中f为我们的二值图像,b为卷积模板,模板的值定义为
Figure PCTCN2020140335-appb-000006
且b在任何位置处(x,y)对图像的膨胀,定义为图像中f与重合区域b的最大值。
Where f is our binary image, b is the convolution template, and the value of the template is defined as
Figure PCTCN2020140335-appb-000006
And the expansion of the image by b at any position (x, y) is defined as the maximum value of f and the overlapping area b in the image.
经过简笔画生成后处理操作,最终得到了绘画机器人能绘制出连续顺畅且不空洞的线条的简笔画图像。After the post-processing operation of the stick figure generation, the stick figure image that the painting robot can draw continuous smooth and non-hollow lines is finally obtained.
实施例3Example 3
图4示出了根据本申请实施例的一种肖像简笔画生成系统的结构示意图。Fig. 4 shows a schematic structural diagram of a system for generating stick figures of portraits according to an embodiment of the present application.
如图4所示,一种基于肖像照片的简笔画生成系统,具体包括:As shown in Figure 4, a simple stroke generation system based on portrait photos specifically includes:
肖像照片预处理模块10:用于根据肖像照片进行图像预处理得到预处理肖像图像;Portrait photo preprocessing module 10: used to perform image preprocessing according to the portrait photo to obtain a preprocessed portrait image;
简笔画生成模块20:用于根据预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像。The stick figure generating module 20 is used to obtain the stick figure image through the convolutional neural network model according to the preprocessed portrait image and the stick figure style photo.
具体的,肖像照片预处理模块10包括:Specifically, the portrait photo preprocessing module 10 includes:
人脸关键点检测模型:用于根据肖像照片进行面部框图以及面部五官关键点检测,得到面部边界框信息以及面部五官关键点的位置坐标;Face key point detection model: used to detect facial block diagrams and key points of facial features based on portrait photos to obtain facial bounding box information and position coordinates of key points of facial features;
人脸对齐单元:用于根据面部边界框信息以及面部五官关键点的位置坐标得到人脸对齐肖像图像;Face alignment unit: used to obtain face alignment portrait images according to the facial bounding box information and the position coordinates of the key points of facial features;
人脸解析模型:用于根据人脸对齐肖像图像得到肖像照片解析掩码图;Face analysis model: used to align portrait images to obtain portrait photo analysis masks;
图像去背景单元:用于根据肖像照片解析掩码图得到去背景的肖像图像。Image background removal unit: used to parse the mask image according to the portrait photo to obtain the background removal portrait image.
图5中示出了根据本申请另一实施例的一种肖像简笔画生成系统的设计流程示意图。Fig. 5 shows a schematic diagram of the design process of a system for generating simple strokes of portraits according to another embodiment of the present application.
如图5所示实施例的肖像简笔画生成系统,增加了简笔画后处理模块。The portrait stick figure generation system of the embodiment shown in FIG. 5 adds a stick figure post-processing module.
具体的,简笔画后处理模块根据简笔画图像进行简笔画后处理得到最终简笔画图像,简笔画后处理包括高斯模糊处理、自适应二值化处理以及线条膨胀处理。Specifically, the stick figure post-processing module performs stick figure post-processing according to the stick figure image to obtain the final stick figure image. The stick figure post-processing includes Gaussian blur processing, adaptive binarization processing, and line expansion processing.
本申请实施例中的肖像简笔画生成方法、系统及绘画机器人,通过根据肖像照片进行图像预处理得到预处理肖像图像;然后根据预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像,实现了能够快速将肖像照片生成高质量的简笔画,并适用于绘画机器人,可在短的时间内绘制出肖像简笔画。解决了现有技术的简笔画生成方法不能很好地应用于绘画机器人绘画出生动形象的肖像简笔画的问题。The portrait stick figure generation method, system and painting robot in the embodiments of this application obtain the pre-processed portrait image by image preprocessing according to the portrait photo; then according to the pre-processed portrait image and stick figure style photo, the convolutional neural network model is used to obtain the simple figure. The stroke image realizes the ability to quickly generate high-quality stick figures from portrait photos, and is suitable for painting robots, which can draw portrait stick figures in a short time. This solves the problem that the stick figure generation method of the prior art cannot be well applied to the portrait stick figure of the animated figure drawn by the painting robot.
可以通过肖像照片进行人脸识别、人脸切割等操作,然后进行肖像-简笔画风格转换,本申请采用的简笔画生成模型生成的各个部分的细节更加丰富。具体地,通过内容图像和风格图像之间的特征统计、局部稀疏的约束以及后期的处理,使得生成的人物肖像简笔画的各个细节相比于基于规则生成或者直接全局生成的方法更加丰富。Operations such as face recognition and face cutting can be performed through portrait photos, and then portrait-to-stick stroke style conversion. The stick figure generation model used in this application has richer details in each part generated by the stick figure generation model. Specifically, through the feature statistics between the content image and the style image, local sparse constraints, and post-processing, the details of the generated character portrait stick figure are more abundant than the method based on rule generation or direct global generation.
具体的,肖像-简笔画风格转换的过程中,本申请采用的简笔画生成模型适用于多种简笔画风格,具备适应多个简笔画风格和保留人物身份信息细节的鲁棒性;Specifically, in the process of portrait-sticky stroke style conversion, the stick figure generation model adopted in this application is suitable for multiple stick figure styles, and has robustness to adapt to multiple stick figure styles and retain the details of character identity information;
在肖像-简笔画风格转换后,进行展示时,本申请将算法集成在绘画机器人中可快速生成人物肖像简笔画图像,满足家庭陪伴的需求。After the portrait-simplified stroke style conversion, when displaying, this application integrates the algorithm into the painting robot to quickly generate portrait images of simple strokes to meet the needs of family companionship.
本实施例还提供了一种绘画机器人,具体包括:处理器、通信模块、摄像模块和画像执行模块,其中,处理器可以执行以上的肖像简笔画生成方法。This embodiment also provides a painting robot, which specifically includes a processor, a communication module, a camera module, and a portrait execution module, wherein the processor can execute the above portrait stick figure generation method.
基于同一发明构思,本申请实施例中还提供了一种计算机程序产品,由于该计算机程序产品解决问题的原理与本申请实施例一所提供的方法相似,因此该计算机程序产品的实施可以参见方法的实施,重复之处不再赘述。Based on the same inventive concept, the embodiments of the present application also provide a computer program product. Since the principle of the computer program product to solve the problem is similar to the method provided in the first embodiment of the present application, the implementation of the computer program product can refer to the method The implementation of the repetition will not be repeated.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。This application is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of this application. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。Although the preferred embodiments of the present application have been described, those skilled in the art can make additional changes and modifications to these embodiments once they learn the basic creative concept. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments and all changes and modifications falling within the scope of the present application.
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the application without departing from the spirit and scope of the application. In this way, if these modifications and variations of this application fall within the scope of the claims of this application and their equivalent technologies, then this application is also intended to include these modifications and variations.

Claims (10)

  1. 一种肖像简笔画生成方法,其特征在于,包括以下步骤:A method for generating simple strokes of portraits is characterized in that it comprises the following steps:
    根据肖像照片进行图像预处理得到预处理肖像图像;Perform image preprocessing according to portrait photos to obtain preprocessed portrait images;
    根据所述预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像,所述卷积神经网络模型为:According to the preprocessed portrait image and the stick figure style photo, the stick figure image is obtained through the convolutional neural network model, and the convolutional neural network model is:
    根据所述预处理肖像图像以及简笔画风格照片通过编码器得到所述预处理肖像图像以及简笔画风格照片的高层语义特征;Obtain high-level semantic features of the preprocessed portrait image and the stick figure style photo through an encoder according to the preprocessed portrait image and stick figure style photo;
    输入所述高层语义特征至自适应实例化模块得到统计特征;Input the high-level semantic features to the adaptive instantiation module to obtain statistical features;
    输入所述统计特征至解码器得到具有简笔画风格的图像。Input the statistical characteristics to the decoder to obtain an image with a stick figure style.
  2. 根据权利要求1所述的肖像简笔画生成方法,其特征在于,所述根据肖像照片进行图像预处理得到预处理肖像图像,所述图像预处理具体包括:The method for generating stick figures of portraits according to claim 1, wherein the image preprocessing is performed according to the portrait photos to obtain the preprocessed portrait images, and the image preprocessing specifically comprises:
    根据肖像照片进行面部框图以及面部五官关键点检测,得到面部边界框信息以及面部五官关键点的位置坐标;Perform face block diagram and key point detection of facial features based on portrait photos to obtain facial bounding box information and position coordinates of key points of facial features;
    根据所述面部边界框信息以及面部五官关键点的位置坐标得到人脸对齐肖像图像;Obtaining a face-aligned portrait image according to the facial bounding box information and the position coordinates of the key points of facial features;
    根据所述人脸对齐肖像图像得到肖像照片解析掩码图;Obtaining a portrait photo parsing mask image according to the face-aligned portrait image;
    根据所述肖像照片解析掩码图得到去背景的肖像图像。The mask image is analyzed according to the portrait photo to obtain the portrait image with the background removed.
  3. 根据权利要求1所述的肖像简笔画生成方法,其特征在于,所述编码器采用VGG编码器;The method for generating stick figures of portraits according to claim 1, wherein the encoder adopts a VGG encoder;
    所述自适应实例化模块采用AdaIN网络结构;The adaptive instantiation module adopts the AdaIN network structure;
    所述解码器采用AdaIN网络结构。The decoder adopts the AdaIN network structure.
  4. 根据权利要求1所述的肖像简笔画生成方法,其特征在于,所述卷积神经网络模型进行优化采用的损失函数包括内容损失函数、风格损失函数、局部稀疏损失函数以及一致性损失函数。The method for generating stick figures of portraits according to claim 1, wherein the loss function used for optimization of the convolutional neural network model includes a content loss function, a style loss function, a local sparse loss function, and a consistency loss function.
  5. 根据权利要求1所述的肖像简笔画生成方法,其特征在于,在所述根据所述预处理肖像图像通过卷积神经网络模型得到简笔画图像后,还包括:The method for generating stick figures for portraits according to claim 1, characterized in that, after the stick figure images are obtained through the convolutional neural network model according to the preprocessed portrait images, the method further comprises:
    根据所述简笔画图像进行简笔画后处理得到绘画机器人适用的最终简笔画图像。Performing post-processing of the stick figure according to the stick figure image to obtain the final stick figure image suitable for the painting robot.
  6. 根据权利要求5所述的肖像简笔画生成方法,其特征在于,所述简笔画后处理包括高斯模糊处理、自适应二值化处理以及线条膨胀处理。The method for generating stick figures in portrait according to claim 5, wherein the stick figure post-processing includes Gaussian blur processing, adaptive binarization processing and line expansion processing.
  7. 根据权利要求5或6任一项所述的肖像简笔画生成方法,其特征在于,所述简笔画后处理具体包括:The method for generating stick figures for portraits according to any one of claims 5 or 6, wherein the stick figure post-processing specifically comprises:
    输入所述简笔画图像至低通滤波器进行高斯模糊处理得到高斯模糊图像;Input the stick figure image to a low-pass filter to perform Gaussian blur processing to obtain a Gaussian blurred image;
    根据高斯模糊图像采用直方图均衡化的自适应二值化方法得到二值图像;According to the Gaussian blurred image, the binary image is obtained by the adaptive binarization method of histogram equalization;
    根据所述二值图像进行线条膨胀处理得到最终简笔画图像。Perform line expansion processing according to the binary image to obtain the final stick figure image.
  8. 一种肖像简笔画生成系统,其特征在于,具体包括:A system for generating simple strokes of portraits, which is characterized in that it specifically includes:
    肖像照片预处理模块:用于根据肖像照片进行图像预处理得到预处理肖像图像;Portrait photo preprocessing module: used to perform image preprocessing according to portrait photos to obtain preprocessed portrait images;
    简笔画生成模块:用于根据所述预处理肖像图像以及简笔画风格照片通过卷积神经网络模型得到简笔画图像;Stick figure generation module: used to obtain stick figure images through the convolutional neural network model according to the preprocessed portrait images and stick figure style photos;
    所述卷积神经网络模型为:The convolutional neural network model is:
    根据所述预处理肖像图像以及简笔画风格照片通过VGG编码器得到所述预处理肖像图像以及简笔画风格照片的高层语义特征;Obtain high-level semantic features of the preprocessed portrait image and the stick figure style photo through a VGG encoder according to the preprocessed portrait image and stick figure style photo;
    输入所述高层语义特征至自适应实例化AdaIN模块得到统计特征;Input the high-level semantic features to the adaptive instantiation AdaIN module to obtain statistical features;
    输入所述统计特征至解码器得到具有简笔画风格的图像。Input the statistical characteristics to the decoder to obtain an image with a stick figure style.
  9. 根据权利要求8所述的肖像简笔画生成系统,其特征在于,所述肖像照片预处理模块包括:The portrait stick figure generation system according to claim 8, wherein the portrait photo preprocessing module comprises:
    人脸关键点检测模型:用于根据肖像照片进行面部框图以及面部五官关键点检测,得到面部边界框信息以及面部五官关键点的位置坐标;Face key point detection model: used to detect facial block diagrams and key points of facial features based on portrait photos to obtain facial bounding box information and position coordinates of key points of facial features;
    人脸对齐单元:用于根据所述面部边界框信息以及面部五官关键点的位置坐标得到人脸对齐肖像图像;Face alignment unit: used to obtain a face alignment portrait image according to the facial bounding box information and the position coordinates of the key points of facial features;
    人脸解析模型:用于根据所述人脸对齐肖像图像得到肖像照片解析掩码图;Face analysis model: used to obtain a portrait photo analysis mask according to the face alignment portrait image;
    图像去背景单元:用于根据所述肖像照片解析掩码图得到去背景的肖像图像。Image background removal unit: used to parse the mask image according to the portrait photo to obtain the background removal portrait image.
  10. 一种绘画机器人,其特征在于,包括:处理器、通信模块、摄像模块和画像执行模块,其中,所述处理器可以执行如实现如权利要求1~6中任一所述的一种肖像简笔画生成方法。A painting robot, which is characterized by comprising: a processor, a communication module, a camera module, and a portrait execution module, wherein the processor can execute a portrait sketch as described in any one of claims 1 to 6; Stroke generation method.
PCT/CN2020/140335 2020-01-08 2020-12-28 Portrait stick figure generation method and system, and drawing robot WO2021139557A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010016519.4 2020-01-08
CN202010016519.4A CN111243050B (en) 2020-01-08 2020-01-08 Portrait simple drawing figure generation method and system and painting robot

Publications (1)

Publication Number Publication Date
WO2021139557A1 true WO2021139557A1 (en) 2021-07-15

Family

ID=70879943

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/140335 WO2021139557A1 (en) 2020-01-08 2020-12-28 Portrait stick figure generation method and system, and drawing robot

Country Status (2)

Country Link
CN (1) CN111243050B (en)
WO (1) WO2021139557A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763498A (en) * 2021-08-11 2021-12-07 杭州妙绘科技有限公司 Portrait simple-stroke region self-adaptive color matching method and system for industrial manufacturing
CN114582002A (en) * 2022-04-18 2022-06-03 华南理工大学 Facial expression recognition method combining attention module and second-order pooling mechanism
CN116503524A (en) * 2023-04-11 2023-07-28 广州赛灵力科技有限公司 Virtual image generation method, system, device and storage medium
CN117745904A (en) * 2023-12-14 2024-03-22 山东浪潮超高清智能科技有限公司 2D playground speaking portrait synthesizing method and device

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111243050B (en) * 2020-01-08 2024-02-27 杭州未名信科科技有限公司 Portrait simple drawing figure generation method and system and painting robot
CN111833420B (en) * 2020-07-07 2023-06-30 北京奇艺世纪科技有限公司 True person-based automatic drawing generation method, device and system and storage medium
CN112991358A (en) * 2020-09-30 2021-06-18 北京字节跳动网络技术有限公司 Method for generating style image, method, device, equipment and medium for training model
CN113223103A (en) * 2021-02-02 2021-08-06 杭州妙绘科技有限公司 Method, device, electronic device and medium for generating sketch
CN113658291A (en) * 2021-08-17 2021-11-16 青岛鱼之乐教育科技有限公司 Automatic rendering method of simplified strokes
CN117333580B (en) * 2023-10-18 2024-08-02 北京阿派朗创造力科技有限公司 Mechanical arm drawing method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096934A (en) * 2011-01-27 2011-06-15 电子科技大学 Human face cartoon generating method based on machine learning
CN106873893A (en) * 2017-02-13 2017-06-20 北京光年无限科技有限公司 For the multi-modal exchange method and device of intelligent robot
CN107945244A (en) * 2017-12-29 2018-04-20 哈尔滨拓思科技有限公司 A kind of simple picture generation method based on human face photo
CN109147010A (en) * 2018-08-22 2019-01-04 广东工业大学 Band attribute Face image synthesis method, apparatus, system and readable storage medium storing program for executing
US20190354791A1 (en) * 2018-05-17 2019-11-21 Idemia Identity & Security France Character recognition method
CN111243050A (en) * 2020-01-08 2020-06-05 浙江省北大信息技术高等研究院 Portrait simple stroke generation method and system and drawing robot

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1710608A (en) * 2005-07-07 2005-12-21 上海交通大学 Picture processing method for robot drawing human-face cartoon
CN108596024B (en) * 2018-03-13 2021-05-04 杭州电子科技大学 Portrait generation method based on face structure information
CN109741247B (en) * 2018-12-29 2020-04-21 四川大学 Portrait cartoon generating method based on neural network
CN109766895A (en) * 2019-01-03 2019-05-17 京东方科技集团股份有限公司 The training method and image Style Transfer method of convolutional neural networks for image Style Transfer
CN110490791B (en) * 2019-07-10 2022-10-18 西安理工大学 Clothing image artistic generation method based on deep learning style migration
CN110570377A (en) * 2019-09-11 2019-12-13 辽宁工程技术大学 group normalization-based rapid image style migration method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096934A (en) * 2011-01-27 2011-06-15 电子科技大学 Human face cartoon generating method based on machine learning
CN106873893A (en) * 2017-02-13 2017-06-20 北京光年无限科技有限公司 For the multi-modal exchange method and device of intelligent robot
CN107945244A (en) * 2017-12-29 2018-04-20 哈尔滨拓思科技有限公司 A kind of simple picture generation method based on human face photo
US20190354791A1 (en) * 2018-05-17 2019-11-21 Idemia Identity & Security France Character recognition method
CN109147010A (en) * 2018-08-22 2019-01-04 广东工业大学 Band attribute Face image synthesis method, apparatus, system and readable storage medium storing program for executing
CN111243050A (en) * 2020-01-08 2020-06-05 浙江省北大信息技术高等研究院 Portrait simple stroke generation method and system and drawing robot

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763498A (en) * 2021-08-11 2021-12-07 杭州妙绘科技有限公司 Portrait simple-stroke region self-adaptive color matching method and system for industrial manufacturing
CN113763498B (en) * 2021-08-11 2024-04-26 杭州妙绘科技有限公司 Industrial manufacturing-oriented portrait simple drawing region self-adaptive color matching method and system
CN114582002A (en) * 2022-04-18 2022-06-03 华南理工大学 Facial expression recognition method combining attention module and second-order pooling mechanism
CN116503524A (en) * 2023-04-11 2023-07-28 广州赛灵力科技有限公司 Virtual image generation method, system, device and storage medium
CN116503524B (en) * 2023-04-11 2024-04-12 广州赛灵力科技有限公司 Virtual image generation method, system, device and storage medium
CN117745904A (en) * 2023-12-14 2024-03-22 山东浪潮超高清智能科技有限公司 2D playground speaking portrait synthesizing method and device

Also Published As

Publication number Publication date
CN111243050B (en) 2024-02-27
CN111243050A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
WO2021139557A1 (en) Portrait stick figure generation method and system, and drawing robot
US12039454B2 (en) Microexpression-based image recognition method and apparatus, and related device
CN112800903B (en) Dynamic expression recognition method and system based on space-time diagram convolutional neural network
US20220261968A1 (en) Image optimization method and apparatus, computer storage medium, and electronic device
CN106960202B (en) Smiling face identification method based on visible light and infrared image fusion
WO2018121777A1 (en) Face detection method and apparatus, and electronic device
CN108875935B (en) Natural image target material visual characteristic mapping method based on generation countermeasure network
US8692830B2 (en) Automatic avatar creation
WO2023050992A1 (en) Network training method and apparatus for facial reconstruction, and device and storage medium
CN110660037A (en) Method, apparatus, system and computer program product for face exchange between images
US20230081982A1 (en) Image processing method and apparatus, computer device, storage medium, and computer program product
Li et al. Learning symmetry consistent deep cnns for face completion
CN111680550B (en) Emotion information identification method and device, storage medium and computer equipment
CN111243051B (en) Portrait photo-based simple drawing generation method, system and storage medium
WO2024001095A1 (en) Facial expression recognition method, terminal device and storage medium
US20230082715A1 (en) Method for training image processing model, image processing method, apparatus, electronic device, and computer program product
Mirani et al. Object recognition in different lighting conditions at various angles by deep learning method
US10803677B2 (en) Method and system of automated facial morphing for eyebrow hair and face color detection
CN113313631B (en) Image rendering method and device
Qu et al. Facial expression recognition based on deep residual network
CN113076916A (en) Dynamic facial expression recognition method and system based on geometric feature weighted fusion
Shukla et al. Deep Learning Model to Identify Hide Images using CNN Algorithm
CN111160327A (en) Expression recognition method based on lightweight convolutional neural network
CN116681579A (en) Real-time video face replacement method, medium and system
CN117152352A (en) Image processing method, deep learning model training method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20912669

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20912669

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20912669

Country of ref document: EP

Kind code of ref document: A1