CN112581358A - Training method of image processing model, image processing method and device - Google Patents

Training method of image processing model, image processing method and device Download PDF

Info

Publication number
CN112581358A
CN112581358A CN202011497173.0A CN202011497173A CN112581358A CN 112581358 A CN112581358 A CN 112581358A CN 202011497173 A CN202011497173 A CN 202011497173A CN 112581358 A CN112581358 A CN 112581358A
Authority
CN
China
Prior art keywords
image
hair
sample
style
processing model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011497173.0A
Other languages
Chinese (zh)
Other versions
CN112581358B (en
Inventor
方轲
宋丛礼
郭益林
郑文
万鹏飞
黄星
张知行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202011497173.0A priority Critical patent/CN112581358B/en
Publication of CN112581358A publication Critical patent/CN112581358A/en
Application granted granted Critical
Publication of CN112581358B publication Critical patent/CN112581358B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The disclosure provides a training method and device of an image processing model, electronic equipment and a storage medium, and belongs to the technical field of multimedia. The method comprises the following steps: inputting a first hair image corresponding to the first same face image into a hair line migration network in an image processing model to obtain a second hair image; acquiring a third hair image corresponding to the first sample style image; inputting the first hair image, the second hair image and the third hair image into a style image generation network to be trained in the image processing model, and performing iterative training on the style image generation network to be trained by taking the third hair image as supervision information to obtain a trained image processing model. According to the scheme, the hair image comprising the enhanced hair edge information is added in the training process of the style image generation network, so that the style image output by the image processing model comprises more lines representing hair, and the accuracy of the style image is improved.

Description

Training method of image processing model, image processing method and device
Technical Field
The present disclosure relates to the field of multimedia technologies, and in particular, to a training method for an image processing model, an image processing method, and an image processing apparatus.
Background
With the development of multimedia technology, users have more and more demands for converting facial images in real scenes into images of different styles, such as converting facial images into hand-drawing styles, cartoon styles, realistic styles and the like.
At present, in order to achieve the above requirements, a face image is used as a model input, and a professional trains an image processing model by using a style image drawn according to the face image as a training target, so that the trained image processing model can output a style image with a corresponding style according to the input face image.
The technical scheme has the problem that hairs in the style image generated by the image processing model lack lines representing the hairs compared with the style image drawn by a professional person, so that the accuracy of the style image output by the image processing model is not high.
Disclosure of Invention
The invention provides a training method of an image processing model, an image processing method and a device, wherein a hair image comprising enhanced hair edge information is added in the training process of a style image generation network, so that the style image output by the image processing model comprises more lines representing hair, and the accuracy of the style image is improved. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a training method of an image processing model, including:
inputting a first hair image corresponding to a first sample face image into a hair line migration network in an image processing model to obtain a second hair image, wherein the first hair image comprises first hair edge information in the first sample face image, and the second hair image comprises second hair edge information obtained by enhancing the first hair edge information based on the hair line migration network;
acquiring a third hair image corresponding to a first sample style image, wherein the first sample style image and the first sample face image comprise the same content with different styles, and the third hair image comprises third hair edge information in the first sample style image;
inputting the first hair image, the second hair image and the third hair image into a style image generation network to be trained in the image processing model, and performing iterative training on the style image generation network to be trained by taking the third hair image as supervision information to obtain a trained image processing model.
In some embodiments, the inputting the first hair image corresponding to the first sample face image into the hair line migration network in the image processing model to obtain the second hair image includes:
inputting a first hair image corresponding to the first same face image into a hair line migration network in an image processing model to obtain hair line enhancement information and hair direction information of the first same face image;
and adding the hair line enhancement information and the hair direction information in the first hair image to obtain the second hair image.
In some embodiments, the training of the hair line migration network comprises:
inputting a first sample hair image corresponding to a second sample face image into a hair line migration network to be trained to obtain a first output hair image, wherein the first sample hair image comprises hair edge information and hair direction information of hair in the second sample face image;
determining image loss according to the first output hair image and a second sample hair image, wherein the second sample hair image comprises hair edge information and hair direction information of hair in a second sample style image corresponding to the second sample face image;
and training the hair line migration network to be trained according to the image loss.
In some embodiments, said determining image loss from said first output hair image and second sample hair image comprises:
extracting a sample hair line and a sample hair direction from the second sample hair image;
determining hair line loss according to the distance between the pixel points included by the sample hair lines and the pixel points included by the corresponding hair lines in the first output hair image;
determining the loss of the hairline direction according to the included angle between the sample hairline direction and the corresponding hairline direction in the first output hair image;
and obtaining the image loss according to the hair line loss and the hair direction loss.
In some embodiments, before the first sample hair image corresponding to the second sample face image is input to the hair line migration network to be trained to obtain the first output hair image, the method further includes:
acquiring a third sample hair image according to the second sample face image, wherein the third sample hair image comprises hair edge information of hair in the second sample face image;
inputting the third sample hair image into a hair direction network in the image processing model to obtain hair direction characteristics;
and fusing the hair direction characteristic and the third sample hair image to obtain the first sample hair image.
In some embodiments, the inputting the first hair image corresponding to the first sample face image into the hair line migration network in the image processing model to obtain the second hair image includes:
inputting a first hair image corresponding to the first same face image into a hair line migration network in an image processing model to obtain hair line enhancement information of the first same face image;
and adding the hair line enhancement information in the first hair image to obtain the second hair image.
In some embodiments, the training of the hair line migration network comprises:
inputting a fourth sample hair image corresponding to a third sample face image into a hair line migration network to be trained to obtain a second output hair image, wherein the fourth sample hair image comprises hair edge information of hair in the third sample face image;
determining hair line loss according to the second output hair image and a fifth sample hair image, wherein the fifth sample hair image comprises hair edge information of hair in a third sample style image corresponding to the third sample face image;
and training the hair line migration model to be trained according to the hair line loss.
In some embodiments, said determining hair line loss from said second output hair image and a fifth sample hair image comprises:
extracting a sample hair line from the fifth sample hair image;
and determining the hair line loss according to the distance between the pixel point included by the sample hair line and the pixel point included by the corresponding hair line in the second output hair image.
In some embodiments, the inputting the first hair image, the second hair image, and the third hair image into a style image generation network to be trained in the image processing model, and performing iterative training on the style image generation network to be trained by using the third hair image as supervision information to obtain a trained image processing model includes:
inputting the first style image, the second hair image and the third hair image into a style image to be trained to generate a network, and obtaining a style result image;
acquiring a fourth hair image of the style result image, wherein the fourth hair image comprises hair edge information in the style result image;
and adjusting parameters of the style image to be trained to generate a network according to the difference between the third hair image and the fourth hair image to obtain a trained image processing model.
According to a second aspect of the embodiments of the present disclosure, there is provided an image processing method including:
inputting the hair image to be processed corresponding to the image to be processed into the image processing model obtained by training according to the first aspect;
obtaining an intermediate image based on the hair line migration network in the image processing model, wherein the intermediate image comprises fifth hair edge information obtained by enhancing fourth hair edge information in the hair image to be processed based on the hair line migration network;
and inputting the image to be processed and the intermediate image into a style image generation network in the image processing model to obtain a target style image, wherein the target style image comprises the same content with the style different from that of the image to be processed.
According to a third aspect of the embodiments of the present disclosure, there is provided a training apparatus for an image processing model, including:
the line migration unit is configured to input a first hair image corresponding to a first sample face image into a hair line migration network in an image processing model to obtain a second hair image, wherein the first hair image comprises first hair edge information in the first sample face image, and the second hair image comprises second hair edge information obtained by enhancing the first hair edge information based on the hair line migration network;
an image acquisition unit configured to perform acquisition of a third hair image corresponding to a first sample style image, the first sample style image including the same content having a different style from the first sample style image, the third hair image including third hair edge information in the first sample style image;
and the network training unit is configured to input the first hair image, the second hair image and the third hair image into a style image generation network to be trained in the image processing model, and perform iterative training on the style image generation network to be trained by taking the third hair image as supervision information to obtain a trained image processing model.
In an optional implementation manner, the line migration unit is configured to execute a hair line migration network that inputs a first hair image corresponding to the first same face image into an image processing model, to obtain hair line enhancement information and hair direction information of the first same face image; and adding the hair line enhancement information and the hair direction information in the first hair image to obtain the second hair image.
In an alternative implementation, the step of training the hair line migration network includes:
inputting a first sample hair image corresponding to a second sample face image into a hair line migration network to be trained to obtain a first output hair image, wherein the first sample hair image comprises hair edge information and hair direction information of hair in the second sample face image;
determining image loss according to the first output hair image and a second sample hair image, wherein the second sample hair image comprises hair edge information and hair direction information of hair in a second sample style image corresponding to the second sample face image;
and training the hair line migration network to be trained according to the image loss.
In an alternative implementation, the determining an image loss according to the first output hair image and the second sample hair image includes:
extracting a sample hair line and a sample hair direction from the second sample hair image;
determining hair line loss according to the distance between the pixel points included by the sample hair lines and the pixel points included by the corresponding hair lines in the first output hair image;
determining the loss of the hairline direction according to the included angle between the sample hairline direction and the corresponding hairline direction in the first output hair image;
and obtaining the image loss according to the hair line loss and the hair direction loss.
In an optional implementation manner, before the inputting the first sample hair image corresponding to the second sample face image into the hair line migration network to be trained to obtain the first output hair image, the method further includes:
and acquiring a third sample hair image according to the second sample face image, wherein the third sample hair image comprises hair edge information of hair in the second sample face image.
Inputting the third sample hair image into a hair direction network in the image processing model to obtain hair direction characteristics;
and fusing the hair direction characteristic and the third sample hair image to obtain the first sample hair image.
In an optional implementation manner, the line migration unit is configured to execute a hair line migration network that inputs a first hair image corresponding to the first same face image into an image processing model, so as to obtain hair line enhancement information of the first same face image; and adding the hair line enhancement information in the first hair image to obtain the second hair image.
In an alternative implementation, the step of training the hair line migration network includes:
inputting a fourth sample hair image corresponding to a third sample face image into a hair line migration network to be trained to obtain a second output hair image, wherein the fourth sample hair image comprises hair edge information of hair in the third sample face image;
determining hair line loss according to the second output hair image and a fifth sample hair image, wherein the fifth sample hair image comprises hair edge information of hair in a third sample style image corresponding to the third sample face image;
and training the hair line migration model to be trained according to the hair line loss.
In an optional implementation, the determining a hair line loss according to the second output hair image and the fifth sample hair image includes:
extracting a sample hair line from the fifth sample hair image;
and determining the hair line loss according to the distance between the pixel point included by the sample hair line and the pixel point included by the corresponding hair line in the second output hair image.
In an optional implementation manner, the network training unit is configured to perform inputting the first style face image, the second hair image and the third hair image into a style image to be trained to generate a network, and obtain a style result image; acquiring a fourth hair image of the style result image, wherein the fourth hair image comprises hair edge information in the style result image; and adjusting parameters of the style image to be trained to generate a network according to the difference between the third hair image and the fourth hair image to obtain a trained image processing model.
According to a fourth aspect of the embodiments of the present disclosure, there is provided an image processing apparatus including:
an input unit configured to perform an image processing model trained according to the first aspect on a hair image to be processed corresponding to the image to be processed;
a first processing unit configured to execute a hair line migration network based on the image processing model to obtain an intermediate image, wherein the intermediate image includes fifth hair edge information obtained by enhancing fourth hair edge information in the hair image to be processed based on the hair line migration network;
a second processing unit configured to perform a stylistic image generation network that inputs the image to be processed and the intermediate image into the image processing model, resulting in a target stylistic image that includes the same content that is different from the stylistic image to be processed.
According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic apparatus including:
one or more processors;
a memory for storing the processor executable program code;
wherein the processor is configured to execute the program code to implement the method of training of the image processing model described above, or the image processing method.
According to a sixth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein when program code in the computer-readable storage medium is executed by a processor of an electronic device, the electronic device is enabled to execute the above-mentioned training method of an image processing model, or an image processing method.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the above-described training method of an image processing model, or an image processing method.
The technical scheme provided by the embodiment of the disclosure at least has the following beneficial effects:
the embodiment of the disclosure provides a training method of an image processing model, which adds a hair image comprising enhanced hair edge information in the training process of a style image generation network, so that the style image output by the image processing model comprises more lines representing hair, and the accuracy of the style image is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a schematic diagram illustrating an implementation environment of a method for training an image processing model according to an exemplary embodiment.
FIG. 2 is a flow diagram illustrating a method of training an image processing model according to an exemplary embodiment.
FIG. 3 is a flow diagram illustrating another method of training an image processing model in accordance with an exemplary embodiment.
FIG. 4 illustrates a comparative schematic of a hair image according to an exemplary embodiment.
Fig. 5 is a graph illustrating a comparison of the output effects of a hair line migration network, according to an exemplary embodiment.
Fig. 6 is a schematic diagram illustrating a hair direction according to an exemplary embodiment.
FIG. 7 illustrates an output effect contrast diagram of another hair line migration model, according to an exemplary embodiment.
FIG. 8 is a flowchart illustrating another method of training an image processing model according to an exemplary embodiment.
FIG. 9 is a flow diagram illustrating an image processing method according to an exemplary embodiment.
FIG. 10 is a block diagram illustrating an apparatus for training an image processing model according to an exemplary embodiment.
Fig. 11 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment.
Fig. 12 is a block diagram illustrating a terminal according to an example embodiment.
FIG. 13 is a block diagram illustrating a server in accordance with an exemplary embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the following claims
The user information to which the present disclosure relates may be information authorized by the user or sufficiently authorized by each party.
The electronic device may be provided as a terminal or a server, and when the electronic device is provided as a terminal, the terminal implements operations performed by the training method of the image processing model; when provided as a server, the server implements operations performed by a training method of an image processing model; or an operation performed by a training method in which the server and the terminal interact to implement the image processing model. Alternatively, when the electronic device is provided as a terminal, the terminal implements an operation performed by the image processing method; when provided as a server, the server implements operations performed by the image processing method; or the server and the terminal interact to implement the operations performed by the image processing method.
FIG. 1 is a schematic diagram illustrating an implementation environment of a method for training an image processing model according to an exemplary embodiment. Taking the electronic device as an example provided as a server, referring to fig. 1, the implementation environment specifically includes: a terminal 101 and a server 102.
In some embodiments, terminal 101 is at least one of a smartphone, a smartwatch, a desktop computer, a laptop computer, an MP3 player, an MP4 player, a laptop portable computer, and the like. An application program is installed and run on the terminal 101, and a user can log in the application program through the terminal 101 to obtain a service provided by the application program, optionally, the application program is a gallery application program, a camera application program, a social contact application program, and the like, which is not limited in this disclosure. The terminal 101 can be connected to the server 102 through a wireless network or a wired network. Optionally, the terminal 101 is configured to implement an image processing method.
In some embodiments, terminal 101 generally refers to one of a plurality of terminals, and this embodiment is illustrated only by terminal 101. Those skilled in the art will appreciate that the number of terminals described above may be greater or fewer. For example, the number of the terminals is only a few, or the number of the terminals is several tens or hundreds, or more, and the number of the terminals and the type of the device are not limited in the embodiments of the present disclosure.
In some embodiments, the server 102 is at least one of a server, a plurality of servers, a cloud computing platform, and a virtualization center. The server 102 is configured to provide a background service for the application installed in the terminal 101. Optionally, the number of the servers may be more or less, and the embodiment of the disclosure does not limit this. Of course, the server 102 can also include other functional servers, such as a database server, a user management server, a model management server, and the like, in order to provide more comprehensive and diversified services.
For example, the server 102 obtains the image processing model through training by using the training method for the image processing model provided by the embodiment of the present disclosure, and then sends the image processing model to the terminal 101, and the terminal 101 calls the image processing model through an application program to process the image to be processed, so as to obtain the target style image, such as a hand-drawing style image, a cartoon style image, a realistic style image, and the like.
Fig. 2 is a flowchart illustrating a training method of an image processing model according to an exemplary embodiment, referring to fig. 2, applied to an electronic device, the method including:
in step S201, a first hair image corresponding to a first sample human face image is input to a hair line migration network in an image processing model, so as to obtain a second hair image, where the first hair image includes first hair edge information in the first sample human face image, and the second hair image includes second hair edge information enhanced by the first hair edge information based on the hair line migration network.
In the embodiment of the disclosure, the electronic device can perform segmentation processing on the sample face image to obtain a hair region representing hair, then perform edge extraction on the hair region, and extract hair lines in the hair region to obtain a first hair image. Then, the electronic device enhances the first hair edge information included in the first hair image based on the hair line migration network to obtain a second hair image. Wherein, the hair line migration network is a network in the image processing model. Optionally, the hair line migration network is trained, or the hair line migration network and the style image generation network in the image processing model are trained together.
In step S202, a third hair image corresponding to a first sample style image is acquired, the first sample style image and the first sample face image include the same content with different styles, and the third hair image includes third hair edge information in the first sample style image.
In the disclosed embodiment, the electronic device is capable of acquiring a first sample style image including the same content of a different style from the first sample face image, the first sample style image being drawn by a professional painter from the first sample face image. For example, the first sample face image is a female face, and a professional painter draws a first sample style image having a hand-drawing style from the female face. The electronic device obtains a third hair image in the same processing manner as the first face image.
In step S203, the first hair image, the second hair image, and the third hair image are input into the style image generation network to be trained in the image processing model, and the style image generation network to be trained is iteratively trained by using the third hair image as the monitoring information, so as to obtain the trained image processing model.
In the embodiment of the present disclosure, taking an iterative process as an example, in the iterative process, the electronic device inputs the first sample face image, the second hair image, and the third hair image, inputs the first sample face image, the second hair image, and the third hair image into the image processing model, and trains the style image generation network corresponding to the iterative process, and adjusts parameters of the style image generation network corresponding to the iterative process according to a difference between a result output by the style image generation network and the third hair image serving as the monitoring information, so as to obtain the trained image processing model.
According to the scheme provided by the embodiment of the disclosure, the hair image comprising the enhanced hair edge information is added in the training process of the style image generation network, so that the style image output by the image processing model comprises more lines representing hair, and the accuracy of the style image is improved.
In an alternative implementation manner, the inputting a first hair image corresponding to a first sample face image into a hair line migration network in an image processing model to obtain a second hair image includes:
inputting a first hair image corresponding to the first same face image into a hair line migration network in an image processing model to obtain hair line enhancement information and hair direction information of the first same face image;
and adding the hair line enhancement information and the hair direction information in the first hair image to obtain the second hair image.
The first hair image corresponding to the first same face image is further processed based on the hair line migration network, and hair line enhancement information and hair line direction information are added in the first hair image, so that a second hair image comprising hair lines with more quantity, longer length and truer hair line direction is obtained.
In an alternative implementation, the step of training the hair line migration network includes:
inputting a first sample hair image corresponding to a second sample face image into a hair line migration network to be trained to obtain a first output hair image, wherein the first sample hair image comprises hair edge information and hair direction information of hair in the second sample face image;
determining image loss according to the first output hair image and a second sample hair image, wherein the second sample hair image comprises hair edge information and hair direction information of hair in a second sample style image corresponding to the second sample face image;
and training the hair line migration network to be trained according to the image loss.
The image loss is determined based on the difference between the second sample hair image indicating the hair edge information and the hair direction information in the second sample style image and the first output hair image output by the hair line migration network, so that the image loss can represent the difference related to the hair line and the hair direction, the hair line migration network can learn more hair edge information and hair direction information, and the hair line migration network after the parameters are adjusted can better realize the generation of the hair line.
In an alternative implementation, the determining image loss from the first output hair image and the second sample hair image includes:
extracting a sample hair line and a sample hair direction from the second sample hair image;
determining hair line loss according to the distance between the pixel point included by the sample hair line and the pixel point included by the corresponding hair line in the first output hair image;
determining the loss of the hairline direction according to the included angle between the sample hairline direction and the corresponding hairline direction in the first output hair image;
and obtaining the image loss according to the hair line loss and the hair direction loss.
Based on the distance between the pixel points included by the corresponding hair lines in the two images of the second sample hair image and the first output hair image and the included angle between the hair directions in the two images, the difference between the hair lines generated by the hair line migration network and the hair lines and the hair directions manually drawn by the drawing technicians can be determined, and therefore the determined loss image loss is beneficial to the hair line migration network after the parameters are adjusted to reduce the difference.
In an optional implementation manner, before the first sample hair image corresponding to the second sample face image is input to the hair line migration network to be trained to obtain the first output hair image, the method further includes:
acquiring a third sample hair image according to the second sample face image, wherein the third sample hair image comprises hair edge information of hair in the second sample face image;
inputting the third sample hair image into a hair direction network in the image processing model to obtain hair direction characteristics;
and fusing the hair direction characteristic and the third sample hair image to obtain the first sample hair image.
Through the hair direction characteristic of superpose hair on third sample hair image for hair direction information has been included in the first sample hair image, and further hair lines migration network can learn hair direction information according to first sample hair image, thereby the line sense of the hair image of hair lines migration network output is stronger.
In an alternative implementation manner, the inputting a first hair image corresponding to a first sample face image into a hair line migration network in an image processing model to obtain a second hair image includes:
inputting a first hair image corresponding to the first same face image into a hair line migration network in an image processing model to obtain hair line enhancement information of the first same face image;
and adding the hair line enhancement information in the first hair image to obtain the second hair image.
The first hair image corresponding to the first same face image is further processed based on the hair line migration network, and hair line enhancement information is added to the first hair image, so that a second hair image comprising hair lines with more quantity and longer length is obtained.
In an alternative implementation, the step of training the hair line migration network includes:
inputting a fourth sample hair image corresponding to the third sample face image into a hair line migration network to be trained to obtain a second output hair image, wherein the fourth sample hair image comprises hair edge information of hair in the third sample face image;
determining hair line loss according to the second output hair image and a fifth sample hair image, wherein the fifth sample hair image comprises hair edge information of hair in a third sample style image corresponding to the third sample face image;
and training the hair line migration model to be trained according to the hair line loss.
The hair line loss is determined based on the difference between the fourth sample hair image indicating the hair edge information in the third sample style image and the second output hair image output by the hair line migration network, so that the hair line loss can represent the difference related to the hair line, the hair line migration network can learn more hair edge information, and the hair line migration network after the parameters are adjusted can better realize the generation of the hair line.
In an alternative implementation, the determining a hair line loss from the second output hair image and the fifth sample hair image includes:
extracting a sample hair line from the fifth sample hair image;
and determining the hair line loss according to the distance between the pixel point included by the sample hair line and the pixel point included by the corresponding hair line in the second output hair image.
Based on the distance between the pixel point included by the sample hair line in the fifth sample hair image and the pixel point included by the corresponding hair line in the second output hair image, the difference between the hair line generated by the hair line migration network and the hair line manually drawn by the drawing technician can be determined, so that the determined loss of the image is beneficial to the reduction of the difference of the hair line migration network after the parameters are adjusted.
In an optional implementation manner, the inputting the first hair image, the second hair image, and the third hair image into a style image generation network to be trained in the image processing model, and performing iterative training on the style image generation network to be trained by using the third hair image as supervision information to obtain a trained image processing model, includes:
inputting the first style face image, the second hair image and the third hair image into a style image to be trained to generate a network, and obtaining a style result image;
acquiring a fourth hair image of the style result image, wherein the fourth hair image comprises hair edge information in the style result image;
and adjusting parameters of the style image to be trained to generate a network according to the difference between the third hair image and the fourth hair image to obtain a trained image processing model.
Parameters of the style image generation network are adjusted according to the difference between the fourth hair image of the modeled style result image and the third hair image serving as the supervision information, so that the adjusted style image generation network can reduce the difference, and a style image closer to a work drawn by a painting technician can be generated.
Fig. 2 is a basic flow of the training method of the image processing model of the present disclosure, and the following further explains the training method of the image processing model provided by the present disclosure based on an application scenario. FIG. 3 is a flow diagram illustrating another method of training an image processing model in accordance with an exemplary embodiment. Taking an electronic device provided as a server and a scene for training an image processing model applied to a social application program as an example, the method trains an obtained image processing model for processing a face image into a hand-drawing style image, and accordingly, the social application program can output the hand-drawing style image based on the trained image processing model when a user inputs the face image, as shown in fig. 3, and the method includes:
in step S301, a hair line migration network in the image processing model for increasing hair edge information in the input hair image is trained.
In the embodiment of the disclosure, the server can segment a hair region from the acquired face image, and then extract a hair image including hair edge information from the hair region. The server can segment a hair region from the face image through the hair segmentation model, and then extract the edge information of the hair region by using a Laplacian operator, so that a hair image is obtained. However, compared with the hair image obtained by the server according to the hand-drawn image corresponding to the face image, the hair image obtained by the server according to the face image has a small number of hair lines and a short length, which means that the hair cannot be accurately represented.
For example, referring to FIG. 4, FIG. 4 illustrates a comparative schematic of a hair image according to an exemplary embodiment. As shown in fig. 4, 401 is a hair region that the server has divided from the face image, 402 is a hair region that the server has divided from the hand-drawn image corresponding to the face image, 403 is a hair image that the server has extracted from 401, and 404 is a hair image that the server has extracted from 402. By comparing 403 and 404, the hair image obtained by the server from the hand-drawn image corresponding to the face image includes a larger number of hair lines, and the hair lines have a longer length.
In order to solve the problems, the server takes a hair image obtained according to a face image as input information and takes a hair image obtained according to a hand-drawn image corresponding to the face image as a training target to train a hair line migration network in an image processing model.
In some embodiments, the hair line migration network outputs a hair image that includes more hair edge information and hair direction information. Correspondingly, taking an iterative process as an example, the step of training the hair line migration network by the server is as follows: and the server inputs a first sample hair image corresponding to the second sample face image into the hair line migration network to be trained to obtain a first output hair image, wherein the first sample hair image comprises hair edge information and hair direction information of hair in the second sample face image. The server then determines an image loss from the first output hair image and a second sample hair image that includes hair edge information and hair direction information for hair in a second sample style image corresponding to a second sample face image. And then the server trains the hair line migration network to be trained according to the image loss. Namely, the parameters of the hair line migration network corresponding to the iteration process are adjusted. Finally, in response to that the hair line migration model corresponding to the iteration process meets the training end condition, the server determines that the hair line migration network corresponding to the iteration process is the trained hair line migration network; and responding to the situation that the hair line migration model corresponding to the iteration process does not meet the training end condition, executing the next iteration process by the server, and continuing to train the hair line migration network to be trained. The image loss is determined based on the difference between the second sample hair image indicating the hair edge information and the hair direction information in the second sample style image and the first output hair image output by the hair line migration network, so that the image loss can represent the difference related to the hair line and the hair direction, the hair line migration network can learn more hair edge information and hair direction information, and the hair line migration network after the parameters are adjusted can better realize the generation of the hair line.
For example, referring to fig. 5, fig. 5 is a graph illustrating a comparison of the output of a hair line migration network, according to an exemplary embodiment. As shown in fig. 5, two sets of hair images are included, wherein 501 is a hair image obtained by the server according to the face image, and is used for inputting a hair line migration network. 502 is a hair image obtained by the server according to the hand-drawn image corresponding to the face image. 503 is a hair image output by the hair line migration network. Compared with the hair image obtained by the server according to the face image, the hair image output by the hair line migration model has the advantages that the number of lines is obviously increased, and the line feeling of the image is enhanced.
In some embodiments, the image loss comprises a hair line loss and a hair direction loss. Correspondingly, the server determines the image loss by the steps of: first, the server extracts a sample hair line and a sample hair direction from the second sample hair image. And then the server determines the hair line loss according to the distance between the pixel point included by the sample hair line and the pixel point included by the corresponding hair line in the first output hair image. And the server determines the hair direction loss according to the included angle between the sample hair direction and the corresponding hair direction in the first output hair image. And finally, the server obtains the image loss according to the hair line loss and the hair direction loss. Based on the distance between the pixel points included by the corresponding hair lines in the two images of the second sample hair image and the first output hair image and the included angle between the hair directions in the two images, the difference between the hair lines generated by the hair line migration network and the hair lines and the hair directions manually drawn by the drawing technicians can be determined, and therefore the determined loss image loss is beneficial to the hair line migration network after the parameters are adjusted to reduce the difference.
In some embodiments, the server obtains the first sample hair image by means of fusion. Correspondingly, before the server inputs the first sample hair image corresponding to the second sample face image into the hair line migration network to be trained to obtain the first output hair image, the server firstly obtains a third sample hair image according to the second sample face image, wherein the third sample hair image comprises hair edge information of hair in the second sample face image. And then the server inputs the third sample hair image into a hair direction network in the image processing model to obtain the hair direction characteristics. And finally, the server fuses the hair direction characteristic and the third sample hair image to obtain the first sample hair image. Through the hair direction characteristic of superpose hair on third sample hair image for hair direction information has been included in the first sample hair image, and further hair lines migration network can learn hair direction information according to first sample hair image, thereby the line sense of the hair image of hair lines migration network output is stronger.
For example, referring to FIG. 6, FIG. 6 is a schematic illustration of one type of hair direction shown in accordance with an exemplary embodiment. As shown in fig. 6, the server estimates the hair direction, such as angle and length values, of each pixel point in the hair region through the hair direction model. The hair direction features extracted by the server are a 1xHxW matrix, the first sample hair image is also the 1xHxW matrix, and the server superposes the two matrices to obtain a 2xHxW matrix. Thus, the hair line migration model obtains additional hair direction information during training. H denotes an image height, and W denotes an image width.
In some embodiments, the hair image output by the hair line migration network includes more hair edge information. Correspondingly, the step of training the hair line migration network by the server is as follows: firstly, a fourth sample hair image corresponding to a third sample face image is input into a hair line migration network to be trained by a server to obtain a second output hair image, and the fourth sample hair image comprises hair edge information of hair in the third sample face image. The server then determines a hair line loss from the second output hair image and a fifth sample hair image that includes hair edge information for hair in a third sample style image corresponding to the third sample face image. And finally, the server trains the hair line migration model to be trained according to the hair line loss. The hair line loss is determined based on the difference between the fourth sample hair image indicating the hair edge information in the third sample style image and the second output hair image output by the hair line migration network, so that the hair line loss can represent the difference related to the hair line, the hair line migration network can learn more hair edge information, and the hair line migration network after the parameters are adjusted can better realize the generation of the hair line.
For example, the server obtains a CNN (Convolutional Neural Networks) model for detecting white lines in the image. The server then detects and extracts at least one sample hair line from the second sample hair image based on the CNN model, and the server extracts a set of sample hair lines denoted as { x1, x2, …, xn }. For any sample hair line xi, the server calculates the distance between the pixel point included in the sample hair line and the first output hair image by means of distance map. And finally, the server collects the distances between the pixel points included by the n sample hair lines and the first output hair image, and the line loss between the first output hair image and the second sample hair image can be obtained. Referring to fig. 7, fig. 7 illustrates an output effect contrast diagram of another hair line migration model according to an exemplary embodiment. As shown in fig. 7, 701 is a hair image obtained by the server from the hand-drawn image, 702 is an output hair portrait output by the hair line migration model that does not use the line loss adjustment parameter, and 703 is an output hair portrait output by the hair line migration model that uses the line loss adjustment parameter. By comparison, the hair line length is longer at 703 than at 702.
In some embodiments, the step of the server determining the hair line loss is: and the server extracts the sample hair line from the fifth sample hair image. And then determining the hair line loss according to the distance between the pixel point included by the sample hair line and the pixel point included by the corresponding hair line in the second output hair image. Based on the distance between the pixel point included by the sample hair line in the fifth sample hair image and the pixel point included by the corresponding hair line in the second output hair image, the difference between the hair line generated by the hair line migration network and the hair line manually drawn by the drawing technician can be determined, so that the determined loss of the image is beneficial to the reduction of the difference of the hair line migration network after the parameters are adjusted.
In step S302, a first hair image corresponding to a first sample human face image is input into a hair line migration network in an image processing model, so as to obtain a second hair image, where the first hair image includes first hair edge information in the first sample human face image, and the second hair image includes second hair edge information obtained by enhancing the first hair edge information based on the hair line migration network.
In the embodiment of the present disclosure, the server may perform segmentation processing on the first sample face image to obtain a hair region representing hair, and then perform edge extraction on the hair region, that is, extract hair lines in the hair region to obtain the first hair image. And then the server can process the first hair image based on the hair line migration network obtained in the step to obtain a second hair image.
In some embodiments, the server inputs a first hair image corresponding to the first same face image into a hair line migration network in the image processing model to obtain hair line enhancement information and hair direction information of the first same face image, and then adds the hair line enhancement information and the hair direction information to the first hair image to obtain the second hair image. The first hair image corresponding to the first same face image is further processed based on the hair line migration network, and hair line enhancement information and hair line direction information are added in the first hair image, so that a second hair image comprising hair lines with more quantity, longer length and truer hair line direction is obtained.
In some embodiments, the server inputs a first hair image corresponding to the first same face image into a hair line migration network in the image processing model to obtain hair line enhancement information of the first same face image, and then adds the hair line enhancement information to the first hair image to obtain the second hair image. The first hair image corresponding to the first same face image is further processed based on the hair line migration network, and hair line enhancement information is added to the first hair image, so that a second hair image comprising hair lines with more quantity and longer length is obtained.
In step S303, a third hair image corresponding to a first sample style image is acquired, the first sample style image and the first sample face image include the same content with different styles, and the third hair image includes third hair edge information in the first sample style image.
In the disclosed embodiment, the electronic device is capable of acquiring a first sample style image including the same content of a different style from the first sample face image, the first sample style image being drawn by a professional painter from the first sample face image. For example, the first sample face image is a female face, and a professional painter draws a first sample style image having a hand-drawing style from the female face. The electronic device obtains a third hair image in the same processing manner as the first face image.
In step S304, the first sample face image, the second hair image, and the third hair image are input into the style image generation network to be trained in the image processing model, and the third hair image is used as the supervision information to perform iterative training on the style image generation network to be trained, so as to obtain the trained image processing model.
In the embodiment of the disclosure, taking an iterative process as an example, the server inputs the first style image, the second hair image and the third hair image into the style image to be trained corresponding to the iterative process to generate a network, so as to obtain a style result image. The server then obtains a fourth hair image of the stylistic result image, the fourth hair image including hair edge information in the stylistic result image. And finally, the server adjusts the parameters of the style image to be trained to generate the network according to the difference between the third hair image and the fourth hair image. Parameters of the style image generation network are adjusted according to the difference between the fourth hair image of the modeled style result image and the third hair image serving as the supervision information, so that the adjusted style image generation network can reduce the difference, and a style image closer to a work drawn by a painting technician can be generated.
It should be noted that, in some embodiments, the style image generation network and the hair line migration network are pix2pix networks.
It should be noted that the flow described in step S301 to step S304 is an optional flow of the training method of the image processing model provided in the embodiment of the present disclosure. Correspondingly, the server can train the hair line migration network and the style image generation network simultaneously. Of course, the server can also directly acquire the trained hair line migration network, and the hair line migration model is obtained by pre-training the server or acquired by the server from a plurality of networks stored in the database. Referring to FIG. 8, FIG. 8 is a flow diagram illustrating another method for training an image processing model according to an exemplary embodiment. As shown in fig. 8, first, a sample face image 801 is subjected to hair segmentation to obtain an image 802 representing a hair region, which is denoted as a _ edge. A hair image 803 including hair lines is then extracted from the image 802. Then, the hair image 803 is processed through a pre-trained hair line migration network, and a hair image 804 with enhanced edge information is obtained and recorded as a _ edge _ plus. At the same time, the sample hand-drawn image 805 as the supervision information is subjected to hair segmentation to obtain an image 806 indicating a hair region. A hair image 807 including edge information is then extracted from this image 806, noted gt _ edge. Then, the sample face image 801 and the hair image 804 are input into the style image generation network in the iteration process, and the hair image 804 is used for assisting in generating a hand-drawing result image 808. The hair image 807 is used to monitor the generation of the hand-drawn result image 808, and parameters of the stylized image generation network in the current iteration process are adjusted according to the difference between the hair image 807 and the hand-drawn result image 808.
The scheme provided by the embodiment of the disclosure provides a training method of an image processing model, and the hair image comprising enhanced hair edge information is added in the training process of generating a network by the style image, so that the style image output by the image processing model comprises more lines representing hair, and the accuracy of the style image is improved.
All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
An image processing method is further provided in the embodiments of the present disclosure, and referring to fig. 9, fig. 9 is a flowchart illustrating an image processing method according to an exemplary embodiment, and is applied to an electronic device, where the method includes:
in step S901, the to-be-processed image corresponding to the to-be-processed image is input into the image processing model obtained by training according to the training method of the image processing model.
In step S902, an intermediate image is obtained based on the hair line migration network in the image processing model, where the intermediate image includes fifth hair edge information enhanced based on the hair line migration network for the fourth hair edge information in the hair image to be processed.
In step S903, the to-be-processed image and the intermediate image are input to a style image generation network in the image processing model, and a target style image is obtained, which includes the same content with a different style from the to-be-processed image.
By processing the image to be processed based on the image processing model, the target style image such as a hand-drawing style image, a cartoon style image, a realistic style image and the like can be output, and the image processing requirements of users are met.
FIG. 10 is a block diagram illustrating an apparatus for training an image processing model according to an exemplary embodiment. Referring to fig. 10, the apparatus includes: a line migration unit 1001, an image acquisition unit 1002, and a network training unit 1003.
A line migration unit 1001 configured to perform a hair line migration network that inputs a first hair image corresponding to a first sample face image into an image processing model, to obtain a second hair image, where the first hair image includes first hair edge information in the first sample face image, and the second hair image includes second hair edge information that is enhanced for the first hair edge information based on the hair line migration network;
an image obtaining unit 1002 configured to perform obtaining a third hair image corresponding to a first sample style image, the first sample style image including the same content having a different style from the first sample style image, the third hair image including third hair edge information in the first sample style image;
a network training unit 1003 configured to perform inputting the first hair image, the second hair image and the third hair image into a style image generation network to be trained in the image processing model, and perform iterative training on the style image generation network to be trained by using the third hair image as supervision information.
According to the device provided by the embodiment of the disclosure, the hair image comprising the enhanced hair edge information is added in the training process of the style image generation network, so that the style image output by the image processing model comprises more lines representing hair, and the accuracy of the style image is improved.
In an alternative implementation manner, the line migration unit 1001 is configured to perform a hair line migration network that inputs a first hair image corresponding to the first same person face image into an image processing model, so as to obtain hair line enhancement information and hair direction information of the first same person face image; and adding the hair line enhancement information and the hair direction information in the first hair image to obtain the second hair image.
In an alternative implementation, the step of training the hair line migration network includes:
inputting a first sample hair image corresponding to a second sample face image into a hair line migration network to be trained to obtain a first output hair image, wherein the first sample hair image comprises hair edge information and hair direction information of hair in the second sample face image;
determining image loss according to the first output hair image and a second sample hair image, wherein the second sample hair image comprises hair edge information and hair direction information of hair in a second sample style image corresponding to the second sample face image;
and training the hair line migration network to be trained according to the image loss.
In an alternative implementation, the determining image loss from the first output hair image and the second sample hair image includes:
extracting a sample hair line and a sample hair direction from the second sample hair image;
determining hair line loss according to the distance between the pixel point included by the sample hair line and the pixel point included by the corresponding hair line in the first output hair image;
determining the loss of the hairline direction according to the included angle between the sample hairline direction and the corresponding hairline direction in the first output hair image;
and obtaining the image loss according to the hair line loss and the hair direction loss.
In an optional implementation manner, before the inputting the first sample hair image corresponding to the second sample face image into the hair line migration network to be trained to obtain the first output hair image, the method further includes:
and acquiring a third sample hair image according to the second sample face image, wherein the third sample hair image comprises hair edge information of hair in the second sample face image.
Inputting the third sample hair image into a hair direction network in the image processing model to obtain hair direction characteristics;
and fusing the hair direction characteristic and the third sample hair image to obtain the first sample hair image.
In an alternative implementation manner, the line migration unit 1001 is configured to perform a hair line migration network that inputs a first hair image corresponding to the first same person face image into an image processing model, so as to obtain hair line enhancement information of the first same person face image; and adding the hair line enhancement information in the first hair image to obtain the second hair image.
In an alternative implementation, the step of training the hair line migration network includes:
inputting a fourth sample hair image corresponding to the third sample face image into a hair line migration network to be trained to obtain a second output hair image, wherein the fourth sample hair image comprises hair edge information of hair in the third sample face image;
determining hair line loss according to the second output hair image and a fifth sample hair image, wherein the fifth sample hair image comprises hair edge information of hair in a third sample style image corresponding to the third sample face image;
and training the hair line migration model to be trained according to the hair line loss.
In an alternative implementation, the determining a hair line loss from the second output hair image and the fifth sample hair image includes:
extracting a sample hair line from the fifth sample hair image;
and determining the hair line loss according to the distance between the pixel point included by the sample hair line and the pixel point included by the corresponding hair line in the second output hair image.
In an alternative implementation manner, the network training unit 1003 is configured to perform inputting the first style image, the second hair image and the third hair image into a style image to be trained to generate a network, and obtain a style result image; acquiring a fourth hair image of the style result image, wherein the fourth hair image comprises hair edge information in the style result image; and adjusting parameters of the style image to be trained to generate a network according to the difference between the third hair image and the fourth hair image to obtain a trained image processing model.
Fig. 11 is a block diagram illustrating an image processing apparatus according to an exemplary embodiment. Referring to fig. 11, the apparatus includes: an input unit 1101, a first processing unit 1102, and a second processing unit 1103.
An input unit 1101 configured to perform an image processing model trained according to the first aspect on a hair image to be processed corresponding to the image to be processed;
a first processing unit 1102 configured to execute a hair line migration network based on the image processing model, obtaining an intermediate image, where the intermediate image includes fifth hair edge information enhanced by fourth hair edge information in the hair image to be processed based on the hair line migration network;
a second processing unit 1103 configured to perform inputting the image to be processed and the intermediate image into a style image generation network in the image processing model, resulting in a target style image comprising the same content with a different style than the image to be processed.
It should be noted that, when training the image processing model, the training apparatus for the image processing model provided in the above embodiment only exemplifies the division of the above functional units, and in practical applications, the above functions are distributed to different functional units as needed, that is, the internal structure of the electronic device is divided into different functional units to complete all or part of the above described functions. In addition, the training apparatus for an image processing model and the embodiment of the training method for an image processing model provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in the embodiment of the method for details, and are not described herein again.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
When the electronic device is provided as a terminal, fig. 12 is a block diagram illustrating a terminal 1200 according to an example embodiment. Fig. 12 is a block diagram illustrating a terminal 1200 according to an exemplary embodiment of the disclosure. The terminal 1200 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 1200 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so forth.
In general, terminal 1200 includes: a processor 1201 and a memory 1202.
The processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 1201 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1201 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1201 may be integrated with a GPU (Graphics Processing Unit) that is responsible for rendering and drawing content that the display screen needs to display. In some embodiments, the processor 1201 may further include an AI (Artificial Intelligence) processor for processing a computing operation related to machine learning.
Memory 1202 may include one or more computer-readable storage media, which may be non-transitory. Memory 1202 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 1202 is used to store at least one program code for execution by the processor 1201 to implement a training method of an image processing model, or an image processing method, provided by method embodiments in the present disclosure.
In some embodiments, the terminal 1200 may further optionally include: a peripheral interface 1203 and at least one peripheral. The processor 1201, memory 1202, and peripheral interface 1203 may be connected by a bus or signal line. Various peripheral devices may be connected to peripheral interface 1203 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1204, display 1205, camera assembly 1206, audio circuitry 1207, positioning assembly 1208, and power supply 1209.
The peripheral interface 1203 may be used to connect at least one peripheral associated with I/O (Input/Output) to the processor 1201 and the memory 1202. In some embodiments, the processor 1201, memory 1202, and peripheral interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 1201, the memory 1202 and the peripheral device interface 1203 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
The Radio Frequency circuit 1204 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1204 communicates with a communication network and other communication devices by electromagnetic signals. The radio frequency circuit 1204 converts an electric signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electric signal. Optionally, the radio frequency circuit 1204 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 1204 may communicate with other terminals through at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 1204 may further include NFC (Near Field Communication) related circuits, which are not limited by this disclosure.
The display screen 1205 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1205 is a touch display screen, the display screen 1205 also has the ability to acquire touch signals on or over the surface of the display screen 1205. The touch signal may be input to the processor 1201 as a control signal for processing. At this point, the display 1205 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 1205 may be one, providing the front panel of the terminal 1200; in other embodiments, the display 1205 can be at least two, respectively disposed on different surfaces of the terminal 1200 or in a folded design; in still other embodiments, the display 1205 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 1200. Even further, the display screen 1205 may be arranged in a non-rectangular irregular figure, i.e., a shaped screen. The Display panel 1205 can be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or other materials.
Camera assembly 1206 is used to capture images or video. Optionally, camera assembly 1206 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 1206 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
The audio circuitry 1207 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals into the processor 1201 for processing or inputting the electric signals into the radio frequency circuit 1204 to achieve voice communication. For stereo capture or noise reduction purposes, multiple microphones may be provided at different locations of terminal 1200. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 1201 or the radio frequency circuit 1204 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 1207 may also include a headphone jack.
The positioning component 1208 is configured to locate a current geographic Location of the terminal 1200 to implement navigation or LBS (Location Based Service). The Positioning component 1208 can be a Positioning component based on the united states GPS (Global Positioning System), the chinese beidou System, the russian graves System, or the european union galileo System.
The power supply 1209 is used to provide power to various components within the terminal 1200. The power source 1209 may be alternating current, direct current, disposable or rechargeable. When the power source 1209 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, terminal 1200 also includes one or more sensors 1210. The one or more sensors 1210 include, but are not limited to: acceleration sensor 1211, gyro sensor 1212, pressure sensor 1213, fingerprint sensor 1214, optical sensor 1215, and proximity sensor 1216.
The acceleration sensor 1211 can detect magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 1200. For example, the acceleration sensor 1211 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 1201 may control the display screen 1205 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1211. The acceleration sensor 1211 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 1212 may detect a body direction and a rotation angle of the terminal 1200, and the gyro sensor 1212 may collect a 3D motion of the user on the terminal 1200 in cooperation with the acceleration sensor 1211. The processor 1201 can implement the following functions according to the data collected by the gyro sensor 1212: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
Pressure sensors 1213 may be disposed on the side frames of terminal 1200 and/or underlying display 1205. When the pressure sensor 1213 is disposed on the side frame of the terminal 1200, the user's holding signal of the terminal 1200 can be detected, and the processor 1201 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 1213. When the pressure sensor 1213 is disposed at a lower layer of the display screen 1205, the processor 1201 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 1205. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 1214 is used for collecting a fingerprint of the user, and the processor 1201 identifies the user according to the fingerprint collected by the fingerprint sensor 1214, or the fingerprint sensor 1214 identifies the user according to the collected fingerprint. When the user identity is identified as a trusted identity, the processor 1201 authorizes the user to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, changing settings, and the like. The fingerprint sensor 1214 may be provided on the front, back, or side of the terminal 1200. When a physical button or vendor Logo is provided on the terminal 1200, the fingerprint sensor 1214 may be integrated with the physical button or vendor Logo.
The optical sensor 1215 is used to collect the ambient light intensity. In one embodiment, the processor 1201 may control the display brightness of the display 1205 according to the ambient light intensity collected by the optical sensor 1215. Specifically, when the ambient light intensity is high, the display luminance of the display panel 1205 is increased; when the ambient light intensity is low, the display brightness of the display panel 1205 is turned down. In another embodiment, processor 1201 may also dynamically adjust the camera head 1206 shooting parameters based on the ambient light intensity collected by optical sensor 1215.
A proximity sensor 1216, also known as a distance sensor, is typically disposed on the front panel of the terminal 1200. The proximity sensor 1216 is used to collect a distance between the user and the front surface of the terminal 1200. In one embodiment, when the proximity sensor 1216 detects that the distance between the user and the front surface of the terminal 1200 gradually decreases, the processor 1201 controls the display 1205 to switch from the bright screen state to the dark screen state; when the proximity sensor 1216 detects that the distance between the user and the front surface of the terminal 1200 gradually becomes larger, the processor 1201 controls the display 1205 to switch from the breath-screen state to the bright-screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 12 is not intended to be limiting of terminal 1200 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
When the electronic device is provided as a server, fig. 13 is a block diagram of a server 1300 according to an exemplary embodiment, where the server 1300 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 1301 and one or more memories 1302, where the memory 1302 stores at least one program code, and the at least one program code is loaded and executed by the processor 1301 to implement the training method of the image Processing model or the image Processing method provided by the above-mentioned method embodiments. Certainly, the server may further have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input and output, and the server 1300 may further include other components for implementing the functions of the device, which is not described herein again.
In some embodiments, there is also provided a computer readable storage medium, such as a memory, comprising program code, which is executable by a processor of a terminal or a processor of a server to perform the above method. Alternatively, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In some embodiments, a computer program product is also provided, which comprises a computer program that, when executed by a processor, implements the above-described method of training an image processing model, or the image processing method.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method of training an image processing model, the method comprising:
inputting a first hair image corresponding to a first sample face image into a hair line migration network in an image processing model to obtain a second hair image, wherein the first hair image comprises first hair edge information in the first sample face image, and the second hair image comprises second hair edge information obtained by enhancing the first hair edge information based on the hair line migration network;
acquiring a third hair image corresponding to a first sample style image, wherein the first sample style image and the first sample face image comprise the same content with different styles, and the third hair image comprises third hair edge information in the first sample style image;
inputting the first hair image, the second hair image and the third hair image into a style image generation network to be trained in the image processing model, and performing iterative training on the style image generation network to be trained by taking the third hair image as supervision information to obtain a trained image processing model.
2. The method for training an image processing model according to claim 1, wherein the step of inputting a first hair image corresponding to a first sample face image into a hair line migration network in the image processing model to obtain a second hair image comprises:
inputting a first hair image corresponding to the first same face image into a hair line migration network in an image processing model to obtain hair line enhancement information and hair direction information of the first same face image;
and adding the hair line enhancement information and the hair direction information in the first hair image to obtain the second hair image.
3. The method for training an image processing model according to claim 2, wherein the step of training the hair line migration network comprises:
inputting a first sample hair image corresponding to a second sample face image into a hair line migration network to be trained to obtain a first output hair image, wherein the first sample hair image comprises hair edge information and hair direction information of hair in the second sample face image;
determining image loss according to the first output hair image and a second sample hair image, wherein the second sample hair image comprises hair edge information and hair direction information of hair in a second sample style image corresponding to the second sample face image;
and training the hair line migration network to be trained according to the image loss.
4. The method of claim 3, wherein determining image loss from the first output hair image and the second sample hair image comprises:
extracting a sample hair line and a sample hair direction from the second sample hair image;
determining hair line loss according to the distance between the pixel points included by the sample hair lines and the pixel points included by the corresponding hair lines in the first output hair image;
determining the loss of the hairline direction according to the included angle between the sample hairline direction and the corresponding hairline direction in the first output hair image;
and obtaining the image loss according to the hair line loss and the hair direction loss.
5. An image processing method, characterized in that the method comprises:
inputting an image of hair to be processed corresponding to the image to be processed into an image processing model obtained by training according to any one of claims 1 to 4;
obtaining an intermediate image based on the hair line migration network in the image processing model, wherein the intermediate image comprises fifth hair edge information obtained by enhancing fourth hair edge information in the hair image to be processed based on the hair line migration network;
and inputting the image to be processed and the intermediate image into a style image generation network in the image processing model to obtain a target style image, wherein the target style image comprises the same content with the style different from that of the image to be processed.
6. An apparatus for training an image processing model, the apparatus comprising:
the line migration unit is configured to input a first hair image corresponding to a first sample face image into a hair line migration network in an image processing model to obtain a second hair image, wherein the first hair image comprises first hair edge information in the first sample face image, and the second hair image comprises second hair edge information obtained by enhancing the first hair edge information based on the hair line migration network;
an image acquisition unit configured to perform acquisition of a third hair image corresponding to a first sample style image, the first sample style image including the same content having a different style from the first sample style image, the third hair image including third hair edge information in the first sample style image;
and the network training unit is configured to input the first hair image, the second hair image and the third hair image into a style image generation network to be trained in the image processing model, and perform iterative training on the style image generation network to be trained by taking the third hair image as supervision information to obtain a trained image processing model.
7. An image processing apparatus, characterized in that the apparatus comprises:
an input unit configured to perform inputting of an image processing model trained according to any one of claims 1 to 4 into a hair image to be processed corresponding to the image to be processed;
a first processing unit configured to execute a hair line migration network based on the image processing model to obtain an intermediate image, wherein the intermediate image includes fifth hair edge information obtained by enhancing fourth hair edge information in the hair image to be processed based on the hair line migration network;
a second processing unit configured to perform a stylistic image generation network that inputs the image to be processed and the intermediate image into the image processing model, resulting in a target stylistic image that includes the same content that is different from the stylistic image to be processed.
8. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a memory for storing the processor executable program code;
wherein the processor is configured to execute the program code to implement the method of training an image processing model according to any one of claims 1 to 4 or to implement the method of image processing according to claim 5.
9. A computer-readable storage medium, characterized in that, when the program code in the storage medium is executed by a processor of an electronic device, the electronic device is enabled to execute the method of training an image processing model according to any one of claims 1 to 4, or the electronic device is enabled to execute the method of image processing according to claim 5.
10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the method of training an image processing model according to any one of claims 1 to 4, or, when being executed by a processor, implements the method of image processing according to claim 5.
CN202011497173.0A 2020-12-17 2020-12-17 Training method of image processing model, image processing method and device Active CN112581358B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011497173.0A CN112581358B (en) 2020-12-17 2020-12-17 Training method of image processing model, image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011497173.0A CN112581358B (en) 2020-12-17 2020-12-17 Training method of image processing model, image processing method and device

Publications (2)

Publication Number Publication Date
CN112581358A true CN112581358A (en) 2021-03-30
CN112581358B CN112581358B (en) 2023-09-26

Family

ID=75135880

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011497173.0A Active CN112581358B (en) 2020-12-17 2020-12-17 Training method of image processing model, image processing method and device

Country Status (1)

Country Link
CN (1) CN112581358B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113409342A (en) * 2021-05-12 2021-09-17 北京达佳互联信息技术有限公司 Training method and device for image style migration model and electronic equipment
CN114187633A (en) * 2021-12-07 2022-03-15 北京百度网讯科技有限公司 Image processing method and device, and training method and device of image generation model
CN114387160A (en) * 2022-03-23 2022-04-22 北京大甜绵白糖科技有限公司 Training method, image processing method, device, electronic equipment and storage medium
CN114758391A (en) * 2022-04-08 2022-07-15 北京百度网讯科技有限公司 Hairstyle image determining method and device, electronic equipment, storage medium and product

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE8809979U1 (en) * 1988-08-04 1988-09-29 Fa. Gustav Herzig, 6830 Schwetzingen Model head for creating hairstyles
CN109816764A (en) * 2019-02-02 2019-05-28 深圳市商汤科技有限公司 Image generating method and device, electronic equipment and storage medium
CN109859096A (en) * 2018-12-28 2019-06-07 北京达佳互联信息技术有限公司 Image Style Transfer method, apparatus, electronic equipment and storage medium
CN109903257A (en) * 2019-03-08 2019-06-18 上海大学 A kind of virtual hair-dyeing method based on image, semantic segmentation
CN109978930A (en) * 2019-03-27 2019-07-05 杭州相芯科技有限公司 A kind of stylized human face three-dimensional model automatic generation method based on single image
CN110023989A (en) * 2017-03-29 2019-07-16 华为技术有限公司 A kind of generation method and device of sketch image
CN110070483A (en) * 2019-03-26 2019-07-30 中山大学 A kind of portrait cartooning method based on production confrontation network
US20200151559A1 (en) * 2018-11-14 2020-05-14 Nvidia Corporation Style-based architecture for generative neural networks
CN111524204A (en) * 2020-05-06 2020-08-11 杭州趣维科技有限公司 Portrait hair animation texture generation method
CN111784565A (en) * 2020-07-01 2020-10-16 北京字节跳动网络技术有限公司 Image processing method, migration model training method, device, medium and equipment
CN111860485A (en) * 2020-07-24 2020-10-30 腾讯科技(深圳)有限公司 Training method of image recognition model, and image recognition method, device and equipment

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE8809979U1 (en) * 1988-08-04 1988-09-29 Fa. Gustav Herzig, 6830 Schwetzingen Model head for creating hairstyles
CN110023989A (en) * 2017-03-29 2019-07-16 华为技术有限公司 A kind of generation method and device of sketch image
US20200151559A1 (en) * 2018-11-14 2020-05-14 Nvidia Corporation Style-based architecture for generative neural networks
CN109859096A (en) * 2018-12-28 2019-06-07 北京达佳互联信息技术有限公司 Image Style Transfer method, apparatus, electronic equipment and storage medium
CN109816764A (en) * 2019-02-02 2019-05-28 深圳市商汤科技有限公司 Image generating method and device, electronic equipment and storage medium
CN109903257A (en) * 2019-03-08 2019-06-18 上海大学 A kind of virtual hair-dyeing method based on image, semantic segmentation
CN110070483A (en) * 2019-03-26 2019-07-30 中山大学 A kind of portrait cartooning method based on production confrontation network
CN109978930A (en) * 2019-03-27 2019-07-05 杭州相芯科技有限公司 A kind of stylized human face three-dimensional model automatic generation method based on single image
CN111524204A (en) * 2020-05-06 2020-08-11 杭州趣维科技有限公司 Portrait hair animation texture generation method
CN111784565A (en) * 2020-07-01 2020-10-16 北京字节跳动网络技术有限公司 Image processing method, migration model training method, device, medium and equipment
CN111860485A (en) * 2020-07-24 2020-10-30 腾讯科技(深圳)有限公司 Training method of image recognition model, and image recognition method, device and equipment

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113409342A (en) * 2021-05-12 2021-09-17 北京达佳互联信息技术有限公司 Training method and device for image style migration model and electronic equipment
CN114187633A (en) * 2021-12-07 2022-03-15 北京百度网讯科技有限公司 Image processing method and device, and training method and device of image generation model
CN114187633B (en) * 2021-12-07 2023-06-16 北京百度网讯科技有限公司 Image processing method and device, and training method and device for image generation model
CN114387160A (en) * 2022-03-23 2022-04-22 北京大甜绵白糖科技有限公司 Training method, image processing method, device, electronic equipment and storage medium
CN114387160B (en) * 2022-03-23 2022-06-24 北京大甜绵白糖科技有限公司 Training method, image processing method, device, electronic equipment and storage medium
CN114758391A (en) * 2022-04-08 2022-07-15 北京百度网讯科技有限公司 Hairstyle image determining method and device, electronic equipment, storage medium and product
CN114758391B (en) * 2022-04-08 2023-09-12 北京百度网讯科技有限公司 Hair style image determining method, device, electronic equipment, storage medium and product

Also Published As

Publication number Publication date
CN112581358B (en) 2023-09-26

Similar Documents

Publication Publication Date Title
CN110189340B (en) Image segmentation method and device, electronic equipment and storage medium
CN110992493B (en) Image processing method, device, electronic equipment and storage medium
CN108401124B (en) Video recording method and device
CN110650379B (en) Video abstract generation method and device, electronic equipment and storage medium
CN112907725B (en) Image generation, training of image processing model and image processing method and device
CN112581358B (en) Training method of image processing model, image processing method and device
CN110110787A (en) Location acquiring method, device, computer equipment and the storage medium of target
CN110533585B (en) Image face changing method, device, system, equipment and storage medium
CN109522863B (en) Ear key point detection method and device and storage medium
CN111028144B (en) Video face changing method and device and storage medium
US11386586B2 (en) Method and electronic device for adding virtual item
CN111723803B (en) Image processing method, device, equipment and storage medium
CN109978996B (en) Method, device, terminal and storage medium for generating expression three-dimensional model
CN113763228A (en) Image processing method, image processing device, electronic equipment and storage medium
CN110807769B (en) Image display control method and device
CN111083513B (en) Live broadcast picture processing method and device, terminal and computer readable storage medium
CN110991445B (en) Vertical text recognition method, device, equipment and medium
CN110837300B (en) Virtual interaction method and device, electronic equipment and storage medium
CN110677713B (en) Video image processing method and device and storage medium
CN111613213A (en) Method, device, equipment and storage medium for audio classification
CN111982293B (en) Body temperature measuring method and device, electronic equipment and storage medium
CN111797754B (en) Image detection method, device, electronic equipment and medium
CN111757146B (en) Method, system and storage medium for video splicing
CN112399080A (en) Video processing method, device, terminal and computer readable storage medium
CN111898488A (en) Video image identification method and device, terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant