CN113269719A

CN113269719A - Model training method, image processing method, device, equipment and storage medium

Info

Publication number: CN113269719A
Application number: CN202110409627.2A
Authority: CN
Inventors: 王迪
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-04-16
Filing date: 2021-04-16
Publication date: 2021-08-17

Abstract

The disclosure provides a model training method, an image processing method, a device, equipment and a storage medium, and relates to the field of image processing, in particular to the technical field of computer vision, augmented reality and deep learning. The specific implementation scheme is as follows: acquiring a face image set, wherein the face image set comprises a makeup front face image subset and a makeup back face image subset, and each makeup front face image corresponds to each makeup back face image; based on the face image set, executing the following training steps for multiple times: inputting each pre-makeup face image or each post-makeup face image into the model to obtain a middle face image output by the model; performing region segmentation on the intermediate face image and the face image corresponding to the input face image; determining a loss function value based on comparison results of each region in the intermediate face image and each region in the face image corresponding to the input face image; and adjusting the parameters of the model according to the loss function value. This implementation can make up or remove makeup to the face image.

Description

Model training method, image processing method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to the field of computer vision, augmented reality, and depth learning technologies, and in particular, to a method, an apparatus, a device, and a storage medium for model training and image processing.

Background

The traditional figure image makeup mode is generated through manual operation of image software and comprises common image processing such as skin grinding, whitening and pupil beautifying, and the mode is time-consuming and high in labor cost. Some image processing algorithms are then also generated to replace this manual operation effect, such as gaussian blur, language algorithms, etc. These image processing algorithms are typically cosmetic in their entirety.

Disclosure of Invention

A model training method, an image processing method, an apparatus, a device and a storage medium are provided.

According to a first aspect, there is provided a model training method comprising: acquiring a face image set, wherein the face image set comprises a makeup front face image subset and a makeup back face image subset, and each makeup front face image in the makeup front face image subset corresponds to each makeup back face image in the makeup back face image subset one to one; based on the face image set, executing the following training steps for multiple times: inputting each pre-makeup face image in the pre-makeup face image subset or each post-makeup face image in the post-makeup face image subset into a model to obtain a middle face image output by the model; carrying out region segmentation on the middle human face image and the human face image corresponding to the input human face image; determining a loss function value based on comparison results of each region in the intermediate face image and each region in the face image corresponding to the input face image; and adjusting the parameters of the model according to the loss function value.

According to a second aspect, there is provided an image processing method comprising: acquiring a target face image; determining a matching model from at least one model trained in advance according to the target face image, wherein the at least one model is obtained by training through the model training method described in the first aspect; and determining a processed face image corresponding to the target face image according to the target face image and the matching model.

According to a third aspect, there is provided a model training apparatus comprising: a first acquisition unit configured to acquire a face image set, wherein the face image set includes a pre-makeup face image subset and a post-makeup face image subset, and each pre-makeup face image in the pre-makeup face image subset corresponds to each post-makeup face image in the post-makeup face image subset one to one; a model training unit configured to perform the training steps a plurality of times based on the set of face images by: an input module configured to input each of the pre-makeup face images in the pre-makeup face image subset or each of the post-makeup face images in the post-makeup face image subset into a model, resulting in an intermediate face image for model output; a segmentation module configured to perform region segmentation on the intermediate face image and a face image corresponding to the input face image; a determining module configured to determine a loss function value based on comparison results of regions in the intermediate face image and regions in the face image corresponding to the input face image; an adjustment module configured to adjust parameters of the model according to the loss function values.

According to a fourth aspect, there is provided an image processing apparatus comprising: a second acquisition unit configured to acquire a target face image; a model determining unit configured to determine a matching model from at least one model trained in advance according to the target face image, wherein the at least one model is trained by the model training method as described in the first aspect; and the image processing unit is configured to determine a processed face image corresponding to the target face image according to the target face image and the matching model.

According to a fifth aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in the first aspect or to perform the method as described in the second aspect.

According to a sixth aspect, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method as described in the first aspect or to perform the method as described in the second aspect.

According to a seventh aspect, a computer program product comprising a computer program which, when executed by a processor, implements the method as described in the first aspect or performs the method as described in the second aspect.

The technology according to the present disclosure provides an image processing model that can make up or remove makeup from a face image.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become readily apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;

FIG. 2 is a flow diagram of one embodiment of a model training method according to the present disclosure;

FIG. 3 is a flow diagram of another embodiment of a model training method according to the present disclosure;

FIG. 4 is a flow diagram for one embodiment of an image processing method according to the present disclosure;

FIG. 5 is a schematic diagram of an application scenario of a model training method, an image processing method according to the present disclosure;

FIG. 6 is a schematic block diagram of one embodiment of a model training apparatus according to the present disclosure;

FIG. 7 is a schematic block diagram of one embodiment of an information output device according to the present disclosure;

FIG. 8 is a block diagram of an electronic device for implementing a model training method, an image processing method of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 for embodiments of a model training method, an image processing method, or a model training apparatus, an image processing apparatus, to which the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104, e.g. to receive trained object models, etc. Various communication client applications, such as an image processing application, a social platform application, and the like, may be installed on the

terminal devices

101, 102, 103. The user may process the facial image through an image processing class application and a model obtained from the server 105.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, e-book readers, car computers, laptop portable computers, desktop computers, and the like. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 105 may be a server that provides various services, such as a background server that processes face images provided on the

terminal devices

101, 102, 103. The background server may train the initial model by using the face image set to obtain a target model, and feed the target model back to the

terminal devices

101, 102, and 103.

The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 105 is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the model training method provided by the embodiment of the present disclosure is generally executed by the server 105, and the image processing method provided by the embodiment of the present disclosure may be executed by the

terminal devices

101, 102, and 103, or may be executed by the server 105. Accordingly, the model training apparatus is generally provided in the server 105, and the image processing apparatus may be provided in the

terminal devices

101, 102, and 103, or may be provided in the server 105.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a model training method according to the present disclosure is shown. The model training method of the embodiment comprises the following steps:

step 201, acquiring a face image set.

In this embodiment, an executing subject (for example, the server 105 shown in fig. 1) of the model training method may first obtain a face image set. The face image set comprises a face image subset before makeup and a face image subset after makeup. And each pre-makeup face image in the pre-makeup face image subset corresponds to each post-makeup face image in the post-makeup face image subset one-to-one. The face image before makeup is an image photographed for a plain face or an image obtained by performing plain processing on the face image. The makeup-finished face image may be an image obtained by shooting a makeup-finished face or an image obtained by subjecting the face image to makeup processing. The face image in this embodiment may be from a public data set, or the face image may be obtained after authorization of a user corresponding to the face image.

In some specific applications, each of the plurality of post-makeup facial images is processed by the same person, or each of the plurality of post-makeup facial images represents a makeup style (e.g., a down-generation style, a Qing-generation style, etc.). It will be appreciated that a cosmetic style has some characteristics that are unique to itself. For example, one style of makeup eyebrow is thin, either curved or pointed. If the eyebrow is picked, two thirds of the eye is picked up and stretched along the tail of the eye. While the makeup of the eyes is relatively simple.

Step 202, based on the face image set, executing the following training steps for multiple times:

step 2021, inputting each pre-makeup face image in the pre-makeup face image subset or each post-makeup face image in the post-makeup face image subset into the model to obtain a middle face image output by the model.

In this embodiment, after the execution subject acquires the face image set, each pre-makeup face image in the pre-makeup face image subset or each post-makeup face image in the post-makeup face image subset may be input into the model. And the output of the model is recorded as an intermediate face image. Specifically, the execution subject may input each pre-makeup face image in the pre-makeup face image subset into the model when training the makeup model. Each of the makeup-removed face images in the makeup-removed face image subset may be input to the model when training the makeup removal model. The model may be a neural network, such as a convolutional neural network, an antagonistic neural network, or the like.

Step 2022, performing region segmentation on the intermediate face image and the face image corresponding to the input face image.

The execution subject may perform region segmentation on the intermediate face image and the face image corresponding to the input face image. Specifically, the execution subject may perform region segmentation on the intermediate face image and the corresponding face image according to the facial features. The purpose of this is to refine the refining of the five sense organs during makeup application or makeup removal.

Here, if the face image of the input model is a pre-makeup face image, the face image corresponding to the input face image is a post-makeup face image. And if the face image of the input model is a makeup-finished face image, the face image corresponding to the input face image is a makeup-finished face image.

Step 2023, determining a loss function value based on the comparison result of each region in the intermediate face image and each region in the face image corresponding to the input face image.

The execution subject may compare each region of the intermediate face image with each region in the face image corresponding to the input face image, to obtain a comparison result. And determining the loss function value according to the comparison result. Specifically, the execution subject may compare the five sense organs in the intermediate face image with the five sense organs in the face image corresponding to the input face image. The comparison result may include the similarity of five sense organs in the images of the two faces, and the loss function value may be determined according to the comparison result.

Step 2024, adjust the parameters of the model according to the loss function values.

After obtaining the loss function value, the execution subject may adjust parameters of the model in combination with a preset threshold. The parameters of the model may include weights of convolution layers, parameters of convolution kernels, and so on.

According to the model training method provided by the embodiment of the disclosure, the face makeup model can be obtained through training by using the face image set, and the face makeup removal model can also be obtained through training. By comparing each region of the intermediate face image output by the model with each region of the face image corresponding to the input face image, thinning processing can be realized.

With continued reference to FIG. 3, a flow 300 of another embodiment of a model training method according to the present disclosure is shown. As shown in fig. 3, the method of the present embodiment may include the following steps:

step 301, acquiring a face image set.

Step 302, based on the face image set, executing the following training steps for multiple times:

step 3021, inputting each pre-makeup face image in the pre-makeup face image subset or each post-makeup face image in the post-makeup face image subset into the model to obtain a middle face image output by the model.

And step 3022, performing region segmentation on the intermediate face image and the face image corresponding to the input face image.

Step 3023, for each region in the intermediate face image, determining a first comparison result according to the comparison between the region and the corresponding region in the face image corresponding to the input face image; for each pixel in the intermediate face image, determining a second comparison result according to the comparison between the pixel and the corresponding pixel in the face image corresponding to the input face image; and determining a loss function value according to the first comparison result and the second comparison result.

In this embodiment, when the execution subject determines the first comparison result, the execution subject may compare each region in the intermediate face image with a corresponding region in the face image corresponding to the input face image. Specifically, if the pre-makeup face image is input, each region of the middle face image is compared with each corresponding region in the post-makeup face image. And if the face image after makeup is input, comparing each region of the middle face image with each corresponding region in the face image before makeup.

The execution subject may extract features of each region of the intermediate face image, and extract features of each region in the corresponding face image, respectively. The distance between the features of the corresponding regions is taken as the similarity of the respective regions. The above-described similarity is taken as a first comparison result.

The execution subject may also compare the pixel value of each pixel in the intermediate face image with the pixel value of each pixel in the corresponding face image. Specifically, the executing subject may take the difference of each pixel between the two face images as the second comparison result.

The execution subject may weight the first comparison result and the second comparison result, and determine a loss function value according to the weighted result.

In some optional implementations of the present embodiment, the first comparison result may be specifically realized by the following steps not shown in fig. 3: determining a local color histogram of a corresponding region in the face image corresponding to the input face image; and determining a first comparison result according to the local color histogram.

In this implementation, the execution subject may calculate a local color histogram of each region in the intermediate face image and a corresponding region in the corresponding face image. A color feature is a global feature that describes the surface properties of a scene to which an image or image region corresponds. The general color features are based on the characteristics of the pixel points, and all pixels belonging to the image or the image area have respective contributions. Since color is not sensitive to changes in the orientation, size, etc. of an image or image region, color features do not capture local features of objects in the image well. In addition, when only color feature query is used, many unwanted images are often retrieved if the database is large. Color histograms are the most commonly used method of expressing color features, which has the advantage of being immune to image rotation and translation changes, and further immune to image scale changes by normalization. The execution subject may determine the first comparison result using the local color histogram. This allows for local training.

In some optional implementations of the present embodiment, the second comparison result may be specifically realized by the following steps not shown in fig. 3: according to the difference between the pixel value of the pixel and the pixel value of the corresponding pixel in the face image corresponding to the input face image; and determining a second comparison result according to the difference.

In this implementation, the execution subject may calculate a difference between each pixel in the intermediate face image and a corresponding pixel in the corresponding face image, and use the difference as the second comparison result.

Step 3024, adjusting the parameters of the model according to the loss function values.

The model training method provided by the above embodiments of the present disclosure may determine the loss function value using the difference between the regions and the difference between the pixels. And adjusting the parameters of the model according to the loss function values, thereby realizing the learning of the details of the face image.

Referring to fig. 4, a flow 400 of one embodiment of an image processing method according to the present disclosure is shown. As shown in fig. 4, the method of the present embodiment may include the following steps:

step 401, a target face image is obtained.

In this embodiment, the execution subject of the image processing method (for example, the

terminal devices

101, 102, 103 shown in fig. 1) may acquire the target face image in various ways. The target face image may be acquired by the user through various authorized manners.

Step 402, determining a matching model from at least one model trained in advance according to the target face image.

The execution subject may determine a matching model from the pre-trained at least one model from the target face image. At least one model is obtained by training through the model training method described in the embodiment shown in fig. 2 or fig. 3. The at least one model may include a makeup model and a makeup removal model. The execution subject may analyze the target face image to determine whether the target face image is a made-up face image. And if the target face image is a makeup-finished face image, the makeup removing model can be used as a matching model. If the target face image is not a makeup face image, the makeup model may be taken as a matching model.

And 403, determining a processed face image corresponding to the target face image according to the target face image and the matching model.

After the execution main body determines the matching model, the target face image can be directly input into the matching model to obtain a processed face image corresponding to the target face image. Thereby realizing the makeup application or makeup removal of the target face image.

According to the image processing method provided by the embodiment of the disclosure, makeup application or makeup removal can be performed on the target face image, and the details of the processed face image are more obvious because the model learns the details of the face image.

With continued reference to FIG. 5, a schematic diagram of an application scenario of the model training method, the image processing method according to the present disclosure is shown. In the application scenario of fig. 5, the server 501 trains a convolutional neural network through a set of face images. Specifically, the server 501 may use the pre-makeup face image in the face image set as an input, use the corresponding post-makeup face image as an expected output, and train to obtain a makeup model. And the makeup removal model can be obtained by training by taking the makeup-finished face image in the face image set as input and the corresponding makeup-finished face image as expected output. The trained makeup model and makeup removal model are sent to the terminal device 502. When a user using the terminal device 502 performs face image processing, a target face image is first acquired, and then a makeup model is selected to obtain a makeup-finished face image.

With further reference to fig. 6, as an implementation of the method shown in fig. 2 to fig. 3, the present disclosure provides an embodiment of a model training apparatus, which corresponds to the embodiment of the method shown in fig. 2, and which can be applied in various electronic devices.

As shown in fig. 6, the model training apparatus 600 of the present embodiment includes: a first acquisition unit 601 and a model training unit 602. The model training unit 602 may include: an input module 6021, a segmentation module 6022, a determination module 6023, and an adjustment module 6024.

A first obtaining unit 601 configured to obtain a set of face images. The face image set comprises a makeup front face image subset and a makeup back face image subset, and each makeup front face image in the makeup front face image subset corresponds to each makeup back face image in the makeup back face image subset.

A model training unit 602 configured to perform the training step a plurality of times based on the set of face images by:

an input module 6021 configured to input each pre-makeup face image of the subset of pre-makeup face images or each post-makeup face image of the subset of post-makeup face images into the model, resulting in an intermediate face image of the model output.

A segmentation module 6022 configured to perform region segmentation on the intermediate face image and the face image corresponding to the input face image.

A determining module 6023 configured to determine a loss function value based on a comparison result of the regions in the intermediate face image and the regions in the face image corresponding to the input face image.

An adjustment module 6024 configured to adjust parameters of the model based on the loss function values.

In some optional implementations of this embodiment, the determining module 6023 may be further configured to: for each region in the intermediate face image, determining a first comparison result according to the comparison between the region and the corresponding region in the face image corresponding to the input face image; for each pixel in the middle face image, determining a second comparison result according to the comparison between the pixel and the corresponding pixel in the face image corresponding to the input face image; and determining the loss function value according to the first comparison result and the second comparison result.

In some optional implementations of this embodiment, the determining module 6023 may be further configured to: determining a local color histogram of a corresponding region in the face image corresponding to the input face image; a first comparison result is determined from the local color histogram.

In some optional implementations of this embodiment, the determining module 6023 may be further configured to: according to the difference between the pixel value of the pixel and the pixel value of the corresponding pixel in the face image corresponding to the input face image; and determining a second comparison result according to the difference.

It should be understood that the units 601 to 602 and the modules 6021 to 6024 recited in the model training apparatus 600 correspond to the respective steps in the method described with reference to fig. 2, respectively. Thus, the operations and features described above with respect to the model training method are equally applicable to the apparatus 600 and the units included therein, and are not described in detail here.

With further reference to fig. 7, as an implementation of the method shown in fig. 4 described above, the present disclosure provides an embodiment of an image processing apparatus, which corresponds to the embodiment of the method shown in fig. 4, and which is particularly applicable to various electronic devices.

As shown in fig. 7, the image processing apparatus 700 of the present embodiment includes: a second acquisition unit 701, a model determination unit 702, and an image processing unit 703.

A second acquisition unit 701 configured to acquire a target face image;

a model determining unit 702 configured to determine a matching model from at least one model trained in advance according to the target face image. At least one model is obtained by training through the model training method described in the embodiment of fig. 2 or fig. 3.

And the image processing unit 703 is configured to determine a processed face image corresponding to the target face image according to the target face image and the matching model.

It should be understood that the units 701 to 703 recited in the image processing apparatus 700 correspond to respective steps in the method described with reference to fig. 4, respectively. Thus, the operations and features described above for the image processing method are equally applicable to the apparatus 700 and the units included therein, and are not described in detail here.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to an embodiment of the present disclosure.

FIG. 8 illustrates a block diagram of an electronic device 800 that performs a model training method, an image processing method, according to an embodiment of the disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 8, the electronic device 800 includes a processor 801 that may perform various suitable actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a memory 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the electronic apparatus 800 can also be stored. The processor 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An I/O interface (input/output interface) 805 is also connected to the bus 804.

A number of components in the electronic device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a memory 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the electronic device 800 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

Processor 801 may be any of a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of processor 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The processor 801 performs the various methods and processes described above, such as a model training method, an image processing method. For example, in some embodiments, the model training method, the image processing method, may be implemented as a computer software program tangibly embodied in a machine-readable storage medium, such as the memory 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto the electronic device 800 via the ROM 802 and/or the communication unit 809. When loaded into RAM 803 and executed by the processor 801, a computer program may perform one or more steps of the model training method, the image processing method described above. Alternatively, in other embodiments, the processor 801 may be configured to perform the model training method, the image processing method, by any other suitable means (e.g., by means of a solid piece).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. The program code described above may be packaged as a computer program product. These program codes or computer program products may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor 801, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable storage medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable storage medium may be a machine-readable signal storage medium or a machine-readable storage medium. A machine-readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions of the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. A model training method, comprising:

acquiring a face image set, wherein the face image set comprises a makeup front face image subset and a makeup back face image subset, and each makeup front face image in the makeup front face image subset corresponds to each makeup back face image in the makeup back face image subset one by one;

based on the facial image set, executing the following training steps for multiple times:

inputting each pre-makeup face image in the pre-makeup face image subset or each post-makeup face image in the post-makeup face image subset into a model to obtain a middle face image output by the model;

carrying out region segmentation on the intermediate face image and a face image corresponding to the input face image;

determining a loss function value based on the comparison results of each region in the intermediate face image and each region in the face image corresponding to the input face image;

and adjusting parameters of the model according to the loss function value.

2. The method of claim 1, wherein determining the loss function value based on a comparison of regions in the intermediate face image and regions in a face image corresponding to the input face image comprises:

for each region in the intermediate face image, determining a first comparison result according to the comparison between the region and the corresponding region in the face image corresponding to the input face image;

for each pixel in the intermediate face image, determining a second comparison result according to the comparison between the pixel and the corresponding pixel in the face image corresponding to the input face image;

and determining a loss function value according to the first comparison result and the second comparison result.

3. The method of claim 2, wherein determining the first comparison result according to the comparison between the region and the corresponding region in the face image corresponding to the input face image comprises:

determining a local color histogram of a corresponding region in the face image corresponding to the input face image;

determining a first comparison result according to the local color histogram.

4. The method of claim 2, wherein determining the second comparison result according to the comparison between the pixel and the corresponding pixel in the face image corresponding to the input face image comprises:

according to the difference between the pixel value of the pixel and the pixel value of the corresponding pixel in the face image corresponding to the input face image;

and determining a second comparison result according to the difference.

5. An image processing method comprising:

acquiring a target face image;

determining a matching model from at least one model trained in advance according to the target face image, wherein the at least one model is obtained by training according to the model training method of any one of claims 1 to 4;

and determining a processed face image corresponding to the target face image according to the target face image and the matching model.

6. A model training apparatus comprising:

a first acquisition unit configured to acquire a face image set, wherein the face image set includes a pre-makeup face image subset and a post-makeup face image subset, and each pre-makeup face image in the pre-makeup face image subset corresponds to each post-makeup face image in the post-makeup face image subset one to one;

a model training unit configured to perform training steps a plurality of times based on the set of face images by:

an input module configured to input each of the pre-makeup face images in the subset of pre-makeup face images or each of the post-makeup face images in the subset of post-makeup face images into a model, resulting in an intermediate face image output by the model;

a segmentation module configured to perform region segmentation on the intermediate face image and a face image corresponding to an input face image;

a determination module configured to determine a loss function value based on a comparison result of each region in the intermediate face image and each region in a face image corresponding to an input face image;

an adjustment module configured to adjust parameters of the model according to the loss function values.

7. The apparatus of claim 6, wherein the determination module is further configured to:

8. The apparatus of claim 7, wherein the determination module is further configured to:

determining a first comparison result according to the local color histogram.

9. The apparatus of claim 7, wherein the determination module is further configured to:

and determining a second comparison result according to the difference.

10. An image processing apparatus comprising:

a second acquisition unit configured to acquire a target face image;

a model determining unit configured to determine a matching model from at least one model trained in advance according to the target face image, wherein the at least one model is obtained by training according to the model training method of any one of claims 1 to 4;

and the image processing unit is configured to determine a processed face image corresponding to the target face image according to the target face image and the matching model.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-4 or to perform the method of claim 5.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-4 or to perform the method of claim 5.

13. A computer program product comprising a computer program which, when executed by a processor, implements the method of any one of claims 1-4 or performs the method of claim 5.