CN116091304A

CN116091304A - Image processing method, device, electronic equipment and storage medium

Info

Publication number: CN116091304A
Application number: CN202310147136.4A
Authority: CN
Inventors: 陈朗
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2023-02-15
Filing date: 2023-02-15
Publication date: 2023-05-09

Abstract

An image processing method, an image processing device, an electronic device and a storage medium are provided, wherein a first style image and a second style image which are generated based on an original image and have different styles are obtained, a third image is generated based on target area information of the first style image and the second style image, the original image is input into an image generation model to obtain a fourth image, and the image generation model is adjusted based on a first style image, the third image and a fourth image sample, so that the matching degree of an output image of the image generation model and the original image can be improved, and the output image can be kept to a certain stylized degree.

Description

Image processing method, device, electronic equipment and storage medium

Technical Field

The disclosure relates to the technical field of computers, and in particular relates to an image processing method, an image processing device, electronic equipment and a storage medium.

Background

In some application scenarios, a user may wish to adjust the display style of an image, such as imparting a cartoon style to a live photo. To achieve this, in the related art, a certain number of sample images (e.g., cartoon character images) having a target style are generally used first, a model generator is not paired in an unsupervised training manner, and the model generator is adopted to generate a stylized image having a corresponding target style based on a real image, and further based on the real image and the stylized image as paired data, the paired large model generator is trained.

However, because the number and diversity of sample images are limited, and the training mode adopts an unsupervised unpaired mode, the stylized image generated by the trained unpaired model generator is easy to generate unexpected errors of the user. Especially, when the input real person image has exaggerated unusual expressions such as open eyes and open mouths, the opening and closing degree of eyes and mouths of the face in the stylized image generated by the unpaired model generator is sometimes not matched with the real person image. For example, if the eyes of a person in a real person artwork are in a tight state, the eyes of the person are not fully closed in a stylized image generated by a non-pairing model generator based on the real person artwork. Training the matched large model generator directly by using the stylized image with the defects also further causes poor generation effect of the trained matched large model generator. Although manual adjustment of these stylized pictures with flaws can be performed by a designer before training, there are still problems of large workload and low processing efficiency.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

According to a first aspect, according to one or more embodiments of the present disclosure, there is provided an image processing method including:

acquiring a first style image and a second style image; the first style image and the second style image are images of different styles generated based on the original image;

inputting the original image into an image generation model to obtain a fourth image;

the image generation model is adjusted based on the first style image, the third image, and the fourth image.

In a second aspect, according to one or more embodiments of the present disclosure, there is provided an apparatus for image processing, including:

a style image acquisition unit configured to acquire a first style image and a second style image; the first style image and the second style image are images of different styles generated based on the original image;

a third image generation unit configured to generate a third image based on target area information of the first-style image and the second-style image;

a fourth image acquisition unit, configured to input the original image into an image generation model to obtain a fourth image;

an adjustment unit configured to adjust the image generation model based on the first-style image, the third image, and the fourth image.

In a third aspect, according to one or more embodiments of the present disclosure, there is provided an electronic device comprising: at least one memory and at least one processor; wherein the memory is for storing program code, and the processor is for invoking the program code stored by the memory to cause the electronic device to perform a method provided in accordance with one or more embodiments of the present disclosure.

In a fourth aspect, according to one or more embodiments of the present disclosure, there is provided a non-transitory computer storage medium storing program code which, when executed by a computer device, causes the computer device to perform a method provided according to one or more embodiments of the present disclosure.

According to one or more embodiments of the present disclosure, by acquiring a first-style image and a second-style image having different styles generated based on an original image, generating a third image based on target area information of the first-style image and the second-style image, inputting the original image into an image generation model to obtain a fourth image, and adjusting the image generation model based on the first-style image, the third image, and the fourth image sample, it is possible to improve the matching degree of an output image of the image generation model and the original image, and to maintain the output image to a certain degree of stylization.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

FIG. 1 shows a flow chart of an image processing method provided in accordance with an embodiment of the present disclosure;

fig. 2 is a schematic structural view of an image processing apparatus according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the steps recited in the embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Furthermore, embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. The term "responsive to" and related terms mean that one signal or event is affected to some extent by another signal or event, but not necessarily completely or directly. If event x occurs "in response to" event y, x may be directly or indirectly in response to y. For example, the occurrence of y may ultimately lead to the occurrence of x, but other intermediate events and/or conditions may exist. In other cases, y may not necessarily result in the occurrence of x, and x may occur even though y has not yet occurred. Furthermore, the term "responsive to" may also mean "at least partially responsive to".

The term "determining" broadly encompasses a wide variety of actions, which may include obtaining, calculating, computing, processing, deriving, investigating, looking up (e.g., looking up in a table, a database or another data structure), ascertaining and the like, and may also include receiving (e.g., receiving information), accessing (e.g., accessing data in a memory) and the like, as well as parsing, selecting, choosing, establishing and the like. Related definitions of other terms will be given in the description below. Related definitions of other terms will be given in the description below.

It will be appreciated that the data (including but not limited to the data itself, the acquisition or use of the data) involved in the present technical solution should comply with the regulations of the relevant legal regulations.

It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type, usage range, usage scenario, etc. of the personal information related to the present disclosure in an appropriate manner according to relevant legal regulations. For example, in response to receiving an active request from a user, prompt information is sent to the user to explicitly prompt the user that the operation requested to be performed will require obtaining and using personal information to the user, so that the user may autonomously select whether to provide personal information to software or hardware such as an electronic device, an application, a server, or a storage medium that performs the operation of the technical solution of the present disclosure according to the prompt information.

As an alternative but non-limiting implementation, in response to receiving an active request from a user, the prompt information may be sent to the user, for example, in a popup window, where the prompt information may be presented in a text manner. In addition, a selection control for the user to select to provide personal information to the electronic device in a 'consent' or 'disagreement' manner can be carried in the popup window.

It will be appreciated that for images generated in accordance with methods provided by embodiments of the present disclosure, they should be processed in compliance with the regulations of the relevant legal regulations. For example, the identification which does not affect the use of the user is added according to the stipulation of technical measures, or the deep synthesis condition is prompted to the public according to the stipulation of significant identification in reasonable positions and areas.

It will be appreciated that the above-described notification and user authorization process, and image processing, are merely illustrative, and not limiting of the implementations of the present disclosure, as other ways of satisfying relevant legal regulations may be applied to the implementations of the present disclosure.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

For the purposes of this disclosure, the phrase "a and/or B" means (a), (B), or (a and B).

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

Referring to fig. 1, fig. 1 shows a flowchart of an image processing method 100 according to an embodiment of the disclosure, where the method 100 includes steps S110 to S140.

Step S110: acquiring a first style image and a second style image; the first style image and the second style image are images of different styles generated based on the original image.

In some embodiments, the first-style image is an image to be restored having a first style generated based on the original image; the second-style image is an image having a second style generated based on the original image.

Step S120: a third image is generated based on target region information of the first-style image and the second-style image.

The above steps S110 to S120 are used to acquire sample data required for adjusting the image generation model parameters.

In some embodiments, the target region of the first style image to be repaired does not match the target region of the original image; the target area of the second style image matches the target area of the original image.

In some embodiments, the target region includes a five sense organ region, including but not limited to an eye region, or a mouth region.

An exemplary description is given below.

Aiming at the problem that the real person closed-eye image is easy to make mistakes when the stylized processing is carried out, a plurality of real person closed-eye images can be obtained in advance to serve as original images, the original images are input into a non-pairing model to generate an image to be repaired, the image is provided with a first style, and the matching degree of the eye shape (such as the closed-eye degree) of the human face in the image with the eye shape in the original image is lower.

Further, the original image is input into a trained second style model to generate a second style image having a second style. The second style is two styles which are not identical with the first style, the second style model is a trained image generation model with second stylization capability, and the second style model can generate a second style image with the opening and closing degree accurately matched with the original image. Here, by using the existing second style model, a second style image having a high degree of matching of the eye shape (for example, the degree of eye closure) with the eye shape in the original image can be generated.

Further, target region information (e.g., color texture information of a face) of the first-style image may be migrated to the second-style image to obtain a third image, such that the third image has the same or similar color and style as the first-style image.

Step S130: inputting the original image into an image generation model to obtain a fourth image;

step S140: the image generation model is adjusted based on the first style image, the third image, and the fourth image.

In some embodiments, the first feature of the target region of the fourth image obtained by the adjusted image generation model is capable of approximating the first feature of the target region of the third image, and the second feature of the target region of the fourth image obtained by the adjusted image generation model is capable of approximating the second feature of the target region of the first style image.

In some embodiments, parameters of the image generation model may be adjusted based on a first loss between a first feature of a target region of the third image and a first feature of a target region of the fourth image, and a second loss between a second feature of a target region of the first style image and a second feature of a target region of the fourth image.

In some embodiments, the first loss corresponds to a higher weight than the second loss when parameters of the image generation model are adjusted based on the first loss and the second loss.

It should be noted that the first feature and the second feature may be completely different, partially identical, or completely different.

In some embodiments, the first feature of the target region comprises an edge profile feature of the target region and the second feature of the target region comprises a color (e.g., skin tone) feature of the target region.

In some embodiments, the original image comprises a face closed-eye image and the target region comprises an eye region; alternatively, the original image comprises a facial opening image and the target region comprises a mouth region.

The exemplary description is continued below. After the original image is input into the image generation model, the output image (namely, the fourth image) is obtained through the related parameter processing of the image generation model. In theory, the contour shape of the eye region of the output image of the image generation model is expected to be as close as possible to the edge contour shape of the eye region of the third image, while features (e.g., color features) other than the edge contour shape in the eye region of the output image of the image generation model are expected to be as close as possible to the eye region of the first-style image. In this regard, a loss function may be additionally added to the eye region alone to constrain the image generation model for constraining the stylized texture details of the eye region.

In a specific embodiment, the loss function may be constructed based on a first loss between the edge contour feature of the eye region of the fourth image and the edge contour feature of the eye region of the third image, and a second loss between the color feature of the eye region of the fourth image and the color feature of the eye region of the first style image, wherein the first loss corresponds to a higher weight than the second loss corresponds to. For example, the color feature may be extracted by a preset feature extractor, and further edge extraction may be performed on the basis of the color feature, so as to obtain an edge contour feature. The edge extraction may be performed using the laplace operator on the basis of a preset feature extractor, but the present disclosure is not limited thereto.

It should be noted that, the loss function provided in the foregoing disclosure is additionally added to the target area (for example, the eye area) separately, so that the target area of the output image of the image generating model can match the target area of the original image, and a certain degree of stylization of the first style is maintained. Those skilled in the art will appreciate that for areas other than the target area, the image generation model may be parametrically adjusted according to training samples and/or loss functions employed by conventional techniques in the art, and the disclosure is not limited thereto.

Thus, according to one or more embodiments of the present disclosure, by acquiring a first style image and a second style image, generating a third image based on target area information of the first style image and the second style image, adjusting parameters of an image generation model using the first style image and the third image as samples, so that a first feature of a target area of a fourth image obtained by an adjusted image generation model based on an original image can approach the first feature of the target area of the third image and a second feature of the target area of the fourth image can approach the second feature of the target area of the first style image, it is possible to independently constrain the target area, and further enable the target area of an output image of the image generation model to match the original image, and maintain a certain degree of stylization.

In one embodiment, an image discrimination model corresponding to the target region may be further set, and after the parameters of the image generation model are adjusted, the image discrimination model and the image generation model may be alternately and iteratively countermeasure-trained. In this embodiment, on the basis of training the generator based on the loss function, a new discriminator may be separately introduced for the target area to further constrain the shape of the target area, so as to further optimize the parameters of the generator by generating a countermeasure training mode. The training may be performed according to training patterns that generate the countermeasure network as are conventional in the art, and the present embodiment is not limited in this regard.

In one embodiment, in performing the generation countermeasure training, the image discrimination model may be trained according to the following steps: and fixing parameters of the image generation model, combining a target area of the original image and a target area of the third image as positive samples, combining the target area of the original image and the target area of the first style image as negative samples, combining the target area of the original image and the target area of an output image (namely a fourth image) generated by the current image generation model based on the original image as negative samples, and inputting the image discrimination model for model training. For example, after the original image and the third image are juxtaposed, a target area (such as an eye area) is independently scratched out and input into a discriminator, and the discriminator is indicated to be combined into a positive sample combination currently; after the original image and the first style image to be repaired are juxtaposed, the target area is independently scratched out and input into a discriminator, the current combination of the discriminator is indicated to be a negative sample combination, and whether the target area is matched or not can be judged by the constraint discriminator; after the original image and the output image (namely the fourth image) of the current generator are juxtaposed, the target area is independently scratched out and input into the discriminator, and the current combination of the discriminator is indicated to be a negative sample combination, so that the countermeasure training process of the discriminator and the generator is realized.

In one embodiment, in performing the generation countermeasure training, the image generation model may be trained according to the following steps: combining a target area of the original image and a target area of an output image (namely a fourth image) generated by a current image generation model based on the original image as positive samples, and inputting the image discrimination model; and adjusting parameters of the image generation model based on the output result of the image discriminator. For example, after the original image and the output image of the current generator are juxtaposed, the target area is independently scratched out and input into the discriminator, the current combination of the discriminators is indicated to be positive sample combination, the error is calculated according to the output of the discriminator and the positive example label, and the parameters of the generator are updated by using an error back propagation algorithm.

In some embodiments, after the parameters of the image generation model are adjusted, the original image may be input into the image generation model to obtain a face image with a first style, and the shape and color of the target area of the output face image with the first style may be matched with those of the target area of the input original image.

Accordingly, referring to fig. 3, there is provided an image processing apparatus 500 according to an embodiment of the present disclosure, including:

A style image acquisition unit 501 for acquiring a first style image and a second style image; the first style image and the second style image are images of different styles generated based on the original image;

a third image generation unit 502 for generating a third image based on target area information of the first-style image and the second-style image;

a fourth image obtaining unit 303, configured to input the original image into an image generation model to obtain a fourth image;

an adjustment unit 304 is configured to adjust the image generation model based on the first style image, the third image, and the fourth image.

In some embodiments, the first feature of the target region of the fourth image obtained by the adjusted image generation model approximates the first feature of the target region of the third image, and the second feature of the target region of the fourth image obtained by the adjusted image generation model approximates the second feature of the target region of the first style image.

In some embodiments, the style image acquisition unit includes:

the first sub-acquisition unit is used for inputting the original image into a first style model to obtain the first style image; the first style image is an image to be repaired with a first style;

The second sub-acquisition unit is used for inputting the original image into a second style model to obtain the second style image; the second-style image is an image having a second style.

In some embodiments, the original image comprises a face image and the target region comprises a facial region.

In some embodiments, the target region information includes color texture information of the face.

In some embodiments, the adjustment unit adjusts the parameters of the image generation model based on a first loss between a first feature of the target region of the third image and a first feature of the target region of the fourth image, and a second loss between a second feature of the target region of the first style image and a second feature of the target region of the fourth image.

In some embodiments, the first feature comprises an edge profile feature and the second feature comprises a color feature.

In some embodiments, the first loss corresponds to a higher weight than the second loss when adjusting parameters of the image generation model based on the first loss and the second loss.

In some embodiments, the image processing apparatus further includes:

A setting unit configured to set an image discrimination model corresponding to the target area;

a parameter adjustment unit, configured to combine, as a positive sample, a target area of the original image and a target area of the third image, combine, as a negative sample, a target area of the original image and a target area of the first style image, and combine, as a negative sample, a target area of the original image and a target area of the fourth image, to perform parameter adjustment of the image discrimination model; and/or, combining the target area of the original image and the target area of the fourth image as positive samples, and inputting the image discrimination model; and adjusting parameters of the image generation model based on the output result of the image discriminator. For embodiments of the device, reference is made to the description of method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate modules may or may not be separate. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

Accordingly, in accordance with one or more embodiments of the present disclosure, there is provided an electronic device comprising:

at least one memory and at least one processor;

wherein the memory is for storing program code, and the processor is for invoking the program code stored by the memory to cause the electronic device to perform an image processing method provided in accordance with one or more embodiments of the present disclosure.

Accordingly, in accordance with one or more embodiments of the present disclosure, there is provided a non-transitory computer storage medium storing program code executable by a computer device to cause the computer device to perform an image processing method provided in accordance with one or more embodiments of the present disclosure.

Referring now to fig. 3, a schematic diagram of an electronic device (e.g., a terminal device or server) 800 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 3 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 3, the electronic device 800 may include a processing means (e.g., a central processor, a graphics processor, etc.) 801, which may perform various appropriate actions and processes according to programs stored in a Read Only Memory (ROM) 802 or programs loaded from a storage 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data required for the operation of the electronic device 800 are also stored. The processing device 801, the ROM 802, and the RAM803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.

In general, the following devices may be connected to the I/O interface 805: input devices 806 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 807 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, etc.; storage 808 including, for example, magnetic tape, hard disk, etc.; communication means 809. The communication means 809 may allow the electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While fig. 3 shows an electronic device 800 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 809, or installed from storage device 808, or installed from ROM 802. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 801.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the methods of the present disclosure described above.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, there is provided an image processing method including: acquiring a first style image and a second style image; the first style image and the second style image are images of different styles generated based on the original image; generating a third image based on target region information of the first-style image and the second-style image; inputting the original image into an image generation model to obtain a fourth image; the image generation model is adjusted based on the first style image, the third image, and the fourth image.

According to one or more embodiments of the present disclosure, the first feature of the target region of the fourth image obtained by the adjusted image generation model approaches the first feature of the target region of the third image, and the second feature of the target region of the fourth image obtained by the adjusted image generation model approaches the second feature of the target region of the first-style image.

According to one or more embodiments of the present disclosure, the acquiring a first style image and a second style image includes: inputting the original image into a first style model to obtain the first style image; the first style image is an image to be repaired with a first style; inputting the original image into a second style model to obtain the second style image; the second-style image is an image having a second style.

According to one or more embodiments of the present disclosure, the original image includes a face image, and the target region includes an eye region, a mouth region.

According to one or more embodiments of the present disclosure, the target region information includes color texture information of a face.

According to one or more embodiments of the present disclosure, the adjusting the image generation model based on the first style image, the third image, and the fourth image includes: parameters of the image generation model are adjusted based on a first loss between a first feature of a target region of the third image and a first feature of a target region of the fourth image, and a second loss between a second feature of a target region of the first style image and a second feature of a target region of the fourth image.

According to one or more embodiments of the present disclosure, the target region comprises a five sense organ region, the first feature comprises an edge profile feature, and the second feature comprises a color feature.

According to one or more embodiments of the present disclosure, when the parameters of the image generation model are adjusted based on the first loss and the second loss, the first loss corresponds to a higher weight than the second loss.

According to one or more embodiments of the present disclosure, the method further comprises: setting an image discrimination model corresponding to the target area; combining the target area of the original image and the target area of the third image as positive samples, combining the target area of the original image and the target area of the first style image as negative samples, combining the target area of the original image and the target area of the fourth image as negative samples, and performing parameter adjustment of the image discrimination model; and/or, combining the target area of the original image and the target area of the fourth image as positive samples, and inputting the image discrimination model; and adjusting parameters of the image generation model based on the output result of the image discriminator.

According to one or more embodiments of the present disclosure, there is provided an image processing apparatus including: a style image acquisition unit configured to acquire a first style image and a second style image; the first style image and the second style image are images of different styles generated based on the original image; a third image generation unit configured to generate a third image based on target area information of the first-style image and the second-style image; a fourth image acquisition unit, configured to input the original image into an image generation model to obtain a fourth image; an adjustment unit configured to adjust the image generation model based on the first-style image, the third image, and the fourth image.

According to one or more embodiments of the present disclosure, there is provided an electronic device including: at least one memory and at least one processor; wherein the memory is for storing program code, and the processor is for invoking the program code stored by the memory to cause the electronic device to perform an image processing method provided in accordance with one or more embodiments of the present disclosure.

According to one or more embodiments of the present disclosure, there is provided a non-transitory computer storage medium storing program code which, when executed by a computer device, causes the computer device to perform an image processing method provided according to one or more embodiments of the present disclosure.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. An image processing method, comprising:

Generating a third image based on target region information of the first-style image and the second-style image;

2. The method of claim 1, wherein a first feature of a target region of a fourth image obtained from the adjusted image generation model approximates a first feature of a target region of the third image, and a second feature of a target region of a fourth image obtained from the adjusted image generation model approximates a second feature of a target region of the first style image.

3. The method of claim 1, wherein the acquiring the first and second style images comprises:

inputting the original image into a first style model to obtain the first style image; the first style image is an image to be repaired with a first style;

inputting the original image into a second style model to obtain the second style image; the second-style image is an image having a second style.

4. The method of claim 2, wherein the step of determining the position of the substrate comprises,

the original image includes a face image, and the target region includes a five sense organs region.

5. The method of claim 1, wherein the target region information comprises color texture information of a face.

6. The method of claim 1, wherein the adjusting the image generation model based on the first style image, the third image, and the fourth image comprises:

parameters of the image generation model are adjusted based on a first loss between a first feature of a target region of the third image and a first feature of a target region of the fourth image, and a second loss between a second feature of a target region of the first style image and a second feature of a target region of the fourth image.

7. The method of claim 6, wherein the first feature comprises an edge profile feature and the second feature comprises a color feature.

8. The method of claim 6, wherein the first loss corresponds to a higher weight than the second loss when adjusting parameters of the image generation model based on the first loss and the second loss.

9. The method as recited in claim 2, further comprising:

setting an image discrimination model corresponding to the target area;

combining the target area of the original image and the target area of the third image as positive samples, combining the target area of the original image and the target area of the first style image as negative samples, combining the target area of the original image and the target area of the fourth image as negative samples, and performing parameter adjustment of the image discrimination model; and/or combining the target area of the original image and the target area of the fourth image as positive samples, inputting the image discrimination model, and adjusting parameters of the image generation model based on the output result of the image discriminator.

10. An image processing apparatus, comprising:

11. An electronic device, comprising:

at least one memory and at least one processor;

wherein the memory is for storing program code and the processor is for invoking the program code stored in the memory to cause the electronic device to perform the method of any of claims 1-9.

12. A non-transitory computer storage medium comprising,

the non-transitory computer storage medium stores program code that, when executed by a computer device, causes the computer device to perform the method of any of claims 1 to 9.