WO2023124697A1 - Image enhancement method, apparatus, storage medium, and electronic device - Google Patents

Image enhancement method, apparatus, storage medium, and electronic device Download PDF

Info

Publication number
WO2023124697A1
WO2023124697A1 PCT/CN2022/134845 CN2022134845W WO2023124697A1 WO 2023124697 A1 WO2023124697 A1 WO 2023124697A1 CN 2022134845 W CN2022134845 W CN 2022134845W WO 2023124697 A1 WO2023124697 A1 WO 2023124697A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
information
target
sample
appearance
Prior art date
Application number
PCT/CN2022/134845
Other languages
French (fr)
Chinese (zh)
Inventor
唐斯伟
郑程耀
吴文岩
钱晨
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023124697A1 publication Critical patent/WO2023124697A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06T5/60
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present disclosure relates to artificial intelligence technology, and in particular to an image enhancement method and device, storage medium and electronic equipment.
  • Image enhancement has a wide range of applications in various scenarios. For example, in the scene of training a neural network, more and richer sample images can be obtained by performing image enhancement on the sample images. For another example, image enhancement can also be used to implement some face image enhancement applications such as makeup migration and face drive.
  • the image enhancement methods in related technologies either use traditional image processing methods such as stretching and interpolation for image enhancement, but the enhanced image quality obtained in this way is not high, and usually only image enhancement can be performed under limited conditions. less.
  • a neural network is used for image enhancement, the training of the neural network needs to obtain enough sample images. For example, a video of a certain length of time of a user with a single ID is often required to obtain multiple images of the user in the video. Face images, the cost of obtaining training samples in this way is relatively high, and it is also very inconvenient for users.
  • Embodiments of the present disclosure at least provide an image enhancement method and device, a storage medium, and an electronic device.
  • an image enhancement method comprising:
  • the target image includes a first object;
  • the appearance information represents surface visual features in the target image;
  • the structural information represents an outline feature of the second object
  • An enhanced image is generated based on the appearance information and the structure information, wherein the enhanced image includes a target object having the appearance information and the structure information.
  • the method is performed by an image enhancement device, and an image enhancement network is deployed in the image enhancement device, and the image enhancement network includes: an appearance extractor and a generator; performing feature extraction on the target image to obtain
  • the appearance information of the target image includes: performing feature extraction on the target image through the appearance extractor in the image enhancement network to obtain the appearance information of the target image; the generating based on the appearance information and structural information
  • the image enhancement includes: generating an enhanced image based on the appearance information and the structure information by the generator in the image enhancement network.
  • the acquiring the structural information of the second object includes: acquiring an initial image, the initial image including the second object; performing key point detection on the initial image to obtain the key points of the second object; and obtain the structural information of the second object according to the key points of the second object.
  • the second object is included in the auxiliary image; the method further includes: acquiring an initial image including the target object; performing key point detection on the initial image to obtain the Key points of the target object in the initial image; cropping the initial image according to the key points of the target object to obtain the target image or auxiliary image including the target object.
  • the method further includes: after generating the enhanced image based on the appearance information and the structural information, the enhanced image replaces a corresponding image portion in the initial image.
  • the first object and the second object are the same target object, or different target objects of the same type, and the target object is one of the facial features in a human face.
  • a training method of an image enhancement network comprising:
  • a sample image including a first object and structural information of a second object, wherein the first object and the second object are the same target object with different structural information; the structural information represents the structure information of the second object contour features;
  • the image enhancement network performing image generation processing on the appearance information and the structure information through the image enhancement network, and outputting a sample enhanced image, wherein the sample enhanced image includes the target object having the appearance information and the structure information;
  • the second object is included in the auxiliary image
  • the image enhancement network includes: an appearance extractor and a generator
  • adjusting network parameters of the image enhancement network according to the sample enhancement image includes : adjusting the network parameters of the appearance extractor and the generator according to the difference between the sample enhanced image and the auxiliary image.
  • an image enhancement device comprising:
  • the appearance extraction module is used for feature extraction of the target image to obtain appearance information of the target image, wherein the target image includes a first object; the appearance information represents the surface visual features in the target image;
  • a structure acquisition module configured to acquire structure information of a second object, the first object and the second object are target objects of the same type; the structure information represents the outline feature of the second object;
  • An image generating module configured to generate an enhanced image based on the appearance information and the structure information, wherein the enhanced image includes a target object with the appearance information and the structure information.
  • a training device for an image enhancement network comprising:
  • An information acquisition module configured to acquire a sample image including a first object and structural information of a second object, wherein the first object and the second object are the same target object with different structural information; the structure information representing contour features of said second object;
  • a feature extraction module configured to perform feature extraction on the sample image through an image enhancement network to obtain appearance information of the sample image, wherein the appearance information represents surface visual features in the sample image;
  • An image output module configured to perform image generation processing on the appearance information and the structure information through the image enhancement network, and output a sample enhanced image, wherein the sample enhanced image includes the appearance information and the structure information the target audience of
  • a parameter adjustment module configured to adjust the network parameters of the image enhancement network according to the sample enhanced image.
  • an electronic device including: a memory and a processor, the memory is used to store computer-readable instructions, and the processor is used to call the computer instructions to implement the method in any embodiment of the present disclosure.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the method in any embodiment of the present disclosure is implemented.
  • the image enhancement method and device, storage medium and electronic equipment provided by the embodiments of the present disclosure can enhance the sample image according to various types of structural information through this method. Since the structural information can be varied and not limited, more Enhanced images for rich samples make the types of samples more abundant. When the generated sample enhanced images are applied to tasks such as model training, rich and diverse samples can improve the robustness and generalization of model training, and in this way A richer variety of samples is obtained. Compared with the previous sample acquisition method, the cost of sample acquisition is reduced, and the sample acquisition is easier.
  • the method uses an image enhancement network to generate a sample enhanced image, which can make the generated image quality higher than conventional image processing methods such as interpolation and stretching.
  • Fig. 1 shows a schematic flowchart of a method for training an image enhancement network provided by at least one embodiment of the present disclosure
  • Fig. 2 shows a schematic framework diagram of image enhancement provided by at least one embodiment of the present disclosure
  • Fig. 3A shows a schematic diagram of structure information of a first object provided by at least one embodiment of the present disclosure
  • Fig. 3B shows a schematic diagram of the structure information of the second object provided by at least one embodiment of the present disclosure
  • Fig. 4 shows a schematic diagram of structural information of another eye provided by at least one embodiment of the present disclosure
  • Fig. 5 shows a schematic diagram of network training provided by at least one embodiment of the present disclosure
  • Fig. 6 shows a schematic flowchart of an image enhancement method provided by at least one embodiment of the present disclosure
  • Fig. 7 shows a schematic structural diagram of an image enhancement device provided by at least one embodiment of the present disclosure
  • Fig. 8 shows a schematic structural diagram of an image enhancement network training device provided by at least one embodiment of the present disclosure.
  • Embodiments of the present disclosure aim to provide an image enhancement method, which can generate an enhanced image through a trained neural network.
  • the neural network may be called an image enhancement network
  • the enhanced image may be an image obtained after enhancement processing is performed on the basis of an initial image.
  • the enhancement process can be, for example, deforming the image.
  • taking the enhancement of a human face image as an example it can include but not limited to changes in the angle of the face, the change in the expression of the face, the change in the orientation of the face, and the size of the facial features. changes etc.
  • the initial image is a human face image in which the mouth of the human face is in a closed state
  • the mouth in the human face image is transformed into a smiling mouth to obtain an enhanced image.
  • the training process of the image enhancement network will be described first, and then how to generate an enhanced image through the trained image enhancement network will be described.
  • Fig. 1 shows a schematic flowchart of a method for training an image enhancement network provided by at least one embodiment of the present disclosure. As shown in Fig. 1, the method may include the following processing:
  • step 100 a sample image and structural information of a second object are acquired.
  • the training method of this embodiment can be performed by the training device of the image enhancement network.
  • the training device can be deployed on an electronic device (such as a server), and the training device can include images to be trained.
  • Enhanced network such as a server
  • the training device of the image enhancement network can obtain the sample image to be enhanced, and in this embodiment, the image to be enhanced in the training stage can be called a sample image.
  • the sample image includes the first object.
  • the sample image may be an image including eyes, and the first object may be the eyes in the sample image.
  • the sample image may be an image including trees, and the first object may be the trees in the sample image.
  • the training device may also obtain structural information of a second object, which is the same target object with different structural information from the first object.
  • the structure information can be understood as representing the contour features of the second object, for example, the size and structure of the object.
  • An example is as follows: Taking the facial features of a human face as an example, the acquired structural information of the facial features can be information such as the contour features of the mouth, the contour features of the nose, etc.; it can also be feature information such as the height of the nose.
  • the record form of the contour feature includes but is not limited to: expressed as a contour line, or recorded as a plurality of key points distributed on the contour line, and the position coordinates or key point identifiers of these key points can be recorded.
  • the target object is an eye
  • the structural information may be an outline feature of the eye.
  • the structural information of the first object illustrated in Figure 3A and the structural information of the second object illustrated in Figure 3B can be seen as eyes with two different structural information, but these two objects can be the eyes of the same person, just one One is in the state of squinting, and the other is in the state of widening, so the structural information of the two eyes is different.
  • the first object and the second object with different structural information can also be the following examples: for another example, if the target object is a mouth, then the first object can be a closed mouth, and the second object can be An open mouth, even if the two mouths are the mouths of the same person, the structural information of the two mouths is different due to their different states. For example, due to the above-mentioned different states of the mouth, the position of each contour key point recorded in the contour feature of the closed mouth is different from the position of each contour key point recorded in the contour feature of the open mouth.
  • step 102 feature extraction is performed on the sample image through an image enhancement network to obtain appearance information of the sample image.
  • the appearance information may be acquired through feature extraction by an appearance extractor in the image enhancement network.
  • an appearance extractor in the image enhancement network.
  • the image enhancement network may include an appearance extractor, and the appearance extractor 21 may perform feature extraction on the sample image to obtain appearance information of the sample image.
  • the appearance extractor may include various modules such as a convolution layer, a residual module, an activation layer, and a pooling layer.
  • the appearance information represents surface visual features in the target image.
  • the surface visual features include but not limited to texture, color, lighting information, etc. in the target image.
  • the appearance information obtained can include: the brightness of the face area, the texture of the face, the color of the face, etc.
  • the appearance information of the sample image output by the appearance extractor 21 may be expressed as a one-dimensional tensor, which may be a 64*1 tensor.
  • the appearance extractor 21 may extract all or part of the appearance information, which may be determined according to actual business requirements. For example, taking an eye picture as an example, in addition to the eyes, the eye picture also includes a part of the face area around the eyes and the eyebrow area. Then, the appearance information of the brightness, color, and texture of all these areas can be extracted by the appearance extractor 21, or only the appearance information of the eyebrow area can be extracted, or only the appearance information of the face area around the eyes can be extracted. The extraction of the appearance information of at least a part of the region in the sample image can be realized by designing and training the appearance extractor 21 .
  • the function of the appearance extractor 21 can also be designed to realize the extraction of at least part of the appearance information, for example, only the texture and texture in the sample image are extracted. color without extracting brightness.
  • the structure information of the second object obtained in step 100 may also be determined to obtain at least part of the structure information of the second object according to actual business requirements. For example, taking the second object as the eye as an example, if you want to obtain all the structural information of the eye, it can include the key points of the outer contour of the eye, the key points of the contour of the eyeball, and the center point of the eyeball; and if you want to obtain the part of the eye Structural information may only include the outline points of the eye, excluding key points of the outline of the eyeball and the center point of the eyeball.
  • the structural information obtained in step 100 and the appearance information obtained in step 102 in this embodiment are at least part of the extracted information, and these information will participate in the image generation process.
  • step 104 an image generation process is performed on the appearance information and structure information through an image enhancement network, and a sample enhanced image is output; wherein, the sample enhanced image includes the appearance information and the target object with the structure information.
  • the image enhancement network can generate a sample enhanced image according to the appearance information and structure information obtained above.
  • image generation processing may be performed by the generator 22 to output a sample enhanced image.
  • the sample enhanced image can have both appearance information and structure information.
  • the structure information may be possessed by a target object in the sample enhanced image, and the target object may be the aforementioned first object or second object.
  • the sample image is an image containing eyes
  • the structure information is a structure map of the eyes in another state.
  • the output sample enhanced image may replace the structural information of the first object in the sample image with the structural information of the second object, and the structural information of the second object in the sample image
  • Other information may not change, for example, the face texture around the eyes, face color, eyebrows, eyeball position inside the eyes, eye color, etc. in the sample image may not change, which is the same as in the sample image.
  • step 106 network parameters of the image enhancement network are adjusted according to the sample enhanced image.
  • the image serving as the label of the sample enhanced image may be an auxiliary image where the second object is located.
  • the auxiliary image may have the same image size as the sample image, and the auxiliary image and the sample image may include the same area.
  • the sample image in Figure 2 includes an eye and an eyebrow
  • the auxiliary image corresponding to the sample image can also include an eye and an eyebrow, that is, it includes the same area as the sample image, and the size of the sample image and the auxiliary image Can be the same.
  • the difference lies in that the structural information of eyes in the sample image and the auxiliary image are different, for example, the eyes in the sample image are wide open, while the eyes in the auxiliary image are squinted.
  • network parameters of the image enhancement network can be adjusted according to the sample enhanced image. For example, according to the difference between the sample enhanced image and the auxiliary image, the L1 norm loss function (L1loss) between the sample enhanced image and the auxiliary image can be solved, and the network parameters of the appearance extractor and generator can be adjusted according to the L1 loss.
  • L1 norm loss function L1loss
  • the sample image can be enhanced according to various types of structural information. Since the structural information can be varied and not limited, a richer sample enhanced image can be obtained, so that the sample image The types are more abundant.
  • the generated sample enhanced images are applied to tasks such as model training, rich and diverse samples can improve the robustness and generalization of model training, and in this way, more abundant sample types can be obtained. Compared with Compared with the previous sample acquisition method, the cost of sample acquisition is reduced, and sample acquisition is more convenient.
  • the method uses an image enhancement network to generate a sample enhanced image, compared with conventional image processing methods such as interpolation and stretching, so that the quality of the generated image is higher.
  • the first object included in the sample image may be determined according to actual application requirements.
  • the sample enhanced image obtained according to the embodiment of the present disclosure if the actual application needs to include the image of the eye, the first object in the sample enhanced image is the eye.
  • the first object in the sample enhanced image is the mouth.
  • other organs in the facial features can also be enhanced, such as eyebrows, nose, etc.
  • the sample image containing the organ to be enhanced and the corresponding structural information of the organ can be used to generate the sample enhanced image.
  • the sample enhanced image shown in Figure 2 is an image including eyes, but in actual implementation, sometimes the initially obtained image can be an image with a relatively large range including the entire face, then you can perform the image enhancement process shown in Figure 2 , to preprocess the initial image.
  • FIG. 4 schematically shows the face image obtained after cropping the initial image, and some key points of the face in the face image, for example, key point 41 , key point 42 and so on.
  • the image in Figure 4 is further cropped to obtain an image including the mouth.
  • the mouth in the mouth image is the mouth in the face image.
  • the mouth image can be used as a sample image in the training phase of the image enhancement network, or can also be used as an auxiliary image.
  • the structure information of the corresponding mouth can also be obtained.
  • the structure information may be a structure map (heatmap) corresponding to the mouth. Keypoints for the mouth may be included in the structure map.
  • the structure graph can be input into the image enhancement network to assist the sample image to generate a corresponding sample enhanced image.
  • the following data can be prepared:
  • a small number of face images with the same ID for example, 15 face images of the same person Zhang.
  • the same ID refers to the same person, for example, multiple face images of Xiao Wang belong to the same ID, and the ID is Xiao Wang's identification.
  • each ID has a certain number of face images with different expressions and different angles.
  • the 15,000 other IDs may be face images of Xiao Wang, Xiao Dong and other people.
  • the prepared training data may include face images of multiple IDs, and each ID may include face images of multiple expressions and different angles, and different expressions and angles may correspond to different structural information.
  • the sample image and the auxiliary image can be two face images belonging to the same ID randomly selected from the above training data.
  • two face images of Xiao Zhang can be extracted. Both images are of Xiao Zhang’s face.
  • Xiao Zhang has squinted eyes
  • Xiao Zhang has his eyes wide open.
  • the structural information of the eyes in the images is different, but the appearance information other than the structural information is the same.
  • the auxiliary image is used as the label of this enhancement, and the network parameters of the image enhancement network are subsequently adjusted according to the difference between the auxiliary image and the sample enhanced image output by the image enhancement network.
  • each face image in the above training data may be preprocessed as shown in FIG. 4 .
  • identify the key points of the face in each face image and then crop the face image and an image including one of the five organs of the face according to the key points of the face.
  • each image in the above training data may be cropped to obtain an eye image including eyes.
  • the two eye images belonging to the same person are used as the auxiliary image and the sample image respectively, and the enhanced eye image is obtained through the output of the image enhancement network shown in Figure 2, that is, in the enhanced eye image, the structural information of the eye in the sample image is replaced by is the structural information of the eye in the auxiliary image.
  • Fig. 5 shows another schematic diagram of network training provided by at least one embodiment of the present disclosure.
  • the image enhancement network in addition to adjusting the network parameters according to the difference between the sample enhancement image and the auxiliary image mentioned above, it is also The training method shown in Figure 5 can be used.
  • the sample enhanced image and the corresponding label may be input into the discriminator 23 to obtain a discriminant value output by the discriminator 23 .
  • the discriminant value may be a numerical value between 0 and 1, which is used to represent the probability of authenticity of the sample enhanced image.
  • the first loss is obtained according to the difference between the discriminant value and the discriminant true value; and the second loss is obtained according to the difference between the sample enhanced image and the auxiliary image. Further adjusting network parameters of at least one of the appearance extractor, generator and discriminator according to the first loss and the second loss.
  • the aforementioned generator and discriminator may adopt a conventional Generative adversarial nets (GAN) network structure, which is not limited in this embodiment.
  • the network structure may include convolutional layers, residual modules, pooling layers, linear layers, activation layers, etc.
  • This way of generating a sample enhanced image by training an image enhancement network through a generative confrontation network can make the discriminant value output by the discriminator as close to the real value as possible through training, thereby improving the fidelity of the enhanced image generation and helping to generate more accurate images. High quality enhanced images.
  • FIG. 6 shows a schematic flowchart of an image enhancement method provided by at least one embodiment of the present disclosure. As shown in Fig. 6, the method may be executed by an image enhancement device, and the method may include the following processing:
  • step 600 feature extraction is performed on the target image to obtain appearance information of the target image, and the target image includes a first object.
  • the target image may be an image including eyes, for example, the sample image shown in FIG. 2 includes an image of human eyes.
  • the eyes in the target image may be referred to as the first object, and the purpose of this embodiment may be to enhance the target image, and perform enhancement and deformation on the eyes in the target image.
  • the appearance information of the target image can be obtained by extracting the features of the target image through the appearance extractor in the trained image enhancement network.
  • the initial image can be preprocessed to obtain a target image including eyes.
  • the key point detection network can be used to detect the key points of the face in the initial image to obtain the key points of the face in the initial image.
  • the initial image can be cropped according to these face key points to obtain the above-mentioned target image including eyes.
  • step 602 the structure information of the second object is obtained according to the key points of the second object in the auxiliary image, and both the first object and the second object are target objects of the same type.
  • the second object in the auxiliary image is the same type of object as the first object, for example, both objects are eyes, or both objects are mouths.
  • the object of the same type may be referred to as a target object.
  • the eyes in the auxiliary image and the target image are different.
  • the eyes in the target image are referred to as the first object, and the eyes in the auxiliary image are referred to as the second object.
  • the first object and the second object in this embodiment may be the same target object, for example, both are Xiao Wang's eyes, and the eyes of the two objects are in different states (eg, one is wide open, and the other is squinting).
  • the first object and the second object may also belong to different target objects, for example, the first object is Xiao Wang's eyes, and the second object is Xiao Zhang's eyes.
  • the structure information of the second object can be obtained according to the key points of the second object in the auxiliary image.
  • the image enhancement network can include a network module for extracting key points, then after the auxiliary image is input into the image enhancement network, the key points in the auxiliary image can be extracted through the network module, and then the structure of the second object can be obtained according to the key points information.
  • the image enhancement network may not include a network module for extracting key points, but the structure information of the second object may be obtained through other processing modules other than the image enhancement network, and the structure information may be input into the image enhancement network.
  • step 604 an enhanced image is generated based on the appearance information and structural information, and the enhanced image replaces the structural information of the first object in the target image with the structural information of the second object.
  • the generator in the image enhancement network can perform image generation processing according to the acquired appearance information and structure information, and finally generate an enhanced image.
  • the enhanced image includes the appearance information of the target image and the structure information of the second object in the auxiliary image, then the enhanced image is compared with the target image by replacing the structure information of the first object in the target image with the structure information of the second object structural information.
  • the enhanced image including eyes output by the image enhancement device of this embodiment through the image enhancement network can be used for subsequent network training, the enhanced image may not undergo subsequent processing.
  • the final output is an image of the entire face.
  • the initial image may be a face image of Xiao Wang, and it is desired to obtain an enhanced image that changes the structural information of Xiao Wang's eyes.
  • the structural information of Xiao Zhang’s eyes can be obtained, combined with the structural information of Xiao Zhang’s eyes and Xiao Wang’s eye image cropped from Xiao Wang’s face image, the image generation process is performed through the image enhancement network, and the obtained enhanced In the image, the structural information of Xiao Wang's eyes is replaced with the structural information of Xiao Zhang's eyes.
  • the enhanced image output by the image enhancement network is an image including Xiao Wang's eyes, and the enhanced image can also be pasted back to the original Xiao Wang's face image, that is, the enhanced image will be replaced with the corresponding part of Xiao Wang's face image , the updated face image of Xiao Wang can be obtained, which can also be called the enhanced face image of Xiao Wang.
  • the eye image (including the image of the eye) is obtained by cutting out the key points of the face from the initial face image and the mouth image (including the mouth image), and then, the eye image and the mouth image are respectively enhanced through the image enhancement network to obtain the corresponding enhanced images, for example, the eye enhanced image and the mouth enhanced image. Finally, paste the eye-enhanced image and the mouth-enhanced image back into the original image respectively, and replace the corresponding parts in the above-mentioned initial human face image.
  • the process of generating an enhanced image in the above-mentioned Figure 6 can be applied to the training scene of the network. For example, if a neural network is to be trained, but the training samples are not enough, the enhanced image is generated through the above-mentioned Figure 6 in the embodiment of the present disclosure to obtain a more accurate image. Rich sample images.
  • the image enhancement network provided by the above embodiments of the present disclosure can combine arbitrary structural information to generate an enhanced image. Taking face enhancement as an example, when an enhanced image is generated by this method, a richer enhanced face image can be generated, which can include multiple Enhanced face images from various angles and expressions.
  • This rich and diverse enhanced image when applied to the training neural network model, helps to improve the generalization and robustness of the trained neural network model, and the method generates enhanced images through the trained image enhancement network , and also used the method of generative confrontation in the training process, so that the quality of the generated enhanced image is higher, more realistic and clear.
  • these small amounts of data can be enriched through the image enhancement network of the embodiment of the present disclosure, so that when obtaining data, it is reduced. Difficulty in obtaining data.
  • process of generating an enhanced image in FIG. 6 can also be applied to other scenarios, for example, it can be applied to face image enhancement applications such as makeup migration and face driving.
  • one person Xiao Zhang’s eye makeup can be transferred to another person Xiao Wang’s eyes, then the appearance information related to Xiao Zhang’s eye makeup can be extracted through the appearance extractor in the image enhancement network , and then combine the structural information of Xiao Wang's eyes to generate an enhanced image.
  • the structure of Xiao Wang's eyes has not changed, but it already has Xiao Zhang's eye makeup.
  • Xiao Zhang's facial expressions are used to drive Xiao Wang's face to do the same facial expressions, and it is assumed that the specific movements are mouth movements. That can combine the appearance information of Xiao Wang's face picture and the structure information of Xiao Zhang's mouth to generate an enhanced image, so that the enhanced image is still Xiao Wang's face, but the movements and expressions of the mouth are replaced by Xiao Zhang's. expression.
  • an embodiment of the present disclosure further provides an image enhancement device.
  • the image enhancement device may include: an appearance extraction module 71 , a structure acquisition module 72 and an image generation module 73 .
  • the appearance extraction module 71 is configured to perform feature extraction on the target image to obtain appearance information of the target image, wherein the target image includes a first object; and the appearance information represents surface visual features in the target image.
  • the structure acquisition module 72 is configured to acquire the structure information of the second object, the first object and the second object are target objects of the same type; the structure information represents the outline feature of the second object.
  • An image generating module 73 configured to generate an enhanced image based on the appearance information and the structure information, where the enhanced image includes a target object with the appearance information and the structure information.
  • the appearance extraction module 71 when used to perform feature extraction on the target image to obtain the appearance information of the target image, it includes: performing feature extraction on the target image through an appearance extractor in the image enhancement network Extract to obtain the appearance information of the target image.
  • the image generation module 73 when used to generate an enhanced image based on the appearance information and the structure information, includes: generating an enhanced image based on the appearance information and the structure information by the generator in the image enhancement network Enhance images.
  • the structure acquisition module 72 when used to acquire the structure information of the second object, it includes: acquiring an initial image, which includes the second object; Detecting to obtain key points of the second object in the initial image; obtaining the structural information of the second object according to the key points of the second object.
  • the device further includes: a preprocessing module.
  • the preprocessing module is configured to acquire an initial image, which includes the target object; perform key point detection on the initial image to obtain key points of the target object in the initial image; according to the The key points of the target object are used to crop the initial image to obtain the target image or an auxiliary image including the target object; wherein the second object is included in the auxiliary image.
  • an embodiment of the present disclosure further provides an image enhancement network training device.
  • the training device of the image enhancement network may include: an information acquisition module 81 , a feature extraction module 82 , an image output module 83 and a parameter adjustment module 84 .
  • An information acquisition module 81 configured to acquire a sample image including a first object and structural information of a second object, wherein the first object and the second object are the same target object with different structural information; the structural information represents the outline feature of the second object.
  • the feature extraction module 82 is configured to perform feature extraction on the sample image through an image enhancement network to obtain appearance information of the sample image, and the appearance information represents surface visual features in the sample image.
  • An image output module 83 configured to perform image generation processing on the appearance information and the structure information through the image enhancement network, and output a sample enhanced image, wherein the sample enhanced image includes the appearance information and the structure information.
  • the target audience for the message is configured to perform image generation processing on the appearance information and the structure information through the image enhancement network, and output a sample enhanced image, wherein the sample enhanced image includes the appearance information and the structure information.
  • the parameter adjustment module 84 is configured to adjust network parameters of the image enhancement network according to the sample enhanced image.
  • the parameter adjustment module 84 when used to adjust the network parameters of the image enhancement network according to the sample enhanced image, it includes: according to the difference between the sample enhanced image and the auxiliary image, adjusting Network parameters of the appearance extractor and generator; wherein the second object is included in the auxiliary image, and the appearance extractor and generator are included in the image enhancement network.
  • the parameter adjustment module 84 when used to adjust the network parameters of the image enhancement network according to the sample enhanced image, it includes: inputting the sample enhanced image into the discriminator to obtain the The discriminant value output by the discriminator; the first loss is obtained according to the difference between the discriminant value and the discriminant true value, and the second loss is obtained according to the difference between the sample enhanced image and the auxiliary image; according to the first loss and the second loss, adjusting network parameters of at least one of the appearance extractor, the generator, and the discriminator; wherein the second object is included in the auxiliary image.
  • one or more embodiments of the present disclosure may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present disclosure may employ a computer program embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. The form of the product.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program can be stored, and when the program is executed by a processor, the image enhancement method described in any embodiment of the present disclosure or the training of the image enhancement network can be implemented. method.
  • An embodiment of the present disclosure also provides an electronic device, the electronic device includes: a memory and a processor, the memory is used to store computer-readable instructions, and the processor is used to call the computer instructions to implement any embodiment of the present disclosure
  • the image enhancement method or the training method of the image enhancement network includes: a memory and a processor, the memory is used to store computer-readable instructions, and the processor is used to call the computer instructions to implement any embodiment of the present disclosure
  • Embodiments of the subject matter and functional operations described in this disclosure can be implemented in digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware including the structures disclosed in this disclosure and their structural equivalents, or in A combination of one or more of .
  • Embodiments of the subject matter described in this disclosure can be implemented as one or more computer programs, i.e. one or more of computer program instructions encoded on a tangible, non-transitory program carrier for execution by or to control the operation of data processing apparatus. Multiple modules.
  • the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for transmission by the data
  • the processing means executes.
  • a computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
  • the processes and logic flows described in this disclosure can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, such as FPG (Field Programmable Gate Array) or SIC (Application Specific Integrated Circuit).
  • FPG Field Programmable Gate Array
  • SIC Application Specific Integrated Circuit
  • Computers suitable for the execution of a computer program include, for example, general and/or special purpose microprocessors, or any other type of central processing unit.
  • a central processing unit will receive instructions and data from a read only memory and/or a random access memory.
  • the essential components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to, one or more mass storage devices for storing data, such as magnetic or magneto-optical disks, or optical disks, to receive data therefrom or to It transmits data, or both.
  • mass storage devices for storing data, such as magnetic or magneto-optical disks, or optical disks
  • a computer may be embedded in another device such as a mobile phone, a personal digital assistant (PD or more), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a device such as a Universal Serial Bus ( USB) flash drives, to name a few.
  • a mobile phone such as a personal digital assistant (PD or more), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a device such as a Universal Serial Bus ( USB) flash drives, to name a few.
  • PD personal digital assistant
  • GPS Global Positioning System
  • USB Universal Serial Bus
  • Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices (such as EPROM, EEPROM, and flash memory devices), magnetic disks (such as internal hard disks or removable disks), magneto-optical disks, and CD ROM and DVD-ROM disks.
  • semiconductor memory devices such as EPROM, EEPROM, and flash memory devices
  • magnetic disks such as internal hard disks or removable disks
  • magneto-optical disks and CD ROM and DVD-ROM disks.
  • the processor and memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Abstract

Provided in embodiments of the present disclosure are an image enhancement method, an apparatus, a storage medium, and an electronic device. The method may comprise: performing feature extraction on a target image to obtain appearance information of the target image, wherein the target image comprises a first object, and the appearance information represents a surface visual feature in the target image; obtaining structural information of a second object, wherein the first object and the second object are target objects of a same type, and the structural information represents a contour feature of the second object; and generating an enhanced image on the basis of the appearance information and the structural information, wherein the enhanced image comprises a target object having the appearance information and the structural information.

Description

图像增强方法和装置、存储介质和电子设备Image enhancement method and device, storage medium and electronic equipment
相关申请的交叉引用Cross References to Related Applications
本公开要求于2021年12月31日提交的、申请号为202111669721.8的中国专利申请的优先权,该申请以引用的方式并入本文中。This disclosure claims priority to Chinese Patent Application No. 202111669721.8 filed on December 31, 2021, which is incorporated herein by reference.
技术领域technical field
本公开涉及人工智能技术,具体涉及一种图像增强方法和装置、存储介质和电子设备。The present disclosure relates to artificial intelligence technology, and in particular to an image enhancement method and device, storage medium and electronic equipment.
背景技术Background technique
图像增强在多种场景下都有广泛的应用。例如,在训练神经网络的场景中,可以通过对样本图像进行图像增强,得到更多更丰富的样本图像。又例如,还可以通过图像增强,实现一些例如妆容迁移、人脸驱动等人脸图像增强类应用。Image enhancement has a wide range of applications in various scenarios. For example, in the scene of training a neural network, more and richer sample images can be obtained by performing image enhancement on the sample images. For another example, image enhancement can also be used to implement some face image enhancement applications such as makeup migration and face drive.
相关技术的图像增强方式,要么是采用传统的拉伸、插值等图像处理的方法进行图像增强,但是这种方式获得的增强图像质量不高,而且通常只能进行有限条件下的图像增强,种类较少。此外,如果用神经网络的方式进行图像增强,该神经网络的训练需要获得足够的样本图像,比如,往往需要单一ID的用户的一定时长的视频,以获得该视频中的该用户的多张人脸图像,这种方式获得训练样本的成本较高,而且对于用户来说也很不方便。The image enhancement methods in related technologies either use traditional image processing methods such as stretching and interpolation for image enhancement, but the enhanced image quality obtained in this way is not high, and usually only image enhancement can be performed under limited conditions. less. In addition, if a neural network is used for image enhancement, the training of the neural network needs to obtain enough sample images. For example, a video of a certain length of time of a user with a single ID is often required to obtain multiple images of the user in the video. Face images, the cost of obtaining training samples in this way is relatively high, and it is also very inconvenient for users.
发明内容Contents of the invention
本公开实施例至少提供一种图像增强方法和装置、存储介质和电子设备。Embodiments of the present disclosure at least provide an image enhancement method and device, a storage medium, and an electronic device.
第一方面,提供一种图像增强方法,所述方法包括:In a first aspect, an image enhancement method is provided, the method comprising:
对目标图像进行特征提取,得到所述目标图像的外观信息,其中,所述目标图像中包括第一对象;所述外观信息表示所述目标图像中的表面视觉特征;performing feature extraction on the target image to obtain appearance information of the target image, wherein the target image includes a first object; the appearance information represents surface visual features in the target image;
获取第二对象的结构信息,其中,所述第一对象和所述第二对象为同种类的目标对象;所述结构信息表示所述第二对象的轮廓特征;Acquiring structural information of a second object, wherein the first object and the second object are target objects of the same type; the structural information represents an outline feature of the second object;
基于所述外观信息和所述结构信息生成增强图像,其中,所述增强图像包括具有所述外观信息以及所述结构信息的目标对象。An enhanced image is generated based on the appearance information and the structure information, wherein the enhanced image includes a target object having the appearance information and the structure information.
在一些例子中,所述方法由图像增强装置执行,所述图像增强装置中部署有图像增强网络,所述图像增强网络包括:外观提取器和生成器;所述对目标图像进行特征提取,得到所述目标图像的外观信息,包括:通过所述图像增强网络中的外观提取器对所述目标图像进行特征提取,得到所述目标图像的外观信息;所述基于所述外观信息和结构信息生成增强图像,包括:通过所述图像增强网络中的所述生成器基于所述外观信息和所述结构信息生成增强图像。In some examples, the method is performed by an image enhancement device, and an image enhancement network is deployed in the image enhancement device, and the image enhancement network includes: an appearance extractor and a generator; performing feature extraction on the target image to obtain The appearance information of the target image includes: performing feature extraction on the target image through the appearance extractor in the image enhancement network to obtain the appearance information of the target image; the generating based on the appearance information and structural information The image enhancement includes: generating an enhanced image based on the appearance information and the structure information by the generator in the image enhancement network.
在一些例子中,所述获取第二对象的结构信息,包括:获取初始图像,所述初始图像中包括所述第二对象;对所述初始图像进行关键点检测,得到所述初始图像中所述第二对象的关键点;根据所述第二对象的所述关键点,得到所述第二对象的所述结构信息。In some examples, the acquiring the structural information of the second object includes: acquiring an initial image, the initial image including the second object; performing key point detection on the initial image to obtain the key points of the second object; and obtain the structural information of the second object according to the key points of the second object.
在一些例子中,所述第二对象包括在辅助图像中;所述方法还包括:获取初始图像,所述初始图像中包括所述目标对象;对所述初始图像进行关键点检测,得到所述初始图 像中所述目标对象的关键点;根据所述目标对象的所述关键点对所述初始图像进行裁剪,得到包括所述目标对象的所述目标图像或者辅助图像。In some examples, the second object is included in the auxiliary image; the method further includes: acquiring an initial image including the target object; performing key point detection on the initial image to obtain the Key points of the target object in the initial image; cropping the initial image according to the key points of the target object to obtain the target image or auxiliary image including the target object.
在一些例子中,所述方法还包括:在所述基于所述外观信息和所述结构信息生成增强图像之后,所述增强图像替换所述初始图像中的对应图像部分。In some examples, the method further includes: after generating the enhanced image based on the appearance information and the structural information, the enhanced image replaces a corresponding image portion in the initial image.
在一些例子中,所述第一对象和所述第二对象是同一个目标对象,或者是同种类的不同目标对象,所述目标对象是人脸中的五官之一。In some examples, the first object and the second object are the same target object, or different target objects of the same type, and the target object is one of the facial features in a human face.
第二方面,提供一种图像增强网络的训练方法,所述方法包括:In a second aspect, a training method of an image enhancement network is provided, the method comprising:
获取包含第一对象的样本图像以及第二对象的结构信息,其中,所述第一对象和所述第二对象是具有不同结构信息的同一目标对象;所述结构信息表示所述第二对象的轮廓特征;Acquiring a sample image including a first object and structural information of a second object, wherein the first object and the second object are the same target object with different structural information; the structural information represents the structure information of the second object contour features;
通过图像增强网络对所述样本图像进行特征提取,得到所述样本图像的外观信息,其中,所述外观信息表示所述样本图像中的表面视觉特征;performing feature extraction on the sample image through an image enhancement network to obtain appearance information of the sample image, wherein the appearance information represents surface visual features in the sample image;
通过所述图像增强网络对所述外观信息和所述结构信息进行图像生成处理,输出样本增强图像,其中,所述样本增强图像包括具有所述外观信息以及所述结构信息的所述目标对象;performing image generation processing on the appearance information and the structure information through the image enhancement network, and outputting a sample enhanced image, wherein the sample enhanced image includes the target object having the appearance information and the structure information;
根据所述样本增强图像,调整所述图像增强网络的网络参数。and adjusting network parameters of the image enhancement network according to the sample enhanced image.
在一些例子中,所述第二对象包括在辅助图像中,所述图像增强网络包括:外观提取器和生成器;所述根据所述样本增强图像,调整所述图像增强网络的网络参数,包括:根据所述样本增强图像和所述辅助图像之间的差异,调整所述外观提取器和所述生成器的网络参数。In some examples, the second object is included in the auxiliary image, and the image enhancement network includes: an appearance extractor and a generator; and adjusting network parameters of the image enhancement network according to the sample enhancement image includes : adjusting the network parameters of the appearance extractor and the generator according to the difference between the sample enhanced image and the auxiliary image.
在一些例子中,所述第二对象包括在辅助图像中;所述根据所述样本增强图像,调整所述图像增强网络的网络参数,包括:将所述样本增强图像输入所述判别器,得到所述判别器输出的判别值;根据所述判别值与判别真值之间的差异得到第一损失,并根据所述样本增强图像和所述辅助图像之间的差异得到第二损失;根据所述第一损失和所述第二损失,调整所述外观提取器、所述生成器和所述判别器中至少一个的网络参数。In some examples, the second object is included in the auxiliary image; the adjusting the network parameters of the image enhancement network according to the sample enhanced image includes: inputting the sample enhanced image into the discriminator to obtain The discrimination value output by the discriminator; the first loss is obtained according to the difference between the discrimination value and the discrimination true value, and the second loss is obtained according to the difference between the sample enhanced image and the auxiliary image; according to the The first loss and the second loss are used to adjust network parameters of at least one of the appearance extractor, the generator, and the discriminator.
第三方面,提供一种图像增强装置,所述装置包括:In a third aspect, an image enhancement device is provided, the device comprising:
外观提取模块,用于目标图像进行特征提取,得到所述目标图像的外观信息,其中,所述目标图像中包括第一对象;所述外观信息表示所述目标图像中的表面视觉特征;The appearance extraction module is used for feature extraction of the target image to obtain appearance information of the target image, wherein the target image includes a first object; the appearance information represents the surface visual features in the target image;
结构获取模块,用于获取第二对象的结构信息,所述第一对象和所述第二对象为同种类的目标对象;所述结构信息表示所述第二对象的轮廓特征;A structure acquisition module, configured to acquire structure information of a second object, the first object and the second object are target objects of the same type; the structure information represents the outline feature of the second object;
图像生成模块,用于基于所述外观信息和所述结构信息生成增强图像,其中,所述增强图像包括具有所述外观信息以及所述结构信息的目标对象。An image generating module, configured to generate an enhanced image based on the appearance information and the structure information, wherein the enhanced image includes a target object with the appearance information and the structure information.
第四方面,提供一种图像增强网络的训练装置,所述装置包括:In a fourth aspect, a training device for an image enhancement network is provided, the device comprising:
信息获取模块,用于获取包含第一对象的样本图像以及第二对象的结构信息,其中,所述所述第一对象和所述第二对象是具有不同结构信息的同一目标对象;所述结构信息表示所述第二对象的轮廓特征;An information acquisition module, configured to acquire a sample image including a first object and structural information of a second object, wherein the first object and the second object are the same target object with different structural information; the structure information representing contour features of said second object;
特征提取模块,用于通过图像增强网络对所述样本图像进行特征提取,得到所述样本图像的外观信息,其中所述外观信息表示所述样本图像中的表面视觉特征;A feature extraction module, configured to perform feature extraction on the sample image through an image enhancement network to obtain appearance information of the sample image, wherein the appearance information represents surface visual features in the sample image;
图像输出模块,用于通过所述图像增强网络对所述外观信息和所述结构信息进行图像生成处理,输出样本增强图像,其中,所述样本增强图像包括具有所述外观信息以及 所述结构信息的所述目标对象;An image output module, configured to perform image generation processing on the appearance information and the structure information through the image enhancement network, and output a sample enhanced image, wherein the sample enhanced image includes the appearance information and the structure information the target audience of
参数调整模块,用于根据所述样本增强图像,调整所述图像增强网络的网络参数。A parameter adjustment module, configured to adjust the network parameters of the image enhancement network according to the sample enhanced image.
第五方面,提供一种电子设备,包括:存储器、处理器,所述存储器用于存储计算机可读指令,所述处理器用于调用所述计算机指令,实现本公开任一实施例的方法。According to a fifth aspect, an electronic device is provided, including: a memory and a processor, the memory is used to store computer-readable instructions, and the processor is used to call the computer instructions to implement the method in any embodiment of the present disclosure.
第六方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现本公开任一实施例的方法。In a sixth aspect, a computer-readable storage medium is provided, on which a computer program is stored, and when the computer program is executed by a processor, the method in any embodiment of the present disclosure is implemented.
本公开实施例提供的图像增强方法和装置、存储介质和电子设备,通过该方法可以依据多种类型的结构信息来增强样本图像,由于结构信息可以多种多样,不受限制,所以能够得到更为丰富的样本增强图像,使得样本的种类更加丰富,当将生成的样本增强图像应用于模型训练等任务时,丰富多样的样本可以提升模型训练的鲁棒性和泛化性,而且通过该方式得到更为丰富的样本种类,相比于之前的样本获取方式,降低了样本获取的成本,样本获取更加简便。另外,该方法是通过图像增强网络来生成样本增强图像,相比于常规的插值、拉伸等图像处理方式,能使得生成的图像质量更高。The image enhancement method and device, storage medium and electronic equipment provided by the embodiments of the present disclosure can enhance the sample image according to various types of structural information through this method. Since the structural information can be varied and not limited, more Enhanced images for rich samples make the types of samples more abundant. When the generated sample enhanced images are applied to tasks such as model training, rich and diverse samples can improve the robustness and generalization of model training, and in this way A richer variety of samples is obtained. Compared with the previous sample acquisition method, the cost of sample acquisition is reduced, and the sample acquisition is easier. In addition, the method uses an image enhancement network to generate a sample enhanced image, which can make the generated image quality higher than conventional image processing methods such as interpolation and stretching.
附图说明Description of drawings
为了更清楚地说明本公开一个或多个实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开一个或多个实施例中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in one or more embodiments of the present disclosure or related technologies, the following will briefly introduce the drawings that need to be used in the descriptions of the embodiments or related technologies. Obviously, the accompanying drawings in the following description The drawings are only some embodiments described in one or more embodiments of the present disclosure, and those skilled in the art can obtain other drawings based on these drawings without any creative effort.
图1示出了本公开至少一个实施例提供的一种图像增强网络的训练方法的流程示意图;Fig. 1 shows a schematic flowchart of a method for training an image enhancement network provided by at least one embodiment of the present disclosure;
图2示出了本公开至少一个实施例提供的一种图像增强的原理框架图;Fig. 2 shows a schematic framework diagram of image enhancement provided by at least one embodiment of the present disclosure;
图3A示出了本公开至少一个实施例提供的第一对象的结构信息示意图;Fig. 3A shows a schematic diagram of structure information of a first object provided by at least one embodiment of the present disclosure;
图3B示出了本公开至少一个实施例提供的第二对象的结构信息示意图;Fig. 3B shows a schematic diagram of the structure information of the second object provided by at least one embodiment of the present disclosure;
图4示出了本公开至少一个实施例提供的另一种眼睛的结构信息示意图;Fig. 4 shows a schematic diagram of structural information of another eye provided by at least one embodiment of the present disclosure;
图5示出了本公开至少一个实施例提供的一种网络训练原理图;Fig. 5 shows a schematic diagram of network training provided by at least one embodiment of the present disclosure;
图6示出了本公开至少一个实施例提供的一种图像增强方法的流程示意图;Fig. 6 shows a schematic flowchart of an image enhancement method provided by at least one embodiment of the present disclosure;
图7示出了本公开至少一个实施例提供的一种图像增强装置的结构示意图;Fig. 7 shows a schematic structural diagram of an image enhancement device provided by at least one embodiment of the present disclosure;
图8示出了本公开至少一个实施例提供的一种图像增强网络的训练装置的结构示意图。Fig. 8 shows a schematic structural diagram of an image enhancement network training device provided by at least one embodiment of the present disclosure.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本公开一个或多个实施例中的技术方案,下面将结合本公开一个或多个实施例中的附图,对本公开一个或多个实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。基于本公开一个或多个实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。In order to enable those skilled in the art to better understand the technical solutions in one or more embodiments of the present disclosure, the following will describe the technical solutions in one or more embodiments of the present disclosure in conjunction with the drawings in one or more embodiments of the present disclosure The technical solutions are clearly and completely described, and obviously, the described embodiments are only some of the embodiments of the present disclosure, rather than all the embodiments. Based on one or more embodiments of the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present disclosure.
本公开实施例旨在提供一种图像增强方法,该方法可以通过训练好的神经网络来生成增强图像。其中,所述的神经网络可以称为图像增强网络,所述的增强图像可以是在初始图像的基础上进行增强处理后得到的图像。其中增强处理例如可以是将图像进行形 变,比如,以人脸图像的增强为例,可以包括但不限于人脸角度的变化,人脸表情的变化,人脸朝向的变化,人脸五官大小的变化等。示例性的,假设初始图像是一张人脸图像,其中的人脸部的嘴巴是处于闭嘴状态,将该人脸图像中的嘴巴变形为张开笑的嘴巴,就得到一张增强图像。Embodiments of the present disclosure aim to provide an image enhancement method, which can generate an enhanced image through a trained neural network. Wherein, the neural network may be called an image enhancement network, and the enhanced image may be an image obtained after enhancement processing is performed on the basis of an initial image. The enhancement process can be, for example, deforming the image. For example, taking the enhancement of a human face image as an example, it can include but not limited to changes in the angle of the face, the change in the expression of the face, the change in the orientation of the face, and the size of the facial features. changes etc. Exemplarily, assuming that the initial image is a human face image in which the mouth of the human face is in a closed state, and the mouth in the human face image is transformed into a smiling mouth to obtain an enhanced image.
如下的实施例中,将先描述该图像增强网络的训练过程,再说明如何通过训练好的图像增强网络生成增强图像。In the following embodiments, the training process of the image enhancement network will be described first, and then how to generate an enhanced image through the trained image enhancement network will be described.
图1示出了本公开至少一个实施例提供的一种图像增强网络的训练方法的流程示意图,如图1所示,该方法可以包括如下处理:Fig. 1 shows a schematic flowchart of a method for training an image enhancement network provided by at least one embodiment of the present disclosure. As shown in Fig. 1, the method may include the following processing:
在步骤100中,获取样本图像以及第二对象的结构信息。In step 100, a sample image and structural information of a second object are acquired.
本实施例的训练方法可以由图像增强网络的训练装置执行,例如,该训练装置可以是部署在一个电子设备上(如,可以是一个服务器),并且,该训练装置中可以包括待训练的图像增强网络。The training method of this embodiment can be performed by the training device of the image enhancement network. For example, the training device can be deployed on an electronic device (such as a server), and the training device can include images to be trained. Enhanced network.
本步骤中,图像增强网络的训练装置可以获取待增强的样本图像,本实施例可以将训练阶段的待增强的图像称为样本图像。其中,该样本图像中包括第一对象。示例性的,样本图像可以是一张包括眼睛的图像,该第一对象可以是样本图像中的眼睛。又例如,样本图像可以是一张包括树木的图像,该第一对象可以是样本图像中的树木。In this step, the training device of the image enhancement network can obtain the sample image to be enhanced, and in this embodiment, the image to be enhanced in the training stage can be called a sample image. Wherein, the sample image includes the first object. Exemplarily, the sample image may be an image including eyes, and the first object may be the eyes in the sample image. For another example, the sample image may be an image including trees, and the first object may be the trees in the sample image.
该训练装置还可以获取第二对象的结构信息,该第二对象与第一对象是具有不同结构信息的同一目标对象。The training device may also obtain structural information of a second object, which is the same target object with different structural information from the first object.
所述的结构信息可以理解为表示第二对象的轮廓特征,例如,物体的大小、结构。举例如下:以人脸的五官为例,获取的五官的结构信息可以是,嘴巴的轮廓特征,鼻子的轮廓特征等信息;还可以是鼻子的高低等特征信息。其中,所述的轮廓特征的记录形式包括但不限于:表示为轮廓线的形式,或者记录为分布在轮廓线上的多个关键点,可以记录这些关键点的位置坐标或关键点标识。The structure information can be understood as representing the contour features of the second object, for example, the size and structure of the object. An example is as follows: Taking the facial features of a human face as an example, the acquired structural information of the facial features can be information such as the contour features of the mouth, the contour features of the nose, etc.; it can also be feature information such as the height of the nose. Wherein, the record form of the contour feature includes but is not limited to: expressed as a contour line, or recorded as a plurality of key points distributed on the contour line, and the position coordinates or key point identifiers of these key points can be recorded.
例如,目标对象以眼睛为例,所述的结构信息可以是眼睛的轮廓特征。图3A中示意的第一对象的结构信息和图3B中示意的第二对象的结构信息,可以看到是两种不同结构信息的眼睛,但是这两个对象可以是同一个人的眼睛,只是一个处于眯眼的状态,一个处于睁大的状态,所以两个眼睛的结构信息是不同的。For example, the target object is an eye, and the structural information may be an outline feature of the eye. The structural information of the first object illustrated in Figure 3A and the structural information of the second object illustrated in Figure 3B can be seen as eyes with two different structural information, but these two objects can be the eyes of the same person, just one One is in the state of squinting, and the other is in the state of widening, so the structural information of the two eyes is different.
同理,所述的具有不同结构信息的第一对象和第二对象还可以是如下示例:又例如,若目标对象是嘴巴,那么第一对象可以是一个闭着的嘴巴,第二对象可以是一个张开的嘴巴,这两个嘴巴即使是同一人的嘴巴,但由于状态不同,两者的结构信息也是不同的。比如,由于嘴巴的上述状态不同,使得闭着的嘴巴的轮廓特征中记录的各个轮廓关键点的位置,与张开的嘴巴的轮廓特征中记录的各个轮廓关键点的位置也是不同的。Similarly, the first object and the second object with different structural information can also be the following examples: for another example, if the target object is a mouth, then the first object can be a closed mouth, and the second object can be An open mouth, even if the two mouths are the mouths of the same person, the structural information of the two mouths is different due to their different states. For example, due to the above-mentioned different states of the mouth, the position of each contour key point recorded in the contour feature of the closed mouth is different from the position of each contour key point recorded in the contour feature of the open mouth.
在步骤102中,通过图像增强网络对样本图像进行特征提取,得到所述样本图像的外观信息。In step 102, feature extraction is performed on the sample image through an image enhancement network to obtain appearance information of the sample image.
在一个示例中,外观信息的获取方式可以是通过图像增强网络中的外观提取器进行特征提取得到。结合图2所示,训练装置在获取到样本图像后,可以将该样本图像输入至图像增强网络。该图像增强网络中可以包括外观提取器,可以通过该外观提取器21对样本图像进行特征提取,得到所述样本图像的外观信息。本实施例不限制所述外观提取器的网络结构,例如,该外观提取器可以包括卷积层、残差模块、激活层、池化层等多种模块构成。In an example, the appearance information may be acquired through feature extraction by an appearance extractor in the image enhancement network. As shown in FIG. 2 , after the training device acquires the sample image, it can input the sample image to the image enhancement network. The image enhancement network may include an appearance extractor, and the appearance extractor 21 may perform feature extraction on the sample image to obtain appearance information of the sample image. This embodiment does not limit the network structure of the appearance extractor. For example, the appearance extractor may include various modules such as a convolution layer, a residual module, an activation layer, and a pooling layer.
所述的外观信息表示目标图像中的表面视觉特征。该表面视觉特征包括但不限于目标图像中的纹理、颜色、光照信息等。以样本图像是一张人脸图像为例,通过外观提取 器的特征提取后,得到的外观信息可以包括:人脸部区域的光照亮度、脸部的纹理、脸部的颜色等。The appearance information represents surface visual features in the target image. The surface visual features include but not limited to texture, color, lighting information, etc. in the target image. Taking the sample image as a face image as an example, after feature extraction by the appearance extractor, the appearance information obtained can include: the brightness of the face area, the texture of the face, the color of the face, etc.
例如,外观提取器21输出的该样本图像的外观信息可以表示为一维张量,可以是64*1的张量。For example, the appearance information of the sample image output by the appearance extractor 21 may be expressed as a one-dimensional tensor, which may be a 64*1 tensor.
此外,对于样本图像中包括的外观信息,外观提取器21可以是提取其中全部的外观信息,也可以是提取其中部分的外观信息,这个可以根据实际业务需求确定。比如,以一张眼睛图片为例,该眼睛图片中除了包括眼睛之外,还包括一部分的眼睛周边的脸部区域、以及眉毛区域。那么,可以通过外观提取器21提取所有这些区域的亮度、颜色、纹理的外观信息,也可以只提取眉毛区域的外观信息,或者只提取眼睛周边的脸部区域的外观信息。可以通过设计和训练外观提取器21实现对样本图像中的至少一部分区域的外观信息的提取。In addition, for the appearance information included in the sample image, the appearance extractor 21 may extract all or part of the appearance information, which may be determined according to actual business requirements. For example, taking an eye picture as an example, in addition to the eyes, the eye picture also includes a part of the face area around the eyes and the eyebrow area. Then, the appearance information of the brightness, color, and texture of all these areas can be extracted by the appearance extractor 21, or only the appearance information of the eyebrow area can be extracted, or only the appearance information of the face area around the eyes can be extracted. The extraction of the appearance information of at least a part of the region in the sample image can be realized by designing and training the appearance extractor 21 .
可选的,对于上述的亮度、颜色、纹理等这些外观信息,也可以通过设计所述的外观提取器21的功能,实现对至少部分外观信息的提取,比如,只提取样本图像中的纹理和颜色,而不提取亮度。Optionally, for the above-mentioned appearance information such as brightness, color, texture, etc., the function of the appearance extractor 21 can also be designed to realize the extraction of at least part of the appearance information, for example, only the texture and texture in the sample image are extracted. color without extracting brightness.
此外,同样的道理,在步骤100中获取的第二对象的结构信息,也可以根据实际业务需求来决定获取第二对象的至少部分结构信息。比如,以第二对象是眼睛为例,如果要获取该眼睛的全部结构信息,那可以包括眼睛的外轮廓关键点、眼球的轮廓关键点以及眼球的中心点等;而如果要获取眼睛的部分结构信息,可以只包括眼睛的外轮廓点,不包括眼球的轮廓关键点和眼球的中心点。In addition, for the same reason, the structure information of the second object obtained in step 100 may also be determined to obtain at least part of the structure information of the second object according to actual business requirements. For example, taking the second object as the eye as an example, if you want to obtain all the structural information of the eye, it can include the key points of the outer contour of the eye, the key points of the contour of the eyeball, and the center point of the eyeball; and if you want to obtain the part of the eye Structural information may only include the outline points of the eye, excluding key points of the outline of the eyeball and the center point of the eyeball.
如上所述的,本实施例的步骤100中获取到的结构信息和步骤102中获取到的外观信息是已提取到的至少一部分信息,这些信息将参与图像生成处理。As mentioned above, the structural information obtained in step 100 and the appearance information obtained in step 102 in this embodiment are at least part of the extracted information, and these information will participate in the image generation process.
在步骤104中,通过图像增强网络对所述外观信息和结构信息进行图像生成处理,输出样本增强图像;其中,所述样本增强图像包括所述外观信息、以及具有所述结构信息的目标对象。In step 104, an image generation process is performed on the appearance information and structure information through an image enhancement network, and a sample enhanced image is output; wherein, the sample enhanced image includes the appearance information and the target object with the structure information.
本实施例中,图像增强网络可以根据前述获得的外观信息和结构信息生成样本增强图像。例如,如图2所示,可以通过生成器22进行图像生成处理,输出样本增强图像。该样本增强图像可以同时具有外观信息和结构信息。其中,该结构信息可以是样本增强图像中的目标对象所具有的,该目标对象可以是前述的第一对象或者第二对象。In this embodiment, the image enhancement network can generate a sample enhanced image according to the appearance information and structure information obtained above. For example, as shown in FIG. 2 , image generation processing may be performed by the generator 22 to output a sample enhanced image. The sample enhanced image can have both appearance information and structure information. Wherein, the structure information may be possessed by a target object in the sample enhanced image, and the target object may be the aforementioned first object or second object.
在一个例子中,请结合图2所示,假设样本图像是一张包含眼睛的图像,结构信息是该眼睛处于另一种状态时的结构图。在通过图像增强网络的处理后,输出的样本增强图像相比于样本图像来说,可以是将样本图像中的第一对象的结构信息替换为所述第二对象的结构信息,样本图像中的其他信息可以未变化,比如,样本图像中的眼睛周边的人脸纹理、人脸颜色、眉毛、眼睛内部的眼球的位置、眼球的颜色等,可以未发生变化,与样本图像中相同。In an example, please refer to FIG. 2 , assuming that the sample image is an image containing eyes, and the structure information is a structure map of the eyes in another state. After being processed by the image enhancement network, compared with the sample image, the output sample enhanced image may replace the structural information of the first object in the sample image with the structural information of the second object, and the structural information of the second object in the sample image Other information may not change, for example, the face texture around the eyes, face color, eyebrows, eyeball position inside the eyes, eye color, etc. in the sample image may not change, which is the same as in the sample image.
在步骤106中,根据样本增强图像,调整所述图像增强网络的网络参数。In step 106, network parameters of the image enhancement network are adjusted according to the sample enhanced image.
本实施例中,作为样本增强图像的标签的图像,可以是第二对象所在的辅助图像。该辅助图像可以与样本图像的图像尺寸相同,且辅助图像与样本图像可以是包含相同的区域。In this embodiment, the image serving as the label of the sample enhanced image may be an auxiliary image where the second object is located. The auxiliary image may have the same image size as the sample image, and the auxiliary image and the sample image may include the same area.
例如,图2中的样本图像包括一只眼睛和一个眉毛,该样本图像对应的辅助图像可以同样包括一只眼睛和一个眉毛,即包括与样本图像相同的区域,且样本图像和辅助图像的尺寸可以相同。不同的地方在于,样本图像和辅助图像的眼睛的结构信息不同,例如,样本图像中的眼睛是睁大的,而辅助图像中的眼睛是眯着的。For example, the sample image in Figure 2 includes an eye and an eyebrow, and the auxiliary image corresponding to the sample image can also include an eye and an eyebrow, that is, it includes the same area as the sample image, and the size of the sample image and the auxiliary image Can be the same. The difference lies in that the structural information of eyes in the sample image and the auxiliary image are different, for example, the eyes in the sample image are wide open, while the eyes in the auxiliary image are squinted.
在得到样本增强图像之后,可以根据该样本增强图像调整所述图像增强网络的网络参数。例如,可以根据样本增强图像与辅助图像之间的差异,求解样本增强图像与辅助图像之间的L1范数损失函数(L1loss),根据该L1损失调整外观提取器和生成器的网络参数。After the sample enhanced image is obtained, network parameters of the image enhancement network can be adjusted according to the sample enhanced image. For example, according to the difference between the sample enhanced image and the auxiliary image, the L1 norm loss function (L1loss) between the sample enhanced image and the auxiliary image can be solved, and the network parameters of the appearance extractor and generator can be adjusted according to the L1 loss.
通过本实施例的图像增强网络的训练方法可以依据多种类型的结构信息来增强样本图像,由于结构信息可以多种多样,不受限制,所以能够得到更为丰富的样本增强图像,使得样本的种类更加丰富,当将生成的样本增强图像应用于模型训练等任务时,丰富多样的样本可以提升模型训练的鲁棒性和泛化性,而且通过该方式得到更为丰富的样本种类,相比于之前的样本获取方式,降低了样本获取的成本,样本获取更加简便。另外,该方法是通过图像增强网络来生成样本增强图像,相比于常规的插值、拉伸等图像处理方式,使得生成的图像质量更高。Through the image enhancement network training method of this embodiment, the sample image can be enhanced according to various types of structural information. Since the structural information can be varied and not limited, a richer sample enhanced image can be obtained, so that the sample image The types are more abundant. When the generated sample enhanced images are applied to tasks such as model training, rich and diverse samples can improve the robustness and generalization of model training, and in this way, more abundant sample types can be obtained. Compared with Compared with the previous sample acquisition method, the cost of sample acquisition is reduced, and sample acquisition is more convenient. In addition, the method uses an image enhancement network to generate a sample enhanced image, compared with conventional image processing methods such as interpolation and stretching, so that the quality of the generated image is higher.
此外,该样本图像中包括的第一对象可以根据实际应用的需求确定。例如,根据本公开实施例获得的样本增强图像,若实际应用需要包括眼睛的图像,则样本增强图像中的第一对象是眼睛。又例如,若实际应用需要包括嘴巴的图像,则样本增强图像中的第一对象是嘴巴。除此之外,还可以是对人脸五官中的其他器官进行增强,比如,眉毛,鼻子等。相应的,可以使用包含待增强的器官的样本图像和对应的该器官的结构信息进行样本增强图像的生成。In addition, the first object included in the sample image may be determined according to actual application requirements. For example, in the sample enhanced image obtained according to the embodiment of the present disclosure, if the actual application needs to include the image of the eye, the first object in the sample enhanced image is the eye. For another example, if the actual application requires an image including a mouth, the first object in the sample enhanced image is the mouth. In addition, other organs in the facial features can also be enhanced, such as eyebrows, nose, etc. Correspondingly, the sample image containing the organ to be enhanced and the corresponding structural information of the organ can be used to generate the sample enhanced image.
图2所示意的样本增强图像是包括眼睛的图像,而在实际实施中,有时初始获得的图像可以是范围比较大的包括整个人脸的图像,那么可以在执行图2示意的图像增强流程之前,对初始图像进行预处理。The sample enhanced image shown in Figure 2 is an image including eyes, but in actual implementation, sometimes the initially obtained image can be an image with a relatively large range including the entire face, then you can perform the image enhancement process shown in Figure 2 , to preprocess the initial image.
结合图4所示,假设初始获得一张图像,可以称之为“初始图像”,该初始图像中不仅包括人脸部,还包括人的手部、脖子、衣服等多个区域,那么可以先通过预先训练好的关键点检测网络,检测初始图像中的人脸关键点(比如,可以是106个关键点)。并根据检测到的关键点对初始图像进行裁剪,得到包括人脸的图像,而将手部、脖子等人脸之外的区域去除掉。裁剪后的人脸图像的尺寸可以是1024*1024。如图4所示,该图4示意了对初始图像裁剪后得到的人脸图像,以及该人脸图像中的部分人脸关键点,例如,关键点41,关键点42等。As shown in Figure 4, suppose an initial image is obtained, which can be called "initial image". Through the pre-trained key point detection network, detect the key points of the face in the initial image (for example, it can be 106 key points). The initial image is cropped according to the detected key points to obtain an image including the face, and the areas other than the face such as the hands and neck are removed. The size of the cropped face image may be 1024*1024. As shown in FIG. 4 , this FIG. 4 schematically shows the face image obtained after cropping the initial image, and some key points of the face in the face image, for example, key point 41 , key point 42 and so on.
进一步的,假如想要通过图2所示的图像增强网络对人脸图像中的其中一个器官区域进行增强形变,例如,要对嘴巴进行增强,那么可以根据上述的人脸关键点,对图4所示的人脸图像做进一步的裁剪,得到包括嘴巴的图像。如图4所示,该嘴巴图像中的嘴巴即为人脸图像中的嘴巴。并且,该嘴巴图像可以在图像增强网络的训练阶段作为样本图像,或者也可以作为辅助图像。Further, if one wants to enhance and deform one of the organ regions in the face image through the image enhancement network shown in Figure 2, for example, to enhance the mouth, then according to the above-mentioned key points of the face, the image in Figure 4 The shown face image is further cropped to obtain an image including the mouth. As shown in FIG. 4 , the mouth in the mouth image is the mouth in the face image. Moreover, the mouth image can be used as a sample image in the training phase of the image enhancement network, or can also be used as an auxiliary image.
再进一步的,根据上述嘴巴的关键点,还可以得到对应嘴巴的结构信息。如图4所示,该结构信息可以是一个对应该嘴巴的结构图(heatmap)。该结构图中可以包括嘴巴的关键点。该结构图可以输入图像增强网络,用以辅助样本图像生成对应的样本增强图像。Furthermore, according to the above key points of the mouth, the structure information of the corresponding mouth can also be obtained. As shown in FIG. 4, the structure information may be a structure map (heatmap) corresponding to the mouth. Keypoints for the mouth may be included in the structure map. The structure graph can be input into the image enhancement network to assist the sample image to generate a corresponding sample enhanced image.
此外,以增强人脸图像为例,在准备图像增强网络的训练数据时,可以准备如下数据:In addition, taking the enhanced face image as an example, when preparing the training data of the image enhancement network, the following data can be prepared:
1)同一ID的少量的人脸图像:例如,可以是同一个人小张的15张人脸图像。所述的同一ID指的是同一个人,比如,小王的多张人脸图像属于同一ID,该ID是小王的标识。1) A small number of face images with the same ID: for example, 15 face images of the same person Zhang. The same ID refers to the same person, for example, multiple face images of Xiao Wang belong to the same ID, and the ID is Xiao Wang's identification.
2)较多数量的其他ID的人脸图像,且每个ID都有一定数量的不同表情,不同角度的人脸图像。例如,该1.5万个其他ID可以是小王、小董等其他人的人脸图像。2) A larger number of face images of other IDs, and each ID has a certain number of face images with different expressions and different angles. For example, the 15,000 other IDs may be face images of Xiao Wang, Xiao Dong and other people.
如上,准备的训练数据可以包括多个ID的人脸图像,每个ID都可以包括多种表情和不同角度的人脸图像,不同的表情和角度可以对应不同的结构信息。As above, the prepared training data may include face images of multiple IDs, and each ID may include face images of multiple expressions and different angles, and different expressions and angles may correspond to different structural information.
当使用上述的训练数据训练图像增强网络时,样本图像和辅助图像可以是从上面的训练数据中随机抽取的属于同一ID的两张人脸图像。例如,可以抽取到小张的两张人脸图像,这两张图像中都是小张的人脸,其中一张图像中小张是眯眼,另一张图像中小张是睁大眼,两张图像中的眼睛的结构信息不同,除了该结构信息之外的外观信息是相同的。这样辅助图像作为本次增强的标签,后续根据该辅助图像与图像增强网络输出的样本增强图像之间的差异,来调整图像增强网络的网络参数。When using the above training data to train the image enhancement network, the sample image and the auxiliary image can be two face images belonging to the same ID randomly selected from the above training data. For example, two face images of Xiao Zhang can be extracted. Both images are of Xiao Zhang’s face. In one image, Xiao Zhang has squinted eyes, in the other image, Xiao Zhang has his eyes wide open. The structural information of the eyes in the images is different, but the appearance information other than the structural information is the same. In this way, the auxiliary image is used as the label of this enhancement, and the network parameters of the image enhancement network are subsequently adjusted according to the difference between the auxiliary image and the sample enhanced image output by the image enhancement network.
在一个例子中,可以将上述训练数据中的各个人脸图像都进行图4示意的预处理。例如,识别每张人脸图像中的人脸关键点,再根据人脸关键点裁剪得到人脸图像以及包括人脸五官中的其中一个器官的图像。示例性的,假设需要的是包括眼睛的器官图像,那么可以将上述的训练数据中的各图像都裁剪得到包括眼睛的眼睛图像。再将属于同一个人的两张眼睛图像分别作为辅助图像和样本图像,通过图2所示的图像增强网络输出得到增强眼睛图像,即在增强眼睛图像中,将样本图像中的眼睛的结构信息替换为辅助图像中的眼睛的结构信息。In an example, each face image in the above training data may be preprocessed as shown in FIG. 4 . For example, identify the key points of the face in each face image, and then crop the face image and an image including one of the five organs of the face according to the key points of the face. Exemplarily, assuming that an organ image including eyes is required, each image in the above training data may be cropped to obtain an eye image including eyes. Then, the two eye images belonging to the same person are used as the auxiliary image and the sample image respectively, and the enhanced eye image is obtained through the output of the image enhancement network shown in Figure 2, that is, in the enhanced eye image, the structural information of the eye in the sample image is replaced by is the structural information of the eye in the auxiliary image.
图5示出了本公开至少一个实施例提供的另一种网络训练原理图,在训练图像增强网络时,除了采用前述提及的根据样本增强图像和辅助图像之间的差异调整网络参数,还可以采用图5所示的训练方式。Fig. 5 shows another schematic diagram of network training provided by at least one embodiment of the present disclosure. When training the image enhancement network, in addition to adjusting the network parameters according to the difference between the sample enhancement image and the auxiliary image mentioned above, it is also The training method shown in Figure 5 can be used.
如图5所示,可以将样本增强图像和对应的标签(例如,该标签可以是辅助图像)输入判别器23,得到该判别器23输出的判别值。例如,该判别值可以是一个0到1之间的数值,用于表示样本增强图像的真实性的概率。根据该判别值与判别真值之间的差异得到第一损失;并根据样本增强图像和所述辅助图像之间的差异得到第二损失。进一步根据所述第一损失和第二损失,调整外观提取器、生成器和判别器中至少一个的网络参数。As shown in FIG. 5 , the sample enhanced image and the corresponding label (for example, the label may be an auxiliary image) may be input into the discriminator 23 to obtain a discriminant value output by the discriminator 23 . For example, the discriminant value may be a numerical value between 0 and 1, which is used to represent the probability of authenticity of the sample enhanced image. The first loss is obtained according to the difference between the discriminant value and the discriminant true value; and the second loss is obtained according to the difference between the sample enhanced image and the auxiliary image. Further adjusting network parameters of at least one of the appearance extractor, generator and discriminator according to the first loss and the second loss.
此外,上述的生成器和判别器可以采用常规的生成对抗网络(Generative adversarial nets,GAN)的网络结构,本实施例不做限制。例如,网络结构中可以包括卷积层、残差模块、池化层、线性层、激活层等。In addition, the aforementioned generator and discriminator may adopt a conventional Generative adversarial nets (GAN) network structure, which is not limited in this embodiment. For example, the network structure may include convolutional layers, residual modules, pooling layers, linear layers, activation layers, etc.
这种通过生成对抗网络训练图像增强网络,来生成样本增强图像的方式,通过训练使得判别器输出的判别值尽可能的接近真实值,从而可以提升增强图像生成的逼真性,有助于生成更高质量的增强图像。This way of generating a sample enhanced image by training an image enhancement network through a generative confrontation network can make the discriminant value output by the discriminator as close to the real value as possible through training, thereby improving the fidelity of the enhanced image generation and helping to generate more accurate images. High quality enhanced images.
上述的图像增强网络训练好之后可以用来生成增强图像。图6示出了本公开至少一个实施例提供的一种图像增强方法的流程示意图,如图6所示,该方法可以由图像增强装置执行,该方法可以包括如下处理:After the above image enhancement network is trained, it can be used to generate enhanced images. Fig. 6 shows a schematic flowchart of an image enhancement method provided by at least one embodiment of the present disclosure. As shown in Fig. 6, the method may be executed by an image enhancement device, and the method may include the following processing:
在步骤600中,对目标图像进行特征提取,得到所述目标图像的外观信息,所述目标图像中包括第一对象。In step 600, feature extraction is performed on the target image to obtain appearance information of the target image, and the target image includes a first object.
在一个示例中,所述的目标图像可以是包括眼睛的图像,比如图2中所示的样本图像包括人的眼睛的图像。本实施例可以将目标图像中的眼睛称为第一对象,并且,本实施例的目的可以是对该目标图像进行增强,对目标图像中的眼睛进行增强形变。In an example, the target image may be an image including eyes, for example, the sample image shown in FIG. 2 includes an image of human eyes. In this embodiment, the eyes in the target image may be referred to as the first object, and the purpose of this embodiment may be to enhance the target image, and perform enhancement and deformation on the eyes in the target image.
其中,可以通过训练好的图像增强网络中的外观提取器对目标图像进行特征提取,得到目标图像的外观信息。Among them, the appearance information of the target image can be obtained by extracting the features of the target image through the appearance extractor in the trained image enhancement network.
此外,如果初始图像是一张包括完整人脸的图像,那么可以对该初始图像进行预处理,以得到包括眼睛的目标图像。例如,可以通过关键点检测网络对初始图像进行人脸 关键点的检测,得到初始图像中的人脸关键点。并且可以根据这些人脸关键点对初始图像进行裁剪,得到包括眼睛的上述目标图像。In addition, if the initial image is an image including a complete human face, then the initial image can be preprocessed to obtain a target image including eyes. For example, the key point detection network can be used to detect the key points of the face in the initial image to obtain the key points of the face in the initial image. And the initial image can be cropped according to these face key points to obtain the above-mentioned target image including eyes.
在步骤602中,根据辅助图像中的第二对象的关键点,得到所述第二对象的结构信息,所述第一对象和第二对象均为同种类的目标对象。In step 602, the structure information of the second object is obtained according to the key points of the second object in the auxiliary image, and both the first object and the second object are target objects of the same type.
本步骤中,所述的辅助图像中的第二对象与第一对象是同种类型的对象,比如,这两个对象都是眼睛,或者这两个对象都是嘴巴。可以将该同种类型的对象称为目标对象,辅助图像和目标图像中的眼睛有所不同,将目标图像中的眼睛称为第一对象,将辅助图像中的眼睛称为第二对象。In this step, the second object in the auxiliary image is the same type of object as the first object, for example, both objects are eyes, or both objects are mouths. The object of the same type may be referred to as a target object. The eyes in the auxiliary image and the target image are different. The eyes in the target image are referred to as the first object, and the eyes in the auxiliary image are referred to as the second object.
本实施例的第一对象和第二对象可以是同一个目标对象,例如,两者都是小王的眼睛,且两个对象的眼睛状态不同(如,一个睁大,一个眯眼)。或者,第一对象和第二对象也可以是属于不同目标对象,例如,第一对象是小王的眼睛,第二对象是小张的眼睛。The first object and the second object in this embodiment may be the same target object, for example, both are Xiao Wang's eyes, and the eyes of the two objects are in different states (eg, one is wide open, and the other is squinting). Alternatively, the first object and the second object may also belong to different target objects, for example, the first object is Xiao Wang's eyes, and the second object is Xiao Zhang's eyes.
本实施例可以根据辅助图像中的第二对象的关键点,得到所述第二对象的结构信息。其中,图像增强网络中可以包括用于提取关键点的网络模块,那么辅助图像在输入图像增强网络后,可以通过该网络模块提取辅助图像中的关键点,再根据关键点得到第二对象的结构信息。或者,图像增强网络中也可以不包括用于提取关键点的网络模块,而是可以通过图像增强网络之外的其他处理模块获得第二对象的结构信息,将该结构信息输入图像增强网络。In this embodiment, the structure information of the second object can be obtained according to the key points of the second object in the auxiliary image. Among them, the image enhancement network can include a network module for extracting key points, then after the auxiliary image is input into the image enhancement network, the key points in the auxiliary image can be extracted through the network module, and then the structure of the second object can be obtained according to the key points information. Alternatively, the image enhancement network may not include a network module for extracting key points, but the structure information of the second object may be obtained through other processing modules other than the image enhancement network, and the structure information may be input into the image enhancement network.
在步骤604中,基于所述外观信息和结构信息生成增强图像,所述增强图像将所述目标图像中的第一对象的结构信息替换为所述第二对象的结构信息。In step 604, an enhanced image is generated based on the appearance information and structural information, and the enhanced image replaces the structural information of the first object in the target image with the structural information of the second object.
例如,图像增强网络中的生成器可以根据获取到的外观信息和结构信息进行图像生成处理,最终生成增强图像。该增强图像包括了目标图像的外观信息、以及辅助图像中的第二对象的结构信息,那么该增强图像与目标图像相比,将目标图像中的第一对象的结构信息替换为第二对象的结构信息。For example, the generator in the image enhancement network can perform image generation processing according to the acquired appearance information and structure information, and finally generate an enhanced image. The enhanced image includes the appearance information of the target image and the structure information of the second object in the auxiliary image, then the enhanced image is compared with the target image by replacing the structure information of the first object in the target image with the structure information of the second object structural information.
在一个例子中,根据实际的应用需求,如果本实施例的图像增强装置通过图像增强网络输出的包括眼睛的增强图像,能够用于后续的网络训练,则该增强图像可以不再进行后续处理。在另一个例子中,尽管可以通过图2所示的图像增强网络对单个的器官部分进行增强处理,但是希望最终输出的是整张人脸的图像。例如,初始图像可以是小王的一张人脸图像,想要得到一张改变小王的眼睛的结构信息的增强图像。那么,可以获取小张的眼睛的结构信息,并结合该小张眼睛的结构信息、以及由小王的人脸图像裁剪得到的小王眼睛图像,通过图像增强网络进行图像生成处理,得到的增强图像中将小王的眼睛的结构信息替换为了小张眼睛的结构信息。但是此时图像增强网络输出的增强图像是包括小王眼睛的图像,还可以将该增强图像贴回到最初的小王人脸图像中,即将增强图像替换到小王人脸图像中的对应部分,可得到更新后的小王人脸图像,也可以称为小王的增强人脸图像。In one example, according to actual application requirements, if the enhanced image including eyes output by the image enhancement device of this embodiment through the image enhancement network can be used for subsequent network training, the enhanced image may not undergo subsequent processing. In another example, although a single organ part can be enhanced through the image enhancement network shown in Figure 2, it is hoped that the final output is an image of the entire face. For example, the initial image may be a face image of Xiao Wang, and it is desired to obtain an enhanced image that changes the structural information of Xiao Wang's eyes. Then, the structural information of Xiao Zhang’s eyes can be obtained, combined with the structural information of Xiao Zhang’s eyes and Xiao Wang’s eye image cropped from Xiao Wang’s face image, the image generation process is performed through the image enhancement network, and the obtained enhanced In the image, the structural information of Xiao Wang's eyes is replaced with the structural information of Xiao Zhang's eyes. But at this time, the enhanced image output by the image enhancement network is an image including Xiao Wang's eyes, and the enhanced image can also be pasted back to the original Xiao Wang's face image, that is, the enhanced image will be replaced with the corresponding part of Xiao Wang's face image , the updated face image of Xiao Wang can be obtained, which can also be called the enhanced face image of Xiao Wang.
在又一个例子中,如果想要改变了眼睛和嘴巴等多个器官的增强人脸图像,可以如下处理:由初始人脸图像中根据人脸关键点分别裁剪得到眼睛图像(包括眼睛的图像)和嘴巴图像(包括嘴巴的图像),然后,将眼睛图像和嘴巴图像分别通过图像增强网络进行增强处理,得到各自对应的增强图像,例如,眼睛增强图像和嘴巴增强图像。最后,再分别将所述的眼睛增强图像和嘴巴增强图像贴回到初始图像中,替换掉上述的初始人脸图像中的对应部分。In another example, if you want to change the enhanced face image of multiple organs such as eyes and mouth, it can be processed as follows: the eye image (including the image of the eye) is obtained by cutting out the key points of the face from the initial face image and the mouth image (including the mouth image), and then, the eye image and the mouth image are respectively enhanced through the image enhancement network to obtain the corresponding enhanced images, for example, the eye enhanced image and the mouth enhanced image. Finally, paste the eye-enhanced image and the mouth-enhanced image back into the original image respectively, and replace the corresponding parts in the above-mentioned initial human face image.
上述图6的生成增强图像的流程,可以应用在网络的训练场景,比如要训练一个神经网络,而训练样本不够,则通过本公开实施例的上述图6的方式生成增强图像,以得 到更为丰富的样本图像。如上本公开实施例提供的图像增强网络,可以结合任意的结构信息生成增强图像,以人脸的增强为例,通过该方法生成增强图像时,可以生成较为丰富的人脸增强图像,可以包括多种角度、表情的人脸增强图像。这种丰富多样的增强图像,在应用于训练神经网络模型时,有助于提高所训练的神经网络模型的泛化性和鲁棒性,而且该方法通过训练好的图像增强网络来生成增强图像,并且还在训练过程中使用了生成对抗的方式进行训练,使得生成的增强图像的质量较高,更为逼真和清晰。The process of generating an enhanced image in the above-mentioned Figure 6 can be applied to the training scene of the network. For example, if a neural network is to be trained, but the training samples are not enough, the enhanced image is generated through the above-mentioned Figure 6 in the embodiment of the present disclosure to obtain a more accurate image. Rich sample images. The image enhancement network provided by the above embodiments of the present disclosure can combine arbitrary structural information to generate an enhanced image. Taking face enhancement as an example, when an enhanced image is generated by this method, a richer enhanced face image can be generated, which can include multiple Enhanced face images from various angles and expressions. This rich and diverse enhanced image, when applied to the training neural network model, helps to improve the generalization and robustness of the trained neural network model, and the method generates enhanced images through the trained image enhancement network , and also used the method of generative confrontation in the training process, so that the quality of the generated enhanced image is higher, more realistic and clear.
在具有一定的数据获取难度的场景下,例如,只能获取到少量的同一ID的数据,那么可以通过本公开实施例的图像增强网络对这些少量的数据进行丰富,这样在获取数据时就降低了数据获取的难度。In a scenario where data acquisition is difficult, for example, only a small amount of data with the same ID can be obtained, these small amounts of data can be enriched through the image enhancement network of the embodiment of the present disclosure, so that when obtaining data, it is reduced. Difficulty in obtaining data.
此外,图6的生成增强图像的流程,还可以应用于其他场景,例如,可以应用于妆容迁移、人脸驱动等人脸图像增强类应用。In addition, the process of generating an enhanced image in FIG. 6 can also be applied to other scenarios, for example, it can be applied to face image enhancement applications such as makeup migration and face driving.
例如,想要将初始图像中的人脸的眼睛进行变换,则通过本方法将包含眼睛的图像进行增强,并将增强后的眼睛图像替换掉初始图像中的眼睛。For example, if you want to transform the eyes of the human face in the original image, you can use this method to enhance the image containing the eyes, and replace the eyes in the original image with the enhanced eye image.
又例如,在妆容迁移的场景中,可以将一个人小张的眼睛妆容迁移到另一个人小王的眼睛,那么可以通过图像增强网络中的外观提取器提取小张眼睛的妆容相关的外观信息,然后再结合小王的眼睛的结构信息,生成增强图像,在该增强图像中,小王的眼睛结构没变,但是其已经具备了小张的眼睛妆容。For another example, in the makeup transfer scene, one person Xiao Zhang’s eye makeup can be transferred to another person Xiao Wang’s eyes, then the appearance information related to Xiao Zhang’s eye makeup can be extracted through the appearance extractor in the image enhancement network , and then combine the structural information of Xiao Wang's eyes to generate an enhanced image. In the enhanced image, the structure of Xiao Wang's eyes has not changed, but it already has Xiao Zhang's eye makeup.
再例如,在人脸驱动的场景中,假设要用小张的人脸表情去驱动小王的人脸做同样的表情动作,并且假设具体是嘴部的动作。那可以结合小王的人脸图片的外观信息、以及小张的嘴部的结构信息,生成增强图像,使得增强图像中还是小王的人脸,只是嘴部的动作表情换成了小张的表情。For another example, in the face-driven scene, it is assumed that Xiao Zhang's facial expressions are used to drive Xiao Wang's face to do the same facial expressions, and it is assumed that the specific movements are mouth movements. That can combine the appearance information of Xiao Wang's face picture and the structure information of Xiao Zhang's mouth to generate an enhanced image, so that the enhanced image is still Xiao Wang's face, but the movements and expressions of the mouth are replaced by Xiao Zhang's. expression.
为了实现上述本公开任一实施例的图像增强方法,本公开实施例还提供了图像增强装置。如图7所示,该图像增强装置可以包括:外观提取模块71、结构获取模块72和图像生成模块73。In order to implement the image enhancement method in any embodiment of the present disclosure above, an embodiment of the present disclosure further provides an image enhancement device. As shown in FIG. 7 , the image enhancement device may include: an appearance extraction module 71 , a structure acquisition module 72 and an image generation module 73 .
外观提取模块71,用于对目标图像进行特征提取,得到所述目标图像的外观信息,其中,所述目标图像中包括第一对象;所述外观信息表示所述目标图像中的表面视觉特征。The appearance extraction module 71 is configured to perform feature extraction on the target image to obtain appearance information of the target image, wherein the target image includes a first object; and the appearance information represents surface visual features in the target image.
结构获取模块72,用于获取第二对象的结构信息,所述第一对象和所述第二对象为同种类的目标对象;所述结构信息表示所述第二对象的轮廓特征。The structure acquisition module 72 is configured to acquire the structure information of the second object, the first object and the second object are target objects of the same type; the structure information represents the outline feature of the second object.
图像生成模块73,用于基于所述外观信息和所述结构信息生成增强图像,所述增强图像包括具有所述外观信息以及所述结构信息的目标对象。An image generating module 73, configured to generate an enhanced image based on the appearance information and the structure information, where the enhanced image includes a target object with the appearance information and the structure information.
在一个例子中,所述外观提取模块71,在用于对目标图像进行特征提取,得到所述目标图像的外观信息时,包括:通过图像增强网络中的外观提取器对所述目标图像进行特征提取,得到所述目标图像的外观信息。In one example, when the appearance extraction module 71 is used to perform feature extraction on the target image to obtain the appearance information of the target image, it includes: performing feature extraction on the target image through an appearance extractor in the image enhancement network Extract to obtain the appearance information of the target image.
所述图像生成模块73,在用于基于所述外观信息和所述结构信息生成增强图像时,包括:通过所述图像增强网络中的所述生成器基于所述外观信息和所述结构信息生成增强图像。The image generation module 73, when used to generate an enhanced image based on the appearance information and the structure information, includes: generating an enhanced image based on the appearance information and the structure information by the generator in the image enhancement network Enhance images.
在一个例子中,所述结构获取模块72,在用于获取第二对象的结构信息时,包括:获取初始图像,所述初始图像中包括所述第二对象;对所述初始图像进行关键点检测,得到所述初始图像中所述第二对象的关键点;根据所述第二对象的所述关键点,得到所述第二对象的所述结构信息。In one example, when the structure acquisition module 72 is used to acquire the structure information of the second object, it includes: acquiring an initial image, which includes the second object; Detecting to obtain key points of the second object in the initial image; obtaining the structural information of the second object according to the key points of the second object.
在一个例子中,所述装置还包括:预处理模块。所述预处理模块,用于获取初始图像,所述初始图像中包括所述目标对象;对所述初始图像进行关键点检测,得到所述初始图像中所述目标对象的关键点;根据所述目标对象的所述关键点对所述初始图像进行裁剪,得到包括所述目标对象的所述目标图像或者辅助图像;其中,所述第二对象包括在所述辅助图像中。In an example, the device further includes: a preprocessing module. The preprocessing module is configured to acquire an initial image, which includes the target object; perform key point detection on the initial image to obtain key points of the target object in the initial image; according to the The key points of the target object are used to crop the initial image to obtain the target image or an auxiliary image including the target object; wherein the second object is included in the auxiliary image.
为了实现上述本公开任一实施例的图像增强网络的训练方法,本公开实施例还提供了图像增强网络的训练装置。如图8所示,该图像增强网络的训练装置可以包括:信息获取模块81、特征提取模块82、图像输出模块83和参数调整模块84。In order to implement the image enhancement network training method of any embodiment of the present disclosure, an embodiment of the present disclosure further provides an image enhancement network training device. As shown in FIG. 8 , the training device of the image enhancement network may include: an information acquisition module 81 , a feature extraction module 82 , an image output module 83 and a parameter adjustment module 84 .
信息获取模块81,用于获取包含第一对象的样本图像以及第二对象的结构信息,其中,所述第一对象和所述第二对象是具有不同结构信息的同一目标对象;所述结构信息表示所述第二对象的轮廓特征。An information acquisition module 81, configured to acquire a sample image including a first object and structural information of a second object, wherein the first object and the second object are the same target object with different structural information; the structural information represents the outline feature of the second object.
特征提取模块82,用于通过图像增强网络对所述样本图像进行特征提取,得到样本图像的外观信息,所述外观信息表示所述样本图像中的表面视觉特征。The feature extraction module 82 is configured to perform feature extraction on the sample image through an image enhancement network to obtain appearance information of the sample image, and the appearance information represents surface visual features in the sample image.
图像输出模块83,用于通过所述图像增强网络对所述外观信息和所述结构信息进行图像生成处理,输出样本增强图像,其中,所述样本增强图像包括具有所述外观信息以及所述结构信息的所述目标对象。An image output module 83, configured to perform image generation processing on the appearance information and the structure information through the image enhancement network, and output a sample enhanced image, wherein the sample enhanced image includes the appearance information and the structure information. The target audience for the message.
参数调整模块84,用于根据所述样本增强图像,调整所述图像增强网络的网络参数。The parameter adjustment module 84 is configured to adjust network parameters of the image enhancement network according to the sample enhanced image.
在一个例子中,所述参数调整模块84,在用于根据所述样本增强图像,调整所述图像增强网络的网络参数时,包括:根据所述样本增强图像和辅助图像之间的差异,调整所述外观提取器和生成器的网络参数;其中,所述第二对象包括在所述辅助图像中,所述外观提取器和所述生成器包括在所述图像增强网络中。In an example, when the parameter adjustment module 84 is used to adjust the network parameters of the image enhancement network according to the sample enhanced image, it includes: according to the difference between the sample enhanced image and the auxiliary image, adjusting Network parameters of the appearance extractor and generator; wherein the second object is included in the auxiliary image, and the appearance extractor and generator are included in the image enhancement network.
在一个例子中,所述参数调整模块84,在用于根据所述样本增强图像,调整所述图像增强网络的网络参数时,包括:将所述样本增强图像输入所述判别器,得到所述判别器输出的判别值;根据所述判别值与判别真值之间的差异得到第一损失,并根据所述样本增强图像和辅助图像之间的差异得到第二损失;根据所述第一损失和所述第二损失,调整所述外观提取器、所述生成器和所述判别器中至少一个的网络参数;其中,所述第二对象包括在所述辅助图像中。In one example, when the parameter adjustment module 84 is used to adjust the network parameters of the image enhancement network according to the sample enhanced image, it includes: inputting the sample enhanced image into the discriminator to obtain the The discriminant value output by the discriminator; the first loss is obtained according to the difference between the discriminant value and the discriminant true value, and the second loss is obtained according to the difference between the sample enhanced image and the auxiliary image; according to the first loss and the second loss, adjusting network parameters of at least one of the appearance extractor, the generator, and the discriminator; wherein the second object is included in the auxiliary image.
本领域技术人员应明白,本公开一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本公开一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本公开一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that one or more embodiments of the present disclosure may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present disclosure may employ a computer program embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. The form of the product.
本公开实施例还提供一种计算机可读存储介质,该存储介质上可以存储有计算机程序,所述程序被处理器执行时实现本公开任一实施例描述的图像增强方法或者图像增强网络的训练方法。An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program can be stored, and when the program is executed by a processor, the image enhancement method described in any embodiment of the present disclosure or the training of the image enhancement network can be implemented. method.
本公开实施例还提供一种电子设备,该电子设备包括:存储器、处理器,所述存储器用于存储计算机可读指令,所述处理器用于调用所述计算机指令,实现本公开任一实施例所述的图像增强方法或者图像增强网络的训练方法。An embodiment of the present disclosure also provides an electronic device, the electronic device includes: a memory and a processor, the memory is used to store computer-readable instructions, and the processor is used to call the computer instructions to implement any embodiment of the present disclosure The image enhancement method or the training method of the image enhancement network.
其中,本公开实施例所述的“和/或”表示至少具有两者中的其中一个,例如,“A和/或B”包括三种方案:A、B、以及“A和B”。Wherein, "and/or" mentioned in the embodiments of the present disclosure means at least one of the two, for example, "A and/or B" includes three options: A, B, and "A and B".
本公开中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互 相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于数据处理设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。Each embodiment in the present disclosure is described in a progressive manner, and the same and similar parts between the various embodiments can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the data processing device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiment.
上述对本公开特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的行为或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of the present disclosure. Other implementations are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in an order different from that in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. Multitasking and parallel processing are also possible or may be advantageous in certain embodiments.
本公开中描述的主题及功能操作的实施例可以在以下中实现:数字电子电路、有形体现的计算机软件或固件、包括本公开中公开的结构及其结构性等同物的计算机硬件、或者它们中的一个或多个的组合。本公开中描述的主题的实施例可以实现为一个或多个计算机程序,即编码在有形非暂时性程序载体上以被数据处理装置执行或控制数据处理装置的操作的计算机程序指令中的一个或多个模块。可替代地或附加地,程序指令可以被编码在人工生成的传播信号上,例如机器生成的电、光或电磁信号,该信号被生成以将信息编码并传输到合适的接收机装置以由数据处理装置执行。计算机存储介质可以是机器可读存储设备、机器可读存储基板、随机或串行存取存储器设备、或它们中的一个或多个的组合。Embodiments of the subject matter and functional operations described in this disclosure can be implemented in digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware including the structures disclosed in this disclosure and their structural equivalents, or in A combination of one or more of . Embodiments of the subject matter described in this disclosure can be implemented as one or more computer programs, i.e. one or more of computer program instructions encoded on a tangible, non-transitory program carrier for execution by or to control the operation of data processing apparatus. Multiple modules. Alternatively or additionally, the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for transmission by the data The processing means executes. A computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
本公开中描述的处理及逻辑流程可以由执行一个或多个计算机程序的一个或多个可编程计算机执行,以通过根据输入数据进行操作并生成输出来执行相应的功能。所述处理及逻辑流程还可以由专用逻辑电路—例如FPG多(现场可编程门阵列)或多SIC(专用集成电路)来执行,并且装置也可以实现为专用逻辑电路。The processes and logic flows described in this disclosure can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, such as FPG (Field Programmable Gate Array) or SIC (Application Specific Integrated Circuit).
适合用于执行计算机程序的计算机包括,例如通用和/或专用微处理器,或任何其他类型的中央处理单元。通常,中央处理单元将从只读存储器和/或随机存取存储器接收指令和数据。计算机的基本组件包括用于实施或执行指令的中央处理单元以及用于存储指令和数据的一个或多个存储器设备。通常,计算机还将包括用于存储数据的一个或多个大容量存储设备,例如磁盘、磁光盘或光盘等,或者计算机将可操作地与此大容量存储设备耦接以从其接收数据或向其传送数据,抑或两种情况兼而有之。然而,计算机不是必须具有这样的设备。此外,计算机可以嵌入在另一设备中,例如移动电话、个人数字助理(PD多)、移动音频或视频播放器、游戏操纵台、全球定位系统(GPS)接收机、或例如通用串行总线(USB)闪存驱动器的便携式存储设备,仅举几例。Computers suitable for the execution of a computer program include, for example, general and/or special purpose microprocessors, or any other type of central processing unit. Generally, a central processing unit will receive instructions and data from a read only memory and/or a random access memory. The essential components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include, or be operatively coupled to, one or more mass storage devices for storing data, such as magnetic or magneto-optical disks, or optical disks, to receive data therefrom or to It transmits data, or both. However, it is not necessary for a computer to have such a device. Furthermore, a computer may be embedded in another device such as a mobile phone, a personal digital assistant (PD or more), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a device such as a Universal Serial Bus ( USB) flash drives, to name a few.
适合于存储计算机程序指令和数据的计算机可读介质包括所有形式的非易失性存储器、媒介和存储器设备,例如包括半导体存储器设备(例如EPROM、EEPROM和闪存设备)、磁盘(例如内部硬盘或可移动盘)、磁光盘以及CD ROM和DVD-ROM盘。处理器和存储器可由专用逻辑电路补充或并入专用逻辑电路中。Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices (such as EPROM, EEPROM, and flash memory devices), magnetic disks (such as internal hard disks or removable disks), magneto-optical disks, and CD ROM and DVD-ROM disks. The processor and memory can be supplemented by, or incorporated in, special purpose logic circuitry.
虽然本公开包含许多具体实施细节,但是这些不应被解释为限制任何公开的范围或所要求保护的范围,而是主要用于描述特定公开的具体实施例的特征。本公开内在多个实施例中描述的某些特征也可以在单个实施例中被组合实施。另一方面,在单个实施例中描述的各种特征也可以在多个实施例中分开实施或以任何合适的子组合来实施。此外,虽然特征可以如上所述在某些组合中起作用并且甚至最初如此要求保护,但是来自所要求保护的组合中的一个或多个特征在一些情况下可以从该组合中去除,并且所要求保护的组合可以指向子组合或子组合的变型。While this disclosure contains many specific implementation details, these should not be construed as limitations on the scope of any disclosure or of what may be claimed, but rather as primarily describing features of particular disclosed embodiments. Certain features that are described in multiple embodiments within this disclosure can also be implemented in combination in a single embodiment. On the other hand, various features that are described in a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may function in certain combinations as described above and even be initially so claimed, one or more features from a claimed combination may in some cases be removed from that combination and the claimed A protected combination can point to a subcombination or a variant of a subcombination.
类似地,虽然在附图中以特定顺序描绘了操作,但是这不应被理解为要求这些操作以所示的特定顺序执行或顺次执行、或者要求所有例示的操作被执行,以实现期望的结果。在某些情况下,多任务和并行处理可能是有利的。此外,上述实施例中的各种系统 模块和组件的分离不应被理解为在所有实施例中均需要这样的分离,并且应当理解,所描述的程序组件和系统通常可以一起集成在单个软件产品中,或者封装成多个软件产品。Similarly, while operations are depicted in the figures in a particular order, this should not be construed as requiring that those operations be performed in the particular order shown, or sequentially, or that all illustrated operations be performed, to achieve the desired result. In some cases, multitasking and parallel processing may be advantageous. Furthermore, the separation of various system modules and components in the above-described embodiments should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can often be integrated together in a single software product in, or packaged into multiple software products.
由此,主题的特定实施例已被描述。其他实施例在所附权利要求书的范围以内。在某些情况下,权利要求书中记载的动作可以以不同的顺序执行并且仍实现期望的结果。此外,附图中描绘的处理并非必需所示的特定顺序或顺次顺序,以实现期望的结果。在某些实现中,多任务和并行处理可能是有利的。Thus, certain embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.
以上所述仅为本公开一个或多个实施例的较佳实施例而已,并不用以限制本公开一个或多个实施例,凡在本公开一个或多个实施例的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开一个或多个实施例保护的范围之内。The above description is only a preferred embodiment of one or more embodiments of the present disclosure, and is not intended to limit one or more embodiments of the present disclosure. Within the spirit and principle of one or more embodiments of the present disclosure, Any modification, equivalent replacement, improvement, etc. should be included in the protection scope of one or more embodiments of the present disclosure.

Claims (13)

  1. 一种图像增强方法,包括:A method of image enhancement, comprising:
    对目标图像进行特征提取,得到所述目标图像的外观信息,其中,所述目标图像中包括第一对象;所述外观信息表示所述目标图像中的表面视觉特征;performing feature extraction on the target image to obtain appearance information of the target image, wherein the target image includes a first object; the appearance information represents surface visual features in the target image;
    获取第二对象的结构信息,其中,所述第一对象和所述第二对象为同种类的目标对象;所述结构信息表示所述第二对象的轮廓特征;Acquiring structural information of a second object, wherein the first object and the second object are target objects of the same type; the structural information represents an outline feature of the second object;
    基于所述外观信息和所述结构信息生成增强图像,其中,所述增强图像包括具有所述外观信息以及所述结构信息的目标对象。An enhanced image is generated based on the appearance information and the structure information, wherein the enhanced image includes a target object having the appearance information and the structure information.
  2. 根据权利要求1所述的方法,其特征在于,所述方法由图像增强装置执行,所述图像增强装置中部署有图像增强网络,所述图像增强网络包括:外观提取器和生成器;The method according to claim 1, wherein the method is executed by an image enhancement device, and an image enhancement network is deployed in the image enhancement device, and the image enhancement network includes: an appearance extractor and a generator;
    所述对目标图像进行特征提取,得到所述目标图像的外观信息,包括:通过所述图像增强网络中的外观提取器对所述目标图像进行特征提取,得到所述目标图像的外观信息;The performing feature extraction on the target image to obtain the appearance information of the target image includes: performing feature extraction on the target image by an appearance extractor in the image enhancement network to obtain the appearance information of the target image;
    所述基于所述外观信息和所述结构信息生成增强图像,包括:通过所述图像增强网络中的所述生成器基于所述外观信息和所述结构信息生成增强图像。The generating an enhanced image based on the appearance information and the structure information includes: generating an enhanced image based on the appearance information and the structure information by the generator in the image enhancement network.
  3. 根据权利要求1所述的方法,其特征在于,所述获取第二对象的结构信息,包括:The method according to claim 1, wherein said obtaining the structural information of the second object comprises:
    获取初始图像,所述初始图像中包括所述第二对象;acquiring an initial image, the initial image including the second object;
    对所述初始图像进行关键点检测,得到所述初始图像中所述第二对象的关键点;performing key point detection on the initial image to obtain key points of the second object in the initial image;
    根据所述第二对象的所述关键点,得到所述第二对象的所述结构信息。Obtain the structural information of the second object according to the key points of the second object.
  4. 根据权利要求1所述的方法,其特征在于,所述第二对象包括在辅助图像中;所述方法还包括:The method according to claim 1, wherein the second object is included in the auxiliary image; the method further comprises:
    获取初始图像,所述初始图像中包括所述目标对象;acquiring an initial image, the initial image including the target object;
    对所述初始图像进行关键点检测,得到所述初始图像中所述目标对象的关键点;performing key point detection on the initial image to obtain key points of the target object in the initial image;
    根据所述目标对象的所述关键点对所述初始图像进行裁剪,得到包括所述目标对象的所述目标图像或者所述辅助图像。The initial image is cropped according to the key point of the target object to obtain the target image or the auxiliary image including the target object.
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:在所述基于所述外观信息和所述结构信息生成增强图像之后,所述增强图像替换所述初始图像中的对应图像部分。The method according to claim 4, further comprising: after the enhanced image is generated based on the appearance information and the structural information, the enhanced image replaces the corresponding image in the initial image part.
  6. 根据权利要求1所述的方法,其特征在于,所述第一对象和所述第二对象是同一个目标对象,或者是同种类的不同目标对象,所述目标对象是人脸中的五官之一。The method according to claim 1, wherein the first object and the second object are the same target object, or different target objects of the same type, and the target object is one of the five sense organs in the human face. one.
  7. 一种图像增强网络的训练方法,包括:A training method for an image enhancement network, comprising:
    获取包含第一对象的样本图像以及第二对象的结构信息,其中,所述第一对象和所述第二对象是具有不同结构信息的同一目标对象;所述结构信息表示所述第二对象的轮廓特征;Acquiring a sample image including a first object and structural information of a second object, wherein the first object and the second object are the same target object with different structural information; the structural information represents the structure information of the second object contour features;
    通过图像增强网络对所述样本图像进行特征提取,得到所述样本图像的外观信息,其中,所述外观信息表示所述样本图像中的表面视觉特征;performing feature extraction on the sample image through an image enhancement network to obtain appearance information of the sample image, wherein the appearance information represents surface visual features in the sample image;
    通过所述图像增强网络对所述外观信息和所述结构信息进行图像生成处理,输出样本增强图像,其中,所述样本增强图像包括具有所述外观信息以及所述结构信息的所述目标对象;performing image generation processing on the appearance information and the structure information through the image enhancement network, and outputting a sample enhanced image, wherein the sample enhanced image includes the target object having the appearance information and the structure information;
    根据所述样本增强图像,调整所述图像增强网络的网络参数。and adjusting network parameters of the image enhancement network according to the sample enhanced image.
  8. 根据权利要求7所述的训练方法,其特征在于,所述第二对象包括在辅助图像中,所述图像增强网络包括:外观提取器和生成器;所述根据所述样本增强图像,调整所述图像增强网络的网络参数,包括:The training method according to claim 7, wherein the second object is included in the auxiliary image, and the image enhancement network includes: an appearance extractor and a generator; The network parameters of the image enhancement network include:
    根据所述样本增强图像和所述辅助图像之间的差异,调整所述外观提取器和所述生成器的网络参数。Adjusting network parameters of the appearance extractor and the generator based on the difference between the sample enhanced image and the auxiliary image.
  9. 根据权利要求7所述的训练方法,其特征在于,所述第二对象包括在辅助图像中;所述图像增强网络包括:外观提取器和生成器;The training method according to claim 7, wherein the second object is included in the auxiliary image; the image enhancement network comprises: an appearance extractor and a generator;
    所述根据所述样本增强图像,调整所述图像增强网络的网络参数,包括:The step of adjusting the network parameters of the image enhancement network according to the sample enhancement image includes:
    将所述样本增强图像输入判别器,得到所述判别器输出的判别值;inputting the sample enhanced image into a discriminator to obtain a discriminant value output by the discriminator;
    根据所述判别值与判别真值之间的差异得到第一损失,并根据所述样本增强图像和所述辅助图像之间的差异得到第二损失;Obtaining a first loss based on the difference between the discriminant value and the discriminant true value, and obtaining a second loss based on the difference between the sample enhanced image and the auxiliary image;
    根据所述第一损失和所述第二损失,调整所述外观提取器、所述生成器和所述判别器中至少一个的网络参数。A network parameter of at least one of the appearance extractor, the generator, and the discriminator is adjusted based on the first loss and the second loss.
  10. 一种图像增强装置,包括:An image enhancement device comprising:
    外观提取模块,用于对目标图像进行特征提取,得到所述目标图像的外观信息,其中,所述目标图像中包括第一对象;所述外观信息表示所述目标图像中的表面视觉特征;The appearance extraction module is used to perform feature extraction on the target image to obtain appearance information of the target image, wherein the target image includes a first object; the appearance information represents surface visual features in the target image;
    结构获取模块,用于获取第二对象的结构信息,所述第一对象和所述第二对象为同种类的目标对象;所述结构信息表示所述第二对象的轮廓特征;A structure acquisition module, configured to acquire structure information of a second object, the first object and the second object are target objects of the same type; the structure information represents the outline feature of the second object;
    图像生成模块,用于基于所述外观信息和所述结构信息生成增强图像,其中,所述增强图像包括具有所述外观信息以及所述结构信息的目标对象。An image generating module, configured to generate an enhanced image based on the appearance information and the structure information, wherein the enhanced image includes a target object with the appearance information and the structure information.
  11. 一种图像增强网络的训练装置,包括:A training device for an image enhancement network, comprising:
    信息获取模块,用于获取包含第一对象的样本图像以及第二对象的结构信息,其中,所述第一对象和所述第二对象是具有不同结构信息的同一目标对象;所述结构信息表示所述第二对象的轮廓特征;An information acquisition module, configured to acquire a sample image including a first object and structural information of a second object, wherein the first object and the second object are the same target object with different structural information; the structural information represents contour features of the second object;
    特征提取模块,用于通过图像增强网络对所述样本图像进行特征提取,得到所述样本图像的外观信息,其中,所述外观信息表示所述样本图像中的表面视觉特征;A feature extraction module, configured to perform feature extraction on the sample image through an image enhancement network to obtain appearance information of the sample image, wherein the appearance information represents surface visual features in the sample image;
    图像输出模块,用于通过所述图像增强网络对所述外观信息和所述结构信息进行图像生成处理,输出样本增强图像,其中,所述样本增强图像包括具有所述外观信息以及所述结构信息的所述目标对象;An image output module, configured to perform image generation processing on the appearance information and the structure information through the image enhancement network, and output a sample enhanced image, wherein the sample enhanced image includes the appearance information and the structure information the target audience of
    参数调整模块,用于根据所述样本增强图像,调整所述图像增强网络的网络参数。A parameter adjustment module, configured to adjust the network parameters of the image enhancement network according to the sample enhanced image.
  12. 一种电子设备,包括:存储器、处理器,所述存储器用于存储计算机可读指令,所述处理器用于调用所述计算机指令,实现权利要求1至6任一所述的方法,或者权利要求7至9任一所述的方法。An electronic device, comprising: a memory and a processor, the memory is used to store computer-readable instructions, and the processor is used to call the computer instructions to implement the method described in any one of claims 1 to 6, or a claim The method described in any one of 7 to 9.
  13. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至6任一所述的方法,或者权利要求7至9任一所述的方法。A computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method according to any one of claims 1 to 6, or the method according to any one of claims 7 to 9 is realized.
PCT/CN2022/134845 2021-12-31 2022-11-29 Image enhancement method, apparatus, storage medium, and electronic device WO2023124697A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111669721.8 2021-12-31
CN202111669721.8A CN114331906A (en) 2021-12-31 2021-12-31 Image enhancement method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
WO2023124697A1 true WO2023124697A1 (en) 2023-07-06

Family

ID=81019990

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/134845 WO2023124697A1 (en) 2021-12-31 2022-11-29 Image enhancement method, apparatus, storage medium, and electronic device

Country Status (2)

Country Link
CN (1) CN114331906A (en)
WO (1) WO2023124697A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114331906A (en) * 2021-12-31 2022-04-12 北京大甜绵白糖科技有限公司 Image enhancement method and device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111881926A (en) * 2020-08-24 2020-11-03 Oppo广东移动通信有限公司 Image generation method, image generation model training method, image generation device, image generation equipment and image generation medium
CN113327212A (en) * 2021-08-03 2021-08-31 北京奇艺世纪科技有限公司 Face driving method, face driving model training device, electronic equipment and storage medium
CN113838076A (en) * 2020-06-24 2021-12-24 深圳市中兴微电子技术有限公司 Method and device for labeling object contour in target image and storage medium
CN114331906A (en) * 2021-12-31 2022-04-12 北京大甜绵白糖科技有限公司 Image enhancement method and device, storage medium and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113838076A (en) * 2020-06-24 2021-12-24 深圳市中兴微电子技术有限公司 Method and device for labeling object contour in target image and storage medium
CN111881926A (en) * 2020-08-24 2020-11-03 Oppo广东移动通信有限公司 Image generation method, image generation model training method, image generation device, image generation equipment and image generation medium
CN113327212A (en) * 2021-08-03 2021-08-31 北京奇艺世纪科技有限公司 Face driving method, face driving model training device, electronic equipment and storage medium
CN114331906A (en) * 2021-12-31 2022-04-12 北京大甜绵白糖科技有限公司 Image enhancement method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN114331906A (en) 2022-04-12

Similar Documents

Publication Publication Date Title
Khalid et al. Oc-fakedect: Classifying deepfakes using one-class variational autoencoder
US11410457B2 (en) Face reenactment
WO2022001593A1 (en) Video generation method and apparatus, storage medium and computer device
WO2020258668A1 (en) Facial image generation method and apparatus based on adversarial network model, and nonvolatile readable storage medium and computer device
US11410364B2 (en) Systems and methods for realistic head turns and face animation synthesis on mobile device
US20230169349A1 (en) Electronic device and controlling method thereof
WO2019024853A1 (en) Image processing method and device, and storage medium
US11551393B2 (en) Systems and methods for animation generation
CN104170374A (en) Modifying an appearance of a participant during a video conference
WO2020150689A1 (en) Systems and methods for realistic head turns and face animation synthesis on mobile device
KR20220101659A (en) Image creation using surface-based neural synthesis
CN111553267B (en) Image processing method, image processing model training method and device
WO2023040679A1 (en) Fusion method and apparatus for facial images, and device and storage medium
US11915355B2 (en) Realistic head turns and face animation synthesis on mobile device
WO2023077742A1 (en) Video processing method and apparatus, and neural network training method and apparatus
Zhang et al. Deep learning in face synthesis: A survey on deepfakes
WO2023124697A1 (en) Image enhancement method, apparatus, storage medium, and electronic device
Tolosana et al. An introduction to digital face manipulation
CN113544706A (en) Electronic device and control method thereof
WO2024051480A1 (en) Image processing method and apparatus, computer device, and storage medium
WO2023154135A1 (en) Systems and methods for facial attribute manipulation
US11216648B2 (en) Method and device for facial image recognition
CN117136404A (en) Neural network for extracting accompaniment from song
Khan et al. Face recognition via multi-level 3D-GAN colorization
US20240104180A1 (en) User authentication based on three-dimensional face modeling using partial face images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913955

Country of ref document: EP

Kind code of ref document: A1