CN110570383B

CN110570383B - Image processing method and device, electronic equipment and storage medium

Info

Publication number: CN110570383B
Application number: CN201910913224.4A
Authority: CN
Inventors: 李华夏
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2019-09-25
Filing date: 2019-09-25
Publication date: 2022-05-06
Anticipated expiration: 2039-09-25
Also published as: CN110570383A

Abstract

The disclosure discloses an image processing method, an image processing device, an electronic device and a storage medium. The method comprises the following steps: obtaining attribute feature vectors and portrait feature vectors of people in the initial image through a confrontation network model; adjusting the attribute feature vector of the figure to be consistent with a preset attribute feature vector; and generating a target image according to the adjusted attribute feature vector and the portrait feature vector. According to the scheme of the embodiment of the disclosure, the diversified image processing requirements of the user can be met by adding the function of adjusting the character attribute characteristics in the image. Especially, under the scene of image shooting, the image re-shooting times are greatly reduced, and the shooting efficiency is improved.

Description

Image processing method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to image processing technologies, and in particular, to an image processing method and apparatus, an electronic device, and a storage medium.

Background

At present, the photographing function becomes a standard configuration of most terminal devices, and a terminal user can record the good moment around the user at any time through the portable terminal device.

Along with the intelligent rapid development of terminal equipment, all possess the image processing function under terminal equipment's the function of shooing, common processing operation includes: beauty, filter, adding pictures or characters, etc. However, these processing operations cannot adjust the attribute characteristics (such as closed eyes, no smile, etc.) of the person who captured the image. When the attribute characteristics of the person in the shot image influence the visual effect, only one image can be shot again. Especially when multiple persons take a group photo, the satisfactory visual effect is required, the shooting may need to be repeated for multiple times, the operation is complex, and the improvement is needed.

Disclosure of Invention

The present disclosure provides an image processing method, apparatus, electronic device and storage medium, which can satisfy diversified image processing requirements of users by adding a function of adjusting character attribute characteristics in an image. Especially, under the scene of image shooting, the image re-shooting times are greatly reduced, and the shooting efficiency is improved.

In a first aspect, an embodiment of the present disclosure provides an image processing method, including:

obtaining attribute feature vectors and portrait feature vectors of people in the initial image through a confrontation network model;

adjusting the attribute feature vector of the figure to be consistent with a preset attribute feature vector;

and generating a target image according to the adjusted attribute feature vector and the portrait feature vector.

In a second aspect, an embodiment of the present disclosure further provides an image processing apparatus, including:

the characteristic vector determining module is used for obtaining attribute characteristic vectors and portrait characteristic vectors of people in the initial image through the confrontation network model;

the characteristic vector adjusting module is used for adjusting the attribute characteristic vector of the character to be consistent with a preset attribute characteristic vector;

and the image generation module is used for generating a target image according to the adjusted attribute feature vector and the portrait feature vector.

In a third aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement an image processing method as in any embodiment of the present disclosure.

In a fourth aspect, embodiments of the present disclosure provide a readable medium, on which a computer program is stored, which when executed by a processor, implements an image processing method according to any of the embodiments of the present disclosure.

The embodiment of the disclosure provides an image processing method, an image processing device, an electronic device and a storage medium, wherein an attribute feature vector and a portrait feature vector of a person in an initial image are obtained through a countermeasure network, the attribute feature vector of the initial person is adjusted to be consistent with a preset attribute feature vector, and then a target image is generated according to the adjusted attribute feature vector and the original person feature vector in the initial image. According to the scheme of the embodiment of the disclosure, the diversified image processing requirements of the user can be met by adding the function of adjusting the character attribute characteristics in the image. Particularly, in an image shooting scene, when the visual effect is influenced by the closed eyes or poor expressions of character attribute features, shooting is not needed again, and a high-quality target image can be obtained by processing after the character attribute features in the image are adjusted, so that the shooting efficiency is improved.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

Fig. 1A illustrates a flowchart of an image processing method provided by an embodiment of the present disclosure;

FIG. 1B is a schematic diagram illustrating the structure of an antagonistic network model provided by an embodiment of the present disclosure;

fig. 1C is a schematic diagram illustrating an image processing process and effect provided by the embodiment of the disclosure;

FIG. 2A is a flow chart illustrating another image processing method provided by the disclosed embodiments;

FIG. 2B is a schematic flow chart illustrating verification of an initial network model provided by an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of an image processing apparatus provided in an embodiment of the present disclosure;

fig. 4 shows a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise. The names of messages or information exchanged between multiple parties in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

Fig. 1A shows a flowchart of an image processing method provided by the embodiment of the present disclosure, fig. 1B shows a schematic diagram of a structure of a countermeasure network model provided by the embodiment of the present disclosure, and fig. 1C shows a schematic diagram of an image processing process and an image processing effect provided by the embodiment of the present disclosure. The embodiment can be applied to processing the image to eliminate the condition that the visual effect is influenced by some character attribute characteristics. For example, the method can be applied to adjust the attribute characteristics of human eyes in the image to be open eyes and the expression attribute characteristics to be smiling, so as to eliminate the situation that the whole visual effect is influenced by the closed eyes or no smiling. The method may be performed by an image processing apparatus or an electronic device, the apparatus may be implemented by software and/or hardware, and the apparatus may be configured in the electronic device, and specifically may be performed by an image processing process in the electronic device.

Optionally, as shown in fig. 1A to 1C, the method in this embodiment may include the following steps:

and S101, obtaining attribute feature vectors and portrait feature vectors of people in the initial image through the confrontation network model.

The initial image may be a person image to be processed in this embodiment, which may be an original person image captured by a camera on the electronic device, or may be a stored image selected from a local gallery of the electronic device according to a click operation of a user.

Optionally, the confrontation network model of this embodiment may be a neural network model that performs person feature decoupling classification on an input image to obtain an attribute feature vector and a portrait feature vector of a person. Optionally, the confrontation network model of this embodiment may include a portrait decoupling network and a classification network, and the process of obtaining the attribute feature vector and the portrait feature vector of the person in the initial image through the confrontation network model in this step may be: obtaining a characteristic vector of a figure in the initial image through a figure decoupling network; and obtaining attribute feature vectors and portrait feature vectors of people in the initial image through a classification network. The attribute feature vector of the person may be a vector describing some attribute feature of the person in the image, and may include, but is not limited to: at least one sub-vector of the human expression vector, the orientation angle vector, and the eye state vector. The portrait feature vector of a person may be a vector describing detailed features of a certain area of the person in the image, and may include, but is not limited to: at least one sub-vector of the facial feature vector, the hair feature vector, and the limb feature vector.

Specifically, as shown in fig. 1B, the confrontation network model 10 may include a portrait decoupling network 11 and a classification network 12, and the portrait decoupling network 11 of this embodiment further includes an attribute feature decoupling network 13 and a portrait feature decoupling network 14. The attribute feature decoupling network 13 is used for decoupling attribute feature vectors of people from the initial image; the portrait feature decoupling network 14 is used to decouple the portrait feature vectors of the person from the initial image. However, after the attribute feature decoupling network 13 and the portrait feature decoupling network 14 in the countermeasure network model 10 identify the feature vectors of the corresponding persons, the countermeasure network model 10 cannot know which feature vector is the portrait feature vector and which vector is the attribute feature vector, and only uses them as the feature vectors of a group of persons, and in this case, the classification can be performed through the classification network 12. Optionally, the classification network 12 may compare the feature vector of the person in the initial image with the standard vector in the classification network to obtain an attribute feature vector and a portrait feature vector of the person in the initial image. Specifically, the classification network 12 may store a standard vector in advance, where the standard vector may be a standard vector corresponding to attribute features or a standard vector corresponding to portrait features, and the two feature vectors of a person obtained by decoupling the portrait decoupling network are compared with the standard vector in similarity, where the feature vector more similar to the standard vector in the two feature vectors belongs to a feature category vector with the standard vector. For example, if the standard vector is a standard vector corresponding to the attribute feature, a feature vector more similar to the standard vector in the two feature vectors is determined as an attribute feature vector of the person, and the other feature vector is determined as a portrait feature vector of the person.

Optionally, a kulbeck-leibler divergence loss function constraint is arranged in the portrait decoupling network. For example, the kulbeck-leibler divergence loss function constraint can be set for a feature decoupling network and/or a portrait feature decoupling network in the portrait decoupling network so as to realize the constraint of the decoupled feature vector. Optionally, in the embodiment of the present disclosure, two decoupling networks (i.e., an attribute feature decoupling network and a portrait feature decoupling network) in the portrait decoupling network may be used, one of the decoupling networks sets a kulbeck-leibler divergence loss function constraint to constrain the decoupled feature vectors, and the other one allocates the above-described classification network to constrain the decoupled feature vectors. For example, a kulbeck-leibler divergence loss function constraint may be set for the attribute feature decoupling network to ensure that the attribute feature vector decoupled by the attribute feature decoupling network satisfies a standard normal distribution as much as possible, thereby eliminating the influence of the portrait feature vector as much as possible to make the attribute feature vector only related to the attribute feature; and (3) adopting a classification network to constrain the portrait feature vectors decoupled by the portrait feature decoupling network so as to make the portrait feature vectors strongly related to the portrait features.

Optionally, in this embodiment, after an initial image to be processed is obtained through a camera or a local gallery on the electronic device, the initial image may be input into a confrontation network model, a portrait decoupling network of the confrontation network model decouples a group of feature vectors of a person in the initial image, and then, for the group of feature vectors, the classification network identifies which vector in the group of feature vectors is an attribute feature vector and which vector is a portrait feature vector based on a locally stored standard vector. Illustratively, the initial image in fig. 1C is input into the confrontation network model, and the confrontation network model performs character feature decoupling on the input initial image, so as to obtain an attribute feature vector of a character: the character expression vector is a closed mouth feature vector; the azimuth angle vector is a front face feature vector; the eye state vector is an eye opening feature vector; the portrait feature vector is the facial feature vector and the hair feature vector of the person in the image.

And S102, adjusting the attribute feature vector of the person to be consistent with the preset attribute feature vector.

The preset attribute feature vector may be a character attribute feature vector corresponding to the optimal visual effect preset by the system. For example, the preset attribute feature vector may include at least one of the following: the character expression vector can be a smiling feature vector; the azimuth angle vector is a front characteristic vector; the eye state vector is an open eye feature vector. The preset attribute feature vector may also be a set of attribute feature vectors set by the user according to actual needs, and specifically, at least one feature vector may be selected from a human expression vector, an orientation angle vector, and an eye state vector, and then one of a plurality of candidate vectors of each selected feature vector is selected to form the preset attribute feature vector. For example, the user selects three attribute vectors, namely a character expression vector, an orientation angle vector and an eye state vector, and then selects a smiling expression vector from a plurality of candidate expression vectors, a frontal orientation vector from a plurality of candidate orientation angles, and an eye-opening feature vector from a plurality of candidate eye state vectors to constitute a preset attribute feature vector.

Optionally, before executing the step, it may be determined whether the attribute feature vector of the person obtained by decoupling in S101 is consistent with the preset attribute feature vector, and if so, the step and subsequent image processing operations are not required; if not, the operation of the step is triggered and executed. Specifically, the operation process may include at least two possible embodiments as follows:

in the first implementation mode, an attribute feature judgment network is set in the confrontation network model, and the attribute feature judgment network can judge whether the attribute feature vector of the character in the initial image is consistent with the preset attribute feature vector according to the trained network parameters and the algorithm during training. The image processing process of the application program judges whether the attribute feature vector of the person is consistent with the preset attribute feature vector through the attribute feature judgment network; and if not, performing adjustment to ensure that the attribute feature vector of the character is consistent with the preset attribute feature vector. Optionally, if the attribute feature determination network determines that the attribute feature vector of the person is inconsistent with the preset attribute feature vector, the determination output result of the attribute feature determination network may further include a sub-vector, which is inconsistent with the preset attribute feature vector, in the attribute feature vector of the person. At this time, the image processing of the application program can find out the preset subvectors which are inconsistent with the attribute characteristic judgment network judgment and belong to the same type from the preset attribute eigenvectors, and replace the inconsistent subvectors in the attribute eigenvectors of the person obtained by the decoupling step S101. Therefore, the adjustment of the attribute feature vector of the person is realized. For example, as shown in fig. 1C, it is assumed that the preset attribute feature vector is a character expression vector which is a smiling feature vector; the azimuth angle vector is a frontal feature vector; the eye state vector is an open eye feature vector. Comparing the preset attribute feature vector with the attribute feature vector of the character obtained by decoupling in the step S101 through the attribute feature judgment network, and finding that the character expression vectors in the two attribute feature vectors are inconsistent. At this time, the human expression feature vector in the attribute feature vector of the human decoupled in S101 may be adjusted from the closed mouth feature vector to the smiling feature vector, and the attribute feature vector of the human remains unchanged.

In the second implementation manner, when the user is not satisfied with the attribute features of the character in the initial image, an unsatisfactory attribute feature vector is selected from the initial image, and at this time, when the image processing process of the application program detects the selection operation of the user, the unsatisfactory attribute feature vector selected by the user is used as a sub-vector which is inconsistent with the preset attribute feature vector, and then the attribute feature vector of the character is adjusted to be consistent with the preset attribute feature vector by using a method similar to the first implementation manner based on the preset attribute feature vector set by the system default or the user.

It should be noted that, for the attribute feature vector of the person in the initial image and the preset attribute feature vector, the correspondence that the sub-vectors are identical means that the candidate attribute types to which the sub-vectors belong are identical, but the vector values of the two vectors are not completely identical. For example, it is assumed that the attribute feature vector of the person in the initial image and the eye state feature vector in the preset attribute feature vector are both eye-opening feature vectors, but the specific features of different persons when the eyes are opened are not necessarily identical, and therefore the specific vector values corresponding to the eye-opening feature vectors of the two persons are not necessarily identical. Therefore, when determining whether the attribute feature vector of the person is consistent with the preset attribute feature vector, the attribute feature determination network does not determine whether the specific vector values of the two vectors are consistent, but determines whether the relationship (such as similarity) between the two vector values meets the requirement of consistency.

And S103, generating a target image according to the adjusted attribute feature vector and the adjusted portrait feature vector.

Optionally, in this step, feature fusion processing may be performed on the person in the initial image again according to the attribute feature vector adjusted based on the preset attribute feature vector in S102 and the person image feature vector obtained in S101, so as to obtain a processed target image. Illustratively, the target image shown in fig. 1C is generated by re-fusing the adjusted attribute feature vector (where the orientation angle vector and the eye state vector are original attribute feature vectors of the person in the initial image, and the person expression vector is a smiling feature vector in the adjusted preset attribute feature vector) and the portrait feature vector of the person in the initial image according to S102, where the target image in fig. 1C is compared with the initial image, only the expression of the person is changed, and the person is smiled from closed mouth, and other features remain unchanged. As can be seen from the effect diagram in fig. 1C, the target image is obtained after the processing in the present embodiment, although the attribute characteristics of the person are changed, the transition of the processing area is natural, no modification trace is seen, and the modification effect is vivid.

It should be noted that, when the initial image is an original image shot by the camera, the target image generated in this step is a shot image finally displayed to the user; when the initial image is a stored image selected from the local gallery, the target image generated in this step is a new image generated after processing the stored image.

Optionally, when the initial image is a shot original image, in consideration of the requirement of the user for personalization, sometimes the attribute feature vector of the person in the ideal effect of the shot image of the user is inconsistent with the preset attribute feature vector, for example, the user needs to shoot a person image with eyes closed to the upper right. In order to prevent that the personalized requirements of the user cannot be met after the photographed image is directly processed, in this embodiment, after the initial image is processed to obtain the target image, the generated target image and the initial image are simultaneously displayed to the user for selection by the user, and the final photographed image is determined according to the selection result of the user.

The embodiment of the disclosure provides an image processing method, which includes obtaining attribute feature vectors and portrait feature vectors of people in an initial image through a countermeasure network, adjusting the attribute feature vectors of the initial people to be consistent with preset attribute feature vectors, and then generating a target image according to the adjusted attribute feature vectors and the original people feature vectors in the initial image. According to the scheme of the embodiment of the disclosure, the diversified image processing requirements of the user can be met by adding the function of adjusting the character attribute characteristics in the image. Particularly, in an image shooting scene, when the visual effect is influenced by the closed eyes or poor expressions of character attribute features, shooting is not needed again, and a high-quality target image can be obtained by processing after the character attribute features in the image are adjusted, so that the shooting efficiency is improved.

Fig. 2A shows a flowchart of another image processing method provided by the embodiment of the present disclosure, and fig. 2B shows a flowchart of verifying an initial network model provided by the embodiment of the present disclosure. The embodiment is optimized on the basis of the alternatives provided by the above embodiment, and specifically gives a detailed process description of how to train the confrontation network model before obtaining the attribute feature vector and the portrait feature vector of the person in the initial image through the confrontation network model.

Optionally, as shown in fig. 2A-2B, the method in this embodiment may include the following steps:

s201, inputting the sample image into the initial network model, and training the initial network model.

The sample image may be training data required for training the initial network model, and may be composed of a large number of images of at least one person, and attribute feature vectors and portrait feature vectors of the person corresponding to each image. For example, various expressions, orientation angles, eye states, and the like are included. The initial network model may be a pre-constructed network model that includes a portrait decoupling network and a classification network. The portrait decoupling network is used for identifying the characteristic vectors of people in the input image, and the classification network is used for classifying and determining the attribute characteristic vectors and the portrait characteristic vectors of people in the input image. Specifically, the classification network is configured to compare feature vectors of persons in the input image with standard vectors in the classification network to obtain attribute feature vectors and portrait feature vectors of the persons in the input image. The portrait decoupling network further comprises a feature decoupling network and a portrait feature decoupling network. Optionally, the initial network model may further include an attribute feature determination network, where the attribute feature determination network is configured to determine whether an attribute feature vector of a person in the initial image is consistent with a preset attribute feature vector.

Optionally, in the process of training the initial network model, the portrait decoupling network is mainly trained, and specifically, the attribute feature decoupling network and the portrait feature decoupling network in the portrait decoupling network may be separately and independently trained by using sample image data. For example, the attribute feature decoupling network can be trained by adopting each human image and the attribute feature vector thereof in the sample image; and training the portrait characteristic decoupling network by adopting each portrait image and the portrait characteristic vector in the sample image. Optionally, after the attribute feature decoupling network is trained by serially using the sample image, the human image feature decoupling network is trained by using the sample image; and the two training processes can be started simultaneously, and the attribute feature decoupling network and the portrait feature decoupling network are trained by adopting the sample image data. Optionally, when the initial network model includes the attribute feature determination network, the step further needs to train the attribute feature determination network based on the preset attribute feature vector and the attribute feature vector of the person in the sample image data output by the attribute feature decoupling network.

It should be noted that, in this step, the sample image is input into the initial network model, and various parameters in the initial network model are trained, and the network structure of the trained initial network model does not change, but only the values of the parameters of the initial network model are adjusted.

S202, inputting the verification image into the trained initial network model to obtain an attribute characteristic vector and a portrait characteristic vector of the verification person.

The verification image may be verification image data used for verifying whether the trained initial network model meets the requirements, and may be selected in the process of acquiring the sample image, for example, in the process of acquiring the sample image, a certain proportion (e.g., 80%) of image data in the acquired image is taken as the sample image, and the remaining proportion (e.g., 20%) of image data is taken as the verification image. It is also possible to specifically select a person image different from the sample image as the verification image. Optionally, in order to ensure that the trained initial network model is verified more accurately, in this embodiment, at least two groups of verification image data may be selected to verify the trained initial network model.

Optionally, in this step, a pre-obtained verification image may be input into the trained initial network model, and the initial network model may process the input verification image according to the parameters obtained by the training to obtain attribute feature vectors and portrait feature vectors of a verification person in the verification image, where a specific processing process is similar to the process of obtaining the attribute feature vectors and portrait feature vectors of the person in the initial image through the confrontation network model in the above embodiment, and details thereof are omitted. For example, as shown in fig. 2B, the verification image may be input into a portrait decoupling network of the trained initial network model to obtain a group of feature vectors, and then the group of feature vectors is classified and identified through a classification network to determine attribute feature vectors and portrait feature vectors in the group of feature vectors, so as to obtain attribute feature vectors and portrait feature vectors of the verification person.

And S203, adjusting the attribute characteristic vector of the verification person to be consistent with the preset attribute characteristic vector, and generating a verification target image according to the adjusted attribute characteristic vector and the portrait characteristic vector.

Optionally, this step is similar to the process of adjusting the attribute feature vector of the character to be consistent with the preset attribute feature vector in the above embodiment, and generating the target image according to the adjusted attribute feature vector and the portrait feature vector, which is not repeated here. For example, as shown in fig. 2B, the network is judged to determine whether the attribute feature vector of the verification person is consistent with the preset attribute feature vector through the attribute feature in the trained initial network model; whether the attribute feature vector of the verification figure is consistent with the preset attribute feature vector or not can be determined manually; if not, adjusting the attribute feature vector of the verification figure to be consistent with the preset attribute feature vector, and then performing image feature fusion processing according to the adjusted attribute feature vector and the figure feature vector of the verification figure to generate a verification target image.

And S204, judging that the visual effect of the verification target image meets the quality judgment requirement, if so, executing S205, otherwise, acquiring a new sample image and returning to execute S201.

The visual effect of the image can be a standard for judging the image quality in multiple aspects set by considering the overall effect and detail effect of the image from multiple angles such as aesthetics, vision and the like. Optionally, the implementation may be to score the visual effect of the generated target image according to at least one scoring parameter, where the scoring parameter may further include: a combination of one or more of a region transition naturalness parameter, an image sharpness parameter, an image integrity parameter, etc. The quality evaluation requirement may be an evaluation requirement set for each scoring parameter of the visual effect, for example, how much the scoring of each scoring parameter is to be achieved; the total score of the visual effect can be reached, and the total score and the score corresponding to each scoring parameter can be reached.

Optionally, in this embodiment, a scoring system corresponding to each scoring parameter may be set in advance for each scoring parameter, for example, the more natural the transition of each feature region in the image is, the higher the scoring of the regional transition naturality parameter is; the clearer the image and the fewer ghost parts, the higher the image definition parameter score is; the more complete each feature in the image, the higher the image integrity parameter score. According to a preset scoring system of each scoring parameter, performing visual effect scoring (wherein the visual effect scoring may be total scoring of visual effects and/or scoring of each scoring parameter of visual effects) on the verification target image generated in step S203, then judging whether the visual scoring meets a scoring requirement corresponding to a quality judgment requirement, if so, indicating that the initial network model is trained well, and executing step S205 to use the initial network model as a final confrontation network model; otherwise, the trained initial network model is not adjusted, and a next group of sample images needs to be obtained again to continue training the initial network model at the moment. Optionally, the visual effect scoring of the verification target image may be calculated by a process of a training model according to a certain algorithm; or obtained by a pre-trained quality judgment network. For example, as shown in fig. 2B, the generated verification target image is input to a quality evaluation network, and the network performs visual effect scoring on the input target image to see whether it meets the quality evaluation requirement; if the initial network model is satisfied, the training of the initial network model is completed, otherwise, the sample data needs to be obtained again to continue training the initial network model.

And S205, if the visual effect of the verification target image meets the quality judgment requirement, taking the initial network model as a countermeasure network model.

Optionally, if the initial network model does not include the attribute feature determination network, the initial network model may be used as the final confrontation network model after the visual effect of the verification target image is determined to meet the quality determination requirement in S204. If the initial network model comprises the attribute feature judgment network, the step needs to use the initial network model as an antagonistic network model when the visual effect of the verification target image meets the quality judgment requirement and the accuracy of the attribute feature vector of the verification figure, which is judged by the attribute feature judgment network, is consistent with the preset attribute feature vector reaches a preset threshold value.

Specifically, the judgment in S204 can only verify the accuracy of the portrait decoupling network and the classification network in the initial network model, so before determining whether the initial network model is trained, it is further necessary to determine the accuracy of whether the attribute feature vector of the verified person is consistent with the preset attribute feature vector by the attribute feature judgment network, and if the accuracy reaches a preset threshold (e.g., 90%), it is determined that the training of the attribute feature judgment network in the initial network model is also completed. Therefore, the initial network model can be used as the confrontation network model only when the accuracy of judging whether the attribute feature vector of the verification figure is consistent with the preset attribute feature vector reaches the preset threshold value by the attribute feature judgment network and the visual effect of the verification target image meets the quality judgment requirement, which indicates that the training of the initial network model is finished.

And S206, obtaining the attribute feature vector and the portrait feature vector of the person in the initial image through the confrontation network model.

And S207, adjusting the attribute feature vector of the person to be consistent with the preset attribute feature vector.

And S208, generating a target image according to the adjusted attribute feature vector and the adjusted portrait feature vector.

In the case of training an initial network model for character feature recognition, the conventional solution can only recognize character features in a sample image by using a neural network model trained by using the sample image, for example, if the network model trained by using the sample image of the character a is used, only the character features of the character a can be recognized during use, and further, the image processing operation of the embodiment can be performed only on the image of the character a, and the image processing operation of the embodiment cannot be performed on the image of the character B. In the method of the embodiment, the sample image for training the confrontation network model does not need to cover the character images of all the characters to be recognized, and the confrontation network model after training can also perform character feature recognition on any character. For example, the confrontation network model trained using the sample image of the person a can perform the image processing operation of the present embodiment by performing the person feature recognition on not only the person a but also the person B.

The embodiment of the disclosure provides an image processing method, which includes training an initial network model by a sample image, verifying the trained initial network model by a verification image, verifying whether the visual effect of a verification target image generated by the trained initial network model meets the judgment requirement, completing training of the initial network model to obtain an confrontation network model if the visual effect meets the judgment requirement, obtaining an attribute feature vector and a portrait feature vector of a figure in the initial image based on the confrontation network model, adjusting the attribute feature vector of the figure based on a preset attribute feature vector, and generating a target image according to the adjusted attribute feature vector and the original portrait feature vector in the initial image. According to the scheme of the embodiment of the invention, the confrontation network model capable of identifying the portrait characteristics and the attribute characteristics of the figures in any image can be trained by adopting a small number of sample images, the training cost is reduced, the application range of the confrontation network model is improved, the function of adjusting the attribute characteristics of the figures in the image is realized by adopting the confrontation network model, and the diversified image processing requirements of users can be met. Especially, under the scene of image shooting, the image re-shooting times are greatly reduced, and the shooting efficiency is improved.

Fig. 3 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present disclosure, which is applicable to processing an image to eliminate a situation where the visual effect is affected by some character attribute features. For example, the method can be applied to adjust the attribute characteristics of human eyes in the image to be open eyes and the expression attribute characteristics to be smiling, so as to eliminate the situation that the whole visual effect is influenced by the closed eyes or no smiling. The apparatus may be implemented by software and/or hardware and integrated in an electronic device executing the method, as shown in fig. 3, the apparatus may include:

the feature vector determining module 301 is configured to obtain attribute feature vectors and portrait feature vectors of people in the initial image through the confrontation network model;

a feature vector adjusting module 302, configured to adjust the attribute feature vector of the person to be consistent with a preset attribute feature vector;

and the image generating module 303 is configured to generate a target image according to the adjusted attribute feature vector and the portrait feature vector.

The embodiment of the disclosure provides an image processing device, which obtains an attribute feature vector and a portrait feature vector of a person in an initial image through a countermeasure network, adjusts the attribute feature vector of the initial person to be consistent with a preset attribute feature vector, and then generates a target image according to the adjusted attribute feature vector and the original person feature vector in the initial image. According to the scheme of the embodiment of the disclosure, the diversified image processing requirements of the user can be met by adding the function of adjusting the character attribute characteristics in the image. Particularly, in an image shooting scene, when the visual effect is influenced by the closed eyes or poor expressions of character attribute features, shooting is not needed again, and a high-quality target image can be obtained by processing after the character attribute features in the image are adjusted, so that the shooting efficiency is improved.

Further, the confrontation network model includes a portrait decoupling network and a classification network, and the feature vector determination module 301 is specifically configured to:

obtaining a feature vector of a person in the initial image through the portrait decoupling network;

and obtaining attribute feature vectors and portrait feature vectors of people in the initial image through the classification network.

Further, when the feature vector determining module 301 obtains the attribute feature vector and the portrait feature vector of the person in the initial image through the classification network, the feature vector determining module is specifically configured to:

and comparing the characteristic vector of the person in the initial image with the standard vector in the classification network to obtain the attribute characteristic vector and the portrait characteristic vector of the person in the initial image.

Further, the portrait decoupling network sets a kulbeck-leibler divergence loss function constraint.

Further, the confrontation network model also comprises an attribute characteristic judgment network; the feature vector adjustment module 302 is specifically configured to:

determining whether the attribute feature vector of the figure is consistent with a preset attribute feature vector through an attribute feature judgment network;

and if not, adjusting the attribute feature vector of the character to be consistent with a preset attribute feature vector.

Further, the apparatus further comprises: a model training module to:

inputting a sample image into an initial network model, and training the initial network model;

inputting the verification image into the trained initial network model to obtain an attribute characteristic vector and a portrait characteristic vector of a verification figure;

adjusting the attribute characteristic vector of the verification figure to be consistent with a preset attribute characteristic vector, and generating a verification target image according to the adjusted attribute characteristic vector and the portrait characteristic vector;

and if the visual effect of the verification target image meets the quality judgment requirement, taking the initial network model as the confrontation network model.

Further, when the confrontation network model includes an attribute feature determination network, the model training module executes that if the visual effect of the verification target image meets the quality determination requirement, the initial network model is specifically configured to:

and if the visual effect of the verification target image meets the quality judgment requirement and the accuracy of judging whether the attribute feature vector of the verification figure is consistent with the preset attribute feature vector by the attribute feature judgment network reaches a preset threshold value, taking the initial network model as the confrontation network model.

Further, the attribute feature vector includes: at least one sub-vector of the character expression vector, the orientation angle vector and the eye state vector; the portrait feature vector includes at least one sub-vector of a facial feature vector, a hair feature vector, and a limb feature vector.

The image processing apparatus provided by the embodiment of the present disclosure belongs to the same inventive concept as the image processing method provided by the above embodiments, and the technical details that are not described in detail in the embodiment of the present disclosure can be referred to the above embodiments, and the embodiment of the present disclosure has the same beneficial effects as the above embodiments.

Referring now to FIG. 4, a block diagram of an electronic device 400 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device in the embodiment of the present disclosure may be a mobile terminal device installed with an application client. In particular, the electronic device may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle-mounted terminal (e.g., a car navigation terminal), etc., and a stationary terminal such as a digital TV, a desktop computer, etc. The electronic device 400 shown in fig. 4 is only an example and should not bring any limitations to the functionality and scope of use of the embodiments of the present disclosure.

As shown in fig. 4, electronic device 400 may include a processing device (e.g., central processing unit, graphics processor, etc.) 401 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage device 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the electronic apparatus 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

Generally, the following devices may be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 407 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 408 including, for example, tape, hard disk, etc.; and a communication device 409. The communication means 409 may allow the electronic device 400 to communicate wirelessly or by wire with other devices to exchange data. While fig. 4 illustrates an electronic device 400 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 409, or from the storage device 408, or from the ROM 402. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 401.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some implementations, the electronic devices may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may be interconnected with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the internal processes of the electronic device to perform: obtaining attribute feature vectors and portrait feature vectors of figures in the initial image through the confrontation network model; adjusting the attribute feature vector of the figure to be consistent with a preset attribute feature vector; and generating a target image according to the adjusted attribute feature vector and the portrait feature vector.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, including conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, there is provided an image processing method including:

According to one or more embodiments of the present disclosure, in the above method, the confrontation network model includes a portrait decoupling network and a classification network, and the obtaining of the attribute feature vector and the portrait feature vector of the person in the initial image through the confrontation network model includes:

According to one or more embodiments of the present disclosure, in the above method, obtaining the attribute feature vector and the portrait feature vector of the person in the initial image through the classification network includes:

In the above method, according to one or more embodiments of the present disclosure, the portrait decoupling network sets a curebeck-leibler divergence loss function constraint.

According to one or more embodiments of the present disclosure, in the above method, the countermeasure network model further includes an attribute feature determination network; the adjusting that the attribute feature vector of the figure is consistent with a preset attribute feature vector comprises:

According to one or more embodiments of the present disclosure, the method further includes:

According to one or more embodiments of the present disclosure, in the above method, when the confrontation network model includes an attribute feature determination network, if the visual effect of the verification target image satisfies a quality judgment requirement, the using the initial network model as the confrontation network model includes:

According to one or more embodiments of the present disclosure, in the above method, the attribute feature vector includes: at least one sub-vector of the character expression vector, the orientation angle vector and the eye state vector; the portrait feature vector includes at least one sub-vector of a facial feature vector, a hair feature vector, and a limb feature vector.

According to one or more embodiments of the present disclosure, there is provided an image processing apparatus including:

the characteristic vector adjusting module is used for adjusting the attribute characteristic vector of the figure to be consistent with a preset attribute characteristic vector;

According to one or more embodiments of the present disclosure, the confrontation network model in the above apparatus includes a portrait decoupling network and a classification network, and the feature vector determination module 301 is specifically configured to:

According to one or more embodiments of the present disclosure, when the feature vector determining module 301 in the foregoing apparatus obtains the attribute feature vector and the portrait feature vector of the person in the initial image through the classification network, the feature vector determining module is specifically configured to:

According to one or more embodiments of the present disclosure, the portrait decoupling network in the above apparatus sets a kulbeck-leibler divergence loss function constraint.

According to one or more embodiments of the present disclosure, the countermeasure network model in the above apparatus further includes an attribute feature determination network; the feature vector adjustment module 302 is specifically configured to:

According to one or more embodiments of the present disclosure, the apparatus further includes: a model training module to:

According to one or more embodiments of the present disclosure, in the above apparatus, when the confrontation network model includes an attribute feature determination network, the model training module is specifically configured to, when the initial network model is used as the confrontation network model if the visual effect of the verification target image meets the quality judgment requirement:

According to one or more embodiments of the present disclosure, the attribute feature vector in the above apparatus includes: at least one sub-vector of the character expression vector, the orientation angle vector and the eye state vector; the portrait feature vector includes at least one sub-vector of a facial feature vector, a hair feature vector, and a limb feature vector.

An electronic device provided in accordance with one or more embodiments of the present disclosure includes:

one or more processors;

a memory for storing one or more programs;

According to one or more embodiments of the present disclosure, a readable medium is provided, on which a computer program is stored, which when executed by a processor, implements an image processing method according to any of the embodiments of the present disclosure.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

1. An image processing method, comprising:

generating a target image according to the adjusted attribute feature vector and the portrait feature vector;

the confrontation network model comprises a portrait decoupling network and a classification network, wherein the portrait decoupling network sets a Curebeck-LaBuffer divergence loss function constraint;

if the selection operation of the user is detected, the unsatisfied attribute feature vector selected by the user is used as a sub-vector inconsistent with the preset attribute feature vector; adjusting the character attribute feature vector to be consistent with the preset attribute feature vector;

wherein, the obtaining of the attribute feature vector and the portrait feature vector of the person in the initial image through the confrontation network model comprises:

and dividing the characteristic vector of the person in the initial image into an attribute characteristic vector and a portrait characteristic vector of the person in the initial image through the classification network.

2. The method of claim 1, wherein obtaining the attribute feature vector and the portrait feature vector of the person in the initial image through the classification network comprises:

3. The method of claim 1, wherein the antagonistic network model further comprises an attribute feature determination network; the adjusting that the attribute feature vector of the figure is consistent with a preset attribute feature vector comprises:

4. The method of claim 1, wherein before obtaining the attribute feature vectors and the portrait feature vectors of the people in the initial image through the confrontation network model, the method further comprises:

5. The method of claim 4, wherein when the countermeasure network model includes an attribute feature determination network, if the visual effect of the verification target image satisfies a quality evaluation requirement, the using the initial network model as the countermeasure network model includes:

6. The method of any of claims 1-5, wherein the attribute feature vector comprises: at least one sub-vector of the character expression vector, the orientation angle vector and the eye state vector; the portrait feature vector includes at least one sub-vector of a facial feature vector, a hair feature vector, and a limb feature vector.

7. An image processing apparatus characterized by comprising:

the image generation module is used for generating a target image according to the adjusted attribute feature vector and the portrait feature vector;

wherein the feature vector determination module is specifically configured to:

8. An electronic device, comprising:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the image processing method of any one of claims 1-6.

9. A readable medium, on which a computer program is stored which, when being executed by a processor, carries out the image processing method of any one of claims 1 to 6.