WO2023179075A1 - Image processing method and apparatus, and electronic device, storage medium and program product - Google Patents

Image processing method and apparatus, and electronic device, storage medium and program product Download PDF

Info

Publication number
WO2023179075A1
WO2023179075A1 PCT/CN2022/134943 CN2022134943W WO2023179075A1 WO 2023179075 A1 WO2023179075 A1 WO 2023179075A1 CN 2022134943 W CN2022134943 W CN 2022134943W WO 2023179075 A1 WO2023179075 A1 WO 2023179075A1
Authority
WO
WIPO (PCT)
Prior art keywords
attribute
face
dimensional vector
dimensional
type
Prior art date
Application number
PCT/CN2022/134943
Other languages
French (fr)
Chinese (zh)
Inventor
林纯泽
王权
钱晨
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023179075A1 publication Critical patent/WO2023179075A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map

Definitions

  • the present disclosure relates to the field of computer technology, and in particular, to an image processing method and device, electronic equipment, storage media and program products.
  • Face attribute editing refers to manipulating and changing face attributes in face images.
  • facial attribute editing is no longer limited to face deformation, but can edit any facial attribute, such as adding glasses, adding beards, changing eye color, changing facial expressions, etc.
  • the current related technology cannot edit a specific face attribute without affecting other face attributes. For example, if the user wants to add glasses, although the related technology can add eyes, it may also cause facial deformation.
  • the present disclosure proposes an image processing technical solution.
  • an image processing method including: acquiring a face image to be processed; encoding the face image to obtain a first latent variable of the face image; responding to The setting operation of the attribute editing degree of the face attribute, according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, edit the first latent variable to obtain the edited second latent variable, wherein, the The attribute editing direction represents the enhancement or weakening direction of the face attributes. Different face attributes correspond to different attribute editing directions.
  • the attribute editing degree represents the enhancement or weakening degree of the face attributes; for the second The latent variables are decoded to obtain a target face image, and the display effects of the face attributes in the target face image and the face image are different.
  • an image processing device including: an acquisition part configured to acquire a face image to be processed; an encoding part configured to encode the face image to obtain the face image.
  • the first latent variable of the image; the editing part is configured to respond to a setting operation for the attribute editing degree of the face attribute, and edit the first hidden variable according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute.
  • the decoding part is configured to decode the second latent variable to obtain a target face image, the target face image and the face attributes in the face image The display effect is different.
  • an electronic device including: a processor; a memory configured to store instructions executable by the processor; wherein the processor is configured to call instructions stored in the memory to execute the above method.
  • a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above method is implemented.
  • a computer program product including computer readable code.
  • the computer readable code When the computer readable code is run in an electronic device, a processor in the electronic device executes a configuration configured to implement the above method.
  • the first latent variable is obtained by encoding the face image
  • the first latent variable is obtained by editing the first latent variable according to the set attribute editing degree of the face attribute and the attribute editing direction corresponding to the face attribute.
  • two latent variables and then decode the second latent variable to obtain the target face image. Since the attribute editing directions of different face attributes are different, the face attributes specified by the user can be accurately edited without affecting other face attributes. display effect.
  • FIG. 1 shows a flowchart of an image processing method according to an embodiment of the present disclosure.
  • FIG. 2 shows a schematic diagram of an operation control according to an embodiment of the present disclosure.
  • Figure 3a shows a schematic diagram of a human face image according to an embodiment of the present disclosure.
  • Figure 3b shows a schematic diagram of a target face image according to an embodiment of the present disclosure.
  • Figure 4a shows a schematic diagram of a human face image according to an embodiment of the present disclosure.
  • Figure 4b shows a schematic diagram of a target face image according to an embodiment of the present disclosure.
  • FIG. 5 shows a schematic diagram of an image processing flow according to an embodiment of the present disclosure.
  • Figure 6 shows a schematic diagram of a sample distribution space according to an embodiment of the present disclosure.
  • FIG. 7 shows a block diagram of an image processing device according to an embodiment of the present disclosure.
  • FIG. 8 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
  • exemplary means "serving as an example, example, or illustrative.” Any embodiment described herein as “exemplary” is not necessarily to be construed as superior or superior to other embodiments.
  • a and/or B can mean: A exists alone, A and B exist simultaneously, and they exist alone. B these three situations.
  • at least one herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, and C, which can mean including from A, Any one or more elements selected from the set composed of B and C.
  • FIG. 1 shows a flow chart of an image processing method according to an embodiment of the present disclosure.
  • the image processing method can be executed by an electronic device such as a terminal device or a server.
  • the terminal device can be a user equipment (User Equipment, UE), a mobile device, a user Terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle-mounted device, wearable device, etc.
  • the method can call the computer-readable data stored in the memory through the processor
  • the method can be implemented by means of instructions, or the method can be executed by the server.
  • the image processing method includes steps S11 to S14:
  • step S11 the face image to be processed is obtained.
  • the face image may be an image collected in real time by an image acquisition device, or may be an image extracted from local storage, or may be an image transmitted by other electronic devices, to which embodiments of the present disclosure are not limited. It should be understood that the face in the face image may be a real face or a virtual face, such as the face of an anime character, etc., and the embodiments of the present disclosure are not limited to this.
  • step S12 the face image is encoded to obtain the first latent variable of the face image.
  • the face image can be encoded by a face image encoder to obtain the first latent variable of the face image.
  • the first latent variable can be expressed as M first N-dimensional vectors, M With N being a positive integer, for example, the face image encoder can encode the face image into 18 512-dimensional vectors.
  • the face image encoder can be implemented using deep learning technology known in the art.
  • the face image encoder can use a deep neural network to extract features of the face image, and use the extracted depth features as the face image. the first hidden variable. It should be understood that the embodiment of the present disclosure does not limit the encoding method of face images.
  • step S13 in response to the setting operation of the attribute editing degree of the face attribute, the first latent variable is edited according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, and the edited second latent variable is obtained.
  • the face attributes may include, for example, but are not limited to: the face shape and posture of the face, the gender, age, and emotion represented by the face, the beard, glasses, and masks on the face, and the pupil color. , hair color, makeup color, filter color at least one.
  • FIG. 2 shows a schematic diagram of an operation control according to an embodiment of the present disclosure.
  • the value ranges corresponding to the respective attribute editing degrees can be passed Adjust the position of the "filled circle" of the operation control corresponding to any face attribute on the corresponding value range line segment to set the attribute editing degree of any face attribute.
  • the attribute editing direction can represent the direction of enhancement or weakening of the face attributes.
  • the first latent variable can be expressed as M first N-dimensional vectors.
  • the attribute editing direction can be expressed as a second N-dimensional vector.
  • the direction of the second N-dimensional vector represents the editing direction of the attribute, and the attribute editing directions corresponding to different face attributes are different. In this way, when the user wishes to edit a certain face attribute, he can edit the attribute according to the direction. It is expected that the attribute editing direction corresponding to the face attribute to be edited edits at least one first N-dimensional vector in the first latent variable without affecting other face attributes.
  • the attribute editing direction corresponding to the face attribute is obtained by using an attribute classifier to classify the sample face image into two categories.
  • the attribute classifier can collect support vector machines, image classification networks, etc., and the embodiments of the present disclosure are not limited to this. It should be understood that each face attribute can correspond to its own attribute classifier, or one attribute classifier can be used to classify one face attribute.
  • the attribute classifier corresponding to gender can be used to classify the sample face image into two categories, that is, the sample face image is divided into male faces and female faces.
  • the attribute editing direction can be the enhancement direction that represents masculine faces (That is, the weakening direction of a feminine face), or the weakening direction of a masculine face (that is, the strengthening direction of a feminine face); you can also use the attribute classifier corresponding to the beard to classify the sample face image into two categories, That is, the sample face image is divided into a bearded face and a beardless face.
  • the attribute editing direction can be the enhancement direction that represents the beard (that is, the weakening direction without the beard), or it can also represent the weakening direction of the beard ( That is, the direction of enhancement without beard).
  • the degree of attribute editing can represent the degree of enhancement or weakening of a face attribute, or in other words, the degree of attribute editing can represent the extent to which the user expects to edit the face attribute. For example, if the user expects to enhance a certain face attribute, then the person
  • the degree of facial attribute enhancement can be the attribute editing degree set by the user.
  • a certain value range can be set for the attribute editing degree.
  • the value range for the attribute editing degree can be set to [-3,3], [-10,10], etc.
  • a positive value means that the person The degree of enhancement of face attributes.
  • Negative values mean the degree of weakening of face attributes. For example, if the user wants to add a beard to the face, the attribute editing degree can be a positive value. The greater the attribute editing degree, the thicker the beard will be; conversely, if the user wants to remove the beard on the face, the attribute editing degree can be a negative value. The smaller the attribute edit, the sparser the beard.
  • editing the first latent variable according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute to obtain the edited second latent variable may include: combining the set attribute editing degree with The product between the attribute editing directions is added to at least one first N-dimensional vector in the first latent variable to obtain the edited second latent variable. In this way, the specified face attributes can be edited.
  • step S14 the second latent variable is decoded to obtain a target face image.
  • the display effects of the face attributes in the target face image and the face image are different.
  • the generation network can be used to decode the second latent variable to obtain the target face image. It should be understood that the embodiments of the present disclosure do not limit the network structure, network type, and training method of the generating network.
  • the generating network can be obtained by training a generative adversarial network (GAN).
  • GAN generative adversarial network
  • the generation network can be used to generate an image with a preset image style based on M N-dimensional vectors.
  • the image style can include at least a real style and a non-realistic style, for example.
  • the non-realistic style can include at least a comic style, a European and American style, and a sketch style. , oil painting style, printmaking style, etc. That is, the face in the target face image may be a real-style face or a non-realistic-style face.
  • the second latent variable is obtained by editing the first latent variable of the face image.
  • the face attributes in the target face image generated based on the second latent variable are the same as the face attributes in the original face image.
  • the display effect is different. Among them, when the product of the attribute editing direction and the attribute editing degree is a positive value, the display effect of the face attributes of the target face image obtained based on the second latent variable is enhanced relative to the face image; when the attribute editing direction and the When the product of the attribute editing degrees is a negative value, the display effect of the face attributes of the target face image obtained based on the second latent variable is weakened relative to the face image.
  • the product of the attribute editing direction and the attribute editing degree is a positive value, which means that the face attributes of the attribute editing degree are enhanced along the same direction as the attribute editing direction; the product of the attribute editing direction and the attribute editing degree is a negative value, It means that the face attributes weaken the degree of attribute editing in the opposite direction to the attribute editing direction.
  • Figure 3a shows a schematic diagram of a human face image according to an embodiment of the present disclosure
  • Figure 3b shows a schematic diagram of a target human face image according to an embodiment of the present disclosure
  • the target face image shown in Figure 3b can be an image obtained by editing the attributes of the face image shown in Figure 3a.
  • the target face image in Figure 3b is larger than that in Figure 3a.
  • the faces in the face images shown are younger.
  • Figures 3a and 3b are both realistic-style images.
  • Figure 4a shows a schematic diagram of a human face image according to an embodiment of the present disclosure
  • Figure 4b shows a schematic diagram of a target human face image according to an embodiment of the present disclosure
  • the target face image shown in Figure 4b can be an image obtained by editing the attributes of the face image shown in Figure 4a.
  • the target face image in Figure 4b is larger than that in Figure 4a.
  • the faces in the face images shown are younger.
  • Figures 4a and 4b are both oil painting style images.
  • Figure 5 shows a schematic diagram of an image processing flow according to an embodiment of the present disclosure.
  • the image processing flow may include: inputting the face image 51 to the face image encoder 52 to obtain the face image corresponding to The first latent variable; according to the set attribute editing degree of the face attribute and the attribute editing direction of the face attribute, edit the first latent variable to obtain the edited second latent variable 53, and input the second latent variable into the generation network 54, the target face image is obtained.
  • the attribute editing degree of the face attribute beard is positive, there are more beards on the face in the target face image 54 than in the face image, that is, the target face image is different from the person in the face image.
  • the display effects of face attributes are different.
  • the first latent variable is obtained by encoding the face image
  • the first latent variable is obtained by editing the first latent variable according to the set attribute editing degree of the face attribute and the attribute editing direction corresponding to the face attribute.
  • two latent variables and then decode the second latent variable to obtain the target face image. Since the attribute editing directions of different face attributes are different, the face attributes specified by the user can be accurately edited without affecting other people. The display effect of face attributes.
  • the attribute editing direction corresponding to the face attribute is obtained by using the attribute classifier to classify the sample face image into two categories.
  • using an attribute classifier to classify the sample face image into two categories includes: using an attribute classifier corresponding to the face attribute to classify the sample face image into two categories, and obtaining the hidden representation of the sample face image.
  • the attribute classification boundary in the space, the latent space represents the sample distribution space of the sample latent variable distribution corresponding to the sample face image; the direction in which the attribute classification boundary faces the positive sample attribute of the face attribute is determined as the attribute editing direction.
  • the sample face image may be a face image randomly generated by the above-mentioned generation network based on M randomly distributed N-dimensional vectors, or it may be a face image actually captured by the image acquisition device, which is not limited by the embodiments of the present disclosure.
  • the sample latent variable corresponding to the sample face image can be understood as the feature vector corresponding to the sample face image, so the latent space can also be understood as the vector distribution space in which the feature vectors corresponding to the sample face image are distributed.
  • the two-classification of sample face images can be briefly described as finding an interface to divide the sample latent variables in the latent space into two parts.
  • One part represents the positive sample attributes of the face attributes
  • the other part represents the positive sample attributes of the face attributes.
  • Negative sample attributes of face attributes this interface can be used as the attribute classification boundary.
  • the normal vector or unit normal vector perpendicular to the interface can be obtained.
  • the normal vector is (a, b, c)
  • the unit normal vector is the normal vector divided by the length of the normal vector.
  • the normal vector (or unit normal vector) pointing to the positive sample attribute side can be determined as the attribute editing direction.
  • the normal vector (or unit normal vector) is determined as the attribute editing direction.
  • Figure 6 shows a schematic diagram of a sample distribution space according to an embodiment of the present disclosure.
  • the sample distribution space is expressed as a two-dimensional space for ease of understanding. It is understandable that the actual sample distribution space can be a high-dimensional distribution space.
  • the square represents the positive sample attribute
  • the triangle represents the negative sample attribute.
  • the attribute editing direction is the direction in which the attribute classification boundary faces the positive sample attribute.
  • the attribute editing direction can represent the enhancement direction of the positive sample attribute (or the weakening direction of the negative sample attribute), then the opposite direction of the attribute editing direction
  • the direction can represent the direction of weakening the attributes of positive samples (or the direction of strengthening the attributes of negative samples).
  • the direction in which the attribute classification boundary faces the negative sample attribute can also be determined as the attribute editing direction.
  • the attribute editing direction can represent the enhancement direction of the negative sample attribute (or the weakening direction of the positive sample attribute).
  • the opposite direction of the attribute editing direction can be Characterizes the weakening direction of negative sample attributes (or the strengthening direction of positive sample attributes).
  • positive sample attributes and negative sample attributes of each face attribute can be customized, and this is not limited by the embodiments of the present disclosure.
  • the attribute editing directions of different face attributes can be determined quickly and effectively.
  • the first latent variable can be expressed as M first N-dimensional vectors
  • the attribute editing direction can be expressed as a second N-dimensional vector.
  • N and M are positive integers.
  • step S13 edit the first latent variable according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, and obtain the edited second latent variable, including:
  • the attribute type corresponding to the face attribute determine at least one first N-dimensional vector on which the attribute editing direction acts; calculate the product between the second N-dimensional vector and the attribute editing degree to obtain the third N-dimensional vector; combine the M-th At least one first N-dimensional vector in an N-dimensional vector is added to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors, and the second hidden variable is represented as M fourth N-dimensional vectors.
  • the generation network includes multiple networks with different resolutions.
  • Network layers with different resolutions have different sensitivities to different face attributes, or have different learning effects. Therefore, the networks with different resolutions of the generation network can They are used to process the second latent variables corresponding to different facial attributes.
  • the low-resolution network layer of the generative network is relatively sensitive to the face shape, posture, and first-class facial attributes represented by the face, such as gender, age, and emotion.
  • the medium-resolution network layer is sensitive to beards, glasses, etc. , masks and other second-category face attributes are more sensitive, and the high-resolution network layer is more sensitive to the third-category face attributes such as pupil color, hair color, makeup color, and filter color.
  • the low-resolution network layer of the generative network is more sensitive to the first type of face attributes than the medium-resolution network layer and high-resolution network layer of the generative network;
  • the medium-resolution network layer of the generative network is The second type of face attributes is more sensitive to the face attributes than the low-resolution network layer and the high-resolution network layer.
  • the high-resolution network layer of the generating network is more sensitive to the third type of face attributes. Higher than the low-resolution network layer and the medium-resolution network layer.
  • the attribute editing direction of a certain face attribute can be applied to part of the first N-dimensional vector of the first latent vector, that is, according to the attribute type corresponding to the face attribute, at least one first N-dimensional vector to which the attribute editing direction acts can be determined. vector, and because different face attributes correspond to different attribute editing directions, it is possible to reduce the impact on other face attributes when editing a certain face attribute, or even have no impact on other face attributes.
  • the attribute editing direction of the first type of face attributes can be applied to the first N-dimensional vector to the i-th first N-dimensional vector;
  • the attribute editing direction of face attributes acts on the i+1 first N-dimensional vector to the j-th first N-dimensional vector;
  • the attribute editing direction of the third type of face attributes acts on the j+1 first N dimensional vector to the Mth first N-dimensional vector; where, i ⁇ [1,M], j ⁇ [2,M].
  • facial attributes of different attribute types can be corresponding to each first N-dimensional vector, so that the generation network can be subsequently used to encode the second N-dimensional vector edited based on the first N-dimensional vector to obtain adjustments.
  • the first latent variable is expressed as three two-dimensional vectors ⁇ (a1,b1),(a2,b2),(a3,b3) ⁇
  • the attribute editing direction of a certain attribute type of face attribute acts on Two-dimensional vector (a1, b1)
  • the attribute editing degree is 2
  • the attribute editing direction is expressed as a two-dimensional vector (m, n).
  • the second hidden variable is expressed as three two-dimensional vectors ⁇ (2m+ a1,2n+b1),(a2,b2),(a3,b3) ⁇
  • any facial attribute can be accurately edited while reducing the impact on other facial attributes, or even having no impact on other facial attributes.
  • the attribute types corresponding to the face attributes include at least one of the first type of face attributes, the second type of face attributes and the third type of face attributes.
  • the first type Face attributes include: face shape, posture, and at least one of gender, age, and emotion represented by the face.
  • the second type of face attributes includes at least one of beard, glasses, and mask on the face.
  • the third type of face attributes includes at least one of pupil color, hair color, makeup color, and filter color.
  • At least one first N-dimensional vector on which the attribute editing direction acts is determined based on the attribute type corresponding to the face attribute, including:
  • the face attributes include the first type of face attributes
  • the face attributes include the second type of face attributes, determine the attribute editing direction to act on the i+1th first N-dimensional vector to the j-th first N-dimensional vector; and/or,
  • the face attributes include the third type of face attributes
  • the low-resolution network layer of the generative network is relatively sensitive to the face shape, posture, and the first type of face attributes represented by the face, such as gender, age, and emotion.
  • the medium-resolution network layer is sensitive to the face
  • the second type of face attributes such as beards, glasses, and masks are more sensitive
  • the high-resolution network layer is more sensitive to the third type of face attributes such as pupil color, hair color, makeup color, and filter color.
  • the 1st first N-dimensional vector to the i-th first N-dimensional vector may correspond to the low-resolution network layer of the generating network
  • the i+1th first N-dimensional vector to the j-th first N-dimensional vector The vector may correspond to a medium-resolution network layer of the generating network
  • the j+1th first N-dimensional vector to the M-th first N-dimensional vector may correspond to a high-resolution network layer of the generating network.
  • i and j can be empirical values determined by experimental testing based on the network structure of the generated network, and the embodiments of the present disclosure are not limited to this. For example, if M is 18, i can be set to 5 and j is 10.
  • At least one first N-dimensional vector among the M first N-dimensional vectors is added to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors, including:
  • the first latent variable is represented by 18 first 512-dimensional vectors
  • the product of the attribute editing direction and the attribute editing degree is the third 512-dimensional vector
  • i is 5
  • j is 10
  • the face attributes include the In the case of a type of face attributes, the first to the fifth first 512-dimensional vectors among the 18 first 512-dimensional vectors can be added to the third 512-dimensional vector respectively; in the face
  • the attributes include the second type of face attributes
  • the 6th to 10th first N-dimensional vectors among the 18 first N-dimensional vectors can be added to the third N-dimensional vector respectively;
  • the face attributes include the third type of face attributes
  • the 11th first N-dimensional vector to the 18th first N-dimensional vector among the M first N-dimensional vectors are respectively associated with the third N-dimensional vector. add.
  • any facial attribute can be accurately edited while reducing the impact on other facial attributes, or even having no impact on other facial attributes.
  • the generation network includes multiple network layers with different resolutions.
  • the networks with different resolutions can be used to process the second latent variables corresponding to different face attributes.
  • the generation network is used to generate predictions based on M N-dimensional vectors. Assume an image style image; and use the generation network to decode the second latent variable to obtain the target face image.
  • the second latent variable is represented as M fourth N-dimensional vectors.
  • the generation network It includes M-layer network layers, and uses the generation network to decode the second latent variable to obtain the target face image, including:
  • the graph is input to the m-th network layer of the generation network, and the m-th intermediate graph output by the m-th network layer is obtained, m ⁇ [2,M); the M-th fourth N-dimensional vector and the M-1-th intermediate
  • the image is input to the Mth network layer of the generation network, and the target face image output by the Mth network layer is obtained.
  • the m-1th intermediate graph is also the first intermediate graph, so the second intermediate graph is obtained based on the second, fourth N-dimensional vector and the first intermediate graph.
  • the second to M-1 intermediate images are all based on the m-th fourth N-dimensional vector and the m-1 intermediate image output by the upper network layer. determined, it can be known that the M-1th intermediate graph is based on the M-1 fourth N-dimensional vector and the M-2th intermediate graph, and the M-2th intermediate graph is based on the M-th intermediate graph.
  • the 2 fourth N-dimensional vectors and the M-3 intermediate picture are obtained, and so on.
  • the generative network can be used to generate images in increasing resolutions.
  • the input of the first network layer of the generative network is a fourth N-dimensional vector
  • the input of each subsequent network layer includes a fourth N-dimensional vector.
  • Four N-dimensional vectors and the intermediate image output by the upper network layer, and the last network layer outputs the target face image.
  • the generative network can also be called a multi-layer transformation generative network.
  • the low-resolution network layer of the generative network (also called a shallow network layer) first learns and generates a low-resolution intermediate image (such as 4 ⁇ 4 resolution), and then gradually increases with the depth of the network. Increase, continue to learn and generate higher resolution intermediate images (such as 512 ⁇ 512 resolution), and finally generate the highest resolution target face image (such as 1024 ⁇ 1024 resolution).
  • the generation network can be used to decode the second latent variable to effectively obtain the target face image.
  • the user can perform at least one attribute editing on the same facial attributes according to the image processing method of steps S11 to S14 in the above embodiments of the present disclosure, or can perform at least one attribute editing on different facial attributes respectively, to obtain Target face image.
  • editable face attributes are limited through traditional image distortion methods, and existing attribute editing cannot be decoupled, making it easy for attribute editing to affect each other.
  • specific face attributes can be edited more accurately, and the impact of attribute editing on any face attribute on other face attributes can be reduced or even have no impact on other face attributes. It can be applied to Face attribute editing in different image styles.
  • the present disclosure also provides image processing devices, electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any image processing method provided by the present disclosure.
  • image processing devices electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any image processing method provided by the present disclosure.
  • Figure 7 shows a block diagram of an image processing device according to an embodiment of the present disclosure. As shown in Figure 7, the device includes:
  • the acquisition part 101 is configured to acquire the face image to be processed
  • the encoding part 102 is configured to perform encoding processing on the face image to obtain the first latent variable of the face image;
  • the editing part 103 is configured to respond to the setting operation of the attribute editing degree of the face attribute, edit the first latent variable according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, and obtain the edited The second latent variable, wherein the attribute editing direction represents the enhancement direction or weakening direction of the face attribute, different face attributes correspond to different attribute editing directions, and the attribute editing degree represents the enhancement degree of the face attribute. or degree of weakening;
  • the decoding part 104 is configured to decode the second latent variable to obtain a target face image, where the display effects of the face attributes in the target face image and the face image are different.
  • the attribute editing direction corresponding to the face attribute is obtained by using an attribute classifier to classify the sample face image; wherein, the attribute classifier is used to classify the sample face image.
  • Classification includes: using an attribute classifier corresponding to the face attribute to perform two classifications on the sample face image to obtain the attribute classification boundary of the sample face image in the latent space, and the latent space represents the The sample distribution space of the sample latent variable distribution corresponding to the sample face image; the direction in which the attribute classification boundary faces the positive sample attribute of the face attribute is determined as the attribute editing direction.
  • the first latent variable is represented as M first N-dimensional vectors
  • the attribute editing direction is represented as a second N-dimensional vector
  • N and M are positive integers
  • the editing part 103 including: a determination sub-part configured to determine at least one first N-dimensional vector on which the attribute editing direction acts based on the attribute type corresponding to the face attribute; a calculation sub-part configured to calculate the second N-dimensional vector The product of the vector and the attribute editing degree obtains a third N-dimensional vector; the addition subpart is configured to combine the at least one first N-dimensional vector among the M first N-dimensional vectors with the The third N-dimensional vectors are added to obtain M fourth N-dimensional vectors, and the second hidden variable is expressed as the M fourth N-dimensional vectors.
  • the attribute type of the face attribute includes at least one of a first type of face attribute, a second type of face attribute and a third type of face attribute; the first type of person
  • the attribute editing direction of the face attribute acts on the first N-dimensional vector to the i-th first N-dimensional vector; the attribute editing direction of the second type of face attribute acts on the i+1 first N-dimensional vector to the j-th first N-dimensional vector; the attribute editing direction of the third type of face attribute acts on the j+1-th first N-dimensional vector to the M-th first N-dimensional vector; where, i ⁇ [1 ,M], j ⁇ [2,M].
  • the first type of human face attributes include: facial shape, posture, and at least one of gender, age, and emotion represented by the human face;
  • the second type of human face attributes include: The face attributes include at least one of beards, glasses, and masks on the human face;
  • the third type of face attributes include at least one of pupil color, hair color, makeup color, and filter color.
  • the attribute type includes a first type of face attribute, and at least one first N-dimensional vector on which the attribute editing direction acts is determined based on the attribute type corresponding to the face attribute. , including: when the face attribute includes the first type of face attribute, determining that the attribute editing direction acts on the first N-dimensional vector to the i-th first N-dimensional vector; wherein, Adding the at least one first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors includes: adding the M The 1st first N-dimensional vector to the i-th first N-dimensional vector among the first N-dimensional vectors are respectively added to the third N-dimensional vector to obtain the M fourth N-dimensional vectors.
  • the attribute type includes a second type of face attribute
  • at least one first N-dimensional vector on which the attribute editing direction acts is determined based on the attribute type corresponding to the face attribute. , including: when the face attribute includes the second type of face attribute, determining that the attribute editing direction acts on the i+1th first N-dimensional vector to the j-th first N-dimensional vector, i ⁇ [1,M], j ⁇ [2,M]; wherein, the at least one first N-dimensional vector among the M first N-dimensional vectors is respectively combined with the third N-dimensional vector
  • Adding to obtain M fourth N-dimensional vectors includes: combining the i+1th first N-dimensional vector to the j-th first N-dimensional vector among the M first N-dimensional vectors with the said Three N-dimensional vectors are added to obtain the M fourth N-dimensional vectors.
  • the attribute type includes a third type of face attribute, and at least one first N-dimensional vector on which the attribute editing direction acts is determined based on the attribute type corresponding to the face attribute. , including: when the face attribute includes the third type of face attribute, determining that the attribute editing direction acts on the j+1th first N-dimensional vector to the M-th first N-dimensional vector, j ⁇ [2,M]; wherein the at least one first N-dimensional vector among the M first N-dimensional vectors is added to the third N-dimensional vector respectively to obtain M fourth
  • the N-dimensional vector includes: adding the j+1th first N-dimensional vector to the M-th first N-dimensional vector among the M first N-dimensional vectors and the third N-dimensional vector respectively, to obtain The M fourth N-dimensional vectors.
  • the decoding part 104 includes: a network decoding sub-part configured to use a generating network to decode the second latent variable to obtain a target face image; the generating network is used to Generate an image with a preset image style based on M N-dimensional vectors; wherein the generation network includes a plurality of network layers with different resolutions, and the network layers with different resolutions are respectively used to process second hidden images corresponding to different face attributes. variable.
  • the generating network includes M network layers
  • the second latent variable is represented as M fourth N-dimensional vectors
  • the generating network is used to decode the second latent variable.
  • obtaining the target face image including: inputting the first fourth N-dimensional vector to the first network layer of the generation network, obtaining the first intermediate image output by the first network layer; converting the mth The fourth N-dimensional vector and the m-1th intermediate image are input to the m-th network layer of the generating network, and the m-th intermediate image output by the m-th network layer is obtained, m ⁇ [2,M) ; Input the Mth fourth N-dimensional vector and the M-1th intermediate image to the Mth network layer of the generation network to obtain the target face image output by the Mth network layer.
  • the target face image obtained based on the second latent variable, relative to the The display effect of the face attributes of the face image is enhanced; when the product of the attribute editing direction and the attribute editing degree is a negative value, the target face image obtained based on the second latent variable , the display effect of the face attribute relative to the face image is weakened.
  • the first latent variable is obtained by encoding the face image
  • the first latent variable is obtained by editing the first latent variable according to the set attribute editing degree of the face attribute and the attribute editing direction corresponding to the face attribute.
  • two latent variables and then decode the second latent variable to obtain the target face image. Since the attribute editing directions of different face attributes are different, the face attributes specified by the user can be accurately edited without affecting other face attributes. display effect.
  • the functions or included parts of the device provided by the embodiments of the present disclosure can be configured to perform the method described in the above method embodiments, and for its specific implementation, reference can be made to the description of the above method embodiments.
  • Embodiments of the present disclosure also provide a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above method is implemented.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium.
  • An embodiment of the present disclosure also provides an electronic device, including: a processor; a memory configured to store instructions executable by the processor; wherein the processor is configured to call instructions stored in the memory to perform the above method.
  • Embodiments of the present disclosure also provide a computer program product, including computer readable code, or a non-volatile computer readable storage medium carrying the computer readable code.
  • computer readable code When the computer readable code is stored in a processor of an electronic device, When running, the processor in the electronic device executes the above method.
  • the electronic device may be provided as a terminal, a server, or other forms of equipment.
  • FIG. 8 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic device 1900 may be provided as a server or terminal device.
  • electronic device 1900 includes a processing component 1922 , which may include one or more processors, and memory resources represented by memory 1932 configured to store instructions, such as applications, executable by processing component 1922 .
  • An application stored in memory 1932 may include one or more portions, each corresponding to a set of instructions.
  • the processing component 1922 is configured to execute instructions to perform the above-described methods.
  • Electronic device 1900 may also include a power supply component 1926 configured to perform power management of electronic device 1900, a wired or wireless network interface 1950 configured to connect electronic device 1900 to a network, and an input-output (I/O) interface 1958 .
  • a power supply component 1926 configured to perform power management of electronic device 1900
  • a wired or wireless network interface 1950 configured to connect electronic device 1900 to a network
  • an input-output (I/O) interface 1958 configured to communicate with a network.
  • a non-volatile computer-readable storage medium is also provided, such as a memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the electronic device 1900 to complete the above method.
  • Embodiments of the present disclosure may be systems, methods, and/or computer program products.
  • a computer program product may include a computer-readable storage medium having thereon computer-readable program instructions configured to cause a processor to implement aspects of the disclosed embodiments.
  • Computer-readable storage media may be tangible devices that can retain and store instructions for use by an instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the above.
  • Computer-readable storage media can include: portable computer disks, hard disks, random access memory (RAM, Random Access Memory), read-only memory, erasable programmable read-only memory (EPROM or flash memory), static random access memory, Portable compact disk read-only memory (CD-ROM, Compact Disc Read-Only Memory), digital versatile disk (DVD, Digital Video Disc), memory stick, floppy disk, mechanical encoding device, such as punched card with instructions stored on it Or a protruding structure in the groove, and any suitable combination of the above.
  • RAM Random Access Memory
  • EPROM or flash memory erasable programmable read-only memory
  • static random access memory Portable compact disk read-only memory
  • CD-ROM Compact Disc Read-Only Memory
  • DVD Digital Video Disc
  • memory stick floppy disk
  • mechanical encoding device such as punched card with instructions stored on it Or a protruding structure in the groove, and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or through electrical wires. transmitted electrical signals.
  • Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to various computing/processing devices, or to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage on a computer-readable storage medium in the respective computing/processing device .
  • These computer-readable program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus, thereby producing a machine that, when executed by the processor of the computer or other programmable data processing apparatus, , resulting in an apparatus that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium. These instructions cause the computer, programmable data processing device and/or other equipment to work in a specific manner. Therefore, the computer-readable medium storing the instructions includes An article of manufacture that includes instructions that implement aspects of the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
  • Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other equipment, causing a series of operating steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executed on a computer, other programmable data processing apparatus, or other equipment to implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
  • each block in the flowchart or block diagrams may represent a portion, segment, or portion of instructions that contains one or more elements configured to implement the specified logical function Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two consecutive blocks may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or acts. , or can be implemented using a combination of specialized hardware and computer instructions.
  • the computer program product may be implemented in hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium.
  • the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and so on.
  • the products applying the disclosed technical solution will clearly inform the personal information processing rules and obtain the individual's independent consent before processing personal information.
  • the product applying the disclosed technical solution must obtain the individual's separate consent before processing the sensitive personal information, and at the same time meet the requirement of "express consent”. For example, setting up clear and conspicuous signs on personal information collection devices such as cameras to inform them that they have entered the scope of personal information collection, and that personal information will be collected.
  • personal information processing rules may include personal information processing rules.
  • Information such as information processors, purposes of processing personal information, methods of processing, and types of personal information processed.
  • a face image to be processed is obtained; the face image is encoded to obtain the first latent variable of the face image; in response to the setting operation of the attribute editing degree of the face attribute, according to the set attribute Editing degree and the attribute editing direction corresponding to the face attribute, edit the first latent variable, and obtain the edited second latent variable.
  • the attribute editing direction represents the enhancement direction or weakening direction of the face attribute, and the attributes corresponding to different face attributes
  • the editing direction is different
  • the degree of attribute editing represents the degree of enhancement or weakening of the face attributes
  • the second latent variable is decoded to obtain the target face image, and the display effects of the face attributes in the target face image and the face image are different.
  • the embodiments of the present disclosure can accurately edit the face attributes specified by the user.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The present disclosure relates to an image processing method and apparatus, and an electronic device, a storage medium and a program product. The method comprises: acquiring a facial image to be processed; performing encoding processing on the facial image, so as to obtain a first hidden variable of the facial image; in response to a setting operation for an attribute editing degree of a facial attribute, editing the first hidden variable according to the set attribute editing degree and an attribute editing direction corresponding to the facial attribute, so as to obtain an edited second hidden variable, wherein the attribute editing direction represents an enhancement direction or a weakening direction of the facial attribute, different facial attributes correspond to different attribute editing directions, and the attribute editing degree represents an enhancement degree or a weakening degree of the facial attribute; and performing decoding processing on the second hidden variable, so as to obtain a target facial image, wherein the display effect of the facial attribute in the target facial image is different from the display effect of the facial attribute in the facial image.

Description

图像处理方法及装置、电子设备、存储介质和程序产品Image processing methods and devices, electronic equipment, storage media and program products
相关申请的交叉引用Cross-references to related applications
本公开基于申请号为202210279511.6、申请日为2022年03月22日、申请名称为“图像处理方法及装置、电子设备和存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以全文引入的方式引入本公开。This disclosure is based on a Chinese patent application with the application number 202210279511.6, the filing date is March 22, 2022, and the application name is "Image processing method and device, electronic equipment and storage medium", and claims the priority of this Chinese patent application, The entire content of the Chinese patent application is hereby incorporated into this disclosure in its entirety.
技术领域Technical field
本公开涉及计算机技术领域,尤其涉及一种图像处理方法及装置、电子设备、存储介质和程序产品。The present disclosure relates to the field of computer technology, and in particular, to an image processing method and device, electronic equipment, storage media and program products.
背景技术Background technique
人脸属性编辑指的是对人脸图像中的人脸属性进行操控、改变。在深度学习领域中,人脸属性编辑不再限于人脸的形变,而是可以对任意人脸属性进行编辑,例如,增加眼镜、添加胡子、改变眼睛颜色、改变人脸表情等。Face attribute editing refers to manipulating and changing face attributes in face images. In the field of deep learning, facial attribute editing is no longer limited to face deformation, but can edit any facial attribute, such as adding glasses, adding beards, changing eye color, changing facial expressions, etc.
而目前的相关技术中无法针对某一指定人脸属性进行编辑的同时不影响其它人脸属性,比如说用户想要添加眼镜,相关技术虽然可以增加眼睛,但同时可能产生人脸变形。However, the current related technology cannot edit a specific face attribute without affecting other face attributes. For example, if the user wants to add glasses, although the related technology can add eyes, it may also cause facial deformation.
发明内容Contents of the invention
本公开提出了一种图像处理技术方案。The present disclosure proposes an image processing technical solution.
根据本公开的一方面,提供了一种图像处理方法,包括:获取待处理的人脸图像;对所述人脸图像进行编码处理,得到所述人脸图像的第一隐变量;响应于针对人脸属性的属性编辑程度的设置操作,根据设置的属性编辑程度以及所述人脸属性对应的属性编辑方向,编辑所述第一隐变量,得到编辑后的第二隐变量,其中,所述属性编辑方向表征所述人脸属性的增强方向或削弱方向,不同人脸属性对应的属性编辑方向不同,所述属性编辑程度表征所述人脸属性的增强程度或削弱程度;对所述第二隐变量进行解码处理,得到目标人脸图像,所述目标人脸图像与所述人脸图像中所述人脸属性的显示效果不同。According to one aspect of the present disclosure, an image processing method is provided, including: acquiring a face image to be processed; encoding the face image to obtain a first latent variable of the face image; responding to The setting operation of the attribute editing degree of the face attribute, according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, edit the first latent variable to obtain the edited second latent variable, wherein, the The attribute editing direction represents the enhancement or weakening direction of the face attributes. Different face attributes correspond to different attribute editing directions. The attribute editing degree represents the enhancement or weakening degree of the face attributes; for the second The latent variables are decoded to obtain a target face image, and the display effects of the face attributes in the target face image and the face image are different.
根据本公开的一方面,提供了一种图像处理装置,包括:获取部分,配置为获取待处理的人脸图像;编码部分,配置为对所述人脸图像进行编码处理,得到所述人脸图像的第一隐变量;编辑部分,配置为响应于针对人脸属性的属性编辑程度的设置操作,根据设置的属性编辑程度以及所述人脸属性对应的属性编辑方向,编辑所述第一隐变量,得到编辑后的第二隐变量,其中,所述属性编辑方向表征所述人脸属性的增强方向或削弱方向,不同人脸属性对应的属性编辑方向不同,所述属性编辑程度表征所述人脸属性的增强程度或削弱程度;解码部分,配置为对所述第二隐变量进行解码处理,得到目标人脸图像,所述目标人脸图像与所述人脸图像中所述人脸属性的显示效果不同。According to an aspect of the present disclosure, an image processing device is provided, including: an acquisition part configured to acquire a face image to be processed; an encoding part configured to encode the face image to obtain the face image. The first latent variable of the image; the editing part is configured to respond to a setting operation for the attribute editing degree of the face attribute, and edit the first hidden variable according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute. variable to obtain the edited second latent variable, wherein the attribute editing direction represents the enhancement direction or weakening direction of the face attribute, different face attributes correspond to different attribute editing directions, and the attribute editing degree represents the The degree of enhancement or weakening of the face attributes; the decoding part is configured to decode the second latent variable to obtain a target face image, the target face image and the face attributes in the face image The display effect is different.
根据本公开的一方面,提供了一种电子设备,包括:处理器;配置为存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上 述方法。According to an aspect of the present disclosure, an electronic device is provided, including: a processor; a memory configured to store instructions executable by the processor; wherein the processor is configured to call instructions stored in the memory to execute the above method.
根据本公开的一方面,提供了一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。According to an aspect of the present disclosure, a computer-readable storage medium is provided, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above method is implemented.
根据本公开的一方面,提供了一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行配置为实现上述方法。According to an aspect of the present disclosure, a computer program product is provided, including computer readable code. When the computer readable code is run in an electronic device, a processor in the electronic device executes a configuration configured to implement the above method.
在本公开实施例中,通过对人脸图像进行编码处理得到第一隐变量,根据设置的人脸属性的属性编辑程度以及该人脸属性对应的属性编辑方向,编辑该第一隐变量得到第二隐变量,再对第二隐变量进行解码处理得到目标人脸图像,由于不同人脸属性的属性编辑方向不同,可以精准地对用户指定的人脸属性进行编辑,而不影响其它人脸属性的显示效果。In the embodiment of the present disclosure, the first latent variable is obtained by encoding the face image, and the first latent variable is obtained by editing the first latent variable according to the set attribute editing degree of the face attribute and the attribute editing direction corresponding to the face attribute. two latent variables, and then decode the second latent variable to obtain the target face image. Since the attribute editing directions of different face attributes are different, the face attributes specified by the user can be accurately edited without affecting other face attributes. display effect.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本公开。根据下面参考附图对示例性实施例的详细说明,本公开的其它特征及方面将变得清楚。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only, and are not restrictive of the disclosure. Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments with reference to the accompanying drawings.
附图说明Description of the drawings
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。The accompanying drawings herein are incorporated into and constitute a part of this specification. They illustrate embodiments consistent with the disclosure and, together with the description, serve to explain the technical solutions of the disclosure.
图1示出根据本公开实施例的图像处理方法的流程图。FIG. 1 shows a flowchart of an image processing method according to an embodiment of the present disclosure.
图2示出根据本公开实施例的一种操作控件的示意图。FIG. 2 shows a schematic diagram of an operation control according to an embodiment of the present disclosure.
图3a示出根据本公开实施例的一种人脸图像的示意图。Figure 3a shows a schematic diagram of a human face image according to an embodiment of the present disclosure.
图3b示出根据本公开实施例的一种目标人脸图像的示意图。Figure 3b shows a schematic diagram of a target face image according to an embodiment of the present disclosure.
图4a示出根据本公开实施例的一种人脸图像的示意图。Figure 4a shows a schematic diagram of a human face image according to an embodiment of the present disclosure.
图4b示出根据本公开实施例的一种目标人脸图像的示意图。Figure 4b shows a schematic diagram of a target face image according to an embodiment of the present disclosure.
图5示出根据本公开实施例的图像处理流程的示意图。FIG. 5 shows a schematic diagram of an image processing flow according to an embodiment of the present disclosure.
图6示出根据本公开实施例的样本分布空间的示意图。Figure 6 shows a schematic diagram of a sample distribution space according to an embodiment of the present disclosure.
图7示出根据本公开实施例的图像处理装置的框图。FIG. 7 shows a block diagram of an image processing device according to an embodiment of the present disclosure.
图8示出根据本公开实施例的一种电子设备的框图。FIG. 8 shows a block diagram of an electronic device according to an embodiment of the present disclosure.
具体实施方式Detailed ways
以下将参考附图详细说明本公开的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。Various exemplary embodiments, features, and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. The same reference numbers in the drawings identify functionally identical or similar elements. Although various aspects of the embodiments are illustrated in the drawings, the drawings are not necessarily drawn to scale unless otherwise indicated.
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。The word "exemplary" as used herein means "serving as an example, example, or illustrative." Any embodiment described herein as "exemplary" is not necessarily to be construed as superior or superior to other embodiments.
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" in this article is just an association relationship that describes related objects, indicating that three relationships can exist. For example, A and/or B can mean: A exists alone, A and B exist simultaneously, and they exist alone. B these three situations. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, and C, which can mean including from A, Any one or more elements selected from the set composed of B and C.
另外,为了更好地说明本公开,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本公开同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本公开 的主旨。In addition, in order to better explain the present disclosure, numerous specific details are given in the following detailed description. It will be understood by those skilled in the art that the present disclosure may be practiced without certain specific details. In some instances, methods, means, components and circuits that are well known to those skilled in the art are not described in detail in order to emphasize the subject matter of the disclosure.
图1示出根据本公开实施例的图像处理方法的流程图,所述图像处理方法可以由终端设备或服务器等电子设备执行,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等,所述方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现,或者,可通过服务器执行所述方法。如图1所示,所述图像处理方法包括步骤S11至步骤S14:Figure 1 shows a flow chart of an image processing method according to an embodiment of the present disclosure. The image processing method can be executed by an electronic device such as a terminal device or a server. The terminal device can be a user equipment (User Equipment, UE), a mobile device, a user Terminal, terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle-mounted device, wearable device, etc., the method can call the computer-readable data stored in the memory through the processor The method can be implemented by means of instructions, or the method can be executed by the server. As shown in Figure 1, the image processing method includes steps S11 to S14:
在步骤S11中,获取待处理的人脸图像。In step S11, the face image to be processed is obtained.
其中,人脸图像可以是通过图像采集设备实时采集的图像,还可以是从本地存储中提取的图像,或还可以是由其它电子设备传输的图像,对此本公开实施例不作限制。应理解的是,人脸图像中的人脸可以是真实人脸,也可以是虚拟人脸,例如动漫人物的人脸等,对此本公开实施例不作限制。The face image may be an image collected in real time by an image acquisition device, or may be an image extracted from local storage, or may be an image transmitted by other electronic devices, to which embodiments of the present disclosure are not limited. It should be understood that the face in the face image may be a real face or a virtual face, such as the face of an anime character, etc., and the embodiments of the present disclosure are not limited to this.
在步骤S12中,对人脸图像进行编码处理,得到人脸图像的第一隐变量。In step S12, the face image is encoded to obtain the first latent variable of the face image.
在一种可能的实现方式中,可以通过人脸图像编码器对人脸图像进行编码处理,得到人脸图像的第一隐变量,第一隐变量可以表示为M个第一N维向量,M与N为正整数,例如,人脸图像编码器可以将人脸图像编码成18个512维向量。In a possible implementation, the face image can be encoded by a face image encoder to obtain the first latent variable of the face image. The first latent variable can be expressed as M first N-dimensional vectors, M With N being a positive integer, for example, the face image encoder can encode the face image into 18 512-dimensional vectors.
其中,可以采用本领域已知的深度学习技术实现该人脸图像编码器,例如,人脸图像编码器可以采用深度神经网络,对人脸图像进行特征提取,将提取的深度特征作为人脸图像的第一隐变量。应理解的是,针对人脸图像的编码方式,本公开实施例不作限制。Among them, the face image encoder can be implemented using deep learning technology known in the art. For example, the face image encoder can use a deep neural network to extract features of the face image, and use the extracted depth features as the face image. the first hidden variable. It should be understood that the embodiment of the present disclosure does not limit the encoding method of face images.
在步骤S13中,响应于针对人脸属性的属性编辑程度的设置操作,根据设置的属性编辑程度以及人脸属性对应的属性编辑方向,编辑第一隐变量,得到编辑后的第二隐变量。In step S13, in response to the setting operation of the attribute editing degree of the face attribute, the first latent variable is edited according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, and the edited second latent variable is obtained.
在一种可能的实现方式中,人脸属性例如可以包括但不限于:人脸的脸型、位姿,人脸表征出的性别、年龄、情绪,人脸上的胡须、眼镜、口罩,瞳孔颜色、头发颜色、妆容颜色、滤镜颜色中的至少一种。In a possible implementation, the face attributes may include, for example, but are not limited to: the face shape and posture of the face, the gender, age, and emotion represented by the face, the beard, glasses, and masks on the face, and the pupil color. , hair color, makeup color, filter color at least one.
应理解的是,本领域技术人员可以利用本领域已知的软件开发技术,设计并实现本公开实施例的图像处理方法的应用程序以及对应的图形交互界面,该图形交互界面中可以提供用于设置属性编辑程度的操作控件,以实现用户对任一人脸属性的属性编辑程度的设置操作,对此本公开实施例不作限制。图2示出根据本公开实施例的一种操作控件的示意图,如图2所示,针对人脸属性例如胡子21、年龄22、性别23分别对应各自的属性编辑程度的取值范围,可以通过调节任一人脸属性对应的操作控件“实心圆”在相应取值范围线段上的位置,来实现设置任一人脸属性的属性编辑程度。It should be understood that those skilled in the art can use software development technologies known in the art to design and implement the application program of the image processing method and the corresponding graphical interactive interface of the embodiment of the present disclosure, and the graphical interactive interface can provide for The operation control for setting the attribute editing degree is used to implement the user's setting operation for the attribute editing degree of any facial attribute, and this embodiment of the present disclosure does not limit this. Figure 2 shows a schematic diagram of an operation control according to an embodiment of the present disclosure. As shown in Figure 2, for facial attributes such as beard 21, age 22, and gender 23, the value ranges corresponding to the respective attribute editing degrees can be passed Adjust the position of the "filled circle" of the operation control corresponding to any face attribute on the corresponding value range line segment to set the attribute editing degree of any face attribute.
其中,属性编辑方向可以表征人脸属性的增强方向或削弱方向。如上所述,第一隐变量可以表示为M个第一N维向量,为了便于对第一隐变量进行编辑,属性编辑方向可以表示为第二N维向量。可知晓的是,向量是有方向的,第二N维向量的方向表征该属性编辑方向,并且不同人脸属性对应的属性编辑方向不同,这样在用户期望编辑某一人脸属性时,可以根据该期望编辑的人脸属性对应的属性编辑方向,对第一隐变量中的至少一个第一N维向量进行编辑,而不会影响其它人脸属性。Among them, the attribute editing direction can represent the direction of enhancement or weakening of the face attributes. As mentioned above, the first latent variable can be expressed as M first N-dimensional vectors. In order to facilitate editing of the first latent variable, the attribute editing direction can be expressed as a second N-dimensional vector. What is known is that the vector has a direction. The direction of the second N-dimensional vector represents the editing direction of the attribute, and the attribute editing directions corresponding to different face attributes are different. In this way, when the user wishes to edit a certain face attribute, he can edit the attribute according to the direction. It is expected that the attribute editing direction corresponding to the face attribute to be edited edits at least one first N-dimensional vector in the first latent variable without affecting other face attributes.
在一种可能的实现方式中,人脸属性对应的属性编辑方向是利用属性分类器对样本人脸图像进行二分类得到的。其中,属性分类器可以采集支持向量机、图像分类网络等,对此本公开实施例不作限制。应理解的是,每个人脸属性可以对应各自的属性分类器,或者说一种属性分类器可以用于分类一种人脸属性。In a possible implementation, the attribute editing direction corresponding to the face attribute is obtained by using an attribute classifier to classify the sample face image into two categories. Among them, the attribute classifier can collect support vector machines, image classification networks, etc., and the embodiments of the present disclosure are not limited to this. It should be understood that each face attribute can correspond to its own attribute classifier, or one attribute classifier can be used to classify one face attribute.
例如,可以利用性别对应的属性分类器对样本人脸图像进行二分类,也即将样本人脸图像分为男性人脸与女性人脸,那么属性编辑方向可以是表征男性化人脸的增强方向 (也即女性化人脸的削弱方向),或表征男性化人脸的削弱方向(也即女性化人脸的增强方向);还可以利用胡子对应的属性分类器对样本人脸图像进行二分类,也即将样本人脸图像分为有胡子人脸与无胡子人脸,那么属性编辑方向可以是表征有胡子的增强方向(也即无胡子的削弱方向),或还可以表征有胡子的削弱方向(也即无胡子的增强方向)。For example, the attribute classifier corresponding to gender can be used to classify the sample face image into two categories, that is, the sample face image is divided into male faces and female faces. Then the attribute editing direction can be the enhancement direction that represents masculine faces ( That is, the weakening direction of a feminine face), or the weakening direction of a masculine face (that is, the strengthening direction of a feminine face); you can also use the attribute classifier corresponding to the beard to classify the sample face image into two categories, That is, the sample face image is divided into a bearded face and a beardless face. Then the attribute editing direction can be the enhancement direction that represents the beard (that is, the weakening direction without the beard), or it can also represent the weakening direction of the beard ( That is, the direction of enhancement without beard).
其中,属性编辑程度可以表征人脸属性的增强程度或削弱程度,或者说,属性编辑程度可以表示用户期望对该人脸属性进行编辑的程度,例如,用户期望增强某一人脸属性,那么该人脸属性增强的程度,可以是用户设置的属性编辑程度。Among them, the degree of attribute editing can represent the degree of enhancement or weakening of a face attribute, or in other words, the degree of attribute editing can represent the extent to which the user expects to edit the face attribute. For example, if the user expects to enhance a certain face attribute, then the person The degree of facial attribute enhancement can be the attribute editing degree set by the user.
在一种可能的实现方式中,属性编辑程度可以设置一定取值范围,例如可以设置属性编辑程度的取值范围为[-3,3]、[-10,10]等,正值意味着人脸属性的增强程度,负值意味着人脸属性的削弱程度。举例来说,用户期望在人脸中增加胡子,属性编辑程度可以是正值,属性编辑程度越大,胡子越浓密;反之,用户期望在人脸中去除胡子,属性编辑程度可以是负值,属性编辑程度越小,胡子越稀疏。In a possible implementation, a certain value range can be set for the attribute editing degree. For example, the value range for the attribute editing degree can be set to [-3,3], [-10,10], etc. A positive value means that the person The degree of enhancement of face attributes. Negative values mean the degree of weakening of face attributes. For example, if the user wants to add a beard to the face, the attribute editing degree can be a positive value. The greater the attribute editing degree, the thicker the beard will be; conversely, if the user wants to remove the beard on the face, the attribute editing degree can be a negative value. The smaller the attribute edit, the sparser the beard.
在一种可能的实现方式中,根据设置的属性编辑程度以及人脸属性对应的属性编辑方向,编辑第一隐变量,得到编辑后的第二隐变量,可以包括:将设置的属性编辑程度与属性编辑方向之间的乘积,与第一隐变量中的至少一个第一N维向量相加,得到编辑后的第二隐变量。通过该方式,能够实现对指定的人脸属性进行编辑。In one possible implementation, editing the first latent variable according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute to obtain the edited second latent variable may include: combining the set attribute editing degree with The product between the attribute editing directions is added to at least one first N-dimensional vector in the first latent variable to obtain the edited second latent variable. In this way, the specified face attributes can be edited.
在步骤S14中,对第二隐变量进行解码处理,得到目标人脸图像,目标人脸图像与人脸图像中人脸属性的显示效果不同。In step S14, the second latent variable is decoded to obtain a target face image. The display effects of the face attributes in the target face image and the face image are different.
在一种可能的实现方式中,可以利用生成网络对第二隐变量进行解码处理,得到目标人脸图像。应理解的是,对于生成网络的网络结构、网络类型以及训练方式,本公开实施例不作限制,例如,生成网络可以是通过训练生成式对抗网络(Generative Adversarial Networks,GAN)得到的。In a possible implementation, the generation network can be used to decode the second latent variable to obtain the target face image. It should be understood that the embodiments of the present disclosure do not limit the network structure, network type, and training method of the generating network. For example, the generating network can be obtained by training a generative adversarial network (GAN).
其中,生成网络可以用于根据M个N维向量生成具有预设图像风格的图像,图像风格例如可以至少包括真实风格与非真实风格,非真实风格例如可以至少包括漫画风格、欧美风格、素描风格、油画风格、版画风格等。也即,目标人脸图像中的人脸可以是真实风格的人脸,也可以是非真实风格的人脸。The generation network can be used to generate an image with a preset image style based on M N-dimensional vectors. The image style can include at least a real style and a non-realistic style, for example. The non-realistic style can include at least a comic style, a European and American style, and a sketch style. , oil painting style, printmaking style, etc. That is, the face in the target face image may be a real-style face or a non-realistic-style face.
如上所述,第二隐变量是对人脸图像的第一隐变量进行编辑得到的,基于第二隐变量生成的目标人脸图像中的人脸属性与原始的人脸图像中的人脸属性的显示效果不同。其中,在属性编辑方向与属性编辑程度的乘积为正值的情况下,基于第二隐变量得到的目标人脸图像,相对于人脸图像的人脸属性的显示效果增强;在属性编辑方向与属性编辑程度的乘积为负值的情况下,基于第二隐变量得到的目标人脸图像,相对于人脸图像的人脸属性的显示效果削弱。As mentioned above, the second latent variable is obtained by editing the first latent variable of the face image. The face attributes in the target face image generated based on the second latent variable are the same as the face attributes in the original face image. The display effect is different. Among them, when the product of the attribute editing direction and the attribute editing degree is a positive value, the display effect of the face attributes of the target face image obtained based on the second latent variable is enhanced relative to the face image; when the attribute editing direction and the When the product of the attribute editing degrees is a negative value, the display effect of the face attributes of the target face image obtained based on the second latent variable is weakened relative to the face image.
其中,属性编辑方向与属性编辑程度的乘积为正值,意味着沿着与属性编辑方向相同的方向,增强了属性编辑程度的人脸属性;属性编辑方向与属性编辑程度的乘积为负值,意味着沿着与属性编辑方向的相反方向,削弱了属性编辑程度的人脸属性。Among them, the product of the attribute editing direction and the attribute editing degree is a positive value, which means that the face attributes of the attribute editing degree are enhanced along the same direction as the attribute editing direction; the product of the attribute editing direction and the attribute editing degree is a negative value, It means that the face attributes weaken the degree of attribute editing in the opposite direction to the attribute editing direction.
图3a示出根据本公开实施例的一种人脸图像的示意图,图3b示出根据本公开实施例的一种目标人脸图像的示意图。其中,图3b所示的目标人脸图像,可以是对图3a所示的人脸图像进行属性编辑所得到的图像,如图3a和3b所示,图3b的目标人脸图像比图3a所示的人脸图像中的人脸更年轻,图3a和图3b均是真实化风格的图像。Figure 3a shows a schematic diagram of a human face image according to an embodiment of the present disclosure, and Figure 3b shows a schematic diagram of a target human face image according to an embodiment of the present disclosure. Among them, the target face image shown in Figure 3b can be an image obtained by editing the attributes of the face image shown in Figure 3a. As shown in Figures 3a and 3b, the target face image in Figure 3b is larger than that in Figure 3a. The faces in the face images shown are younger. Figures 3a and 3b are both realistic-style images.
图4a示出根据本公开实施例的一种人脸图像的示意图,图4b示出根据本公开实施例的一种目标人脸图像的示意图。其中,图4b所示的目标人脸图像,可以是对图4a所示的人脸图像进行属性编辑所得到的图像,如图4a和4b所示,图4b的目标人脸图像比图4a所示的人脸图像中的人脸更年轻,图4a和图4b均是油画风格的图像。Figure 4a shows a schematic diagram of a human face image according to an embodiment of the present disclosure, and Figure 4b shows a schematic diagram of a target human face image according to an embodiment of the present disclosure. Among them, the target face image shown in Figure 4b can be an image obtained by editing the attributes of the face image shown in Figure 4a. As shown in Figures 4a and 4b, the target face image in Figure 4b is larger than that in Figure 4a. The faces in the face images shown are younger. Figures 4a and 4b are both oil painting style images.
图5示出根据本公开实施例的图像处理流程的示意图,如图5所示,所述图像处理流程可以包括:将人脸图像51输入至人脸图像编码器52,得到人脸图像对应的第一隐变量;根据设置的人脸属性的属性编辑程度以及该人脸属性的属性编辑方向,编辑第一隐变量,得到编辑后的第二隐变量53,将第二隐变量输入至生成网络54中,得到目标人脸图像。如图5所示,由于胡子这一人脸属性的属性编辑程度为正值,因此目标人脸图像54中人脸上的胡子比人脸图像多,也即目标人脸图像与人脸图像中人脸属性的显示效果不同。Figure 5 shows a schematic diagram of an image processing flow according to an embodiment of the present disclosure. As shown in Figure 5, the image processing flow may include: inputting the face image 51 to the face image encoder 52 to obtain the face image corresponding to The first latent variable; according to the set attribute editing degree of the face attribute and the attribute editing direction of the face attribute, edit the first latent variable to obtain the edited second latent variable 53, and input the second latent variable into the generation network 54, the target face image is obtained. As shown in Figure 5, since the attribute editing degree of the face attribute beard is positive, there are more beards on the face in the target face image 54 than in the face image, that is, the target face image is different from the person in the face image. The display effects of face attributes are different.
在本公开实施例中,通过对人脸图像进行编码处理得到第一隐变量,根据设置的人脸属性的属性编辑程度以及该人脸属性对应的属性编辑方向,编辑该第一隐变量得到第二隐变量,再对第二隐变量进行解码处理得到目标人脸图像,由于不同人脸属性的属性编辑方向是不同的,可以精准地对用户指定的人脸属性进行编辑,而不影响其它人脸属性的显示效果。In the embodiment of the present disclosure, the first latent variable is obtained by encoding the face image, and the first latent variable is obtained by editing the first latent variable according to the set attribute editing degree of the face attribute and the attribute editing direction corresponding to the face attribute. two latent variables, and then decode the second latent variable to obtain the target face image. Since the attribute editing directions of different face attributes are different, the face attributes specified by the user can be accurately edited without affecting other people. The display effect of face attributes.
如上所述,人脸属性对应的属性编辑方向是利用属性分类器对样本人脸图像进行二分类得到的。在一种可能的实现方式中,利用属性分类器对样本人脸图像进行二分类,包括:利用人脸属性对应的属性分类器,对样本人脸图像进行二分类,得到样本人脸图像在隐空间里的属性分类边界,隐空间表征样本人脸图像对应的样本隐变量分布的样本分布空间;将属性分类边界面向人脸属性的正样本属性的方向确定为属性编辑方向。As mentioned above, the attribute editing direction corresponding to the face attribute is obtained by using the attribute classifier to classify the sample face image into two categories. In one possible implementation, using an attribute classifier to classify the sample face image into two categories includes: using an attribute classifier corresponding to the face attribute to classify the sample face image into two categories, and obtaining the hidden representation of the sample face image. The attribute classification boundary in the space, the latent space represents the sample distribution space of the sample latent variable distribution corresponding to the sample face image; the direction in which the attribute classification boundary faces the positive sample attribute of the face attribute is determined as the attribute editing direction.
其中,样本人脸图像可以是上述生成网络根据随机分布的M个N维向量随机生成的人脸图像,也可以是图像采集设备实际拍摄的人脸图像,对此本公开实施例不作限制。样本人脸图像对应的样本隐变量,可以理解为样本人脸图像对应的特征向量,因此隐空间还可以理解为样本人脸图像对应的特征向量所分布的向量分布空间。The sample face image may be a face image randomly generated by the above-mentioned generation network based on M randomly distributed N-dimensional vectors, or it may be a face image actually captured by the image acquisition device, which is not limited by the embodiments of the present disclosure. The sample latent variable corresponding to the sample face image can be understood as the feature vector corresponding to the sample face image, so the latent space can also be understood as the vector distribution space in which the feature vectors corresponding to the sample face image are distributed.
可知晓的是,对样本人脸图像进行二分类,可以简述为,找到一个分界面将隐空间中的样本隐变量分为两部分,一部分表征人脸属性的正样本属性,另一部表征人脸属性的负样本属性,该分界面可以作为属性分类边界。应理解的是,在得到分界面后,便可以得到垂直该分界面的法向量或单位法向量,例如,ax+by+cz=d表示的分界面,法向量为(a,b,c),单位法向量即该法向量除以该法向量的长度,那么可以将指向正样本属性一侧的法向量(或单位法向量)确定为属性编辑方向,当然也可以将指向负样本属性一侧的法向量(或单位法向量)确定为属性编辑方向。What is known is that the two-classification of sample face images can be briefly described as finding an interface to divide the sample latent variables in the latent space into two parts. One part represents the positive sample attributes of the face attributes, and the other part represents the positive sample attributes of the face attributes. Negative sample attributes of face attributes, this interface can be used as the attribute classification boundary. It should be understood that after obtaining the interface, the normal vector or unit normal vector perpendicular to the interface can be obtained. For example, for the interface represented by ax+by+cz=d, the normal vector is (a, b, c) , the unit normal vector is the normal vector divided by the length of the normal vector. Then the normal vector (or unit normal vector) pointing to the positive sample attribute side can be determined as the attribute editing direction. Of course, you can also point to the negative sample attribute side. The normal vector (or unit normal vector) is determined as the attribute editing direction.
图6示出根据本公开实施例的样本分布空间的示意图,如图6所示,为便于理解将样本分布空间用二维空间表示,可理解的是,实际的样本分布空间可以高维分布空间。图6中正方形表示可以是正样本属性,三角形表示的可以是负样本属性,属性编辑方向是属性分类边界面向正样本属性的方向。Figure 6 shows a schematic diagram of a sample distribution space according to an embodiment of the present disclosure. As shown in Figure 6, the sample distribution space is expressed as a two-dimensional space for ease of understanding. It is understandable that the actual sample distribution space can be a high-dimensional distribution space. . In Figure 6, the square represents the positive sample attribute, and the triangle represents the negative sample attribute. The attribute editing direction is the direction in which the attribute classification boundary faces the positive sample attribute.
应理解的是,将属性分类边界面向正样本属性的方向确定为属性编辑方向后,属性编辑方向可以表征正样本属性的增强方向(或负样本属性的削弱方向),那么该属性编辑方向的相反方向可以表征正样本属性的削弱方向(或负样本属性的增强方向)。当然还可以将属性分类边界面向负样本属性的方向确定为属性编辑方向,此时属性编辑方向可表征负样本属性的增强方向(或正样本属性的削弱方向),该属性编辑方向的相反方向可以表征负样本属性的削弱方向(或正样本属性的增强方向)。It should be understood that after the direction in which the attribute classification boundary faces the positive sample attribute is determined as the attribute editing direction, the attribute editing direction can represent the enhancement direction of the positive sample attribute (or the weakening direction of the negative sample attribute), then the opposite direction of the attribute editing direction The direction can represent the direction of weakening the attributes of positive samples (or the direction of strengthening the attributes of negative samples). Of course, the direction in which the attribute classification boundary faces the negative sample attribute can also be determined as the attribute editing direction. At this time, the attribute editing direction can represent the enhancement direction of the negative sample attribute (or the weakening direction of the positive sample attribute). The opposite direction of the attribute editing direction can be Characterizes the weakening direction of negative sample attributes (or the strengthening direction of positive sample attributes).
应理解的是,各个人脸属性的正样本属性与负样本属性可以自定义设置,对此本公开实施例不作限制。例如,可以设置有胡子为正样本属性,那么无胡子就是负样本属性;还可以设置年轻化人脸为正样本属性,那么老年化人脸为负样本属性;还可以设置笑脸为正样本属性,那么非笑脸(如哭脸、生气脸等)为负样本属性等;还可以设置蓝色瞳孔为正样本属性,那么非蓝色瞳孔(例如绿色瞳孔、黑色瞳孔等)为负样本属性。It should be understood that the positive sample attributes and negative sample attributes of each face attribute can be customized, and this is not limited by the embodiments of the present disclosure. For example, you can set a beard as a positive sample attribute, then no beard as a negative sample attribute; you can also set a younger face as a positive sample attribute, then an older face as a negative sample attribute; you can also set a smiling face as a positive sample attribute, Then non-smiling faces (such as crying faces, angry faces, etc.) are negative sample attributes, etc.; you can also set blue pupils as positive sample attributes, then non-blue pupils (such as green pupils, black pupils, etc.) are negative sample attributes.
在本公开实施例中,能够快速有效地确定出不同人脸属性的属性编辑方向。In the embodiments of the present disclosure, the attribute editing directions of different face attributes can be determined quickly and effectively.
如上所述,第一隐变量可以表示为M个第一N维向量,属性编辑方向可以表示为第二N维向量,N与M为正整数,在一种可能的实现方式中,在步骤S13中,根据设置的属性编辑程度以及人脸属性对应的属性编辑方向,编辑第一隐变量,得到编辑后的第二隐变量,包括:As mentioned above, the first latent variable can be expressed as M first N-dimensional vectors, and the attribute editing direction can be expressed as a second N-dimensional vector. N and M are positive integers. In one possible implementation, in step S13 , edit the first latent variable according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, and obtain the edited second latent variable, including:
根据人脸属性对应的属性种类,确定属性编辑方向作用于的至少一个第一N维向量;计算第二N维向量与属性编辑程度之间的乘积,得到第三N维向量;将M个第一N维向量中的至少一个第一N维向量分别与第三N维向量相加,得到M个第四N维向量,第二隐变量表示为M个第四N维向量。According to the attribute type corresponding to the face attribute, determine at least one first N-dimensional vector on which the attribute editing direction acts; calculate the product between the second N-dimensional vector and the attribute editing degree to obtain the third N-dimensional vector; combine the M-th At least one first N-dimensional vector in an N-dimensional vector is added to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors, and the second hidden variable is represented as M fourth N-dimensional vectors.
其中,考虑到实验发现,生成网络包括多个不同分辨率的网络,不同分辨率的网络层对不同人脸属性的敏感程度不同,或者说学习效果不同,因此生成网络的不同分辨率的网络可以分别用于处理不同人脸属性对应的第二隐变量。生成网络的低分辨率网络层对人脸的脸型、位姿以及人脸表征出的性别、年龄、情绪等第一类人脸属性比较敏感,中分辨率网络层对人脸上的胡须、眼镜、口罩等第二类人脸属性比较敏感,高分辨率网络层对瞳孔颜色、头发颜色、妆容颜色、滤镜颜色等第三类人脸属性比较敏感。也即,生成网络的低分辨率网络层对人脸属性的第一类人脸属性的敏感程度高于生成网络的中分辨率网络层与高分辨率网络层;生成网络的中分辨率网络层对人脸属性的第二类人脸属性的敏感程度高于低分辨率网络层与高分辨率网络层,生成网络的高分辨率网络层对人脸属性的第三类人脸属性的敏感程度高于低分辨率网络层与中分辨率网络层。Among them, considering the experimental findings, the generation network includes multiple networks with different resolutions. Network layers with different resolutions have different sensitivities to different face attributes, or have different learning effects. Therefore, the networks with different resolutions of the generation network can They are used to process the second latent variables corresponding to different facial attributes. The low-resolution network layer of the generative network is relatively sensitive to the face shape, posture, and first-class facial attributes represented by the face, such as gender, age, and emotion. The medium-resolution network layer is sensitive to beards, glasses, etc. , masks and other second-category face attributes are more sensitive, and the high-resolution network layer is more sensitive to the third-category face attributes such as pupil color, hair color, makeup color, and filter color. That is to say, the low-resolution network layer of the generative network is more sensitive to the first type of face attributes than the medium-resolution network layer and high-resolution network layer of the generative network; the medium-resolution network layer of the generative network is The second type of face attributes is more sensitive to the face attributes than the low-resolution network layer and the high-resolution network layer. The high-resolution network layer of the generating network is more sensitive to the third type of face attributes. Higher than the low-resolution network layer and the medium-resolution network layer.
因此,可以将某一人脸属性的属性编辑方向作用于第一隐向量的部分第一N维向量,也即根据人脸属性对应的属性种类,确定属性编辑方向作用于的至少一个第一N维向量,并且由于不同人脸属性对应的属性编辑方向不同,从而能够在对某一人脸属性进行编辑时减少对其它人脸属性产生的影响,甚至对其它人脸属性不产生影响。Therefore, the attribute editing direction of a certain face attribute can be applied to part of the first N-dimensional vector of the first latent vector, that is, according to the attribute type corresponding to the face attribute, at least one first N-dimensional vector to which the attribute editing direction acts can be determined. vector, and because different face attributes correspond to different attribute editing directions, it is possible to reduce the impact on other face attributes when editing a certain face attribute, or even have no impact on other face attributes.
基于上述实验发现,在一种可能的实现方式中,可以将第一类人脸属性的属性编辑方向作用于第1个第一N维向量至第i个第一N维向量;将第二类人脸属性的属性编辑方向作用于第i+1个第一N维向量至第j个第一N维向量;将第三类人脸属性的属性编辑方向作用于第j+1个第一N维向量至第M个第一N维向量;其中,i∈[1,M],j∈[2,M]。通过该方式,可以将不同属性种类的人脸属性与各个第一N维向量分别对应,便于后续利用生成网络对基于第一N维向量所编辑后的第二N维向量进行编码处理,得到调整不同人脸属性的显示效果的目标人脸图像。Based on the above experimental findings, in a possible implementation method, the attribute editing direction of the first type of face attributes can be applied to the first N-dimensional vector to the i-th first N-dimensional vector; The attribute editing direction of face attributes acts on the i+1 first N-dimensional vector to the j-th first N-dimensional vector; the attribute editing direction of the third type of face attributes acts on the j+1 first N dimensional vector to the Mth first N-dimensional vector; where, i∈[1,M], j∈[2,M]. In this way, facial attributes of different attribute types can be corresponding to each first N-dimensional vector, so that the generation network can be subsequently used to encode the second N-dimensional vector edited based on the first N-dimensional vector to obtain adjustments. Target face images with display effects of different face attributes.
举例来说,假设第一隐变量表示为3个二维向量{(a1,b1),(a2,b2),(a3,b3)},某一属性种类的人脸属性的属性编辑方向作用于二维向量(a1,b1),属性编辑程度为2,属性编辑方向表示为二维向量(m,n),那么基于上述实现方式,第二隐变量表示为3个二维向量{(2m+a1,2n+b1),(a2,b2),(a3,b3)}For example, assuming that the first latent variable is expressed as three two-dimensional vectors {(a1,b1),(a2,b2),(a3,b3)}, the attribute editing direction of a certain attribute type of face attribute acts on Two-dimensional vector (a1, b1), the attribute editing degree is 2, and the attribute editing direction is expressed as a two-dimensional vector (m, n). Then based on the above implementation, the second hidden variable is expressed as three two-dimensional vectors {(2m+ a1,2n+b1),(a2,b2),(a3,b3)}
在本公开实施例中,能够精准地对任一人脸属性进行属性编辑,同时减少对其它人脸属性产生的影响,甚至对其它人脸属性不产生影响。In the embodiments of the present disclosure, any facial attribute can be accurately edited while reducing the impact on other facial attributes, or even having no impact on other facial attributes.
如上所述,在一种可能的实现方式中,人脸属性对应的属性种类包括第一类人脸属性、第二类人脸属性以及第三类人脸属性中的至少一种,第一类人脸属性包括:人脸的脸型、位姿,以及人脸表征出的性别、年龄、情绪中的至少一种,第二类人脸属性包括人脸上的胡须、眼镜、口罩中的至少一种,第三类人脸属性包括瞳孔颜色、头发颜色、妆容颜色、滤镜颜色中的至少一种。As mentioned above, in a possible implementation, the attribute types corresponding to the face attributes include at least one of the first type of face attributes, the second type of face attributes and the third type of face attributes. The first type Face attributes include: face shape, posture, and at least one of gender, age, and emotion represented by the face. The second type of face attributes includes at least one of beard, glasses, and mask on the face. The third type of face attributes includes at least one of pupil color, hair color, makeup color, and filter color.
在一种可能的实现方式中,根据人脸属性对应的属性种类,确定属性编辑方向作用于的至少一个第一N维向量,包括:In a possible implementation, at least one first N-dimensional vector on which the attribute editing direction acts is determined based on the attribute type corresponding to the face attribute, including:
在人脸属性包括第一类人脸属性的情况下,确定属性编辑方向作用于第1个第一N维向量至第i个第一N维向量;和/或,In the case where the face attributes include the first type of face attributes, determine the attribute editing direction to act on the first N-dimensional vector to the i-th first N-dimensional vector; and/or,
在人脸属性包括第二类人脸属性的情况下,确定属性编辑方向作用于第i+1个第一 N维向量至第j个第一N维向量;和/或,In the case where the face attributes include the second type of face attributes, determine the attribute editing direction to act on the i+1th first N-dimensional vector to the j-th first N-dimensional vector; and/or,
在人脸属性包括第三类人脸属性的情况下,确定属性编辑方向作用于第j+1个第一N维向量至第M个第一N维向量;其中,i∈[1,M],j∈[2,M]。In the case where the face attributes include the third type of face attributes, determine the attribute editing direction to act on the j+1th first N-dimensional vector to the M-th first N-dimensional vector; where, i∈[1,M] ,j∈[2,M].
如上所述,生成网络的低分辨率网络层对人脸的脸型、位姿以及人脸表征出的性别、年龄、情绪等第一类人脸属性比较敏感,中分辨率网络层对人脸上的胡须、眼镜、口罩等第二类人脸属性比较敏感,高分辨率网络层对瞳孔颜色、头发颜色、妆容颜色、滤镜颜色等第三类人脸属性比较敏感。由此,第1个第一N维向量至第i个第一N维向量可以对应于生成网络的低分辨率网络层,第i+1个第一N维向量至第j个第一N维向量可以对应于生成网络的中分辨率网络层,第j+1个第一N维向量至第M个第一N维向量可以对应于生成网络的高分辨率网络层。As mentioned above, the low-resolution network layer of the generative network is relatively sensitive to the face shape, posture, and the first type of face attributes represented by the face, such as gender, age, and emotion. The medium-resolution network layer is sensitive to the face The second type of face attributes such as beards, glasses, and masks are more sensitive, and the high-resolution network layer is more sensitive to the third type of face attributes such as pupil color, hair color, makeup color, and filter color. Thus, the 1st first N-dimensional vector to the i-th first N-dimensional vector may correspond to the low-resolution network layer of the generating network, and the i+1th first N-dimensional vector to the j-th first N-dimensional vector The vector may correspond to a medium-resolution network layer of the generating network, and the j+1th first N-dimensional vector to the M-th first N-dimensional vector may correspond to a high-resolution network layer of the generating network.
应理解的是,i和j的值可以是基于生成网络的网络结构进行实验测试所确定的经验值,对此本公开实施例不作限制,例如,若M为18,可以设置i为5,j为10。It should be understood that the values of i and j can be empirical values determined by experimental testing based on the network structure of the generated network, and the embodiments of the present disclosure are not limited to this. For example, if M is 18, i can be set to 5 and j is 10.
在一种可能的实现方式中,将M个第一N维向量中的至少一个第一N维向量分别与第三N维向量相加,得到M个第四N维向量,包括:In a possible implementation, at least one first N-dimensional vector among the M first N-dimensional vectors is added to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors, including:
将M个第一N维向量中的第1个第一N维向量至第i个第一N维向量分别与第三N维向量相加,得到M个第四N维向量;和/或,Add the 1st first N-dimensional vector to the i-th first N-dimensional vector among the M first N-dimensional vectors and the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors; and/or,
将M个第一N维向量中的第i+1个第一N维向量至第j个第一N维向量分别与第三N维向量相加,得到M个第四N维向量;和/或,Add the i+1th first N-dimensional vector to the j-th first N-dimensional vector among the M first N-dimensional vectors and the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors; and/ or,
将M个第一N维向量中的第j+1个第一N维向量至第M个第一N维向量分别与第三N维向量相加,得到M个第四N维向量。Add the j+1th first N-dimensional vector to the M-th first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively, to obtain M fourth N-dimensional vectors.
举例来说,假设第一隐变量表示为18个第一512维向量,属性编辑方向与属性编辑程度的乘积为第三512维向量,i为5,j为10,则在人脸属性包括第一类人脸属性的情况下,可以将18个第一512维向量中的第1个第一512维向量至第5个第一512维向量分别与第三512维向量相加;在人脸属性包括第二类人脸属性的情况下,可以将18个第一N维向量中的第6个第一N维向量至第10个第一N维向量分别与第三N维向量相加;在人脸属性包括第三类人脸属性的情况下,将M个第一N维向量中的第11个第一N维向量至第18个第一N维向量分别与第三N维向量相加。For example, assuming that the first latent variable is represented by 18 first 512-dimensional vectors, the product of the attribute editing direction and the attribute editing degree is the third 512-dimensional vector, i is 5, j is 10, then when the face attributes include the In the case of a type of face attributes, the first to the fifth first 512-dimensional vectors among the 18 first 512-dimensional vectors can be added to the third 512-dimensional vector respectively; in the face When the attributes include the second type of face attributes, the 6th to 10th first N-dimensional vectors among the 18 first N-dimensional vectors can be added to the third N-dimensional vector respectively; In the case where the face attributes include the third type of face attributes, the 11th first N-dimensional vector to the 18th first N-dimensional vector among the M first N-dimensional vectors are respectively associated with the third N-dimensional vector. add.
在本公开实施例中,能够精准地对任一人脸属性进行属性编辑,同时减少对其它人脸属性产生的影响,甚至对其它人脸属性不产生影响。In the embodiments of the present disclosure, any facial attribute can be accurately edited while reducing the impact on other facial attributes, or even having no impact on other facial attributes.
如上所述,生成网络包括多个不同分辨率的网络层,不同分辨率的网络可以分别用于处理不同人脸属性对应的第二隐变量,生成网络用于根据M个N维向量生成具有预设图像风格的图像;以及利用生成网络对第二隐变量进行解码处理,得到目标人脸图像,第二隐变量表示为M个第四N维向量,在一种可能的实现方式中,生成网络包括M层网络层,利用生成网络对第二隐变量进行解码处理,得到目标人脸图像,包括:As mentioned above, the generation network includes multiple network layers with different resolutions. The networks with different resolutions can be used to process the second latent variables corresponding to different face attributes. The generation network is used to generate predictions based on M N-dimensional vectors. Assume an image style image; and use the generation network to decode the second latent variable to obtain the target face image. The second latent variable is represented as M fourth N-dimensional vectors. In a possible implementation, the generation network It includes M-layer network layers, and uses the generation network to decode the second latent variable to obtain the target face image, including:
将第1个第四N维向量输入至生成网络的第1层网络层,得到第1层网络层输出的第1个中间图;将第m个第四N维向量以及第m-1个中间图输入至生成网络的第m层网络层,得到第m层网络层输出的第m个中间图,m∈[2,M);将第M个第四N维向量以及第M-1个中间图输入至生成网络的第M层网络层,得到第M层网络层输出的目标人脸图像。Input the first fourth N-dimensional vector to the first network layer of the generation network to obtain the first intermediate image output by the first layer network layer; add the m-th fourth N-dimensional vector and the m-1 intermediate The graph is input to the m-th network layer of the generation network, and the m-th intermediate graph output by the m-th network layer is obtained, m∈[2,M); the M-th fourth N-dimensional vector and the M-1-th intermediate The image is input to the Mth network layer of the generation network, and the target face image output by the Mth network layer is obtained.
应理解的是,当m为2时,第m-1个中间图也即为第1个中间图,因此第2个中间图是基于第2个第四N维向量以及第1个中间图得到的;当m∈[2,M)时,对于第2个中间图至第M-1个中间图均是基于第m个第四N维向量以及上层网络层输出的第m-1个中间图所确定的,由此可得知第M-1个中间图是基于第M-1个第四N维向量以及第M-2中间图得到的,第M-2个中间图是基于第M-2个第四N维向量以及第M-3中间图 得到的,以此类推。It should be understood that when m is 2, the m-1th intermediate graph is also the first intermediate graph, so the second intermediate graph is obtained based on the second, fourth N-dimensional vector and the first intermediate graph. ; when m∈[2,M), the second to M-1 intermediate images are all based on the m-th fourth N-dimensional vector and the m-1 intermediate image output by the upper network layer. determined, it can be known that the M-1th intermediate graph is based on the M-1 fourth N-dimensional vector and the M-2th intermediate graph, and the M-2th intermediate graph is based on the M-th intermediate graph. The 2 fourth N-dimensional vectors and the M-3 intermediate picture are obtained, and so on.
在一种可能的实现方式中,生成网络可以用于生成逐分辨率递增的图像,生成网络的第一层网络层的输入是一个第四N维向量,之后每层网络层的输入包括一个第四N维向量以及上层网络层输出的中间图,最后一层网络层输出目标人脸图像。其中,生成网络也可以称为多层变换的生成网络。In a possible implementation, the generative network can be used to generate images in increasing resolutions. The input of the first network layer of the generative network is a fourth N-dimensional vector, and the input of each subsequent network layer includes a fourth N-dimensional vector. Four N-dimensional vectors and the intermediate image output by the upper network layer, and the last network layer outputs the target face image. Among them, the generative network can also be called a multi-layer transformation generative network.
可以理解的是,生成网络的低分辨率网络层(也可以称为浅层网络层)先学习并生成低分辨率的中间图(如4×4的分辨率),之后逐渐随着网络深度的增加,继续学习并生成更高分辨率的中间图(如512×512的分辨率),最后生成最高分辨率的目标人脸图像(如1024×1024的分辨率)。It can be understood that the low-resolution network layer of the generative network (also called a shallow network layer) first learns and generates a low-resolution intermediate image (such as 4×4 resolution), and then gradually increases with the depth of the network. Increase, continue to learn and generate higher resolution intermediate images (such as 512×512 resolution), and finally generate the highest resolution target face image (such as 1024×1024 resolution).
在本公开实施例中,可以利用生成网络对第二隐变量进行解码处理,有效得到目标人脸图像。In the embodiment of the present disclosure, the generation network can be used to decode the second latent variable to effectively obtain the target face image.
应理解的是,用户可以按照上述本公开实施例中步骤S11至步骤S14的图像处理方式,对同一人脸属性执行至少一次属性编辑,也可以对不同人脸属性分别执行至少一次属性编辑,得到目标人脸图像。It should be understood that the user can perform at least one attribute editing on the same facial attributes according to the image processing method of steps S11 to S14 in the above embodiments of the present disclosure, or can perform at least one attribute editing on different facial attributes respectively, to obtain Target face image.
相关技术中,通过传统图像扭曲方法,可编辑的人脸属性有限,并且现有属性编辑无法解耦,容易使得属性编辑互相影响。根据本公开的实施例,能够更加精准地编辑特定的人脸属性,降低对任一人脸属性进行属性编辑时对其它人脸属性产生的影响,甚至对其它人脸属性不产生影响,可以应用于不同图像风格的人脸属性编辑。In related technologies, editable face attributes are limited through traditional image distortion methods, and existing attribute editing cannot be decoupled, making it easy for attribute editing to affect each other. According to the embodiments of the present disclosure, specific face attributes can be edited more accurately, and the impact of attribute editing on any face attribute on other face attributes can be reduced or even have no impact on other face attributes. It can be applied to Face attribute editing in different image styles.
可以理解,本公开提及的上述各个方法实施例,在不违背原理逻辑的情况下,均可以彼此相互结合形成结合后的实施例。本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。It can be understood that the above-mentioned method embodiments mentioned in this disclosure can be combined with each other to form a combined embodiment without violating the principle logic. Those skilled in the art can understand that in the above-mentioned methods of specific embodiments, the specific execution order of each step should be determined by its function and possible internal logic.
此外,本公开还提供了图像处理装置、电子设备、计算机可读存储介质、程序,上述均可用来实现本公开提供的任一种图像处理方法,相应技术方案和描述和参见方法部分的相应记载。In addition, the present disclosure also provides image processing devices, electronic equipment, computer-readable storage media, and programs, all of which can be used to implement any image processing method provided by the present disclosure. For corresponding technical solutions and descriptions, please refer to the corresponding records in the method section. .
图7示出根据本公开实施例的图像处理装置的框图,如图7所示,所述装置包括:Figure 7 shows a block diagram of an image processing device according to an embodiment of the present disclosure. As shown in Figure 7, the device includes:
获取部分101,配置为获取待处理的人脸图像;The acquisition part 101 is configured to acquire the face image to be processed;
编码部分102,配置为对所述人脸图像进行编码处理,得到所述人脸图像的第一隐变量;The encoding part 102 is configured to perform encoding processing on the face image to obtain the first latent variable of the face image;
编辑部分103,配置为响应于针对人脸属性的属性编辑程度的设置操作,根据设置的属性编辑程度以及所述人脸属性对应的属性编辑方向,编辑所述第一隐变量,得到编辑后的第二隐变量,其中,所述属性编辑方向表征所述人脸属性的增强方向或削弱方向,不同人脸属性对应的属性编辑方向不同,所述属性编辑程度表征所述人脸属性的增强程度或削弱程度;The editing part 103 is configured to respond to the setting operation of the attribute editing degree of the face attribute, edit the first latent variable according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, and obtain the edited The second latent variable, wherein the attribute editing direction represents the enhancement direction or weakening direction of the face attribute, different face attributes correspond to different attribute editing directions, and the attribute editing degree represents the enhancement degree of the face attribute. or degree of weakening;
解码部分104,配置为对所述第二隐变量进行解码处理,得到目标人脸图像,所述目标人脸图像与所述人脸图像中所述人脸属性的显示效果不同。The decoding part 104 is configured to decode the second latent variable to obtain a target face image, where the display effects of the face attributes in the target face image and the face image are different.
在一种可能的实现方式中,所述人脸属性对应的属性编辑方向是利用属性分类器对样本人脸图像进行二分类得到的;其中,所述利用属性分类器对样本人脸图像进行二分类,包括:利用所述人脸属性对应的属性分类器,对所述样本人脸图像进行二分类,得到所述样本人脸图像在隐空间里的属性分类边界,所述隐空间表征所述样本人脸图像对应的样本隐变量分布的样本分布空间;将所述属性分类边界面向所述人脸属性的正样本属性的方向,确定为所述属性编辑方向。In a possible implementation, the attribute editing direction corresponding to the face attribute is obtained by using an attribute classifier to classify the sample face image; wherein, the attribute classifier is used to classify the sample face image. Classification includes: using an attribute classifier corresponding to the face attribute to perform two classifications on the sample face image to obtain the attribute classification boundary of the sample face image in the latent space, and the latent space represents the The sample distribution space of the sample latent variable distribution corresponding to the sample face image; the direction in which the attribute classification boundary faces the positive sample attribute of the face attribute is determined as the attribute editing direction.
在一种可能的实现方式中,所述第一隐变量表示为M个第一N维向量,所述属性编辑方向表示为第二N维向量,N与M为正整数,所述编辑部分103,包括:确定子部分,配置为根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少 一个第一N维向量;计算子部分,配置为计算所述第二N维向量与所述属性编辑程度之间的乘积,得到第三N维向量;相加子部分,配置为将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,所述第二隐变量表示为所述M个第四N维向量。In a possible implementation, the first latent variable is represented as M first N-dimensional vectors, the attribute editing direction is represented as a second N-dimensional vector, N and M are positive integers, and the editing part 103 , including: a determination sub-part configured to determine at least one first N-dimensional vector on which the attribute editing direction acts based on the attribute type corresponding to the face attribute; a calculation sub-part configured to calculate the second N-dimensional vector The product of the vector and the attribute editing degree obtains a third N-dimensional vector; the addition subpart is configured to combine the at least one first N-dimensional vector among the M first N-dimensional vectors with the The third N-dimensional vectors are added to obtain M fourth N-dimensional vectors, and the second hidden variable is expressed as the M fourth N-dimensional vectors.
在一种可能的实现方式中,所述人脸属性的属性种类包括第一类人脸属性、第二类人脸属性与第三类人脸属性中的至少一种;所述第一类人脸属性的属性编辑方向作用于第1个第一N维向量至第i个第一N维向量;所述第二类人脸属性的属性编辑方向作用于第i+1个第一N维向量至第j个第一N维向量;所述第三类人脸属性的属性编辑方向作用于第j+1个第一N维向量至第M个第一N维向量;其中,i∈[1,M],j∈[2,M]。In a possible implementation, the attribute type of the face attribute includes at least one of a first type of face attribute, a second type of face attribute and a third type of face attribute; the first type of person The attribute editing direction of the face attribute acts on the first N-dimensional vector to the i-th first N-dimensional vector; the attribute editing direction of the second type of face attribute acts on the i+1 first N-dimensional vector to the j-th first N-dimensional vector; the attribute editing direction of the third type of face attribute acts on the j+1-th first N-dimensional vector to the M-th first N-dimensional vector; where, i∈[1 ,M], j∈[2,M].
在一种可能的实现方式中,所述第一类人脸属性包括:人脸的脸型、位姿,以及人脸表征出的性别、年龄、情绪中的至少一种;所述第二类人脸属性包括人脸上的胡须、眼镜、口罩中的至少一种;所述第三类人脸属性包括瞳孔颜色、头发颜色、妆容颜色、滤镜颜色中的至少一种。In a possible implementation, the first type of human face attributes include: facial shape, posture, and at least one of gender, age, and emotion represented by the human face; the second type of human face attributes include: The face attributes include at least one of beards, glasses, and masks on the human face; the third type of face attributes include at least one of pupil color, hair color, makeup color, and filter color.
在一种可能的实现方式中,所述属性种类包括第一类人脸属性,所述根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量,包括:在所述人脸属性包括所述第一类人脸属性的情况下,确定所述属性编辑方向作用于第1个第一N维向量至第i个第一N维向量;其中,所述将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,包括:将所述M个第一N维向量中的第1个第一N维向量至第i个第一N维向量分别与所述第三N维向量相加,得到所述M个第四N维向量。In a possible implementation, the attribute type includes a first type of face attribute, and at least one first N-dimensional vector on which the attribute editing direction acts is determined based on the attribute type corresponding to the face attribute. , including: when the face attribute includes the first type of face attribute, determining that the attribute editing direction acts on the first N-dimensional vector to the i-th first N-dimensional vector; wherein, Adding the at least one first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors includes: adding the M The 1st first N-dimensional vector to the i-th first N-dimensional vector among the first N-dimensional vectors are respectively added to the third N-dimensional vector to obtain the M fourth N-dimensional vectors.
在一种可能的实现方式中,所述属性种类包括第二类人脸属性,所述根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量,包括:在所述人脸属性包括所述第二类人脸属性的情况下,确定所述属性编辑方向作用于第i+1个第一N维向量至第j个第一N维向量,i∈[1,M],j∈[2,M];其中,所述将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,包括:将所述M个第一N维向量中的第i+1个第一N维向量至第j个第一N维向量分别与所述第三N维向量相加,得到所述M个第四N维向量。In a possible implementation, the attribute type includes a second type of face attribute, and at least one first N-dimensional vector on which the attribute editing direction acts is determined based on the attribute type corresponding to the face attribute. , including: when the face attribute includes the second type of face attribute, determining that the attribute editing direction acts on the i+1th first N-dimensional vector to the j-th first N-dimensional vector, i∈[1,M], j∈[2,M]; wherein, the at least one first N-dimensional vector among the M first N-dimensional vectors is respectively combined with the third N-dimensional vector Adding to obtain M fourth N-dimensional vectors includes: combining the i+1th first N-dimensional vector to the j-th first N-dimensional vector among the M first N-dimensional vectors with the said Three N-dimensional vectors are added to obtain the M fourth N-dimensional vectors.
在一种可能的实现方式中,所述属性种类包括第三类人脸属性,所述根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量,包括:在所述人脸属性包括所述第三类人脸属性的情况下,确定所述属性编辑方向作用于第j+1个第一N维向量至第M个第一N维向量,j∈[2,M];其中,所述将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,包括:将所述M个第一N维向量中的第j+1个第一N维向量至第M个第一N维向量分别与所述第三N维向量相加,得到所述M个第四N维向量。In a possible implementation, the attribute type includes a third type of face attribute, and at least one first N-dimensional vector on which the attribute editing direction acts is determined based on the attribute type corresponding to the face attribute. , including: when the face attribute includes the third type of face attribute, determining that the attribute editing direction acts on the j+1th first N-dimensional vector to the M-th first N-dimensional vector, j∈[2,M]; wherein the at least one first N-dimensional vector among the M first N-dimensional vectors is added to the third N-dimensional vector respectively to obtain M fourth The N-dimensional vector includes: adding the j+1th first N-dimensional vector to the M-th first N-dimensional vector among the M first N-dimensional vectors and the third N-dimensional vector respectively, to obtain The M fourth N-dimensional vectors.
在一种可能的实现方式中,所述解码部分104,包括:网络解码子部分,配置为利用生成网络对所述第二隐变量进行解码处理,得到目标人脸图像;所述生成网络用于根据M个N维向量生成具有预设图像风格的图像;其中,所述生成网络包括多个不同分辨率的网络层,不同分辨率的网络层分别用于处理不同人脸属性对应的第二隐变量。In a possible implementation, the decoding part 104 includes: a network decoding sub-part configured to use a generating network to decode the second latent variable to obtain a target face image; the generating network is used to Generate an image with a preset image style based on M N-dimensional vectors; wherein the generation network includes a plurality of network layers with different resolutions, and the network layers with different resolutions are respectively used to process second hidden images corresponding to different face attributes. variable.
在一种可能的实现方式中,所述生成网络包括M层网络层,所述第二隐变量表示为M个第四N维向量,所述利用生成网络对所述第二隐变量进行解码处理,得到目标人脸图像,包括:将第1个第四N维向量输入至所述生成网络的第1层网络层,得到所述第1层网络层输出的第1个中间图;将第m个第四N维向量以及第m-1个中间图输入至所述生成网络的第m层网络层,得到所述第m层网络层输出的第m个中间图,m∈[2,M);将第M个第四N维向量以及第M-1个中间图输入至所述生成网络的第M层 网络层,得到所述第M层网络层输出的所述目标人脸图像。In a possible implementation, the generating network includes M network layers, the second latent variable is represented as M fourth N-dimensional vectors, and the generating network is used to decode the second latent variable. , obtaining the target face image, including: inputting the first fourth N-dimensional vector to the first network layer of the generation network, obtaining the first intermediate image output by the first network layer; converting the mth The fourth N-dimensional vector and the m-1th intermediate image are input to the m-th network layer of the generating network, and the m-th intermediate image output by the m-th network layer is obtained, m∈[2,M) ; Input the Mth fourth N-dimensional vector and the M-1th intermediate image to the Mth network layer of the generation network to obtain the target face image output by the Mth network layer.
在一种可能的实现方式中,在所述属性编辑方向与所述属性编辑程度的乘积为正值的情况下,基于所述第二隐变量得到的所述目标人脸图像,相对于所述人脸图像的所述人脸属性的显示效果增强;在所述属性编辑方向与所述属性编辑程度的乘积为负值的情况下,基于所述第二隐变量得到的所述目标人脸图像,相对于所述人脸图像的所述人脸属性的显示效果削弱。In a possible implementation, when the product of the attribute editing direction and the attribute editing degree is a positive value, the target face image obtained based on the second latent variable, relative to the The display effect of the face attributes of the face image is enhanced; when the product of the attribute editing direction and the attribute editing degree is a negative value, the target face image obtained based on the second latent variable , the display effect of the face attribute relative to the face image is weakened.
在本公开实施例中,通过对人脸图像进行编码处理得到第一隐变量,根据设置的人脸属性的属性编辑程度以及该人脸属性对应的属性编辑方向,编辑该第一隐变量得到第二隐变量,再对第二隐变量进行解码处理得到目标人脸图像,由于不同人脸属性的属性编辑方向不同,可以精准地对用户指定的人脸属性进行编辑,而不影响其它人脸属性的显示效果。In the embodiment of the present disclosure, the first latent variable is obtained by encoding the face image, and the first latent variable is obtained by editing the first latent variable according to the set attribute editing degree of the face attribute and the attribute editing direction corresponding to the face attribute. two latent variables, and then decode the second latent variable to obtain the target face image. Since the attribute editing directions of different face attributes are different, the face attributes specified by the user can be accurately edited without affecting other face attributes. display effect.
在一些实施例中,本公开实施例提供的装置具有的功能或包含的部分可以配置为执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述。In some embodiments, the functions or included parts of the device provided by the embodiments of the present disclosure can be configured to perform the method described in the above method embodiments, and for its specific implementation, reference can be made to the description of the above method embodiments.
本公开实施例还提出一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。计算机可读存储介质可以是非易失性计算机可读存储介质。Embodiments of the present disclosure also provide a computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the above method is implemented. The computer-readable storage medium may be a non-volatile computer-readable storage medium.
本公开实施例还提出一种电子设备,包括:处理器;配置为存储处理器可执行指令的存储器;其中,所述处理器被配置为调用所述存储器存储的指令,以执行上述方法。An embodiment of the present disclosure also provides an electronic device, including: a processor; a memory configured to store instructions executable by the processor; wherein the processor is configured to call instructions stored in the memory to perform the above method.
本公开实施例还提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。Embodiments of the present disclosure also provide a computer program product, including computer readable code, or a non-volatile computer readable storage medium carrying the computer readable code. When the computer readable code is stored in a processor of an electronic device, When running, the processor in the electronic device executes the above method.
电子设备可以被提供为终端、服务器或其它形态的设备。The electronic device may be provided as a terminal, a server, or other forms of equipment.
图8示出根据本公开实施例的一种电子设备的框图。例如,电子设备1900可以被提供为一服务器或终端设备。参照图8,电子设备1900包括处理组件1922,可以包括一个或多个处理器,以及由存储器1932所代表的存储器资源,配置为存储可由处理组件1922的执行的指令,例如应用程序。存储器1932中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的部分。此外,处理组件1922被配置为执行指令,以执行上述方法。FIG. 8 shows a block diagram of an electronic device according to an embodiment of the present disclosure. For example, the electronic device 1900 may be provided as a server or terminal device. Referring to FIG. 8 , electronic device 1900 includes a processing component 1922 , which may include one or more processors, and memory resources represented by memory 1932 configured to store instructions, such as applications, executable by processing component 1922 . An application stored in memory 1932 may include one or more portions, each corresponding to a set of instructions. Furthermore, the processing component 1922 is configured to execute instructions to perform the above-described methods.
电子设备1900还可以包括一个电源组件1926被配置为执行电子设备1900的电源管理,一个有线或无线网络接口1950被配置为将电子设备1900连接到网络,和一个输入输出(I/O)接口1958。 Electronic device 1900 may also include a power supply component 1926 configured to perform power management of electronic device 1900, a wired or wireless network interface 1950 configured to connect electronic device 1900 to a network, and an input-output (I/O) interface 1958 .
在示例性实施例中,还提供了一种非易失性计算机可读存储介质,例如包括计算机程序指令的存储器1932,上述计算机程序指令可由电子设备1900的处理组件1922执行以完成上述方法。In an exemplary embodiment, a non-volatile computer-readable storage medium is also provided, such as a memory 1932 including computer program instructions, which can be executed by the processing component 1922 of the electronic device 1900 to complete the above method.
本公开实施例可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有配置为使处理器实现本公开实施例的各个方面的计算机可读程序指令。Embodiments of the present disclosure may be systems, methods, and/or computer program products. A computer program product may include a computer-readable storage medium having thereon computer-readable program instructions configured to cause a processor to implement aspects of the disclosed embodiments.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是(但不限于)电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质可以包括:便携式计算机盘、硬盘、随机存取存储器(RAM,Random Access Memory)、只读存储器、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器、便携式压缩盘只读存储器(CD-ROM,Compact Disc Read-Only Memory)、数字多功能盘(DVD,Digital Video Disc)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹 槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。Computer-readable storage media may be tangible devices that can retain and store instructions for use by an instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the above. Computer-readable storage media can include: portable computer disks, hard disks, random access memory (RAM, Random Access Memory), read-only memory, erasable programmable read-only memory (EPROM or flash memory), static random access memory, Portable compact disk read-only memory (CD-ROM, Compact Disc Read-Only Memory), digital versatile disk (DVD, Digital Video Disc), memory stick, floppy disk, mechanical encoding device, such as punched card with instructions stored on it Or a protruding structure in the groove, and any suitable combination of the above. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or through electrical wires. transmitted electrical signals.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to various computing/processing devices, or to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage on a computer-readable storage medium in the respective computing/processing device .
这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus, thereby producing a machine that, when executed by the processor of the computer or other programmable data processing apparatus, , resulting in an apparatus that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium. These instructions cause the computer, programmable data processing device and/or other equipment to work in a specific manner. Therefore, the computer-readable medium storing the instructions includes An article of manufacture that includes instructions that implement aspects of the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other equipment, causing a series of operating steps to be performed on the computer, other programmable data processing apparatus, or other equipment to produce a computer-implemented process , thereby causing instructions executed on a computer, other programmable data processing apparatus, or other equipment to implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个部分、程序段或指令的一部分,所述部分、程序段或指令的一部分包含一个或多个配置为实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a portion, segment, or portion of instructions that contains one or more elements configured to implement the specified logical function Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two consecutive blocks may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts. , or can be implemented using a combination of specialized hardware and computer instructions.
该计算机程序产品可以通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品体现为计算机存储介质,在另一个可选实施例中,计算机程序产品体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。The computer program product may be implemented in hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium. In another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and so on.
若本公开技术方案涉及个人信息,应用本公开技术方案的产品在处理个人信息前,已明确告知个人信息处理规则,并取得个人自主同意。若本公开技术方案涉及敏感个人信息,应用本公开技术方案的产品在处理敏感个人信息前,已取得个人单独同意,并且同时满足“明示同意”的要求。例如,在摄像头等个人信息采集装置处,设置明确显著的标识告知已进入个人信息采集范围,将会对个人信息进行采集,若个人自愿进入采集范围即视为同意对其个人信息进行采集;或者在个人信息处理的装置上,利用明显的标识/信息告知个人信息处理规则的情况下,通过弹窗信息或请个人自行上传其个人信息等方式获得个人授权;其中,个人信息处理规则可包括个人信息处理者、个人信息处理目的、处理方式以及处理的个人信息种类等信息。If the disclosed technical solution involves personal information, the products applying the disclosed technical solution will clearly inform the personal information processing rules and obtain the individual's independent consent before processing personal information. If the disclosed technical solution involves sensitive personal information, the product applying the disclosed technical solution must obtain the individual's separate consent before processing the sensitive personal information, and at the same time meet the requirement of "express consent". For example, setting up clear and conspicuous signs on personal information collection devices such as cameras to inform them that they have entered the scope of personal information collection, and that personal information will be collected. If an individual voluntarily enters the collection scope, it is deemed to have agreed to the collection of his or her personal information; or On personal information processing devices, when using obvious logos/information to inform personal information processing rules, obtain personal authorization through pop-up messages or asking individuals to upload their personal information; among them, personal information processing rules may include personal information processing rules. Information such as information processors, purposes of processing personal information, methods of processing, and types of personal information processed.
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。The embodiments of the present disclosure have been described above. The above description is illustrative, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles, practical applications, or improvements to the technology in the market, or to enable other persons of ordinary skill in the art to understand the embodiments disclosed herein.
工业实用性Industrial applicability
本公开实施例中,获取待处理的人脸图像;对人脸图像进行编码处理,得到人脸图像的第一隐变量;响应于针对人脸属性的属性编辑程度的设置操作,根据设置的属性编辑程度以及人脸属性对应的属性编辑方向,编辑第一隐变量,得到编辑后的第二隐变量,其中,属性编辑方向表征人脸属性的增强方向或削弱方向,不同人脸属性对应的属性编辑方向不同,属性编辑程度表征人脸属性的增强程度或削弱程度;对第二隐变量进行解码处理,得到目标人脸图像,目标人脸图像与人脸图像中人脸属性的显示效果不同。本公开实施例可实现精准地对用户指定的人脸属性进行编辑。In the embodiment of the present disclosure, a face image to be processed is obtained; the face image is encoded to obtain the first latent variable of the face image; in response to the setting operation of the attribute editing degree of the face attribute, according to the set attribute Editing degree and the attribute editing direction corresponding to the face attribute, edit the first latent variable, and obtain the edited second latent variable. Among them, the attribute editing direction represents the enhancement direction or weakening direction of the face attribute, and the attributes corresponding to different face attributes The editing direction is different, and the degree of attribute editing represents the degree of enhancement or weakening of the face attributes; the second latent variable is decoded to obtain the target face image, and the display effects of the face attributes in the target face image and the face image are different. The embodiments of the present disclosure can accurately edit the face attributes specified by the user.

Claims (25)

  1. 一种图像处理方法,包括:An image processing method including:
    获取待处理的人脸图像;Get the face image to be processed;
    对所述人脸图像进行编码处理,得到所述人脸图像的第一隐变量;Perform encoding processing on the face image to obtain the first latent variable of the face image;
    响应于针对人脸属性的属性编辑程度的设置操作,根据设置的属性编辑程度以及所述人脸属性对应的属性编辑方向,编辑所述第一隐变量,得到编辑后的第二隐变量,其中,所述属性编辑方向表征所述人脸属性的增强方向或削弱方向,不同人脸属性对应的属性编辑方向不同,所述属性编辑程度表征所述人脸属性的增强程度或削弱程度;In response to the setting operation for the attribute editing degree of the face attribute, according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, the first latent variable is edited to obtain the edited second latent variable, where , the attribute editing direction represents the enhancement direction or weakening direction of the face attribute, different face attributes correspond to different attribute editing directions, and the attribute editing degree represents the enhancement or weakening degree of the face attribute;
    对所述第二隐变量进行解码处理,得到目标人脸图像,所述目标人脸图像与所述人脸图像中所述人脸属性的显示效果不同。The second latent variable is decoded to obtain a target face image, and the display effects of the face attributes in the target face image and the face image are different.
  2. 根据权利要求1所述的方法,其中,所述人脸属性对应的属性编辑方向是利用属性分类器对样本人脸图像进行二分类得到的;The method according to claim 1, wherein the attribute editing direction corresponding to the face attribute is obtained by using an attribute classifier to classify the sample face image into two categories;
    其中,所述利用属性分类器对样本人脸图像进行二分类,包括:Among them, the use of attribute classifiers to classify sample face images includes:
    利用所述人脸属性对应的属性分类器,对所述样本人脸图像进行二分类,得到所述样本人脸图像在隐空间里的属性分类边界,所述隐空间表征所述样本人脸图像对应的样本隐变量分布的样本分布空间;Utilize the attribute classifier corresponding to the face attribute to perform two classifications on the sample face image to obtain the attribute classification boundary of the sample face image in the latent space, and the latent space represents the sample face image The sample distribution space of the corresponding sample latent variable distribution;
    将所述属性分类边界面向所述人脸属性的正样本属性的方向,确定为所述属性编辑方向。The direction in which the attribute classification boundary faces the positive sample attribute of the face attribute is determined as the attribute editing direction.
  3. 根据权利要求1或2所述的方法,其中,所述第一隐变量表示为M个第一N维向量,所述属性编辑方向表示为第二N维向量,N与M为正整数,所述根据设置的属性编辑程度以及所述人脸属性对应的属性编辑方向,编辑所述第一隐变量,得到编辑后的第二隐变量,包括:The method according to claim 1 or 2, wherein the first latent variable is represented by M first N-dimensional vectors, the attribute editing direction is represented by a second N-dimensional vector, N and M are positive integers, so It is described that the first latent variable is edited according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, and the edited second latent variable is obtained, including:
    根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量;Determine at least one first N-dimensional vector on which the attribute editing direction acts according to the attribute type corresponding to the face attribute;
    计算所述第二N维向量与所述属性编辑程度之间的乘积,得到第三N维向量;Calculate the product between the second N-dimensional vector and the attribute editing degree to obtain a third N-dimensional vector;
    将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,所述第二隐变量表示为所述M个第四N维向量。The at least one first N-dimensional vector among the M first N-dimensional vectors is added to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors, and the second hidden variable is expressed as The M fourth N-dimensional vectors.
  4. 根据权利要求1至3任一项所述的方法,其中,所述人脸属性对应的属性种类包括第一类人脸属性、第二类人脸属性与第三类人脸属性中的至少一种;所述第一隐变量表示为M个第一N维向量;The method according to any one of claims 1 to 3, wherein the attribute type corresponding to the face attribute includes at least one of a first type of face attribute, a second type of face attribute and a third type of face attribute. kind; the first hidden variable is expressed as M first N-dimensional vectors;
    所述第一类人脸属性的属性编辑方向作用于第1个第一N维向量至第i个第一N维向量;所述第二类人脸属性的属性编辑方向作用于第i+1个第一N维向量至第j个第一N维向量;所述第三类人脸属性的属性编辑方向作用于第j+1个第一N维向量至第M个第一N维向量;其中,i∈[1,M],j∈[2,M]。The attribute editing direction of the first type of face attributes acts on the first N-dimensional vector to the i-th first N-dimensional vector; the attribute editing direction of the second type of face attributes acts on the i+1 The first N-dimensional vector to the j-th first N-dimensional vector; the attribute editing direction of the third type of face attribute acts on the j+1-th first N-dimensional vector to the M-th first N-dimensional vector; Among them, i∈[1,M], j∈[2,M].
  5. 根据权利要求4所述的方法,其中,所述第一类人脸属性包括:人脸的脸型、位姿,以及人脸表征出的性别、年龄、情绪中的至少一种;所述第二类人脸属性包括人脸上的胡须、眼镜、口罩中的至少一种;所述第三类人脸属性包括瞳孔颜色、头发颜色、妆容颜色、滤镜颜色中的至少一种。The method of claim 4, wherein the first type of face attributes include: face shape, posture, and at least one of gender, age, and emotion represented by the face; and the second The human face attributes include at least one of beards, glasses, and masks on the human face; the third type of human face attributes include at least one of pupil color, hair color, makeup color, and filter color.
  6. 根据权利要求3至5任一项所述的方法,其中,所述属性种类包括第一类人脸属性,所述根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量,包括:The method according to any one of claims 3 to 5, wherein the attribute type includes a first type of face attribute, and based on the attribute type corresponding to the face attribute, it is determined that the attribute editing direction acts on At least one first N-dimensional vector, including:
    在所述人脸属性包括所述第一类人脸属性的情况下,确定所述属性编辑方向作用于第1个第一N维向量至第i个第一N维向量;In the case where the face attributes include the first type of face attributes, it is determined that the attribute editing direction acts on the first N-dimensional vector to the i-th first N-dimensional vector;
    其中,所述将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,包括:Wherein, adding the at least one first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors includes:
    将所述M个第一N维向量中的第1个第一N维向量至第i个第一N维向量分别与所述第三N维向量相加,得到所述M个第四N维向量。Add the first to the i-th first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain the M fourth N-dimensional vectors. vector.
  7. 根据权利要求3至5任一项所述的方法,其中,所述属性种类包括第二类人脸属性,所述根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量,包括:The method according to any one of claims 3 to 5, wherein the attribute type includes a second type of face attribute, and based on the attribute type corresponding to the face attribute, it is determined that the attribute editing direction acts on At least one first N-dimensional vector, including:
    在所述人脸属性包括所述第二类人脸属性的情况下,确定所述属性编辑方向作用于第i+1个第一N维向量至第j个第一N维向量;In the case where the face attributes include the second type of face attributes, it is determined that the attribute editing direction acts on the i+1th first N-dimensional vector to the j-th first N-dimensional vector;
    其中,所述将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,包括:Wherein, adding the at least one first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors includes:
    将所述M个第一N维向量中的第i+1个第一N维向量至第j个第一N维向量分别与所述第三N维向量相加,得到所述M个第四N维向量。Add the i+1th first N-dimensional vector to the j-th first N-dimensional vector among the M first N-dimensional vectors respectively to the third N-dimensional vector to obtain the M fourth N-dimensional vector.
  8. 根据权利要求3至5任一项所述的方法,其中,所述属性种类包括第三类人脸属性,所述根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量,包括:The method according to any one of claims 3 to 5, wherein the attribute type includes a third type of face attribute, and based on the attribute type corresponding to the face attribute, it is determined that the attribute editing direction acts on At least one first N-dimensional vector, including:
    在所述人脸属性包括所述第三类人脸属性的情况下,确定所述属性编辑方向作用于第j+1个第一N维向量至第M个第一N维向量;In the case where the face attributes include the third type of face attributes, it is determined that the attribute editing direction acts on the j+1th first N-dimensional vector to the M-th first N-dimensional vector;
    其中,所述将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,包括:Wherein, adding the at least one first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors includes:
    将所述M个第一N维向量中的第j+1个第一N维向量至第M个第一N维向量分别与所述第三N维向量相加,得到所述M个第四N维向量。Add the j+1th first N-dimensional vector to the M-th first N-dimensional vector among the M first N-dimensional vectors respectively to the third N-dimensional vector to obtain the M fourth N-dimensional vector.
  9. 根据权利要求1至8任一项所述的方法,其中,所述对所述第二隐变量进行解码处理,得到目标人脸图像,包括:The method according to any one of claims 1 to 8, wherein said decoding the second latent variable to obtain the target face image includes:
    利用生成网络对所述第二隐变量进行解码处理,得到所述目标人脸图像;所述生成网络用于根据M个N维向量生成具有预设图像风格的图像;A generation network is used to decode the second latent variable to obtain the target face image; the generation network is used to generate an image with a preset image style based on M N-dimensional vectors;
    其中,所述生成网络包括多个不同分辨率的网络层,不同分辨率的网络层分别用于处理不同人脸属性对应的第二隐变量。Wherein, the generation network includes a plurality of network layers with different resolutions, and the network layers with different resolutions are respectively used to process second latent variables corresponding to different face attributes.
  10. 根据权利要求9所述的方法,其中,所述生成网络包括M层网络层,所述第二隐变量表示为M个第四N维向量,所述利用生成网络对所述第二隐变量进行解码处理,得到所述目标人脸图像,包括:The method according to claim 9, wherein the generating network includes M network layers, the second latent variable is represented by M fourth N-dimensional vectors, and the generating network is used to perform Decoding process to obtain the target face image, including:
    将第1个第四N维向量输入至所述生成网络的第1层网络层,得到所述第1层网络层输出的第1个中间图;Input the first fourth N-dimensional vector into the first network layer of the generating network to obtain the first intermediate image output by the first network layer;
    将第m个第四N维向量以及第m-1个中间图输入至所述生成网络的第m层网络层,得到所述第m层网络层输出的第m个中间图,m∈[2,M);Input the m-th fourth N-dimensional vector and the m-1 intermediate image into the m-th network layer of the generating network, and obtain the m-th intermediate image output by the m-th network layer, m∈[2 ,M);
    将第M个第四N维向量以及第M-1个中间图输入至所述生成网络的第M层网络层,得到所述第M层网络层输出的所述目标人脸图像。Input the Mth fourth N-dimensional vector and the M-1th intermediate image to the Mth network layer of the generation network to obtain the target face image output by the Mth network layer.
  11. 根据权利要求1至10任一项所述的方法,其中,在所述属性编辑方向与所述属性编辑程度的乘积为正值的情况下,基于所述第二隐变量得到的所述目标人脸图像,相对于所述人脸图像的所述人脸属性的显示效果增强;The method according to any one of claims 1 to 10, wherein when the product of the attribute editing direction and the attribute editing degree is a positive value, the target person obtained based on the second latent variable A face image, the display effect of the face attributes relative to the face image is enhanced;
    在所述属性编辑方向与所述属性编辑程度的乘积为负值的情况下,基于所述第二隐变量得到的所述目标人脸图像,相对于所述人脸图像的所述人脸属性的显示效果削弱。In the case where the product of the attribute editing direction and the attribute editing degree is a negative value, the target face image obtained based on the second latent variable, relative to the face attribute of the face image The display effect is weakened.
  12. 一种图像处理装置,包括:An image processing device, including:
    获取部分,配置为获取待处理的人脸图像;The acquisition part is configured to acquire the face image to be processed;
    编码部分,配置为对所述人脸图像进行编码处理,得到所述人脸图像的第一隐变量;The encoding part is configured to perform encoding processing on the face image to obtain the first latent variable of the face image;
    编辑部分,配置为响应于针对人脸属性的属性编辑程度的设置操作,根据设置的属性编辑程度以及所述人脸属性对应的属性编辑方向,编辑所述第一隐变量,得到编辑后的第二隐变量,其中,所述属性编辑方向表征所述人脸属性的增强方向或削弱方向,不同人脸属性对应的属性编辑方向不同,所述属性编辑程度表征所述人脸属性的增强程度或削弱程度;The editing part is configured to respond to the setting operation of the attribute editing degree of the face attribute, edit the first latent variable according to the set attribute editing degree and the attribute editing direction corresponding to the face attribute, and obtain the edited third Two latent variables, wherein the attribute editing direction represents the enhancement direction or weakening direction of the face attribute, different face attributes correspond to different attribute editing directions, and the attribute editing degree represents the enhancement degree or weakening direction of the face attribute. degree of weakening;
    解码部分,配置为对所述第二隐变量进行解码处理,得到目标人脸图像,所述目标人脸图像与所述人脸图像中所述人脸属性的显示效果不同。The decoding part is configured to perform decoding processing on the second latent variable to obtain a target face image, where the display effects of the face attributes in the target face image and the face image are different.
  13. 根据权利要求12所述的装置,其中,所述人脸属性对应的属性编辑方向是利用属性分类器对样本人脸图像进行二分类得到的;其中,所述利用属性分类器对样本人脸图像进行二分类,包括:The device according to claim 12, wherein the attribute editing direction corresponding to the face attribute is obtained by using an attribute classifier to classify the sample face image into two categories; wherein the attribute classifier is used to classify the sample face image. Carry out two categories, including:
    利用所述人脸属性对应的属性分类器,对所述样本人脸图像进行二分类,得到所述样本人脸图像在隐空间里的属性分类边界,所述隐空间表征所述样本人脸图像对应的样本隐变量分布的样本分布空间;将所述属性分类边界面向所述人脸属性的正样本属性的方向,确定为所述属性编辑方向。Utilize the attribute classifier corresponding to the face attribute to perform two classifications on the sample face image to obtain the attribute classification boundary of the sample face image in the latent space, and the latent space represents the sample face image The sample distribution space of the corresponding sample latent variable distribution; the direction in which the attribute classification boundary faces the positive sample attribute of the face attribute is determined as the attribute editing direction.
  14. 根据权利要求12或13所述的装置,其中,所述第一隐变量表示为M个第一N维向量,所述属性编辑方向表示为第二N维向量,N与M为正整数,所述编辑部分包括:The device according to claim 12 or 13, wherein the first latent variable is represented by M first N-dimensional vectors, the attribute editing direction is represented by a second N-dimensional vector, N and M are positive integers, so The editing section includes:
    确定子部分,配置为根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量;The determination sub-part is configured to determine at least one first N-dimensional vector on which the attribute editing direction acts based on the attribute type corresponding to the face attribute;
    计算子部分,配置为计算所述第二N维向量与所述属性编辑程度之间的乘积,得到第三N维向量;A calculation sub-part configured to calculate the product between the second N-dimensional vector and the attribute editing degree to obtain a third N-dimensional vector;
    相加子部分,配置为将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,所述第二隐变量表示为所述M个第四N维向量。The addition sub-part is configured to add the at least one first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively, to obtain M fourth N-dimensional vectors, so The second latent variable is expressed as the M fourth N-dimensional vectors.
  15. 根据权利要求12至14任一项所述的装置,其中,所述人脸属性对应的属性种类包括第一类人脸属性、第二类人脸属性与第三类人脸属性中的至少一种;所述第一隐变量表示为M个第一N维向量;The device according to any one of claims 12 to 14, wherein the attribute type corresponding to the face attribute includes at least one of a first type of face attribute, a second type of face attribute and a third type of face attribute. kind; the first hidden variable is expressed as M first N-dimensional vectors;
    所述第一类人脸属性的属性编辑方向作用于第1个第一N维向量至第i个第一N维向量;所述第二类人脸属性的属性编辑方向作用于第i+1个第一N维向量至第j个第一N维向量;所述第三类人脸属性的属性编辑方向作用于第j+1个第一N维向量至第M个第一N维向量;其中,i∈[1,M],j∈[2,M]。The attribute editing direction of the first type of face attributes acts on the first N-dimensional vector to the i-th first N-dimensional vector; the attribute editing direction of the second type of face attributes acts on the i+1 The first N-dimensional vector to the j-th first N-dimensional vector; the attribute editing direction of the third type of face attribute acts on the j+1-th first N-dimensional vector to the M-th first N-dimensional vector; Among them, i∈[1,M], j∈[2,M].
  16. 根据权利要求15所述的装置,其中,所述第一类人脸属性包括:人脸的脸型、位姿,以及人脸表征出的性别、年龄、情绪中的至少一种;所述第二类人脸属性包括人脸上的胡须、眼镜、口罩中的至少一种;所述第三类人脸属性包括瞳孔颜色、头发颜色、妆容颜色、滤镜颜色中的至少一种。The device according to claim 15, wherein the first type of human face attributes include: face shape, posture, and at least one of gender, age, and emotion represented by the human face; and the second The human face attributes include at least one of beards, glasses, and masks on the human face; the third type of human face attributes include at least one of pupil color, hair color, makeup color, and filter color.
  17. 根据权利要求14至16任一项所述的装置,其中,所述属性种类包括第一类人脸属性,所述根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量,包括:The device according to any one of claims 14 to 16, wherein the attribute type includes a first type of face attribute, and based on the attribute type corresponding to the face attribute, it is determined that the attribute editing direction acts on At least one first N-dimensional vector, including:
    在所述人脸属性包括所述第一类人脸属性的情况下,确定所述属性编辑方向作用于第1个第一N维向量至第i个第一N维向量;In the case where the face attributes include the first type of face attributes, it is determined that the attribute editing direction acts on the first N-dimensional vector to the i-th first N-dimensional vector;
    其中,所述将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,包括:Wherein, adding the at least one first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors includes:
    将所述M个第一N维向量中的第1个第一N维向量至第i个第一N维向量分别与 所述第三N维向量相加,得到所述M个第四N维向量。Add the first to the i-th first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain the M fourth N-dimensional vectors. vector.
  18. 根据权利要求14至16任一项所述的装置,其中,所述属性种类包括第二类人脸属性,所述根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量,包括:The device according to any one of claims 14 to 16, wherein the attribute type includes a second type of face attribute, and based on the attribute type corresponding to the face attribute, it is determined that the attribute editing direction acts on At least one first N-dimensional vector, including:
    在所述人脸属性包括所述第二类人脸属性的情况下,确定所述属性编辑方向作用于第i+1个第一N维向量至第j个第一N维向量;In the case where the face attributes include the second type of face attributes, it is determined that the attribute editing direction acts on the i+1th first N-dimensional vector to the j-th first N-dimensional vector;
    其中,所述将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,包括:Wherein, adding the at least one first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors includes:
    将所述M个第一N维向量中的第i+1个第一N维向量至第j个第一N维向量分别与所述第三N维向量相加,得到所述M个第四N维向量。Add the i+1th first N-dimensional vector to the j-th first N-dimensional vector among the M first N-dimensional vectors respectively to the third N-dimensional vector to obtain the M fourth N-dimensional vector.
  19. 根据权利要求14至16任一项所述的装置,其中,所述属性种类包括第三类人脸属性,所述根据所述人脸属性对应的属性种类,确定所述属性编辑方向作用于的至少一个第一N维向量,包括:The device according to any one of claims 14 to 16, wherein the attribute type includes a third type of face attribute, and based on the attribute type corresponding to the face attribute, it is determined that the attribute editing direction acts on At least one first N-dimensional vector, including:
    在所述人脸属性包括所述第三类人脸属性的情况下,确定所述属性编辑方向作用于第j+1个第一N维向量至第M个第一N维向量;In the case where the face attributes include the third type of face attributes, it is determined that the attribute editing direction acts on the j+1th first N-dimensional vector to the M-th first N-dimensional vector;
    其中,所述将所述M个第一N维向量中的所述至少一个第一N维向量分别与所述第三N维向量相加,得到M个第四N维向量,包括:Wherein, adding the at least one first N-dimensional vector among the M first N-dimensional vectors to the third N-dimensional vector respectively to obtain M fourth N-dimensional vectors includes:
    将所述M个第一N维向量中的第j+1个第一N维向量至第M个第一N维向量分别与所述第三N维向量相加,得到所述M个第四N维向量。Add the j+1th first N-dimensional vector to the M-th first N-dimensional vector among the M first N-dimensional vectors respectively to the third N-dimensional vector to obtain the M fourth N-dimensional vector.
  20. 根据权利要求12至19任一项所述的装置,其中,所述解码部分包括:网络解码子部分,配置为利用生成网络对所述第二隐变量进行解码处理,得到所述目标人脸图像;所述生成网络用于根据M个N维向量生成具有预设图像风格的图像;The device according to any one of claims 12 to 19, wherein the decoding part includes: a network decoding sub-part configured to use a generating network to decode the second latent variable to obtain the target face image. ;The generation network is used to generate an image with a preset image style based on M N-dimensional vectors;
    其中,所述生成网络包括多个不同分辨率的网络层,不同分辨率的网络层分别用于处理不同人脸属性对应的第二隐变量。Wherein, the generation network includes a plurality of network layers with different resolutions, and the network layers with different resolutions are respectively used to process second latent variables corresponding to different face attributes.
  21. 根据权利要求20所述的装置,所述所述生成网络包括M层网络层,所述第二隐变量表示为M个第四N维向量,所述利用生成网络对所述第二隐变量进行解码处理,得到所述目标人脸图像,包括:The device according to claim 20, the generating network includes M network layers, the second latent variable is expressed as M fourth N-dimensional vectors, and the generating network is used to perform Decoding process to obtain the target face image, including:
    将第1个第四N维向量输入至所述生成网络的第1层网络层,得到所述第1层网络层输出的第1个中间图;Input the first fourth N-dimensional vector into the first network layer of the generating network to obtain the first intermediate image output by the first network layer;
    将第m个第四N维向量以及第m-1个中间图输入至所述生成网络的第m层网络层,得到所述第m层网络层输出的第m个中间图,m∈[2,M);Input the m-th fourth N-dimensional vector and the m-1 intermediate image into the m-th network layer of the generating network, and obtain the m-th intermediate image output by the m-th network layer, m∈[2 ,M);
    将第M个第四N维向量以及第M-1个中间图输入至所述生成网络的第M层网络层,得到所述第M层网络层输出的所述目标人脸图像。Input the Mth fourth N-dimensional vector and the M-1th intermediate image to the Mth network layer of the generation network to obtain the target face image output by the Mth network layer.
  22. 根据权利要求12至21任一项所述的装置,其中,在所述属性编辑方向与所述属性编辑程度的乘积为正值的情况下,基于所述第二隐变量得到的所述目标人脸图像,相对于所述人脸图像的所述人脸属性的显示效果增强;The device according to any one of claims 12 to 21, wherein when the product of the attribute editing direction and the attribute editing degree is a positive value, the target person obtained based on the second latent variable A face image, the display effect of the face attributes relative to the face image is enhanced;
    在所述属性编辑方向与所述属性编辑程度的乘积为负值的情况下,基于所述第二隐变量得到的所述目标人脸图像,相对于所述人脸图像的所述人脸属性的显示效果削弱。In the case where the product of the attribute editing direction and the attribute editing degree is a negative value, the target face image obtained based on the second latent variable, relative to the face attribute of the face image The display effect is weakened.
  23. 一种电子设备,包括:An electronic device including:
    处理器;processor;
    配置为存储处理器可执行指令的存储器;Memory configured to store instructions executable by the processor;
    其中,所述处理器被配置为调用所述存储器存储的指令,以执行权利要求1至11中任意一项所述的方法。Wherein, the processor is configured to call instructions stored in the memory to execute the method described in any one of claims 1 to 11.
  24. 一种计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令 被处理器执行时实现权利要求1至11中任意一项所述的方法。A computer-readable storage medium having computer program instructions stored thereon. When the computer program instructions are executed by a processor, the method of any one of claims 1 to 11 is implemented.
  25. 一种计算机程序产品,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行配置为实现权利要求1至12中任意一项所述的方法。A computer program product comprising computer readable code. When the computer readable code is run in an electronic device, a processor in the electronic device executes a configuration configured to implement the method described in any one of claims 1 to 12 method.
PCT/CN2022/134943 2022-03-22 2022-11-29 Image processing method and apparatus, and electronic device, storage medium and program product WO2023179075A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210279511.6 2022-03-22
CN202210279511.6A CN114373215A (en) 2022-03-22 2022-03-22 Image processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2023179075A1 true WO2023179075A1 (en) 2023-09-28

Family

ID=81146705

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/134943 WO2023179075A1 (en) 2022-03-22 2022-11-29 Image processing method and apparatus, and electronic device, storage medium and program product

Country Status (2)

Country Link
CN (1) CN114373215A (en)
WO (1) WO2023179075A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114373215A (en) * 2022-03-22 2022-04-19 北京大甜绵白糖科技有限公司 Image processing method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563427A (en) * 2020-04-23 2020-08-21 中国科学院半导体研究所 Method, device and equipment for editing attribute of face image
CN113255551A (en) * 2021-06-04 2021-08-13 广州虎牙科技有限公司 Training, face editing and live broadcasting method of face editor and related device
CN113822953A (en) * 2021-06-24 2021-12-21 华南理工大学 Processing method of image generator, image generation method and device
CN114373215A (en) * 2022-03-22 2022-04-19 北京大甜绵白糖科技有限公司 Image processing method and device, electronic equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111951153B (en) * 2020-08-12 2024-02-13 杭州电子科技大学 Face attribute refined editing method based on generation of countering network hidden space deconstructment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563427A (en) * 2020-04-23 2020-08-21 中国科学院半导体研究所 Method, device and equipment for editing attribute of face image
CN113255551A (en) * 2021-06-04 2021-08-13 广州虎牙科技有限公司 Training, face editing and live broadcasting method of face editor and related device
CN113822953A (en) * 2021-06-24 2021-12-21 华南理工大学 Processing method of image generator, image generation method and device
CN114373215A (en) * 2022-03-22 2022-04-19 北京大甜绵白糖科技有限公司 Image processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114373215A (en) 2022-04-19

Similar Documents

Publication Publication Date Title
CN110070483B (en) Portrait cartoon method based on generation type countermeasure network
JP7490004B2 (en) Image Colorization Using Machine Learning
CN109815924B (en) Expression recognition method, device and system
Lee et al. Detecting handcrafted facial image manipulations and GAN-generated facial images using Shallow-FakeFaceNet
US11410364B2 (en) Systems and methods for realistic head turns and face animation synthesis on mobile device
TW202105238A (en) Image processing method and device, processor, electronic equipment and storage medium
Chadha et al. Deepfake: an overview
WO2020150689A1 (en) Systems and methods for realistic head turns and face animation synthesis on mobile device
Zhang et al. Bionic face sketch generator
CN107025678A (en) A kind of driving method and device of 3D dummy models
WO2023179074A1 (en) Image fusion method and apparatus, and electronic device, storage medium, computer program and computer program product
CN116363261B (en) Training method of image editing model, image editing method and device
WO2023179075A1 (en) Image processing method and apparatus, and electronic device, storage medium and program product
Wang et al. Learning how to smile: Expression video generation with conditional adversarial recurrent nets
WO2023024653A1 (en) Image processing method, image processing apparatus, electronic device and storage medium
Parihar et al. Everything is there in latent space: Attribute editing and attribute style manipulation by stylegan latent space exploration
CN111639537A (en) Face action unit identification method and device, electronic equipment and storage medium
Hang et al. Language-guided face animation by recurrent StyleGAN-based generator
WO2024066549A1 (en) Data processing method and related device
Liu et al. Mask face inpainting based on improved generative adversarial network
Joshi A brief review of facial expressions recognition system
CN113838159B (en) Method, computing device and storage medium for generating cartoon images
US20230377214A1 (en) Identity-preserving image generation using diffusion models
Song et al. Virtual Human Talking-Head Generation
Xu et al. Character photo selection for mobile platform

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22933126

Country of ref document: EP

Kind code of ref document: A1