CN109934107A

CN109934107A - Image processing method and device, electronic equipment and storage medium

Info

Publication number: CN109934107A
Application number: CN201910095929.XA
Authority: CN
Inventors: 钱晨; 林君仪; 吴文岩; 王权; 钱湦钜
Original assignee: Beijing Sensetime Technology Development Co Ltd
Current assignee: Beijing Sensetime Technology Development Co Ltd
Priority date: 2019-01-31
Filing date: 2019-01-31
Publication date: 2019-06-25
Anticipated expiration: 2039-01-31
Also published as: CN109934107B; CN114581999A

Abstract

The embodiment of the invention discloses a kind of image processing method and devices, electronic equipment and storage medium.Described image processing method includes: the first image of detection, obtains the textural characteristics of the first object in the first image；Obtain structure feature；In conjunction with the textural characteristics and structure feature, generation includes the second image of the second object.

Description

Image processing method and device, electronic equipment and storage medium

Technical field

The present invention relates to information technology field more particularly to a kind of image processing methods and device, electronic equipment and storage Medium.

Background technique

Under some image procossing scenes, user is desired based on an image and generates another image.For example, image A is Image when the serious expression of user A, user want the image that user's A smile expression is generated based on image A.

In the prior art, it is realized in such a way that face operates, but to be only able to carry out face small for the prior art Change, if the amplitude changed is bigger, the image generated will be made very strange, face imaging and the true people of generation The difference of face imaging is very big；This results in image fault degree greatly and expected problem is not achieved in picture quality.

Summary of the invention

In view of this, an embodiment of the present invention is intended to provide a kind of image processing methods and device, electronic equipment and storage to be situated between Matter.

The technical scheme of the present invention is realized as follows:

A kind of image processing method, comprising:

The first image is detected, the textural characteristics of the first object in the first image are obtained；

Obtain structure feature；

In conjunction with the textural characteristics and structure feature, generation includes the second image of the second object.

Based on above scheme, textural characteristics described in the combination and structure feature, generation include the second of the second object Image, comprising:

According to the textural characteristics and the structure feature, the pixel for being included to first object carries out pixel weight Group generates second object；

The second image is formed based on second object.

Based on above scheme, the acquisition structure feature, comprising:

Receive first profile information；

The structure feature is generated based on the first profile information.

Based on above scheme, the acquisition structure feature, comprising:

The profile for detecting third object in third image obtains first structure feature；

Textural characteristics described in the combination and structure feature, generation include the second image of the second object, comprising:

In conjunction with the textural characteristics and the first structure feature, generation includes the second image of second object.

Based on above scheme, the acquisition structure feature, comprising:

The profile for detecting the first object described in the first image obtains the second structure feature；

Second structure feature, which is adjusted, based on debugging instruction obtains third structure feature；

In conjunction with the textural characteristics and the third structure feature, generation includes the second image of second object.

Based on above scheme, the first image of the detection obtains the textural characteristics of the first object in the first image, comprising:

Using the first coder processes the first image of deep learning model, obtain characterizing in first object In spatial structural form and profile between texture incidence relation probability distribution.

Based on above scheme, the acquisition structure feature, comprising:

The first profile information inputted using the second encoder of deep learning model, acquisition are with spatial structural form Observation variable and using texture information as the probability of latent variable.

Based on above scheme, the structural information that the second encoder using the deep learning model inputs is obtained It is observation variable and using texture information as the probability of latent variable using spatial structural form, comprising:

The second profile information based on K cluster carries out the classification of the first profile information；

According to the classification as a result, determining that the K cluster calculates the weight of the probability；

According to the weight, determine that using first profile information be observation variable and using texture information as the general of latent variable Rate.

Convolution sum pixel is carried out in conjunction with the probability distribution and the probability using the deep learning solution to model code device Reorganization generates second object；

Second image is obtained based on second object.

Based on above scheme, the spatial structural form includes at least one of:

Action message；

Expression information；

Orientation information.

Based on above scheme, the weight of the deep learning model is obtained by weight normalized.

A kind of image processing apparatus, comprising:

Detection module obtains the textural characteristics of the first object in the first image for detecting the first image；

Module is obtained, for obtaining structure feature；

Generation module, in conjunction with the textural characteristics and structure feature, generation to include the second figure of the second object Picture.

Based on above scheme, the generation module is specifically used for according to the textural characteristics and the structure feature, right The pixel that first object is included carries out pixel reorganization and generates second object；The is formed based on second object Two images.

Based on above scheme, the acquisition module is specifically used for receiving first profile information；Based on the first profile Information generates the structure feature.

Based on above scheme, the acquisition module obtains the specifically for the profile of third object in detection third image One structure feature；

The generation module is specifically used for generating in conjunction with the textural characteristics and the first structure feature comprising State the second image of the second object.

Based on above scheme, the acquisition module, specifically for the wheel of the first object described in detection the first image Exterior feature obtains the second structure feature；Second structure feature, which is adjusted, based on debugging instruction obtains third structure feature；

The generation module is specifically used for generating in conjunction with the textural characteristics and the third structure feature comprising State the second image of the second object.

Based on above scheme, the detection module, for the described in the first coder processes using deep learning model One image obtains the probability distribution that incidence relation between texture in spatial structural form and profile is characterized in first object.

Based on above scheme, the acquisition structure feature, comprising:

Based on above scheme, the acquisition module carries out described the specifically for the second profile information based on K cluster The classification of one profile information；According to the classification as a result, determining that the K cluster calculates the weight of the probability；According to described Weight determines that using first profile information be observation variable and using texture information as the probability of latent variable.

Based on above scheme, the generation module is specifically used for utilizing the deep learning solution to model code device combination institute It states probability distribution and the probability carries out the processing of convolution sum pixel reorganization and generates second object；

Second image is obtained based on second object.

Based on above scheme, the spatial structural form includes at least one of:

Action message；

Expression information；

Orientation information.

A kind of computer storage medium, the computer storage medium are stored with computer-executable code；The calculating After machine executable code is performed, the image processing method that any one aforementioned technical solution provides can be realized.

A kind of electronic equipment characterized by comprising

Memory, for storing information；

Processor is connect with the memory, for executable by executing the computer being stored on the memory Instruction can be realized the image processing method that aforementioned any technical solution provides.

Technical solution provided in an embodiment of the present invention is no longer direct life when changing the first object in the first image Hard the first object of adjustment obtains the second object；But textural characteristics are extracted from the first object, obtain a structure spy Sign, by the second high object of textural characteristics and one fidelity of the comprehensive generation of structure feature, to obtain one comprising second pair The second image of elephant；It is generated other than desired effect in this way, reducing and being based only upon some preset conditions directly the first object of stiff adjustment Unusual second object the phenomenon that, promoted the second image picture quality.

Detailed description of the invention

Fig. 1 is a kind of flow diagram of image processing method provided in an embodiment of the present invention；

Fig. 2 is a kind of facial contour schematic diagram provided in an embodiment of the present invention；

Fig. 3 A is a kind of structural schematic diagram of deep learning model provided in an embodiment of the present invention；

Fig. 3 B is the structural schematic diagram of another deep learning model provided in an embodiment of the present invention；

Fig. 4 A is the comparison schematic diagram of the image of reconstruction provided in an embodiment of the present invention and the reconstruction image of correlation technique；

Fig. 4 B be replaced for the profile that the embodiment of the present invention provides a kind of first image and face image after the second image Comparison schematic diagram；

Fig. 4 C is the textural characteristics of first image of fusion provided in an embodiment of the present invention, the structure feature of the second image obtains To the schematic diagram of third image；

Fig. 5 is a kind of structural schematic diagram of image processing apparatus provided in an embodiment of the present invention；

Fig. 6 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.

Specific embodiment

Technical solution of the present invention is further described in detail with reference to the accompanying drawings and specific embodiments of the specification.

As shown in Figure 1, the present embodiment provides a kind of image processing methods, comprising:

Step S110: the first image of detection obtains the textural characteristics of the first object in the first image；

Step S120: structure feature is obtained；

Step S130: in conjunction with the textural characteristics and structure feature, generation includes the second image of the second object.

Image processing method provided in this embodiment is applied in various electronic equipments, for example, laptop, hand In machine, tablet computer or server.In some embodiments, image processing method provided in this embodiment can be advantageously employed in It include to carry out image procossing using GPU in the electronic equipment of image processor (GPU), with the high spy of image processing efficiency Point.

In the present embodiment, the first image is original image, for the feature comprising the first object.Described first pair As that can be the planar object extracted from plane (2D) image, it be also possible to the 3D object extracted from three-dimensional (3D) image.

If the first image is 2D image, second image can be 2D image or 3D rendering；If the first image is 3D rendering, second image may be 2D image or 3D rendering.

First object can be the life entities such as human or animal, be also possible to mobile physics or stationary object etc. without life Body.

In the present embodiment, the textural characteristics of the first object in the first image are extracted.

The textural characteristics characterization can include: the outer surface textural characteristics of the first object；For example, the outer surface texture Feature may include dermatoglyph feature；Taking human as example, the dermatoglyph is used to indicate eyelid fold, face wrinkles and/or laughs at Line.For another example the outer surface textural characteristics may also include that the hair feature of outer surface, for example, the hair feature can be with Indicate dense degree and/or the shape etc. of eyebrow.

In short, the textural characteristics are used to indicate the texture feature of the outer surface of the first object.

In the present embodiment, while also structure feature can be obtained, which is used to indicate second pair for wishing to present The space structure feature of elephant.

By taking face as an example, the structure feature be can serve to indicate that: the relative positional relationship between the face of face, five The space structure feature of each organ itself in official.

In the present embodiment, in order to reduce the strange degree for generating the second object, in conjunction with textural characteristics and structure feature, Generate the second image.The structure feature defines the space structure feature of the second object, and textural characteristics define in the space The texture feature of outer surface under design feature.On the one hand, the first object due to the textural characteristics from the first image, then The second object integration textural characteristics of the first object.At the same time, the structure feature of acquisition may be with the first object itself Structure feature changes, so that the second object is the structure feature based on acquisition the textural characteristics of the first object are imaged； In this way, the second object is made, again because of the difference of space structure feature, to there is difference with the first object relative to the first object.

In this way, the second object is equivalent to the reconstructed object of the first object, second image is equivalent to the weight of the first image Build image.

No longer it is the direct adjustment directly to the first object in the first image in the present embodiment, reduces because of equipment Due to constraining improper or adjusting the caused second strange image generated such as excessive during adjustment, generation is improved The fidelity of second image improves the picture quality of the second image of reconstruction.

In some embodiments, the step S130 can include: right according to the textural characteristics and the structure feature The pixel that first object is included carries out pixel reorganization and generates second object；

The second image is formed based on second object.

It in the present embodiment, can be right when generating second object based on the textural characteristics and the structure feature The pixel of first object carries out pixel reorganization.In the present embodiment, the granularity of pixel reorganization is sub-pixel granularity.For example, with For RGB image, a pixel may include three sub-pixels of RGB, can be by changing the first figure when carrying out pixel reorganization As in the first object institute within the pixel one or more sub-pixels sub-pixel value, to generate second object.In this way, knot Textural characteristics and structure feature are closed, the pixel reorganization of sub-pixel granularity is carried out, it is possible to reduce generates lines or the stiff surprise of profile Different image, to promote the picture quality of the second image.For example, locally being wrapped relative to some is carried out based on warp mesh The number of coordinate transform or pixel containing pixel is deleted deformed for, integrated structure feature carry out pixel reorganization, tool Have the characteristics that deformation effect is good and image effect is good.

Specifically such as, the step S130 can include:

The profile of second object is determined according to the structure feature；The profile characterizes the different portions of the second object Divide the position in the second image；

The textural characteristics are then based on, the pixel value of pixel in the profile is changed, the pixel value after change has described Textural characteristics；So generate the second object that at least design feature is changed；Further, it is replaced using the second object The first object in the first image just generates the second image comprising second object.

In some embodiments, the step S120 can include:

Receive first profile information；

The structure feature is generated based on the first profile information.

In the present embodiment, the electronic equipment for running deep learning model can not have to voluntarily obtain first profile information, Can the first profile information directly be received from peripheral hardware.

For example, the first profile information can be facial contour by taking the face of organism as an example.

It is facial contour figure that Fig. 2, which illustrates three kinds of first profile information,.

The first profile information can be comprising contoured contour images, which can be and described Image of one image with picture size.For example, the first image includes W*H pixel, then the contour images also may include There is W*H pixel, in this way, when being subsequently generated the second image, it is possible to reduce the pixel pair between the image of different images size Together.

In the present embodiment, the structure feature can be obtained based on the first profile information.

For example, by taking the facial contour of first profile information description as an example, the structure feature can include: shape of face, five The structure feature of the various characterization design features such as the position of official.

In further embodiments, the step S120 can include:

When carrying out image reconstruction, it is desirable to change the face in image A into face in image B, continue to continue to use image A Middle dermatoglyph；Then at this point, image A is the first image, image B is the third image.The structure feature can be The structure feature that the deep learning model that electronic equipment is run is extracted from image B.For example, the deep learning model Can be handled by convolution sum pondization etc., the characteristics of image of third image is extracted, then knot is extracted from these characteristics of image Structure feature.Specifically such as, profile key point is extracted from third image using neural network even depth learning model, be based on The relative position of these profile key points obtains the structure feature.

By taking face as an example, the profile key point can include: forehead key point, brow ridge key point, nose highest point pass Key point, the complete profile key point of several shapes of face, lip outer profile and upper lip and the key point of lower lip cut-off rule etc..Connect these Profile key point, so that it may obtain the profile diagram of face.

In some embodiments, the step S130 can include:

In some embodiments, the structure feature comes from the first image, but again endless is congruent with first figure Picture.For example, carrying out the extraction of structure feature to the first image first with deep learning model etc., described second is obtained Structure feature, then obtains an adjustment instruction, which can be the user based on man-machine interactive interface and take in generation , it is also possible to the adjustment instruction generation of electronic equipment internal generation.

For example, in some embodiments, image processing application program has smiling face's Reconstruction of The Function；Smiling face's Reconstruction of The Function The instruction that built-in many smiling faces rebuild, the instruction can be one kind of aforementioned adjustment instruction.

For example, the first image includes the face A of serious expression；According to smiling face's Reconstruction of The Function, based on smiling face relative to tight The face of respectful label, it is possible that the features such as lip is grinned out, eyes are narrowed, based on these smiling faces relative to serious expression Difference, can be generated adjustment the second structure feature in corresponding eigenvalue adjustment instruction.In this way, being passed through based on adjustment instruction Adjustment to the second structure feature can generate third structure feature.

In this way, on the one hand third structure feature at least partly inherits the structure feature of the first object, on the other hand relatively Variation is produced again in the first object.In this way, can be based on an image in film or video production field, pass through extraction Structure feature in the image, it is changed so as to generate multiple expressions and/or posture by adjusting the output of instruction Second image.If generating video using these second images as picture frame, can also in the case where only acquiring an image, Automatically generate the video that a fidelity is high and picture quality is good.

In some embodiments, the step S110 can include:

The present embodiment extracts the textural characteristics using deep learning model.

The deep learning model include the first encoder, first encoder can carry out image process of convolution and Pondization processing, obtains the textural characteristics.

For example, first encoder may include the residual error module of N number of number；Each residual error module includes convolutional layer, pond Change layer and splicing layer；The convolutional layer extracts characteristics of image from the first figure by process of convolution, which includes but not It is limited to the feature of the first object.For example, the characteristics of image may include the background distinguished other than the first object and the first object Boundary characteristic etc..After the pond layer is handled by pondization, the image pixel of preceding layer input will sample；It is described Splicing layer can splice the initial input of the residual error module and the feature after the processing of convolution sum pondization, export It is exported to next residual error module or directly as textural characteristics.

In the present embodiment, a residual error module can include: three convolutional layers and 2 pond layers；Convolutional layer and pond Change and is spaced apart between layer.

In some embodiments, the residual error module that first encoder includes can be 5 or 6.

In the present embodiment, using first the first image of coder processes, posture different in the first object, institute are obtained The texture for stating the first object corresponding to posture different in the first object is different.For example, smiling face's skin corresponding with face of crying Skin texture is different.

In the present embodiment, what is obtained by the first encoder is based on being associated with pass between spatial structural form and texture The probability distribution of system, the probability distribution can be indicated with q (z/x, y), wherein what y was referred to is spatial structural form；Z is indicated Apparent textural characteristics, x indicate the first image image data.

Since the probability distribution is to be changed based on spatial structural form and make textural characteristics also changed, by this Probability distribution is as the textural characteristics, and subsequent structure feature is obtained according to spatial structural form, alternatively, saying structure spy Sign is extracted from spatial structural form.If introducing new structure feature, y is changed, in this way, based on above-mentioned Probability distribution can reconstruct the second object that picture quality is high and fidelity is high, to generate fidelity height and image matter Measure the second high image.

In some embodiments, the step S130 can include:

The first profile information inputted using the second encoder of deep learning model, acquisition are with spatial structural form Observation variable and using texture information as the probability of latent variable.Observation variable herein can be considered dependent variable, the latent variable It can be to be considered as independent variable, obtained probability characterization is that spatial structural form changes the probability for causing texture information to change.

The spatial structural form can be extracted from the first profile information, and the spatial structural form can wrap Include at least one of:

Action message, different face actions has corresponded to different face contours, therefore the action message can be for described in conduct One kind of spatial structural form；

Expression information, different expressions have corresponded to different face contours, in this way, expression information can be used as the sky Between structural information one kind；

Face's direction of orientation information, different directions, the first object or the second object is different.

In the present embodiment, which can be indicated with p (z/y)；What y was referred to is spatial structural form；Z indicates outer Textural characteristics in sight.

If structure feature is the structure feature of a side face, and the first object is positive face, is implemented by executing the present invention The step S110 to step S130 of example can make the second object for being generated as side face.

In some embodiments, the structural information that the second encoder using the deep learning model inputs, is obtained Being able to spatial structural form is observation variable and using texture information as the probability of latent variable, comprising:

In order to obtain structure feature of each object under different expressions and/or posture.In the present embodiment, The deep learning model and K cluster can be divided by a large amount of training data in advance, expression and/or appearance in each cluster The similitude with higher of design feature corresponding to gesture, the similitude between space structure feature between cluster and cluster are lower.

In some embodiments, different clusters can correspond to the crowd of all ages and classes, the different colours of skin, different sexes.It is different The crowd of cluster has the characteristics that different space structures, to correspond to different spatial structural forms.

In the present embodiment, it can determine whether out that first profile information carries out the meter of similitude with the second profile information of K cluster It calculates, for example, calculating the similar of the second profile information between the first profile information and each cluster of input based on cosine function Degree.What the second profile information herein can be formed for the central value of the corresponding contour point of each profile information in a cluster.Pass through The cluster that the available first profile information such as cosine function calculating are belonged to, alternatively, between first profile information and each cluster Distance etc..In short, the similitude between available first profile information and the second profile information of each cluster, based on this Similitude determines that each cluster participates in calculating the weight of the probability.In some embodiments, the similitude the high then corresponds to cluster Weight it is bigger, then it is bigger to the influence of probability.

Based on the similarity, each fasciation is determined at the weight of the probability, based on the weight and each cluster Spatial structural form and texture between corresponding relationship, obtain the probability.

For example, the probability is availableTo indicate.Wherein, y indicates to be based on first profile Information obtains spatial structural form；Z indicates texture information；It is to be based onCovariance matrix；K indicates the total of cluster Number；w_kIt is the weight of k-th of cluster；u_kIt is the mean value of the Gaussian Profile of k-th of cluster；σ_kIt is the side of the Gaussian Profile of k-th of cluster Difference.Indicate z withBetween relative entropy.

In some embodiments, the step S130 can include:

Second image is obtained based on second object.

As shown in Fig. 3 A and Fig. 3 B, the probability distribution and the probability are input to deep learning solution to model code device In, decoder generates the second image comprising the second object by the processing such as convolution sum pixel reorganization.

It is that input obtains aforementioned probability distribution q with image as, using y as input, obtained aforementioned Probability p (z/y) in Fig. 3 A (z/x,y).Q (z/x, y) herein can also be write as the form of z~q (/x, y), and what mathematic sign herein referred to is Parameter z；For example, with reference in Fig. 3 B.

E in Fig. 3 B_φIndicate the first encoder；Indicate decoder, E_uIndicate second encoder.Show there is K in Fig. 3 B A cluster, these clusters are respectively designated as: c₁,c₂,...c_k.As shown in Figure 3A and Figure 3B, by between second encoder and decoder Jump connection, the feature after different residual error module convolution is directly inputted to the corresponding convolutional layer of decoder and is spliced Processing, to promote the fidelity and image effect of the second object.

In some embodiments, it in order to promote the image processing effect of deep learning model, is completed in deep learning model After training, the weight of deep learning model can be normalized.Therefore in the present embodiment, the deep learning mould The weight of type is obtained by weight normalized.

Specifically, the weight normalized is carried out using following functional relation:

Y=W*X+B；Wherein, Y is output, and W is weight；X is M dimension input feature vector；B is threshold value.Its In, v is the M dimensional characteristics vector of X；| | v | | it is v Euclid norm.G is | | W | |, independently of v.| | W | | it is several for the Europe W In norm.

Above provide a kind of normalized modes of weight, and in some embodiments, the weight normalization can be with base It is normalized in maximum weight.For example, seek each weight and weight limit into ratio, after normalization Weight.In further embodiments, the weight normalized, which can also be, is normalized place based on maximin Reason, for example, the difference between each weight and maximum weight and minimum weight is compared, the power after being normalized Value.

Weight herein can be the weight of each calculate node in neural network even depth model.

As shown in figure 5, the present embodiment provides a kind of image processing apparatus, comprising:

Detection module 110 obtains the textural characteristics of the first object in the first image for detecting the first image；

Module 120 is obtained, for obtaining structure feature；

Generation module 130, in conjunction with the textural characteristics and structure feature, generation to include the second of the second object Image.

In some embodiments, the detection module 110, acquisition module 120 and generation module 130 can be program module, After described program module is executed by processor, it can be realized detection module 110, acquisition module 120 and generation module 130 and held Capable function.

In further embodiments, the detection module 110, acquisition module 120 and generation module 130 can be soft or hard knot Mold block；The soft or hard binding modules can be various programmable arrays；The programmable array can include: field-programmable battle array Column and/or complex programmable array.

In further embodiments, the detection module 110, acquisition module 120 and generation module 130 can be pure hard Part module；The pure hardware module may include specific integrated circuit.

In some embodiments, the generation module 130 is specifically used for special according to the textural characteristics and the structure Sign, the pixel for being included to first object carry out pixel reorganization and generate second object；Based on second pair of pictograph At the second image.

In some embodiments, the acquisition module 120 is specifically used for receiving first profile information；Based on described first Profile information generates the structure feature.

In some embodiments, the acquisition module 120, specifically for the profile of third object in detection third image Obtain first structure feature；

The generation module 130, specifically for including in conjunction with the textural characteristics and the first structure feature, generation There is the second image of second object.

In some embodiments, the acquisition module 120 is specifically used for described in detection the first image first pair The profile of elephant obtains the second structure feature；Second structure feature, which is adjusted, based on debugging instruction obtains third structure feature；

The generation module 130, specifically for including in conjunction with the textural characteristics and the third structure feature, generation There is the second image of second object.

In some embodiments, the detection module 110, for the first coder processes using deep learning model The first image obtains characterizing in spatial structural form and profile the general of incidence relation between texture in first object Rate distribution.

In some embodiments, the acquisition structure feature, comprising:

In some embodiments, the acquisition module 120 carries out institute specifically for the second profile information based on K cluster State the classification of first profile information；According to the classification as a result, determining that the K cluster calculates the weight of the probability；According to The weight determines that using first profile information be observation variable and using texture information as the probability of latent variable.

In some embodiments, the generation module 130 is specifically used for utilizing the deep learning solution to model code device The processing of convolution sum pixel reorganization, which is carried out, in conjunction with the probability distribution and the probability generates second object；

Second image is obtained based on second object.

In some embodiments, the spatial structural form includes at least one of:

Action message；

Expression information；

Orientation information.

In some embodiments, the weight of the deep learning model is obtained by weight normalized.

Several specific examples are provided below in conjunction with above-mentioned any embodiment:

Example 1:

This exemplary technical solution consists of three parts:

1. one is used for the convolutional neural networks model of edge detection, for the face picture of input, which is responsible for To an accurate face edge line detection result (such as outer eyelid, outer face contour line etc.).

2. a condition encoding and decoding network, in the profile information obtained by the first step, structural characterization is extracted by network, The appearance textural characteristics and structure feature of encoder decomposition input facial image are helped as specific structure feature information.

3. a weight normalization and decoder design based on perceived quality, further help to promote generation quality.

The neural network that this example provides clearly decomposes encoder by the way that the spatial structural form of picture is decomposited The textural characteristics and structure feature of appearance maintain the consistency and specific expression of appearance texture, posture structure, recombinant Decoder obtains a high-fidelity, and various face, which manipulates, to be calculated.Structural information constraint has been explicitly joined in network, so that Network can well decompose appearance texture and structure feature, so that face manipulation has good result.One is based on The weight normalization of perceived quality and decoder design incorporate in network, further help to promote generation quality.

Example 2:

This example provides a kind of image processing method, comprising:

Give an image x；Then need to obtain x withBetween mapping relations G.The G can include: φ_appAnd u_str.It should Mapping relations can pass through textural characteristics z=φ_app(x, c) and y=u_str(c)。

Utilize the available building aforementioned depth learning model of conditional variance autocoding (CVAE) network.

Aforementioned probability distribution and/or probability are solved using following functional relation.

logp(x/y)≥E_q[logp(x/z,y)]-D_KL[q(z/x,y),p(z/y)]；Based on the functional relation, by asking Solve the available q of maximum value (z/x, y) and p (z/y) of p (x/y).Wherein, q (z/x, y) can be approximately p (z/y)²。

The network that this example provides can be trained using the function of following random targets:

q_φ(z/x, y) meets distribution constraint condition N (0, I).

The probability is availableTo indicate.Wherein, y indicates to be based on first profile information Obtain spatial structural form；Z indicates texture information；It is to be based onCovariance matrix；The total number of K expression cluster； w_kIt is the weight of k-th of cluster；u_kIt is the mean value of the Gaussian Profile of k-th of cluster；σ_kIt is the variance of the Gaussian Profile of k-th of cluster.Indicate z withBetween relative entropy.

c_kFor the weight of structure feature.

Q (z/x, c)=N (z/u (x, c), σ²(x,c)I)。

In the training process of deep learning model, costing bio disturbance is carried out using following loss function:

Wherein, λ_lFor the weight of first of hidden layer of neural network；ψ_l(x) input picture x is transformed to for first of hidden layer The transforming function transformation function of feature space；It is first of hidden layer by reconstruction imageTransform to the transforming function transformation function of feature space.

Fig. 4 A is the comparison schematic diagram of the image of reconstruction provided in an embodiment of the present invention and the reconstruction image of correlation technique； The comparison of the second image after Fig. 4 B is replaced for the profile that the embodiment of the present invention provides a kind of first image and face image is shown It is intended to；Fig. 4 C be the fusion textural characteristics of the first image provided in an embodiment of the present invention, the second image structure feature obtain the The schematic diagram of three images.

Known to comparison chart 4A, Fig. 4 B and Fig. 4 C, it is clear that using the effect for the second image that the method that this example provides is formed It is more true to nature.

As shown in fig. 6, the embodiment of the present application provides a kind of image processing equipment, comprising:

Memory, for storing information；

Processor is connect with the memory, for executable by executing the computer being stored on the memory Instruction can be realized the image processing method that aforementioned one or more technical solutions provide, for example, such as Fig. 1, Fig. 3 A and Fig. 3 B Shown in one or more of method.

The memory can be various types of memories, can be random access memory, read-only memory, flash memory etc..It is described to deposit Reservoir can be used for information storage, for example, storage computer executable instructions etc..The computer executable instructions can be various Program instruction, for example, objective program instruction and/or source program instruction etc..

The processor can be various types of processors, for example, central processing unit, microprocessor, Digital Signal Processing Device, programmable array, digital signal processor, specific integrated circuit or image processor etc..

The processor can be connect by bus with the memory.The bus can be IC bus etc..

In some embodiments, the terminal device may also include that communication interface, the communication interface can include: network connects Mouthful, for example, lan interfaces, dual-mode antenna etc..The communication interface is equally connected to the processor, and can be used in information Transmitting-receiving.

In some embodiments, described image processing equipment further includes camera, which can be 2D camera, can To acquire 2D image or 3D rendering.

In some embodiments, the terminal device further includes man-machine interactive interface, for example, the man-machine interactive interface It may include various input-output equipment, for example, keyboard, touch screen etc..

The embodiment of the present application provides a kind of computer storage medium, and the computer storage medium is stored with computer Executable code；After the computer-executable code is performed, it can be realized what aforementioned one or more technical solutions provided Image processing method, for example, one or more of method as shown in Fig. 1, Fig. 3 A and Fig. 3 B.

The storage medium includes: movable storage device, read-only memory (ROM, Read-Only Memory), deposits at random The various media that can store program code such as access to memory (RAM, Random Access Memory), magnetic or disk. The storage medium can be non-moment storage medium.

The embodiment of the present application provides a kind of computer program product, and described program product includes computer executable instructions； After the computer executable instructions are performed, aforementioned any image processing method for implementing to provide can be realized, for example, such as One or more of method shown in Fig. 1, Fig. 3 A and Fig. 3 B.

In several embodiments provided herein, it should be understood that disclosed device and method can pass through Other modes are realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only For a kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, Or it is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition Partially mutual coupling or direct-coupling or communication connection can be through some interfaces, equipment or unit it is indirect Coupling or communication connection, can be electrical, mechanical or other forms.

Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit The component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple networks On unit；Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.

In addition, each functional unit in various embodiments of the present invention can be fully integrated into a processing module, it can also To be each unit individually as a unit, can also be integrated in one unit with two or more units；It is above-mentioned Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.

Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can lead to The relevant hardware of program instruction is crossed to complete, program above-mentioned can be stored in a computer readable storage medium, the journey Sequence when being executed, executes step including the steps of the foregoing method embodiments；And storage medium above-mentioned include: movable storage device, only Read memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk Or the various media that can store program code such as CD.

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, appoints What those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, answer It is included within the scope of the present invention.Therefore, protection scope of the present invention should be with the scope of protection of the claims It is quasi-.

Claims

1. a kind of image processing method characterized by comprising

Obtain structure feature；

2. the method according to claim 1, wherein textural characteristics described in the combination and structure feature, generate It include the second image of the second object, comprising:

According to the textural characteristics and the structure feature, the pixel for being included to first object carries out pixel reorganization generation Second object；

The second image is formed based on second object.

3. method according to claim 1 or 2, which is characterized in that the acquisition structure feature, comprising:

Receive first profile information；

The structure feature is generated based on the first profile information.

4. method according to claim 1 or 2, which is characterized in that the acquisition structure feature, comprising:

5. method according to claim 1 or 2, which is characterized in that the acquisition structure feature, comprising:

6. method according to claim 1-5, which is characterized in that the first image of the detection obtains the first figure The textural characteristics of the first object as in, comprising:

Using the first coder processes the first image of deep learning model, obtain characterizing space knot in first object In structure information and profile between texture incidence relation probability distribution.

7. method according to any one of claims 1 to 6, which is characterized in that

The acquisition structure feature, comprising:

The first profile information inputted using the second encoder of deep learning model, obtaining with spatial structural form is that observation becomes It measures and using texture information as the probability of latent variable.

8. a kind of image processing apparatus characterized by comprising

Module is obtained, for obtaining structure feature；

Generation module, in conjunction with the textural characteristics and structure feature, generation to include the second image of the second object.

9. a kind of computer storage medium, the computer storage medium is stored with computer-executable code；The computer After executable code is performed, the method that any one of claim 1 to 7 provides can be realized.

10. a kind of electronic equipment characterized by comprising

Memory, for storing information；

Processor is connect with the memory, the computer executable instructions for being stored on the memory by execution, It can be realized the method that any one of claim 1 to 7 provides.