CN112734632B

CN112734632B - Image processing method, device, electronic equipment and readable storage medium

Info

Publication number: CN112734632B
Application number: CN202110009523.2A
Authority: CN
Inventors: 李益永; 黄秋实; 孙准; 井雪; 项伟
Original assignee: Bigo Technology Pte Ltd
Current assignee: Bigo Technology Pte Ltd
Priority date: 2021-01-05
Filing date: 2021-01-05
Publication date: 2024-02-27
Anticipated expiration: 2041-01-05
Also published as: CN112734632A; WO2022148379A1

Abstract

The invention provides an image processing method and device, the image processing method comprises: acquiring an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image includes: presenting a reference object of a reference pose; acquiring a first key feature of a target object and a second key feature of a reference object; determining a gesture migration matrix according to the first key features and the second key features; acquiring an initial image; and determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image. In the embodiment of the invention, a large number of training samples are not required to be acquired to train the model to obtain the target synthetic image, the complexity of image migration is reduced, the initial image is acquired, and the gesture migration matrix, the image to be migrated and the initial image are adopted to migrate the whole image to be migrated, so that the details of the image to be migrated can be ensured to be displayed in the target synthetic image, and the omission of the details is prevented.

Description

Image processing method, device, electronic equipment and readable storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image processing method, an image processing device, an electronic device, and a readable storage medium.

Background

The posture shift refers to that after one image a is processed, the person P in the image a has the posture of the person H in the other image B, and a composite image C is obtained.

Currently, in order to realize gesture migration, a plurality of images a, a plurality of images B and a plurality of images C are used as training samples, an image migration model is trained, and then a new image a and a new image B are processed according to the image migration model, so as to obtain a new composite image C.

In the above gesture migration method, a large number of training samples need to be prepared when the image migration model is trained, and the training method is complicated. Moreover, when the image migration model is adopted to perform image migration, when the clothing body types of the people in the two images are greatly different, the people in the composite image C cannot keep details of the people P in the original image A, the people in different composite images C are greatly different in shape under different visual angles and postures, in addition, the situation that only part of human bodies in the people are migrated, and other human body parts are required to be processed again to achieve migration is likely to occur, so that the migration process is complicated.

Disclosure of Invention

In view of the above, the present invention provides an image processing method, which solves the problems of complicated migration process and incomplete migration to a certain extent.

An embodiment of the present invention provides an image processing method, including:

acquiring an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose;

acquiring a first key feature of the target object and a second key feature of the reference object;

determining a gesture migration matrix according to the first key features and the second key features;

acquiring an initial image;

and determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image.

A second aspect of an embodiment of the present invention provides an image processing apparatus, including:

the first acquisition module is used for acquiring an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose;

the second acquisition module is used for acquiring the first key characteristics of the target object and the second key characteristics of the reference object;

the first determining module is used for determining a gesture migration matrix according to the first key features and the second key features;

the third acquisition module is used for acquiring an initial image;

and the second determining module is used for determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image.

A third aspect of an embodiment of the present invention provides an electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, the program or instruction when executed by the processor implementing the steps of the method as described in the first aspect.

A fourth aspect of the embodiments of the present invention provides a readable storage medium having stored thereon a program or instructions which when executed by a processor performs the steps of the method according to the first aspect.

In the embodiment of the invention, an image to be migrated and a reference image are acquired; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose; acquiring a first key feature of the target object and a second key feature of the reference object; determining a gesture migration matrix according to the first key features and the second key features; acquiring an initial image; and determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image. In the embodiment of the invention, a large number of training samples are not required to be acquired to train the model to obtain the target synthetic image, the complexity of image migration is reduced, the initial image is acquired, and the gesture migration matrix, the image to be migrated and the initial image are adopted to migrate the whole image to be migrated, so that the details of the image to be migrated can be ensured to be displayed in the target synthetic image, and the omission of the details is prevented.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:

fig. 1 is a flowchart of steps of an image processing method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an image processing method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of another image processing method according to an embodiment of the present invention;

fig. 4 is a block diagram of an image processing apparatus provided by an embodiment of the present invention;

fig. 5 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

Referring to fig. 1, a step flowchart of an image processing method provided by an embodiment of the present invention is shown, where the image processing method specifically includes the following steps:

step 101, obtaining an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: a reference object of a reference pose is presented.

Wherein the image to be migrated is m ₁ *n ₁ *3, wherein m ₁ Is the width of the image to be migrated, n ₁ Is the high of the image to be migrated, 3 means that the image to be migrated is an RGB image. The reference picture is m ₂ *n ₂ *3, wherein m ₂ Is the width of the reference image, n ₂ Is the high of the reference image, 3 means that the reference image is an RGB image.

In the embodiment of the invention, the target object and the reference object generally refer to human body objects in an image; referring to fig. 2, wherein an image a is an image to be migrated and an image B is a reference image; the image A to be migrated comprises a target object P, and the reference image B comprises: reference object H.

In the embodiment of the invention, the images to be migrated and the reference image can be selected from the image memory according to the requirements, and can be shot and acquired at any time, which is not limited.

In addition, a user can select a video as a reference video, each frame of image in the reference video is used as a reference image, and then the image to be migrated is processed based on each frame of reference image.

Step 102, acquiring a first key feature of the target object and a second key feature of the reference object.

In the embodiment of the invention, the image to be migrated is represented by a dimension vector, and the dimension vector of the image to be migrated is: x (i x m n+j x m+k) =x (j, k, i); wherein, 3 is greater than or equal to i and greater than or equal to 1, n is greater than or equal to i and greater than or equal to 1, and m is greater than or equal to i and greater than or equal to 1. The reference image is also represented by a dimension vector, and the dimension vector of the reference image is y (i×m×n+j×m+k) =y (j, k, i); wherein, 3 is greater than or equal to i and greater than or equal to 1, n is greater than or equal to i and greater than or equal to 1, and m is greater than or equal to i and greater than or equal to 1.

Specifically, each pixel point in the image to be migrated can use a dimension vector or coordinate to represent its position, and each pixel point in the reference image can also use a dimension vector or coordinate to represent its position. For example: an image has 10 rows by 10 columns of pixels, and the coordinates of the pixel p of the 5 th row and 5 th column are expressed as (5, 5); the pixel point p is denoted p (45) by a one-dimensional vector.

In the embodiment of the invention, the first key feature refers to coordinates of a plurality of feature points capable of marking the gesture of the target object; for example, the first key feature may be coordinates of respective joints of the target object; each joint includes: shoulder joint, elbow joint, wrist joint, carpometacarpal joint, hip joint, knee joint, ankle joint, etc. Furthermore, the first key feature may also be coordinates of a main part of the human body, for example, a part characterizing the head pose, including: eyes, nasal prongs, temples and chin prongs; a site characterizing the pose of an arm, comprising: shoulder, elbow and carpometacarpal joints; a site characterizing a hand gesture, comprising: the joints and fingertips of each finger; a site characterizing a leg gesture, comprising: hip joint, knee joint, ankle joint.

In the embodiment of the present invention, the first key feature is a preset key feature in the target object; the second key features are in one-to-one correspondence with the first key features.

Specifically, when the first key feature is obtained, a second key feature may be obtained according to the first key feature, where the second key feature corresponds to the first key feature, and for example, the first key feature includes: the coordinates of the shoulder joint, elbow joint, wrist joint, carpometacarpal joint, hip joint, knee joint, ankle joint of the target object in the image to be migrated, the second key feature includes the coordinates of the shoulder joint, elbow joint, wrist joint, carpometacarpal joint, hip joint, knee joint, ankle joint of the reference object in the reference image.

In the embodiment of the present invention, if the target image includes only a face image, that is, only the face pose is migrated, the first key feature is set to the coordinates of each feature point of the human face, for example: eyes, nose, eyebrows, ears, mouth, etc.

In the embodiment of the invention, the user can select the migrated position according to the need, for example, when only the face is migrated, only the first key feature of the face is selected, and when only the body is migrated, only the first key feature of the body is selected.

And step 103, determining a gesture migration matrix according to the first key features and the second key features.

In an embodiment of the present invention, the step 103 includes: determining coordinate values of each first key feature and coordinate values of each second key feature;

and determining the gesture migration matrix according to the coordinate values of the first key feature and the coordinate values of the second key feature, wherein the gesture migration matrix is used for converting the coordinate values of the first key feature into the coordinate values of the second key feature corresponding to the first key feature.

In the invention realizeIn an embodiment, the gesture migration matrix refers to a gesture migration matrix required for migrating the coordinates of the first key feature to the coordinates of the second key feature. For example, the first key features include: coordinates of temple (a, b), coordinates of temple of the second key feature (m, n), shoulder joint (o, p) at the time of shoulder joint (c, d); the coordinates of the temple of the first key feature are stored as (m, n), the coordinates of the shoulder joint of the first key feature are stored as (o, p), the coordinates of the elbow joint of the first key feature are stored as (q, r), and so on. Wherein according toWhen the first key feature includes a plurality (3 or more) of the gesture transition matrices W are obtained, the gesture transition matrices W may be obtained in this manner.

In the embodiment of the present invention, the coordinates of each first key feature are Px, and the coordinates of each second key feature are Py, and w=w [ Px, py ] is the gesture migration matrix required for transforming from the first key feature to the second key feature.

After determining the gesture migration matrix W, each pixel point of the image to be migrated may be migrated using the gesture migration matrix W.

Step 104, an initial image is acquired.

In the embodiment of the present invention, the initial image is an initial image that needs to be input in order to complete the step of obtaining the target synthetic image in a preset manner in the subsequent step.

In an embodiment of the present invention, step 104 includes: and inputting the gesture migration matrix and the image to be migrated into an initial network model to obtain an initial image.

In an embodiment of the present invention, the initial network model may be a model trained according to data samples, where the data samples include: the plurality of image samples to be migrated are converted into gesture migration matrix samples of the reference image samples, and the plurality of image samples to be migrated and the plurality of target synthetic image samples; training by adopting the data samples to obtain an initial network model; and inputting the gesture migration matrix and the image to be migrated into a trained initial network model to obtain an initial image, wherein the initial image is an initial synthesized image of a target object in the image to be migrated under the gesture of a reference object, but details of the initial synthesized image are omitted, and all the characteristics of the image to be migrated cannot be completely displayed. After the subsequent steps are continuously executed, the details of the image to be migrated can be supplemented completely.

In addition, the working principle of the initial network model can also be Z ₀ =w×x; wherein Z is ₀ Is the dimension vector of the initial image, W is the pose migration matrix, and x is the dimension vector of the image to be migrated. The initial image thus obtained has some characteristics of the image to be migrated, but is not clear, and not all pixels in the image to be migrated are migrated. The initial image obtained by the method is used as a basis of subsequent calculation, so that the migration quality of the image to be migrated can be improved.

Optionally, step 104 includes: and taking the preset image with the dimension vector of zero as the initial image. The preset image can be stored in a memory, and is called when the image to be migrated is processed.

In the embodiment of the invention, the dimension vector corresponding to the initial image can be assigned to zero so as to carry out subsequent calculation.

And 105, determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image.

In an embodiment of the present invention, step 105 includes: obtaining an intermediate composite image according to a preset mode, the gesture migration matrix, the image to be migrated and the initial image; and taking the intermediate synthetic image as the new initial image, and circularly executing the preset mode according to preset times, and the gesture migration matrix, the image to be migrated and the initial image to obtain an intermediate synthetic image.

In the embodiment of the invention, F (Z, px, py) represents the target composition under the condition that the target object in the image to be migrated is migrated from the posture of the target object to the posture of the reference objectA dimension vector of the image. Then min Σiif [ z, P is required _x ，P _y ]When x II approaches 0, the details of the images to be migrated are all shown in the target composite image. Where x is the dimension vector of the image to be migrated. The following pair of min Σiif [ z, P _x ，P _y ]-x| solving the step when approaching 0;

1) For min ΣII F [ z, P _x ,P _y ]-x| optimizing to obtain

2) Let a= (W [ P ] _x ,P _y ]) ^T W[P _x ,P _y ]，b＝(W[P _x ,P _y ]) ^T x; then pairPerforming inverse problem modeling to solve the equation set az=b;

3) Setting the solution accuracy e=0.0000001, then r ₀ ＝b-AZ ₀ ；p ₀ ＝r ₀ The method comprises the steps of carrying out a first treatment on the surface of the If r ₀ Is larger than the value of e,r _k ＝r _k-1 +α _k-1 Apk-1；p _k ＝r _k +β _k-1 p _k-1 ；/>wherein let a= (W) ^T W；P ₀ ＝r ₀ ；r ₀ ＝b-AZ ₀ ；b＝W ^T x；

4) The formula is arranged to obtain Z _k+1 ＝f(b，A，Z _k ) It can be seen that the target composite image Z _k+1 Is dependent on the pose migration matrix W, the image x to be migrated, and the initial image Z _k A kind of electronic device.

In the embodiment of the present invention, according to the above steps 1) to 4), the preset mode in the embodiment of the present invention is specifically as follows:

Z _k+1 ＝Z _k +α _k P _k ；

wherein Z is _k+1 A first dimension vector that is the intermediate composite image; z is Z _k A second dimension vector in the initial image; wherein W is the gesture migration matrix;r _k ＝r _k-1 +α _k-1 Apk-1；p _k ＝r _k +β _k-1 p _k-1 ；wherein let a= (W) ^T W；P ₀ ＝r ₀ ；r ₀ ＝b-AZ ₀ ；b＝W ^T x; wherein x is the third dimension vector of the image to be migrated, Z ₀ A fourth dimensional vector of the initial image is obtained for an initial network model.

In the embodiment of the present invention, the preset mode refers to the application of the above formula Z _k+1 ＝Z _k +α _k P _k And obtaining a target synthetic image.

In the embodiment of the invention, the gesture migration matrix W is the W [ P ] _x ,P _y ]。

Specifically, the initial image obtained as described above is exemplified as Z ₀ The method comprises the steps of carrying out a first treatment on the surface of the The gesture migration matrix W, the image x to be migrated and the initial image Z are firstly migrated ₀ Inputting a formula corresponding to the preset mode to obtain:wherein r is ₀ ＝b-Az ₀ ；P ₀ ＝r ₀ The method comprises the steps of carrying out a first treatment on the surface of the Alpha is then ₀ ＝1/A＝1/(W) ^T W；P ₀ ＝b-Az ₀ ＝W ^T x-(W) ^T W·Z ₀ The method comprises the steps of carrying out a first treatment on the surface of the Z is then ₁ ＝Z ₀ +(1/(W) ^T W)·(W ^T x-(W) ^T W·Z ₀ )＝x/W-Z ₀ . Finally obtain Z ₁ ＝x/W-Z ₀ Wherein Z is ₁ Is to migrate the matrix W, the image x to be migrated and the initial image for the first timeZ ₀ Inputting a formula corresponding to the preset mode to obtain a dimension vector of the intermediate synthetic image; z is Z ₀ A dimension vector for the acquired initial image.

The intermediate composite image Z obtained above ₁ As new initial image, the matrix W, the image x to be migrated and the new initial image Z are moved for the second time ₁ Inputting a formula corresponding to the preset mode to obtain: z is Z ₂ ＝Z ₁ +α ₁ P ₁ The method comprises the steps of carrying out a first treatment on the surface of the Wherein,r ₁ ＝r ₀ +α ₀ Ap ₀ ；p ₁ ＝r ₁ +β ₀ p ₀ ；/>wherein let a= (W) ^T W；P ₀ ＝r ₀ ；r ₀ ＝b-AZ ₀ ；b＝W ^T x; obtaining Z ₂ 。

In the embodiment of the present invention, the dimension vector of the image may be a one-dimensional vector, a two-dimensional vector or a three-dimensional vector, which is not limited herein.

In the embodiment of the invention, the preset times are more than or equal to 2; when the preset times are 2 times, the final target synthetic image is Z ₂ . Wherein, when the final target synthetic image Z is obtained ₂ And when the details are not clear enough, the intermediate composite image can be circularly executed to serve as a new initial image, and the intermediate composite image is circularly executed for a preset number of times according to a preset mode, and the gesture migration matrix, the image to be migrated and the initial image, until the target composite image is obtained to be an image satisfactory to a user.

In an embodiment of the present invention, the method includes: multiple frames of reference images; the reference image comprises a temporal sequence; then after step 105, further comprising: and arranging a plurality of frames of target synthesized images according to the time sequence to obtain a target synthesized video.

In the embodiment of the invention, the method further comprises the step of inputting the target synthetic image into a complement model to obtain a final synthetic image. The complement model is used for complementing the missing part in the target synthetic image. For example, when the image to be migrated input by the user is an image lacking a human face or lacking a part of limbs, the complement model complements the missing parts.

Specifically, the complement model can be obtained by training a large number of images as training samples; for example, a back side photograph (no face photograph), no leg photograph, and no arm photograph and corresponding body complete photograph are used as training samples to train the full model.

Wherein, the multi-frame reference image with time sequence forms the reference video; the user may click on the upload of the reference video, which includes: a plurality of frames of reference images, the plurality of frames of reference images having a corresponding time sequence; and (3) sequentially executing steps 101-105 on each frame of image in the image to be migrated and the reference video by the server or the electronic equipment to finally obtain a multi-frame target composite image, and arranging the multi-frame target composite image according to a time sequence to obtain the final target composite video.

In an embodiment of the present invention, the method further includes: identifying each frame of image in the reference video, and selecting an image comprising a human body object as a reference image; taking an image which does not contain a human body object as a transition image; and finally, arranging the multi-frame target synthesized image and the transition image according to a time sequence to obtain the target synthesized video.

Wherein the reference video includes a dance motion or other motion, not limited herein.

In an embodiment of the present invention, the step 105 includes: extracting a target object in the image to be migrated; determining a synthetic object according to the gesture migration matrix, the target object and the initial image; and synthesizing the background of the reference image and the synthetic object to obtain the target synthetic image.

In the embodiment of the invention, only the target object in the image to be migrated is migrated, the background of the target object in the image to be migrated is not migrated, the obtained human body object is a synthesized object after the target object in the image to be migrated is migrated, then the background of the reference image is synthesized with the synthesized object, and referring to fig. 3, after the target object corresponding to the image to be migrated A is migrated, the background of the reference image B is adopted, so as to obtain the target synthesized image C.

In the embodiment of the present invention, referring to fig. 2, the whole image a to be migrated may also be migrated to obtain the target composite image C.

Fig. 4 is a block diagram of an image processing apparatus according to an embodiment of the present invention, and as shown in the drawings, the apparatus may include:

the third acquisition module is used for acquiring an initial image;

The image processing device provided by the embodiment of the invention is provided with the corresponding functional module for executing the image processing method, can execute the image processing method provided by the embodiment of the invention, and can achieve the same beneficial effects.

In still another embodiment of the present invention, there is also provided an electronic device, which may include: the image processing device comprises a processor, a memory and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the processes of the image processing method embodiment when executing the program, and can achieve the same technical effects, and the repetition is avoided, and the description is omitted here. As illustrated in fig. 5, the electronic device may specifically include: a processor 301, a memory device 302, a display screen 303 with touch function, an input device 304, an output device 305, and a communication device 306. The number of processors 301 in the electronic device may be one or more, one processor 301 being exemplified in fig. 5. The processor 301, the storage device 302, the display 303, the input device 304, the output device 305, and the communication device 306 of the electronic apparatus may be connected by a bus or other means.

In yet another embodiment of the present invention, a computer readable storage medium is provided, in which instructions are stored, which when run on a computer, cause the computer to perform the image processing method according to any one of the above embodiments.

In a further embodiment of the present invention, a computer program product comprising instructions which, when run on a computer, cause the computer to perform the image processing method according to any of the above embodiments is also provided.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims

1. An image processing method, the method comprising:

acquiring an initial image;

determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image;

wherein the determining a target synthetic image according to the gesture migration matrix, the image to be migrated, and the initial image includes:

obtaining an intermediate composite image according to a preset mode, the gesture migration matrix, the image to be migrated and the initial image;

the step of taking the intermediate synthetic image as the new initial image, circularly executing the preset mode according to preset times, and obtaining the intermediate synthetic image by the gesture migration matrix, the image to be migrated and the initial image to obtain the target synthetic image;

the preset mode is as follows:

Z _k+1 ＝Z _k +α _k P _k ；

wherein Z is _k+1 A first dimension vector that is the intermediate composite image; z is Z _k A second dimension vector in the initial image; wherein,r _k ＝r _k-1 +α _k-1 AP _k-1 ；

P _k ＝r _k +β _k-1 P _k-1 ；wherein let a= (W) ^T W；P ₀ ＝r ₀ ；r ₀ ＝b-AZ ₀ ；b＝W ^T x; wherein W is the gesture migration matrix, x is the third dimension vector of the image to be migrated, Z ₀ A fourth dimensional vector of the initial image is obtained for an initial network model.

2. The method of claim 1, wherein the acquiring the initial image comprises:

and inputting the gesture migration matrix and the image to be migrated into an initial network model to obtain an initial image.

3. The method of claim 1, wherein the acquiring the initial image comprises:

and taking the preset image with the dimension vector of zero as the initial image.

4. The method of claim 1, wherein the first key feature is a preset key feature in the target object; the second key features are in one-to-one correspondence with the first key features.

5. The method of claim 4, wherein determining a pose migration matrix from the first key feature and the second key feature comprises:

determining coordinate values of each first key feature and coordinate values of each second key feature;

6. The method according to claim 5, comprising: multiple frames of reference images; the reference image comprises a temporal sequence;

the determining the target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image further includes:

and arranging a plurality of frames of target synthesized images according to the time sequence to obtain a target synthesized video.

7. The method of claim 1, wherein the determining a target composite image from the pose migration matrix, the image to be migrated, and the initial image comprises:

extracting a target object in the image to be migrated;

determining a synthetic object according to the gesture migration matrix, the target object and the initial image;

and synthesizing the background of the reference image and the synthetic object to obtain the target synthetic image.

8. An image processing apparatus, characterized in that the apparatus comprises:

the third acquisition module is used for acquiring an initial image;

the second determining module is used for determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image;

wherein the second determining module further comprises:

the first obtaining module is used for obtaining an intermediate composite image according to a preset mode, the gesture migration matrix, the image to be migrated and the initial image;

the second obtaining module is used for circularly executing the preset mode according to preset times by taking the intermediate synthetic image as the new initial image, and obtaining the intermediate synthetic image by the gesture migration matrix, the image to be migrated and the initial image to obtain the target synthetic image;

the preset mode is as follows:

Z _k+1 ＝Z _k +α _k P _k ；

wherein Z is _k+1 A first dimension vector that is the intermediate composite image; z is Z _k A second dimension vector in the initial image; wherein,r _k ＝r _k-1 +α _k-1 AP _k-1 ；P _k ＝γ _k +β _k-1 P _k-1 ；/>wherein let a= (W) ^T W；P ₀ ＝r ₀ ；r ₀ ＝b-AZ ₀ ；b＝W ^T x; wherein W is the gesture migration matrix, x is the third dimension vector of the image to be migrated, Z ₀ A fourth dimensional vector of the initial image is obtained for an initial network model.

9. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the method of any of claims 1-7.

10. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implement the steps of the method according to any of claims 1-7.