CN112734632B - Image processing method, device, electronic equipment and readable storage medium - Google Patents
Image processing method, device, electronic equipment and readable storage medium Download PDFInfo
- Publication number
- CN112734632B CN112734632B CN202110009523.2A CN202110009523A CN112734632B CN 112734632 B CN112734632 B CN 112734632B CN 202110009523 A CN202110009523 A CN 202110009523A CN 112734632 B CN112734632 B CN 112734632B
- Authority
- CN
- China
- Prior art keywords
- image
- migrated
- initial
- gesture
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 18
- 230000005012 migration Effects 0.000 claims abstract description 69
- 238000013508 migration Methods 0.000 claims abstract description 69
- 239000011159 matrix material Substances 0.000 claims abstract description 56
- 238000000034 method Methods 0.000 claims description 33
- 239000002131 composite material Substances 0.000 claims description 21
- 238000012545 processing Methods 0.000 claims description 8
- 230000002194 synthesizing effect Effects 0.000 claims description 2
- 230000002123 temporal effect Effects 0.000 claims description 2
- 238000012549 training Methods 0.000 abstract description 10
- 210000000323 shoulder joint Anatomy 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 210000002310 elbow joint Anatomy 0.000 description 5
- 230000036544 posture Effects 0.000 description 5
- 210000000544 articulatio talocruralis Anatomy 0.000 description 4
- 210000000511 carpometacarpal joint Anatomy 0.000 description 4
- 230000000295 complement effect Effects 0.000 description 4
- 210000004394 hip joint Anatomy 0.000 description 4
- 210000000629 knee joint Anatomy 0.000 description 4
- 230000007704 transition Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 210000003857 wrist joint Anatomy 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 210000001508 eye Anatomy 0.000 description 2
- 210000001503 joint Anatomy 0.000 description 2
- 210000002414 leg Anatomy 0.000 description 2
- 102000020897 Formins Human genes 0.000 description 1
- 108091022623 Formins Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 210000003414 extremity Anatomy 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 210000000214 mouth Anatomy 0.000 description 1
- 210000001331 nose Anatomy 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Psychiatry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
- Processing Or Creating Images (AREA)
Abstract
The invention provides an image processing method and device, the image processing method comprises: acquiring an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image includes: presenting a reference object of a reference pose; acquiring a first key feature of a target object and a second key feature of a reference object; determining a gesture migration matrix according to the first key features and the second key features; acquiring an initial image; and determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image. In the embodiment of the invention, a large number of training samples are not required to be acquired to train the model to obtain the target synthetic image, the complexity of image migration is reduced, the initial image is acquired, and the gesture migration matrix, the image to be migrated and the initial image are adopted to migrate the whole image to be migrated, so that the details of the image to be migrated can be ensured to be displayed in the target synthetic image, and the omission of the details is prevented.
Description
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image processing method, an image processing device, an electronic device, and a readable storage medium.
Background
The posture shift refers to that after one image a is processed, the person P in the image a has the posture of the person H in the other image B, and a composite image C is obtained.
Currently, in order to realize gesture migration, a plurality of images a, a plurality of images B and a plurality of images C are used as training samples, an image migration model is trained, and then a new image a and a new image B are processed according to the image migration model, so as to obtain a new composite image C.
In the above gesture migration method, a large number of training samples need to be prepared when the image migration model is trained, and the training method is complicated. Moreover, when the image migration model is adopted to perform image migration, when the clothing body types of the people in the two images are greatly different, the people in the composite image C cannot keep details of the people P in the original image A, the people in different composite images C are greatly different in shape under different visual angles and postures, in addition, the situation that only part of human bodies in the people are migrated, and other human body parts are required to be processed again to achieve migration is likely to occur, so that the migration process is complicated.
Disclosure of Invention
In view of the above, the present invention provides an image processing method, which solves the problems of complicated migration process and incomplete migration to a certain extent.
An embodiment of the present invention provides an image processing method, including:
acquiring an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose;
acquiring a first key feature of the target object and a second key feature of the reference object;
determining a gesture migration matrix according to the first key features and the second key features;
acquiring an initial image;
and determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image.
A second aspect of an embodiment of the present invention provides an image processing apparatus, including:
the first acquisition module is used for acquiring an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose;
the second acquisition module is used for acquiring the first key characteristics of the target object and the second key characteristics of the reference object;
the first determining module is used for determining a gesture migration matrix according to the first key features and the second key features;
the third acquisition module is used for acquiring an initial image;
and the second determining module is used for determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image.
A third aspect of an embodiment of the present invention provides an electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, the program or instruction when executed by the processor implementing the steps of the method as described in the first aspect.
A fourth aspect of the embodiments of the present invention provides a readable storage medium having stored thereon a program or instructions which when executed by a processor performs the steps of the method according to the first aspect.
In the embodiment of the invention, an image to be migrated and a reference image are acquired; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose; acquiring a first key feature of the target object and a second key feature of the reference object; determining a gesture migration matrix according to the first key features and the second key features; acquiring an initial image; and determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image. In the embodiment of the invention, a large number of training samples are not required to be acquired to train the model to obtain the target synthetic image, the complexity of image migration is reduced, the initial image is acquired, and the gesture migration matrix, the image to be migrated and the initial image are adopted to migrate the whole image to be migrated, so that the details of the image to be migrated can be ensured to be displayed in the target synthetic image, and the omission of the details is prevented.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 is a flowchart of steps of an image processing method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an image processing method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of another image processing method according to an embodiment of the present invention;
fig. 4 is a block diagram of an image processing apparatus provided by an embodiment of the present invention;
fig. 5 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Referring to fig. 1, a step flowchart of an image processing method provided by an embodiment of the present invention is shown, where the image processing method specifically includes the following steps:
step 101, obtaining an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: a reference object of a reference pose is presented.
Wherein the image to be migrated is m 1 *n 1 *3, wherein m 1 Is the width of the image to be migrated, n 1 Is the high of the image to be migrated, 3 means that the image to be migrated is an RGB image. The reference picture is m 2 *n 2 *3, wherein m 2 Is the width of the reference image, n 2 Is the high of the reference image, 3 means that the reference image is an RGB image.
In the embodiment of the invention, the target object and the reference object generally refer to human body objects in an image; referring to fig. 2, wherein an image a is an image to be migrated and an image B is a reference image; the image A to be migrated comprises a target object P, and the reference image B comprises: reference object H.
In the embodiment of the invention, the images to be migrated and the reference image can be selected from the image memory according to the requirements, and can be shot and acquired at any time, which is not limited.
In addition, a user can select a video as a reference video, each frame of image in the reference video is used as a reference image, and then the image to be migrated is processed based on each frame of reference image.
Step 102, acquiring a first key feature of the target object and a second key feature of the reference object.
In the embodiment of the invention, the image to be migrated is represented by a dimension vector, and the dimension vector of the image to be migrated is: x (i x m n+j x m+k) =x (j, k, i); wherein, 3 is greater than or equal to i and greater than or equal to 1, n is greater than or equal to i and greater than or equal to 1, and m is greater than or equal to i and greater than or equal to 1. The reference image is also represented by a dimension vector, and the dimension vector of the reference image is y (i×m×n+j×m+k) =y (j, k, i); wherein, 3 is greater than or equal to i and greater than or equal to 1, n is greater than or equal to i and greater than or equal to 1, and m is greater than or equal to i and greater than or equal to 1.
Specifically, each pixel point in the image to be migrated can use a dimension vector or coordinate to represent its position, and each pixel point in the reference image can also use a dimension vector or coordinate to represent its position. For example: an image has 10 rows by 10 columns of pixels, and the coordinates of the pixel p of the 5 th row and 5 th column are expressed as (5, 5); the pixel point p is denoted p (45) by a one-dimensional vector.
In the embodiment of the invention, the first key feature refers to coordinates of a plurality of feature points capable of marking the gesture of the target object; for example, the first key feature may be coordinates of respective joints of the target object; each joint includes: shoulder joint, elbow joint, wrist joint, carpometacarpal joint, hip joint, knee joint, ankle joint, etc. Furthermore, the first key feature may also be coordinates of a main part of the human body, for example, a part characterizing the head pose, including: eyes, nasal prongs, temples and chin prongs; a site characterizing the pose of an arm, comprising: shoulder, elbow and carpometacarpal joints; a site characterizing a hand gesture, comprising: the joints and fingertips of each finger; a site characterizing a leg gesture, comprising: hip joint, knee joint, ankle joint.
In the embodiment of the present invention, the first key feature is a preset key feature in the target object; the second key features are in one-to-one correspondence with the first key features.
Specifically, when the first key feature is obtained, a second key feature may be obtained according to the first key feature, where the second key feature corresponds to the first key feature, and for example, the first key feature includes: the coordinates of the shoulder joint, elbow joint, wrist joint, carpometacarpal joint, hip joint, knee joint, ankle joint of the target object in the image to be migrated, the second key feature includes the coordinates of the shoulder joint, elbow joint, wrist joint, carpometacarpal joint, hip joint, knee joint, ankle joint of the reference object in the reference image.
In the embodiment of the present invention, if the target image includes only a face image, that is, only the face pose is migrated, the first key feature is set to the coordinates of each feature point of the human face, for example: eyes, nose, eyebrows, ears, mouth, etc.
In the embodiment of the invention, the user can select the migrated position according to the need, for example, when only the face is migrated, only the first key feature of the face is selected, and when only the body is migrated, only the first key feature of the body is selected.
And step 103, determining a gesture migration matrix according to the first key features and the second key features.
In an embodiment of the present invention, the step 103 includes: determining coordinate values of each first key feature and coordinate values of each second key feature;
and determining the gesture migration matrix according to the coordinate values of the first key feature and the coordinate values of the second key feature, wherein the gesture migration matrix is used for converting the coordinate values of the first key feature into the coordinate values of the second key feature corresponding to the first key feature.
In the invention realizeIn an embodiment, the gesture migration matrix refers to a gesture migration matrix required for migrating the coordinates of the first key feature to the coordinates of the second key feature. For example, the first key features include: coordinates of temple (a, b), coordinates of temple of the second key feature (m, n), shoulder joint (o, p) at the time of shoulder joint (c, d); the coordinates of the temple of the first key feature are stored as (m, n), the coordinates of the shoulder joint of the first key feature are stored as (o, p), the coordinates of the elbow joint of the first key feature are stored as (q, r), and so on. Wherein according toWhen the first key feature includes a plurality (3 or more) of the gesture transition matrices W are obtained, the gesture transition matrices W may be obtained in this manner.
In the embodiment of the present invention, the coordinates of each first key feature are Px, and the coordinates of each second key feature are Py, and w=w [ Px, py ] is the gesture migration matrix required for transforming from the first key feature to the second key feature.
After determining the gesture migration matrix W, each pixel point of the image to be migrated may be migrated using the gesture migration matrix W.
Step 104, an initial image is acquired.
In the embodiment of the present invention, the initial image is an initial image that needs to be input in order to complete the step of obtaining the target synthetic image in a preset manner in the subsequent step.
In an embodiment of the present invention, step 104 includes: and inputting the gesture migration matrix and the image to be migrated into an initial network model to obtain an initial image.
In an embodiment of the present invention, the initial network model may be a model trained according to data samples, where the data samples include: the plurality of image samples to be migrated are converted into gesture migration matrix samples of the reference image samples, and the plurality of image samples to be migrated and the plurality of target synthetic image samples; training by adopting the data samples to obtain an initial network model; and inputting the gesture migration matrix and the image to be migrated into a trained initial network model to obtain an initial image, wherein the initial image is an initial synthesized image of a target object in the image to be migrated under the gesture of a reference object, but details of the initial synthesized image are omitted, and all the characteristics of the image to be migrated cannot be completely displayed. After the subsequent steps are continuously executed, the details of the image to be migrated can be supplemented completely.
In addition, the working principle of the initial network model can also be Z 0 =w×x; wherein Z is 0 Is the dimension vector of the initial image, W is the pose migration matrix, and x is the dimension vector of the image to be migrated. The initial image thus obtained has some characteristics of the image to be migrated, but is not clear, and not all pixels in the image to be migrated are migrated. The initial image obtained by the method is used as a basis of subsequent calculation, so that the migration quality of the image to be migrated can be improved.
Optionally, step 104 includes: and taking the preset image with the dimension vector of zero as the initial image. The preset image can be stored in a memory, and is called when the image to be migrated is processed.
In the embodiment of the invention, the dimension vector corresponding to the initial image can be assigned to zero so as to carry out subsequent calculation.
And 105, determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image.
In an embodiment of the present invention, step 105 includes: obtaining an intermediate composite image according to a preset mode, the gesture migration matrix, the image to be migrated and the initial image; and taking the intermediate synthetic image as the new initial image, and circularly executing the preset mode according to preset times, and the gesture migration matrix, the image to be migrated and the initial image to obtain an intermediate synthetic image.
In the embodiment of the invention, F (Z, px, py) represents the target composition under the condition that the target object in the image to be migrated is migrated from the posture of the target object to the posture of the reference objectA dimension vector of the image. Then min Σiif [ z, P is required x ,P y ]When x II approaches 0, the details of the images to be migrated are all shown in the target composite image. Where x is the dimension vector of the image to be migrated. The following pair of min Σiif [ z, P x ,P y ]-x| solving the step when approaching 0;
1) For min ΣII F [ z, P x ,P y ]-x| optimizing to obtain
2) Let a= (W [ P ] x ,P y ]) T W[P x ,P y ],b=(W[P x ,P y ]) T x; then pairPerforming inverse problem modeling to solve the equation set az=b;
3) Setting the solution accuracy e=0.0000001, then r 0 =b-AZ 0 ;p 0 =r 0 The method comprises the steps of carrying out a first treatment on the surface of the If r 0 Is larger than the value of e,r k =r k-1 +α k-1 Apk-1;p k =r k +β k-1 p k-1 ;/>wherein let a= (W) T W;P 0 =r 0 ;r 0 =b-AZ 0 ;b=W T x;
4) The formula is arranged to obtain Z k+1 =f(b,A,Z k ) It can be seen that the target composite image Z k+1 Is dependent on the pose migration matrix W, the image x to be migrated, and the initial image Z k A kind of electronic device.
In the embodiment of the present invention, according to the above steps 1) to 4), the preset mode in the embodiment of the present invention is specifically as follows:
Z k+1 =Z k +α k P k ;
wherein Z is k+1 A first dimension vector that is the intermediate composite image; z is Z k A second dimension vector in the initial image; wherein W is the gesture migration matrix;r k =r k-1 +α k-1 Apk-1;p k =r k +β k-1 p k-1 ;wherein let a= (W) T W;P 0 =r 0 ;r 0 =b-AZ 0 ;b=W T x; wherein x is the third dimension vector of the image to be migrated, Z 0 A fourth dimensional vector of the initial image is obtained for an initial network model.
In the embodiment of the present invention, the preset mode refers to the application of the above formula Z k+1 =Z k +α k P k And obtaining a target synthetic image.
In the embodiment of the invention, the gesture migration matrix W is the W [ P ] x ,P y ]。
Specifically, the initial image obtained as described above is exemplified as Z 0 The method comprises the steps of carrying out a first treatment on the surface of the The gesture migration matrix W, the image x to be migrated and the initial image Z are firstly migrated 0 Inputting a formula corresponding to the preset mode to obtain:wherein r is 0 =b-Az 0 ;P 0 =r 0 The method comprises the steps of carrying out a first treatment on the surface of the Alpha is then 0 =1/A=1/(W) T W;P 0 =b-Az 0 =W T x-(W) T W·Z 0 The method comprises the steps of carrying out a first treatment on the surface of the Z is then 1 =Z 0 +(1/(W) T W)·(W T x-(W) T W·Z 0 )=x/W-Z 0 . Finally obtain Z 1 =x/W-Z 0 Wherein Z is 1 Is to migrate the matrix W, the image x to be migrated and the initial image for the first timeZ 0 Inputting a formula corresponding to the preset mode to obtain a dimension vector of the intermediate synthetic image; z is Z 0 A dimension vector for the acquired initial image.
The intermediate composite image Z obtained above 1 As new initial image, the matrix W, the image x to be migrated and the new initial image Z are moved for the second time 1 Inputting a formula corresponding to the preset mode to obtain: z is Z 2 =Z 1 +α 1 P 1 The method comprises the steps of carrying out a first treatment on the surface of the Wherein,r 1 =r 0 +α 0 Ap 0 ;p 1 =r 1 +β 0 p 0 ;/>wherein let a= (W) T W;P 0 =r 0 ;r 0 =b-AZ 0 ;b=W T x; obtaining Z 2 。
In the embodiment of the present invention, the dimension vector of the image may be a one-dimensional vector, a two-dimensional vector or a three-dimensional vector, which is not limited herein.
In the embodiment of the invention, the preset times are more than or equal to 2; when the preset times are 2 times, the final target synthetic image is Z 2 . Wherein, when the final target synthetic image Z is obtained 2 And when the details are not clear enough, the intermediate composite image can be circularly executed to serve as a new initial image, and the intermediate composite image is circularly executed for a preset number of times according to a preset mode, and the gesture migration matrix, the image to be migrated and the initial image, until the target composite image is obtained to be an image satisfactory to a user.
In an embodiment of the present invention, the method includes: multiple frames of reference images; the reference image comprises a temporal sequence; then after step 105, further comprising: and arranging a plurality of frames of target synthesized images according to the time sequence to obtain a target synthesized video.
In the embodiment of the invention, the method further comprises the step of inputting the target synthetic image into a complement model to obtain a final synthetic image. The complement model is used for complementing the missing part in the target synthetic image. For example, when the image to be migrated input by the user is an image lacking a human face or lacking a part of limbs, the complement model complements the missing parts.
Specifically, the complement model can be obtained by training a large number of images as training samples; for example, a back side photograph (no face photograph), no leg photograph, and no arm photograph and corresponding body complete photograph are used as training samples to train the full model.
Wherein, the multi-frame reference image with time sequence forms the reference video; the user may click on the upload of the reference video, which includes: a plurality of frames of reference images, the plurality of frames of reference images having a corresponding time sequence; and (3) sequentially executing steps 101-105 on each frame of image in the image to be migrated and the reference video by the server or the electronic equipment to finally obtain a multi-frame target composite image, and arranging the multi-frame target composite image according to a time sequence to obtain the final target composite video.
In an embodiment of the present invention, the method further includes: identifying each frame of image in the reference video, and selecting an image comprising a human body object as a reference image; taking an image which does not contain a human body object as a transition image; and finally, arranging the multi-frame target synthesized image and the transition image according to a time sequence to obtain the target synthesized video.
Wherein the reference video includes a dance motion or other motion, not limited herein.
In an embodiment of the present invention, the step 105 includes: extracting a target object in the image to be migrated; determining a synthetic object according to the gesture migration matrix, the target object and the initial image; and synthesizing the background of the reference image and the synthetic object to obtain the target synthetic image.
In the embodiment of the invention, only the target object in the image to be migrated is migrated, the background of the target object in the image to be migrated is not migrated, the obtained human body object is a synthesized object after the target object in the image to be migrated is migrated, then the background of the reference image is synthesized with the synthesized object, and referring to fig. 3, after the target object corresponding to the image to be migrated A is migrated, the background of the reference image B is adopted, so as to obtain the target synthesized image C.
In the embodiment of the present invention, referring to fig. 2, the whole image a to be migrated may also be migrated to obtain the target composite image C.
In the embodiment of the invention, an image to be migrated and a reference image are acquired; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose; acquiring a first key feature of the target object and a second key feature of the reference object; determining a gesture migration matrix according to the first key features and the second key features; acquiring an initial image; and determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image. In the embodiment of the invention, a large number of training samples are not required to be acquired to train the model to obtain the target synthetic image, the complexity of image migration is reduced, the initial image is acquired, and the gesture migration matrix, the image to be migrated and the initial image are adopted to migrate the whole image to be migrated, so that the details of the image to be migrated can be ensured to be displayed in the target synthetic image, and the omission of the details is prevented.
Fig. 4 is a block diagram of an image processing apparatus according to an embodiment of the present invention, and as shown in the drawings, the apparatus may include:
the first acquisition module is used for acquiring an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose;
the second acquisition module is used for acquiring the first key characteristics of the target object and the second key characteristics of the reference object;
the first determining module is used for determining a gesture migration matrix according to the first key features and the second key features;
the third acquisition module is used for acquiring an initial image;
and the second determining module is used for determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image.
The image processing device provided by the embodiment of the invention is provided with the corresponding functional module for executing the image processing method, can execute the image processing method provided by the embodiment of the invention, and can achieve the same beneficial effects.
In still another embodiment of the present invention, there is also provided an electronic device, which may include: the image processing device comprises a processor, a memory and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the processes of the image processing method embodiment when executing the program, and can achieve the same technical effects, and the repetition is avoided, and the description is omitted here. As illustrated in fig. 5, the electronic device may specifically include: a processor 301, a memory device 302, a display screen 303 with touch function, an input device 304, an output device 305, and a communication device 306. The number of processors 301 in the electronic device may be one or more, one processor 301 being exemplified in fig. 5. The processor 301, the storage device 302, the display 303, the input device 304, the output device 305, and the communication device 306 of the electronic apparatus may be connected by a bus or other means.
In yet another embodiment of the present invention, a computer readable storage medium is provided, in which instructions are stored, which when run on a computer, cause the computer to perform the image processing method according to any one of the above embodiments.
In a further embodiment of the present invention, a computer program product comprising instructions which, when run on a computer, cause the computer to perform the image processing method according to any of the above embodiments is also provided.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.
Claims (10)
1. An image processing method, the method comprising:
acquiring an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose;
acquiring a first key feature of the target object and a second key feature of the reference object;
determining a gesture migration matrix according to the first key features and the second key features;
acquiring an initial image;
determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image;
wherein the determining a target synthetic image according to the gesture migration matrix, the image to be migrated, and the initial image includes:
obtaining an intermediate composite image according to a preset mode, the gesture migration matrix, the image to be migrated and the initial image;
the step of taking the intermediate synthetic image as the new initial image, circularly executing the preset mode according to preset times, and obtaining the intermediate synthetic image by the gesture migration matrix, the image to be migrated and the initial image to obtain the target synthetic image;
the preset mode is as follows:
Z k+1 =Z k +α k P k ;
wherein Z is k+1 A first dimension vector that is the intermediate composite image; z is Z k A second dimension vector in the initial image; wherein,r k =r k-1 +α k-1 AP k-1 ;
P k =r k +β k-1 P k-1 ;wherein let a= (W) T W;P 0 =r 0 ;r 0 =b-AZ 0 ;b=W T x; wherein W is the gesture migration matrix, x is the third dimension vector of the image to be migrated, Z 0 A fourth dimensional vector of the initial image is obtained for an initial network model.
2. The method of claim 1, wherein the acquiring the initial image comprises:
and inputting the gesture migration matrix and the image to be migrated into an initial network model to obtain an initial image.
3. The method of claim 1, wherein the acquiring the initial image comprises:
and taking the preset image with the dimension vector of zero as the initial image.
4. The method of claim 1, wherein the first key feature is a preset key feature in the target object; the second key features are in one-to-one correspondence with the first key features.
5. The method of claim 4, wherein determining a pose migration matrix from the first key feature and the second key feature comprises:
determining coordinate values of each first key feature and coordinate values of each second key feature;
and determining the gesture migration matrix according to the coordinate values of the first key feature and the coordinate values of the second key feature, wherein the gesture migration matrix is used for converting the coordinate values of the first key feature into the coordinate values of the second key feature corresponding to the first key feature.
6. The method according to claim 5, comprising: multiple frames of reference images; the reference image comprises a temporal sequence;
the determining the target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image further includes:
and arranging a plurality of frames of target synthesized images according to the time sequence to obtain a target synthesized video.
7. The method of claim 1, wherein the determining a target composite image from the pose migration matrix, the image to be migrated, and the initial image comprises:
extracting a target object in the image to be migrated;
determining a synthetic object according to the gesture migration matrix, the target object and the initial image;
and synthesizing the background of the reference image and the synthetic object to obtain the target synthetic image.
8. An image processing apparatus, characterized in that the apparatus comprises:
the first acquisition module is used for acquiring an image to be migrated and a reference image; the image to be migrated comprises the following steps: a target object to be converted into a gesture; the reference image comprises the following steps: presenting a reference object of a reference pose;
the second acquisition module is used for acquiring the first key characteristics of the target object and the second key characteristics of the reference object;
the first determining module is used for determining a gesture migration matrix according to the first key features and the second key features;
the third acquisition module is used for acquiring an initial image;
the second determining module is used for determining a target synthetic image according to the gesture migration matrix, the image to be migrated and the initial image;
wherein the second determining module further comprises:
the first obtaining module is used for obtaining an intermediate composite image according to a preset mode, the gesture migration matrix, the image to be migrated and the initial image;
the second obtaining module is used for circularly executing the preset mode according to preset times by taking the intermediate synthetic image as the new initial image, and obtaining the intermediate synthetic image by the gesture migration matrix, the image to be migrated and the initial image to obtain the target synthetic image;
the preset mode is as follows:
Z k+1 =Z k +α k P k ;
wherein Z is k+1 A first dimension vector that is the intermediate composite image; z is Z k A second dimension vector in the initial image; wherein,r k =r k-1 +α k-1 AP k-1 ;P k =γ k +β k-1 P k-1 ;/>wherein let a= (W) T W;P 0 =r 0 ;r 0 =b-AZ 0 ;b=W T x; wherein W is the gesture migration matrix, x is the third dimension vector of the image to be migrated, Z 0 A fourth dimensional vector of the initial image is obtained for an initial network model.
9. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the method of any of claims 1-7.
10. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implement the steps of the method according to any of claims 1-7.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110009523.2A CN112734632B (en) | 2021-01-05 | 2021-01-05 | Image processing method, device, electronic equipment and readable storage medium |
PCT/CN2022/070336 WO2022148379A1 (en) | 2021-01-05 | 2022-01-05 | Image processing method and apparatus, electronic device, and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110009523.2A CN112734632B (en) | 2021-01-05 | 2021-01-05 | Image processing method, device, electronic equipment and readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112734632A CN112734632A (en) | 2021-04-30 |
CN112734632B true CN112734632B (en) | 2024-02-27 |
Family
ID=75591231
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110009523.2A Active CN112734632B (en) | 2021-01-05 | 2021-01-05 | Image processing method, device, electronic equipment and readable storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112734632B (en) |
WO (1) | WO2022148379A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734632B (en) * | 2021-01-05 | 2024-02-27 | 百果园技术(新加坡)有限公司 | Image processing method, device, electronic equipment and readable storage medium |
CN115423752B (en) * | 2022-08-03 | 2023-07-07 | 荣耀终端有限公司 | Image processing method, electronic equipment and readable storage medium |
CN115713582B (en) * | 2022-12-02 | 2023-10-27 | 北京百度网讯科技有限公司 | Avatar generation method, device, electronic equipment and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670474A (en) * | 2018-12-28 | 2019-04-23 | 广东工业大学 | A kind of estimation method of human posture based on video, device and equipment |
CN110705625A (en) * | 2019-09-26 | 2020-01-17 | 北京奇艺世纪科技有限公司 | Image processing method and device, electronic equipment and storage medium |
CN111626218A (en) * | 2020-05-28 | 2020-09-04 | 腾讯科技(深圳)有限公司 | Image generation method, device and equipment based on artificial intelligence and storage medium |
CN112115783A (en) * | 2020-08-12 | 2020-12-22 | 中国科学院大学 | Human face characteristic point detection method, device and equipment based on deep knowledge migration |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109960986A (en) * | 2017-12-25 | 2019-07-02 | 北京市商汤科技开发有限公司 | Human face posture analysis method, device, equipment, storage medium and program |
US11030772B2 (en) * | 2019-06-03 | 2021-06-08 | Microsoft Technology Licensing, Llc | Pose synthesis |
CN111027438B (en) * | 2019-12-03 | 2023-06-02 | Oppo广东移动通信有限公司 | Human body posture migration method, mobile terminal and computer storage medium |
CN111652798B (en) * | 2020-05-26 | 2023-09-29 | 浙江大华技术股份有限公司 | Face pose migration method and computer storage medium |
CN112734632B (en) * | 2021-01-05 | 2024-02-27 | 百果园技术(新加坡)有限公司 | Image processing method, device, electronic equipment and readable storage medium |
-
2021
- 2021-01-05 CN CN202110009523.2A patent/CN112734632B/en active Active
-
2022
- 2022-01-05 WO PCT/CN2022/070336 patent/WO2022148379A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670474A (en) * | 2018-12-28 | 2019-04-23 | 广东工业大学 | A kind of estimation method of human posture based on video, device and equipment |
CN110705625A (en) * | 2019-09-26 | 2020-01-17 | 北京奇艺世纪科技有限公司 | Image processing method and device, electronic equipment and storage medium |
CN111626218A (en) * | 2020-05-28 | 2020-09-04 | 腾讯科技(深圳)有限公司 | Image generation method, device and equipment based on artificial intelligence and storage medium |
CN112115783A (en) * | 2020-08-12 | 2020-12-22 | 中国科学院大学 | Human face characteristic point detection method, device and equipment based on deep knowledge migration |
Also Published As
Publication number | Publication date |
---|---|
CN112734632A (en) | 2021-04-30 |
WO2022148379A1 (en) | 2022-07-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112734632B (en) | Image processing method, device, electronic equipment and readable storage medium | |
US11741629B2 (en) | Controlling display of model derived from captured image | |
TWI742690B (en) | Method and apparatus for detecting a human body, computer device, and storage medium | |
JP7015152B2 (en) | Processing equipment, methods and programs related to key point data | |
CN110992454A (en) | Real-time motion capture and three-dimensional animation generation method and device based on deep learning | |
CN104794722A (en) | Dressed human body three-dimensional bare body model calculation method through single Kinect | |
CN109960962B (en) | Image recognition method and device, electronic equipment and readable storage medium | |
US10970849B2 (en) | Pose estimation and body tracking using an artificial neural network | |
CN110660076A (en) | Face exchange method | |
KR20230004837A (en) | Generative nonlinear human shape model | |
JP2017037424A (en) | Learning device, recognition device, learning program and recognition program | |
US11430168B2 (en) | Method and apparatus for rigging 3D scanned human models | |
CN114022645A (en) | Action driving method, device, equipment and storage medium of virtual teacher system | |
WO2022197024A1 (en) | Point-based modeling of human clothing | |
WO2022060229A1 (en) | Systems and methods for generating a skull surface for computer animation | |
CN114638744B (en) | Human body posture migration method and device | |
CN112102451A (en) | Common camera-based wearable virtual live broadcast method and equipment | |
CN113051973A (en) | Method and device for posture correction and electronic equipment | |
US11508121B2 (en) | Method for annotating points on a hand image to create training dataset for machine learning | |
JP7251003B2 (en) | Face mesh deformation with fine wrinkles | |
CN114821791A (en) | Method and system for capturing three-dimensional motion information of image | |
CN114973396B (en) | Image processing method, image processing device, terminal equipment and computer readable storage medium | |
CN113658319B (en) | Gesture migration method and device between heterogeneous frameworks | |
US12033281B2 (en) | Automatic blending of human facial expression and full-body poses for dynamic digital human model creation using integrated photo-video volumetric capture system and mesh-tracking | |
CN118230359A (en) | Human body posture estimation method, device and system and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |