CN115205964A - Image processing method, apparatus, medium, and device for pose prediction - Google Patents

Image processing method, apparatus, medium, and device for pose prediction Download PDF

Info

Publication number
CN115205964A
CN115205964A CN202210692661.XA CN202210692661A CN115205964A CN 115205964 A CN115205964 A CN 115205964A CN 202210692661 A CN202210692661 A CN 202210692661A CN 115205964 A CN115205964 A CN 115205964A
Authority
CN
China
Prior art keywords
image
target
limb
key points
coordinate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210692661.XA
Other languages
Chinese (zh)
Inventor
柳阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202210692661.XA priority Critical patent/CN115205964A/en
Publication of CN115205964A publication Critical patent/CN115205964A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The application provides an image processing method, an image processing device, a storage medium and electronic equipment for attitude prediction, wherein the method comprises the following steps: acquiring a first image and a second image containing the same shooting object; identifying limb key points of a shooting object in the first image and the second image; generating target posture data of the shot object according to the limb key points of the shot object in the first image and the second image; and adjusting the posture of the shooting object in the first image and/or the second image according to the target posture data to obtain a target image. According to the method and the device, other postures in the snapshot process can be predicted and corresponding images can be generated according to the postures of the shooting objects in the existing images, and therefore the snapshot effect is improved.

Description

Image processing method, apparatus, medium, and device for pose prediction
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image processing method and apparatus, a storage medium, and an electronic device for pose prediction.
Background
With the development of science and technology, the photographing function has become a standard configuration of many devices, and the good moment around the body can be recorded at any time through the portable photographing device.
However, the photographing device has effective photographing capability, and for some rapidly changing photographing scenes, good photographing opportunities are often missed, so that the photo effect is not ideal.
Disclosure of Invention
The embodiment of the application provides an image processing method and device for attitude prediction, a storage medium and electronic equipment, which can improve the snapshot effect.
The embodiment of the application provides an image processing method for attitude prediction, which comprises the following steps:
acquiring a first image and a second image containing the same shooting object;
identifying limb key points of a shooting object in the first image and the second image;
generating target posture data of the shot object according to the limb key points of the shot object in the first image and the second image;
and adjusting the postures of the shooting objects in the first image and/or the second image according to the target posture data to obtain a target image.
An embodiment of the present application further provides an image processing apparatus for pose prediction, including:
the device comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a first image and a second image which contain the same shooting object;
the identification module is used for identifying limb key points of the shooting object in the first image and the second image;
the generation module is used for generating target posture data of the shot object according to the limb key points of the shot object in the first image and the second image;
and the adjusting module is used for adjusting the posture of the shooting object in the first image and/or the second image according to the target posture data to obtain the target image.
The embodiment of the present application further provides a computer-readable storage medium, in which a computer program is stored, where the computer program is executed by a processor to implement the steps in the image processing method for pose prediction.
An embodiment of the present application further provides an electronic device, where the electronic device includes a processor and a memory, where the memory stores a computer program, and the processor executes the computer program to implement the steps in the image processing method for pose prediction.
In the embodiment of the application, a first image and a second image which contain the same shooting object are obtained firstly; then identifying limb key points of the shot object in the first image and the second image; generating target posture data of the shot object according to the limb key points of the shot object in the first image and the second image; and then adjusting the posture of the shooting object in the first image and/or the second image according to the target posture data to obtain the target image. According to the method and the device, other postures in the snapshot process can be predicted and corresponding images can be generated according to the postures of the shooting objects in the existing images, and therefore the snapshot effect is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a scene schematic diagram of an image processing method for pose prediction according to an embodiment of the present application.
Fig. 2 is a first flowchart of an image processing method for pose prediction according to an embodiment of the present application.
Fig. 3 is a first schematic diagram of an image processing method for pose prediction according to an embodiment of the present application.
Fig. 4 is a second schematic diagram of an image processing method for pose prediction according to an embodiment of the present application.
Fig. 5 is a second flowchart of an image processing method for pose prediction according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of an image processing apparatus for pose prediction according to an embodiment of the present application.
Fig. 7 is a schematic structural diagram of a first electronic device according to an embodiment of the present application.
Fig. 8 is a schematic structural diagram of a second electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It should be apparent that the described embodiments are only a few embodiments of the present application, and not all embodiments. All embodiments obtained by a person skilled in the art without making any inventive step on the basis of the embodiments in this application are within the scope of protection of the present invention.
The terms "first," "second," "third," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the objects so described are interchangeable under appropriate circumstances. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, or apparatus, terminal, system comprising a list of steps is not necessarily limited to those steps or modules or elements expressly listed, and may include other steps or modules or elements not expressly listed, or inherent to such process, method, apparatus, terminal, or system.
In the photo age, a photo taking device has gradually merged into people's daily life as an important device. With the development of science and technology, the image processing capability of various photographing devices is gradually increased. However, due to the characteristics of the photographing device, whether photographing or shooting, hundreds of percent of details cannot be captured like human eyes, and a picture at each moment can be captured. Even a video obtained by imaging is composed of a plurality of continuously shot pictures, and the effect of continuous motion is visually formed.
Still, the camera with higher frame rate still has missing pictures. The more complicated the processing process, the longer the time spent in the processing, and the shortening of the shooting interval also brings a great computational burden to the equipment.
This limits the capturing capabilities of the device to a great extent. When shooting a moving object, a good-looking posture cannot be ensured all hundred percent. For example, when a person takes a jump, the posture of the person to be photographed at each moment in the whole jump is different, and the actually photographed picture often has few frames, so that the best-looking posture is possibly missed. It is possible that both the previous and subsequent photographs of the best looking pose were taken, but it happens that the best looking pose was not taken, resulting in an inadequate snap shot.
Based on this, the embodiment of the present application provides an image processing method for pose prediction, and an execution subject of the image processing method for pose prediction may be the image processing apparatus for pose prediction provided in the embodiment of the present application, or may be an electronic device. Wherein the image processing apparatus for pose prediction may be integrated in an electronic device. The electronic device may be a computer device, which may be a terminal device such as a smartphone, a tablet, a personal computer, or may be a server. The following is a detailed description of the analysis.
Referring to fig. 1, fig. 1 is a scene schematic diagram of an image processing method for pose prediction according to an embodiment of the present disclosure. Among them, the server 100 incorporates therein an image processing apparatus for pose prediction. The image processing device for attitude prediction executes the image processing method for attitude prediction provided by the embodiment of the application, and firstly, a first image and a second image containing the same shooting object are obtained; then identifying limb key points of the shooting object in the first image and the second image; generating target posture data of the shot object according to the limb key points of the shot object in the first image and the second image; and then adjusting the posture of the shooting object in the first image and/or the second image according to the target posture data to obtain a target image. According to the method and the device, other gestures which are not photographed can be predicted according to the gesture of the photographing object in the existing image, and a corresponding image is generated and provided for a user. The user can select the best-looking captured photo in the posture together with the newly generated image and the original image, and therefore the capturing effect is improved.
Referring to fig. 2, fig. 2 is a first flowchart illustrating an image processing method for pose prediction according to an embodiment of the present disclosure. The image processing method for pose prediction may include:
110. a first image and a second image containing the same photographic subject are acquired.
The subject may be a human or an animal. For example, the photographic subject may be a banking person, or a customer handling banking.
Referring to fig. 3, fig. 3 is a first schematic diagram of an image processing method for pose prediction according to an embodiment of the present disclosure. The first image and the second image may be adjacent photographs obtained during photographing of the photographic subject. For example, the first image and the second image may be two adjacent frames selected from an image frame sequence of an instant shooting video of a subject.
In one embodiment, the camera can be used for shooting the position of the banking staff so as to monitor the daily behavior specification of the banking staff, two adjacent pictures are selected from the shot video even if the camera shoots as a first image and a second image, and the first image and the second image comprise the same banking staff. Because the first image and the second image are adjacent frames, the posture of the banking staff in the first image and the second image may be different but not much different.
120. And identifying the limb key points of the shooting object in the first image and the second image.
With continued reference to fig. 3, for the first and second images that are acquired and contain the same photographic subject, the limb key points of the photographic subject in the first and second images are first identified. The limb key points comprise fulcrums, turning points, connection points and the like of the limb key parts.
In one embodiment, the key point recognition may be performed by a pre-trained limb recognition model. Before the method, a sample set can be prepared, the sample set comprises a plurality of image samples marked with the limb key points, the limb recognition model is trained by the sample set, and the training is stopped until the recognition accuracy of the limb recognition model meets a preset threshold value, so that the trained limb recognition model is obtained.
And inputting the first image and the second image into the trained limb recognition model, and performing limb recognition on the first image and the second image through the model to obtain limb key points of the shot object in the first image and the second image.
130. And generating target posture data of the shooting object according to the limb key points of the shooting object in the first image and the second image.
After the limb key points of the shot object in the first image and the second image are obtained, the target posture data of the shot object can be generated according to the limb key points of the shot object in the first image and the second image.
The position of each limb key point in the image (first image, second image) indicates the position of the corresponding limb of the photographic subject, and thus indicates the posture of the photographic subject when the image is photographed. And the position difference of the limb key points in the first image and the second image can reflect the posture change and the posture change trend of the shooting object when shooting the two images, including the swinging of the limbs, the translation of the limb positions and the like.
The target posture data indicates a posture of the subject desired by the user. For example, when a subject is captured in a take-off posture, the subject is captured in the first and second images in a squat-down and non-take-off posture and a take-off posture that is the highest, and the posture desired by the user is a take-off posture. It is necessary to generate posture data of a posture in the middle of jumping from a posture in which the squat has not yet taken off and a posture in which the squat has reached the highest among the first image and the second image, thereby generating a target image including the desired posture.
Referring to fig. 3, in an embodiment, the step of generating target pose data of the photographic subject according to the limb key points of the photographic subject in the first image and the second image in step 130 may include:
(1) And acquiring a target time sequence of the target image to be generated, wherein the target time sequence is used for indicating the time sequence of the target image to be generated, the first image and the second image.
Wherein the target image to be generated is worth the expected image that is intended to be generated from the first image and the second image. The target image has not yet been generated and is thus described as to be generated. The target time sequence is used for indicating the time sequence of the target image to be generated, the first image and the second image. The user may input the target timing to the device through interaction. Assuming that the temporal order of the first image precedes the second image, the target timing of the target image to be generated may comprise: the target image to be generated is positioned in front of the first image, the target image to be generated is positioned between the first image and the second image, and the target image to be generated is positioned behind the second image.
For example, like video editing software, a timeline of a video sequence including a first image and a second image is displayed, and a user interactively decides at which position of the timeline a target image is to be generated.
The target image to be generated and its target timing indicate where the gesture (intended gesture) the user wants to generate is on the timeline where the first and second images are located, thereby deciding how to generate the target image containing the intended gesture. The target time sequence is different, and the mode of generating the target image according to the first image and the second image is different.
(2) And generating target posture data of the shot object according to the target time sequence and the limb key points of the shot object in the first image and the second image.
And after the target time sequence is acquired, generating target attitude data of the shooting object by taking the target time sequence as a reference. And processing the limb key points of the shot object in the first image and the second image based on the target time sequence to obtain target posture data which is consistent with the target time sequence.
In an embodiment, the step of generating target pose data of the photographic subject according to the target time sequence and the limb key points of the photographic subject in the first image and the second image may include:
and (2.1) determining the limb parts of the shot object in the first image and the second image according to the identified limb key points.
The limb key points identified in the first image and the second image can reflect the trend of the limb, the position of the limb, the angle formed by each limb and the like. Therefore, according to the identified limb key points, the limb parts of the shooting objects in the first image and the second image can be determined.
When determining the limb part of the shot object in the first image, specifically, each limb key point in the first image may be divided according to the position of each limb key point in the first image, and an inflection point, a connection point, and the like in the limb key point, so as to determine the limb part of the shot object in the first image and the limb part of the shot object in the second image. When the limb part of the shooting object in the second image is determined, the same principle as that of determining the limb part in the first image is not repeated.
And (2.2) generating target limb posture data of the limb part according to the target time sequence and the limb key points of the limb part in the first image and the second image for each limb part of the shooting object.
Since the first image and the second image include the unified photographic subject and the photographing times of the first image and the second image are close to each other, the first image and the second image include the same limb portion as the photographic subject.
After the limb parts in the first image and the second image are divided, for each limb part of the shooting object, corresponding areas and limb key points of the limb parts in the first image and the second image can be determined in the first image and the second image. Further, target limb pose data for the limb portion may be generated based on the target timing and the limb key points of the limb portion in the first and second images.
In an embodiment, the step of generating target limb pose data for the limb part according to the target timing and the limb key points of the limb part in the first image and the second image may include:
(2.21) determining the corresponding relation of the limb key points of the limb part in the first image and the limb key points in the second image.
Taking any limb part of the object as an example, the limb part has corresponding areas in the first image and the second image, and both of the limb parts include corresponding limb key points. The limb key points of the limb part in the first image are in corresponding relation with the limb key points in the second image. The correspondence may be a one-to-one correspondence. Specifically, the corresponding relationship may be obtained synchronously when the limb key points of the first image and the second image are identified, and determined at this time.
And (2.22) aiming at each group of corresponding limb key points of the first image and the second image, generating a target key point according to the target time sequence, thereby obtaining a plurality of target key points of the limb part.
For the corresponding limb keypoints in the first image and the second image, it may be referred to as a set of corresponding limb keypoints. For each group of corresponding limb key points of the first image and the second image, one target key point can be generated according to the target time sequence by taking the target time sequence as a reference.
For example, a first limb key point in the first image and a second limb key point in the second image are a set of corresponding limb key points, and taking the first limb key point and the second limb key point as an example, the set of corresponding key points may generate one target key point. When generating the target key points of the first limb key point and the second limb key point, the step (2.22) may specifically be:
acquiring a first coordinate of a first limb key point in a first image;
acquiring a second coordinate of the lost second limb key point in a second image;
and calculating the first coordinate and the second coordinate according to the target time sequence to obtain a target coordinate of the target key point to be generated, and generating the target key point according to the target coordinate.
The way of calculating the first coordinate and the second coordinate is determined by the target time sequence.
Referring to fig. 3 and 4 together, in an embodiment, the target timing of the target image to be generated includes a posture that is not captured between the first image and the second image, i.e., the desired posture is the middle time of the postures exhibited by the first image and the second image. The step of calculating the first coordinate and the second coordinate according to the target time sequence to obtain the target coordinate of the target key point to be generated may include:
calculating the average value of the first coordinate and the second coordinate to obtain the average result of the first coordinate and the second coordinate;
and taking the average result as the target coordinate of the target key point to be generated.
(2.23) generating target limb pose data for the limb portion from the plurality of target key points for the limb portion.
After obtaining the plurality of target key points of the limb part, target limb posture data of the limb part can be generated according to the target key points of the limb part. The target limb posture data indicates data of posture, position, angle, etc. of the limb part in a desired posture, and specifically may include length, position, angle with other limb parts, etc. of the limb part in a target image to be generated, for generating final target posture data.
And (2.3) generating target posture data of the shooting object according to the target limb posture data of each limb part.
In the embodiment of the application, the target limb posture data of each limb part can be acquired according to the method. Then, target posture data of the imaging object is generated according to the target limb posture data of each limb part. The target posture data combines the target limb posture data of each limb part, so that the position, the displayed length and the angle formed by the limb parts meet the target limb posture data of each limb part. The target pose data is used to generate a target image containing the desired pose.
140. And adjusting the posture of the shooting object in the first image and/or the second image according to the target posture data to obtain a target image.
After the target posture data is obtained, the posture of the shooting object in the first image and/or the second image can be adjusted according to the target posture data to obtain the target image.
In an embodiment, in step 140, adjusting the pose of the photographic object in the first image and/or the second image according to the target pose data, and the step of obtaining the target image may include:
determining first posture data of a shooting object in the first image according to the limb key points identified in the first image;
determining second posture data of the shooting object in the second image according to the limb key points identified in the second image;
generating attitude adjustment information of the shooting object according to the first attitude data, the second attitude data and the target attitude data;
and adjusting the postures of the shooting objects in the first image and/or the second image according to the posture adjustment information to obtain the target image.
The first image may be adjusted, the second image may be adjusted, or the first image and the second image may be adjusted at the same time, so as to obtain the target image.
In an embodiment, the first image or the second image may be adjusted. That is, adjustment is performed only for one of the images, and the posture of the subject is adjusted based on the first image or the second image to obtain a target image including a posture matching the target posture data.
Optionally, the first pose data of the photographic subject in the first image may be determined according to the limb key points identified in the first image, or the second pose data of the photographic subject in the second image may be determined according to the limb key points identified in the second image, and then the pose adjustment information of the photographic subject may be generated according to the first pose data or the second pose data in combination with the target pose data.
In an embodiment, the first image and the second image may both be adjusted. For example, the postures in the first image and the second image are respectively adjusted to the expected postures of the composite target posture data, and then the adjusted first image and the second image are combined to obtain the target image.
The embodiment of the application also provides an image processing method for attitude prediction. Referring to fig. 5, fig. 5 is a second flowchart illustrating an image processing method for pose prediction according to an embodiment of the present disclosure. The image processing method for pose prediction may include:
201. a first image and a second image containing the same photographic subject are acquired.
The subject may be a human or an animal. For example, the photographic subject may be a banking person, or a customer handling banking.
Referring to fig. 3, fig. 3 is a first schematic diagram of an image processing method for pose prediction according to an embodiment of the present disclosure. The first image and the second image may be adjacent photographs obtained during photographing of the photographic subject. For example, the first image and the second image may be two adjacent frames selected from a sequence of image frames of an instant shot video of a subject.
In one embodiment, the position of the banking staff can be shot through the camera so as to monitor the daily behavior specification of the banking staff, two adjacent pictures are selected from the shot video as a first image and a second image, and the first image and the second image contain the same banking staff. Because the first image and the second image are adjacent frames, the posture of the banking staff in the first image and the second image may be different but not much different.
202. And identifying the limb key points of the shooting object in the first image and the second image.
With continued reference to fig. 3, for the first and second images that are acquired and contain the same photographic subject, the limb key points of the photographic subject in the first and second images are first identified. The limb key points comprise fulcrums, turning points, connection points and the like of the limb key parts.
In one embodiment, the key point recognition may be performed by a pre-trained limb recognition model. Before the method, a sample set can be prepared, the sample set comprises a plurality of image samples marked with the limb key points, the limb recognition model is trained by the sample set, and the training is stopped until the recognition accuracy of the limb recognition model meets a preset threshold value, so that the trained limb recognition model is obtained.
And inputting the first image and the second image into the trained limb recognition model, and performing limb recognition on the first image and the second image through the model to obtain limb key points of the shot object in the first image and the second image.
203. And acquiring a target time sequence of the target image to be generated, wherein the target time sequence is used for indicating the time sequence of the target image to be generated, the first image and the second image.
After the limb key points of the shot object in the first image and the second image are obtained, the target posture data of the shot object can be generated according to the limb key points of the shot object in the first image and the second image.
The position of each limb key point in the image (first image, second image) indicates the position of the corresponding limb of the photographic subject, and thus indicates the posture of the photographic subject when the image is photographed. And the position difference of the limb key points in the first image and the second image can reflect the posture change and the posture change trend of the shooting object when shooting the two images, including the swinging of the limbs, the translation of the limb positions and the like.
And the target posture data indicates a posture of the photographic subject desired by the user. For example, when a subject is captured in a take-off posture, the subject is captured in the first and second images in a squat-down and non-take-off posture and a take-off posture that is the highest, and the posture desired by the user is a take-off posture. It is necessary to generate posture data of a posture in the middle of jumping from a posture in which the squat has not yet taken off and a posture in which the squat has reached the highest among the first image and the second image, thereby generating a target image including the desired posture.
Wherein the target image to be generated is worth the expected image that is intended to be generated from the first image and the second image. The target image has not yet been generated and is thus described as being to be generated. The target time sequence is used for indicating the time sequence of the target image to be generated, the first image and the second image. The user may input the target timing to the device through interaction. Assuming that the temporal order of the first image precedes the second image, the target timing of the target image to be generated may comprise: the target image to be generated is positioned in front of the first image, the target image to be generated is positioned between the first image and the second image, and the target image to be generated is positioned behind the second image.
For example, like video editing software, a timeline of a video sequence including a first image and a second image is displayed, and a user interactively decides at which position of the timeline a target image is to be generated.
The target image to be generated and its target timing indicate where the gesture (intended gesture) the user wants to generate is on the timeline where the first and second images are located, thereby deciding how to generate the target image containing the intended gesture. The target time sequences are different, and the mode of generating the target image according to the first image and the second image is different.
204. And determining the limb parts of the shot object in the first image and the second image according to the identified limb key points.
Referring to fig. 3, after the target timing sequence is obtained, the target attitude data of the object is generated based on the target timing sequence. And processing the limb key points of the shot object in the first image and the second image based on the target time sequence to obtain target posture data which is consistent with the target time sequence.
The limb key points identified in the first image and the second image can reflect the trend of the limbs, the positions of the limbs, the angles formed by the limbs and the like. Therefore, according to the identified limb key points, the limb parts of the shooting object in the first image and the second image can be determined.
When determining the limb part of the photographic subject in the first image, specifically, each limb key point in the first image may be divided according to the position of each limb key point in the first image, and an inflection point, a connection point, and the like in the limb key point, so as to determine the limb part of the photographic subject in the first image and the limb part of the photographic subject in the second image. When the limb part of the shooting object in the second image is determined, the same principle as that of determining the limb part in the first image is not repeated.
205. For each limb part of the photographic subject, determining the corresponding relation between the limb key points of the limb part in the first image and the limb key points in the second image.
Since the first image and the second image include the unified photographic subject and the photographing times of the first image and the second image are close to each other, the first image and the second image include the same limb portion as the photographic subject.
After the body parts in the first image and the second image are divided, for each body part of the shooting object, the corresponding area and the body key point included in the body part in the first image and the second image can be determined in the first image and the second image. Further, target limb pose data for the limb portion may be generated based on the target timing and the limb key points of the limb portion in the first and second images.
Taking any limb part of the subject as an example, the limb part has corresponding areas in the first image and the second image, and both of the limb parts include corresponding limb key points. The limb key points of the limb part in the first image are in corresponding relation with the limb key points in the second image. The correspondence may be a one-to-one correspondence. Specifically, the corresponding relationship may be obtained synchronously when the limb key points of the first image and the second image are identified, and determined at this time.
206. And generating a target key point according to the target time sequence aiming at each group of corresponding limb key points of the first image and the second image, thereby obtaining a plurality of target key points of the limb part.
For the corresponding limb keypoints in the first image and the second image, it may be referred to as a set of corresponding limb keypoints. For each group of corresponding limb key points of the first image and the second image, one target key point can be generated according to the target time sequence by taking the target time sequence as a reference.
For example, a first limb key point in the first image and a second limb key point in the second image are a set of corresponding limb key points, and taking the first limb key point and the second limb key point as an example, the set of corresponding key points may generate one target key point. When generating the target key points of the first limb key point and the second limb key point, step 206 may specifically include:
acquiring a first coordinate of a first limb key point in a first image;
acquiring a second coordinate of the lost second limb key point in a second image;
and calculating the first coordinate and the second coordinate according to the target time sequence to obtain a target coordinate of the target key point to be generated, and generating the target key point according to the target coordinate.
The way of calculating the first coordinate and the second coordinate is determined by the target time sequence.
Referring to fig. 3 and 4 together, in an embodiment, the target timing of the target image to be generated includes a posture that is not captured between the first image and the second image, i.e., the desired posture is the middle time of the postures exhibited by the first image and the second image. The step of calculating the first coordinate and the second coordinate according to the target time sequence to obtain the target coordinate of the target key point to be generated may include:
calculating the average value of the first coordinate and the second coordinate to obtain the average result of the first coordinate and the second coordinate;
and taking the average result as the target coordinate of the target key point to be generated.
207. Target limb posture data of the limb part is generated according to the plurality of target key points of the limb part.
After obtaining the plurality of target key points of the limb part, target limb posture data of the limb part can be generated according to the target key points of the limb part. The target limb posture data indicates posture, position, angle and the like of the limb part in a desired posture, and specifically, may include length, position, angle and the like of the limb part in a target image to be generated, and is used for generating final target posture data.
208. And generating target posture data of the shooting object according to the target limb posture data of each limb part.
In the embodiment of the application, the target limb posture data of each limb part can be acquired according to the method. Then, target posture data of the imaging object is generated according to the target limb posture data of each limb part. The target posture data combines the target limb posture data of each limb part, so that the position, the displayed length and the angle formed by the limb parts meet the target limb posture data of each limb part. The target pose data is used to generate a target image containing the desired pose.
209. And determining first posture data of the shooting object in the first image according to the limb key points identified in the first image.
210. And determining second posture data of the shooting object in the second image according to the limb key points identified in the second image.
211. And generating attitude adjustment information of the shooting object according to the first attitude data, the second attitude data and the target attitude data.
212. And adjusting the postures of the shooting objects in the first image and/or the second image according to the posture adjustment information to obtain the target image.
In an embodiment, the first image or the second image may be adjusted. That is, adjustment is performed only for one of the images, and the posture of the subject is adjusted based on the first image or the second image to obtain a target image including a posture matching the target posture data.
Optionally, the first pose data of the photographic subject in the first image may be determined according to the limb key points identified in the first image, or the second pose data of the photographic subject in the second image may be determined according to the limb key points identified in the second image, and then the pose adjustment information of the photographic subject may be generated according to the first pose data or the second pose data in combination with the target pose data.
In an embodiment, the first image and the second image may both be adjusted. For example, the postures in the first image and the second image are respectively adjusted to the expected postures of the composite target posture data, and then the adjusted first image and the adjusted second image are combined to obtain the target image.
As can be seen from the above, the image processing method for pose prediction provided by the embodiment of the present application first obtains a first image and a second image that contain the same photographic subject; then identifying limb key points of the shooting object in the first image and the second image; generating target posture data of the shot object according to the limb key points of the shot object in the first image and the second image; and then adjusting the posture of the shooting object in the first image and/or the second image according to the target posture data to obtain the target image. According to the method and the device, other postures in the snapshot process can be predicted and corresponding images can be generated according to the postures of the shooting objects in the existing images, and therefore the snapshot effect is improved.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an image processing apparatus 300 for pose prediction according to an embodiment of the present disclosure. The image processing apparatus 300 for pose prediction includes an acquisition module 301, a recognition module 302, a generation module 303, and an adjustment module 304:
an acquiring module 301, configured to acquire a first image and a second image that include the same photographic subject;
the identification module 302 is used for identifying the key points of the limbs of the shooting object in the first image and the second image;
the generating module 303 is configured to generate target posture data of the photographic subject according to the limb key points of the photographic subject in the first image and the second image;
and the adjusting module 304 is configured to adjust the posture of the object captured in the first image and/or the second image according to the target posture data, so as to obtain a target image.
In an embodiment, when generating target pose data of a photographic subject according to the key points of the limbs of the photographic subject in the first image and the second image, the generating module 303 is specifically configured to:
acquiring a target time sequence of a target image to be generated, wherein the target time sequence is used for indicating the time sequence of the target image to be generated, the first image and the second image;
and generating target posture data of the shot object according to the target time sequence and the limb key points of the shot object in the first image and the second image.
In an embodiment, when generating target pose data of a photographic subject according to a target time sequence and the limb key points of the photographic subject in the first image and the second image, the generating module 303 is specifically configured to:
determining the limb parts of the shot object in the first image and the second image according to the identified limb key points;
for each limb part of the shot object, generating target limb posture data of the limb part according to the target time sequence and the limb key points of the limb part in the first image and the second image;
and generating target posture data of the shooting object according to the target limb posture data of each limb part.
In an embodiment, when the target limb posture data of the limb part is generated according to the target time sequence and the limb key points of the limb part in the first image and the second image, the generating module 303 is specifically configured to:
determining the corresponding relation of the limb key points of the limb part in the first image and the limb key points in the second image;
generating a target key point according to a target time sequence aiming at each group of corresponding limb key points of the first image and the second image, thereby obtaining a plurality of target key points of the limb part;
target limb posture data of the limb part is generated according to the plurality of target key points of the limb part.
In an embodiment, the first limb key point in the first image and the second limb key point in the second image are a set of corresponding limb key points, and when one target key point is generated according to a target timing sequence for each set of corresponding limb key points of the first image and the second image, the generating module 303 is specifically configured to:
acquiring a first coordinate of a first limb key point in a first image;
acquiring a second coordinate of the lost second limb key point in a second image;
and calculating the first coordinate and the second coordinate according to the target time sequence to obtain a target coordinate of the target key point to be generated, and generating the target key point according to the target coordinate.
In an embodiment, the target time sequence of the target image to be generated includes a time sequence between the first image and the second image, and when the first coordinate and the second coordinate are calculated according to the target time sequence to obtain the target coordinate of the target key point to be generated, the generating module 303 is specifically configured to:
calculating the average value of the first coordinate and the second coordinate to obtain the average result of the first coordinate and the second coordinate;
and taking the average result as the target coordinate of the target key point to be generated.
In an embodiment, when the pose of the object captured in the first image and/or the second image is adjusted according to the target pose data to obtain the target image, the adjusting module 304 is specifically configured to:
determining first posture data of a shooting object in the first image according to the limb key points identified in the first image;
determining second posture data of the shooting object in the second image according to the limb key points identified in the second image;
generating attitude adjustment information of the shot object according to the first attitude data, the second attitude data and the target attitude data;
and adjusting the posture of the shooting object in the first image and/or the second image according to the posture adjustment information to obtain the target image.
In one embodiment, the photographic subject comprises a banking person.
As can be seen from the above, the present embodiment provides an image processing apparatus 300 for pose prediction, wherein an obtaining module 301 first obtains a first image and a second image containing the same photographic subject; then the identification module 302 identifies the limb key points of the shooting object in the first image and the second image; the generating module 303 generates target posture data of the shot object according to the limb key points of the shot object in the first image and the second image; and the adjusting module 304 adjusts the posture of the object in the first image and/or the second image according to the target posture data to obtain the target image. According to the method and the device, other postures in the snapshot process can be predicted and corresponding images can be generated according to the postures of the shooting objects in the existing images, and therefore the snapshot effect is improved.
The embodiment of the application also provides an electronic device 400. Referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. The electronic device 400 comprises a processor 401 and a memory 402. The processor 401 is electrically connected to the memory 402.
The processor 401 is a control center of the electronic device 400, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device 400 by running or loading a computer program stored in the memory 402, and by data stored in the memory 402, and processes the data, thereby performing overall monitoring of the electronic device 400.
The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the computer programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, a computer program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.
In the embodiment of the present application, the processor 401 in the electronic device 400 stores a computer program executable on the processor 401 in the memory 402, and the processor 401 executes the computer program stored in the memory 402, thereby implementing various functions as follows:
acquiring a first image and a second image containing the same shooting object;
identifying limb key points of a shooting object in the first image and the second image;
generating target posture data of the shot object according to the limb key points of the shot object in the first image and the second image;
and adjusting the postures of the shooting objects in the first image and/or the second image according to the target posture data to obtain a target image.
Referring to fig. 8, fig. 8 is a schematic view of a second structure of an electronic device according to an embodiment of the present disclosure. In some implementations, the electronic device 400 can also include: a display 403, radio frequency circuitry 404, audio circuitry 405, and a power supply 406. The display 403, the rf circuit 404, the audio circuit 405, and the power source 406 are electrically connected to the processor 401.
The display screen 403 may be used to display information entered by or provided to the user as well as various graphical user interfaces that may be composed of graphics, text, icons, video, and any combination thereof. The Display 403 may include a Display panel, and in some embodiments, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The rf circuit 404 may be used for transceiving rf signals to establish wireless communication with a network device or other electronic devices through wireless communication, and for transceiving signals with the network device or other electronic devices.
The audio circuit 405 may be used to provide an audio interface between a user and an electronic device through a speaker, microphone.
The power supply 406 may be used to power various components of the electronic device 400. In some embodiments, power supply 406 may be logically coupled to processor 401 via a power management system, such that functions to manage charging, discharging, and power consumption management are performed via the power management system.
Although not shown in fig. 7 and 8, the electronic device 400 may further include a camera, a bluetooth module, and the like, which are not described in detail herein.
The present application further provides a computer-readable storage medium, in which a computer program is stored, where the computer program is executed by a processor to implement the steps in the image processing method for pose prediction in any of the above embodiments, such as: acquiring a first image and a second image containing the same shooting object; identifying limb key points of a shooting object in the first image and the second image; generating target posture data of the shot object according to the limb key points of the shot object in the first image and the second image; and adjusting the postures of the shooting objects in the first image and/or the second image according to the target posture data to obtain a target image.
In the embodiment of the present application, the storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
It should be noted that, for the image processing method for pose prediction in the embodiment of the present application, it may be understood by a person skilled in the art that all or part of a process of implementing the image processing method for pose prediction in the embodiment of the present application may be implemented by controlling related hardware through a computer program, where the computer program may be stored in a computer readable storage medium, such as a memory of an electronic device, and executed by at least one processor in the electronic device, and during the execution process, the process of implementing the embodiment of the image processing method for pose prediction may be included. The computer readable storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.
In the image processing apparatus for pose prediction according to the embodiment of the present application, each functional module may be integrated in one processing chip, each module may exist alone physically, or two or more modules are integrated in one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, or the like.
The term "module" as used herein may refer to software objects that execute on the computing system. The various components, modules, engines, and services described herein may be viewed as objects implemented on the computing system. The apparatus and method described herein are preferably implemented in software, but may also be implemented in hardware, and are within the scope of the present application.
The image processing method, the image processing apparatus, the storage medium, and the electronic device for pose prediction provided in the embodiments of the present application are described in detail above, and specific examples are applied in the present application to explain the principles and embodiments of the present application, and the description of the embodiments above is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (11)

1. An image processing method for pose prediction, comprising:
acquiring a first image and a second image containing the same shooting object;
identifying limb key points of the shooting object in the first image and the second image;
generating target posture data of the shooting object according to the limb key points of the shooting object in the first image and the second image;
and adjusting the posture of the shooting object in the first image and/or the second image according to the target posture data to obtain a target image.
2. The image processing method for pose prediction according to claim 1, wherein the generating target pose data of the photographic subject according to the extremity keypoints of the photographic subject in the first image and the second image comprises:
acquiring a target time sequence of a target image to be generated, wherein the target time sequence is used for indicating the time sequence of the target image to be generated, the first image and the second image;
and generating target posture data of the shot object according to the target time sequence and the limb key points of the shot object in the first image and the second image.
3. The image processing method for pose prediction according to claim 2, wherein the generating target pose data of the photographic subject according to the target time sequence and the limb key points of the photographic subject in the first image and the second image comprises:
determining the limb parts of the shot object in the first image and the second image according to the identified limb key points;
for each limb part of the shooting object, generating target limb posture data of the limb part according to the target time sequence and the limb key points of the limb part in the first image and the second image;
and generating target posture data of the shooting object according to the target limb posture data of each limb part.
4. The image processing method for pose prediction according to claim 3, wherein the generating target limb pose data for the limb part according to the target time sequence and limb key points of the limb part in the first image and the second image comprises:
determining a corresponding relation between the limb key points of the limb part in the first image and the limb key points in the second image;
generating a target key point according to the target time sequence aiming at each group of corresponding limb key points of the first image and the second image, thereby obtaining a plurality of target key points of the limb part;
and generating target limb posture data of the limb part according to the plurality of target key points of the limb part.
5. The image processing method for pose prediction according to claim 4, wherein a first limb key point in the first image and a second limb key point in the second image are a set of corresponding limb key points, and wherein the generating a target key point according to the target timing sequence for each set of corresponding limb key points of the first image and the second image comprises:
acquiring a first coordinate of the first limb key point in the first image;
acquiring a second coordinate of a lost second limb key point in the second image;
and calculating the first coordinate and the second coordinate according to the target time sequence to obtain a target coordinate of the target key point to be generated, and generating the target key point according to the target coordinate.
6. The image processing method for pose prediction according to claim 5, wherein a target time sequence of the target image to be generated is included between the first image and the second image, and the calculating the first coordinate and the second coordinate according to the target time sequence to obtain the target coordinate of the target key point to be generated comprises:
calculating the average value of the first coordinate and the second coordinate to obtain the average result of the first coordinate and the second coordinate;
and taking the average result as the target coordinate of the target key point to be generated.
7. The image processing method for pose prediction according to any one of claims 1 to 6, wherein the adjusting the pose of the photographic subject in the first image and/or the second image according to the target pose data to obtain the target image comprises:
determining first posture data of the shooting object in the first image according to the limb key points identified in the first image;
determining second posture data of the shooting object in the second image according to the limb key points identified in the second image;
generating attitude adjustment information of the shot object according to the first attitude data, the second attitude data and target attitude data;
and adjusting the postures of the shooting objects in the first image and/or the second image according to the posture adjustment information to obtain a target image.
8. The image processing method for pose prediction according to any one of claims 1 to 6, wherein the photographic subject comprises a banking staff.
9. An image processing apparatus for pose prediction, comprising:
the device comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a first image and a second image which contain the same shooting object;
the identification module is used for identifying the limb key points of the shooting object in the first image and the second image;
the generation module is used for generating target posture data of the shooting object according to the limb key points of the shooting object in the first image and the second image;
and the adjusting module is used for adjusting the posture of the shooting object in the first image and/or the second image according to the target posture data to obtain a target image.
10. A computer-readable storage medium, in which a computer program is stored which is executable by a processor to implement the steps in the image processing method for pose prediction according to any one of claims 1 to 8.
11. An electronic device, characterized in that the electronic device comprises a processor and a memory, in which a computer program is stored, which computer program is executed by the processor to implement the steps in the image processing method for pose prediction according to any of claims 1 to 8.
CN202210692661.XA 2022-06-17 2022-06-17 Image processing method, apparatus, medium, and device for pose prediction Pending CN115205964A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210692661.XA CN115205964A (en) 2022-06-17 2022-06-17 Image processing method, apparatus, medium, and device for pose prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210692661.XA CN115205964A (en) 2022-06-17 2022-06-17 Image processing method, apparatus, medium, and device for pose prediction

Publications (1)

Publication Number Publication Date
CN115205964A true CN115205964A (en) 2022-10-18

Family

ID=83575442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210692661.XA Pending CN115205964A (en) 2022-06-17 2022-06-17 Image processing method, apparatus, medium, and device for pose prediction

Country Status (1)

Country Link
CN (1) CN115205964A (en)

Similar Documents

Publication Publication Date Title
US11386699B2 (en) Image processing method, apparatus, storage medium, and electronic device
JP7154678B2 (en) Target position acquisition method, device, computer equipment and computer program
WO2019134516A1 (en) Method and device for generating panoramic image, storage medium, and electronic apparatus
WO2022042776A1 (en) Photographing method and terminal
WO2019120029A1 (en) Intelligent screen brightness adjustment method and apparatus, and storage medium and mobile terminal
CN110650379B (en) Video abstract generation method and device, electronic equipment and storage medium
CN111726536A (en) Video generation method and device, storage medium and computer equipment
CN109040524B (en) Artifact eliminating method and device, storage medium and terminal
CN107395957B (en) Photographing method and device, storage medium and electronic equipment
WO2022116604A1 (en) Image captured image processing method and electronic device
CN108200337B (en) Photographing processing method, device, terminal and storage medium
JP7209851B2 (en) Image deformation control method, device and hardware device
CN113747085A (en) Method and device for shooting video
CN112019739A (en) Shooting control method and device, electronic equipment and storage medium
CN110572711A (en) Video cover generation method and device, computer equipment and storage medium
CN106254807B (en) Electronic device and method for extracting still image
CN109803165A (en) Method, apparatus, terminal and the storage medium of video processing
CN108513069B (en) Image processing method, image processing device, storage medium and electronic equipment
CN112333382B (en) Shooting method and device and electronic equipment
CN110677591A (en) Sample set construction method, image imaging method, device, medium and electronic equipment
CN112437231A (en) Image shooting method and device, electronic equipment and storage medium
CN108259767B (en) Image processing method, image processing device, storage medium and electronic equipment
CN112511743B (en) Video shooting method and device
WO2021189927A1 (en) Image processing method and apparatus, electronic device, and storage medium
CN110941977A (en) Image processing method, image processing device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination