CN111179281A - Human body image extraction method and human body action video extraction method - Google Patents

Human body image extraction method and human body action video extraction method Download PDF

Info

Publication number
CN111179281A
CN111179281A CN201911349143.2A CN201911349143A CN111179281A CN 111179281 A CN111179281 A CN 111179281A CN 201911349143 A CN201911349143 A CN 201911349143A CN 111179281 A CN111179281 A CN 111179281A
Authority
CN
China
Prior art keywords
human body
target
human
image
extraction method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911349143.2A
Other languages
Chinese (zh)
Inventor
王楠
雷欢
马敬奇
陈再励
何峰
卢杏坚
钟震宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Institute of Intelligent Manufacturing
Original Assignee
Guangdong Institute of Intelligent Manufacturing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Institute of Intelligent Manufacturing filed Critical Guangdong Institute of Intelligent Manufacturing
Priority to CN201911349143.2A priority Critical patent/CN111179281A/en
Publication of CN111179281A publication Critical patent/CN111179281A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/155Segmentation; Edge detection involving morphological operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20152Watershed segmentation

Abstract

The invention discloses a human body image extraction method, which comprises the following steps: acquiring an original input picture; extracting human body information from an original input picture based on a skeleton detection method, wherein the human body information comprises a human body number for each human body and human body skeleton joint point coordinate information corresponding to each human body; constructing a target area based on coordinate information of human body skeleton joint points of all human bodies; extracting a target picture from an original input picture based on the target area; and extracting a human body image corresponding to each human body in the target picture based on an image segmentation algorithm. The human body image extraction method has the advantages of reasonable step setting, high execution speed, low hardware requirement performance and good practicability. In addition, the invention also provides a human body action video extraction method.

Description

Human body image extraction method and human body action video extraction method
Technical Field
The invention relates to the field of picture processing, in particular to a human body image extraction method and a human body action video extraction method.
Background
The human body segmentation is an important step of applications such as human body three-dimensional modeling, posture estimation, mode recognition, detection and tracking, and the like, and the quality of the human body segmentation effect directly determines the effect of subsequent work, so that the research on how to obtain an accurate human body segmentation result has certain practical significance.
In an actual scene, human body segmentation is affected by a plurality of factors such as noise, shading, similar colors and complex backgrounds, and an ideal result cannot be obtained, so that how to obtain an accurate human body segmentation result in a complex scene is still a very challenging task.
At present, the research and application of human body image segmentation are still in the exploration phase, and the commonly used human body segmentation algorithms can be roughly divided into: image segmentation based on image graphics, image segmentation based on shallow machine learning, and image segmentation based on deep learning.
However, the existing segmentation method has the problem that when the pedestrian areas are overlapped, the pedestrians in the overlapped areas cannot be accurately segmented, so that the segmentation precision is low.
Disclosure of Invention
In order to overcome the defects of the existing segmentation method, the invention provides the human body image extraction method and the human body action video extraction method.
Correspondingly, the human body image extraction method comprises the following steps:
acquiring an original input picture;
extracting human body information in the original input picture based on a skeleton detection method, wherein the human body information comprises a human body number for each human body and human body skeleton joint point coordinate information corresponding to each human body;
constructing a target area based on coordinate information of human body skeleton joint points of all human bodies;
extracting a target picture from an original input picture based on the target area;
and extracting a human body image corresponding to each human body in the target picture based on an image segmentation algorithm.
In an optional implementation mode, extracting the coordinate information of human skeleton joint points of all human bodies in the original input picture based on a trained deep convolutional neural network;
the coordinate information of all human body skeleton joint points is Pki={(xki,yki) I ═ 0,1,.. n, k ═ 1,2,.. m } where k represents the human body number and i represents the human skeleton joint point number; n and m are integer values generated based on the original input picture.
In an optional embodiment, the constructing the target region based on the coordinate information of the human skeleton joint points of all the human bodies includes:
let the target region of the kth human body be a rectangular region, denoted as Rk(x, y, w, h), (x, y) is the coordinates of the lower left corner point of the rectangular region, w is the width of the rectangular region, and w is the height of the rectangular region;
wherein x is xkmin-b,xkminIs the minimum x coordinate in the skeleton joint point coordinate information in the kth individual; y ═ yminA, wherein ykminThe y coordinate with the minimum coordinate information of the skeleton joint point in the kth individual is used, and d is an empirical value; w ═ xkmax-xkmin|+2b,xkmaxIs the x coordinate with the maximum coordinate information of the skeleton joint point in the kth individual; h ═ ykmax-ykmin|+2a,ykmaxThe y coordinate with the maximum coordinate information of the skeleton joint point in the kth individual; a and b are empirical values.
In an optional embodiment, the extracting a target picture from an original input picture based on the target region includes:
reserving pixel points corresponding to all target areas of human bodies in the original input picture, and formatting the rest pixel points into designated colors;
the target picture comprises a plurality of unconnected target blocks, and one target block in the plurality of target blocks comprises one target area or more than two target areas.
In an optional embodiment, extracting the human body image from the target picture based on an image segmentation algorithm includes:
and sequentially extracting a corresponding number of human body images from each of the plurality of target blocks.
In an optional implementation manner, the sequentially extracting a corresponding number of human body images from each of the plurality of target blocks includes:
selecting a target block, and counting the number of target areas in the target block;
if the number of the target areas in the target block is one, extracting a human body image from the target block based on an image segmentation algorithm;
if the number of the target areas in the target block is more than two, selecting any two target areas with coincident target areas as processing objects, extracting two communicated human body images from the processing objects based on an image segmentation algorithm, segmenting the two communicated human body images based on a watershed algorithm, associating segmentation results to human bodies with corresponding human body numbers, and traversing and executing the step until the combination mode of the target areas is selected;
and obtaining a corresponding human body image based on a plurality of segmentation results of each human body.
In an optional embodiment, the image segmentation algorithm is one of a graph-cut algorithm, a gram-but algorithm and a one-cut algorithm.
Correspondingly, the invention provides a human body action video extraction method, which comprises the following steps:
sequentially extracting each frame of video picture of an original input video based on a time axis;
taking each frame of video picture as an original input picture and executing the human body image extraction method to obtain a human body image corresponding to each human body;
and carrying out video recombination on the human body image corresponding to the human body with the specific human body number in each frame of video picture based on the time axis to obtain the human body action video corresponding to the human body with the specific human body number.
In an optional implementation manner, if a human body image in an extracted frame in the human body motion video has a blocked area, the blocked area is completed based on human body images of other frames except the extracted frame through corresponding human body skeleton joint point coordinate information.
In an optional embodiment, the completing the occluded region based on the body images of the remaining frames according to the corresponding body skeleton joint coordinate information includes the following steps:
determining a human body part where the shielded area is located in the extracted frame and a head joint point and a tail joint point corresponding to the human body part, and obtaining a target human body part posture vector based on the coordinate information of human body skeleton joint points corresponding to the head joint point and the tail joint point;
calculating a reference target human body part posture vector of the same human body part in the rest frames of the human body action video based on the same head joint point and tail joint point;
respectively carrying out similarity matching on the target human body component posture vector and reference target human body component posture vectors in other frames of the human body action video;
and filling the occluded area in the extraction frame by using the reference target human body part with the highest similarity degree.
The invention provides a human body image extraction method and a human body action video extraction method, the human body image extraction method utilizes the coordinate information of human body skeleton joint points to carry out primary processing on an original input picture, can reduce the execution complexity of subsequent human body image extraction in a large area, and has certain effects on improving the execution speed and reducing the hardware requirement; the overlapped human body images are segmented based on the watershed algorithm, the steps are reasonably set, the execution speed is high, and the method has good practicability. The human body motion video extraction method established based on the human body image extraction method can accurately extract the human body motion video corresponding to each human body, and completes the shielded area according to the relation between the front frame and the rear frame so as to further restore the complete appearance of the human body, and has good practicability.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a human body image extraction method according to an embodiment of the invention;
FIG. 2 shows a schematic diagram of an original input picture (excluding human skeletal joint points) of an embodiment of the invention;
FIG. 3 shows a schematic diagram of an original input picture including human skeletal joint information according to an embodiment of the present invention;
FIG. 4 shows a schematic diagram of a target picture of an embodiment of the invention;
FIG. 5 shows a schematic representation of a human body image of an embodiment of the invention;
fig. 6 shows a flowchart of a human motion video extraction method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that, in different embodiments, due to the variation of the device structure, several components named by the same name may have different structures, and for components with different structures but the same name in different embodiments, different numbers may be used for distinguishing between different embodiments.
Fig. 1 shows a flow diagram of a human body image extraction method according to an embodiment of the invention.
The embodiment of the invention provides a human body image extraction method, which comprises the following steps:
s101, acquiring an original input picture;
fig. 2 shows a schematic diagram of an original input picture (excluding human skeletal joint points), which is a schematic diagram according to an embodiment of the present invention, and the source of the original input picture may be a photographed picture or one image captured from a video. The actual image formation is more complex than the schematic shown in fig. 2 of the drawings, the drawings of the embodiments of the present invention being for illustration only.
It should be noted that the human body image extraction method according to the embodiment of the present invention is to extract a plurality of human body images from an original input picture.
As can be seen from fig. 2 of the accompanying drawings, the number of the human bodies in the original input picture is four, for convenience of subsequent description, different human bodies are respectively labeled, and the purpose of the subsequent steps is to extract four human body images from the original input picture.
S102, extracting coordinate information of human body skeleton joint points from the original input picture based on a skeleton detection method;
specifically, the skeleton detection method is mainly used for confirming the position information and the number information of the human body in the original input picture.
Generally, skeleton detection methods are mainly classified into top-down detection methods and bottom-up detection methods; specifically, the top-down detection method is that a human body whole is used as a human body detector, then each component of the human body is estimated from the human body whole, and finally the coordinate information of the human body skeleton joint point is confirmed through the posture of the component; the detection method from bottom to top is characterized in that components forming a human body type whole are used as a human body detector, different components are related to a corresponding human body to form the human body type whole, and finally coordinate information of human body skeleton joint points is confirmed through the posture of the human body type whole.
Specifically, the differences of the two modes are mainly reflected in the difference of the types of the human body detectors, and if a bottom-up detection method is adopted, different components are required to be associated to the corresponding human body type as a whole; specifically, both of the above two methods can be implemented based on a neural network.
In the embodiment of the invention, the coordinate information of the human skeleton joint points of all pedestrians in the original input picture can be extracted based on the trained deep convolutional neural network; specifically, the original input picture is input into a trained deep convolutional neural network, and the trained deep convolutional neural network outputs a series of coordinate points and human body attribution of the coordinate points.
Specifically, the coordinate information of the human skeleton joint points of all the pedestrians can be generally expressed as Pki={(xki,yki) I ═ 0,1,.. ·. n, k ═ 1,2,..... m }, k denotes a pedestrian number, and i denotes a key point number; n and m are integer values generated based on the original input picture. In this embodiment, m is 4, and the maximum value of n is 13.
It should be noted that, sometimes, the number of joint point coordinate information of different pedestrians may be different due to the occlusion relationship, but since the human skeleton joint point coordinate information of the embodiment of the present invention is only used for extracting an image, and does not relate to the content such as the posture analysis, the absence of a part of joint points only represents the occlusion of a pattern (the occluded part of the human body is absent in the picture), the extraction of the picture is less affected,
fig. 3 is a schematic diagram of an original input picture including human skeleton joint point information, and it can be seen from fig. 3 that after this step, a series of human skeleton joint point coordinate information for each individual human body can be obtained in the original input picture.
S103: constructing a target area based on the coordinate information of the human body skeleton joint points;
specifically, the target area refers to a target area containing a human body image, and the purpose of this step is to primarily remove a background image of the original input picture, which is unrelated to human body image information, so as to simplify the processing flow and processing pressure.
Specifically, since this step is mainly used for the preliminary processing, the target region of the k-th pedestrian may be set as a rectangular region, which may be denoted as Rk(x, y, w, h), (x, y) is the coordinates of the lower left corner point of the rectangular region, w is the width of the rectangular region, and w is the height of the rectangular region.
Specifically, x ═ xkmin-b,xkminThe minimum x coordinate in the k pedestrian skeleton joint point; y ═ yminA, wherein ykminD is an empirical value, and is the minimum y coordinate in the k-th pedestrian skeleton joint point; w ═ xkmax-xkmin|+2b,xkmaxThe maximum coordinate in the k-th pedestrian skeleton joint point; h ═ ykmax-ykmin|+2a,ykmaxThe maximum y coordinate of all detected pedestrian skeleton joint points is obtained; a, b are experience values preset for ensuring the integrity of the intercepted area.
S104: extracting a target picture from an original input picture based on the target area;
fig. 4 shows a schematic diagram of a target picture of an embodiment of the invention.
After the processing of step S103, a plurality of target regions (rectangular regions) including the human body image can be obtained, specifically, in order to reduce the number of background pixels of the image, optionally, pixel points of the target regions corresponding to all pedestrians in the original input picture can be retained, and the remaining pixel points are formatted in a designated color. In this embodiment, the remaining pixels are formatted to be black.
In specific implementation, the target picture extracted through the step includes a plurality of target blocks, and the target blocks include one target area or a plurality of target areas. As in the present embodiment, the target areas corresponding to the human body images of two human bodies, k-1 and k-4, respectively form target blocks; the target areas corresponding to the human body images of the two human bodies, namely k-2 and k-3, are an integral target block.
S105: and extracting a human body image in the target picture based on an image segmentation algorithm.
A small part of background pixel points are also reserved in the target picture and need to be removed (namely, the extraction of the human body image). Specifically, the human body image is extracted from the target picture, which is substantially extracted from each target block, and specifically, according to the number of target areas included in the target block, a corresponding number of human body images are respectively extracted.
Specifically, in step S102, the coordinate information of the human skeleton joint point of each human body is obtained, and when the human body image is extracted, the extraction is generally performed for a specific region of each human body (for example, for a target region of each human body in the present embodiment).
Specifically, in the target picture, there may be a portion of the target regions of the human body independently (for example, a human body with k equal to 1, 4), and the target regions of the portion of the human body overlap (for example, a human body with k equal to 2, 3).
When the human body image of the target block only comprising one target area is extracted, the human body image corresponding to the target block can be directly extracted by using an image segmentation algorithm;
when extracting the human body images of the target blocks comprising more than two target areas, firstly extracting all the human body images based on an image segmentation algorithm, segmenting the overlapped parts of all the human body images based on a watershed segmentation algorithm, and obtaining the human body images with corresponding quantity.
Specifically, the image segmentation algorithm may be a graph-cut, a gram-but, a one-cut and other image algorithms; the watershed segmentation algorithm is an image region segmentation method, and in the segmentation process, the similarity between adjacent pixels is taken as an important reference basis, so that pixel points which are close in spatial position and have similar gray values (gradient calculation) are connected with each other to form a closed contour, and the overlapped human body image is segmented.
Specifically, for the target picture, the representation form of the human body overlapped on the image is that one human body shields a partial region of the other human body, the human body located in the foreground (shielding component) is continuous in the target picture, the human body located in the background (shielded component) shields the shielding component in the target picture, and the human body located in the background (shielded component) is cut off by the shielding component, so that the human body in the overlapped region can be rapidly separated and classified by using the watershed segmentation algorithm.
Figure 5 shows a schematic representation of a human body image. Specifically, referring to fig. 5, where k is 2 and k is 3, the overlapped region of the two human body images is the position framed by the white frame, and the overlapped target region can be accurately extracted and classified by analyzing the overlapped target region through a watershed segmentation algorithm.
Fig. 6 shows a flowchart of a human motion video extraction method according to an embodiment of the present invention.
Correspondingly, the embodiment of the invention also provides a human body action video extraction method, which comprises the following steps:
s201, sequentially extracting each frame of video picture of an original input video based on a time axis;
decomposing an original input video into a plurality of frames of video pictures based on a time axis;
s202, taking each frame of video picture as an original input picture and executing the human body image extraction method to obtain a human body image corresponding to each human body;
and respectively processing the plurality of frames of video pictures by using the human body image extraction method to obtain a plurality of human body images only retaining the human body images.
S203: and carrying out video recombination on the human body image corresponding to the human body with the specific human body number in each frame of video picture based on the time axis to obtain the human body action video corresponding to the human body with the specific human body number.
And confirming the human body number of a human body object to be researched, extracting corresponding human body images from the plurality of human body images by using the human body number, sequencing by using a time axis again, and recombining into a human body action video comprising the corresponding human body number.
In a specific implementation, if a human body image in a frame in the human body motion video includes a blocked area, since blocking is not absolute and blocking positions are not absolutely identical, the blocked area can be complemented based on a human body image of a previous frame or a next frame through corresponding human body skeleton joint point coordinate information, so that the image is restored.
Specifically, if the human body image in the extracted frame in the human body motion video has a blocked area, the blocked area is completed based on the human body images of the other frames except the extracted frame through the corresponding human body skeleton joint point coordinate information.
The step of completing the shielded area based on the human body images of the rest frames through the corresponding human body skeleton joint point coordinate information comprises the following steps:
s301: determining a human body part where the shielded area is located in the extracted frame and a head joint point and a tail joint point corresponding to the human body part, and obtaining a target human body part posture vector based on the coordinate information of human body skeleton joint points corresponding to the head joint point and the tail joint point;
specifically, the coordinate information of the human body skeleton joint point corresponding to the head joint point is represented as a point a, the coordinate information of the human body skeleton joint point corresponding to the tail joint point is represented as a point B, and the coordinates are two-dimensional coordinates in the screen image.
Correspondingly, the target human body part attitude vector is expressed as
Figure BDA0002334236220000101
S302: calculating a reference target human body part posture vector of the same human body part in the rest frames of the human body action video based on the same head joint point and tail joint point;
correspondingly, the attitude vector of the reference target human body part of the w frame and the rest frames can be expressed as
Figure BDA0002334236220000102
w is 1,2, …, u, u is the total frame number of the human motion video minus 1;
s303, respectively carrying out similarity matching on the target human body component posture vector and reference target human body component posture vectors in other frames of the human body action video;
in particular implementations, similarity matching may take into account the aspects of comparison including vector angle and vector length.
Specifically, the meaning of the vector length similarity matching is that, on one hand, images or videos shot through the camera have a near-far effect, so that, firstly, human body parts with the same magnification or reduction ratio (indicating that the distances between the human body parts and the camera are close to or equal) can be obtained with a higher probability based on the vector length comparison, and on the other hand, due to the diversity of the motion postures of the human body parts, such as the rotation of the arm around the shoulder, the same images of the same human body part at different angles can be obtained with a higher probability based on the vector length comparison. At this time, the missing part in the target human body part attitude vector can be obtained by reasonably rotating the human body image corresponding to the reference target human body part attitude vector.
Specifically, the meaning of the vector angle similarity matching is that, under the condition that the vector angles of the target human body component attitude vector and the reference target human body component attitude vector are the same or similar, the reference target human body component attitude in the same attitude or similar attitude as the target human body component attitude vector can be obtained with a high probability. At this time, the missing part in the target human body part attitude vector can be obtained by reasonably enlarging or reducing the human body image corresponding to the reference target human body part attitude vector.
In specific implementation, different weighted values are required to be allocated to the vector angle and the vector length so as to obtain a reasonable similarity matching result.
S304: and filling the occluded area in the extraction frame by using the reference target human body part with the highest similarity degree.
Specifically, the completion method includes, but is not limited to, zooming, stretching, rotating, and the like.
In summary, the embodiment of the present invention provides a human body image extraction method and a human body action video extraction method, the human body image extraction method performs preliminary processing on an original input picture by using coordinate information of a human body skeleton joint point, can reduce the execution complexity of subsequent human body image extraction in a large area, and has certain effects on improving the execution speed and reducing the hardware requirement; the overlapped human body images are segmented based on the watershed algorithm, the steps are reasonably set, the execution speed is high, and the method has good practicability. The human body motion video extraction method established based on the human body image extraction method can accurately extract the human body motion video corresponding to each human body, and completes the shielded area according to the relation between the front frame and the rear frame so as to further restore the complete appearance of the human body, and has good practicability.
The human body image extraction method and the human body motion video extraction method provided by the embodiment of the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation mode of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A human body image extraction method is characterized by comprising the following steps:
acquiring an original input picture;
extracting human body information in the original input picture based on a skeleton detection method, wherein the human body information comprises a human body number for each human body and human body skeleton joint point coordinate information corresponding to each human body;
constructing a target area based on coordinate information of human body skeleton joint points of all human bodies;
extracting a target picture from an original input picture based on the target area;
and extracting a human body image corresponding to each human body in the target picture based on an image segmentation algorithm.
2. The human image extraction method according to claim 1, wherein human skeleton joint point coordinate information of all human bodies in the original input picture is extracted based on a trained deep convolutional neural network;
the coordinate information of all human body skeleton joint points is Pki={(xki,yki) I ═ 0,1,.. n, k ═ 1,2,.. m } where k represents the human body number and i represents the human skeleton joint point number; n and m are integer values generated based on the original input picture.
3. The human image extraction method of claim 2, wherein the constructing the target region based on the human skeleton joint point coordinate information of all human bodies comprises:
let the target region of the kth human body be a rectangular region, denoted as Rk(x, y, w, h), (x, y) is the coordinates of the lower left corner point of the rectangular region, w is the width of the rectangular region, and w is the height of the rectangular region;
wherein x is xkmin-b,xkminIs the minimum x coordinate in the skeleton joint point coordinate information in the kth individual; y ═ yminA, wherein ykminThe y coordinate with the minimum coordinate information of the skeleton joint point in the kth individual is used, and d is an empirical value; w ═ xkmax-xkmin|+2b,xkmaxIs the x coordinate with the maximum coordinate information of the skeleton joint point in the kth individual; h ═ ykmax-ykmin|+2a,ykmaxThe y coordinate with the maximum coordinate information of the skeleton joint point in the kth individual; a, b are empirical values。
4. The human image extraction method of claim 4, wherein the extracting the target picture from the original input picture based on the target region comprises:
reserving pixel points corresponding to all target areas of human bodies in the original input picture, and formatting the rest pixel points into designated colors;
the target picture comprises a plurality of unconnected target blocks, and one target block in the plurality of target blocks comprises one target area or more than two target areas.
5. The human image extraction method of claim 4, wherein extracting the human image in the target picture based on an image segmentation algorithm comprises:
and sequentially extracting a corresponding number of human body images from each of the plurality of target blocks.
6. The human body image extraction method according to claim 5, wherein said sequentially extracting a corresponding number of human body images from each of the plurality of target blocks comprises:
selecting a target block, and counting the number of target areas in the target block;
if the number of the target areas in the target block is one, extracting a human body image from the target block based on an image segmentation algorithm;
if the number of the target areas in the target block is more than two, selecting any two target areas with coincident target areas as processing objects, extracting two communicated human body images from the processing objects based on an image segmentation algorithm, segmenting the two communicated human body images based on a watershed algorithm, associating segmentation results to human bodies with corresponding human body numbers, and traversing and executing the step until the combination mode of the target areas is selected;
and obtaining a corresponding human body image based on a plurality of segmentation results of each human body.
7. The human image extraction method according to claim 6, wherein the image segmentation algorithm is one of a graph-cut algorithm, a gram-but algorithm, and a one-cut algorithm.
8. A human motion video extraction method is characterized by comprising the following steps:
sequentially extracting each frame of video picture of an original input video based on a time axis;
taking each frame of video picture as an original input picture and executing the human body image extraction method of any one of claims 1 to 7 on the original input picture to obtain a human body image corresponding to each human body;
and carrying out video recombination on the human body image corresponding to the human body with the specific human body number in each frame of video picture according to the time axis sequence corresponding to each frame of video picture to obtain the human body action video corresponding to the human body with the specific human body number.
9. The human motion extraction method according to claim 8, wherein if the human image in the extracted frame in the human motion video has an occluded region, the occluded region is completed based on the human images of the other frames except the extracted frame by corresponding human skeleton joint point coordinate information.
10. The human motion extraction method according to claim 9, wherein the complementing the occluded region based on the human images of the remaining frames by the corresponding human skeleton joint point coordinate information comprises:
determining a human body part where the shielded area is located in the extracted frame and a head joint point and a tail joint point corresponding to the human body part, and obtaining a target human body part posture vector based on the coordinate information of human body skeleton joint points corresponding to the head joint point and the tail joint point;
calculating a reference target human body part posture vector of the same human body part in the rest frames of the human body action video based on the same head joint point and tail joint point;
respectively carrying out similarity matching on the target human body component posture vector and reference target human body component posture vectors in other frames of the human body action video;
and filling the occluded area in the extraction frame by using the reference target human body part with the highest similarity degree.
CN201911349143.2A 2019-12-24 2019-12-24 Human body image extraction method and human body action video extraction method Pending CN111179281A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911349143.2A CN111179281A (en) 2019-12-24 2019-12-24 Human body image extraction method and human body action video extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911349143.2A CN111179281A (en) 2019-12-24 2019-12-24 Human body image extraction method and human body action video extraction method

Publications (1)

Publication Number Publication Date
CN111179281A true CN111179281A (en) 2020-05-19

Family

ID=70650423

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911349143.2A Pending CN111179281A (en) 2019-12-24 2019-12-24 Human body image extraction method and human body action video extraction method

Country Status (1)

Country Link
CN (1) CN111179281A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022046326A (en) * 2020-09-10 2022-03-23 ソフトバンク株式会社 Information processing device, information processing method and information processing program
GB2613925A (en) * 2021-12-16 2023-06-21 Adobe Inc Generating segmentation masks for objects in digital videos using pose tracking data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036231A (en) * 2014-05-13 2014-09-10 深圳市菲普莱体育发展有限公司 Human-body trunk identification device and method, and terminal-point image detection method and device
CN107220604A (en) * 2017-05-18 2017-09-29 清华大学深圳研究生院 A kind of fall detection method based on video
CN108986137A (en) * 2017-11-30 2018-12-11 成都通甲优博科技有限责任公司 Human body tracing method, device and equipment
CN109919132A (en) * 2019-03-22 2019-06-21 广东省智能制造研究所 A kind of pedestrian's tumble recognition methods based on skeleton detection
CN110347877A (en) * 2019-06-27 2019-10-18 北京奇艺世纪科技有限公司 A kind of method for processing video frequency, device, electronic equipment and storage medium
CN110472569A (en) * 2019-08-14 2019-11-19 旭辉卓越健康信息科技有限公司 A kind of method for parallel processing of personnel detection and identification based on video flowing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036231A (en) * 2014-05-13 2014-09-10 深圳市菲普莱体育发展有限公司 Human-body trunk identification device and method, and terminal-point image detection method and device
CN107220604A (en) * 2017-05-18 2017-09-29 清华大学深圳研究生院 A kind of fall detection method based on video
CN108986137A (en) * 2017-11-30 2018-12-11 成都通甲优博科技有限责任公司 Human body tracing method, device and equipment
CN109919132A (en) * 2019-03-22 2019-06-21 广东省智能制造研究所 A kind of pedestrian's tumble recognition methods based on skeleton detection
CN110347877A (en) * 2019-06-27 2019-10-18 北京奇艺世纪科技有限公司 A kind of method for processing video frequency, device, electronic equipment and storage medium
CN110472569A (en) * 2019-08-14 2019-11-19 旭辉卓越健康信息科技有限公司 A kind of method for parallel processing of personnel detection and identification based on video flowing

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
RUIBING HOU等: "VRSTC: Occlusion-Free Video Person Re-Identification", 《红外序列图像中基于形状的人体检测》, pages 7183 - 7186 *
RUIBING HOU等: "VRSTC: Occlusion-Free Video Person Re-Identification", HTTPS://UI.ADSABS.HARVARD.EDU/ABS/2019ARXIV190708427H/ABSTRACT HTTPS://ARXIV.ORG/PDF/1907.08427.PDF, pages 2 - 5 *
王江涛等: "红外序列图像中基于形状的人体检测", 《红外与毫米波学报》, vol. 26, no. 6, pages 437 - 442 *
程光等: "《僵尸网络检测技术》", 31 October 2014, 东南大学出版社, pages: 137 - 143 *
蔡桂艳: "距离变换和分水岭分割的粘连人群分割方法", 《钦州学院学报》, vol. 26, no. 3, pages 41 - 44 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022046326A (en) * 2020-09-10 2022-03-23 ソフトバンク株式会社 Information processing device, information processing method and information processing program
GB2613925A (en) * 2021-12-16 2023-06-21 Adobe Inc Generating segmentation masks for objects in digital videos using pose tracking data
US20230196817A1 (en) * 2021-12-16 2023-06-22 Adobe Inc. Generating segmentation masks for objects in digital videos using pose tracking data

Similar Documents

Publication Publication Date Title
Wen et al. Deep color guided coarse-to-fine convolutional network cascade for depth image super-resolution
KR102003015B1 (en) Creating an intermediate view using an optical flow
US20190037150A1 (en) System and methods for depth regularization and semiautomatic interactive matting using rgb-d images
US9715761B2 (en) Real-time 3D computer vision processing engine for object recognition, reconstruction, and analysis
US9626568B2 (en) Use of spatially structured light for dynamic three dimensional reconstruction and reality augmentation
US8824801B2 (en) Video processing
Nakajima et al. Fast and accurate semantic mapping through geometric-based incremental segmentation
CN103443826B (en) mesh animation
Ding et al. Spatio-temporal recurrent networks for event-based optical flow estimation
Sánchez-Riera et al. Simultaneous pose, correspondence and non-rigid shape
KR101969082B1 (en) Optimal Spherical Image Acquisition Method Using Multiple Cameras
CN112102342B (en) Plane contour recognition method, plane contour recognition device, computer equipment and storage medium
CN111179281A (en) Human body image extraction method and human body action video extraction method
Scholz et al. Texture replacement of garments in monocular video sequences
Li et al. Three-dimensional motion estimation via matrix completion
CN111161219B (en) Robust monocular vision SLAM method suitable for shadow environment
Cushen et al. Markerless real-time garment retexturing from monocular 3d reconstruction
Kim et al. Multi-view object extraction with fractional boundaries
Bazin et al. An original approach for automatic plane extraction by omnidirectional vision
Kovačević et al. An improved CamShift algorithm using stereo vision for object tracking
CN111160255B (en) Fishing behavior identification method and system based on three-dimensional convolution network
Gay-Bellile et al. Deformable surface augmentation in spite of self-occlusions
Rotman et al. A depth restoration occlusionless temporal dataset
Hasegawa et al. Distortion-Aware Self-Supervised 360 {\deg} Depth Estimation from A Single Equirectangular Projection Image
Kim et al. Accurate depth image generation via overfit training of point cloud registration using local frame sets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination