WO2021192085A1 - Pose identifying apparatus, pose identifying method, and non-transitory computer readable medium storing program - Google Patents
Pose identifying apparatus, pose identifying method, and non-transitory computer readable medium storing program Download PDFInfo
- Publication number
- WO2021192085A1 WO2021192085A1 PCT/JP2020/013306 JP2020013306W WO2021192085A1 WO 2021192085 A1 WO2021192085 A1 WO 2021192085A1 JP 2020013306 W JP2020013306 W JP 2020013306W WO 2021192085 A1 WO2021192085 A1 WO 2021192085A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- body region
- detected
- points
- basic pattern
- point
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/75—Determining position or orientation of objects or cameras using feature-based methods involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Definitions
- the present disclosure relates to a pose identifying apparatus, a pose identifying method, and a non-transitory computer readable medium storing a program.
- Patent Literature 1 A technique of identifying a pose of each person in an image including a plurality of person images respectively corresponding to a plurality of humans has been proposed (e.g., Patent Literature 1).
- the technique disclosed in Patent Literature 1 detects a plurality of body region points for a human in the image, and identifies the human's head from among the plurality of detected body region points so as to identify the human in the image. Then, the pose of the human is identified by associating the detected human body region point with another detected body region point.
- Patent Literature 1 the person in the image is identified based only on his/her head. For this reason, for example, when the resolution of the image is low, the accuracy of the identification may decrease.
- the inventor has found that an accuracy of identifying a person's pose can be improved by extracting a basic pattern including three or more base body region points as a human "core part".
- the present inventor has found that the speed of identifying a person's pose can be increased by using a detected body region point not adjacent to a base body region point directly as a grouping target detected body region point.
- An object of the present disclosure is to provide a pose identifying apparatus, a pose identifying method, and a non-transitory computer-readable medium for storing a program, which can improve an accuracy of identifying a person's pose and increase the speed of identifying a person's pose.
- a first example aspect is a pose identifying apparatus including: basic pattern extracting means for extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human , wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and grouping means for counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of
- a second example aspect is a pose identifying method including: extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern
- a third example aspect is a non-transitory computer readable medium storing a program for causing a pose identifying apparatus to execute: extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of
- a pose identifying apparatus a pose identifying method, and a non-transitory computer-readable medium for storing a program, which can improve an accuracy of identifying a person's pose and increase the speed of identifying a person's pose.
- Fig. 1 is a diagram showing an example of a pose identifying apparatus according to a first example embodiment.
- Fig. 2 is a flowchart showing an example of a processing operation of a pose identifying apparatus according to the first example embodiment.
- Fig. 3 is a block diagram showing an example of a pose identifying apparatus according to a second example embodiment.
- Fig. 4 is a diagram showing an example of a plurality of predetermined detection target points for a human.
- Fig. 5 is a diagram for describing an extraction process of a basic pattern.
- Fig. 6 is a diagram for describing types of basic pattern candidates.
- Fig. 7 is a diagram for describing calculation of a base length.
- Fig. 8 is a diagram for describing grouping process.
- Fig. 9 is a diagram for describing a mid-point expected area.
- Fig. 10 is a diagram for describing grouping process.
- Fig. 11 is a diagram showing an example of a hardware configuration of the
- Fig. 1 is a diagram showing an example of a pose identifying apparatus according to a first example embodiment.
- a pose identifying apparatus 10 includes a basic pattern extracting unit 11 and a grouping unit 12.
- the basic pattern extracting unit 11 acquires information about a "position in an image” and a "point type” of each of a plurality of "detected body region points” and a plurality of "detected mid-points”.
- the basic pattern extracting unit 11 extracts a "basic pattern” for each human from the plurality of "detected body region points" and the plurality of "detected mid-points".
- the plurality of "detected body region points" and the plurality of "detected mid-points" are a plurality of predetermined "detection target points" for a human in an image including a plurality of person images respectively corresponding to a plurality of humans.
- the plurality of "detected body region points" and the plurality of “detected mid-points” are detected by, for example, a neural network (not shown) for the plurality of predetermined detection target points including the plurality of "body region points” of the human and “mid-points” for respective "body region point pairs” each composed of two body region points.
- the "basic pattern” includes a plurality of "detected body region points (i.e., detected base body region points)" corresponding to a plurality of "base body region types" that are different from each other.
- each "body region point” included in the plurality of the predetermined “detection target points” relates to a human body region such as a neck, an eye, a nose, an ear, a shoulder, and an elbow of a human.
- the "mid-point” included in the plurality of the predetermined “detection target points” relates to a human body region in the case of a body region point pair composed of body region points directly connected by, for example, an arm, such as a right shoulder and a right elbow (i.e., body region point pair composed of body region points adjacent to each other).
- the "mid-point” included in the plurality of predetermined “detection target points” may be a spatial point around the human according to the person's pose at that moment.
- the "detected body region point” and the “detected mid-point” are points detected by, for example, a neural network (not shown) for the "body region point” and the "mid-point", respectively, included in the plurality of predetermined "detection target points”.
- the "basic pattern" includes at least one of the following two combinations.
- a first combination is a combination of three base body region points, which correspond to a neck, a left shoulder and a left ear each being a base body region type.
- a second combination is a combination of three base body region points, which correspond to a neck, a right shoulder and a right ear each being a base body region type. That is, the "basic pattern" corresponds to a core part that is most stably detectable in a human body in images.
- the grouping unit 12 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in a "mid-point expected area" obtained from links between each of a "plurality of grouping evaluation reference body region points" and a "grouping target detected body region point".
- the "plurality of grouping evaluation reference body region points" are composed of some or all of the plurality of detected base body region points included in the extracted basic pattern.
- the "plurality of grouping evaluation reference body region points” may be detected body region points each corresponding to one of the neck, left shoulder and right shoulder each being the base body region type.
- the two detected body region points corresponding to the neck and the right shoulder are the above-mentioned "plurality of grouping evaluation reference body region points”.
- the "grouping target detected body region point” is each detected body region point for a body region type not included in the basic pattern. For example, when the body region types of the basic pattern are the neck, the right shoulder, the left shoulder, the right ear, and the left ear, the detected body region points for the eye, the nose, the elbow, etc. are grouping target detected body region points.
- the "mid-point expected area” is an area (middle area) including a "defined mid-point” defined as a "center point” of the above link.
- the grouping unit 12 groups the grouping target detected body region point into one of a plurality of person groups respectively corresponding to the plurality of extracted basic patterns based on count values each of which is counted for the corresponding extracted basic pattern.
- FIG. 2 is a flowchart showing an example of the processing operation of the pose identifying apparatus according to the first example embodiment.
- the basic pattern extracting unit 11 extracts a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points (Step S101).
- the grouping unit 12 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in the mid-point expected area obtained from links between each of the plurality of grouping evaluation reference body region points and the grouping target detected body region point (Step S102).
- the grouping unit 12 groups the grouping target detected body region point into one of a plurality of person groups respectively corresponding to the plurality of extracted basic patterns based on count values each of which is counted for the corresponding extracted basic pattern (Step S103).
- Steps S102 and S103 i.e., grouping process
- the processing of Steps S102 and S103 for a plurality of grouping target detected body region points may be performed in order or in parallel.
- the basic pattern extracting unit 11 in the pose identifying apparatus 10 extracts the basic pattern for each human from the plurality of detected body region points and the plurality of "detected mid-points".
- the "basic pattern” includes the plurality of detected base body region points corresponding to the plurality of base body region types that are different from each other.
- the above-described basic pattern including a plurality of detected base body region points can be extracted as a "core part" of a human. By doing so, the accuracy of identifying the person's pose included in the image can be improved.
- the grouping unit 12 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in the mid-point expected area obtained from links between each of the plurality of grouping evaluation reference body region points and the grouping target detected body region point.
- the plurality of grouping evaluation reference body region points are composed of some or all of the plurality of detected base body region points included in the extracted basic pattern.
- the grouping unit 12 groups the grouping target detected body region point into one of a plurality of person groups respectively corresponding to the plurality of extracted basic patterns based on count values each of which is counted for the corresponding extracted basic pattern.
- a result of the grouping on a first grouping target detected body region point does not affect a result of the grouping on a second grouping target detected body region point adjacent to the first grouping target detected body region point. It is thus possible to execute the grouping process on the grouping target detected body region points regardless of whether a grouping evaluation reference body region point is adjacent to the grouping target detected body region point. For this reason, the grouping process for a plurality of grouping target detected body region points can be executed in parallel, so that the speed of identifying a person's pose can be increased.
- Second example embodiment The second example embodiment relates to a more specific example embodiment.
- Fig. 3 is a block diagram showing an example of a pose identifying apparatus according to the second example embodiment.
- the pose identifying apparatus 20 includes a basic pattern extracting unit 21 and a grouping unit 22.
- the basic pattern extracting unit 21 acquires information about a "position in an image” and a "point type” of each of a plurality of "detected body region points” and a plurality of "detected mid-points".
- the basic pattern extracting unit 21 extracts a "basic pattern” for each human from the plurality of "detected body region points" and the plurality of "detected mid-points".
- the plurality of "detected body region points" and the plurality of "detected mid-points" are a plurality of predetermined "detection target points" for a human in an image including a plurality of person images respectively corresponding to a plurality of humans.
- the plurality of "detected body region points" and the plurality of “detected mid-points” are detected by, for example, a neural network (not shown) for the plurality of predetermined detection target including the plurality of "body region points” of the human and “mid-points” for respective "body region point pairs” each composed of two body region points.
- the "basic pattern” includes a plurality of "detected body region points (i.e., detected base body region points)" corresponding to a plurality of "base body region types" that are different from each other.
- Fig. 4 is a diagram showing an example of the plurality of predetermined detection target points for a human.
- the "plurality of predetermined detection target points" for a human include body region points N 0 to N 17 .
- the body region point N 0 corresponds to a neck.
- the body region point N 1 corresponds to a right shoulder.
- the body region point N 2 corresponds to a left shoulder.
- the body region point N 3 corresponds to a right ear.
- the body region point N 4 corresponds to a left ear.
- the body region point N 5 corresponds to a nose.
- the body region point N 6 corresponds to a right eye.
- the body region point N 7 corresponds to a left eye.
- the body region point N 8 corresponds to a right elbow.
- the body region point N 9 corresponds to a right wrist.
- the body region point N 10 corresponds to a left elbow.
- the body region point N 11 corresponds to a left wrist.
- the body region point N 12 corresponds to a right hip.
- the body region point N 13 corresponds to a left hip.
- the body region point N 14 corresponds to a right knee.
- the body region point N 15 corresponds to a left knee.
- the body region point N 16 corresponds to a right ankle.
- the body region point N 17 corresponds to a left ankle.
- the "plurality of grouping evaluation reference body region points" described in the first example embodiment are the body region points N 0 , N 1 , and N 2
- the "plurality of predetermined detection target points" for a human include 39 mid-points corresponding to the respective combinations of the body region points N 0 , N 1 , and N 2 and the body region points N 5 to N 17 .
- a mid-point between a body region point N i and a body region point N j is represented by a mid-point M i_j .
- the detected mid-point is represented by a detected mid-point M i_j
- the "defined mid-point" described in the first example embodiment is represented by a detected mid-point M' i_j .
- the basic pattern extracting unit 21 may acquire five sets of information each including the positions and the point types of the detected body region points N 0 to N 17 and 39 detected mid-points M.
- the basic pattern extracting unit 21 extracts a "basic pattern" for each human from the plurality of "detected body region points" and the plurality of "detected mid-points".
- the basic pattern extracting unit 21 includes a basic pattern candidate identifying unit 21A, a base length calculating unit 21B, and a basic pattern forming unit 21C.
- the basic pattern candidate identifying unit 21A identifies a plurality of "basic pattern candidates" by classifying, into the same basic pattern candidate, each combination which includes detected body region points that are close in distance to each other in the image from among a plurality of combinations of the plurality of detected base body region points corresponding to the "main type” and the plurality of detected body region points corresponding to the "sub types".
- the "main type” is, for example, the neck, and the "sub types” are the right shoulder, the left shoulder, the right ear, and the left ear.
- the basic pattern candidate identifying unit 21A selects, for one detected body region point corresponding to the neck, one detected body region point corresponding to the right shoulder that is closest in distance to the one detected body region point corresponding to the neck from among the plurality of detected body region points corresponding to the right shoulder. This selection is made for each detected body region point corresponding to the neck. Then, when one detected body region point corresponding to the right shoulder is selected for the plurality of detected body region points corresponding to the neck, the basic pattern candidate identifying unit 21A selects one detected body region point corresponding to the neck that is closest in distance to the above-mentioned detected body region point corresponding to the right shoulder from among the plurality of detected body region points corresponding to the neck.
- the basic pattern candidate identifying unit 21A performs processing using the MLMD (Mutual-Local-Minimum-Distance) algorithm.
- MLMD Matual-Local-Minimum-Distance
- one detected body region point corresponding to the neck and one detected body region point corresponding to the right shoulder are selected, and these detected body region points are classified into the same "basic pattern candidate".
- the processing described above is performed for each of the left shoulder, the right ear, and the left ear.
- the basic pattern forming unit 21C performs "optimization processing" on a plurality of basic pattern candidates identified by the basic pattern candidate identifying unit 21A to thereby form a plurality of basic patterns for the plurality of humans.
- a first process is a process of cutting one basic pattern candidate including the plurality of detected body region points corresponding to the main type to convert the one basic pattern candidate into a plurality of the basic pattern candidates each including one detection point corresponding to the main type. That is, when one basic pattern candidate includes a plurality of detected body region points corresponding to the neck, the one basic pattern candidate is converted into a plurality of basic pattern candidates each including one detected body region point corresponding to the neck.
- a second process is a process of excluding, from each basic pattern candidate, a detected body region point(s) that is included in the basic pattern candidate, that corresponds to the sub type, and whose distance from the detected body region point corresponding to the main type is longer than a "base length for the basic pattern candidate".
- a third process is a process of excluding a basic pattern candidate(s) not including any of a combination of three detected body region points of a "first body region type group” and a combination of three detected body region points of a "second body region type group".
- the "first body region type group” includes the neck, the left shoulder, and the left ear
- the "second body region type group” includes the neck, the right shoulder, and the right ear.
- the base length calculating unit 21B calculates the "base length for each basic pattern candidate" when the above-described first process is completed. The calculation of the "base length for each basic pattern candidate" will be described in detail later.
- the grouping unit 22 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in a "mid-point expected area" obtained from links between each of a "plurality of grouping evaluation reference body region points" and a "grouping target detected body region point".
- the grouping unit 22 groups the grouping target detected body region point into one of a plurality of person groups respectively corresponding to the plurality of extracted basic patterns based on count values each of which is counted for the corresponding extracted basic pattern. For example, the grouping unit 22 may group the grouping target detected body region point into a person group corresponding to the basic pattern having the largest count value from among the plurality of extracted basic patterns. Alternatively, the grouping unit 22 may group the grouping target detected body region point into a person group corresponding to the basic pattern having the largest count value which is a predetermined value or greater (e.g., 2 or greater) from among the plurality of extracted basic patterns.
- a predetermined value or greater e.g., 2 or greater
- the grouping unit 22 may group the grouping target detected body region point into a basic pattern including the detected base body region point corresponding to the main type having the smallest distance from the grouping target detected body region point. This grouping process will be described in detail later.
- the basic pattern extracting unit 21 acquires information about a "position in an image” and a "point type” of each of a plurality of "detected body region points” and a plurality of "detected mid-points”.
- the basic pattern extracting unit 21 extracts a "basic pattern” for each human from the plurality of "detected body region points" and the plurality of "detected mid-points”.
- Fig. 5 is a diagram for describing the basic pattern extraction process.
- the basic pattern extraction process starts from a graph G shown in Fig. 5.
- the graph G includes all detected base body region point pairs corresponding to a group Sc of a "base body region type pair".
- the group Sc includes, as group elements, a pair of the neck and right shoulder, a pair of the neck and left shoulder, a pair of the neck and right ear, a pair of the neck and left ear, a pair of the right shoulder and right ear, and a pair of the left shoulder and left ear.
- the basic pattern candidate identifying unit 21A performs the processing using the MLMD algorithm on each base body region type pair to obtain a graph G-sub, and identifies, as the "basic pattern candidate", a block including a triangle(s) having the respective detected body region points corresponding to the neck in the graph G-sub as vertexes.
- Fig. 6 is a diagram for describing types of the basic pattern candidates. As shown in Fig. 6, there may be five types of the basic pattern candidates, which are TA, TB, TC, TD, and TE. In Fig. 5, these five types of the basic pattern candidates are collectively referred to as "PATTERN- ⁇ ". For example, the basic pattern candidate corresponding to the person facing the front is likely to be the type TA. There are basic pattern candidates of the types TB, TC, and TD due to a complex environment such as occlusions.
- the basic pattern forming unit 21C performs the optimization processing on "PATTERN- ⁇ " to form the plurality of basic patterns.
- the basic pattern forming unit 21C divides the basic pattern candidate of the type TE into two basic pattern candidates each including one detected body region point corresponding to the neck (the above-described first process). Then, the basic pattern candidate of the type TB and the basic pattern candidate of the type TC are obtained. As a result, basic pattern candidates corresponding to the types TA, TB, TC, and TD remain.
- the basic length calculating unit 21B calculates the "base length" for each basic pattern candidate corresponding to any one of the types TA, TB, TC, and TD.
- the "base length” is a length that is a reference of a size of a human body.
- Fig. 7 is a diagram for describing the calculation of the base length.
- the base length calculating unit 21B calculates lengths La, Lb, and Lc for each basic pattern candidate.
- the length La is calculated as a distance between the detected body region point N 0 corresponding to the neck and the detected body region point N 1 corresponding to the right shoulder in the basic pattern candidate or a distance between the detected body region point N 0 corresponding to the neck and the detected body region point N 2 corresponding to the left shoulder in the basic pattern candidate.
- the length La is equal to the smaller one of the distance between the detected body region point N 0 corresponding to the neck and the detected body region point N 1 corresponding to the right shoulder and the distance between the detected body region point N 0 corresponding to the neck and the detected body region point N 2 corresponding to the left shoulder.
- the length La is the distance between the detected body region point N 0 corresponding to the neck and the detected body region point N 1 corresponding to the right shoulder.
- the length La is the distance between the detected body region point N 0 corresponding to the neck and the detected body region point N 2 corresponding to the left shoulder.
- the length Lb is calculated as a distance between the detected body region point N 0 corresponding to the neck and the detected body region point N 3 corresponding to the right ear in the basic pattern candidate or a distance between the detected body region point N 0 corresponding to the neck and the detected body region point N 4 corresponding to the left ear in the basic pattern candidate.
- the base length calculating unit 21B calculates the base length of each basic pattern candidate based on the calculated lengths La, Lb, and Lc.
- the base length calculating unit 21B calculates the base length by different calculation methods according to a large/small relation between "Lc” and "La+Lb” and a large/small relation between "Lb” and "La ⁇ 2". As shown in Fig. 7, for example, when “Lc” is “La+Lb” or less and “Lb” is “La ⁇ 2" or less, the base length is “Lc”. When “Lc” is "La+Lb" or less and “Lb” is larger than “La ⁇ 2", the base length is "Lc ⁇ 1.17".
- the basic pattern forming unit 21C excludes, from each basic pattern candidate, a detected base body region point(s) that is included in the basic pattern candidate, that corresponds to the sub type, and whose distance from the detected base body region point corresponding to the main type is longer than the "base length for the basic pattern candidate" (the above second process). Then, for example, in the basic pattern candidate of the type TA including two triangles shown in Fig. 6, when the detected body region point corresponding to the ear included in one of the triangles is far from the detected body region point corresponding to the neck, this detected body region point corresponding to the ear is excluded from the basic pattern candidate. Thus, the basic pattern candidate of the type TA is changed to the basic pattern candidate of the type TC.
- the basic pattern candidate of the type TA including two triangles when the detected body region point corresponding to the shoulder included in one of the triangles is far from the detected body region point corresponding to the neck, this detected body region point corresponding to the shoulder is excluded from the basic pattern candidate.
- the basic pattern candidate of the type TA is changed to the basic pattern candidate of the type TB.
- the basic pattern candidate not including any triangle may appear as a result of the processing by this basic pattern forming unit 21C.
- the basic pattern forming unit 21C excludes the basic pattern candidate(s) not including any of the combination of the three detected base body region points of the "first body region type group” and the combination of the three detected base body region points of the "second body region type group (the above-described third process).
- the "first body region type group” includes the neck, the left shoulder, and the left ear
- the “second body region type group” includes the neck, the right shoulder, and the right ear. That is, the basic pattern candidate(s) not including any of the above triangles is excluded by the processing of the basic pattern forming unit 21C.
- four types of the basic pattern candidates i.e., the types TA, TB, TC, and TD, may remain. These remaining basic pattern candidates are the "basic patterns”.
- the grouping unit 22 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in a "mid-point expected area" obtained from links between each of a "plurality of grouping evaluation reference body region points" and a "grouping target detected body region point”.
- Fig. 8 is a diagram for describing grouping process.
- the basic patterns 1, 2, and 3 are extracted.
- the grouping unit 22 connects the grouping target detected body region point N i to each of the grouping evaluation reference body region points N 0 , N 1 , N 2 by a "temporary link".
- the grouping evaluation reference body region points are detected base body region points corresponding to the neck, the right shoulder, and the left shoulder, respectively.
- Fig. 9 is a diagram for describing the mid-point expected area.
- the mid-point expected area corresponding to the temporary link is an oblong area centered on a center point M i_j ' between two detected body region points N i and N j of the temporary link (i.e., defined mid-point M i_j ').
- a major axis length Rmajor of the mid-point expected area is "link distance ⁇ 0.75"
- a minor axis length Rminor is "link distance ⁇ 0.35".
- the grouping unit 22 determines whether the detected mid-point M i_j is present in the mid-point expected area corresponding to the temporary link of the detected body region points N i and N j . Then, the grouping unit 22 defines the temporary link corresponding to the mid-point expected area where the detected mid-point M i_j is present as a "candidate link”. The grouping unit 22 counts the number of candidate links for each basic pattern. Fig. 8 shows the detected mid-points present in the mid-point expected area. That is, in the example of Fig. 8, the count number of the basic pattern 1 is "1", the count number of the basic pattern 2 is "3", and the count number of the basic pattern 3 is "2".
- the grouping unit 22 may, for example, group the grouping target detected body region point N i into a person group corresponding to the basic pattern 2 having the largest count number.
- Fig. 10 is another diagram for describing the grouping process.
- the count numbers of the basic patterns 2 and 3 are both "2", and the basic patterns 2 and 3 have the largest count number.
- the grouping unit 22 may group the grouping target detected body region point N i into a person group corresponding to a basic pattern (i.e., the basic pattern 2) including a detected base body region point N 0-bp2 corresponding to the main type having the smallest distance from the grouping target detected body region point N i .
- the pose identifying apparatus includes a processor 101 and a memory 102.
- the processor 101 may be, for example, a microprocessor, a Micro Processing Unit (MPU), or a Central Processing Unit (CPU).
- the processor 101 may include a plurality of processors.
- the memory 102 is composed of a combination of a volatile memory and a non-volatile memory.
- the memory 102 may include a storage located separated from the processor 101. In this case, the processor 101 may access the memory 102 via an I/O interface (not shown).
- Each of the pose identifying apparatuses 10 according to the first example embodiment and the pose identifying apparatuses 20 according to the second example embodiment can include the hardware configuration shown in Fig. 11.
- the basic pattern extracting units 11 and 21 and the grouping units 12 and 22 of the pose identifying apparatuses 10 and 20 according to the first and second example embodiments may be achieved by the processor 101 reading a program stored in the memory 102 and executing it.
- the program can be stored and provided to the pose identifying apparatuses 10 and 20 using any type of non-transitory computer readable media.
- Examples of non-transitory computer readable media include magnetic storage media (such as floppy disks, magnetic tapes, hard disk drives, etc.), and optical magnetic storage media (e.g. magneto-optical disks).
- non-transitory computer readable media further include CD-ROM (Read Only Memory), CD-R, and CD-R/W.
- Examples of non-transitory computer readable media further include semiconductor memories.
- the semiconductor memories include, for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory), etc.
- the program may be provided to the pose identifying apparatuses 10 and 20 using any type of transitory computer readable media.
- Non-transitory computer readable media include any type of tangible storage media. Transitory computer readable media can provide the program to the pose identifying apparatuses 10 and 20 via a wired communication line (e.g. electric wires, and optical fibers) or a wireless communication line.
- a pose identifying apparatus comprising: basic pattern extracting means for extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and grouping means for counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality
- the basic pattern extracting means comprises: basic pattern candidate identifying means for identifying a plurality of basic pattern candidates by classifying, into the same basic pattern candidate, combination which includes detected body region points that are close in distance to each other in the image from among a plurality of combinations of the plurality of detected body region points corresponding to the main type and the plurality of detected body region points corresponding to the respective sub types; and basic pattern formation means for forming the plurality of basic patterns for the plurality of humans by performing optimization processing on the identified plurality of basic pattern candidates.
- Supplementary note 8 The pose identifying apparatus according to Supplementary note 7, wherein the main type is a neck, the sub types are a left shoulder, a right shoulder, a left ear, and a right ear, the first body region type group includes the neck, the left shoulder, and the left ear, and the second body region type group includes the neck, the right shoulder, and the right ear.
- a pose identifying method comprising: extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the
- a non-transitory computer readable medium storing a program for causing a pose identifying apparatus to execute: extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
A grouping unit (12) of a pose identifying apparatus (10) counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point. The basic pattern includes a plurality of detected body region points corresponding to a plurality of base body region types that are different from each other. The plurality of grouping evaluation reference body region points are composed of some or all of a plurality of detected body region points included in the extracted basic pattern. The grouping unit (12) groups the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on a count value.
Description
The present disclosure relates to a pose identifying apparatus, a pose identifying method, and a non-transitory computer readable medium storing a program.
A technique of identifying a pose of each person in an image including a plurality of person images respectively corresponding to a plurality of humans has been proposed (e.g., Patent Literature 1). The technique disclosed in Patent Literature 1 detects a plurality of body region points for a human in the image, and identifies the human's head from among the plurality of detected body region points so as to identify the human in the image. Then, the pose of the human is identified by associating the detected human body region point with another detected body region point.
PTL 1: US Patent Application Publication No. 2018/0293753
However, in the technique disclosed in Patent Literature 1, the person in the image is identified based only on his/her head. For this reason, for example, when the resolution of the image is low, the accuracy of the identification may decrease.
The inventor has found that an accuracy of identifying a person's pose can be improved by extracting a basic pattern including three or more base body region points as a human "core part".
Further, the present inventor has found that the speed of identifying a person's pose can be increased by using a detected body region point not adjacent to a base body region point directly as a grouping target detected body region point.
An object of the present disclosure is to provide a pose identifying apparatus, a pose identifying method, and a non-transitory computer-readable medium for storing a program, which can improve an accuracy of identifying a person's pose and increase the speed of identifying a person's pose.
A first example aspect is a pose identifying apparatus including: basic pattern extracting means for extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human , wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
grouping means for counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values counted for the extracted basic patterns.
grouping means for counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values counted for the extracted basic patterns.
A second example aspect is a pose identifying method including:
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values for the extracted basic patterns.
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values for the extracted basic patterns.
A third example aspect is a non-transitory computer readable medium storing a program for causing a pose identifying apparatus to execute:
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on a count values for the extracted basic patterns.
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on a count values for the extracted basic patterns.
According to the present disclosure, it is possible to provide a pose identifying apparatus, a pose identifying method, and a non-transitory computer-readable medium for storing a program, which can improve an accuracy of identifying a person's pose and increase the speed of identifying a person's pose.
Hereinafter, example embodiments will be described with reference to the drawings. In the example embodiments, the same or equivalent elements will be denoted by the same reference signs, and repeated descriptions will be omitted.
First example embodiment
<Configuration example of pose identifying apparatus>
Fig. 1 is a diagram showing an example of a pose identifying apparatus according to a first example embodiment. In Fig. 1, apose identifying apparatus 10 includes a basic pattern extracting unit 11 and a grouping unit 12.
<Configuration example of pose identifying apparatus>
Fig. 1 is a diagram showing an example of a pose identifying apparatus according to a first example embodiment. In Fig. 1, a
The basic pattern extracting unit 11 acquires information about a "position in an image" and a "point type" of each of a plurality of "detected body region points" and a plurality of "detected mid-points". The basic pattern extracting unit 11 extracts a "basic pattern" for each human from the plurality of "detected body region points" and the plurality of "detected mid-points". The plurality of "detected body region points" and the plurality of "detected mid-points" are a plurality of predetermined "detection target points" for a human in an image including a plurality of person images respectively corresponding to a plurality of humans. The plurality of "detected body region points" and the plurality of "detected mid-points" are detected by, for example, a neural network (not shown) for the plurality of predetermined detection target points including the plurality of "body region points" of the human and "mid-points" for respective "body region point pairs" each composed of two body region points. The "basic pattern" includes a plurality of "detected body region points (i.e., detected base body region points)" corresponding to a plurality of "base body region types" that are different from each other.
Here, each "body region point" included in the plurality of the predetermined "detection target points" relates to a human body region such as a neck, an eye, a nose, an ear, a shoulder, and an elbow of a human. The "mid-point" included in the plurality of the predetermined "detection target points" relates to a human body region in the case of a body region point pair composed of body region points directly connected by, for example, an arm, such as a right shoulder and a right elbow (i.e., body region point pair composed of body region points adjacent to each other). On the other hand, when the body region point pair is composed of body region points not directly connected to each other such as a right shoulder and a left elbow, the "mid-point" included in the plurality of predetermined "detection target points" may be a spatial point around the human according to the person's pose at that moment. The "detected body region point" and the "detected mid-point" are points detected by, for example, a neural network (not shown) for the "body region point" and the "mid-point", respectively, included in the plurality of predetermined "detection target points".
For example, the "basic pattern" includes at least one of the following two combinations. A first combination is a combination of three base body region points, which correspond to a neck, a left shoulder and a left ear each being a base body region type. A second combination is a combination of three base body region points, which correspond to a neck, a right shoulder and a right ear each being a base body region type. That is, the "basic pattern" corresponds to a core part that is most stably detectable in a human body in images.
The grouping unit 12 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in a "mid-point expected area" obtained from links between each of a "plurality of grouping evaluation reference body region points" and a "grouping target detected body region point". The "plurality of grouping evaluation reference body region points" are composed of some or all of the plurality of detected base body region points included in the extracted basic pattern.
For example, the "plurality of grouping evaluation reference body region points" may be detected body region points each corresponding to one of the neck, left shoulder and right shoulder each being the base body region type. At this time, in the case of the basic pattern that does not include the detected body region point of the left shoulder, the two detected body region points corresponding to the neck and the right shoulder are the above-mentioned "plurality of grouping evaluation reference body region points". The "grouping target detected body region point" is each detected body region point for a body region type not included in the basic pattern. For example, when the body region types of the basic pattern are the neck, the right shoulder, the left shoulder, the right ear, and the left ear, the detected body region points for the eye, the nose, the elbow, etc. are grouping target detected body region points. The "mid-point expected area" is an area (middle area) including a "defined mid-point" defined as a "center point" of the above link.
The grouping unit 12 groups the grouping target detected body region point into one of a plurality of person groups respectively corresponding to the plurality of extracted basic patterns based on count values each of which is counted for the corresponding extracted basic pattern.
<Operation example of pose identifying apparatus>
An example of a processing operation of thepose identifying apparatus 10 having the above configuration will be described. Fig. 2 is a flowchart showing an example of the processing operation of the pose identifying apparatus according to the first example embodiment.
An example of a processing operation of the
The basic pattern extracting unit 11 extracts a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points (Step S101).
The grouping unit 12 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in the mid-point expected area obtained from links between each of the plurality of grouping evaluation reference body region points and the grouping target detected body region point (Step S102).
The grouping unit 12 groups the grouping target detected body region point into one of a plurality of person groups respectively corresponding to the plurality of extracted basic patterns based on count values each of which is counted for the corresponding extracted basic pattern (Step S103).
Note that the processing of Steps S102 and S103 (i.e., grouping process) is performed for each grouping target detected body region point. The processing of Steps S102 and S103 for a plurality of grouping target detected body region points may be performed in order or in parallel. By performing the processing of Steps S102 and S103 for the plurality of grouping target detected body region points in parallel, it is possible to increase the speed of identifying a human.
As described above, according to the first example embodiment, the basic pattern extracting unit 11 in the pose identifying apparatus 10 extracts the basic pattern for each human from the plurality of detected body region points and the plurality of "detected mid-points". The "basic pattern" includes the plurality of detected base body region points corresponding to the plurality of base body region types that are different from each other.
According to such a configuration of the pose identifying apparatus 10, the above-described basic pattern including a plurality of detected base body region points can be extracted as a "core part" of a human. By doing so, the accuracy of identifying the person's pose included in the image can be improved.
In the pose identifying apparatus 10, the grouping unit 12 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in the mid-point expected area obtained from links between each of the plurality of grouping evaluation reference body region points and the grouping target detected body region point. The plurality of grouping evaluation reference body region points are composed of some or all of the plurality of detected base body region points included in the extracted basic pattern. Then, the grouping unit 12 groups the grouping target detected body region point into one of a plurality of person groups respectively corresponding to the plurality of extracted basic patterns based on count values each of which is counted for the corresponding extracted basic pattern.
With such a configuration of the pose identifying apparatus 10, a result of the grouping on a first grouping target detected body region point does not affect a result of the grouping on a second grouping target detected body region point adjacent to the first grouping target detected body region point. It is thus possible to execute the grouping process on the grouping target detected body region points regardless of whether a grouping evaluation reference body region point is adjacent to the grouping target detected body region point. For this reason, the grouping process for a plurality of grouping target detected body region points can be executed in parallel, so that the speed of identifying a person's pose can be increased.
Second example embodiment
The second example embodiment relates to a more specific example embodiment.
The second example embodiment relates to a more specific example embodiment.
<Configuration example of pose identifying apparatus>
Fig. 3 is a block diagram showing an example of a pose identifying apparatus according to the second example embodiment. In Fig. 3, thepose identifying apparatus 20 includes a basic pattern extracting unit 21 and a grouping unit 22.
Fig. 3 is a block diagram showing an example of a pose identifying apparatus according to the second example embodiment. In Fig. 3, the
Like the basic pattern extracting unit 11 according to the first example embodiment, the basic pattern extracting unit 21 acquires information about a "position in an image" and a "point type" of each of a plurality of "detected body region points" and a plurality of "detected mid-points". The basic pattern extracting unit 21 extracts a "basic pattern" for each human from the plurality of "detected body region points" and the plurality of "detected mid-points". The plurality of "detected body region points" and the plurality of "detected mid-points" are a plurality of predetermined "detection target points" for a human in an image including a plurality of person images respectively corresponding to a plurality of humans. The plurality of "detected body region points" and the plurality of "detected mid-points" are detected by, for example, a neural network (not shown) for the plurality of predetermined detection target including the plurality of "body region points" of the human and "mid-points" for respective "body region point pairs" each composed of two body region points. The "basic pattern" includes a plurality of "detected body region points (i.e., detected base body region points)" corresponding to a plurality of "base body region types" that are different from each other.
Fig. 4 is a diagram showing an example of the plurality of predetermined detection target points for a human. In Fig. 4, the "plurality of predetermined detection target points" for a human include body region points N0 to N17. As shown in Fig. 4, the body region point N0 corresponds to a neck. The body region point N1 corresponds to a right shoulder. The body region point N2 corresponds to a left shoulder. The body region point N3 corresponds to a right ear. The body region point N4 corresponds to a left ear. The body region point N5 corresponds to a nose. The body region point N6 corresponds to a right eye. The body region point N7 corresponds to a left eye. The body region point N8 corresponds to a right elbow. The body region point N9 corresponds to a right wrist. The body region point N10 corresponds to a left elbow. The body region point N11 corresponds to a left wrist. The body region point N12 corresponds to a right hip. The body region point N13 corresponds to a left hip. The body region point N14 corresponds to a right knee. The body region point N15 corresponds to a left knee. The body region point N16 corresponds to a right ankle. The body region point N17 corresponds to a left ankle.
When the "plurality of grouping evaluation reference body region points" described in the first example embodiment are the body region points N0, N1, and N2, the "plurality of predetermined detection target points" for a human include 39 mid-points corresponding to the respective combinations of the body region points N0, N1, and N2 and the body region points N5 to N17. A mid-point between a body region point Ni and a body region point Nj is represented by a mid-point Mi_j. Like the mid-point, the detected mid-point is represented by a detected mid-point Mi_j, and the "defined mid-point" described in the first example embodiment is represented by a detected mid-point M'i_j.
Thus, when the image includes human full body images of five persons, the basic pattern extracting unit 21 may acquire five sets of information each including the positions and the point types of the detected body region points N0 to N17 and 39 detected mid-points M.
Returning to the description of Fig. 3, like the basic pattern extracting unit 11 according to the first example embodiment, the basic pattern extracting unit 21 extracts a "basic pattern" for each human from the plurality of "detected body region points" and the plurality of "detected mid-points".
For example, as shown in Fig. 3, the basic pattern extracting unit 21 includes a basic pattern candidate identifying unit 21A, a base length calculating unit 21B, and a basic pattern forming unit 21C.
The basic pattern candidate identifying unit 21A identifies a plurality of "basic pattern candidates" by classifying, into the same basic pattern candidate, each combination which includes detected body region points that are close in distance to each other in the image from among a plurality of combinations of the plurality of detected base body region points corresponding to the "main type" and the plurality of detected body region points corresponding to the "sub types". The "main type" is, for example, the neck, and the "sub types" are the right shoulder, the left shoulder, the right ear, and the left ear. For example, the basic pattern candidate identifying unit 21A selects, for one detected body region point corresponding to the neck, one detected body region point corresponding to the right shoulder that is closest in distance to the one detected body region point corresponding to the neck from among the plurality of detected body region points corresponding to the right shoulder. This selection is made for each detected body region point corresponding to the neck. Then, when one detected body region point corresponding to the right shoulder is selected for the plurality of detected body region points corresponding to the neck, the basic pattern candidate identifying unit 21A selects one detected body region point corresponding to the neck that is closest in distance to the above-mentioned detected body region point corresponding to the right shoulder from among the plurality of detected body region points corresponding to the neck. That is, the basic pattern candidate identifying unit 21A performs processing using the MLMD (Mutual-Local-Minimum-Distance) algorithm. Thus, one detected body region point corresponding to the neck and one detected body region point corresponding to the right shoulder are selected, and these detected body region points are classified into the same "basic pattern candidate". The processing described above is performed for each of the left shoulder, the right ear, and the left ear.
The basic pattern forming unit 21C performs "optimization processing" on a plurality of basic pattern candidates identified by the basic pattern candidate identifying unit 21A to thereby form a plurality of basic patterns for the plurality of humans.
The "optimization processing" includes the following processes. A first process is a process of cutting one basic pattern candidate including the plurality of detected body region points corresponding to the main type to convert the one basic pattern candidate into a plurality of the basic pattern candidates each including one detection point corresponding to the main type. That is, when one basic pattern candidate includes a plurality of detected body region points corresponding to the neck, the one basic pattern candidate is converted into a plurality of basic pattern candidates each including one detected body region point corresponding to the neck.
A second process is a process of excluding, from each basic pattern candidate, a detected body region point(s) that is included in the basic pattern candidate, that corresponds to the sub type, and whose distance from the detected body region point corresponding to the main type is longer than a "base length for the basic pattern candidate".
A third process is a process of excluding a basic pattern candidate(s) not including any of a combination of three detected body region points of a "first body region type group" and a combination of three detected body region points of a "second body region type group". For example, the "first body region type group" includes the neck, the left shoulder, and the left ear, and the "second body region type group" includes the neck, the right shoulder, and the right ear.
The base length calculating unit 21B calculates the "base length for each basic pattern candidate" when the above-described first process is completed. The calculation of the "base length for each basic pattern candidate" will be described in detail later.
Like the grouping unit 12 according to the first example embodiment, the grouping unit 22 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in a "mid-point expected area" obtained from links between each of a "plurality of grouping evaluation reference body region points" and a "grouping target detected body region point".
Then, the grouping unit 22 groups the grouping target detected body region point into one of a plurality of person groups respectively corresponding to the plurality of extracted basic patterns based on count values each of which is counted for the corresponding extracted basic pattern. For example, the grouping unit 22 may group the grouping target detected body region point into a person group corresponding to the basic pattern having the largest count value from among the plurality of extracted basic patterns. Alternatively, the grouping unit 22 may group the grouping target detected body region point into a person group corresponding to the basic pattern having the largest count value which is a predetermined value or greater (e.g., 2 or greater) from among the plurality of extracted basic patterns. Further, when there are a plurality of basic patterns having the largest count value, the grouping unit 22 may group the grouping target detected body region point into a basic pattern including the detected base body region point corresponding to the main type having the smallest distance from the grouping target detected body region point. This grouping process will be described in detail later.
<Operation example of pose identifying apparatus>
An example of the processing operation of thepose identifying apparatus 20 having the above configuration will be described.
An example of the processing operation of the
<Basic pattern extraction process>
The basicpattern extracting unit 21 acquires information about a "position in an image" and a "point type" of each of a plurality of "detected body region points" and a plurality of "detected mid-points". The basic pattern extracting unit 21 extracts a "basic pattern" for each human from the plurality of "detected body region points" and the plurality of "detected mid-points".
The basic
Fig. 5 is a diagram for describing the basic pattern extraction process.
First, the basic pattern extraction process starts from a graph G shown in Fig. 5. The graph G includes all detected base body region point pairs corresponding to a group Sc of a "base body region type pair". The group Sc includes, as group elements, a pair of the neck and right shoulder, a pair of the neck and left shoulder, a pair of the neck and right ear, a pair of the neck and left ear, a pair of the right shoulder and right ear, and a pair of the left shoulder and left ear.
Next, the basic pattern candidate identifying unit 21A performs the processing using the MLMD algorithm on each base body region type pair to obtain a graph G-sub, and identifies, as the "basic pattern candidate", a block including a triangle(s) having the respective detected body region points corresponding to the neck in the graph G-sub as vertexes.
Fig. 6 is a diagram for describing types of the basic pattern candidates. As shown in Fig. 6, there may be five types of the basic pattern candidates, which are TA, TB, TC, TD, and TE. In Fig. 5, these five types of the basic pattern candidates are collectively referred to as "PATTERN-α". For example, the basic pattern candidate corresponding to the person facing the front is likely to be the type TA. There are basic pattern candidates of the types TB, TC, and TD due to a complex environment such as occlusions.
Then, the basic pattern forming unit 21C performs the optimization processing on "PATTERN-α" to form the plurality of basic patterns.
In the optimization processing, first, since the basic pattern candidate of the type TE shown in Fig. 6 includes two detected body region points corresponding to the neck, the basic pattern forming unit 21C divides the basic pattern candidate of the type TE into two basic pattern candidates each including one detected body region point corresponding to the neck (the above-described first process). Then, the basic pattern candidate of the type TB and the basic pattern candidate of the type TC are obtained. As a result, basic pattern candidates corresponding to the types TA, TB, TC, and TD remain.
Next, the basic length calculating unit 21B calculates the "base length" for each basic pattern candidate corresponding to any one of the types TA, TB, TC, and TD. The "base length" is a length that is a reference of a size of a human body.
Fig. 7 is a diagram for describing the calculation of the base length. First, the base length calculating unit 21B calculates lengths La, Lb, and Lc for each basic pattern candidate.
As shown in Fig. 7, the length La is calculated as a distance between the detected body region point N0 corresponding to the neck and the detected body region point N1 corresponding to the right shoulder in the basic pattern candidate or a distance between the detected body region point N0 corresponding to the neck and the detected body region point N2 corresponding to the left shoulder in the basic pattern candidate. Specifically, when the basic pattern candidate includes both the detected body region point N1 corresponding to the right shoulder and the detected body region point N2 corresponding to the left shoulder, the length La is equal to the smaller one of the distance between the detected body region point N0 corresponding to the neck and the detected body region point N1 corresponding to the right shoulder and the distance between the detected body region point N0 corresponding to the neck and the detected body region point N2 corresponding to the left shoulder. When the basic pattern candidate includes the detected body region point N1 corresponding to the right shoulder but does not include the detected body region point N2 corresponding to the left shoulder, the length La is the distance between the detected body region point N0 corresponding to the neck and the detected body region point N1 corresponding to the right shoulder. When the basic pattern candidate includes the detected body region point N2 corresponding to the left shoulder but does not include the detected body region point N1 corresponding to the right shoulder, the length La is the distance between the detected body region point N0 corresponding to the neck and the detected body region point N2 corresponding to the left shoulder.
As shown in Fig. 7, the length Lb is calculated as a distance between the detected body region point N0 corresponding to the neck and the detected body region point N3 corresponding to the right ear in the basic pattern candidate or a distance between the detected body region point N0 corresponding to the neck and the detected body region point N4 corresponding to the left ear in the basic pattern candidate.
As shown in Fig. 7, the length Lc is calculated as, when there are detected mid-points M12_1 and M13_2 corresponding to the chest, a distance between the detected body region point N0 corresponding to the neck in the basic pattern candidate and the detected mid-point M12-1 that corresponds to the right chest and that is closest to the detected body region point N0 or a distance between the detected body region point N0 corresponding to the neck in the basic pattern candidate and the detected mid-point M13-2 that corresponds to the left chest and that is closest to the detected body region point N0. Further, when there is no detected mid-points M12_1 and M13_2 corresponding to the chest, the length Lc is calculated as Lc=La+Lb+1.
Next, the base length calculating unit 21B calculates the base length of each basic pattern candidate based on the calculated lengths La, Lb, and Lc. The base length calculating unit 21B calculates the base length by different calculation methods according to a large/small relation between "Lc" and "La+Lb" and a large/small relation between "Lb" and "La×2". As shown in Fig. 7, for example, when "Lc" is "La+Lb" or less and "Lb" is "La×2" or less, the base length is "Lc". When "Lc" is "La+Lb" or less and "Lb" is larger than "La×2", the base length is "Lc×1.17". When "Lc" is larger than "La+Lb" and "Lb" is "La×2" or less, the base length is "La+Lb". When "Lc" is larger than "La+Lb" and "Lb" is larger than "La×2", the base length is "Lb×1.7". In this example, "Lb" tends to be larger than "La×2" for the basic pattern candidates corresponding to a person facing sideways. Further, "Lb" tends to be "La×2" or less for basic pattern candidates corresponding to a person facing forward or backward. There are cases in which, in the basic pattern candidates corresponding to a person shown at a lower part of the image, his/her chest may not be shown in the image. In this case, "Lc" tends to be larger than "La+Lb".
Returning to the description of Fig. 5, the basic pattern forming unit 21C excludes, from each basic pattern candidate, a detected base body region point(s) that is included in the basic pattern candidate, that corresponds to the sub type, and whose distance from the detected base body region point corresponding to the main type is longer than the "base length for the basic pattern candidate" (the above second process). Then, for example, in the basic pattern candidate of the type TA including two triangles shown in Fig. 6, when the detected body region point corresponding to the ear included in one of the triangles is far from the detected body region point corresponding to the neck, this detected body region point corresponding to the ear is excluded from the basic pattern candidate. Thus, the basic pattern candidate of the type TA is changed to the basic pattern candidate of the type TC. Further, for example, in the basic pattern candidate of the type TA including two triangles, when the detected body region point corresponding to the shoulder included in one of the triangles is far from the detected body region point corresponding to the neck, this detected body region point corresponding to the shoulder is excluded from the basic pattern candidate. Thus, the basic pattern candidate of the type TA is changed to the basic pattern candidate of the type TB. The basic pattern candidate not including any triangle may appear as a result of the processing by this basic pattern forming unit 21C.
The basic pattern forming unit 21C excludes the basic pattern candidate(s) not including any of the combination of the three detected base body region points of the "first body region type group" and the combination of the three detected base body region points of the "second body region type group (the above-described third process). The "first body region type group" includes the neck, the left shoulder, and the left ear, and the "second body region type group" includes the neck, the right shoulder, and the right ear. That is, the basic pattern candidate(s) not including any of the above triangles is excluded by the processing of the basic pattern forming unit 21C. At this stage, as shown in Fig. 5, four types of the basic pattern candidates, i.e., the types TA, TB, TC, and TD, may remain. These remaining basic pattern candidates are the "basic patterns".
<Grouping process>
Thegrouping unit 22 counts, for each basic pattern, the number of links in which a corresponding detected mid-point is present in a "mid-point expected area" obtained from links between each of a "plurality of grouping evaluation reference body region points" and a "grouping target detected body region point".
The
Fig. 8 is a diagram for describing grouping process. In the example of Fig. 8, the basic patterns 1, 2, and 3 are extracted. For example, as shown in Fig. 8, the grouping unit 22 connects the grouping target detected body region point Ni to each of the grouping evaluation reference body region points N0, N1, N2 by a "temporary link". Here, the grouping evaluation reference body region points are detected base body region points corresponding to the neck, the right shoulder, and the left shoulder, respectively.
Next, the grouping unit 22 calculates the "mid-point expected area" for each temporary link. Fig. 9 is a diagram for describing the mid-point expected area. As shown in Fig. 9, the mid-point expected area corresponding to the temporary link is an oblong area centered on a center point Mi_j' between two detected body region points Ni and Nj of the temporary link (i.e., defined mid-point Mi_j'). In the example shown in Fig. 9, a major axis length Rmajor of the mid-point expected area is "link distance×0.75", and a minor axis length Rminor is "link distance×0.35".
Then, the grouping unit 22 determines whether the detected mid-point Mi_j is present in the mid-point expected area corresponding to the temporary link of the detected body region points Ni and Nj. Then, the grouping unit 22 defines the temporary link corresponding to the mid-point expected area where the detected mid-point Mi_j is present as a "candidate link". The grouping unit 22 counts the number of candidate links for each basic pattern. Fig. 8 shows the detected mid-points present in the mid-point expected area. That is, in the example of Fig. 8, the count number of the basic pattern 1 is "1", the count number of the basic pattern 2 is "3", and the count number of the basic pattern 3 is "2".
In this case, the grouping unit 22 may, for example, group the grouping target detected body region point Ni into a person group corresponding to the basic pattern 2 having the largest count number.
Fig. 10 is another diagram for describing the grouping process. In the example of Fig. 10, the count numbers of the basic patterns 2 and 3 are both "2", and the basic patterns 2 and 3 have the largest count number. In this case, the grouping unit 22 may group the grouping target detected body region point Ni into a person group corresponding to a basic pattern (i.e., the basic pattern 2) including a detected base body region point N0-bp2 corresponding to the main type having the smallest distance from the grouping target detected body region point Ni.
Other Embodiments
Fig. 11 is a diagram showing an example of a hardware configuration of the pose identifying apparatus. In Fig. 11, the pose identifying apparatus includes aprocessor 101 and a memory 102. The processor 101 may be, for example, a microprocessor, a Micro Processing Unit (MPU), or a Central Processing Unit (CPU). The processor 101 may include a plurality of processors. The memory 102 is composed of a combination of a volatile memory and a non-volatile memory. The memory 102 may include a storage located separated from the processor 101. In this case, the processor 101 may access the memory 102 via an I/O interface (not shown).
Fig. 11 is a diagram showing an example of a hardware configuration of the pose identifying apparatus. In Fig. 11, the pose identifying apparatus includes a
Each of the pose identifying apparatuses 10 according to the first example embodiment and the pose identifying apparatuses 20 according to the second example embodiment can include the hardware configuration shown in Fig. 11. The basic pattern extracting units 11 and 21 and the grouping units 12 and 22 of the pose identifying apparatuses 10 and 20 according to the first and second example embodiments may be achieved by the processor 101 reading a program stored in the memory 102 and executing it. The program can be stored and provided to the pose identifying apparatuses 10 and 20 using any type of non-transitory computer readable media. Examples of non-transitory computer readable media include magnetic storage media (such as floppy disks, magnetic tapes, hard disk drives, etc.), and optical magnetic storage media (e.g. magneto-optical disks). Examples of non-transitory computer readable media further include CD-ROM (Read Only Memory), CD-R, and CD-R/W. Examples of non-transitory computer readable media further include semiconductor memories. The semiconductor memories include, for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory), etc. The program may be provided to the pose identifying apparatuses 10 and 20 using any type of transitory computer readable media. Non-transitory computer readable media include any type of tangible storage media. Transitory computer readable media can provide the program to the pose identifying apparatuses 10 and 20 via a wired communication line (e.g. electric wires, and optical fibers) or a wireless communication line.
Although the present disclosure has been described with reference to the example embodiments so far, the present disclosure is not limited by the above. Various modifications that can be understood by a person skilled in the art within the scope of the present disclosure can be made to the configuration and details of the present disclosure.
The whole or part of the example embodiments disclosed above can be described as, but not limited to, the following supplementary notes.
(Supplementary note 1)
A pose identifying apparatus comprising:
basic pattern extracting means for extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
grouping means for counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values counted for the extracted basic patterns.
A pose identifying apparatus comprising:
basic pattern extracting means for extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
grouping means for counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values counted for the extracted basic patterns.
(Supplementary note 2)
The pose identifying apparatus according toSupplementary note 1, wherein
the grouping means groups the grouping target detected body region point into a person group having the largest count value from among the plurality of extracted basic patterns.
The pose identifying apparatus according to
the grouping means groups the grouping target detected body region point into a person group having the largest count value from among the plurality of extracted basic patterns.
(Supplementary note 3)
The pose identifying apparatus according toSupplementary note 1, wherein
the grouping means groups the grouping target detected body region point into a person group corresponding a basic pattern whose count value is largest from among the plurality of extracted basic patterns and is equal to or greater than a predetermined value.
The pose identifying apparatus according to
the grouping means groups the grouping target detected body region point into a person group corresponding a basic pattern whose count value is largest from among the plurality of extracted basic patterns and is equal to or greater than a predetermined value.
(Supplementary note 4)
The pose identifying apparatus according to Supplementary note 2 or 3, wherein
when there are a plurality of the basic patterns having the largest count value, the grouping means groups the grouping target detected body region point into a person group corresponding to a basic pattern whose distance from the grouping target detected body region point in the image is shortest from among the basic patterns having the largest count value.
The pose identifying apparatus according to
when there are a plurality of the basic patterns having the largest count value, the grouping means groups the grouping target detected body region point into a person group corresponding to a basic pattern whose distance from the grouping target detected body region point in the image is shortest from among the basic patterns having the largest count value.
(Supplementary note 5)
The pose identifying apparatus according to any one ofSupplementary notes 1 to 4, wherein
the mid-point expected area is a predetermined middle area including a defined mid-point defined as a center point of the link.
The pose identifying apparatus according to any one of
the mid-point expected area is a predetermined middle area including a defined mid-point defined as a center point of the link.
(Supplementary note 6)
The pose identifying apparatus according to any one ofSupplementary notes 1 to 5, wherein
the plurality of base body region types include a main type and a plurality of sub types,
the basic pattern extracting means comprises:
basic pattern candidate identifying means for identifying a plurality of basic pattern candidates by classifying, into the same basic pattern candidate, combination which includes detected body region points that are close in distance to each other in the image from among a plurality of combinations of the plurality of detected body region points corresponding to the main type and the plurality of detected body region points corresponding to the respective sub types; and
basic pattern formation means for forming the plurality of basic patterns for the plurality of humans by performing optimization processing on the identified plurality of basic pattern candidates.
The pose identifying apparatus according to any one of
the plurality of base body region types include a main type and a plurality of sub types,
the basic pattern extracting means comprises:
basic pattern candidate identifying means for identifying a plurality of basic pattern candidates by classifying, into the same basic pattern candidate, combination which includes detected body region points that are close in distance to each other in the image from among a plurality of combinations of the plurality of detected body region points corresponding to the main type and the plurality of detected body region points corresponding to the respective sub types; and
basic pattern formation means for forming the plurality of basic patterns for the plurality of humans by performing optimization processing on the identified plurality of basic pattern candidates.
(Supplementary note 7)
The pose identifying apparatus according to Supplementary note 6, wherein the optimization processing comprises:
dividing one basic pattern candidate including the plurality of detected body region points corresponding to the main type and converting the one basic pattern candidate into the plurality of basic pattern candidates each including one detected body region point corresponding to the main type;
excluding, from each basic pattern candidate, the detected body region point that is included in the basic pattern candidate, that corresponds to the sub type, and whose distance from the detected body region point corresponding to the main type is longer than a base length for the basic pattern candidate; and
excluding the basic pattern candidate not including any of a combination of three detected body region points which belong to a first body region type group and a combination of three detected body region points which belong to a second body region type group.
The pose identifying apparatus according to Supplementary note 6, wherein the optimization processing comprises:
dividing one basic pattern candidate including the plurality of detected body region points corresponding to the main type and converting the one basic pattern candidate into the plurality of basic pattern candidates each including one detected body region point corresponding to the main type;
excluding, from each basic pattern candidate, the detected body region point that is included in the basic pattern candidate, that corresponds to the sub type, and whose distance from the detected body region point corresponding to the main type is longer than a base length for the basic pattern candidate; and
excluding the basic pattern candidate not including any of a combination of three detected body region points which belong to a first body region type group and a combination of three detected body region points which belong to a second body region type group.
(Supplementary note 8)
The pose identifying apparatus according to Supplementary note 7, wherein
the main type is a neck,
the sub types are a left shoulder, a right shoulder, a left ear, and a right ear,
the first body region type group includes the neck, the left shoulder, and the left ear, and
the second body region type group includes the neck, the right shoulder, and the right ear.
The pose identifying apparatus according to Supplementary note 7, wherein
the main type is a neck,
the sub types are a left shoulder, a right shoulder, a left ear, and a right ear,
the first body region type group includes the neck, the left shoulder, and the left ear, and
the second body region type group includes the neck, the right shoulder, and the right ear.
(Supplementary note 9)
The pose identifying apparatus according toSupplementary note 1, wherein the basic pattern includes at least one of a combination of three detected base body region points corresponding to a neck, a left shoulder, and a left ear and a combination of three base detected body region points corresponding to the neck, a right shoulder, and a right ear.
The pose identifying apparatus according to
(Supplementary note 10)
A pose identifying method comprising:
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values for the extracted basic patterns.
A pose identifying method comprising:
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values for the extracted basic patterns.
(Supplementary note 11)
A non-transitory computer readable medium storing a program for causing a pose identifying apparatus to execute:
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on a count values for the extracted basic patterns.
A non-transitory computer readable medium storing a program for causing a pose identifying apparatus to execute:
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on a count values for the extracted basic patterns.
10 POSE IDENTIFYING APPARATUS
11 BASIC PATTERN EXTRACTING UNIT
12 GROUPING UNIT
20 POSE IDENTIFYING APPARATUS
21 BASIC PATTERN EXTRACTING UNIT
21A BASIC PATTERN CANDIDATE IDENTIFYING UNIT
21B BASE LENGTH CALCULATING UNIT
21C BASIC PATTERN FORMING UNIT
22 GROUPING UNIT
11 BASIC PATTERN EXTRACTING UNIT
12 GROUPING UNIT
20 POSE IDENTIFYING APPARATUS
21 BASIC PATTERN EXTRACTING UNIT
21A BASIC PATTERN CANDIDATE IDENTIFYING UNIT
21B BASE LENGTH CALCULATING UNIT
21C BASIC PATTERN FORMING UNIT
22 GROUPING UNIT
Claims (11)
- A pose identifying apparatus comprising:
basic pattern extracting means for extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
grouping means for counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values counted for the extracted basic patterns.
- The pose identifying apparatus according to Claim 1, wherein
the grouping means groups the grouping target detected body region point into a person group corresponding a basic pattern having the largest count value from among the plurality of extracted basic patterns.
- The pose identifying means according to Claim 1, wherein
the grouping means groups the grouping target detected body region point into a person group corresponding a basic pattern whose count value is largest from among the plurality of extracted basic patterns and is equal to or greater than a predetermined value.
- The pose identifying apparatus according to Claim 2 or 3, wherein
when there are a plurality of the basic patterns having the largest count value, the grouping means groups the grouping target detected body region point into a person group corresponding to a basic pattern whose distance from the grouping target detected body region point in the image is shortest from among the basic patterns having the largest count value.
- The pose identifying apparatus according to any one of Claims 1 to 4, wherein
the mid-point expected area is a predetermined middle area including a defined mid-point defined as a center point of the link.
- The pose identifying apparatus according to any one of Claims 1 to 5, wherein
the plurality of base body region types include a main type and a plurality of sub types,
the basic pattern extracting means comprises:
basic pattern candidate identifying means for identifying a plurality of basic pattern candidates by classifying, into the same basic pattern candidate, combination which includes detected body region points that are close in distance to each other in the image from among a plurality of combinations of the plurality of detected body region points corresponding to the main type and the plurality of detected body region points corresponding to the respective sub types; and
basic pattern formation means for forming the plurality of basic patterns for the plurality of humans by performing optimization processing on the identified plurality of basic pattern candidates.
- The pose identifying apparatus according to Claim 6, wherein the optimization processing comprises:
dividing one basic pattern candidate including the plurality of detected body region points corresponding to the main type and converting the one basic pattern candidate into the plurality of basic pattern candidates each including one detected body region point corresponding to the main type;
excluding, from each basic pattern candidate, the detected body region point that is included in the basic pattern candidate, that corresponds to the sub type, and whose distance from the detected body region point corresponding to the main type is longer than a base length for the basic pattern candidate; and
excluding the basic pattern candidate not including any of a combination of three detected body region points which belong to a first body region type group and a combination of three detected body region points which belong to a second body region type group.
- The pose identifying apparatus according to Claim 7, wherein
the main type is a neck,
the sub types are a left shoulder, a right shoulder, a left ear, and a right ear,
the first body region type group includes the neck, the left shoulder, and the left ear, and
the second body region type group includes the neck, the right shoulder, and the right ear.
- The pose identifying apparatus according to Claim 1, wherein the basic pattern includes at least one of a combination of three detected base body region points corresponding to a neck, a left shoulder, and a left ear and a combination of three base detected body region points corresponding to the neck, a right shoulder, and a right ear.
- A pose identifying method comprising:
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on count values for the extracted basic patterns.
- A non-transitory computer readable medium storing a program for causing a pose identifying apparatus to execute:
extracting a basic pattern for each human from a plurality of detected body region points and a plurality of detected mid-points, which are detected, in an image including a plurality of person images respectively corresponding to a plurality of humans, for a plurality of predetermined detection target points for a human, wherein the predetermined detection target points include a plurality of body region points of the human and a mid-point of each body region point pair composed of two body region points, and wherein the basic pattern includes a plurality of detected base body region points corresponding to a plurality of base body region types that are different from each other; and
counting, for each extracted basic pattern, the number of links in which a corresponding detected mid-point is present in a mid-point expected area obtained from links between a plurality of grouping evaluation reference body region points and a grouping target detected body region point, wherein the grouping evaluation reference body region points are composed of some or all of a plurality of detected base body region points included in the extracted basic pattern, and then grouping the grouping target detected body region point into one of a plurality of person groups respectively corresponding to a plurality of the extracted basic patterns based on a count values for the extracted basic patterns.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2020/013306 WO2021192085A1 (en) | 2020-03-25 | 2020-03-25 | Pose identifying apparatus, pose identifying method, and non-transitory computer readable medium storing program |
JP2022547155A JP7323079B2 (en) | 2020-03-25 | 2020-03-25 | Posture identification device, posture identification method, and program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2020/013306 WO2021192085A1 (en) | 2020-03-25 | 2020-03-25 | Pose identifying apparatus, pose identifying method, and non-transitory computer readable medium storing program |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021192085A1 true WO2021192085A1 (en) | 2021-09-30 |
Family
ID=77889999
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2020/013306 WO2021192085A1 (en) | 2020-03-25 | 2020-03-25 | Pose identifying apparatus, pose identifying method, and non-transitory computer readable medium storing program |
Country Status (2)
Country | Link |
---|---|
JP (1) | JP7323079B2 (en) |
WO (1) | WO2021192085A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017091377A (en) * | 2015-11-13 | 2017-05-25 | 日本電信電話株式会社 | Attitude estimation device, attitude estimation method, and attitude estimation program |
JP2018057596A (en) * | 2016-10-05 | 2018-04-12 | コニカミノルタ株式会社 | Joint position estimation device and joint position estimation program |
JP2018147313A (en) * | 2017-03-07 | 2018-09-20 | Kddi株式会社 | Object attitude estimating method, program and device |
JP2020042476A (en) * | 2018-09-10 | 2020-03-19 | 国立大学法人 東京大学 | Method and apparatus for acquiring joint position, and method and apparatus for acquiring motion |
-
2020
- 2020-03-25 WO PCT/JP2020/013306 patent/WO2021192085A1/en active Application Filing
- 2020-03-25 JP JP2022547155A patent/JP7323079B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017091377A (en) * | 2015-11-13 | 2017-05-25 | 日本電信電話株式会社 | Attitude estimation device, attitude estimation method, and attitude estimation program |
JP2018057596A (en) * | 2016-10-05 | 2018-04-12 | コニカミノルタ株式会社 | Joint position estimation device and joint position estimation program |
JP2018147313A (en) * | 2017-03-07 | 2018-09-20 | Kddi株式会社 | Object attitude estimating method, program and device |
JP2020042476A (en) * | 2018-09-10 | 2020-03-19 | 国立大学法人 東京大学 | Method and apparatus for acquiring joint position, and method and apparatus for acquiring motion |
Also Published As
Publication number | Publication date |
---|---|
JP2023512318A (en) | 2023-03-24 |
JP7323079B2 (en) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6891351B2 (en) | How to generate a human hairstyle based on multi-feature search and deformation | |
KR102415632B1 (en) | Information processing apparatus, information processing method, and storage medium | |
CN105139040B (en) | A kind of queueing condition information detecting method and its system | |
JP5657113B2 (en) | Semantic analysis of objects in video | |
CN110738101A (en) | Behavior recognition method and device and computer readable storage medium | |
CN105096300B (en) | Method for checking object and equipment | |
CN109766822A (en) | Gesture identification method neural network based and system | |
JP6381368B2 (en) | Image processing apparatus, image processing method, and program | |
JPWO2018207282A1 (en) | Object recognition method, apparatus, system, program | |
CN114902299A (en) | Method, device, equipment and storage medium for detecting associated object in image | |
WO2021192085A1 (en) | Pose identifying apparatus, pose identifying method, and non-transitory computer readable medium storing program | |
KR101675692B1 (en) | Method and apparatus for crowd behavior recognition based on structure learning | |
JP2019109843A (en) | Classification device, classification method, attribute recognition device, and machine learning device | |
CN113780145A (en) | Sperm morphology detection method, sperm morphology detection device, computer equipment and storage medium | |
CN111191527B (en) | Attribute identification method, attribute identification device, electronic equipment and readable storage medium | |
US20220277482A1 (en) | Pose identifying apparatus, pose identifying method, and non-transitory computer readable medium | |
KR20130125283A (en) | Apparatus and method for body components detection | |
JP2011232845A (en) | Feature point extracting device and method | |
JP7152651B2 (en) | Program, information processing device, and information processing method | |
WO2023079742A1 (en) | Information processing device, analysis system, data generation method, and non-transitory computer readable medium | |
US20230162388A1 (en) | Learning device, control method, and storage medium | |
CN115457104B (en) | Human body information determination method and device and electronic equipment | |
CN115147526B (en) | Training of clothing generation model and method and device for generating clothing image | |
JP2012212325A (en) | Visual axis measuring system, method and program | |
US20230130397A1 (en) | Determination method and information processing apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20927400 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022547155 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20927400 Country of ref document: EP Kind code of ref document: A1 |