WO2022254644A1 - 姿勢推定装置、姿勢推定方法、及びコンピュータ読み取り可能な記録媒体 - Google Patents
姿勢推定装置、姿勢推定方法、及びコンピュータ読み取り可能な記録媒体 Download PDFInfo
- Publication number
- WO2022254644A1 WO2022254644A1 PCT/JP2021/021140 JP2021021140W WO2022254644A1 WO 2022254644 A1 WO2022254644 A1 WO 2022254644A1 JP 2021021140 W JP2021021140 W JP 2021021140W WO 2022254644 A1 WO2022254644 A1 WO 2022254644A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- joint
- person
- posture estimation
- detected
- joints
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 45
- 238000006073 displacement reaction Methods 0.000 claims abstract description 74
- 238000004364 calculation method Methods 0.000 claims abstract description 70
- 239000011159 matrix material Substances 0.000 claims description 18
- 238000012545 processing Methods 0.000 description 41
- 238000010586 diagram Methods 0.000 description 22
- 238000013527 convolutional neural network Methods 0.000 description 18
- 238000012986 modification Methods 0.000 description 15
- 230000004048 modification Effects 0.000 description 15
- 210000003127 knee Anatomy 0.000 description 7
- 210000003423 ankle Anatomy 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 210000000707 wrist Anatomy 0.000 description 6
- 238000000605 extraction Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 210000001015 abdomen Anatomy 0.000 description 1
- 210000003451 celiac plexus Anatomy 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/0059—Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence
- A61B5/0077—Devices for viewing the surface of the body, e.g. camera, magnifying lens
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
- A61B5/1116—Determining posture transitions
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/103—Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
- A61B5/11—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
- A61B5/1126—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb using a particular sensing technique
- A61B5/1128—Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb using a particular sensing technique using image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B2576/00—Medical imaging apparatus involving image processing or analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30232—Surveillance
Definitions
- the present invention relates to a posture estimation device and a posture estimation method for estimating the posture of a person in an image, and a computer-readable recording medium for realizing them.
- Non-Patent Documents 1 and 2 disclose an example of a system for estimating a person's posture. Specifically, the system disclosed in Non-Patent Document 1 first acquires image data output from a camera, and estimates the joints of a person in the image and the vector field between the joints from the acquired image data. do. The system disclosed in Non-Patent Document 1 then determines the inter-joint orientation for each pair of two adjacent joints.
- the system disclosed in Non-Patent Document 1 obtains the inner product of the obtained direction and the vector field estimated between the joints for each pair of two adjacent joints, and further, based on the inner product, Calculate the probability of connection between joints. After that, the system disclosed in Non-Patent Document 1 identifies the joints to be connected based on the likelihood, and estimates the posture of the person.
- Non-Patent Document 2 first acquires image data output from a camera, inputs the acquired image data to a detector, and determines the reference position of a person in the image and the reference of each joint. and output the relative position from the position. Next, the system disclosed in Non-Patent Document 2 estimates the posture of the person in the image based on the output reference position of the person and the relative positions of each joint.
- the detector in this case is constructed by machine learning using images, the reference position of the person in the image and the relative positions of each joint as training data.
- Non-Patent Documents 1 and 2 if a part of a person whose pose is to be estimated is hidden by another person or object in the image, the pose cannot be accurately estimated. There is a problem.
- the right knee of the person to be estimated is detected in the image, but the right ankle is hidden by the right knee of another person and is not detected.
- the right knee of the person to be estimated is likely to be tied to the right ankle of another person, making it impossible to accurately estimate the posture.
- An example of an object of the present invention is to provide a posture estimation device, a posture estimation method, and a computer-readable recording medium that can improve the accuracy of posture estimation when a part of a person to be estimated is hidden. to provide.
- a posture estimation device includes: A position calculation unit that calculates a temporary reference position of the person based on the position of each joint of the person detected from the image data and the displacement from the joint to the reference part of the person.
- a posture estimation unit that determines, for each of the detected joints, a person to whom the joint belongs based on the calculated temporary reference position; characterized by comprising
- a posture estimation method includes: A position calculation step of calculating a temporary reference position of the person based on the position of each joint of the person detected from the image data and the displacement from the joint to the reference part of the person.
- a posture estimation step of determining, for each of the detected joints, a person to whom the joint belongs based on the calculated temporary reference position characterized by comprising
- a computer-readable recording medium in one aspect of the present invention provides a computer with A position calculation step of calculating a temporary reference position of the person based on the position of each joint of the person detected from the image data and the displacement from the joint to the reference part of the person.
- a posture estimation step of determining, for each of the detected joints, a person to whom the joint belongs based on the calculated temporary reference position;
- a program is recorded that includes instructions for executing
- FIG. 1 is a configuration diagram showing a schematic configuration of a posture estimation apparatus according to Embodiment 1.
- FIG. 2A and 2B are diagrams showing positions calculated by the position calculation unit, FIG. 2A shows an example of joint positions, FIG. 2B shows an example of relative positions, and FIG. indicates an example of a temporary reference position.
- FIG. 3 is a diagram showing a specific configuration and processing of the position calculation unit.
- FIG. 4 is a diagram illustrating an outline of processing by a position calculation unit and a posture estimation unit;
- FIG. 5 is a diagram showing specific processing in the posture estimation unit.
- FIG. 6 is a diagram illustrating an outline of processing by a position calculation unit and a posture estimation unit when a reference position cannot be detected;
- FIG. 1 is a configuration diagram showing a schematic configuration of a posture estimation apparatus according to Embodiment 1.
- FIG. 2A and 2B are diagrams showing positions calculated by the position calculation unit, FIG. 2A shows an example of joint positions, FIG. 2B shows an example of
- FIG. 7 is a diagram showing specific processing in the attitude estimation unit when the reference position cannot be detected.
- 8 is a flow chart showing the operation of the posture estimation device according to Embodiment 1.
- FIG. 9 is a diagram showing a specific configuration and processing of a position calculation unit according to Embodiment 2.
- FIG. 10 is a diagram illustrating a specific configuration and processing of a position calculation unit in Modification 1 of Embodiment 2.
- FIG. 11 is a diagram illustrating a specific configuration and processing of a position calculation unit in Modification 2 of Embodiment 2.
- FIG. 12 is a block diagram of an example of a computer that implements the posture estimation apparatus according to the first and second embodiments; FIG.
- Embodiment 1 A posture estimation device, a posture estimation method, and a program according to Embodiment 1 will be described below with reference to FIGS. 1 to 8.
- FIG. 1 A posture estimation device, a posture estimation method, and a program according to Embodiment 1 will be described below with reference to FIGS. 1 to 8.
- FIG. 1 A posture estimation device, a posture estimation method, and a program according to Embodiment 1 will be described below with reference to FIGS. 1 to 8.
- FIG. 1 is a configuration diagram showing a schematic configuration of a posture estimation apparatus according to Embodiment 1.
- FIG. 1 is a configuration diagram showing a schematic configuration of a posture estimation apparatus according to Embodiment 1.
- the posture estimation device 10 according to Embodiment 1 shown in FIG. 1 is a device that estimates the posture of a person in an image. As shown in FIG. 1 , posture estimation device 10 includes position calculation section 20 and posture estimation section 30 .
- the position calculation unit 20 calculates, for each joint of the person detected from the image data, the position of the joint and the displacement (hereinafter referred to as the “relative A temporary reference position of the person is calculated based on .
- the posture estimation unit 30 determines the person to whom the joint belongs based on the calculated temporary reference position for each detected joint.
- Embodiment 1 the position of each joint detected on the image data and the relative displacement from each joint to the reference part of the person (for example, abdomen, neck, etc.) are used to determine the joint position.
- a temporary reference position is calculated for each joint, and the temporary reference position determines which person each joint belongs to.
- the joints and the person can be connected as long as the relative displacement from the joints to the reference parts is known. Therefore, according to Embodiment 1, it is possible to improve the accuracy of posture estimation even when a part of the person to be estimated is hidden.
- FIG. 2A and 2B are diagrams showing positions calculated by the position calculation unit, FIG. 2A shows an example of joint positions, FIG. 2B shows an example of relative positions, and FIG. indicates an example of a temporary reference position.
- FIG. 3 is a diagram showing a specific configuration and processing of the position calculation unit.
- the position calculation unit 20 detects the joints of the person and the reference parts of the person from the image data, as shown in FIG. 2(a).
- the position calculation unit 20 also estimates the detected positions of the joints (hereinafter referred to as “joint positions”) and the detected positions of the reference parts. This estimated position of the reference part is not a temporary position but a true "reference position”.
- the target joints are the right wrist, right elbow, right ankle, right knee, left wrist, left elbow, left ankle, left knee, etc., and are set in advance.
- a reference part is also set in advance. Examples of reference sites include the solar plexus, the base of the neck, and the like.
- the joint positions are indicated by ⁇ , and the reference positions are indicated by ⁇ .
- the position calculation unit 20 can also estimate the positions of preset parts, such as the position of the head, in addition to the joint positions and the reference positions.
- the position of the head is also indicated by ⁇ .
- the joint position includes the position of the head.
- the position calculator 20 also estimates the displacement (x, y) from the joint to the reference part for each joint in the image from the image data, as shown in FIG. 2(b).
- the position calculation unit 20 calculates a temporary reference position for each joint.
- the provisional reference position is the provisional position of the reference part of the person estimated from the joint position of each joint.
- the temporary reference position may differ for each joint. Specifically, as shown in FIG. 2C, for each joint in the image, the position calculation unit 20 adds together the coordinates of the joint position and the relative displacement of the joint to obtain a temporary reference position. Calculate
- the temporary reference position is indicated by ⁇ .
- the position of the reference part (reference position) and the provisional reference position of each joint do not match because the joint position, reference position, and displacement are different from the image. This is because it is estimated by processing.
- the image processing used in Embodiment 1 will be described later.
- the position calculator 20 includes a convolutional neural network (CNN) 21 and a calculation processor 22 .
- CNN convolutional neural network
- the CNN 21 When the image data of the person is input, the CNN 21 outputs a map (hereinafter referred to as “joint position/reference position map”) 23 indicating the existence probability for each reference part and each joint of the person. . In addition, when the image data of the person is input, the CNN 21 also outputs a map (hereinafter referred to as "relative displacement map”) 24 indicating the relative displacement of each joint of the person.
- a map hereinafter referred to as "joint position/reference position map”
- relative displacement map indicating the relative displacement of each joint of the person.
- the joint position/reference position map 23 is, for example, a two-dimensional heat map that expresses the existence probability of the target in terms of density.
- the relative displacement map 24 is a map that stores the magnitude and direction of relative displacement in elements corresponding to joint positions on the map.
- the CNN 21 is constructed by performing deep learning using an image to be extracted and a label indicating the extraction target as training data.
- the CNN 21 When there are multiple persons in the image data, the CNN 21 outputs a joint position/reference position map 23 for each joint in the image.
- each of the joint position/reference position maps 23 is provided with information indicating a joint site (right elbow, left elbow, etc.) or information indicating a reference site.
- the relative displacement map 24 is also provided with information indicating the corresponding joint parts.
- the position calculation unit 20 cannot detect the reference part, detects only the joints appearing on the image, and estimates only the joint positions of the detected joints.
- the calculation processing unit 22 uses the joint position/reference position map 23 to estimate the joint position and the reference position of each joint.
- the calculation processing unit 22 also uses the relative displacement map 24 to estimate the relative displacement of each joint.
- each joint and reference part is composed of a plurality of pixels
- the calculation processing unit 22 as shown in FIG.
- the coordinates (x, y) of each pixel constituting it are calculated.
- one pixel is represented by two rectangles, one of which corresponds to the x-coordinate and the other to the y-coordinate.
- the calculation processing unit 22 calculates the coordinates of each pixel of each tentative reference position using the coordinates of each pixel of the joint position and the relative displacement.
- FIG. 4 is a diagram illustrating an outline of processing by a position calculation unit and a posture estimation unit
- FIG. 5 is a diagram showing specific processing in the posture estimation unit.
- the position calculator 20 first estimates the joint position ⁇ , the reference position ⁇ , and the relative displacement of each joint, and then calculates the temporary reference position ⁇ of each joint.
- the posture estimation unit 30 determines whether each joint belongs to the person corresponding to the detected reference position based on the provisional reference position ⁇ and the detected reference position ⁇ for each joint. .
- the posture estimation unit 30 first obtains a distance matrix between the provisional reference position and the estimated reference position for each detected joint. If there are a plurality of estimated reference positions, posture estimation section 30 obtains a distance matrix for each reference position for each joint. Posture estimating section 30 then associates each joint with one of the reference positions such that the distance between the estimated reference position and the provisional reference position is minimized and less than a certain value. Thus, for each joint, the person to which it belongs is determined.
- FIG. 6 is a diagram illustrating an outline of processing by a position calculation unit and a posture estimation unit when a reference position cannot be detected
- FIG. 7 is a diagram showing specific processing in the attitude estimation unit when the reference position cannot be detected.
- the position calculation unit 20 detects each joint appearing on the image data, estimates the joint position and relative displacement only for the detected joint, and further calculates a provisional reference position.
- posture estimation section 30 performs clustering on the provisional reference position of each of the detected joints, and based on the clustering result, for each detected joint, , to determine the person to which each joint belongs.
- the posture estimation unit 30 develops the temporary reference positions of the detected joints in the feature space.
- the number of dimensions of the feature space in this case is two-dimensional because the provisional reference position is represented by two-dimensional coordinates.
- posture estimation section 30 clusters the provisional reference positions developed in the feature space by the following processes (a) to (e).
- the posture estimation unit 30 also prevents provisional reference positions for multiple joints of the same type (for example, right wrist and right wrist) from being included in the same cluster. Then, posture estimation section 30 assumes that the joints at the tentative reference positions included in the same cluster belong to the same person.
- the processing shown in FIGS. 6 and 7 is executed when reference parts are not detected for all persons in the image data. For example, when a plurality of joints are detected but no reference part is detected, the posture estimation unit 30 determines that the reference part is not detected for all persons in the image data, The processing shown in FIGS. 6 and 7 is executed. Note that the posture estimation unit 30 can also execute the processing shown in FIGS. 6 and 7 when reference parts have been detected for all persons in the image data.
- the posture estimation unit 30 estimates the posture of each person based on the positions of the joints belonging to the person. Specifically, the posture estimating unit 30 uses a machine learning model for estimating all joint position information from the joint position information of a person when some joints are missing due to non-detection or the like. , the final human pose can be estimated.
- FIG. 8 is a flow chart showing the operation of the posture estimation device according to Embodiment 1.
- FIG. 1 to 7 will be referred to as appropriate in the following description.
- the posture estimation method is implemented by operating posture estimation apparatus 10 . Therefore, the description of the posture estimation method in Embodiment 1 is replaced with the description of the operation of posture estimation apparatus 10 below.
- the position calculation unit 20 first acquires image data (step A1).
- the image data in step A1 may be image data directly output from an imaging device such as a surveillance camera, or may be image data stored in a storage device.
- the position calculation unit 20 detects the joints and reference parts of the person from the image data, and estimates joint positions, relative displacements and reference positions (step A2). However, when the image data obtained in step A1 is input, the joint position/reference position map 23 and the relative displacement map 24 are output. Then, the calculation processing unit 22 uses the joint position/reference position map 23 to estimate the joint position and the reference position, and uses the relative displacement map 24 to estimate the relative displacement of each joint.
- the position calculation unit 20 calculates a temporary reference position for each joint using the joint position and relative displacement estimated in step A2 (step A3). Specifically, in step A3, as shown in FIG. Calculate the position.
- the posture estimating section 30 determines whether or not reference parts have been detected for at least one person from the image data in step A2 (step A4). Specifically, when at least one reference position is estimated in step A2, posture estimation section 30 determines that a reference part has been detected for at least one person.
- the posture estimation unit 30 executes the processes of steps A5 to A7.
- the posture estimation unit 30 obtains a distance matrix between the tentative reference position and the estimated reference position for each detected joint, and further calculates the distance from the distance matrix. If there are a plurality of reference positions estimated in step A2, posture estimation section 30 obtains a distance matrix for each reference position for each joint, and calculates the distance.
- posture estimating section 30 associates each joint with one of the reference positions so that the distance between the estimated reference position and the provisional reference position is minimized and less than a certain value, Determine the person to which each joint belongs.
- the posture estimation unit 30 also determines the person to which each joint belongs, on the condition that multiple joints of the same type (for example, right wrist and right wrist) do not belong to the same person.
- the posture estimation unit 30 determines whether there is a joint that is not associated with the reference position. In step A7, if there is no joint that is not associated with the reference position, the posture estimation unit 30 estimates the posture of each person based on the positions of the joints belonging to the person ( Step A8). The case where there is a joint that is not associated with the reference position in step A7 will be described later.
- step A4 determines whether the result of the determination in step A4 is that no reference part has been detected for even one person from the image data. If the result of the determination in step A4 is that no reference part has been detected for even one person from the image data, the posture estimation unit 30 executes the processes of steps A9 and A10.
- the posture estimating unit 30 develops the temporary reference positions for each joint in the feature space, and clusters the temporary reference positions developed in the feature space. Specifically, posture estimation section 30 performs clustering by the processes (a) to (e) described above.
- the posture estimation unit 30 determines the person to whom each joint belongs, assuming that the joints at the tentative reference positions included in the same cluster belong to the same person.
- the posture estimation unit 30 After executing steps A9 and A10, the posture estimation unit 30 also executes step A8 to estimate the posture of each person based on the positions of the joints belonging to the person.
- step A7 if there is a joint that is not associated with the reference position, posture estimation section 30 executes steps A9 and A10 for the joint that is not associated with the reference position. . As a result, the person is determined even for the joints determined not to be associated with the reference position, and the pose estimation in step 8 is performed.
- steps A1 to A10 when steps A1 to A10 are executed, the pose of the person in the image data is estimated. Also, if the source of image data is an imaging device such as a surveillance camera, steps A1 to A10 are executed, for example, each time image data is output or each time a set time elapses.
- the program in Embodiment 1 may be any program that causes a computer to execute steps A1 to A10 shown in FIG. By installing this program in a computer and executing it, posture estimation apparatus 10 and the posture estimation method in Embodiment 1 can be realized.
- the processor of the computer functions as the position calculator 20 and the orientation estimator 30 to perform processing. Examples of computers include general-purpose PCs, smartphones, and tablet-type terminal devices.
- Embodiment 1 may be executed by a computer system constructed by a plurality of computers.
- each computer may function as either the position calculator 20 or the posture estimator 30 .
- Embodiment 1 As described above, according to Embodiment 1, even if a part of a person whose posture is to be estimated is hidden in the image data, the person to whom the detected joint belongs can be accurately identified. can be determined, and the accuracy of posture estimation can be improved.
- FIG. 2 (Embodiment 2) Next, a posture estimation device, a posture estimation method, and a program according to Embodiment 2 will be described with reference to FIGS. 9 to 11.
- FIG. 9 (Embodiment 2)
- a posture estimation device is configured in the same manner as the posture estimation device according to Embodiment 1 shown in FIG. However, in Embodiment 2, unlike Embodiment 1, the joint positions of the detected joints and the relative displacements of the joints are expressed on three-dimensional coordinates. The following description focuses on differences from the first embodiment.
- the posture estimation device differs from the posture estimation device according to Embodiment 1 in terms of the function of position calculation section 20 .
- the position calculation unit 20 uses the detected depth of each joint and the parameters of the camera that captured the image data to calculate three-dimensional coordinates indicating the joint position and three-dimensional coordinates indicating the relative displacement for each joint. Estimate dimensional coordinates.
- the position calculation unit 20 also calculates three-dimensional coordinates indicating a temporary reference position of the person based on the three-dimensional coordinates indicating the estimated joint positions and the three-dimensional coordinates indicating the relative displacement.
- FIG. 9 is a diagram showing a specific configuration and processing of a position calculation unit according to Embodiment 2.
- the position calculation unit 20 also includes a CNN 21 and a calculation processing unit 22 in the second embodiment.
- the CNN 21 when the image data of the person is input, the CNN 21 creates a joint position/reference position map 23 for each reference part and each joint of the person, A relative displacement map 24 is output.
- the relative displacement map 24 stores the magnitude and direction of the three-dimensional relative displacement to the reference position in the element corresponding to the joint position in the image on the map.
- the CNN 21 when the image data is input, the CNN 21 also outputs the depth map 25 for each reference part and each joint of the person.
- the depth map 25 stores the depth (distance) from the reference part or joint to the camera that captured the image data in the element corresponding to the joint position in the image on the map.
- the CNN 21 is constructed by performing deep learning using the image to be extracted, the depth to the extraction target, and the label indicating the extraction target as training data.
- the calculation processing unit 22 uses the camera parameters of the camera, the joint position/reference position map 23, and the depth map 25 to calculate three-dimensional coordinates of the joint position and the reference position of each joint. presume.
- the calculation processing unit 22 uses the camera parameters of the camera, the joint position/reference position map 23, the relative displacement map 24 of the camera, and the depth map 25 to estimate the three-dimensional coordinates of the relative displacement of each joint. do.
- the camera parameters are input from the outside in the second embodiment.
- the camera parameters are composed of camera intrinsic parameters and extrinsic parameters.
- the internal parameters are parameters used for coordinate conversion between the three-dimensional coordinates of the camera and the two-dimensional coordinates of the image, with the position of the camera as the origin.
- the internal parameters include the focal length of the camera, the position of the center of the image, and the like.
- External parameters are parameters used for coordinate conversion between three-dimensional world coordinates, which are real world coordinates, and camera coordinates. External parameters include the height of the mounting position of the camera, the angle of depression of the camera, and the like.
- each joint and reference part are composed of a plurality of pixels.
- the calculation processing unit 22 calculates the three-dimensional coordinates (x, y, z) of each pixel that constitutes each of the joint position, the reference position, and the relative displacement.
- one pixel is represented by three rectangles, which correspond to the x-coordinate, y-coordinate, and z-coordinate, respectively.
- the calculation processing unit 22 calculates the three-dimensional coordinates of each pixel of each temporary reference position using the three-dimensional coordinates of each pixel of the joint position and relative displacement.
- Posture estimating section 30 also in Embodiment 2, similarly to Embodiment 1, determines the person to whom each joint belongs based on the temporary reference position calculated for each detected joint. .
- Embodiment 2 three-dimensional coordinates are obtained as the reference position and the temporary reference position. Therefore, when the reference parts of all persons are detected from the image data, the posture estimation unit 30 obtains a three-dimensional distance matrix to determine the person to which each joint belongs. In addition, when the reference parts of all the person are not detected from the image data, the posture estimation unit 30 develops temporary reference positions in the three-dimensional feature space, performs clustering, and then performs clustering. to decide.
- the posture estimation apparatus executes steps A1 to A7 shown in FIG.
- the position calculation unit 20 estimates, for each joint, three-dimensional coordinates indicating the joint position and three-dimensional coordinates indicating the relative displacement in step A2. Further, unlike the first embodiment, the position calculation unit 20 calculates three-dimensional coordinates indicating the temporary reference position of the person in step A3. Also in the second embodiment, the posture estimation method in the second embodiment is implemented by operating the posture estimation apparatus in the second embodiment.
- the program in Embodiment 2 may also be a program that causes a computer to execute steps A1 to A7 shown in FIG. By installing this program in a computer and executing it, the posture estimation device and posture estimation method according to the second embodiment can be realized.
- FIG. 10 is a diagram illustrating a specific configuration and processing of a position calculation unit in Modification 1 of Embodiment 2.
- the position calculation unit 20 also includes a CNN 21 and a calculation processing unit 22 in the first modification.
- the CNN 21 in addition to the joint position/reference position map 23, the relative displacement map 24, and the depth map 25, the camera parameters of the camera that captured the image data. 26 is also output.
- the CNN 21 is constructed by performing deep learning using images to be extracted, depths to extraction targets, labels indicating extraction targets, and camera parameters as training data.
- the calculation processing unit 22 uses the parameters output by the CNN 21 to estimate, for each joint, the three-dimensional coordinates indicating the joint position and the three-dimensional coordinates indicating the relative displacement, and further, These are used to calculate the three-dimensional coordinates indicating the temporary reference position of the person. According to Modification 2, it is possible to estimate and calculate three-dimensional coordinates without inputting camera parameters from the outside.
- FIG. 11 is a diagram illustrating a specific configuration and processing of a position calculation unit in Modification 2 of Embodiment 2.
- FIG. 10 also in Modification 2, the position calculation unit 20 includes a CNN 21 and a calculation processing unit 22 .
- the CNN 21 outputs only two, the joint position/reference position map 23 and the relative displacement map 24, as in the example shown in the first embodiment.
- depth information and camera parameters are input to the position calculation unit 20 .
- the depth information is information specifying the depth of the object measured by the distance measuring device 40 .
- the depth information specifies the depth of the subject of the image data input to the posture estimation device.
- the distance measurement device 40 include devices capable of acquiring depth information, such as stereo cameras, TOF (Time Of Flight) cameras, and LiDAR (Laser Imaging Detection and Ranging).
- the calculation processing unit 22 uses the camera parameters of the camera, the joint position/reference position map 23, and the depth information to calculate the three-dimensional coordinates of the joint position and the reference position of each joint. presume.
- the calculation processing unit 22 also uses the camera parameters of the camera, the relative displacement map 24 of the camera, and the depth information to estimate the three-dimensional coordinates of the relative displacement of each joint. According to Modification 2, it is possible to estimate and calculate the three-dimensional coordinates without outputting the depth of the object from the CNN 21 .
- the temporary reference position is calculated as three-dimensional coordinates. can be determined more accurately, and the posture estimation accuracy can be further improved.
- FIG. 12 is a block diagram of an example of a computer that implements the posture estimation apparatus according to the first and second embodiments;
- a computer 110 includes a CPU (Central Processing Unit) 111, a main memory 112, a storage device 113, an input interface 114, a display controller 115, a data reader/writer 116, and a communication interface 117. and These units are connected to each other via a bus 121 so as to be able to communicate with each other.
- CPU Central Processing Unit
- the computer 110 may include a GPU (Graphics Processing Unit) or an FPGA (Field-Programmable Gate Array) in addition to the CPU 111 or instead of the CPU 111 .
- a GPU or FPGA can execute the programs in the embodiments.
- the CPU 111 expands the program in the embodiment, which is composed of a code group stored in the storage device 113, into the main memory 112 and executes various operations by executing each code in a predetermined order.
- the main memory 112 is typically a volatile storage device such as DRAM (Dynamic Random Access Memory).
- the programs in Embodiments 1 and 2 are provided in a state stored in a computer-readable recording medium 120.
- the programs in Embodiments 1 and 2 may be distributed over the Internet connected via communication interface 117 .
- the storage device 113 includes hard disk drives and semiconductor storage devices such as flash memory.
- Input interface 114 mediates data transmission between CPU 111 and input devices 118 such as a keyboard and mouse.
- the display controller 115 is connected to the display device 119 and controls display on the display device 119 .
- the data reader/writer 116 mediates data transmission between the CPU 111 and the recording medium 120, reads programs from the recording medium 120, and writes processing results in the computer 110 to the recording medium 120.
- Communication interface 117 mediates data transmission between CPU 111 and other computers.
- the recording medium 120 include general-purpose semiconductor storage devices such as CF (Compact Flash (registered trademark)) and SD (Secure Digital), magnetic recording media such as flexible disks, or CD-ROMs ( Compact Disk Read Only Memory) and other optical recording media.
- CF Compact Flash
- SD Secure Digital
- magnetic recording media such as flexible disks
- CD-ROMs Compact Disk Read Only Memory
- the posture estimation apparatus in Embodiments 1 and 2 can also be realized by using hardware corresponding to each part, such as an electronic circuit, instead of a computer in which a program is installed. Furthermore, the posture estimation device may be partly implemented by a program and the rest by hardware.
- a position calculation unit that calculates a temporary reference position of the person based on the position of each joint of the person detected from the image data and the displacement from the joint to the reference part of the person.
- a posture estimation unit that determines, for each of the detected joints, a person to whom the joint belongs based on the calculated temporary reference position;
- a posture estimation device comprising:
- the posture estimation device (Appendix 2) The posture estimation device according to Supplementary Note 1, The position calculation unit estimates, for each joint point of the person, the position of the joint and the displacement of the joint, and calculates the position of the person based on the estimated position of the joint and the displacement of the joint. calculating a temporary reference position;
- a posture estimation device characterized by:
- a posture estimation device characterized by:
- Appendix 4 The posture estimation device according to any one of Appendices 1 to 3, When a reference part of the person is detected from the image data, The posture estimation unit obtains a distance matrix between the tentative reference position and the position of the detected reference part for each of the detected joints, and uses the obtained distance matrix to determine the determine to whom the joint belongs,
- a posture estimation device characterized by:
- a posture estimation device characterized by:
- a posture estimation device characterized by:
- Appendix 7 The posture estimation device according to any one of Appendices 1 to 6, Positions of the joints and displacements about the joints are represented on three-dimensional coordinates, A posture estimation device characterized by:
- the posture estimation device uses the depth of each of the detected joints and the parameters of the camera that captured the image data to determine, for each of the joints, the three-dimensional coordinates indicating the position of the joint and the position of the joint. 3 that indicates a temporary reference position of the person based on the estimated three-dimensional coordinates that indicate the position of the joint and the three-dimensional coordinates that indicate the displacement of the joint; calculate the dimensional coordinates,
- a posture estimation device characterized by:
- a posture estimation step of determining, for each of the detected joints, a person to whom the joint belongs based on the calculated temporary reference position A posture estimation method characterized by comprising:
- a posture estimation step of determining, for each of the detected joints, a person to whom the joint belongs based on the calculated temporary reference position A computer-readable recording medium recording a program containing instructions for executing a
- Appendix 18 The computer-readable recording medium according to Appendix 17, In the position calculation step, for each joint point of the person, the positions of the joints and the displacements of the joints are estimated, and based on the estimated positions of the joints and the displacements of the joints, calculating a temporary reference position;
- a computer-readable recording medium characterized by:
- Appendix 19 The computer-readable recording medium according to Appendix 17 or 18, In the pose estimation step, for each person, the pose of the person is estimated based on the positions of the joints belonging to the person.
- a computer-readable recording medium characterized by:
- Appendix 20 The computer-readable recording medium according to any one of Appendices 17 to 19, When a reference part of the person is detected from the image data, In the posture estimation step, for each of the detected joints, a distance matrix between the tentative reference position and the position of the detected reference part is obtained, and using the obtained distance matrix, the determine to whom the joint belongs, A computer-readable recording medium characterized by:
- Appendix 21 The computer-readable recording medium according to Appendix 20, If there is a person in the image data for whom the reference part has not been detected, In the posture estimation step, clustering is performed on the temporary reference positions of the detected joints, and based on the clustering result, for each detected joint, the person to whom the joint belongs is determined.
- a computer-readable recording medium characterized by:
- Appendix 22 The computer-readable recording medium according to any one of Appendices 17 to 19, In the posture estimation step, clustering is performed on the temporary reference positions of the detected joints, and based on the clustering result, for each detected joint, the person to whom the joint belongs is determined.
- a computer-readable recording medium characterized by:
- Appendix 23 The computer-readable recording medium according to any one of Appendices 17 to 22, Positions of the joints and displacements about the joints are represented on three-dimensional coordinates, A computer-readable recording medium characterized by:
- Appendix 24 19.
- a computer-readable recording medium characterized by:
- the present invention it is possible to improve the accuracy of posture estimation when a part of the person to be estimated is hidden.
- INDUSTRIAL APPLICABILITY The present invention is useful for systems that require estimation of a person's pose on image data, such as surveillance systems.
- attitude estimation device 20 position calculator 21 CNN 22 Calculation processing unit 23 Joint position/reference position map 24 Relative displacement map 25 Depth map 26 Camera parameter 30 Posture estimation unit 40
- Distance measuring device 110 Computer 111 CPU 112 main memory 113 storage device 114 input interface 115 display controller 116 data reader/writer 117 communication interface 118 input device 119 display device 120 recording medium 121 bus
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Heart & Thoracic Surgery (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- Pathology (AREA)
- Biophysics (AREA)
- Veterinary Medicine (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Dentistry (AREA)
- Physiology (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Image Analysis (AREA)
Abstract
Description
画像データから検出されている人物の関節それぞれについて、当該関節の位置、及び当該関節から前記人物の基準となる部位までの変位に基づいて、前記人物の仮の基準位置を算出する、位置算出部と、
検出されている前記関節毎に、算出された前記仮の基準位置に基づいて、当該関節が属する人物を決定する、姿勢推定部と、
を備えていることを特徴とする。
画像データから検出されている人物の関節それぞれについて、当該関節の位置、及び当該関節から前記人物の基準となる部位までの変位に基づいて、前記人物の仮の基準位置を算出する、位置算出ステップと、
検出されている前記関節毎に、算出された前記仮の基準位置に基づいて、当該関節が属する人物を決定する、姿勢推定ステップと、
を備えていることを特徴とする。
画像データから検出されている人物の関節それぞれについて、当該関節の位置、及び当該関節から前記人物の基準となる部位までの変位に基づいて、前記人物の仮の基準位置を算出する、位置算出ステップと、
検出されている前記関節毎に、算出された前記仮の基準位置に基づいて、当該関節が属する人物を決定する、姿勢推定ステップと、
を実行させる命令を含む、プログラムを記録していることを特徴とする。
以下、実施の形態1における姿勢推定装置、姿勢推定方法、及びプログラムについて、図1~図8を参照しながら説明する。
最初に、実施の形態1における姿勢推定装置の概略構成について図1を用いて説明する。図1は、実施の形態1における姿勢推定装置の概略構成を示す構成図である。
(b)決定したクラスタ中心を用いてk-meansによるクラスタリングを実施する。
(c)得られたクラスタ内のサンプルがガウス分布に従うかどうかを、統計的仮説検定に基づいて検証する。この検証は、クラスタ内のサンプルがガウス分布に従うという仮説に基づいて行われる。
(d)上記(c)の検証によって上記仮説が棄却された場合は、該当するクラスタを2つに分割する。一方、上記(c)の検証によって上記仮説が棄却されなかった場合は、該当するクラスタを確定する。
(e)上記(d)によるクラスタの分割がなくなるまで上記(b)から(d)を反復する。
次に、実施の形態1における姿勢推定装置10の動作について図8を用いて説明する。図8は、実施の形態1における姿勢推定装置の動作を示すフロー図である。以下の説明においては、適宜図1~図7を参照する。また、実施の形態1では、姿勢推定装置10を動作させることによって、姿勢推定方法が実施される。よって、実施の形態1における姿勢推定方法の説明は、以下の姿勢推定装置10の動作説明に代える。
実施の形態1におけるプログラムは、コンピュータに、図8に示すステップA1~A10を実行させるプログラムであれば良い。このプログラムをコンピュータにインストールし、実行することによって、実施の形態1における姿勢推定装置10と姿勢推定方法とを実現することができる。この場合、コンピュータのプロセッサは、位置算出部20及び姿勢推定部30として機能し、処理を行なう。コンピュータとしては、汎用のPCの他に、スマートフォン、タブレット型端末装置が挙げられる。
以上のように、実施の形態1によれば、画像データ上で、姿勢推定の対象となる人物の一部が隠れてしまっている場合であっても、検出された関節が属する人物を正確に決定でき、姿勢推定の精度の向上が図られる。
次に、実施の形態2における姿勢推定装置、姿勢推定方法、及びプログラムについて、図9~図11を参照しながら説明する。
ここで、実施の形態2における変形例1について図10を用いて説明する。図10は、実施の形態2の変形例1においての位置算出部の具体的な構成及び処理を示す図である。図10に示すように、変形例1においても、位置算出部20は、CNN21と、計算処理部22とを備えている。
続いて、実施の形態2における変形例2について図11を用いて説明する。図11は、実施の形態2の変形例2においての位置算出部の具体的な構成及び処理を示す図である。図10に示すように、変形例2においても、位置算出部20は、位置算出部20は、CNN21と、計算処理部22とを備えている。
以上のように、実施の形態2によれば、仮の基準位置は3次元座標として算出されるため、姿勢推定の対象となる人物の一部が隠れてしまっている場合であっても、関節の所属先の決定をより正確に行うことができ、姿勢推定精度の更なる向上が図られる。
実施の形態1及び2におけるプログラムを実行することによって、姿勢推定装置を実現するコンピュータについて図12を用いて説明する。図12は、実施の形態1及び2における姿勢推定装置を実現するコンピュータの一例を示すブロック図である。
画像データから検出されている人物の関節それぞれについて、当該関節の位置、及び当該関節から前記人物の基準となる部位までの変位に基づいて、前記人物の仮の基準位置を算出する、位置算出部と、
検出されている前記関節毎に、算出された前記仮の基準位置に基づいて、当該関節が属する人物を決定する、姿勢推定部と、
を備えていることを特徴とする姿勢推定装置。
付記1に記載の姿勢推定装置であって、
前記位置算出部は、前記人物の関節点それぞれについて、当該関節の位置、及び当該関節についての前記変位を推定し、推定した当該関節の位置及び当該関節についての前記変位に基づいて、前記人物の仮の基準位置を算出する、
ことを特徴とする姿勢推定装置。
付記1または2に記載の姿勢推定装置であって、
前記姿勢推定部が、前記人物それぞれ毎に、当該人物に属する前記関節の位置に基づいて、当該人物の姿勢を推定する、
ことを特徴とする姿勢推定装置。
付記1~3のいずれかに記載の姿勢推定装置であって、
前記画像データから、前記人物の基準となる部位が検出されている場合に、
前記姿勢推定部が、検出されている前記関節毎に、前記仮の基準位置と検出された前記基準となる部位の位置との間の距離行列を求め、求めた前記距離行列を用いて、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定装置。
付記4に記載の姿勢推定装置であって、
前記画像データ中に、前記基準となる部位が検出されていない人物が存在する場合に、
前記姿勢推定部が、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定装置。
付記1~3のいずれかに記載の姿勢推定装置であって、
前記姿勢推定部が、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定装置。
付記1~6のいずれかに記載の姿勢推定装置であって、
前記関節の位置、及び前記関節についての変位は、3次元座標上で表現される、
ことを特徴とする姿勢推定装置。
付記2に記載の姿勢推定装置であって、
前記位置算出部が、検出されている前記関節それぞれの深度、及び前記画像データを撮影したカメラのパラメータを用いて、前記関節それぞれについて、当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標、を推定し、推定した当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標に基づいて、前記人物の仮の基準位置を示す3次元座標を算出する、
ことを特徴とする姿勢推定装置。
画像データから検出されている人物の関節それぞれについて、当該関節の位置、及び当該関節から前記人物の基準となる部位までの変位に基づいて、前記人物の仮の基準位置を算出する、位置算出ステップと、
検出されている前記関節毎に、算出された前記仮の基準位置に基づいて、当該関節が属する人物を決定する、姿勢推定ステップと、
を備えていることを特徴とする姿勢推定方法。
付記9に記載の姿勢推定方法であって、
前記位置算出ステップにおいて、前記人物の関節点それぞれについて、当該関節の位置、及び当該関節についての前記変位を推定し、推定した当該関節の位置及び当該関節についての前記変位に基づいて、前記人物の仮の基準位置を算出する、
ことを特徴とする姿勢推定方法。
付記9または10に記載の姿勢推定方法であって、
前記姿勢推定ステップにおいて、前記人物それぞれ毎に、当該人物に属する前記関節の位置に基づいて、当該人物の姿勢を推定する、
ことを特徴とする姿勢推定方法。
付記9~11のいずれかに記載の姿勢推定方法であって、
前記画像データから、前記人物の基準となる部位が検出されている場合に、
前記姿勢推定ステップにおいて、検出されている前記関節毎に、前記仮の基準位置と検出された前記基準となる部位の位置との間の距離行列を求め、求めた前記距離行列を用いて、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定方法。
付記12に記載の姿勢推定方法であって、
前記画像データ中に、前記基準となる部位が検出されていない人物が存在する場合に、
前記姿勢推定ステップにおいて、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定方法。
付記9~11のいずれかに記載の姿勢推定方法であって、
前記姿勢推定ステップにおいて、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定方法。
付記9~14のいずれかに記載の姿勢推定方法であって、
前記関節の位置、及び前記関節についての変位は、3次元座標上で表現される、
ことを特徴とする姿勢推定方法。
付記10に記載の姿勢推定方法であって、
前記位置算出ステップにおいて、検出されている前記関節それぞれの深度、及び前記画像データを撮影したカメラのパラメータを用いて、前記関節それぞれについて、当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標、を推定し、推定した当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標に基づいて、前記人物の仮の基準位置を示す3次元座標を算出する、
ことを特徴とする姿勢推定方法。
コンピュータに、
画像データから検出されている人物の関節それぞれについて、当該関節の位置、及び当該関節から前記人物の基準となる部位までの変位に基づいて、前記人物の仮の基準位置を算出する、位置算出ステップと、
検出されている前記関節毎に、算出された前記仮の基準位置に基づいて、当該関節が属する人物を決定する、姿勢推定ステップと、
を実行させる命令を含む、プログラムを記録しているコンピュータ読み取り可能な記録媒体。
付記17に記載のコンピュータ読み取り可能な記録媒体であって、
前記位置算出ステップにおいて、前記人物の関節点それぞれについて、当該関節の位置、及び当該関節についての前記変位を推定し、推定した当該関節の位置及び当該関節についての前記変位に基づいて、前記人物の仮の基準位置を算出する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。
付記17または18に記載のコンピュータ読み取り可能な記録媒体であって、
前記姿勢推定ステップにおいて、前記人物それぞれ毎に、当該人物に属する前記関節の位置に基づいて、当該人物の姿勢を推定する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。
付記17~19のいずれかに記載のコンピュータ読み取り可能な記録媒体であって、
前記画像データから、前記人物の基準となる部位が検出されている場合に、
前記姿勢推定ステップにおいて、検出されている前記関節毎に、前記仮の基準位置と検出された前記基準となる部位の位置との間の距離行列を求め、求めた前記距離行列を用いて、当該関節が属する人物を決定する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。
付記20に記載のコンピュータ読み取り可能な記録媒体であって、
前記画像データ中に、前記基準となる部位が検出されていない人物が存在する場合に、
前記姿勢推定ステップにおいて、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。
付記17~19のいずれかに記載のコンピュータ読み取り可能な記録媒体であって、
前記姿勢推定ステップにおいて、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。
付記17~22のいずれかに記載のコンピュータ読み取り可能な記録媒体であって、
前記関節の位置、及び前記関節についての変位は、3次元座標上で表現される、
ことを特徴とするコンピュータ読み取り可能な記録媒体。
付記18に記載のコンピュータ読み取り可能な記録媒体であって、
前記位置算出ステップにおいて、検出されている前記関節それぞれの深度、及び前記画像データを撮影したカメラのパラメータを用いて、前記関節それぞれについて、当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標、を推定し、推定した当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標に基づいて、前記人物の仮の基準位置を示す3次元座標を算出する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。
20 位置算出部
21 CNN
22 計算処理部
23 関節位置・基準位置マップ
24 相対変位マップ
25 深度マップ
26 カメラパラメータ
30 姿勢推定部
40 距離計測装置
110 コンピュータ
111 CPU
112 メインメモリ
113 記憶装置
114 入力インターフェイス
115 表示コントローラ
116 データリーダ/ライタ
117 通信インターフェイス
118 入力機器
119 ディスプレイ装置
120 記録媒体
121 バス
Claims (24)
- 画像データから検出されている人物の関節それぞれについて、当該関節の位置、及び当該関節から前記人物の基準となる部位までの変位に基づいて、前記人物の仮の基準位置を算出する、位置算出手段と、
検出されている前記関節毎に算出された前記仮の基準位置に基づいて、当該関節が属する人物を決定する、姿勢推定手段と、
を備えていることを特徴とする姿勢推定装置。 - 請求項1に記載の姿勢推定装置であって、
前記位置算出手段は、前記人物の関節点それぞれについて、当該関節の位置、及び当該関節についての前記変位を推定し、推定した当該関節の位置及び当該関節についての前記変位に基づいて、前記人物の仮の基準位置を算出する、
ことを特徴とする姿勢推定装置。 - 請求項1または2に記載の姿勢推定装置であって、
前記姿勢推定手段が、前記人物それぞれ毎に、当該人物に属する前記関節の位置に基づいて、当該人物の姿勢を推定する、
ことを特徴とする姿勢推定装置。 - 請求項1~3のいずれかに記載の姿勢推定装置であって、
前記画像データから、前記人物の基準となる部位が検出されている場合に、
前記姿勢推定手段が、検出されている前記関節毎に、前記仮の基準位置と検出された前記基準となる部位の位置との間の距離行列を求め、求めた前記距離行列を用いて、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定装置。 - 請求項4に記載の姿勢推定装置であって、
前記画像データ中に、前記基準となる部位が検出されていない人物が存在する場合に、
前記姿勢推定手段が、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定装置。 - 請求項1~3のいずれかに記載の姿勢推定装置であって、
前記姿勢推定手段が、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定装置。 - 請求項1~6のいずれかに記載の姿勢推定装置であって、
前記関節の位置、及び前記関節についての変位は、3次元座標上で表現される、
ことを特徴とする姿勢推定装置。 - 請求項2に記載の姿勢推定装置であって、
前記位置算出手段が、検出されている前記関節それぞれの深度、及び前記画像データを撮影したカメラのパラメータを用いて、前記関節それぞれについて、当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標、を推定し、推定した当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標に基づいて、前記人物の仮の基準位置を示す3次元座標を算出する、
ことを特徴とする姿勢推定装置。 - 画像データから検出されている人物の関節それぞれについて、当該関節の位置、及び当該関節から前記人物の基準となる部位までの変位に基づいて、前記人物の仮の基準位置を算出し、
検出されている前記関節毎に、算出された前記仮の基準位置に基づいて、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定方法。 - 請求項9に記載の姿勢推定方法であって、
前記仮の基準位置の算出において、前記人物の関節点それぞれについて、当該関節の位置、及び当該関節についての前記変位を推定し、推定した当該関節の位置及び当該関節についての前記変位に基づいて、前記人物の仮の基準位置を算出する、
ことを特徴とする姿勢推定方法。 - 請求項9または10に記載の姿勢推定方法であって、
前記人物の決定において、前記人物それぞれ毎に、当該人物に属する前記関節の位置に基づいて、当該人物の姿勢を推定する、
ことを特徴とする姿勢推定方法。 - 請求項9~11のいずれかに記載の姿勢推定方法であって、
前記画像データから、前記人物の基準となる部位が検出されている場合に、
前記人物の決定において、検出されている前記関節毎に、前記仮の基準位置と検出された前記基準となる部位の位置との間の距離行列を求め、求めた前記距離行列を用いて、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定方法。 - 請求項12に記載の姿勢推定方法であって、
前記画像データ中に、前記基準となる部位が検出されていない人物が存在する場合に、
前記人物の決定において、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定方法。 - 請求項9~11のいずれかに記載の姿勢推定方法であって、
前記人物の決定において、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とする姿勢推定方法。 - 請求項9~14のいずれかに記載の姿勢推定方法であって、
前記関節の位置、及び前記関節についての変位は、3次元座標上で表現される、
ことを特徴とする姿勢推定方法。 - 請求項10に記載の姿勢推定方法であって、
前記仮の基準位置の算出において、検出されている前記関節それぞれの深度、及び前記画像データを撮影したカメラのパラメータを用いて、前記関節それぞれについて、当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標、を推定し、推定した当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標に基づいて、前記人物の仮の基準位置を示す3次元座標を算出する、
ことを特徴とする姿勢推定方法。 - コンピュータに、
画像データから検出されている人物の関節それぞれについて、当該関節の位置、及び当該関節から前記人物の基準となる部位までの変位に基づいて、前記人物の仮の基準位置を算出させ、
検出されている前記関節毎に、算出された前記仮の基準位置に基づいて、当該関節が属する人物を決定させる、
命令を含む、プログラムを記録しているコンピュータ読み取り可能な記録媒体。 - 請求項17に記載のコンピュータ読み取り可能な記録媒体であって、
前記仮の基準位置の算出において、前記人物の関節点それぞれについて、当該関節の位置、及び当該関節についての前記変位を推定し、推定した当該関節の位置及び当該関節についての前記変位に基づいて、前記人物の仮の基準位置を算出する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。 - 請求項17または18に記載のコンピュータ読み取り可能な記録媒体であって、
前記人物の決定において、前記人物それぞれ毎に、当該人物に属する前記関節の位置に基づいて、当該人物の姿勢を推定する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。 - 請求項17~19のいずれかに記載のコンピュータ読み取り可能な記録媒体であって、
前記画像データから、前記人物の基準となる部位が検出されている場合に、
前記人物の決定において、検出されている前記関節毎に、前記仮の基準位置と検出された前記基準となる部位の位置との間の距離行列を求め、求めた前記距離行列を用いて、当該関節が属する人物を決定する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。 - 請求項20に記載のコンピュータ読み取り可能な記録媒体であって、
前記画像データ中に、前記基準となる部位が検出されていない人物が存在する場合に、
前記人物の決定において、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。 - 請求項17~19のいずれかに記載のコンピュータ読み取り可能な記録媒体であって、
前記人物の決定において、検出されている前記関節それぞれの前記仮の基準位置に対してクラスタリングを実行し、クラスタリングの結果に基づいて、検出された前記関節毎に、当該関節が属する人物を決定する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。 - 請求項17~22のいずれかに記載のコンピュータ読み取り可能な記録媒体であって、
前記関節の位置、及び前記関節についての変位は、3次元座標上で表現される、
ことを特徴とするコンピュータ読み取り可能な記録媒体。 - 請求項18に記載のコンピュータ読み取り可能な記録媒体であって、
前記仮の基準位置の算出において、検出されている前記関節それぞれの深度、及び前記画像データを撮影したカメラのパラメータを用いて、前記関節それぞれについて、当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標、を推定し、推定した当該関節の位置を示す3次元座標、及び当該関節についての前記変位を示す3次元座標に基づいて、前記人物の仮の基準位置を示す3次元座標を算出する、
ことを特徴とするコンピュータ読み取り可能な記録媒体。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023525270A JPWO2022254644A5 (ja) | 2021-06-03 | 姿勢推定装置、姿勢推定方法、及びプログラム | |
US18/273,431 US20240119620A1 (en) | 2021-06-03 | 2021-06-03 | Posture estimation apparatus, posture estimation method, and computer-readable recording medium |
PCT/JP2021/021140 WO2022254644A1 (ja) | 2021-06-03 | 2021-06-03 | 姿勢推定装置、姿勢推定方法、及びコンピュータ読み取り可能な記録媒体 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/021140 WO2022254644A1 (ja) | 2021-06-03 | 2021-06-03 | 姿勢推定装置、姿勢推定方法、及びコンピュータ読み取り可能な記録媒体 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022254644A1 true WO2022254644A1 (ja) | 2022-12-08 |
Family
ID=84322910
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/021140 WO2022254644A1 (ja) | 2021-06-03 | 2021-06-03 | 姿勢推定装置、姿勢推定方法、及びコンピュータ読み取り可能な記録媒体 |
Country Status (2)
Country | Link |
---|---|
US (1) | US20240119620A1 (ja) |
WO (1) | WO2022254644A1 (ja) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009506441A (ja) * | 2005-08-26 | 2009-02-12 | ソニー株式会社 | モーションキャプチャに使用されるラベリング |
JP2020052476A (ja) * | 2018-09-23 | 2020-04-02 | 株式会社Acculus | オブジェクト検出装置およびオブジェクト検出プログラム |
-
2021
- 2021-06-03 US US18/273,431 patent/US20240119620A1/en active Pending
- 2021-06-03 WO PCT/JP2021/021140 patent/WO2022254644A1/ja active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009506441A (ja) * | 2005-08-26 | 2009-02-12 | ソニー株式会社 | モーションキャプチャに使用されるラベリング |
JP2020052476A (ja) * | 2018-09-23 | 2020-04-02 | 株式会社Acculus | オブジェクト検出装置およびオブジェクト検出プログラム |
Also Published As
Publication number | Publication date |
---|---|
JPWO2022254644A1 (ja) | 2022-12-08 |
US20240119620A1 (en) | 2024-04-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Dong et al. | Marker-free monitoring of the grandstand structures and modal identification using computer vision methods | |
US20200279121A1 (en) | Method and system for determining at least one property related to at least part of a real environment | |
US9418480B2 (en) | Systems and methods for 3D pose estimation | |
Dong et al. | A non-target structural displacement measurement method using advanced feature matching strategy | |
US10636168B2 (en) | Image processing apparatus, method, and program | |
US11398049B2 (en) | Object tracking device, object tracking method, and object tracking program | |
US20170337701A1 (en) | Method and system for 3d capture based on structure from motion with simplified pose detection | |
US10776978B2 (en) | Method for the automated identification of real world objects | |
US10755422B2 (en) | Tracking system and method thereof | |
Chattopadhyay et al. | Frontal gait recognition from occluded scenes | |
CN111459269B (zh) | 一种增强现实显示方法、系统及计算机可读存储介质 | |
JP2002024807A (ja) | 物体運動追跡手法及び記録媒体 | |
JP2017059945A (ja) | 画像解析装置及び画像解析方法 | |
KR20200076267A (ko) | 골격의 길이 정보를 이용한 제스쳐 인식 방법 및 처리 시스템 | |
Romero-Ramirez et al. | Tracking fiducial markers with discriminative correlation filters | |
JP2017096813A (ja) | キャリブレーション装置、キャリブレーション方法およびキャリブレーションプログラム | |
Angladon et al. | The toulouse vanishing points dataset | |
JP6922348B2 (ja) | 情報処理装置、方法、及びプログラム | |
Afif et al. | Vision-based tracking technology for augmented reality: a survey | |
CN113196283A (zh) | 使用射频信号的姿态估计 | |
JP7484492B2 (ja) | レーダーに基づく姿勢認識装置、方法及び電子機器 | |
CN110728172B (zh) | 基于点云的人脸关键点检测方法、装置、系统及存储介质 | |
US11373318B1 (en) | Impact detection | |
WO2022254644A1 (ja) | 姿勢推定装置、姿勢推定方法、及びコンピュータ読み取り可能な記録媒体 | |
CN113887384B (zh) | 基于多轨迹融合的行人轨迹分析方法、装置、设备及介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21944146 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2023525270 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18273431 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21944146 Country of ref document: EP Kind code of ref document: A1 |