WO2021235440A1 - Method and device for acquiring movement feature amount using skin information - Google Patents

Method and device for acquiring movement feature amount using skin information Download PDF

Info

Publication number
WO2021235440A1
WO2021235440A1 PCT/JP2021/018809 JP2021018809W WO2021235440A1 WO 2021235440 A1 WO2021235440 A1 WO 2021235440A1 JP 2021018809 W JP2021018809 W JP 2021018809W WO 2021235440 A1 WO2021235440 A1 WO 2021235440A1
Authority
WO
WIPO (PCT)
Prior art keywords
shape
representative
vertices
skin
posture
Prior art date
Application number
PCT/JP2021/018809
Other languages
French (fr)
Japanese (ja)
Inventor
仁彦 中村
洋介 池上
添威 張
稔尚 赤瀬
Original Assignee
国立大学法人東京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 国立大学法人東京大学 filed Critical 国立大学法人東京大学
Publication of WO2021235440A1 publication Critical patent/WO2021235440A1/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/117Identification of persons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion

Definitions

  • the present invention relates to a device and a method for acquiring a motor feature amount using skin information.
  • the exercise feature amount acquired by the present invention can be used as an informationization of individuality in exercise and a personal authentication technique.
  • the posture of the target can be represented by a skeletal model, and the posture of the target is determined from each joint angle and position. By using motion capture, it is possible to acquire the motion data of the target from the time series data of the posture of the target.
  • the shape of the object can be represented by a polygon model or a polygon mesh.
  • the body surface of the target is composed of a set of a large number of polygons (typically triangles), and the shape of the target is determined from the coordinates of the vertices of all the polygons.
  • the polygon model can be obtained by acquiring the three-dimensional coordinates of all the tops of the polygons on the target body surface using, for example, a 3D body scanner.
  • the shape of the target (coordinates of the vertices of the polygon) changes depending on the posture of the target.
  • skinning is known as the task of associating a 3DCG model with a skeleton. Skinning determines how each vertex of the model's polygons follows the skeleton, and weight adjustments are made to adjust the effect of the skeleton on each vertex.
  • Non-Patent Document 4 video motion capture in which two-dimensional joint positions are estimated from a camera image by deep learning and integrated to reconstruct the motion in three dimensions has been realized (Patent Document 4, Non-Patent Document 2).
  • video motion capture it is possible to acquire three-dimensional information of the target skeleton from camera images without interfering with the target person.
  • OpenPose Non-Patent Document 3
  • Non-Patent Document 3 can be used for estimating the two-dimensional joint position by deep learning from the camera image.
  • the video motion capture system estimates the 3D information of the skeleton based on the joint position, but in addition to the skeleton, a technology to estimate the 3D body surface information (shape information) only from the RGB camera has been developed. There is.
  • a technique for estimating shape information only from an RGB camera a technique for reconstructing the detailed shape of clothes (Non-Patent Document 4) and a technique for constructing a skin model without wearing clothes (Non-Patent Document 5). , Non-Patent Document 6) is known.
  • biometric authentication is attracting attention as an authentication method that solves the drawbacks of conventional password and password encryption.
  • Many biometrics such as face recognition, fingerprint recognition, vein recognition, and iris recognition require the acquisition of biometric information at a short distance after obtaining consent.
  • gait authentication which identifies an individual by walking, is attracting attention as a method for acquiring biometric information from a long distance without the cooperation of the subject.
  • Many of the gait authentication uses silhouette images of walking and joint angles as feature quantities, and the amount of information used is still limited, and it is premised on gait as a specific exercise. The accuracy of the cage is not yet satisfactory.
  • personal authentication that is not necessarily limited to walking becomes possible.
  • Non-Patent Document 7 In addition, research on shape search of 3D Human Model is also being conducted, and Non-Patent Document 7 can be referred to. Further, HKS (Heat Kernel Signature) and WKS (Wave Kernel Signature) have been proposed as shape descriptors. HKS (Heat Kernel Signature) is described in Non-Patent Document 8, and WKS is described in Patent Document 5 and Non-Patent Document 9. These shape searches and shape descriptors do not focus on changes in shape during exercise.
  • SMPL A skinned multi-person linear model.
  • SHREC'14 track Shape retrieval of non-rigid 3d human models. Eurographics Association, 2014. Jian Sun, Maks Ovsjanikov, and Leonidas Guibas. A concise and provably informative multi-scale signature based on heat diffusion. Computer Graphics Forum, Vol. 28, No. 5, pp. 1383-1392, 2009. M. Aubry, U. Schlickewei, and D. Cremers.
  • the wave kernel signature A quantum me-chanical approach to shape analysis. In 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp.
  • the present invention aims to extract motion features from the dynamics of skin polygons that deform during exercise.
  • the technical means adopted in the present invention is The shape of the object is specified by the skin polygon, Each vertex of the skin polygon has coordinates depending on the posture of the target.
  • the shape of the object is represented by one or more representative regions selected from the skin polygons, and each representative region is a group of vertices consisting of a plurality of vertices.
  • Prepare time-series data of the skin polygons during the movement of the target In a plurality of frames, using the one or more representative regions, a shape representative value representing the target shape of each frame is calculated.
  • a value representing the temporal change of the one or more representative regions accompanying the movement of the target is acquired as a motion feature amount. It is a method of acquiring a motor feature amount.
  • the shape of the object is represented by a plurality of representative regions.
  • the temporal change of the plurality of representative regions includes a temporal change of the spatial relationship between the representative regions with the movement of the subject.
  • the spatial relationship between the representative regions is defined by a function (shape representative value or shape descriptor) of the vertex coordinates of any two representative regions.
  • the spatial relationship between the representative regions is defined by the distance between vertices (shape representative value or shape descriptor) between any two representative regions.
  • the representative region is a ring of annular vertices or rings arranged in a ring.
  • the annular vertex group (ring) is not limited to the annular vertex group, and may be, for example, a group of vertices arranged in a substantially square shape.
  • the annular apex group is a group of vertices arranged along the perimeter of a part of the human body. In one embodiment, the annular apex group is placed on the surface of a part of the human body.
  • the annular apex group is For all vertices, get the HK value using HKS (Heat Kernel Signature) and All vertices are divided into two groups according to the threshold, Obtain a ring of vertices consisting of a set of vertices arranged in a circle at the boundary between the two groups.
  • HKS Heat Kernel Signature
  • the representative region is a planar apex group consisting of a plurality of vertices arranged so as to form a surface (the annular apex group is, so to speak, a linear apex group).
  • the representative region may represent the shape of the target portion.
  • the entire skin polygon may be selected and used as a representative area.
  • the posture of the subject is specified by the skeletal model, We have obtained a function that associates each vertex of the skin polygon model with the skeleton.
  • the coordinates (initial coordinates) of each vertex of the target skin polygon model are obtained depending on the specific posture (initial posture).
  • the coordinates of each vertex in an arbitrary posture can be obtained from the initial coordinates, the initial posture, and the arbitrary posture by using the function.
  • the posture of the subject is acquired by markerless motion capture using one or more images.
  • the shape of the object is specified by the skin polygon, Each vertex of the skin polygon model has coordinates depending on the posture of the target. For all vertices, HKS values are obtained using HKS (Heat Kernel Signature). All vertices are divided into two groups according to the threshold, An annular vertex group (ring) consisting of a set of vertices arranged in an annular shape located at the boundary between the two groups is acquired as a shape representative region. This is a method for acquiring shape representative information. In one embodiment, the shape of the object is represented by a group of annular vertices. By changing the threshold value, the plurality of annular vertices are determined.
  • a plurality of annular vertex groups representing the target shape are represented by a function (shape representative value or shape descriptor) of the vertex coordinates of the two annular vertex groups.
  • a plurality of annular vertex groups representing the target shape are represented as a distance between the vertices (shape representative value or shape descriptor) between the two annular vertex groups.
  • a plurality of annular apex groups representing the shape of the target are represented by the area of the region surrounded by each annular apex group and / or the perimeter.
  • the method for acquiring the motion feature amount and the method for acquiring the shape representative information are executed by a computer, and the present invention is also provided as a computer program for causing the computer to execute these methods.
  • the storage unit stores time-series data of skin polygons that specify the shape of the target during exercise, and each vertex of the skin polygon has a vertex ID and coordinates depending on the posture of the target.
  • the shape of the object is represented by one or more representative regions selected from the skin polygons, and each representative region is a group of vertices specified by the vertex IDs and coordinates of the plurality of vertices.
  • the shape representative value calculation unit calculates a shape representative value that represents the shape of the target depending on the posture by using the one or a plurality of representative regions.
  • the motion feature amount calculation unit uses the time-series data of the shape representative values acquired in a plurality of frames, and uses the motion features to represent the temporal change of the one or more representative regions with the motion of the target. Calculated as a quantity, It is a device for acquiring motion features.
  • the shape of the object is represented by a plurality of representative regions.
  • the shape representative value calculation unit calculates a shape representative value that represents the shape of the target depending on the posture from the spatial relationship between the representative regions.
  • the motion feature calculation unit uses time-series data of the shape representative values acquired in a plurality of frames to calculate a value representing the change in the spatial relationship between the representative regions due to the motion of the target as the motion feature. do.
  • the shape representative value is defined by a function of vertex coordinates of any two representative regions.
  • the shape representative value is defined by the distance between vertices between any two representative regions.
  • the coordinate set forming the ring extracted from the vertex set S3 by clustering and the plane obtained by the principal component analysis are shown. It is a figure which shows the set of vertices located at the boundary detected corresponding to S3, S8, S38, S70, S80 in order from the left. In the skin model of different body shape and posture, all 17 boundaries of the set S 3 , S 8 , S 38 , S 70 , S 80 are displayed together in one skin model. It is a figure which shows 10 postures of a SHREC model (Non-Patent Document 7). The set of vertex sets of the boundary consisting of S5, S15, S30, and S80 is shown. It is a figure which applies the joint position acquired by video motion capture to the skin model acquired by HMR.
  • the skeletal model used in video motion capture is shown, and the numbers represent joints.
  • the skeletal model used in video motion capture is shown, and the numbers represent bones.
  • It is a figure which shows the skin model acquired by the method shown in FIG. The walking motion on the target treadmill is shown every 20 frames (0.3 seconds).
  • the feature amount acquisition system using the shape information is a target moving image. From one or more video cameras to acquire, and one or more computers to calculate and output the amount of motion features of the target by inputting image information and executing a predetermined calculation. It is composed. More specifically, the feature amount acquisition system acquires the target shape information (skin polygon) using the image or image and the posture information, and the posture information acquisition unit that acquires the target posture information from the image constituting the moving image.
  • a shape information acquisition unit for calculating a shape representative value
  • a shape representative value calculation unit for calculating a shape representative value using the shape information
  • a calculation unit for calculating a feature amount using the shape representative value.
  • the posture information acquisition unit, shape information acquisition unit, shape representative value calculation unit, and feature amount calculation unit are realized by a computer processor, and the attitude information, shape information (skin polygon), shape representative value, and feature amount are stored in the computer memory ( It is stored in the storage unit).
  • a computer storage unit stores a skeleton model, a skin polygon model, a function (skinning function, etc.) that associates polygon vertex coordinates with the skeleton model.
  • [A-2] Posture acquisition The posture of the target is specified by the joint position of the target skeletal model.
  • posture information is acquired by adopting video motion capture technology.
  • Video motion capture is a method of synchronously photographing the motion of a target human using a plurality of cameras and performing three-dimensional reconstruction of the motion (Patent Document 4, Non-Patent Document 2).
  • the video motion capture system estimates the joint position by processing the images from multiple RGB cameras arranged so as to surround the object with OpenPose (Non-Patent Document 3), and the estimated 3D joint position is a skeletal model.
  • the accuracy of 3D joint estimation is improved by performing inverse dynamics calculation using.
  • Patent Document 4 and Non-Patent Document 2 can be referred to.
  • each joint is labeled (Table 1).
  • the line connecting adjacent joints is treated as a bone (see FIG. 21).
  • the number of bones is 17, and each bone is labeled (Table 1).
  • the skeleton model, the target skeleton information (bone length), and the coordinates of each joint position for each frame acquired by motion capture are stored in the storage unit.
  • Non-Patent Document 3 OpenPose (Non-Patent Document 3) is used in the video motion capture according to the present embodiment, another method may be used for estimating the two-dimensional joint position from the camera image by deep learning. Further, various methods for acquiring the posture of the target are known to those skilled in the art, and the method for acquiring the posture of the target is not limited to video motion capture (Patent Document 4, Non-Patent Document 2), for example. A method of analyzing an RGB image from one viewpoint to acquire motion data may be adopted, or a markerless motion capture using a camera and a depth sensor may be adopted.
  • the shape to be acquired for the shape is defined by a polygon model or a polygon mesh.
  • a polygon model is composed of vertices, edges, and faces. For example, in the case of a triangular mesh model, it has three vertices, three sides of the face of the triangle are composed of ridges, and the start point and the end point of each ridge are each composed of two vertices out of the three vertices.
  • Each polygon can be represented by the 3D coordinate values of all the vertices that make up the polygon, and the simplest data structure of the polygon model can be the IDs and 3D coordinate values of all the vertices of the polygon model. ..
  • Various data structures of the polygon model are known, and the shape of the target may be quantified, and the specific data structure is not limited.
  • SMPL Skinned Multi-Person Linear Model
  • Patent Documents 1 and 2 Non-Patent Document 1
  • SMPL is a naked human parameter model that has 72 postures and 10 body parameters and returns a triangular mesh.
  • three-dimensional shape information is acquired by optimizing the parameters of the model so that they fit the human silhouette of the image.
  • Non-Patent Documents 5 and 6 there is one that constitutes a skin model in a state where no clothes are worn.
  • a target mesh model was obtained using the method according to Non-Patent Document 5.
  • Patent Document 4 By using feature quantity video motion capture (Patent Document 4, Non-Patent Document 2) acquired from a skin polygon having posture information, one or more cameras and a computer can be prepared. It is possible to reconstruct the motion of the skin in three dimensions, and it is also possible to reconstruct the approximate outer shape of the skin in three dimensions even if the clothes are worn from the images of one or more cameras (non-patent). Patent Documents 5 and 6). When the motion of the object is three-dimensionally reconstructed, not only the motion of the skeleton but also the shape information represented by the skin can be three-dimensionally reconstructed to express the rich motion information of the object.
  • the motion information can be obtained from the temporal change (deformed skin polygon) of the shape of the individual.
  • the feature amount of the motion is extracted by quantifying the skin polygon deformed by the motion using the shape representative value or the shape descriptor.
  • a known shape descriptor can be used when obtaining the shape representative value or the shape descriptor. Many candidates can be considered as the shape descriptor, but in this embodiment, Heat Kernel Signature (Non-Patent Document 8) is used. In this embodiment, HKS has identified a plurality of representative regions (closed curve groups or ring sets) on the body surface of the subject. The amount of calculation can be reduced by calculating the shape representative value or the shape descriptor using the information in the representative area.
  • the coordinate value of the ring 1, the coordinate value of the ring 2, and the ring 1 and the ring depend on the posture.
  • the spatial relationship between 2 also changes.
  • the distance from each vertex of one ring to all the vertices of the other ring is calculated, and a vector storing this value is obtained.
  • a vector using the distances between all the rings constituting the ring set representing the target shape or the distances between specific rings is used as the shape representative value or the shape descriptor.
  • the shape representative value or the shape descriptor is obtained, and the time series data of the shape representative value or the shape descriptor is acquired.
  • the array of time-series data may be used as a motion feature quantity, or the array of time-series data may be reduced in dimension to form a motion feature quantity.
  • This motor feature amount includes information such as that a plurality of representative parts on the target's skin are separated or approached with the movement of the target (posture 1 ⁇ posture 2 ⁇ posture 3), and the posture is included. It can be said that the motion feature amount obtained from the skin polygon having information reflects the motion information and the skeletal information consisting of the time-series data of the posture of the target.
  • the motion feature amount acquired from the skin polygon having the posture information reflects the individual difference and individuality in the exercise, and by using this motion feature amount, the informationization of the individuality in the exercise and the personal authentication technology are established. be able to.
  • This motor feature amount can be used as an index of motor change in exercise training and rehabilitation.
  • the degree of similarity between one's own movement and the movement of another person can be obtained, and the distance between the movement of a certain individual and the target movement can be expressed numerically.
  • the motion feature amount acquired from the skin polygon having the posture information becomes the feature amount of the individual walking motion, and this feature amount identifies the individual. It can be used as important information for doing so. ..
  • the feature quantities of movement can be acquired from the time-series data of posture.
  • the relative position of each joint with NOSE as the origin is calculated, the RWRIST, LWRIST, RANKLE, and LANKLE parts of the array containing the relative position of each joint are extracted, and the relative joint position is extracted.
  • the array of can be defined as the feature quantity obtained from the motion information by collecting a certain number of frames and calculating the dispersion for each joint. It will be understood by those skilled in the art that there are various methods for acquiring low-dimensional motion features.
  • the length of the labeled bone can be used as a feature. From this three-dimensional skeletal information, skeletal features and motion features are defined. As for the skeletal features, for example, an array containing the bone length values can be created and defined as the skeletal features.
  • the sequence contains the value of the i-th bone length of the label, and the length of the sequence is 17, which is equal to the number of bones.
  • the skeletal features may be reduced in dimension.
  • the shape feature amount that does not depend on the posture may be acquired.
  • the thickness or volume of a certain part of the human body, or the ratio or volume ratio of the thickness between a plurality of parts may be used as the shape feature amount.
  • the mass distribution may be estimated from the skin polygons of the human body according to the shape, and the estimated value of the force of the joints and muscles generated by the movement may be used for extracting the movement feature amount.
  • the volume of each part of the human body can be calculated from the shape information of the human body.
  • the specific gravity of the body or body part is known, and the volume and specific gravity can be used to roughly calculate the mass distribution.
  • the skin polygon model according to the present embodiment will be described with reference to FIG.
  • the posture of the target is represented by a skeleton model, and time-series data of the posture (posture 1 to 5) is acquired from the moving image data (time-series data of the image).
  • Postures 1 to 5 do not necessarily have to be continuous frames, and are, for example, characteristic time-series frames in a predetermined motion, or time-series frames extracted from continuous frames for each predetermined number of frames. ..
  • the skin polygon model provides a skin polygon having posture information corresponding to each posture.
  • the skin polygons 1 to 5 corresponding to the postures 1 to 5 have vertex coordinates corresponding to the postures 1 to 5, respectively.
  • a shape representative value representing the shape of the target defined by the skin polygon is obtained.
  • time-series data of the shape representative values can be obtained.
  • the time-series data of the shape representative value reflects the motion data of the target, and the feature amount of the motion of the target is acquired using the time-series data of the shape representative value.
  • the target posture information is acquired by motion capture. Based on the image of a certain posture of the target, the initial posture is obtained, and at the same time, the skin polygon corresponding to the initial posture is obtained.
  • Known means can be used to obtain skin polygons from an image. By matching the coordinate systems of the skeleton model and the skin polygon model, the posture (joint position) and the coordinates of the polygon apex are made to correspond.
  • a function for example, a skinning function that associates the coordinates of the apex of the skin polygon with the skeleton model (posture) has been obtained, and the coordinates of each apex in an arbitrary posture of the target are the initial coordinates and the above using the function. It can be obtained from the initial posture and the above-mentioned arbitrary posture. That is, the skin polygons 1 to 6 can be obtained from the postures 1 to 6, respectively.
  • the skin polygon 1 has vertex coordinates corresponding to the posture 1, a shape descriptor is calculated using all the vertex coordinates, a plurality of representative regions are extracted using the calculation results, and a set consisting of a plurality of representative regions is used. Is used to calculate the shape representative value.
  • the representative region is a set of vertices, which is specified by the vertex ID and coordinates.
  • the shape representative value is, for example, a function of the coordinates of all vertices of any two representative regions, and is, for example, the distance between the vertices of any two representative regions.
  • a plurality of representative regions are extracted using the threshold value and the HKS value, and the ring set is used as the shape representative region. More specifically, as shown in FIG. 6, all the vertices are divided into two groups using the HKS value and the threshold value, and the set of vertices located at the boundary between the two groups is detected as a ring. The ring is identified by the vertex ID and coordinates.
  • a threshold value a plurality of rings can be detected, and a ring set composed of a plurality of rings is acquired.
  • Ring sets are extracted by applying HKS to all vertices of skin polygon 1.
  • Each ring in the ring set is identified by a vertex ID.
  • the skin polygons 2 to 5 are each composed of vertices having IDs and coordinates, and the rings in the skin polygons 2 to 5 are specified from the IDs of the vertices constituting the ring, and are specified by the vertex IDs and coordinates. You can get the ring set to be done.
  • the shape representative value is calculated using the ring set.
  • the calculation of the feature amount using the time-series data of the shape representative value will be described.
  • the shape representative value is 1.
  • the ring set corresponding to the posture 2 is acquired, and the shape representative value 2 is calculated in the same manner.
  • the feature amount calculated using the shape representative value 1 and the shape representative value 2 is information on how one ring approaches or separates from another ring when the target posture is displaced from the posture 1 to the posture 2.
  • the shape representative value 1 to the shape representative value 6 are considered to represent the movement of the target skin during the movement, and the shape.
  • the feature amount acquired from the representative value 1 to the shape representative value 6 is a motion feature amount.
  • the skin polygon changes depending on the postures 1 to 5. That is, the movement of the skin polygon (skin polygon 1 to skin polygon 5) corresponds to the walking motion, and the time-series data of the shape representative value representing the shape of the skin polygon represents the walking motion.
  • the kinetic features obtained from the time-series data of the shape representative values can be important information for identifying an individual.
  • the time-series data of the shape representative value representing the shape of the skin polygon represents the swing motion.
  • the swing motion is acquired from the time-series data of the shape representative value
  • the motion feature is acquired from the swing motion, and the difference in performance is obtained.
  • the annular vertex group or the ring as the shape representative region is acquired using HKS, but the setting of the shape representative region is not limited to that using HKS, and the other.
  • Existing shape descriptors may be used.
  • a predetermined position is selected and a shape representative area is set, and the shape representative area is set. May be determined by the vertex ID of the vertex set that is the shape representative region. In the embodiment shown in FIG.
  • two shape representative regions are provided at positions corresponding to the scapula on the back of the human body, and the movement of the scapula may be analyzed by the information obtained from the temporal change of the shape representative region. ..
  • one shape representative region is provided on the back of the human body over a wide area, and the movement of the human body may be analyzed by the information obtained from the temporal change of the shape representative region.
  • the shape representative region is a ring-shaped vertex group, but the shape representative region may be a planar vertex group composed of a plurality of vertices.
  • HKS is applied to each of the skin polygons 1 to 5 to acquire the shape representative region 1 (ring set 1) to the shape representative region 5 (ring set 5), respectively.
  • the shape representative value 1 to the shape representative value 5 are calculated from the shape representative region 1 (ring set 1) to the shape representative region 5 (ring set 5).
  • the shape representative value is the perimeter of each ring and / or the cross-sectional area
  • the feature quantity is an array consisting of the shape representative value 1 to the shape representative value 5 acquired in a plurality of frames.
  • HKS Heat Kernel Signature
  • WKS Wide Kernel Signature
  • HKS Heat Kernel Signature
  • HKS Heat Kernel Signature
  • HKS is a function defined in each set M for each shape information and depends on the 3D coordinates x and the time t.
  • the time t here is not the time of the model motion, but the elapsed time in HKS.
  • FIG. 12 shows actually colored vertices of the model according to the HKS value. In reality, FIG. 12 is a color image.
  • the maximum and minimum values of HKS were colored according to the maximum and minimum values of the color map, respectively.
  • the HKS value is the smallest near the torso, and the HKS value increases toward the end of the body (hands and feet).
  • the set M is divided into two sets A and B above and below a predetermined threshold (H th ) according to the HKS value of each vertex. That is, Will be.
  • the points constituting the triangle in which the points belonging to the group A and the points belonging to the group B are simultaneously present at the three vertices of the triangular polygon are set as the set S (see FIG. 13).
  • the set of vertices that make up the set S is variable by changing the value of the threshold H th.
  • FIG. 15 shows the vertex set S when the value of the threshold value H th is changed.
  • the boundary region consisting of the set of vertices S is a closed curved region or ring.
  • FIG. 15 shows the boundary regions corresponding to S3, S8, S38, S70, and S80 in order from the left.
  • FIG. 15 shows a case where the value of H th is increased in order from the left, and it can be seen that when the value of H th is increased, the position of the boundary region moves to the terminal part (hands and feet) of the body.
  • the number of boundary regions may vary depending on the value of the threshold H th. In the two figures on the left side of FIG. 15, the number of boundary areas is 2, in the central view, the number of boundary areas is 5, and in the two figures on the right side, the number of boundary areas is 4.
  • FIG. 17 shows, as an example, a skin model having different body shapes and postures, in which all 17 boundaries of the sets S 3 , S 8 , S 38 , S 70 , and S 80 are displayed together in one skin model. .. From this figure, it can be seen that the positions of the boundary regions are substantially the same even if the body shapes and postures are different.
  • the set of vertices that make up the boundary region may be further classified.
  • a representative region (ring) may be extracted by performing k-means clustering on the vertex set included in each boundary region using the coordinates of the vertices.
  • the shape representative value is calculated using a ring set consisting of all 17 boundary regions (rings) of the set S.
  • Typical shape value cross-sectional area or perimeter of ring
  • HKS shape information
  • HKS it is possible to match the same part between people even if they have different shapes. Therefore, the same part (ring) is found using HKS, and the parts are compared to identify by shape.
  • Table 3 shows the percentages of the top five correct answers and the results of the combinations, which had the highest percentage of correct answers from the approximately 3000 combinations. From this, the maximum correct answer rate was 96.93% for the combination of S5, S15, S30, and S80.
  • S5 is included in all of the top five, and it can be seen that the area near the fuselage is an important feature. Except for S35, the upper combinations include the boundary below S15, that is, around the torso, and the boundary above S65, that is, the part from the elbow to the wrist. From this, it is inferred that when the volume is normalized, the area around the torso and the area from the elbow to the wrist mainly contributes significantly.
  • FIG. 18 shows a ring set consisting of S5, S15, S30, and S80.
  • the speed of the treadmill was set to 4.0 km / h, which is the average walking speed of a person.
  • Four cameras were installed so as to surround the treadmill, and walking was measured at 60 fps for about one minute for three men in their twenties and one in their thirties, for a total of four people.
  • Non-Patent Document 5 The method of Non-Patent Document 5 is to reconstruct a skin model from one image.
  • the skin model may be reconstructed based on a plurality of images.
  • the joint position acquired by video motion capture (Patent Document 4 and Non-Patent Document 2) is embedded in a skin model acquired in advance in a predetermined posture (for example, an upright posture or a T pose) to create a walking pose.
  • the deformed one was treated as the shape information acquired from the camera.
  • FIG. 22 shows the result of deforming the posture by applying the joint position acquired by video motion capture to the skin model acquired in advance. This shows a walking model every 20 frames (about 0.3 seconds) in order from the left.
  • the shape information may be three-dimensionally reconstructed from the image on the treadmill.
  • features were extracted using S8, S20, S30, and S80 as boundaries, and four people were identified.
  • the feature amount of the shape was taken out from a part of the measured walking data, and the average value of the feature amount for each person was taken and used as each feature base data.
  • each feature amount was extracted from the walking data other than the frame used in the creation of the base data, and these were used as test data.
  • As a method of creating base data since walking is not stable at the start and end of walking, select 600 consecutive frames from the frames excluding the relevant part.
  • a skin model is generated for each frame for a total of 60 frames obtained by extracting one frame for every 10 frames from the selected 600 frames, and for that skin model, the representative area S8 , S20, S30, S80 were created in an array containing the cross-sectional areas, and the average value was taken for each element of the 60 elements in this array as the base data.
  • test data from the walking data excluding the part used in the base data, in the case of shape, another 200 frames were selected from the selected frames and taken every 10 frames from there.
  • the skin model was transformed so that it fits the skeleton, and a total of 20 arrays containing the cross-sectional areas created from the skin model were used as test data.
  • For each test data compare the L1 norms of the base data for 4 people, use the base data with the smallest value as the estimation result, and calculate the ratio that the estimation result and the test data are the same person, 88.75%. It became.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Veterinary Medicine (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physiology (AREA)
  • Dentistry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Image Analysis (AREA)

Abstract

In the present invention, a movement feature amount is extracted from the dynamic state of skin polygons that change in shape during movement. In the present invention, the shape of an object is identified by skin polygons, the vertices of the skin polygons each have coordinates that are dependent on the pose of the object, the shape of the object is represented by a plurality of representative regions selected from the skin polygons, time series data of the skin polygons during the movement of the object is prepared, the plurality of representative regions are used in a plurality of frames to calculate shape representative values that represent the shape of the object in the respective frames, and the time series data of the shape representative values is used to acquire, as a movement feature amount, a value that represents a change in the spatial relation between the representative regions that occurs along with object movement.

Description

皮膚情報を用いた運動特徴量の取得方法及び装置Method and device for acquiring motor features using skin information
本発明は、皮膚情報を用いた運動特徴量の取得装置及方法に関するものである。本発明により取得される運動特徴量は、運動における個性の情報化や個人認証技術として用いることができる。 The present invention relates to a device and a method for acquiring a motor feature amount using skin information. The exercise feature amount acquired by the present invention can be used as an informationization of individuality in exercise and a personal authentication technique.
対象の姿勢(pose)は骨格モデルによって表すことができ、各関節角度及び位置から対象の姿勢が決定される。モーションキャプチャを用いることで、対象の姿勢の時系列データから当該対象の運動データを取得することができる。 The posture of the target can be represented by a skeletal model, and the posture of the target is determined from each joint angle and position. By using motion capture, it is possible to acquire the motion data of the target from the time series data of the posture of the target.
対象の形状(shape)はポリゴンモデルないしポリゴンメッシュによって表すことができる。ポリゴンモデルでは、対象の体表を多数のポリゴン(典型的には三角形)の集合から構成し、全てのポリゴンの頂点の座標から対象の形状が決定される。ポリゴンモデルは、例えば、3Dボディスキャナを用いて、対象の体表のポリゴンの全頂の3次元座標を取得することで得ることができる。 The shape of the object can be represented by a polygon model or a polygon mesh. In the polygon model, the body surface of the target is composed of a set of a large number of polygons (typically triangles), and the shape of the target is determined from the coordinates of the vertices of all the polygons. The polygon model can be obtained by acquiring the three-dimensional coordinates of all the tops of the polygons on the target body surface using, for example, a 3D body scanner.
対象の形状(ポリゴンの頂点座標)は、当該対象の姿勢に依存して変化する。アニメーション分野では、3DCGモデルとスケルトンを関連付ける作業としてスキニングが知られている。スキニングでは、モデルのポリゴンの各頂点が、スケルトンに対してどのように追従するかを決定し、また、スケルトンから各頂点への影響を調整するウエイト調整が行われる。人体モデルも同様であり、人体の姿勢情報を用いた人体の表面(皮膚ポリゴン)の3次元座標の計算については、例えば、特許文献1~3、非特許文献1に記載されている。 The shape of the target (coordinates of the vertices of the polygon) changes depending on the posture of the target. In the field of animation, skinning is known as the task of associating a 3DCG model with a skeleton. Skinning determines how each vertex of the model's polygons follows the skeleton, and weight adjustments are made to adjust the effect of the skeleton on each vertex. The same applies to the human body model, and the calculation of the three-dimensional coordinates of the surface (skin polygon) of the human body using the posture information of the human body is described in, for example, Patent Documents 1 to 3 and Non-Patent Document 1.
1枚あるいは複数のカメラ画像をもとに運動を3次元再構成して動作解析を行うモーションキャプチャ技術の進展がめざましい。例えば、カメラ画像から深層学習によって2次元関節位置を推定しそれらを総合して、運動を3次元再構成するビデオモーションキャプチャが実現されている(特許文献4、非特許文献2)。ビデオモーションキャプチャを用いることで、カメラ映像などから対象者に干渉せずに対象の骨格の3次元情報を取得することができる。カメラ画像から深層学習によって2次元関節位置の推定には、例えば、OpenPose(非特許文献3)を用いることができる。 The progress of motion capture technology that analyzes motion by three-dimensionally reconstructing motion based on one or more camera images is remarkable. For example, video motion capture in which two-dimensional joint positions are estimated from a camera image by deep learning and integrated to reconstruct the motion in three dimensions has been realized (Patent Document 4, Non-Patent Document 2). By using video motion capture, it is possible to acquire three-dimensional information of the target skeleton from camera images without interfering with the target person. For example, OpenPose (Non-Patent Document 3) can be used for estimating the two-dimensional joint position by deep learning from the camera image.
ビデオモーションキャプチャシステムは関節位置を基に骨格の3次元情報を推定するものであるが、骨格のほかにも、RGBカメラのみから3次元体表面情報(形状情報)を推定する技術も開発されている。RGBカメラのみから形状情報を推定する技術としては、衣服の詳細な形状まで再構成するもの(非特許文献4)や、衣服を着用していない状態のスキンモデルを構築するもの(非特許文献5、非特許文献6)が知られている。 The video motion capture system estimates the 3D information of the skeleton based on the joint position, but in addition to the skeleton, a technology to estimate the 3D body surface information (shape information) only from the RGB camera has been developed. There is. As a technique for estimating shape information only from an RGB camera, a technique for reconstructing the detailed shape of clothes (Non-Patent Document 4) and a technique for constructing a skin model without wearing clothes (Non-Patent Document 5). , Non-Patent Document 6) is known.
このように近年では、カメラ画像を用いて、より簡便に 姿勢情報や形状情報を取得できる技術が開発されているが、このとき骨格の運動だけでなく、皮膚で表される形状情報も一緒に3次元再構成することによって人の豊かな運動情報を表現することができる。この運動情報を元にして一般の運動における個性の情報化や、それを用いた個人認証技術を確立することを考える。 In this way, in recent years, technology has been developed that makes it easier to acquire posture information and shape information using camera images. At this time, not only the movement of the skeleton but also the shape information represented by the skin is also developed. Rich motion information of a person can be expressed by three-dimensional reconstruction. Based on this exercise information, we will consider computerizing individuality in general exercise and establishing personal authentication technology using it.
個人認証技術について言うと、従来のパスワードや暗証暗号による認証の欠点を解決する認証方法として、生体認証が注目されている。顔認証、指紋認証、静脈認証、虹彩認証など多くの生体認証は同意を得た上で近距離での生体情報の取得が必要とされる。これに対して、対象者の協力がなくとも、遠距離から生体情報を取得できる手法として歩き方で個人を識別する歩容認証が注目を集めている。歩容認証では特徴量として歩行のシルエット画像や、関節の角度などを用いているものが多く、使われる情報量は未だ限られたものであり、また、特定の運動としての歩容を前提としており精度もまだ満足できるレベルではない。これに対して、姿勢を持った皮膚ポリゴンから得られる情報を用いることで、必ずしも歩行に限定する必要が無い個人認証が可能となる。 Regarding personal authentication technology, biometric authentication is attracting attention as an authentication method that solves the drawbacks of conventional password and password encryption. Many biometrics such as face recognition, fingerprint recognition, vein recognition, and iris recognition require the acquisition of biometric information at a short distance after obtaining consent. On the other hand, gait authentication, which identifies an individual by walking, is attracting attention as a method for acquiring biometric information from a long distance without the cooperation of the subject. Many of the gait authentication uses silhouette images of walking and joint angles as feature quantities, and the amount of information used is still limited, and it is premised on gait as a specific exercise. The accuracy of the cage is not yet satisfactory. On the other hand, by using the information obtained from the skin polygon having a posture, personal authentication that is not necessarily limited to walking becomes possible.
また、3D Human Modelの形状検索についての研究も行われており、非特許文献7を参照することができる。また、形状記述子として、HKS(Heat Kernel Signature)やWKS(Wave Kernel Signature)が提案されている。HKS(Heat Kernel Signature)は非特許文献8に、WKSは特許文献5、非特許文献9に記載されている。これらの形状検索や形状記述子は、運動時の形状の変化に着目したものでない。 In addition, research on shape search of 3D Human Model is also being conducted, and Non-Patent Document 7 can be referred to. Further, HKS (Heat Kernel Signature) and WKS (Wave Kernel Signature) have been proposed as shape descriptors. HKS (Heat Kernel Signature) is described in Non-Patent Document 8, and WKS is described in Patent Document 5 and Non-Patent Document 9. These shape searches and shape descriptors do not focus on changes in shape during exercise.
WO2016/207311A1(US10,395,511B2)WO2016 / 207311A1 (US10,395,511B2) US2020/0058137A1)US2020 / 0058137A1) WO2019/207176A1WO2019 / 207176A1 特開2020-042476JP-A-2020-042476 EP2530623A1EP2530623A1
 本発明は、運動時に変形する皮膚ポリゴンの動態から運動特徴量を抽出することを目的とするものである。 The present invention aims to extract motion features from the dynamics of skin polygons that deform during exercise.
 本発明が採用した技術手段は、
 対象の形状は、皮膚ポリゴンによって特定されており、
 前記皮膚ポリゴンの各頂点は、対象の姿勢に依存した座標を備えており、
 対象の形状は、皮膚ポリゴンから選択された1つあるいは複数の代表領域により代表されており、各代表領域は複数の頂点からなる頂点群であり、
 対象の運動時の皮膚ポリゴンの時系列データを用意し、
 複数フレームにおいて、前記1つあるいは複数の代表領域を用いて、各フレームの対象の形状を代表する形状代表値を算出し、
 前記形状代表値の時系列データを用いて、対象の運動に伴う、前記1つあるいは複数の代表領域の時間的変化を代表する値を運動特徴量として取得する、
 運動特徴量の取得方法、である。
The technical means adopted in the present invention is
The shape of the object is specified by the skin polygon,
Each vertex of the skin polygon has coordinates depending on the posture of the target.
The shape of the object is represented by one or more representative regions selected from the skin polygons, and each representative region is a group of vertices consisting of a plurality of vertices.
Prepare time-series data of the skin polygons during the movement of the target,
In a plurality of frames, using the one or more representative regions, a shape representative value representing the target shape of each frame is calculated.
Using the time-series data of the shape representative value, a value representing the temporal change of the one or more representative regions accompanying the movement of the target is acquired as a motion feature amount.
It is a method of acquiring a motor feature amount.
 1つの態様では、対象の形状は、複数の代表領域により代表されており、
 前記複数の代表領域の時間的変化には、対象の運動に伴う、代表領域間の空間的関係の時間的変化が含まれる。
 1つの態様では、代表領域間の空間的関係は、任意の2つの代表領域の頂点座標の関数(形状代表値ないし形状記述子)により規定される。
 1つの態様では、代表領域間の空間的関係は、任意の2つの代表領域間における頂点間距離(形状代表値ないし形状記述子)により規定される。
In one embodiment, the shape of the object is represented by a plurality of representative regions.
The temporal change of the plurality of representative regions includes a temporal change of the spatial relationship between the representative regions with the movement of the subject.
In one embodiment, the spatial relationship between the representative regions is defined by a function (shape representative value or shape descriptor) of the vertex coordinates of any two representative regions.
In one embodiment, the spatial relationship between the representative regions is defined by the distance between vertices (shape representative value or shape descriptor) between any two representative regions.
 1つの態様では、前記代表領域は、環状に並んだ環状頂点群ないしリングである。
 環状頂点群(リング)は、円環状の頂点群に限定されず、例えば、概ね方形状に並んだ頂点群であってもよい。
 1つの態様では、前記環状頂点群は、人体の部位の周囲に沿って並んだ頂点群である。
 1つの態様では、前記環状頂点群は、人体の部位の表面に配置される。
 1つの態様では、前記環状頂点群は、
 全ての頂点について、HKS(Heat Kernel Signature)を用いてHK値を取得し、
 全ての頂点を、閾値によって2つのグループに分け、
 前記2つのグループの境界に位置して環状に並んだ頂点の集合からなる環状頂点群を取得し、
 閾値を変化させることで、複数の環状頂点群が取得される。
 1つの態様では、前記代表領域は、面を形成するように並んだ複数の頂点からなる面状頂点群(環状頂点群は、いわば線状頂点群である)である。
 また、前記代表領域は、対象の部分の形状を代表するものでもよい。
 また、皮膚ポリゴン全体を選択して代表領域としてもよい。
In one embodiment, the representative region is a ring of annular vertices or rings arranged in a ring.
The annular vertex group (ring) is not limited to the annular vertex group, and may be, for example, a group of vertices arranged in a substantially square shape.
In one embodiment, the annular apex group is a group of vertices arranged along the perimeter of a part of the human body.
In one embodiment, the annular apex group is placed on the surface of a part of the human body.
In one embodiment, the annular apex group is
For all vertices, get the HK value using HKS (Heat Kernel Signature) and
All vertices are divided into two groups according to the threshold,
Obtain a ring of vertices consisting of a set of vertices arranged in a circle at the boundary between the two groups.
By changing the threshold value, a plurality of annular vertex groups are acquired.
In one embodiment, the representative region is a planar apex group consisting of a plurality of vertices arranged so as to form a surface (the annular apex group is, so to speak, a linear apex group).
Further, the representative region may represent the shape of the target portion.
Further, the entire skin polygon may be selected and used as a representative area.
 対象の姿勢は、骨格モデルによって特定されており、
 皮膚ポリゴンモデルの各頂点と、骨格とを関連付ける関数が得られており、
 対象の皮膚ポリゴンモデルの各頂点の座標(初期座標)が、特定の姿勢(初期姿勢)に依存して得られており、
 任意の姿勢における各頂点の座標が、前記関数を用いて、前記初期座標、前記初期姿勢、前記任意の姿勢から取得可能となっている。
The posture of the subject is specified by the skeletal model,
We have obtained a function that associates each vertex of the skin polygon model with the skeleton.
The coordinates (initial coordinates) of each vertex of the target skin polygon model are obtained depending on the specific posture (initial posture).
The coordinates of each vertex in an arbitrary posture can be obtained from the initial coordinates, the initial posture, and the arbitrary posture by using the function.
 1つの態様では、前記対象の姿勢は、1枚あるいは複数枚の画像を用いてマーカレスモーションキャプチャによって取得される。 In one aspect, the posture of the subject is acquired by markerless motion capture using one or more images.
 本発明が採用した他の技術手段は、
 対象の形状は、皮膚ポリゴンによって特定されており、
 前記皮膚ポリゴンモデルの各頂点は、対象の姿勢に依存した座標を備えており、
 全ての頂点について、HKS(Heat Kernel Signature)を用いてHKS値を取得し、
 全ての頂点を、閾値によって2つのグループに分け、
 前記2つのグループの境界に位置して環状に並んだ頂点の集合からなる環状頂点群(リング)を形状代表領域として取得する、
 形状代表情報の取得方法である。
 1つの態様では、 対象の形状は、複数の環状頂点群により代表されており、
 閾値を変化させることで、前記複数の環状頂点群を決定する。
Other technical means adopted by the present invention are
The shape of the object is specified by the skin polygon,
Each vertex of the skin polygon model has coordinates depending on the posture of the target.
For all vertices, HKS values are obtained using HKS (Heat Kernel Signature).
All vertices are divided into two groups according to the threshold,
An annular vertex group (ring) consisting of a set of vertices arranged in an annular shape located at the boundary between the two groups is acquired as a shape representative region.
This is a method for acquiring shape representative information.
In one embodiment, the shape of the object is represented by a group of annular vertices.
By changing the threshold value, the plurality of annular vertices are determined.
 1つの態様では、対象の形状を代表する複数の環状頂点群を、2つの環状頂点群の頂点座標の関数(形状代表値ないし形状記述子)で表す。
 1つの態様では、対象の形状を代表する複数の環状頂点群を、2つの環状頂点群間における頂点間距離(形状代表値ないし形状記述子)として表す。
In one embodiment, a plurality of annular vertex groups representing the target shape are represented by a function (shape representative value or shape descriptor) of the vertex coordinates of the two annular vertex groups.
In one embodiment, a plurality of annular vertex groups representing the target shape are represented as a distance between the vertices (shape representative value or shape descriptor) between the two annular vertex groups.
 1つの態様では、対象の形状を代表する複数の環状頂点群を、各環状頂点群で囲まれた領域の面積、および/あるいは、周囲長によって表す。 In one embodiment, a plurality of annular apex groups representing the shape of the target are represented by the area of the region surrounded by each annular apex group and / or the perimeter.
 上記運動特徴量の取得方法及び上記形状代表情報の取得方法はコンピュータによって実行されるものであり、本発明は、コンピュータに、これらを方法を実行させるためのコンピュータプログラムとしても提供される。 The method for acquiring the motion feature amount and the method for acquiring the shape representative information are executed by a computer, and the present invention is also provided as a computer program for causing the computer to execute these methods.
 本発明が採用した他の技術手段は、
 記憶部と、形状代表値算出部と、運動特徴量算出部と、を備え、
 前記記憶部には、運動時の対象の形状を特定する皮膚ポリゴンの時系列データが記憶されており、前記皮膚ポリゴンの各頂点は、頂点IDと対象の姿勢に依存した座標を備えており、
 対象の形状は、皮膚ポリゴンから選択された1つあるいは複数の代表領域により代表されており、各代表領域は複数の頂点の頂点ID及び座標によって特定される頂点群であり、
 前記形状代表値算出部は、前記1つあるいは複数の代表領域を用いて、姿勢に依存した対象の形状を代表する形状代表値を算出し、
 前記運動特徴量算出部は、複数フレームにおいて取得した前記形状代表値の時系列データを用いて、対象の運動に伴う、前記1つあるいは複数の代表領域の時間的変化を代表する値を運動特徴量として算出する、
 運動特徴量の取得装置、である。
Other technical means adopted by the present invention are
It is equipped with a storage unit, a shape representative value calculation unit, and a motion feature amount calculation unit.
The storage unit stores time-series data of skin polygons that specify the shape of the target during exercise, and each vertex of the skin polygon has a vertex ID and coordinates depending on the posture of the target.
The shape of the object is represented by one or more representative regions selected from the skin polygons, and each representative region is a group of vertices specified by the vertex IDs and coordinates of the plurality of vertices.
The shape representative value calculation unit calculates a shape representative value that represents the shape of the target depending on the posture by using the one or a plurality of representative regions.
The motion feature amount calculation unit uses the time-series data of the shape representative values acquired in a plurality of frames, and uses the motion features to represent the temporal change of the one or more representative regions with the motion of the target. Calculated as a quantity,
It is a device for acquiring motion features.
 1つの態様では、対象の形状は、複数の代表領域により代表されており、
 前記形状代表値算出部は、代表領域間の空間的関係から、姿勢に依存した対象の形状を代表する形状代表値を算出し、
 前記運動特徴量算出部は、複数フレームにおいて取得した前記形状代表値の時系列データを用いて、対象の運動に伴う、代表領域間の空間的関係の変化を代表する値を運動特徴量として算出する。
 1つの態様では、前記形状代表値は、任意の2つの代表領域の頂点座標の関数により規定される。
 1つの態様では、前記形状代表値は、任意の2つの代表領域間における頂点間距離により規定される。
In one embodiment, the shape of the object is represented by a plurality of representative regions.
The shape representative value calculation unit calculates a shape representative value that represents the shape of the target depending on the posture from the spatial relationship between the representative regions.
The motion feature calculation unit uses time-series data of the shape representative values acquired in a plurality of frames to calculate a value representing the change in the spatial relationship between the representative regions due to the motion of the target as the motion feature. do.
In one embodiment, the shape representative value is defined by a function of vertex coordinates of any two representative regions.
In one embodiment, the shape representative value is defined by the distance between vertices between any two representative regions.
 本発明によって、運動時に変形する皮膚ポリゴンの動態から運動特徴量を抽出することができる。 According to the present invention, it is possible to extract motion features from the dynamics of skin polygons that deform during exercise.
本実施形態に係る形状情報を用いた特徴量の取得装置の概要図である。It is a schematic diagram of the feature amount acquisition apparatus using the shape information which concerns on this embodiment. 本実施形態に係る形状情報を用いた対象の特徴量の取得を流れを示す図である。It is a figure which shows the flow of the acquisition of the feature amount of an object using the shape information which concerns on this embodiment. 本実施形態に係る皮膚ポリゴンモデルを説明する図である。It is a figure explaining the skin polygon model which concerns on this embodiment. 本実施形態に係る皮膚ポリゴンの取得を説明する図である。It is a figure explaining the acquisition of the skin polygon which concerns on this embodiment. 本実施形態に係る形状代表値の取得を説明する図である。It is a figure explaining the acquisition of the shape representative value which concerns on this embodiment. HKSを用いた形状代表値の取得を示すフロー図である。It is a flow chart which shows the acquisition of the shape representative value using HKS. 姿勢の変化を反映させた特徴量の取得を示すフロー図である。It is a flow chart which shows the acquisition of the feature amount which reflected the change of the posture. 対象の姿勢の変化に伴う2つのリングの空間的関係の変化を示す概念図である。It is a conceptual diagram which shows the change of the spatial relationship of two rings with the change of the posture of an object. 頂点間距離によって2つのリングの空間的関係を規定することを示す概念図である。It is a conceptual diagram showing that the spatial relationship between two rings is defined by the distance between vertices. 他の実施形態に係る形状代表値の取得を説明する図である。It is a figure explaining the acquisition of the shape representative value which concerns on other embodiment. 姿勢への依存度が小さい特徴量の取得を示すフロー図である。It is a flow chart which shows the acquisition of the feature quantity which is less dependent on the posture. 人体モデルの全ポリゴン頂点をHKS値にしたがって色分けして示図である。実際には、カラー画像である。All polygon vertices of the human body model are color-coded according to the HKS value. It is actually a color image. 2つのグループA,Bの境界に位置する頂点集合Sを検出する方法を示す図である。メッシュの3つの頂点にグループAに属する頂点とグループBに属する頂点が同時に存在する三角形を構成する頂点を集合Sとする。It is a figure which shows the method of detecting the vertex set S located at the boundary of two groups A and B. Let the set S be the vertices that make up a triangle in which the vertices belonging to group A and the vertices belonging to group B exist at the same time on the three vertices of the mesh. 頂点集合S3からクラスタリングによって抽出したリングを形成する座標集合、及び、主成分分析によって求めた平面を示す。The coordinate set forming the ring extracted from the vertex set S3 by clustering and the plane obtained by the principal component analysis are shown. 左から順にS3,S8,S38,S70,S80に対応して検出した境界に位置する頂点集合を示す図である。It is a figure which shows the set of vertices located at the boundary detected corresponding to S3, S8, S38, S70, S80 in order from the left. 異なる体型と姿勢のスキンモデルにおいて、集合S3, S8, S38, S70, S80の合計17個の境界全てを1つのスキンモデルにまとめて表示した図である。In the skin model of different body shape and posture, all 17 boundaries of the set S 3 , S 8 , S 38 , S 70 , S 80 are displayed together in one skin model. SHRECモデル(非特許文献7)の10の姿勢を示す図である。It is a figure which shows 10 postures of a SHREC model (Non-Patent Document 7). S5,S15,S30,S80からなる境界の頂点集合のセットを示す。The set of vertex sets of the boundary consisting of S5, S15, S30, and S80 is shown. HMRによって取得したスキンモデルにビデオモーションキャプチャで取得した関節位置を適用する図である。It is a figure which applies the joint position acquired by video motion capture to the skin model acquired by HMR. ビデオモーションキャプチャで用いる骨格モデルを示し、数字は関節を表している。The skeletal model used in video motion capture is shown, and the numbers represent joints. ビデオモーションキャプチャで用いる骨格モデルを示し、数字は骨を表している。The skeletal model used in video motion capture is shown, and the numbers represent bones. 図17に示す手法によって取得したスキンモデルを示す図である。対象のトレッドミル上の歩行動作を20フレーム(0.3秒)毎に示す。It is a figure which shows the skin model acquired by the method shown in FIG. The walking motion on the target treadmill is shown every 20 frames (0.3 seconds). 人体の各部部位に設けた複数の形状代表領域を例示する概念図である。It is a conceptual diagram which illustrates a plurality of shape representative regions provided in each part part of a human body. 人体の背中に設けた複数の形状代表領域を例示する概念図である。It is a conceptual diagram which illustrates a plurality of shape representative regions provided on the back of a human body. 人体の背中に設けた1つの形状代表領域を例示する概念図である。It is a conceptual diagram which illustrates one shape representative area provided on the back of a human body.
[A]皮膚情報を用いた対象の特徴量の取得システム
[A-1]システムの概要
図1に示すように、本実施形態に係る形状情報を用いた特徴量の取得システムは、対象の動画を取得する1台あるいは複数台のビデオカメラと、画像情報を入力して、所定の計算を実行することで、対象の運動特徴量を算出して出力する1台あるいは複数台のコンピュータと、から構成される。より具体的には、特徴量取得システムは、動画を構成する画像から対象の姿勢情報を取得する姿勢情報取得部と、画像ないし画像及び姿勢情報を用いて対象の形状情報(皮膚ポリゴン)を取得する形状情報取得部と、形状情報を用いて形状代表値を算出する形状代表値算出部と、形状代表値を用いて特徴量を算出する算出部と、を備えている。姿勢情報取得部、形状情報取得部、形状代表値算出部、特徴量算出部は、コンピュータのプロセッサによって実現され、姿勢情報、形状情報(皮膚ポリゴン)、形状代表値、特徴量はコンピュータのメモリ(記憶部)に記憶される。また、図1には図示していないが、コンピュータの記憶部には、骨格モデル、皮膚ポリゴンモデル、骨格モデルにポリゴン頂点座標を対応させる関数(スキニング関数等)等が格納されている。
[A] Outline of the target feature amount acquisition system using skin information [A-1] Outline of the system As shown in FIG. 1, the feature amount acquisition system using the shape information according to the present embodiment is a target moving image. From one or more video cameras to acquire, and one or more computers to calculate and output the amount of motion features of the target by inputting image information and executing a predetermined calculation. It is composed. More specifically, the feature amount acquisition system acquires the target shape information (skin polygon) using the image or image and the posture information, and the posture information acquisition unit that acquires the target posture information from the image constituting the moving image. It is provided with a shape information acquisition unit for calculating a shape representative value, a shape representative value calculation unit for calculating a shape representative value using the shape information, and a calculation unit for calculating a feature amount using the shape representative value. The posture information acquisition unit, shape information acquisition unit, shape representative value calculation unit, and feature amount calculation unit are realized by a computer processor, and the attitude information, shape information (skin polygon), shape representative value, and feature amount are stored in the computer memory ( It is stored in the storage unit). Further, although not shown in FIG. 1, a computer storage unit stores a skeleton model, a skin polygon model, a function (skinning function, etc.) that associates polygon vertex coordinates with the skeleton model.
[A-2]姿勢(pose)の取得
対象の姿勢は、対象の骨格モデルの関節位置によって特定される。本実施形態では、ビデオモーションキャプチャ技術を採用して姿勢情報を取得する。ビデオモーションキャプチャは複数台のカメラを用いて対象となる人間の運動を同期撮影し、その運動の3次元再構成を行う手法である(特許文献4、非特許文献2)。ビデオモーションキャプチャシステムは対象を囲むように配置された複数台のRGBカメラからの映像をそれぞれ OpenPose(非特許文献3)で処理することにより関節位置を推定し、推定した 3次元関節位置について骨格モデルを用いた逆動力学計算を行うことで、3次元関節推定精度を向上させている。ビデオモーションキャプチャシステムの詳細については、特許文献4、非特許文献2を参照することができる。
[A-2] Posture acquisition The posture of the target is specified by the joint position of the target skeletal model. In this embodiment, posture information is acquired by adopting video motion capture technology. Video motion capture is a method of synchronously photographing the motion of a target human using a plurality of cameras and performing three-dimensional reconstruction of the motion (Patent Document 4, Non-Patent Document 2). The video motion capture system estimates the joint position by processing the images from multiple RGB cameras arranged so as to surround the object with OpenPose (Non-Patent Document 3), and the estimated 3D joint position is a skeletal model. The accuracy of 3D joint estimation is improved by performing inverse dynamics calculation using. For details of the video motion capture system, Patent Document 4 and Non-Patent Document 2 can be referred to.
図20に示すように、本実施形態に係るビデオモーションキャプチャにおける計測点は18点であり、各関節にはラベル付けがされている(表1)。また、推定した関節位置を用いて、隣接する関節間を繋いだ線を骨として扱う(図21参照)。骨の数は17本であり、各骨にラベル付けを行う(表1)。骨格モデル、対象の骨格情報(骨長)、モーションキャプチャで取得したフレーム毎の各関節位置座標が記憶部に記憶される。
Figure JPOXMLDOC01-appb-T000001
As shown in FIG. 20, there are 18 measurement points in the video motion capture according to the present embodiment, and each joint is labeled (Table 1). In addition, using the estimated joint position, the line connecting adjacent joints is treated as a bone (see FIG. 21). The number of bones is 17, and each bone is labeled (Table 1). The skeleton model, the target skeleton information (bone length), and the coordinates of each joint position for each frame acquired by motion capture are stored in the storage unit.
Figure JPOXMLDOC01-appb-T000001
本実施形態に係るビデオモーションキャプチャでは、OpenPose(非特許文献3)を用いているが、深層学習によるカメラ画像からの2次元関節位置の推定には他の手法を用いてもよい。また、対象の姿勢の取得手法としては色々な手法が当業者に知られており、対象の姿勢の取得手法は、ビデオモーションキャプチャ(特許文献4、非特許文献2)に限定されず、例えば、1視点からのRGB画像を解析してモーションデータを取得する手法を採用してもよく、あるいは、カメラと深度センサを用いたマーカレスモーションキャプチャを採用してもよい。 Although OpenPose (Non-Patent Document 3) is used in the video motion capture according to the present embodiment, another method may be used for estimating the two-dimensional joint position from the camera image by deep learning. Further, various methods for acquiring the posture of the target are known to those skilled in the art, and the method for acquiring the posture of the target is not limited to video motion capture (Patent Document 4, Non-Patent Document 2), for example. A method of analyzing an RGB image from one viewpoint to acquire motion data may be adopted, or a markerless motion capture using a camera and a depth sensor may be adopted.
[A-3]形状(shape)の取得
対象の形状は、ポリゴンモデルないしポリゴンメッシュによって規定される。ポリゴンモデルは、頂点(vertex)、稜線(edge)、面(face)から構成される。例えば、三角形メッシュモデルの場合、3つの頂点を備え、三角形の面の3辺は稜線からなり、各稜線の始点と終点はそれぞれ3つの頂点のうちの2つの頂点からなる。各ポリゴンは、ポリゴンを構成する全頂点の3次元座標値で代表させることができ、ポリゴンモデルの最もシンプルなデータ構造は、ポリゴンモデルの全ての頂点のIDと3次元座標値とすることができる。なお、ポリゴンモデルのデータ構造には、色々なものが知られており、対象の形状が数値化されていればよく、具体的なデータ構造は限定されない。
[A-3] The shape to be acquired for the shape is defined by a polygon model or a polygon mesh. A polygon model is composed of vertices, edges, and faces. For example, in the case of a triangular mesh model, it has three vertices, three sides of the face of the triangle are composed of ridges, and the start point and the end point of each ridge are each composed of two vertices out of the three vertices. Each polygon can be represented by the 3D coordinate values of all the vertices that make up the polygon, and the simplest data structure of the polygon model can be the IDs and 3D coordinate values of all the vertices of the polygon model. .. Various data structures of the polygon model are known, and the shape of the target may be quantified, and the specific data structure is not limited.
人体の形状の取得に関して、パラメータにより体型や姿勢を指定できるパラメトリックボディモデルを用いる手法が知られている。パラメトリックボディモデルの例として SMPL(A Skinned Multi-Person Linear Model)があげられる(特許文献1~2、非特許文献1)。SMPLは、72の姿勢と10の体型パラメータを持ち、3角メッシュを返す裸の人間のパラメータモデルである。モデルベース手法では、モデルのパラメータを画像の人間のシルエットに当てはまるように最適化することで、3次元の形状情報を取得する。SMPLモデルを用いた形状の取得方法には、衣服を着用していない状態のスキンモデルを構成するものがある(非特許文献5、6)。後述する実験では、非特許文献5に係る手法を用いて、対象のメッシュモデルを取得した。 Regarding the acquisition of the shape of the human body, a method using a parametric body model that can specify the body shape and posture by parameters is known. An example of a parametric body model is SMPL (A Skinned Multi-Person Linear Model) ( Patent Documents 1 and 2, Non-Patent Document 1). SMPL is a naked human parameter model that has 72 postures and 10 body parameters and returns a triangular mesh. In the model-based method, three-dimensional shape information is acquired by optimizing the parameters of the model so that they fit the human silhouette of the image. As a method for acquiring a shape using an SMPL model, there is one that constitutes a skin model in a state where no clothes are worn (Non-Patent Documents 5 and 6). In the experiment described later, a target mesh model was obtained using the method according to Non-Patent Document 5.
[A-4]姿勢情報を持つ皮膚ポリゴンから取得する特徴量
ビデオモーションキャプチャ(特許文献4、非特許文献2)を用いることで、1台あるいは複数台のカメラとコンピュータを用意することで、対象の運動を3次元再構成することが可能であり、また、1台あるいは複数台のカメラの画像から衣服をまとっていても皮膚のおおよその外形を3次元再構成することも可能である(非特許文献5、6)。対象の運動を3次元再構成する時に、骨格の運動だけでなく皮膚で表される形状情報も一緒に3次元再構成することによって対象の豊かな運動情報を表現することができる。すなわち、姿勢の時系列データと形状の時系列データを組み合わせることによって、個人の形状の時間的な変化(変形する皮膚ポリゴン)から運動情報を求めることができる。運動に伴って変形する皮膚ポリゴンを、形状代表値ないし形状記述子を用いて数値化することによって、運動の特徴量を抽出する。
[A-4] By using feature quantity video motion capture (Patent Document 4, Non-Patent Document 2) acquired from a skin polygon having posture information, one or more cameras and a computer can be prepared. It is possible to reconstruct the motion of the skin in three dimensions, and it is also possible to reconstruct the approximate outer shape of the skin in three dimensions even if the clothes are worn from the images of one or more cameras (non-patent). Patent Documents 5 and 6). When the motion of the object is three-dimensionally reconstructed, not only the motion of the skeleton but also the shape information represented by the skin can be three-dimensionally reconstructed to express the rich motion information of the object. That is, by combining the time-series data of the posture and the time-series data of the shape, the motion information can be obtained from the temporal change (deformed skin polygon) of the shape of the individual. The feature amount of the motion is extracted by quantifying the skin polygon deformed by the motion using the shape representative value or the shape descriptor.
上記形状代表値ないし形状記述子を求める際に、既知の形状記述子を用いることができる。形状記述子としては多くの候補が考えられるが、本実施形態では、Heat Kernel Signature(非特許文献8)を用いた。本実施形態では、HKSによって、対象の体表面上での複数の代表領域(閉曲線群ないしリングセット)を特定した。代表領域の情報を用いて形状代表値ないし形状記述子を計算することで計算量を低減が可能になる。 A known shape descriptor can be used when obtaining the shape representative value or the shape descriptor. Many candidates can be considered as the shape descriptor, but in this embodiment, Heat Kernel Signature (Non-Patent Document 8) is used. In this embodiment, HKS has identified a plurality of representative regions (closed curve groups or ring sets) on the body surface of the subject. The amount of calculation can be reduced by calculating the shape representative value or the shape descriptor using the information in the representative area.
図8に示すように、運動に伴って対象の姿勢が姿勢1、姿勢2、姿勢3と変化した時に、姿勢に依存して、リング1の座標値、リング2の座標値、リング1とリング2の空間的関係も変化する。例えば、図9に示すように、任意の2つのリングにおいて、一方のリングの各頂点から、他方のリングの全頂点までの距離を計算し、この値を格納したベクトルを求める。対象の形状を代表するリングセットを構成する全てのリング間の距離、あるいは、特定のリング間の距離を用いたベクトルを形状代表値ないし形状記述子として用いる。姿勢1、姿勢2、姿勢3において、それぞれ、形状代表値ないし形状記述子を求め、形状代表値ないし形状記述子の時系列データを取得する。この時系列データの配列を運動特徴量としてもよく、あるいは、時系列データの配列を低次元化して運動特徴量とすることができる。この運動特徴量には、対象の運動に伴う(姿勢1→姿勢2→姿勢3)、対象の皮膚上の複数の代表部位が離れた、あるいは、近づいた等の情報が含まれており、姿勢情報を持つ皮膚ポリゴンから求めた運動特徴量には、対象の姿勢の時系列データからなる運動情報や骨格情報が反映されていると言える。 As shown in FIG. 8, when the posture of the target changes to posture 1, posture 2, and posture 3 due to the movement, the coordinate value of the ring 1, the coordinate value of the ring 2, and the ring 1 and the ring depend on the posture. The spatial relationship between 2 also changes. For example, as shown in FIG. 9, in any two rings, the distance from each vertex of one ring to all the vertices of the other ring is calculated, and a vector storing this value is obtained. A vector using the distances between all the rings constituting the ring set representing the target shape or the distances between specific rings is used as the shape representative value or the shape descriptor. In the posture 1, the posture 2, and the posture 3, the shape representative value or the shape descriptor is obtained, and the time series data of the shape representative value or the shape descriptor is acquired. The array of time-series data may be used as a motion feature quantity, or the array of time-series data may be reduced in dimension to form a motion feature quantity. This motor feature amount includes information such as that a plurality of representative parts on the target's skin are separated or approached with the movement of the target (posture 1 → posture 2 → posture 3), and the posture is included. It can be said that the motion feature amount obtained from the skin polygon having information reflects the motion information and the skeletal information consisting of the time-series data of the posture of the target.
姿勢情報を持つ皮膚ポリゴンから取得する運動特徴量は、運動における個人差や個性を反映するものであり、この運動特徴量を用いることで、運動における個性の情報化や、個人認証技術を確立することができる。この運動特徴量は、運動トレーニングやリハビリテーションにおける運動変化の指標として用いることができる。また、この運動特徴量を用いて、自己の運動と他者の運動との類似度を得ることができ、ある個人の運動と目標とする運動との距離を数値で表すことができる。人の歩行動作を、姿勢情報を備えた皮膚ポリゴンの時間変化で捉えることで、姿勢情報を持つ皮膚ポリゴンから取得する運動特徴量は個人の歩行動作の特徴量となり、この特徴量は個人を特定するための重要な情報として用いることができる。           The motion feature amount acquired from the skin polygon having the posture information reflects the individual difference and individuality in the exercise, and by using this motion feature amount, the informationization of the individuality in the exercise and the personal authentication technology are established. be able to. This motor feature amount can be used as an index of motor change in exercise training and rehabilitation. In addition, using this motor feature amount, the degree of similarity between one's own movement and the movement of another person can be obtained, and the distance between the movement of a certain individual and the target movement can be expressed numerically. By capturing the walking motion of a person by the time change of the skin polygon with posture information, the motion feature amount acquired from the skin polygon having the posture information becomes the feature amount of the individual walking motion, and this feature amount identifies the individual. It can be used as important information for doing so. ‥
[A-5]その他の特徴量
姿勢の時系列データから運動の特徴量を取得することができる。運動の特徴量については、例えば、NOSEを原点とした各関節の相対位置を計算し、各関節の相対位置が入った配列のRWRIST,LWRIST,RANKLE,LANKLEの部分を抽出し、その相対関節位置の配列を一定フレーム数集め、各関節について分散を計算した配列を運動情報から得た特徴量として定義することができる。低次元化された運動特徴量の取得については、様々なやり方があることが当業者に理解される。
[A-5] Other feature quantities The feature quantities of movement can be acquired from the time-series data of posture. For the feature amount of motion, for example, the relative position of each joint with NOSE as the origin is calculated, the RWRIST, LWRIST, RANKLE, and LANKLE parts of the array containing the relative position of each joint are extracted, and the relative joint position is extracted. The array of can be defined as the feature quantity obtained from the motion information by collecting a certain number of frames and calculating the dispersion for each joint. It will be understood by those skilled in the art that there are various methods for acquiring low-dimensional motion features.
ラベル付けを行った骨の長さを特徴量として用いることができる。この3次元の骨格情報から、骨格の特徴量と運動の特徴量を定義する。骨格の特徴量については、例えば、骨の長さの値が入った配列を作り、これを骨格の特徴量として定義することができる。配列にはラベルのi番目の骨の長さの値が入っており、配列の長さは骨の数と等しい17である。骨格特徴量を低次元化してもよい。 The length of the labeled bone can be used as a feature. From this three-dimensional skeletal information, skeletal features and motion features are defined. As for the skeletal features, for example, an array containing the bone length values can be created and defined as the skeletal features. The sequence contains the value of the i-th bone length of the label, and the length of the sequence is 17, which is equal to the number of bones. The skeletal features may be reduced in dimension.
人体の形状から、姿勢に依存しない形状特徴量を取得してもよい。例えば、人体のある部位の太さや体積、あるいは、複数数の部位間の太さの比や体積比を形状特徴量としてもよい。 From the shape of the human body, the shape feature amount that does not depend on the posture may be acquired. For example, the thickness or volume of a certain part of the human body, or the ratio or volume ratio of the thickness between a plurality of parts may be used as the shape feature amount.
さらに、人体の皮膚ポリゴンから形状に応じて質量分布を推定し、運動で生じる関節や筋の力の推定値を運動特徴量の抽出に用いてもよい。人体の形状情報から当該人体の各部位の体積を計算することができる。身体ないし身体の部位の比重が知られており、体積と比重を用いて、質量分布の概略を計算することができる。人体の質量分布を用いることで、形状を持った個人の運動において、各関節や各筋にかかる力を推定することができる。これも個人認証のための有効な情報になる。 Further, the mass distribution may be estimated from the skin polygons of the human body according to the shape, and the estimated value of the force of the joints and muscles generated by the movement may be used for extracting the movement feature amount. The volume of each part of the human body can be calculated from the shape information of the human body. The specific gravity of the body or body part is known, and the volume and specific gravity can be used to roughly calculate the mass distribution. By using the mass distribution of the human body, it is possible to estimate the force applied to each joint and each muscle in the movement of an individual with a shape. This is also valid information for personal authentication.
[A-6]皮膚ポリゴンモデル
図3を参照しつつ、本実施形態に係る皮膚ポリゴンモデルについて説明する。対象の姿勢は骨格モデルによって代表され、動画データ(画像の時系列データ)から姿勢の時系列データ(姿勢1~姿勢5)が取得される。姿勢1~姿勢5は、必ずしも連続するフレームである必要はなく、例えば、所定の運動における特徴的な時系列フレームであり、あるいは、連続するフレームから所定フレーム数毎に抽出した時系列フレームである。皮膚ポリゴンモデルは、各姿勢に対応した姿勢情報を持つ皮膚ポリゴンを提供する。姿勢1~姿勢5にそれぞれ対応する皮膚ポリゴン1~皮膚ポリゴン5は、姿勢1~姿勢5にそれぞれ対応した頂点座標を備えている。皮膚ポリゴンの頂点座標を用いて、当該皮膚ポリゴンによって規定される対象の形状を代表する形状代表値を求める。皮膚ポリゴン1~皮膚ポリゴン5のそれぞれから形状代表値1~形状代表値5を取得することで、形状代表値の時系列データが得られる。形状代表値の時系列データは対象の運動データを反映しており、形状代表値の時系列データを用いて対象の運動の特徴量を取得する。
[A-6] Skin Polygon Model The skin polygon model according to the present embodiment will be described with reference to FIG. The posture of the target is represented by a skeleton model, and time-series data of the posture (posture 1 to 5) is acquired from the moving image data (time-series data of the image). Postures 1 to 5 do not necessarily have to be continuous frames, and are, for example, characteristic time-series frames in a predetermined motion, or time-series frames extracted from continuous frames for each predetermined number of frames. .. The skin polygon model provides a skin polygon having posture information corresponding to each posture. The skin polygons 1 to 5 corresponding to the postures 1 to 5 have vertex coordinates corresponding to the postures 1 to 5, respectively. Using the apex coordinates of the skin polygon, a shape representative value representing the shape of the target defined by the skin polygon is obtained. By acquiring the shape representative value 1 to the shape representative value 5 from each of the skin polygons 1 to 5, time-series data of the shape representative values can be obtained. The time-series data of the shape representative value reflects the motion data of the target, and the feature amount of the motion of the target is acquired using the time-series data of the shape representative value.
図4を参照しつつ、皮膚ポリゴンモデルの取得の1つの実施形態について説明する。対象の動画から、モーションキャプチャによって対象の姿勢情報が取得される。対象の一定の姿勢の画像に基づいて、初期姿勢を求め、同時に、初期姿勢に対応する皮膚ポリゴンを求める。画像から皮膚ポリゴンを取得することは、公知の手段を用いることができる。骨格モデルと皮膚ポリゴンモデルの座標系を一致させることで、姿勢(関節位置)とポリゴンの頂点座標とを対応させる。骨格モデル(姿勢)に皮膚ポリゴンの頂点座標を対応させる関数(例えば、スキニング関数)が得られており、対象の任意の姿勢における各頂点の座標が、前記関数を用いて、前記初期座標、前記初期姿勢、前記任意の姿勢から取得可能となっている。すなわち、姿勢1~姿勢6から、それぞれ、皮膚ポリゴン1~皮膚ポリゴン6が取得可能となっている。 One embodiment of the acquisition of the skin polygon model will be described with reference to FIG. From the target video, the target posture information is acquired by motion capture. Based on the image of a certain posture of the target, the initial posture is obtained, and at the same time, the skin polygon corresponding to the initial posture is obtained. Known means can be used to obtain skin polygons from an image. By matching the coordinate systems of the skeleton model and the skin polygon model, the posture (joint position) and the coordinates of the polygon apex are made to correspond. A function (for example, a skinning function) that associates the coordinates of the apex of the skin polygon with the skeleton model (posture) has been obtained, and the coordinates of each apex in an arbitrary posture of the target are the initial coordinates and the above using the function. It can be obtained from the initial posture and the above-mentioned arbitrary posture. That is, the skin polygons 1 to 6 can be obtained from the postures 1 to 6, respectively.
図5を参照しつつ、形状代表値の取得の1つの実施形態について説明する。皮膚ポリゴン1は、姿勢1に対応した頂点座標を備えており、全頂点座標を用いて形状記述子を算出し、算出結果を用いて複数の代表領域を抽出し、複数の代表領域からなるセットを用いて形状代表値を算出する。代表領域は頂点集合であり、頂点IDと座標によって特定される。形状代表値は、例えば、任意の2つの代表領域の全頂点座標の関数であり、例えば、任意の2つの代表領域の頂点間の距離である。 One embodiment of the acquisition of the shape representative value will be described with reference to FIG. The skin polygon 1 has vertex coordinates corresponding to the posture 1, a shape descriptor is calculated using all the vertex coordinates, a plurality of representative regions are extracted using the calculation results, and a set consisting of a plurality of representative regions is used. Is used to calculate the shape representative value. The representative region is a set of vertices, which is specified by the vertex ID and coordinates. The shape representative value is, for example, a function of the coordinates of all vertices of any two representative regions, and is, for example, the distance between the vertices of any two representative regions.
例えば、形状記述子としてHKSを用いる場合、閾値とHKS値を用いて、複数の代表領域(リング)を抽出し、リングセットを形状代表領域とする。より具体的には、図6に示すように、HKS値と閾値を用いて、全頂点を2グループに分割し、2グループの境界に位置する頂点集合をリングとして検出する。リングは、頂点IDと座標によって特定される。閾値を変化させることで、複数のリングを検出することができ、複数のリングからなるリングセットを取得する。 For example, when HKS is used as the shape descriptor, a plurality of representative regions (rings) are extracted using the threshold value and the HKS value, and the ring set is used as the shape representative region. More specifically, as shown in FIG. 6, all the vertices are divided into two groups using the HKS value and the threshold value, and the set of vertices located at the boundary between the two groups is detected as a ring. The ring is identified by the vertex ID and coordinates. By changing the threshold value, a plurality of rings can be detected, and a ring set composed of a plurality of rings is acquired.
皮膚ポリゴン1の全頂点にHKSを適用することで、リングセットを抽出する。リングセットの各リングは頂点IDによって特定されている。皮膚ポリゴン2~皮膚ポリゴン5はそれぞれIDと座標を備えた頂点から構成されており、リングを構成する頂点のIDから、皮膚ポリゴン2~皮膚ポリゴン5におけるリングを特定し、頂点IDと座標によって特定されるリングセットを得ることができる。リングセットを用いて形状代表値を算出する。 Ring sets are extracted by applying HKS to all vertices of skin polygon 1. Each ring in the ring set is identified by a vertex ID. The skin polygons 2 to 5 are each composed of vertices having IDs and coordinates, and the rings in the skin polygons 2 to 5 are specified from the IDs of the vertices constituting the ring, and are specified by the vertex IDs and coordinates. You can get the ring set to be done. The shape representative value is calculated using the ring set.
図7を参照して、形状代表値の時系列データを用いた特徴量の算出について説明する。姿勢1に対応するリングセットを取得する。リングセットを構成する全リングあるいは選択した一部のリングの全頂点座標の関数を設定する。この関数は、例えば、1つのリングの頂点座標と他の1つのリングの頂点座標との距離である(図9参照)。これらの距離を用いて形状代表値1とする。姿勢2に対応するリングセットを取得し、同様に、形状代表値2を算出する。形状代表値1と形状代表値2を用いて算出した特徴量は、対象の姿勢が姿勢1から姿勢2に変位した時に、あるリングが別のリングにどのように近づいた、もしくは離れたかという情報を反映する(図8参照)。したがって、姿勢1~姿勢6が、対象の特定の運動を代表している場合に、形状代表値1~形状代表値6は、運動時の対象の皮膚の動きを表していると考えられ、形状代表値1~形状代表値6から取得する特徴量は運動特徴量である。 With reference to FIG. 7, the calculation of the feature amount using the time-series data of the shape representative value will be described. Acquire the ring set corresponding to the posture 1. Set a function of all vertex coordinates of all the rings that make up the ring set or some of the selected rings. This function is, for example, the distance between the vertex coordinates of one ring and the vertex coordinates of the other ring (see FIG. 9). Using these distances, the shape representative value is 1. The ring set corresponding to the posture 2 is acquired, and the shape representative value 2 is calculated in the same manner. The feature amount calculated using the shape representative value 1 and the shape representative value 2 is information on how one ring approaches or separates from another ring when the target posture is displaced from the posture 1 to the posture 2. (See Fig. 8). Therefore, when the postures 1 to 6 represent the specific movement of the target, the shape representative value 1 to the shape representative value 6 are considered to represent the movement of the target skin during the movement, and the shape. The feature amount acquired from the representative value 1 to the shape representative value 6 is a motion feature amount.
図6において、姿勢1~姿勢5が歩行動作に対応している場合に、皮膚ポリゴンは姿勢1~姿勢5に依存して変化する。すなわち、皮膚ポリゴンの動き(皮膚ポリゴン1~皮膚ポリゴン5)は、歩行動作に対応しており、皮膚ポリゴンの形状を代表する形状代表値の時系列データは歩行動作を表している。形状代表値の時系列データから取得される運動特徴量は、個人を特定するための重要な情報となり得る。また、図6において、姿勢1~姿勢5がゴルフのスイング動作に対応している場合には、皮膚ポリゴンの形状を代表する形状代表値の時系列データはスイング動作を表している。例えば、パフォーマンスが良い時のスイング動作を、形状代表値の時系列データから取得される運動特徴量で取得しておき、パフォーマンスが落ちた時に、スイング動作から運動特徴量を取得し、パフォーマンスの違いを運動特徴量の距離(差)として数値化することができる。 In FIG. 6, when the postures 1 to 5 correspond to the walking motion, the skin polygon changes depending on the postures 1 to 5. That is, the movement of the skin polygon (skin polygon 1 to skin polygon 5) corresponds to the walking motion, and the time-series data of the shape representative value representing the shape of the skin polygon represents the walking motion. The kinetic features obtained from the time-series data of the shape representative values can be important information for identifying an individual. Further, in FIG. 6, when the postures 1 to 5 correspond to the swing motion of golf, the time-series data of the shape representative value representing the shape of the skin polygon represents the swing motion. For example, when the performance is good, the swing motion is acquired from the time-series data of the shape representative value, and when the performance is poor, the motion feature is acquired from the swing motion, and the difference in performance is obtained. Can be quantified as the distance (difference) of the motion features.
図6、図7において、1つの態様では、形状代表領域としての環状頂点群ないしリングは、HKSを用いて取得されるが、形状代表領域の設定はHKSを用いるものに限定されず、他の既存の形状記述子を用いてもよい。また、図23に示すように、人体の所定の複数部位、例えば、大腿、下腿、前腕、上腕、胸、腹ないし腰において、所定位置をそれぞれ選択して形状代表領域を設定し、形状代表領域を当該形状代表領域である頂点集合の頂点IDによって決定してもよい。図24に示す態様では、人体の背中の肩甲骨に対応する位置に2つの形状代表領域が設けてあり、形状代表領域の時間的変化から得られる情報によって肩甲骨の動きを分析してもよい。図25に示す態様では、人体の背中に広範囲に亘って1つの形状代表領域が設けてあり、形状代表領域の時間的変化から得られる情報によって人体の動きを分析してもよい。図24、図25では、形状代表領域は環状頂点群であるが、形状代表領域は複数の頂点からなる面状頂点群でもよい。 In FIGS. 6 and 7, in one embodiment, the annular vertex group or the ring as the shape representative region is acquired using HKS, but the setting of the shape representative region is not limited to that using HKS, and the other. Existing shape descriptors may be used. Further, as shown in FIG. 23, in a predetermined plurality of parts of the human body, for example, in the thigh, lower leg, forearm, upper arm, chest, abdomen or waist, a predetermined position is selected and a shape representative area is set, and the shape representative area is set. May be determined by the vertex ID of the vertex set that is the shape representative region. In the embodiment shown in FIG. 24, two shape representative regions are provided at positions corresponding to the scapula on the back of the human body, and the movement of the scapula may be analyzed by the information obtained from the temporal change of the shape representative region. .. In the embodiment shown in FIG. 25, one shape representative region is provided on the back of the human body over a wide area, and the movement of the human body may be analyzed by the information obtained from the temporal change of the shape representative region. In FIGS. 24 and 25, the shape representative region is a ring-shaped vertex group, but the shape representative region may be a planar vertex group composed of a plurality of vertices.
図10に示す態様では、皮膚ポリゴン1~皮膚ポリゴン5のそれぞれにHKSを適用して、それぞれ、形状代表領域1(リングセット1)~形状代表領域5(リングセット5)を取得する。形状代表領域1(リングセット1)~形状代表領域5(リングセット5)から形状代表値1~形状代表値5を算出する。本実施形態では、形状代表値は、各リングの周囲長、あるいは/および、断面積であり、複数フレームで取得した形状代表値1~形状代表値5からなる配列を特徴量とする。 In the embodiment shown in FIG. 10, HKS is applied to each of the skin polygons 1 to 5 to acquire the shape representative region 1 (ring set 1) to the shape representative region 5 (ring set 5), respectively. The shape representative value 1 to the shape representative value 5 are calculated from the shape representative region 1 (ring set 1) to the shape representative region 5 (ring set 5). In the present embodiment, the shape representative value is the perimeter of each ring and / or the cross-sectional area, and the feature quantity is an array consisting of the shape representative value 1 to the shape representative value 5 acquired in a plurality of frames.
図11を参照しつつ、断面積からなる形状代表値の取得について説明する。姿勢1に対応するリングセットを取得する。リングセットを構成する各リングの頂点座標を用いて、各リング(代表領域)における断面積を算出し、断面積のセットから姿勢1における形状代表値1を取得する。姿勢2に対応するリングセットを取得する。リングセットを構成する各リングの頂点座標を用いて、各リング(代表領域)における断面積を算出し、断面積のセットから姿勢2における形状代表値2を取得する。断面積を用いた形状代表値は、対象の姿勢への依存度が小さい特徴量であり、個人認証に用いることができる(後述する実験参照)。 With reference to FIG. 11, acquisition of a shape representative value composed of a cross-sectional area will be described. Acquire the ring set corresponding to the posture 1. Using the vertex coordinates of each ring constituting the ring set, the cross-sectional area in each ring (representative region) is calculated, and the shape representative value 1 in the posture 1 is acquired from the cross-sectional area set. Acquire the ring set corresponding to the posture 2. The cross-sectional area in each ring (representative region) is calculated using the vertex coordinates of each ring constituting the ring set, and the shape representative value 2 in the posture 2 is acquired from the cross-sectional area set. The shape representative value using the cross-sectional area is a feature quantity that is less dependent on the posture of the target and can be used for personal authentication (see the experiment described later).
[B]対象の体表情報の形状記述子
[B-1]HKS(Heat Kernel Signature)
3次元形状情報から特徴量を抽出する方法には形状記述子(Shape Descriptor)が知られている。形状記述子は、形状の照合や探索に用いられるものであり様々な手法が提案されている。人体の3次元体表面のような非剛体間のマッチングを行うための形状記述子としては、HKS(Heat Kernel Signature)やWKS(Wave Kernel Signature)が知られている。本実施形態では、体表面上における熱拡散方程式を利用したHKSを採用し、HKSを用いて対象の形状を代表する形状代表値を取得する。
[B] Shape descriptor of target body surface information [B-1] HKS (Heat Kernel Signature)
A shape descriptor is known as a method for extracting a feature amount from three-dimensional shape information. The shape descriptor is used for collation and search of the shape, and various methods have been proposed. HKS (Heat Kernel Signature) and WKS (Wave Kernel Signature) are known as shape descriptors for matching between non-rigid bodies such as the three-dimensional body surface of the human body. In this embodiment, HKS using the heat diffusion equation on the body surface is adopted, and the shape representative value representing the target shape is acquired by using HKS.
形状記述子としてのHKS(Heat Kernel Signature)について説明する。体表面上の異なる2点の3次元座標をx,y,多様体表面の集合をM,熱拡散の時間をt,u(x,t)を熱分布としたときの3次元表面における熱拡散方程式
Figure JPOXMLDOC01-appb-I000002
の基本解
Figure JPOXMLDOC01-appb-I000003
 
におけるkt(x,y)を熱核(Heat Kernel)と呼ぶ。熱核の固有値分解は、
Figure JPOXMLDOC01-appb-I000004
となる。ここでのλiとφiはそれぞれΔのi番目の固有値と固有関数である。ここで、y=xとしたときのH(x,t)=kt(x,x)をHKS(Heat Kernel Signature)と定義する。
HKS (Heat Kernel Signature) as a shape descriptor will be described. Heat diffusion on a three-dimensional surface when the three-dimensional coordinates of two different points on the body surface are x and y, the set of manifold surfaces is M, the heat diffusion time is t, and u (x, t) is the heat distribution. equation
Figure JPOXMLDOC01-appb-I000002
Fundamental solution of
Figure JPOXMLDOC01-appb-I000003

The kt (x, y) in is called the heat kernel. Eigenvalue decomposition of the heat kernel
Figure JPOXMLDOC01-appb-I000004
Will be. Here, λ i and φ i are the i-th eigenvalue and eigenfunction of Δ, respectively. Here, H (x, t) = kt (x, x) when y = x is defined as HKS (Heat Kernel Signature).
上の式より、HKSは形状情報ごとの各集合Mにおいて定義され、3次元座標xと時間tに依存する関数である。ここでの時間tはモデルのモーションの時間ではなく、HKS内での経過時間である。実際にモデルの各頂点にHKS値に応じて色付けを行ったものを図12に示す。実際には、図12はカラー画像である。HKSの最大値と最小値をそれぞれカラーマップの最大値、最小値に対応させて色付けを行った。胴体付近でHKS値は最も小さく、身体の末端(手先足先)に向かうにつれてHKS値は大きくなっている。 From the above equation, HKS is a function defined in each set M for each shape information and depends on the 3D coordinates x and the time t. The time t here is not the time of the model motion, but the elapsed time in HKS. FIG. 12 shows actually colored vertices of the model according to the HKS value. In reality, FIG. 12 is a color image. The maximum and minimum values of HKS were colored according to the maximum and minimum values of the color map, respectively. The HKS value is the smallest near the torso, and the HKS value increases toward the end of the body (hands and feet).
[B-2]形状を代表する代表領域
形状を代表する領域の抽出について述べる。HKSが近い値を示す体表面上の座標の集合をSとし、ポリゴンモデルの体表面上の全座標の集合をMとする。体表面上の全座標の集合Mや体表面上の一部分の集合Sは、メッシュの頂点の集合である。
[B-2] Representative region representing the shape Extraction of the region representing the shape will be described. Let S be the set of coordinates on the body surface where HKS shows similar values, and let M be the set of all coordinates on the body surface of the polygon model. The set M of all coordinates on the body surface and the set S of a part on the body surface are the set of vertices of the mesh.
集合Sの決め方について説明する。まず、各頂点のHKSの値によって、集合Mを所定の閾値(Hth)以上とそれ未満で2つの集合A,Bに分ける。すなわち、
Figure JPOXMLDOC01-appb-I000005
となる。
How to determine the set S will be explained. First, the set M is divided into two sets A and B above and below a predetermined threshold (H th ) according to the HKS value of each vertex. That is,
Figure JPOXMLDOC01-appb-I000005
Will be.
このとき、三角形ポリゴンの3つの頂点にグループAに属する点とグループBに属する点が同時にある三角形を構成する点を集合Sとする(図13参照)。集合Sを構成する頂点集合は、閾値Hthの値を変えることで可変である。全頂点のHKSの値を小さい順に並び替えた時のkパーセント点のHKSの値を閾値Hthkとして用い、それに対応する頂点集合をSkとする(k = 1,・・・99)。
Figure JPOXMLDOC01-appb-I000006
At this time, the points constituting the triangle in which the points belonging to the group A and the points belonging to the group B are simultaneously present at the three vertices of the triangular polygon are set as the set S (see FIG. 13). The set of vertices that make up the set S is variable by changing the value of the threshold H th. Using the value of the HKS of k percentile when sorted values of HKS of all vertices in ascending order as the threshold value H thk, the vertex set and S k corresponding thereto (k = 1, ··· 99) .
Figure JPOXMLDOC01-appb-I000006
99個の頂点集合Sを、小さい方から順にそれぞれS1, S2,・・・, S99とする。閾値Hthの値を変化させていったときの頂点集合Sを図15に示す。頂点集合Sからなる境界領域は閉曲線状の領域ないしリングである。図15は、左から順にS3,S8,S38,S70,S80に対応する境界領域を示している。図15は、左から順にHthの値を大きくした場合を示しており、Hthの値が大きくなると、境界領域の位置は身体の末端部位(手先足先)に移動することがわかる。また、境界領域の数は、閾値Hthの値によって異なり得る。図15の左側の2つの図では、境界領域の数は2であり、中央図では、境界領域の数は5であり、右側の2つの図では、境界領域の数は4である。 Let the 99 vertex sets S be S 1 , S 2 , ..., S 99 in order from the smallest. FIG. 15 shows the vertex set S when the value of the threshold value H th is changed. The boundary region consisting of the set of vertices S is a closed curved region or ring. FIG. 15 shows the boundary regions corresponding to S3, S8, S38, S70, and S80 in order from the left. FIG. 15 shows a case where the value of H th is increased in order from the left, and it can be seen that when the value of H th is increased, the position of the boundary region moves to the terminal part (hands and feet) of the body. Also, the number of boundary regions may vary depending on the value of the threshold H th. In the two figures on the left side of FIG. 15, the number of boundary areas is 2, in the central view, the number of boundary areas is 5, and in the two figures on the right side, the number of boundary areas is 4.
図17は、例として異なる体型と姿勢のスキンモデルにおいて、集合S3, S8, S38, S70, S80の合計17個の境界全てを1つのスキンモデルにまとめて表示したものである。この図から、体型や姿勢が異なっていたとしても境界領域の位置は略同じであることがわかる。 FIG. 17 shows, as an example, a skin model having different body shapes and postures, in which all 17 boundaries of the sets S 3 , S 8 , S 38 , S 70 , and S 80 are displayed together in one skin model. .. From this figure, it can be seen that the positions of the boundary regions are substantially the same even if the body shapes and postures are different.
境界領域を構成する頂点集合をさらに分類してもよい。例えば、各境界領域に含まれる頂点集合に対して、その頂点の座標を用いてk-meansクラスタリングを行って代表領域(リング)を抽出してもよい。集合Sの全ての17個の境界領域(リング)からなるリングセットを用いて形状代表値を算出する。 The set of vertices that make up the boundary region may be further classified. For example, a representative region (ring) may be extracted by performing k-means clustering on the vertex set included in each boundary region using the coordinates of the vertices. The shape representative value is calculated using a ring set consisting of all 17 boundary regions (rings) of the set S.
[B-3]形状代表値(リングの断面積ないし周囲長)
本実施形態では、HKSによって形状情報から取得した特徴量を用いた個人認証を行う。HKSを用いることにより、人同士であれば形状が異なっていても同じ部位のマッチングを行うことができる。そこで、HKSを用いて同じ部位(リング)をみつけ、その部位を比較することで形状による識別を行う。
[B-3] Typical shape value (cross-sectional area or perimeter of ring)
In this embodiment, personal authentication is performed using the feature amount acquired from the shape information by HKS. By using HKS, it is possible to match the same part between people even if they have different shapes. Therefore, the same part (ring) is found using HKS, and the parts are compared to identify by shape.
この様にして求めた境界集合Sの全ての17個の境界における断面積が入った配列を各モデルの特徴量として用いた。各境界の頂点に対して、主成分分析を用い最も頂点の座標間の分散が大きくなる平面を求め、各頂点をその平面に射影することで2次元化を行う。平面上の頂点から凸包を求め、その凸包の周囲長と面積を特徴量として用いる。 An array containing the cross-sectional areas at all 17 boundaries of the boundary set S obtained in this way was used as the feature quantity of each model. For the vertices of each boundary, a plane with the largest variance between the coordinates of the vertices is obtained using principal component analysis, and each vertex is projected onto that plane to make it two-dimensional. The convex hull is obtained from the apex on the plane, and the perimeter and area of the convex hull are used as features.
境界集合Siから求めた凸包の周囲長、面積が入った配列を値が小さい順に並び替えた配列ic, iaをそれぞれ作る。次に、境界集合Siを複数組み合わせた境界集合S=Sa, Sb,・・ とする。人のラベルをmとし、ポーズのラベルをpとしたとき、この境界集合Sに含まれているSiごとに定義されたic, iaの配列を積み重ねることにより人とポーズと集合S(すなわち境界集合Siの組み合わせ)を変数に持つ配列c(m, p)=ac, bc,・・・, a(m, p) = aa, ba,・・・を作成する。このようにして求めたc(m, p), a(m, p) を形状情報からの特徴量とする。図15の例では、集合SはS={S3, S8, S38, S70, S80}となり、境界集合の数から、3c, 8c, 38c, 70c, 80cの要素数kiはk3=2, k8=2, k38=5, k70=4, k80=4となり、c(m, p)の配列長は17となる。 Create arrays i c and i a by rearranging the array containing the perimeter and area of the convex hull obtained from the boundary set S i in ascending order of value. Next, let S = S a , Sb, ..., Which is a combination of a plurality of boundary sets S i. The label of the human and m, when the pose of the label was p, the frontier set defined i c for each S i that is being included in the S, i set a person and pose by stacking sequence of a S ( That is, an array c (m, p) = a c, b c, ···, a (m, p) = a a, b a, ··· having a variable of the boundary set S i) is created. Let c (m, p) and a (m, p) obtained in this way be the features from the shape information. In the example of FIG. 15, the set S is S = {S3, S8, S38, S70, S80}, and from the number of boundary sets, the number of elements k i of 3c, 8c, 38c, 70c, 80c is k3 = 2, k8. = 2, k38 = 5, k70 = 4, k80 = 4, and the array length of c (m, p) is 17.
[C]実験
[C-1]使用したデータセット
本章では、提案した特徴量の配列を用いて、データセットの異なる人を識別し、提案手法による特徴量の有用性を確認した。実験にはSHREC'14(Shape Retrieval of Non-Rigid 3D Human Models)Human Dataset(非特許文献7)を利用した。40人が10種類のポーズをとった合計400のメッシュモデルで構成されている(図17参照)。また、メッシュの頂点数は15000前後である。個人認証において同性間での比較の精度を優先すべきであるため、40人のモデルのうち男性20人のモデルを用いた。200種類のモデルの中には、メッシュの一部が欠損していたためHKSを計算できないものも含まれており、最終的にそれらを除いた20人9ポーズの合計163種類のスキンモデルを用いて識別を行った。これらの各モデルには時間(t=1000)におけるHKSを計算する処理を加え、それをデータとして用いた。
[C] Experiment [C-1] Data set used In this chapter, people with different data sets were identified using the proposed feature array, and the usefulness of the feature by the proposed method was confirmed. SHREC'14 (Shape Retrieval of Non-Rigid 3D Human Models) Human Dataset (Non-Patent Document 7) was used for the experiment. It consists of a total of 400 mesh models in which 40 people pose in 10 different poses (see Fig. 17). The number of vertices of the mesh is around 15,000. Since the accuracy of comparison between the same sex should be prioritized in personal authentication, the model of 20 males out of 40 models was used. Among the 200 types of models, there are some that cannot calculate HKS because a part of the mesh was missing, and finally, using a total of 163 types of skin models of 20 people and 9 poses excluding them. Identification was performed. Each of these models was processed to calculate HKS at time (t = 1000) and used as data.
[C-2]正規化
HKSの値はモデルのサイズに依存しないよう正規化されているものの、HKSから導いた境界領域の断面積に関しては、モデルのサイズに特徴量が依存しないようモデルのサイズを揃える正規化を行った。正規化の方法として、身長(大腿骨の長さ)を統一させる方法、首の断面積を統一させる方法、体積を統一させる方法などが考えられるが、実験では、体積で正規化を行う手法を採用した。なお、特徴量を用いる目的によって、正規化が必要な場合と必要ない場合があり、また、正規化手段については、上記に限定されるものではないことが当業者に理解される。
[C-2] Normalization
Although the HKS value is normalized so that it does not depend on the size of the model, the cross-sectional area of the boundary region derived from HKS is normalized so that the feature quantity does not depend on the size of the model. .. As a method of normalization, a method of unifying the height (length of the femur), a method of unifying the cross-sectional area of the neck, a method of unifying the volume, etc. can be considered, but in the experiment, the method of normalizing by volume is used. Adopted. It should be noted that those skilled in the art will understand that normalization may or may not be necessary depending on the purpose of using the feature amount, and the normalization means are not limited to the above.
[C-3]ポーズによる正答率の違い
ポーズと正答率との関係について調べた。各ポーズを図17に示す。左上から右下にかけてそれぞれポーズ0から9となっている。ポーズ8においてメッシュが欠損していたため10種類あるポーズのうち9種類を実験に用いた。体積で正規化を行ったときの、n=1,2,3における正答率について検討した結果を表2に示す。横軸にはポーズの種類(p)を、縦軸にはnにおける正答率を示している。nは、あるテストデータから推測される体型の候補をn個あげたときの正答率である。
Figure JPOXMLDOC01-appb-T000007
結果からは、腕の位置や関節の角度はあまり正答率に影響はせず、一方、膝の関節角度は正答率に影響を及ぼし得ると考えられる。p=3,4,6などの膝の関節角度が鈍角の場合は、識別に関して大きな影響がないが、鋭角となると影響が無視できなくなると推定できる。
[C-3] Difference in correct answer rate depending on the pose The relationship between the pose and the correct answer rate was investigated. Each pose is shown in FIG. From the upper left to the lower right, the poses are 0 to 9, respectively. Since the mesh was missing in pose 8, 9 of the 10 poses were used in the experiment. Table 2 shows the results of examining the correct answer rate at n = 1, 2, and 3 when normalization was performed by volume. The horizontal axis shows the pose type (p), and the vertical axis shows the correct answer rate at n. n is the correct answer rate when n body shape candidates inferred from a certain test data are given.
Figure JPOXMLDOC01-appb-T000007
From the results, it is considered that the position of the arm and the angle of the joint do not affect the correct answer rate so much, while the joint angle of the knee can affect the correct answer rate. If the knee joint angle is obtuse, such as p = 3, 4, 6, there is no significant effect on discrimination, but it can be estimated that the effect cannot be ignored at an acute angle.
[C-4]最適な境界領域について
集合S3,S8,S38,S70,S80からなるリングセットを用いて検証を行ったが、これらは例示であって、境界領域のセットは、これに限定されるものではない。集合Shはh = 1,・・・,99 の合計99個あるが、そのうちhが5の倍数の境界であるh = 5,10,・・・,90,95の合計19個から、リングを形成していないS20と手先足先の部分S85,S90,S95を除いた合計15個のShについて調べた。15個のShの中から4つの集合を選び(15C4の約3000通り)、その中で最適な組み合わせを選んだ。約3000通りの組み合わせの中から正答率が高かった上位5つの正答率と組み合わせの結果を表3に示す。
Figure JPOXMLDOC01-appb-T000008
このことから、最大の正答率はS5,S15,S30,S80の組み合わせの96.93%となった。上位5つの全てにS5が入っており、胴体付近の面積は重要な特徴量ということがわかる。S35をのぞき、上位の組み合わせにはS15以下の境界、すなわち胴体周りと、S65以上の境界、すなわち肘から手首にかけての部位が含まれている。このことから、体積を正規化させたときには胴体周りと肘から手首にかけての部位が主に大きく寄与していることが推測される。S5,S15,S30,S80からなるリングセットを図18に示す。
[C-4] The optimum boundary region was verified using a ring set consisting of sets S3, S8, S38, S70, and S80, but these are examples, and the set of boundary regions is limited to this. It's not something. There are a total of 99 sets S h = 1, ..., 99, of which h = 5, 10, ..., 90, 95, which is a boundary of multiples of 5, a total of 19 rings. A total of 15 S h were investigated, excluding S20, which did not form S20, and S85, S90, and S95, which are the parts of the hands and toes. We selected 4 sets from 15 S h (about 3000 ways of 15 C 4 ), and selected the most suitable combination from them. Table 3 shows the percentages of the top five correct answers and the results of the combinations, which had the highest percentage of correct answers from the approximately 3000 combinations.
Figure JPOXMLDOC01-appb-T000008
From this, the maximum correct answer rate was 96.93% for the combination of S5, S15, S30, and S80. S5 is included in all of the top five, and it can be seen that the area near the fuselage is an important feature. Except for S35, the upper combinations include the boundary below S15, that is, around the torso, and the boundary above S65, that is, the part from the elbow to the wrist. From this, it is inferred that when the volume is normalized, the area around the torso and the area from the elbow to the wrist mainly contributes significantly. FIG. 18 shows a ring set consisting of S5, S15, S30, and S80.
[C-5]歩行動作からの識別実験
本実験ではトレッドミル上での対象の歩行動作を計測した。トレッドミルの速度は人の平均的な歩行速度である時速4.0kmに設定した。4台のカメラをトレッドミルを取り囲む様に設置し、60fpsで20代男性3名と30代1名合計4名の約1分の歩行の計測を行った。
[C-5] Identification experiment from walking motion In this experiment, the walking motion of the target on the treadmill was measured. The speed of the treadmill was set to 4.0 km / h, which is the average walking speed of a person. Four cameras were installed so as to surround the treadmill, and walking was measured at 60 fps for about one minute for three men in their twenties and one in their thirties, for a total of four people.
実験では衣服を着用していない状態のスキンモデルを入力として用いたいため、非特許文献5に開示されているHMR(Human Mesh Recovery)によりスキンモデルを作成した。非特許文献5の手法は、1枚の画像からスキンモデルの再構成を行うものである。なお、複数枚の画像に基づいてスキンモデルの再構成を行ってもよい。 In the experiment, we wanted to use a skin model without clothes as an input, so we created a skin model by HMR (Human Mesh Recovery) disclosed in Non-Patent Document 5. The method of Non-Patent Document 5 is to reconstruct a skin model from one image. The skin model may be reconstructed based on a plurality of images.
本実験では、所定の姿勢(例えば、直立姿勢やTポーズ)で予め取得したスキンモデルにビデオモーションキャプチャ(特許文献4、非特許文献2)によって取得した関節位置を埋め込むことで歩いているポーズに変形したものを、カメラから取得した形状情報とみなして扱った。事前に取得したスキンモデルにビデオモーションキャプチャによって取得した関節位置を当てはめることで、姿勢を変形した結果を図22に示す。これは、左から順に20フレーム(約0.3 秒)ごとの歩行のモデルを表したものとなっている。なお、トレッドミル上の映像から形状情報を3次元再構成してもよい。 In this experiment, the joint position acquired by video motion capture (Patent Document 4 and Non-Patent Document 2) is embedded in a skin model acquired in advance in a predetermined posture (for example, an upright posture or a T pose) to create a walking pose. The deformed one was treated as the shape information acquired from the camera. FIG. 22 shows the result of deforming the posture by applying the joint position acquired by video motion capture to the skin model acquired in advance. This shows a walking model every 20 frames (about 0.3 seconds) in order from the left. The shape information may be three-dimensionally reconstructed from the image on the treadmill.
このスキンモデルに対して、境界はS8,S20,S30,S80を用いて特徴量を抽出し、4人の識別を行った。評価方法として、計測した歩行データの一部から形状の特徴量を取り出し、人ごとの特徴量の平均値を取り、各特徴ベースデータとした。次に、ベースデータの作成で用いたフレーム以外の歩行データから、各特徴量を抽出しこれらをテストデータとした。ベースデータの作成方法として、歩行の開始時と終了時は安定した歩行ではないため、該当部分を除外したフレームの中から連続した600フレームを選択する。選択した600フレームに関して、形状については、選択した600フレームの中から10フレームごとに1フレーム分を抜き出した合計60フレームに関して、1フレームごとにスキンモデルを生成し、そのスキンモデルに関して、代表領域S8,S20,S30,S80における断面積が入った配列を作り、この配列60個の各要素ごとに平均値を取ったものをベースデータとした。 For this skin model, features were extracted using S8, S20, S30, and S80 as boundaries, and four people were identified. As an evaluation method, the feature amount of the shape was taken out from a part of the measured walking data, and the average value of the feature amount for each person was taken and used as each feature base data. Next, each feature amount was extracted from the walking data other than the frame used in the creation of the base data, and these were used as test data. As a method of creating base data, since walking is not stable at the start and end of walking, select 600 consecutive frames from the frames excluding the relevant part. For the selected 600 frames, for the shape, a skin model is generated for each frame for a total of 60 frames obtained by extracting one frame for every 10 frames from the selected 600 frames, and for that skin model, the representative area S8 , S20, S30, S80 were created in an array containing the cross-sectional areas, and the average value was taken for each element of the 60 elements in this array as the base data.
次に、テストデータの作成方法として、ベースデータで用いた部分を除いた歩行データの中から、形状の場合では、選択したフレームの中から、さらに200フレームを選びそこから10フレームおきにとった合計20フレーム1つ1つに関して、骨格に当てはまる様にスキンモデルを変形させ、そのスキンモデルから作成した断面積が入った合計20個の配列をテストデータとした。4人分のベースデータ4つと、一人あたり20個の合計80個のテストデータを作成した。これらを、それぞれのテストデータに関して、4人分のベースデータのL1ノルムを比較し、最も値が小さくなるベースデータを推定結果とし、推定結果とテストデータが同一人物である割合を計算すると88.75%となった。 Next, as a method of creating test data, from the walking data excluding the part used in the base data, in the case of shape, another 200 frames were selected from the selected frames and taken every 10 frames from there. For each of the 20 frames in total, the skin model was transformed so that it fits the skeleton, and a total of 20 arrays containing the cross-sectional areas created from the skin model were used as test data. We created a total of 80 test data, 4 base data for 4 people and 20 per person. For each test data, compare the L1 norms of the base data for 4 people, use the base data with the smallest value as the estimation result, and calculate the ratio that the estimation result and the test data are the same person, 88.75%. It became.

Claims (17)

  1.  対象の形状は、皮膚ポリゴンによって特定されており、
     前記皮膚ポリゴンの各頂点は、対象の姿勢に依存した座標を備えており、
     対象の形状は、皮膚ポリゴンから選択された1つあるいは複数の代表領域により代表されており、各代表領域は複数の頂点からなる頂点群であり、
     対象の運動時の皮膚ポリゴンの時系列データを用意し、
     複数フレームにおいて、前記1つあるいは複数の代表領域を用いて、各フレームの対象の形状を代表する形状代表値を算出し、
     前記形状代表値の時系列データを用いて、対象の運動に伴う、前記1つあるいは複数の代表領域の時間的変化を代表する値を運動特徴量として取得する、
     運動特徴量の取得方法。
    The shape of the object is specified by the skin polygon,
    Each vertex of the skin polygon has coordinates depending on the posture of the target.
    The shape of the object is represented by one or more representative regions selected from the skin polygons, and each representative region is a group of vertices consisting of a plurality of vertices.
    Prepare time-series data of the skin polygons during the movement of the target,
    In a plurality of frames, using the one or more representative regions, a shape representative value representing the target shape of each frame is calculated.
    Using the time-series data of the shape representative value, a value representing the temporal change of the one or more representative regions accompanying the movement of the target is acquired as a motion feature amount.
    How to get the amount of motor features.
  2.  対象の形状は、複数の代表領域により代表されており、
     前記複数の代表領域の時間的変化には、対象の運動に伴う、代表領域間の空間的関係の時間的変化が含まれる、
     請求項1に記載の運動特徴量の取得方法。
    The shape of the object is represented by a plurality of representative regions.
    The temporal change of the plurality of representative regions includes the temporal change of the spatial relationship between the representative regions with the movement of the subject.
    The method for acquiring a motor feature amount according to claim 1.
  3.  代表領域間の空間的関係は、任意の2つの代表領域の頂点座標の関数により規定される、
     請求項2に記載の運動特徴量の取得方法。
    The spatial relationship between the representative regions is defined by a function of the vertex coordinates of any two representative regions.
    The method for acquiring a motor feature amount according to claim 2.
  4.  代表領域間の空間的関係は、任意の2つの代表領域間における頂点間距離により規定される、
     請求項2、3いずれか1項に記載の運動特徴量の取得方法。
    The spatial relationship between representative regions is defined by the distance between vertices between any two representative regions.
    The method for acquiring a motor feature amount according to any one of claims 2 and 3.
  5.  前記代表領域は、環状に並んだ環状頂点群である、
     請求項1~3いずれか1項に記載の運動特徴量の取得方法。
    The representative region is a group of circular vertices arranged in a ring.
    The method for acquiring a motor feature amount according to any one of claims 1 to 3.
  6.  前記環状頂点群は、
     全ての頂点について、HKS(Heat Kernel Signature)を用いてHK値を取得し、
     全ての頂点を、閾値によって2つのグループに分け、
     前記2つのグループの境界に位置して環状に並んだ頂点の集合からなる環状頂点群を取得し、
     閾値を変化させることで、複数の環状頂点群を取得する、
     請求項5に記載の運動特徴量の取得方法。
    The ring-shaped vertices are
    For all vertices, get the HK value using HKS (Heat Kernel Signature) and
    All vertices are divided into two groups according to the threshold,
    Obtain a ring of vertices consisting of a set of vertices arranged in a circle at the boundary between the two groups.
    Acquire multiple circular vertices by changing the threshold value,
    The method for acquiring a motor feature amount according to claim 5.
  7.  対象の姿勢は、骨格モデルによって特定されており、
     皮膚ポリゴンモデルの各頂点と、骨格とを関連付ける関数が得られており、
     対象の皮膚ポリゴンモデルの各頂点の初期座標が、特定の初期姿勢に依存して得られており、
     任意の姿勢における各頂点の座標が、前記関数を用いて、前記初期座標、前記初期姿勢、前記任意の姿勢から取得可能となっている、
     請求項1~6いずれか1項に記載の運動特徴量の取得方法。
    The posture of the subject is specified by the skeletal model,
    We have obtained a function that associates each vertex of the skin polygon model with the skeleton.
    The initial coordinates of each vertex of the skin polygon model of interest are obtained depending on the specific initial posture.
    The coordinates of each vertex in an arbitrary posture can be obtained from the initial coordinates, the initial posture, and the arbitrary posture by using the function.
    The method for acquiring a motor feature amount according to any one of claims 1 to 6.
  8.  前記対象の姿勢は、1枚あるいは複数枚の画像を用いてマーカレスモーションキャプチャによって取得される、
     請求項1~7いずれか1項に記載の運動特徴量の取得方法。
    The posture of the subject is acquired by markerless motion capture using one or more images.
    The method for acquiring a motor feature amount according to any one of claims 1 to 7.
  9.  対象の形状は、皮膚ポリゴンによって特定されており、
     前記皮膚ポリゴンモデルの各頂点は、対象の姿勢に依存した座標を備えており、
     全ての頂点について、HKS(Heat Kernel Signature)を用いてHKS値を取得し、
     全ての頂点を、閾値によって2つのグループに分け、
     前記2つのグループの境界に位置して環状に並んだ頂点の集合からなる環状頂点群を形状代表領域として取得する、
     形状代表情報の取得方法。
    The shape of the object is specified by the skin polygon,
    Each vertex of the skin polygon model has coordinates depending on the posture of the target.
    For all vertices, HKS values are obtained using HKS (Heat Kernel Signature).
    All vertices are divided into two groups according to the threshold,
    A group of annular vertices consisting of a set of vertices arranged in an annular shape located at the boundary between the two groups is acquired as a shape representative region.
    How to get shape representative information.
  10.  対象の形状は、複数の環状頂点群により代表されており、
     閾値を変化させることで、前記複数の環状頂点群を決定する、
     請求項9に記載の形状代表情報の取得方法。
    The shape of the object is represented by a group of circular vertices.
    By changing the threshold value, the plurality of circular vertices are determined.
    The method for acquiring shape representative information according to claim 9.
  11.  対象の形状を代表する複数の環状頂点群を、2つの環状頂点群の頂点座標の関数で表す、
     請求項10に記載の形状代表情報の取得方法。
    A plurality of annular vertex groups representing the target shape are represented by a function of the vertex coordinates of the two annular vertex groups.
    The method for acquiring shape representative information according to claim 10.
  12.  対象の形状を代表する複数の環状頂点群を、2つの環状頂点群間における頂点間距離として表す、
     請求項10、11いずれか1項に記載の形状代表情報の取得方法。
    A plurality of ring-shaped vertices representing the shape of the target are represented as the distance between the vertices between the two ring-shaped vertices.
    The method for acquiring shape representative information according to any one of claims 10 and 11.
  13.  対象の形状を代表する複数の環状頂点群を、各環状頂点群で囲まれた領域の面積、および/あるいは、周囲長によって表す、
     請求項9、10いずれか1項に記載の形状代表情報の取得方法。
    A plurality of annular vertices representing the shape of the object are represented by the area of the area surrounded by each annular vertex group and / or the perimeter.
    The method for acquiring shape representative information according to any one of claims 9 and 10.
  14.  記憶部と、形状代表値算出部と、運動特徴量算出部と、を備え、
     前記記憶部には、運動時の対象の形状を特定する皮膚ポリゴンの時系列データが記憶されており、前記皮膚ポリゴンの各頂点は、頂点IDと対象の姿勢に依存した座標を備えており、
     対象の形状は、皮膚ポリゴンから選択された1つあるいは複数の代表領域により代表されており、各代表領域は複数の頂点の頂点ID及び座標によって特定される頂点群であり、
     前記形状代表値算出部は、前記1つあるいは複数の代表領域を用いて、姿勢に依存した対象の形状を代表する形状代表値を算出し、
     前記運動特徴量算出部は、複数フレームにおいて取得した前記形状代表値の時系列データを用いて、対象の運動に伴う、前記1つあるいは複数の代表領域の時間的変化を代表する値を運動特徴量として算出する、
     運動特徴量の取得装置。
    It is equipped with a storage unit, a shape representative value calculation unit, and a motion feature amount calculation unit.
    The storage unit stores time-series data of skin polygons that specify the shape of the target during exercise, and each vertex of the skin polygon has a vertex ID and coordinates depending on the posture of the target.
    The shape of the object is represented by one or more representative regions selected from the skin polygons, and each representative region is a group of vertices specified by the vertex IDs and coordinates of the plurality of vertices.
    The shape representative value calculation unit calculates a shape representative value that represents the shape of the target depending on the posture by using the one or a plurality of representative regions.
    The motion feature amount calculation unit uses the time-series data of the shape representative values acquired in a plurality of frames, and uses the motion features to represent the temporal change of the one or more representative regions with the motion of the target. Calculated as a quantity,
    A device for acquiring motion features.
  15.  対象の形状は、複数の代表領域により代表されており、
     前記形状代表値算出部は、代表領域間の空間的関係から、姿勢に依存した対象の形状を代表する形状代表値を算出し、
     前記運動特徴量算出部は、複数フレームにおいて取得した前記形状代表値の時系列データを用いて、対象の運動に伴う、代表領域間の空間的関係の変化を代表する値を運動特徴量として算出する、
     請求項14に記載の運動特徴量の取得装置。
    The shape of the object is represented by a plurality of representative regions.
    The shape representative value calculation unit calculates a shape representative value that represents the shape of the target depending on the posture from the spatial relationship between the representative regions.
    The motion feature calculation unit uses time-series data of the shape representative values acquired in a plurality of frames to calculate a value representing the change in the spatial relationship between the representative regions due to the motion of the target as the motion feature. do,
    The device for acquiring a motion feature amount according to claim 14.
  16.  前記形状代表値は、任意の2つの代表領域の頂点座標の関数により規定される、
     請求項15に記載の運動特徴量の取得装置。
    The shape representative value is defined by a function of the vertex coordinates of any two representative regions.
    The device for acquiring a motion feature amount according to claim 15.
  17.  前記形状代表値は、任意の2つの代表領域間における頂点間距離により規定される、
     請求項15、16いずれか1項に記載の運動特徴量の取得装置。
    The shape representative value is defined by the distance between vertices between any two representative regions.
    The device for acquiring a motion feature amount according to any one of claims 15 and 16.
PCT/JP2021/018809 2020-05-22 2021-05-18 Method and device for acquiring movement feature amount using skin information WO2021235440A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020-090047 2020-05-22
JP2020090047A JP2021184215A (en) 2020-05-22 2020-05-22 Method and apparatus for acquiring motion feature quantity using skin information

Publications (1)

Publication Number Publication Date
WO2021235440A1 true WO2021235440A1 (en) 2021-11-25

Family

ID=78708546

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/018809 WO2021235440A1 (en) 2020-05-22 2021-05-18 Method and device for acquiring movement feature amount using skin information

Country Status (2)

Country Link
JP (1) JP2021184215A (en)
WO (1) WO2021235440A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7120532B1 (en) * 2021-12-17 2022-08-17 株式会社ワコール Program, device and method for statistically analyzing body shape based on flesh from skin model

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010508609A (en) * 2006-11-01 2010-03-18 ソニー株式会社 Surface capture in motion pictures

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010508609A (en) * 2006-11-01 2010-03-18 ソニー株式会社 Surface capture in motion pictures

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KANEZAKI, ASAKO ET AL.: "Learning Similarities for Rigid and Non-Rigid Object Detection", IEICE TECHNICAL REPORT, vol. 114, no. 230, 2014, pages 13 - 18 *
KOGA, HIROTAKA ET AL.: "Extraction and evaluation of gait feature values from video motion capture", LECTURE PROCEEDINGS OF THE 2018 JSME ANNUAL CONFERENCE ON ROBOTICS AND MECHATRONICS, 2018, pages 1P1 - D09 *

Also Published As

Publication number Publication date
JP2021184215A (en) 2021-12-02

Similar Documents

Publication Publication Date Title
Hesse et al. Computer vision for medical infant motion analysis: State of the art and rgb-d data set
Pala et al. Multimodal person reidentification using RGB-D cameras
Loper et al. MoSh: motion and shape capture from sparse markers.
US8023726B2 (en) Method and system for markerless motion capture using multiple cameras
CN108717531B (en) Human body posture estimation method based on Faster R-CNN
JP5873442B2 (en) Object detection apparatus and object detection method
CN107392086B (en) Human body posture assessment device, system and storage device
Uddin et al. Human activity recognition using body joint‐angle features and hidden Markov model
CN105740780B (en) Method and device for detecting living human face
CN105740781B (en) Three-dimensional human face living body detection method and device
Sundaresan et al. Model driven segmentation of articulating humans in Laplacian Eigenspace
JP2010176380A (en) Information processing device and method, program, and recording medium
CN102609683A (en) Automatic labeling method for human joint based on monocular video
CN110263605A (en) Pedestrian's dress ornament color identification method and device based on two-dimension human body guise estimation
CN104794449A (en) Gait energy image acquisition method based on human body HOG (histogram of oriented gradient) features and identity identification method
Iwasawa et al. Real-time, 3D estimation of human body postures from trinocular images
Wang Analysis and evaluation of Kinect-based action recognition algorithms
WO2021235440A1 (en) Method and device for acquiring movement feature amount using skin information
Zhang et al. Local surface geometric feature for 3D human action recognition
Imani et al. Histogram of the node strength and histogram of the edge weight: two new features for RGB-D person re-identification
Yamauchi et al. Recognition of walking humans in 3D: Initial results
Zhang A comprehensive survey on face image analysis
El-Sallam et al. A low cost 3D markerless system for the reconstruction of athletic techniques
Seely et al. View invariant gait recognition
Rafi et al. A parametric approach to gait signature extraction for human motion identification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21809534

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21809534

Country of ref document: EP

Kind code of ref document: A1