WO2021169924A1 - 行为预测方法及装置、步态识别方法及装置、电子设备和计算机可读存储介质 - Google Patents
行为预测方法及装置、步态识别方法及装置、电子设备和计算机可读存储介质 Download PDFInfo
- Publication number
- WO2021169924A1 WO2021169924A1 PCT/CN2021/077297 CN2021077297W WO2021169924A1 WO 2021169924 A1 WO2021169924 A1 WO 2021169924A1 CN 2021077297 W CN2021077297 W CN 2021077297W WO 2021169924 A1 WO2021169924 A1 WO 2021169924A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- key point
- target object
- point information
- dimensional key
- target
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
- G06V40/25—Recognition of walking or running movements, e.g. gait recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/34—Smoothing or thinning of the pattern; Morphological operations; Skeletonisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
- G06V20/647—Three-dimensional objects by matching two-dimensional images to three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Definitions
- the present disclosure relates to the field of computer vision technology, and in particular, to a behavior prediction method and device, a gait recognition method and device, electronic equipment, and a computer-readable storage medium.
- the accuracy of the predicted gait data and the target center of gravity coordinates can be improved; based on the more accurate gait data and the target center of gravity coordinates, the prediction can be improved The accuracy of the behavioral characteristic information can effectively guarantee the safety of the target object.
- the combined use of the two determined coordinates of the center of gravity can more accurately determine the final target center of gravity coordinates of the target object.
- the determining the multiple three-dimensional key point information of the target object in the target image includes: for each frame of image in each target video segment, based on the location of the target object The multiple two-dimensional key point information in the frame image determines the detection frame of the target object in the frame image; based on the size information of the detection frame and the coordinates of the center point of the detection frame, the frame image The coordinate information corresponding to each two-dimensional key point information in the frame is normalized to obtain multiple target two-dimensional key point information of the target object in the frame of image; based on the target object in the frame of image Multiple target two-dimensional key point information determines multiple three-dimensional key point information of the target object in the target image.
- the determining multiple three-dimensional key points of the target object in the target image based on the multiple target two-dimensional key point information of the target object in each frame of image includes: inputting the two-dimensional key point information of the multiple targets of the target object in each frame of the image into a trained first neural network, and performing the two-dimensional analysis of the multiple targets through the first neural network.
- the key point information is processed to determine multiple three-dimensional key point information of the target object in the target image.
- the multiple sample two-dimensional key point information is determined based on the back projection of multiple standard three-dimensional key point information, which can improve the accuracy of the determined two-dimensional key point information of the sample.
- the determining the multiple sample two-dimensional key point information of the first sample object in each frame of the first sample video segment includes: acquiring and shooting the first sample object.
- the device parameter information of the shooting device of the sample video clip, and the RGB picture of each frame of the first sample video clip; based on the device parameter information, the RGB picture of each frame of image, and the plurality of The standard three-dimensional key point information determines the multiple sample two-dimensional key point information of the first sample object in each frame of the first sample video segment.
- the two-dimensional key point information of multiple samples can be determined relatively accurately.
- the adjusting the network parameters of the first initial neural network based on the error information between the multiple predicted three-dimensional key point information and the multiple standard three-dimensional key point information includes : Obtain the physical size information of the first sample object; based on the physical size information of the first sample object, determine that each standard three-dimensional key point information corresponds to the target standard three-dimensional key point information in the network scale space; The error information between the plurality of predicted three-dimensional key point information and the plurality of target standard three-dimensional key point information adjusts the network parameters of the first initial neural network.
- the determining the forward direction of the target object based on the multiple three-dimensional key point information includes: determining the left crotch of the target object based on the multiple three-dimensional key point information The first line between the right crotch and the left shoulder and the second line between the left shoulder and the right shoulder of the target object; determine the minimum between the first line and the second line Error plane; based on the line of intersection between the minimum error plane and the horizontal plane, determine the advancing direction of the target object; or, based on the multiple three-dimensional key point information, determine the left and right crotch of the target object
- only three-dimensional key point information is used to determine multiple lines, and then each determined line is used to determine the first torso direction of the target object relative to the horizontal plane, and the second torso direction of the target object relative to the vertical plane.
- the torso direction is used to determine the forward direction of the target object.
- the forward direction is not determined based on the device parameters of the shooting device, that is, the gait analysis and recognition are not based on the device parameters of the shooting device. It overcomes the shortcomings of strong dependence on other data or equipment and poor generalization ability in gait analysis and recognition.
- the gait data includes step length information of the target object; the identification of the target object is based on the multiple three-dimensional key point information and the progress direction
- the gait data in the image includes: determining the first projection of the line between the feet of the target object in the forward direction based on the multiple three-dimensional key point information; and determining the first projection based on the first projection Length information, which determines the step length information of the target object; and/or, the gait data includes the step width information of the target object; the identification is based on the multiple three-dimensional key point information and the forward direction
- the gait data of the target object in the target image includes: based on the multiple three-dimensional key point information, determining that the line between the two feet of the target object is in a direction perpendicular to the forward direction On the second projection; based on the length information of the second projection, determine the step width information of the target object.
- the multiple sample two-dimensional key point information is determined based on the back projection of multiple standard three-dimensional key point information, which can improve the accuracy of the determined two-dimensional key point information of the sample.
- the forward direction of the target object is determined based on the multiple three-dimensional key point information Previously, the above-mentioned gait recognition method further includes: acquiring physical size information of the target object; based on the physical size information of the target object, updating the three-dimensional key point information of the network scale space to the three-dimensional key point information of the physical scale space.
- the determining the forward direction of the target object based on the multiple three-dimensional key point information includes: determining the left crotch of the target object based on the multiple three-dimensional key point information The first line between the right crotch and the left shoulder and the second line between the left shoulder and the right shoulder of the target object; determine the minimum between the first line and the second line Error plane; based on the line of intersection between the minimum error plane and the horizontal plane, determine the advancing direction of the target object; or, based on the multiple three-dimensional key point information, determine the left and right crotch of the target object The third line between the parts, the fourth line between the left shoulder and the right shoulder of the target object, and the fifth line between the pelvic point and the cervical vertebra point of the target object; based on the first The third line and the fourth line determine the first torso direction of the target object relative to the horizontal plane; based on the fifth line, determine the second torso direction of the target object relative to the vertical plane; based on The first
- the gait data includes step length information of the target object; the identification of the target object is based on the multiple three-dimensional key point information and the progress direction
- the gait data in the image includes: determining the first projection of the line between the feet of the target object in the forward direction based on the multiple three-dimensional key point information; and determining the first projection based on the first projection Length information, which determines the step length information of the target object; and/or, the gait data includes the step width information of the target object; the identification is based on the multiple three-dimensional key point information and the forward direction
- the gait data of the target object in the target image includes: based on the multiple three-dimensional key point information, determining that the line between the two feet of the target object is in a direction perpendicular to the forward direction On the second projection; based on the length information of the second projection, determine the step width information of the target object.
- the present disclosure provides a gait recognition device, including:
- a key point processing module configured to determine multiple three-dimensional key point information of the target object in the target image based on multiple two-dimensional key point information of the target object in each target video segment;
- the gait recognition module is configured to recognize the gait data of the target object in the target image based on the multiple three-dimensional key point information and the forward direction.
- An image processing module configured to determine the gait data and the target center of gravity coordinates of the target object in the target image based on multiple two-dimensional key point information of the target object in each target video segment;
- the prediction module is configured to predict the behavior characteristic information of the target object in a preset time period based on the gait data and the coordinates of the target center of gravity.
- the present disclosure provides an electronic device, including a processor, a memory, and a bus.
- the memory stores machine-readable instructions executable by the processor.
- the processing The device communicates with the memory through the bus, and when the machine-readable instructions are executed by the processor, the processor executes the above-mentioned gait recognition method or behavior prediction method.
- the present disclosure also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium, and when the computer program is run by a processor, the processor executes the above-mentioned gait recognition method or behavior prediction method.
- the above-mentioned apparatus, electronic equipment, and computer-readable storage medium of the present disclosure include at least technical features that are substantially the same as or similar to the technical features of any aspect of the above-mentioned method or any embodiment of any aspect of the present disclosure. Therefore, regarding the above-mentioned apparatus, for the effect description of the electronic device, and the computer-readable storage medium, please refer to the effect description of the above method content, which will not be repeated here.
- Fig. 1 shows a flowchart of a behavior prediction method provided by an embodiment of the present disclosure
- FIG. 3 shows a schematic diagram of a detection frame in still another behavior prediction method provided by an embodiment of the present disclosure
- FIG. 4 shows a schematic diagram of determining a forward direction in yet another behavior prediction method provided by an embodiment of the present disclosure
- FIG. 5 shows a flowchart of a method for predicting a center of gravity provided by an embodiment of the present disclosure
- FIG. 6B shows a schematic structural diagram of a simplified time-series hole convolutional neural network
- FIG. 7 shows a flowchart of a method for gait recognition provided by an embodiment of the present disclosure
- the present disclosure provides a behavior prediction method and device.
- the present disclosure is based on the two-dimensional key point information of the target object in the video clip, which can improve the accuracy of the predicted gait data and the target center of gravity coordinates; based on the more accurate gait data and the target center of gravity coordinates, the predicted behavior can be improved
- the accuracy of the characteristic information can effectively guarantee the safety of the target object.
- the embodiments of the present disclosure provide a behavior prediction method, which is applied to a terminal device or a server that performs behavior prediction on a target object. Specifically, as shown in FIG. 1, the behavior prediction method provided by the embodiment of the present disclosure includes the following steps:
- the target image may be the last frame image of the target video segment, which is an image to be subjected to behavior prediction.
- the behavior feature information, security feature information, etc. of the target object in the target image can be determined.
- S120 Determine the gait data and target center of gravity coordinates of the target object in the target image based on multiple two-dimensional key point information of the target object in each target video segment.
- the multiple two-dimensional key point information of the target object in each frame of the target video clip may be used to predict multiple three-dimensional key point information of the target object in the target image.
- the multiple three-dimensional key point information in the target image determines the gait data of the target object in the target image; the multiple two-dimensional key point information of the target object in each frame of the target video clip and the target object are used.
- the multiple three-dimensional key point information in the target image determines the target center of gravity coordinates of the target object in the target image.
- the aforementioned gait data may include gait-length information and/or gait-width information of the target object.
- S130 Predict the behavior characteristic information of the target object in a preset time period based on the gait data and the coordinates of the target center of gravity.
- the target video may include multiple target video clips, and each target video clip includes a target image.
- the target object is Gait data and target center of gravity coordinates at multiple consecutive moments.
- the behavior of the target object in a preset time period can be monitored and predicted based on the obtained gait data.
- the trajectory of the target object's movement within a preset time period can be predicted.
- the behavior characteristic information of the target object in a preset time period is determined.
- the above-mentioned behavior characteristic information includes the trajectory characteristics and behavior characteristics of the target object in a preset time period.
- the behavior characteristic information specifically includes the motion trajectory coordinates of the target object in the preset time period, Set the step length and step width of the movement within the time period.
- S140 Based on the behavior characteristic information, determine the security feature information of the target object within the preset time period and a security handling strategy that matches the security feature information.
- the above-mentioned safety feature information is used to indicate whether the movement of the target object within the preset time period will cause danger and what kind of danger will occur.
- the safety feature information indicates that the target object moves too far within a preset period of time, will collide with other objects, fall, and other dangerous situations.
- the above-mentioned security handling strategy is preset, and there is a mapping relationship with the security feature information. Based on the mapping relationship and the determined security feature information, the security handling strategy for the target object can be determined.
- the safe handling strategy may be to send a reminder to the target object or the guardian of the target object. For example, in a situation where the target object may fall due to excessive strides, an alert is issued to the target object or the guardian of the target object to prevent falling; when the target object may collide, the target object or The guardian of the target object issues a reminder to prevent collisions.
- the accuracy of the predicted gait data and the target center of gravity coordinates can be improved; based on the more accurate gait data and the target center of gravity coordinates, the prediction can be improved The accuracy of the obtained behavior characteristic information can effectively guarantee the safety of the target object.
- the process of determining the gait data of the target object in the target image may include the following steps:
- a two-dimensional key point detection network can be used to detect each frame of image. Determine multiple two-dimensional key point information in each frame of image.
- the time series hole convolutional neural network can be used to determine the multiple two-dimensional key point information of the target object in the target image.
- Three-dimensional key point information can be used to determine the multiple two-dimensional key point information of the target object in the target image.
- the three-dimensional key point information corresponding to the hip, shoulder, pelvis, and cervical spine of the target object in the three-dimensional key point information can be used to determine the forward direction of the target object, instead of relying on the shooting device that shoots the target video clip. Device parameters.
- the above-mentioned forward direction is the forward direction of the target object in the physical scale space;
- the three-dimensional key point information can be the information of the target object in the network scale space, or the information of the target object in the physical scale space. If the three-dimensional key point information It is the information of the target object in the network scale space, and then the three-dimensional key point information in the network scale space needs to be converted to the physical scale space.
- the above-mentioned physical scale space is the physical scale in the real world, and the unit may be "meter", the standard unit of length in the International System of Units.
- the network scale space is an artificially defined arithmetic scale, the unit is 1, the purpose is to remove the influence of the object's own size on the related calculations and simplify the calculation. The dimensions of the two are different.
- Step 3 Recognizing the gait data of the target object in the target image based on the multiple three-dimensional key point information and the forward direction.
- the three-dimensional key point information corresponding to the foot of the target object in the three-dimensional key point information can be used to determine the gait data of the target object.
- the aforementioned gait data may include step length information and/or step width information of the target object.
- the following sub-steps can be used to determine the step information of the target object in the target image:
- the following sub-steps can be used to determine the step width information of the target object in the target image: based on the multiple three-dimensional key point information, determine that the line between the two feet of the target object is in line with the A second projection in a direction perpendicular to the forward direction; based on the length information of the second projection, the step width information of the target object is determined.
- the above is to project the line between the two feet onto the forward direction of the target object and the direction perpendicular to the forward direction, and then determine the step length information and step width information of the target object based on the length of the projection.
- the length information of the first projection can be directly used as the step length information of the target object, and the length information of the second projection can be used as the target object.
- the step width information is information when the three-dimensional key point information is information in a physical scale space.
- the foregoing determines the multiple three-dimensional key points of the target object in the target image based on multiple two-dimensional key point information of the target object in each frame of the target video clip.
- Point information which can specifically include the following steps:
- S210 For each frame of image in the target video segment, determine a detection frame of the target object in the frame of image based on multiple two-dimensional key point information of the target object in the frame of image.
- a two-dimensional key point detection network can be used to determine multiple two-dimensional key point information of the target object in each frame of the image.
- a plurality of two-dimensional coordinates based on the key information in the key points may be determined a frame surrounding the detection target object is detected in block 331 as shown, in FIG., W d represents the width of the detection frames, detection frame represents H D high.
- S220 Based on the size information of the detection frame and the coordinates of the center point of the detection frame, normalize the coordinate information corresponding to each two-dimensional key point information in the frame image to obtain the target object in the frame. Two-dimensional key point information of multiple targets in the image.
- K x, y represent the normalized two-dimensional key point information, that is, the coordinates corresponding to the above-mentioned target two-dimensional key point information, Represents the coordinates corresponding to the two-dimensional key point information, Indicates the coordinates of the center point of the detection frame.
- S230 Determine multiple three-dimensional key point information of the target object in the target image based on multiple target two-dimensional key point information of the target object in each frame of the image.
- the detection frame of the target object is determined by using the two-dimensional key point information of the target object in the image, and then the size information and center point coordinates of the detection frame are used to categorize the coordinate information corresponding to the two-dimensional key point information.
- the unified processing does not depend on the camera parameters of the camera that shoots the video clip or the size information of the original image, gets rid of the dependence on the camera parameters, and still has good versatility for cropped images.
- the normalized two-dimensional key point information can be input into the trained first neural network, for example, the trained time-series hole convolutional neural network. Determine the three-dimensional key point information.
- Using the trained first neural network to determine the three-dimensional key point information can improve the automation of information processing and determination, and improve the accuracy of information processing and determination.
- Step 1 Obtain a first sample video segment including a first sample image, and multiple standard three-dimensional key point information of the first sample object in the first sample image, wherein the first sample
- the video clip also includes the first N frames of images of the first sample image.
- the first sample image is an image to be subjected to gait recognition.
- the above-mentioned standard three-dimensional key point information is used as sample labeling information.
- the standard three-dimensional key point information can be back-projected to obtain the two-dimensional key point information of the sample.
- the following steps can be used for back-projection processing:
- the device parameter information of the shooting device that shot the first sample video clip and the RGB picture of each frame of the image in the first sample video clip; based on the device parameter information, the RGB picture of each frame of the image, and the multiple Standard three-dimensional key point information to determine multiple sample two-dimensional key point information of the first sample object in each frame of the first sample video segment.
- Determining the two-dimensional key point information of multiple samples based on the back projection of multiple standard three-dimensional key point information can improve the accuracy of the determined two-dimensional key point information of the sample.
- a two-dimensional key point detection network can be directly used to detect the first sample object in each frame of image to obtain the first sample object. Two-dimensional key point information of multiple samples of the sample object in each frame of the first sample video segment.
- the coordinate information corresponding to the sample two-dimensional key point information needs to be normalized.
- the method of transformation processing is the same as the method of normalizing the coordinate information corresponding to the two-dimensional key point information in the above-mentioned embodiment.
- Step 3 Input the two-dimensional key point information of the multiple samples into the first initial neural network to be trained, and process the input two-dimensional key point information of the multiple samples through the first initial neural network to determine the first Multiple predicted three-dimensional key point information of the sample object in the first sample image.
- Step 4 Adjust the network parameters of the first initial neural network based on the error information between the multiple predicted three-dimensional key point information and the multiple standard three-dimensional key point information, and obtain the first Neural Networks.
- the multiple standard 3D key point information in step 1 is the information in the physical scale space.
- the standard 3D key point information in the physical scale space can be directly used with all the information. Adjusting the network parameters of the first initial neural network by predicting the error information between the multiple predicted three-dimensional key point information.
- the three-dimensional key point information directly predicted is also information in the physical scale space. Then, when using the predicted three-dimensional key point information to determine the gait data of the target object, there is no need to perform physical scale space conversion.
- the coordinate information in the standard three-dimensional key point information in the physical scale space can be divided by the physical size information to obtain the converted standard three-dimensional key point information in the physical scale space.
- Information in cyberspace is not limited to:
- the aforementioned physical size information may be the height information of the first sample object (for example, a person).
- the above uses the physical size information of the sample object to convert the standard three-dimensional key point information in the physical scale space into the information in the network scale space.
- the neural network trained based on the information in the network scale space can determine the three-dimensional key point information in the network scale space, namely , Can eliminate scale diversity, overcome the influence of target object size on determining three-dimensional key point information, and help improve the accuracy of gait recognition.
- the predicted three-dimensional key point information predicted by the first neural network is the information in the network scale space
- the predicted three-dimensional key point information is used for gait analysis, that is, before the forward direction and gait data are determined, the network scale space
- the three-dimensional key point information in is converted to the information in the physical scale space, which can be converted using the following steps:
- Obtain the physical size information of the target object based on the physical size information of the target object, update the three-dimensional key point information of the network scale space to the three-dimensional key point information of the physical scale space.
- the above-mentioned physical size information may be the height information of the target object (for example, a person).
- determining the forward direction of the target object based on the multiple three-dimensional key point information may specifically include the following steps:
- determining a first line between the left and right crotch of the target object and a second line between the left and right shoulders of the target object Determine the minimum error plane between the first connection line and the second connection line; and determine the advancing direction of the target object based on the intersection line between the minimum error plane and the horizontal plane.
- the above only uses the three-dimensional key point information to determine the first line, the second line, and the minimum error plane between the first line and the second line, and then uses the intersection line between the minimum error plane and the horizontal plane to determine the target
- the forward direction of the object is not determined based on the equipment parameters of the shooting device, that is, the gait analysis and recognition are not based on the equipment parameters of the shooting device, which overcomes the dependence on other data or equipment in the gait analysis and recognition Defects with strong sex and poor generalization ability.
- determining the forward direction of the target object based on the multiple three-dimensional key point information may specifically include the following steps:
- the direction of the bisector of the angle formed by the third line and the fourth line can be taken as the left and right direction of the target object, that is, the first torso direction
- the direction of the fifth line can be taken as the direction of the target object.
- the up and down direction that is, the above-mentioned second torso direction. After that, the cross product of the first torso direction and the second torso direction is used as the forward direction of the target object.
- the above only uses the three-dimensional key point information to determine multiple lines. After that, the determined lines are used to determine the first torso direction of the target object relative to the horizontal plane, and the second torso direction of the target object relative to the vertical plane. Finally, use the first The first torso direction and the second torso direction determine the forward direction of the target object.
- the forward direction is not determined based on the device parameters of the shooting device, that is, the gait analysis and recognition are not performed based on the device parameters of the shooting device, which overcomes the gait analysis And identify the defects of strong dependence on other data or equipment and poor generalization ability.
- the gait data of the target object at multiple consecutive moments can be identified.
- the target can be monitored and predicted based on the identified gait data.
- the behavior of the object in specific applications, can use the recognized gait data to remotely monitor and predict the behavior of children or elderly people with cognitive impairments, so as to protect the personal safety of children or elderly people with cognitive impairments.
- the present disclosure also provides a center of gravity prediction method, which can be applied to a separate terminal device or server that predicts the center of gravity of a target object, and of course, can also be applied to the above-mentioned terminal device or server that performs behavior prediction.
- the center of gravity prediction method provided by the present disclosure may include the following steps:
- S510 Based on multiple two-dimensional key point information of the target object in each frame of the target video clip, determine the first center of gravity coordinate of the target object and determine multiple three-dimensional keys of the target object in the target image Point information.
- the determination of the multiple three-dimensional key point information of the target object in the target image based on the multiple two-dimensional key point information of the target object in each frame of the target video segment is the same as the method in the foregoing embodiment , I won’t go into details here.
- the trained time-series hole convolutional neural network can be used to determine the first center of gravity coordinate.
- the time-series hole convolutional neural network here is different from the above-mentioned time-series hole convolutional neural network for determining three-dimensional key point information, and needs to be retrained.
- the trained neural network is used to determine the center of gravity coordinates, which can improve the automation of information processing and determination, and improve the accuracy of information processing and determination.
- the first center of gravity coordinate determined by the above-mentioned time-series hole convolutional neural network is more accurate in the depth direction (Z direction), when determining the target center of gravity coordinate based on the first center of gravity coordinate, you can only take the first center of gravity coordinate in the depth direction coordinate of.
- S520 Determine a second center of gravity coordinate of the target object based on the multiple two-dimensional key point information and the multiple three-dimensional key point information of the target object in the target image.
- the second center of gravity coordinate can be determined based on the two-dimensional key point information and the three-dimensional key point information using, for example, the SolvePnP algorithm or a similar optimization method. Since the second center of gravity coordinates determined by the above algorithm are more accurate in the horizontal direction (X direction) and the vertical direction (Y direction), when determining the target center of gravity coordinates based on the second center of gravity coordinates, you can only take the second center of gravity coordinates in the The coordinates in the horizontal and vertical directions.
- the above-mentioned three-dimensional key point information is information in a physical scale space.
- the coordinates of the first center of gravity coordinate in the depth direction and the coordinates of the second center of gravity coordinate in the horizontal direction and the vertical direction may be taken as the target center of gravity coordinates of the target object in the target image.
- the target barycentric coordinates of the target object at multiple consecutive moments can be obtained. After the target barycentric coordinates of the target object at multiple consecutive moments are obtained, the target barycentric coordinates can be determined based on the obtained multiple target barycentric coordinates. The displacement estimation result (motion trajectory) of the target object at the multiple consecutive times.
- the behavior of the target object can be comprehensively predicted.
- the motion trajectory predict whether the next behavior of the target object will be dangerous. For example, combining the currently predicted behavior and movement trajectory of the child, predicting the behavior characteristic information of the child's next behavior, and determining whether the child's next behavior will be dangerous according to the behavior characteristic information, so as to implement a matching safe handling strategy.
- the three-dimensional key point information corresponding to the hip, shoulder, pelvis, and cervical spine of the target object in the three-dimensional key point information can be used to determine the forward direction of the target object, instead of relying on the shooting device that shoots the target video clip. Device parameters.
- the gait recognition method may further include the following steps: acquiring physical size information of the target object; based on the physical size information of the target object, updating the three-dimensional key point information of the network scale space to the three-dimensional key point information of the physical scale space.
- a behavior prediction device may include:
- the image acquisition module 810 is configured to acquire multiple target video clips, each of which includes a target image and the first N frames of images of the target image, where N is a positive integer.
- the image processing module 820 is configured to determine the gait data and the target center of gravity coordinates of the target object in the target image based on multiple two-dimensional key point information of the target object in each target video segment.
- the key point processing module 920 is configured to determine multiple three-dimensional key point information of the target object in the target image based on multiple two-dimensional key point information of the target object in each target video segment.
- the advancing direction determining module 930 is configured to determine the advancing direction of the target object based on the multiple three-dimensional key point information.
- the gait recognition module 940 is configured to recognize the gait data of the target object in the target image based on the multiple three-dimensional key point information and the forward direction.
- the electronic device includes a processor 1001, a memory 1002, and a bus 1003, and the memory 1002 stores a machine readable executable by the processor 1001. Instruction, when the electronic device is running, the processor 1001 and the memory 1002 communicate through the bus 1003.
- the processor 1001 executes the following behavior prediction method:
- each target video clip includes a target image and the first N frames of the target image, where N is a positive integer;
- the processor 1001 executes the following gait recognition method:
- gait data of the target object in the target image is recognized.
- the processor 1001 can also execute the method content in any of the embodiments described in the above method section, which will not be repeated here.
- the embodiments of the present disclosure also provide a computer program product corresponding to the above-mentioned method and device, including a computer-readable storage medium storing program code.
- the instructions included in the program code can be used to execute the method in the previous method embodiment, and the specific implementation is Please refer to the method embodiment, which will not be repeated here.
- the specific working process of the device described above can refer to the corresponding process in the method embodiment, which will not be repeated in this disclosure.
- the disclosed device and method may be implemented in other ways.
- the device embodiments described above are merely illustrative.
- the division of the modules is only a logical function division, and there may be other divisions in actual implementation.
- multiple modules or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection between devices or modules, and may be in electrical, mechanical or other forms.
- the functional units in the various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the function is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a non-volatile computer-readable storage medium executable by a processor.
- the computer software product is stored in a storage medium and includes a number of instructions to make a computer
- a device which may be a personal computer, a server, or a network device, etc.
- the aforementioned storage media include: U disk, mobile hard disk, ROM (Read-Only Memory), RAM (Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (21)
- 一种行为预测方法,包括:获取多个目标视频片段,每个目标视频片段中包括目标图像和所述目标图像的前N帧图像,其中,N为正整数;基于在每个目标视频片段中目标对象的多个二维关键点信息,确定所述目标对象在所述目标图像中的步态数据和目标重心坐标;基于所述步态数据和所述目标重心坐标,预测所述目标对象在预设时间段内的行为特征信息。
- 根据权利要求1所述的行为预测方法,还包括:基于所述行为特征信息,确定所述目标对象在所述预设时间段内的安全特征信息以及与所述安全特征信息匹配的安全处置策略。
- 根据权利要求1所述的行为预测方法,其中,所述基于在每个目标视频片段中目标对象的多个二维关键点信息,确定所述目标对象在所述目标图像中的步态数据,包括:基于所述目标对象在每个目标视频片段的每帧图像中的多个二维关键点信息,确定所述目标对象在所述目标图像中的多个三维关键点信息;基于所述多个三维关键点信息,确定所述目标对象的前进方向;基于所述多个三维关键点信息和所述前进方向,识别所述目标对象在所述目标图像中的步态数据。
- 根据权利要求1所述的行为预测方法,其中,所述基于在每个目标视频片段中目标对象的多个二维关键点信息,确定所述目标对象在所述目标图像中的目标重心坐标,包括:基于所述目标对象在每个目标视频片段的每帧图像中的多个二维关键点信息,确定所述目标对象的第一重心坐标以及确定所述目标对象在所述目标图像中的多个三维关键点信息;基于所述目标对象在所述目标图像中的多个二维关键点信息和所述多个三维关键点信息,确定所述目标对象的第二重心坐标;基于所述第一重心坐标和所述第二重心坐标,确定所述目标对象在所述目标图像中的目标重心坐标。
- 根据权利要求3或4所述的行为预测方法,其中,所述确定所述目标对象在所述目标图像中的多个三维关键点信息,包括:针对每个目标视频片段中的每帧图像,基于所述目标对象在该帧图像中的多个二维关键点信息,确定所述目标对象在该帧图像中的检测框;基于所述检测框的尺寸信息、所述检测框的中心点的坐标,对该帧图像中的每个二维关键点信息对应的坐标信息进行归一化处理,得到目标对象在该帧图像中的多个目标二维关键点信息;基于所述目标对象在所述每帧图像中的多个目标二维关键点信息,确定所述目标对象在所述目标图像中的多个三维关键点信息。
- 根据权利要求5所述的行为预测方法,其中,所述基于所述目标对象在所述每帧图像中的多个目标二维关键点信息,确定所述目标对象在所述目标图像中的多个三维关键点信息,包括:将所述目标对象在所述每帧图像中的多个目标二维关键点信息,输入训练好的第一神经网络,经过所述第一神经网络对输入的多个目标二维关键点信息进行处理,确定所述目标对象在所述目标图像中的多个三维关键点信息。
- 根据权利要求6所述的行为预测方法,还包括训练所述第一神经网络的步骤:获取包括第一样本图像的第一样本视频片段,和第一样本对象在所述第一样本图像中的多个标准三维关键点信息,其中,所述第一样本视频片段还包括所述第一样本图像的前N帧图像;基于所述多个标准三维关键点信息,确定所述第一样本对象在所述第一样本视频片段的每帧图像中的多个样本二维关键点信息;将确定的所述多个样本二维关键点信息输入待训练的第一初始神经网络,经过所述第一初始神经网络对输入的多个样本二维关键点信息进行处理,确定所述第一样本对象在所述第一样本图像中的多个预测三维关键点信息;基于所述多个预测三维关键点信息与所述多个标准三维关键点信息之间的误差信息,调整所述第一初始神经网络的网络参数;在所述第一初始神经网络训练完成后得到所述第一神经网络。
- 根据权利要求7所述的行为预测方法,其中,所述确定所述第一样本对象在所述第一样本视频片段的每帧图像中的多个样本二维关键点信息,包括:获取拍摄所述第一样本视频片段的拍摄设备的设备参数信息,以及所述第一样本视频片段的每帧图像的RGB画面;基于所述设备参数信息、所述每帧图像的RGB画面和所述多个标准三维关键点信息,确定所述第一样本对象在所述第一样本视频片段的每帧图像中的多个样本二维关键点信息。
- 根据权利要求7所述的行为预测方法,其中,所述基于所述多个预测三维关键点信息与所述多个标准三维关键点信息之间的误差信息,调整所述第一初始神经网络的网络参数,包括:获取所述第一样本对象的物理尺寸信息;基于所述第一样本对象的物理尺寸信息,确定每个标准三维关键点信息对应于网络尺度空间的目标标准三维关键点信息;基于所述多个预测三维关键点信息与多个所述目标标准三维关键点信息之间的误差信息,调整所述第一初始神经网络的网络参数。
- 根据权利要求3所述的行为预测方法,其中,所述基于所述多个三维关键点信息,确定所述目标对象的前进方向,包括:基于所述多个三维关键点信息,确定所述目标对象的左胯部与右胯部之间的第一连线和所述目标对象的左肩部与右肩部之间的第二连线;确定所述第一连线和所述第二连线之间的最小误差平面;基于所述最小误差平面和水平面之间的相交线,确定所述目标对象的前进方向;或者,基于所述多个三维关键点信息,确定所述目标对象的左胯部与右胯部之间的第三连线、所述目标对象的左肩部与右肩部之间的第四连线和所述目标对象的骨盆点与颈椎点之间的第五连线,其中,所述第三连线为所述第一连线,所述第四连线为所述第二连线;基于所述第三连线和所述第四连线,确定所述目标对象相对于水平面的第一躯干方向;基于所述第五连线,确定所述目标对象相对于竖直面的第二躯干方向;基于所述第一躯干方向和所述第二躯干方向,确定所述目标对象的前进方向。
- 根据权利要求3或10所述的行为预测方法,其中,所述步态数据包括所述目标对象的步长信息;所述基于所述多个三维关键点信息和所述前进方向,识别所述目标对象在所述目标图像中的步态数据,包括:基于所述多个三维关键点信息,确定所述目标对象的两脚之间的连线在所述前进方向上的第一投影;基于所述第一投影的长度信息,确定所述目标对象的步长信息;和/或,所述步态数据包括所述目标对象的步宽信息;所述基于所述多个三维关键点信息和所述前进方向,识别所述目标对象在所述目标图像中的步态数据,包括:基于所述多个三维关键点信息,确定所述目标对象的两脚之间的连线在与所述前进方向相垂直的方向上的第二投影;基于所述第二投影的长度信息,确定所述目标对象的步宽信息。
- 根据权利要求4所述的行为预测方法,其中,所述基于所述目标对象在每个目标视频片段的每帧图像中的多个二维关键点信息,确定所述目标对象的第一重心坐标,包括:将所述目标对象在每个目标视频片段的每帧图像中的多个二维关键点信息,输入训练好的第二神经网络,经过所述第二神经网络对输入的多个二维关键点信息进行处理,确定所述目标对象的第一重心坐标。
- 根据权利要求12所述的行为预测方法,还包括训练所述第二神经网络的步骤:获取包括第二样本图像的第二样本视频片段,和第二样本对象在所述第二样本图像中的多个标准三维关键点信息,其中,所述第二样本视频片段还包括所述第二样本图像的前N帧图像;基于所述多个标准三维关键点信息,确定所述第二样本对象在所述第二样本视频片段的每帧图像中的多个样本二维关键点信息;基于所述多个标准三维关键点信息,确定所述第二样本对象的标准重心坐标;将确定的所述多个样本二维关键点信息输入待训练的第二初始神经网络,经过所述第二初始神经网络对输入的多个样本二维关键点信息进行处理,输出所述第二样本对象在所述第二样本图像中的预测重心坐标;基于所述预测重心坐标与所述标准重心坐标之间的误差信息,调整所述第二初始神经网络的网络参数;在所述第二初始神经网络训练完成后得到所述第二神经网络。
- 一种步态识别方法,包括:获取多个目标视频片段,每个目标视频片段中包括目标图像和所述目标图像的前N帧图像,其中,N为正整数;基于在每个目标视频片段中目标对象的多个二维关键点信息,确定所述目标对象在所述目标图像中的多个三维关键点信息;基于所述多个三维关键点信息,确定所述目标对象的前进方向;基于所述多个三维关键点信息和所述前进方向,识别所述目标对象在所述目标图像中的步态数据。
- 根据权利要求14所述的步态识别方法,其中,在所述多个三维关键点信息是网络尺度空间的三维关键点信息的情况下,在基于所述多个三维关键点信息,确定所述目标对象的前进方向之前,所述方法还包括:获取所述目标对象的物理尺寸信息;基于所述目标对象的物理尺寸信息,将网络尺度空间的三维关键点信息更新为物理尺度空间的三维关键点信息。
- 根据权利要求14或15所述的步态识别方法,其中,所述基于所述多个三维关键点信息,确定所述目标对象的前进方向,包括:基于所述多个三维关键点信息,确定所述目标对象的左胯部与右胯部之间的第一连线和所述目标对象的左肩部与右肩部之间的第二连线;确定所述第一连线和所述第二连线之间的最小误差平面;基于所述最小误差平面和水平面之间的相交线,确定所述目标对象的前进方向;或者,基于所述多个三维关键点信息,确定所述目标对象的左胯部与右胯部之间的第三连线、所述目标对象的左肩部与右肩部之间的第四连线和所述目标对象的骨盆点与颈椎点之间的第五连线;基于所述第三连线和所述第四连线,确定所述目标对象相对于水平面的第一躯干方向;基于所述第五连线,确定所述目标对象相对于竖直面的第二躯干方向;基于所述第一躯干方向和所述第二躯干方向,确定所述目标对象的前进方向。
- 根据权利要求14至16任一所述的步态识别方法,其中,所述步态数据包括所述目标对象的步长信息;所述基于所述多个三维关键点信息和所述前进方向,识别所述目标对象在所述目标图像中的步态数据,包括:基于所述多个三维关键点信息,确定所述目标对象的两脚之间的连线在所述前进方向上的第一投影;基于所述第一投影的长度信息,确定所述目标对象的步长信息;和/或,所述步态数据包括所述目标对象的步宽信息;所述基于所述多个三维关键点信息和所述前进方向,识别所述目标对象在所述目标图像中的步态数据,包括:基于所述多个三维关键点信息,确定所述目标对象的两脚之间的连线在与所述前进方向相垂直的方向上的第二投影;基于所述第二投影的长度信息,确定所述目标对象的步宽信息。
- 一种步态识别装置,包括:视频获取模块,用于获取多个目标视频片段,每个目标视频片段中包括目标图像和所述目标图像的前N帧图像,其中,N为正整数;关键点处理模块,用于基于在每个目标视频片段中目标对象的多个二维关键点信息,确定所述目标对象在所述目标图像中的多个三维关键点信息;前进方向确定模块,用于基于所述多个三维关键点信息,确定所述目标对象的前进方向;步态识别模块,用于基于所述多个三维关键点信息和所述前进方向,识别所述目标对象在所述目标图像中的步态数据。
- 一种行为预测装置,包括:图像获取模块,用于获取多个目标视频片段,每个目标视频片段中包括目标图像和所述目标图像的前N帧图像,其中,N为正整数;图像处理模块,用于基于在每个目标视频片段中目标对象的多个二维关键点信息,确定所述目标对象在所述目标图像中的步态数据和目标重心坐标;预测模块,用于基于所述步态数据和所述目标重心坐标,预测所述目标对象在预设时间段内的行为特征信息。
- 一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当所述电子设备运行时,所述处理器与所述存储器之间通过所述总线通信,所述处理器执行所述机器可读指令,以执行如权利要求1-13任一所述的行为预测方法或如权利要求14-17任一所述的步态识别方法。
- 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器运行时使所述处理器执行如权利要求1-13任一所述的行为预测方法或如权利要求14-17任一所述的步态识别方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21760041.0A EP3979122A4 (en) | 2020-02-28 | 2021-02-22 | BEHAVIOR PREDICTION METHOD AND APPARATUS, GEAR RECOGNITION METHOD AND APPARATUS, ELECTRONIC DEVICE AND COMPUTER READABLE STORAGE MEDIUM |
JP2021573491A JP7311640B2 (ja) | 2020-02-28 | 2021-02-22 | 行動予測方法及び装置、歩容認識方法及び装置、電子機器並びにコンピュータ可読記憶媒体 |
KR1020217039317A KR20220008843A (ko) | 2020-02-28 | 2021-02-22 | 행위 예측 방법 및 장치, 걸음걸이 인식 방법 및 장치, 전자 장비 및 컴퓨터 판독 가능 저장 매체 |
US17/559,119 US20220114839A1 (en) | 2020-02-28 | 2021-12-22 | Behavior prediction method and apparatus, gait recognition method and apparatus, electronic device, and computer-readable storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010129936.X | 2020-02-28 | ||
CN202010129936.XA CN111291718B (zh) | 2020-02-28 | 2020-02-28 | 行为预测方法及装置、步态识别方法及装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/559,119 Continuation US20220114839A1 (en) | 2020-02-28 | 2021-12-22 | Behavior prediction method and apparatus, gait recognition method and apparatus, electronic device, and computer-readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021169924A1 true WO2021169924A1 (zh) | 2021-09-02 |
Family
ID=71022337
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/077297 WO2021169924A1 (zh) | 2020-02-28 | 2021-02-22 | 行为预测方法及装置、步态识别方法及装置、电子设备和计算机可读存储介质 |
Country Status (7)
Country | Link |
---|---|
US (1) | US20220114839A1 (zh) |
EP (1) | EP3979122A4 (zh) |
JP (1) | JP7311640B2 (zh) |
KR (1) | KR20220008843A (zh) |
CN (1) | CN111291718B (zh) |
TW (1) | TW202133036A (zh) |
WO (1) | WO2021169924A1 (zh) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111291718B (zh) * | 2020-02-28 | 2022-06-03 | 上海商汤智能科技有限公司 | 行为预测方法及装置、步态识别方法及装置 |
CN112597903B (zh) * | 2020-12-24 | 2021-08-13 | 珠高电气检测有限公司 | 基于步幅测量的电力人员安全状态智能识别方法及介质 |
US11475628B2 (en) * | 2021-01-12 | 2022-10-18 | Toyota Research Institute, Inc. | Monocular 3D vehicle modeling and auto-labeling using semantic keypoints |
CN112907892A (zh) * | 2021-01-28 | 2021-06-04 | 上海电机学院 | 基于多视图的人体跌倒报警方法 |
CN113015022A (zh) * | 2021-02-05 | 2021-06-22 | 深圳市优必选科技股份有限公司 | 行为识别方法、装置、终端设备及计算机可读存储介质 |
CN114821805B (zh) * | 2022-05-18 | 2023-07-18 | 湖北大学 | 一种危险行为预警方法、装置和设备 |
KR102476777B1 (ko) * | 2022-05-25 | 2022-12-12 | 주식회사 유투에스알 | 인공지능 기반 경로 예측시스템 |
CN114818989B (zh) * | 2022-06-21 | 2022-11-08 | 中山大学深圳研究院 | 基于步态的行为识别方法、装置、终端设备及存储介质 |
CN116999852A (zh) * | 2022-07-11 | 2023-11-07 | 腾讯科技(深圳)有限公司 | 用于控制虚拟角色的ai模型的训练方法、装置及介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050267630A1 (en) * | 2002-09-26 | 2005-12-01 | Shuuji Kajita | Walking gait producing device for walking robot |
CN102697508A (zh) * | 2012-04-23 | 2012-10-03 | 中国人民解放军国防科学技术大学 | 采用单目视觉的三维重建来进行步态识别的方法 |
CN111291718A (zh) * | 2020-02-28 | 2020-06-16 | 上海商汤智能科技有限公司 | 行为预测方法及装置、步态识别方法及装置 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101558996B (zh) * | 2009-05-15 | 2010-11-10 | 天津大学 | 基于人体运动结构正投影三维重建的步态识别方法 |
WO2014112632A1 (ja) | 2013-01-18 | 2014-07-24 | 株式会社東芝 | 動作情報処理装置及び方法 |
US10096125B1 (en) * | 2017-04-07 | 2018-10-09 | Adobe Systems Incorporated | Forecasting multiple poses based on a graphical image |
CN107392097B (zh) | 2017-06-15 | 2020-07-07 | 中山大学 | 一种单目彩色视频的三维人体关节点定位方法 |
CN108229355B (zh) * | 2017-12-22 | 2021-03-23 | 北京市商汤科技开发有限公司 | 行为识别方法和装置、电子设备、计算机存储介质 |
WO2019216593A1 (en) * | 2018-05-11 | 2019-11-14 | Samsung Electronics Co., Ltd. | Method and apparatus for pose processing |
CN109214282B (zh) * | 2018-08-01 | 2019-04-26 | 中南民族大学 | 一种基于神经网络的三维手势关键点检测方法和系统 |
CN109840500B (zh) * | 2019-01-31 | 2021-07-02 | 深圳市商汤科技有限公司 | 一种三维人体姿态信息检测方法及装置 |
-
2020
- 2020-02-28 CN CN202010129936.XA patent/CN111291718B/zh active Active
-
2021
- 2021-02-22 JP JP2021573491A patent/JP7311640B2/ja active Active
- 2021-02-22 KR KR1020217039317A patent/KR20220008843A/ko not_active Application Discontinuation
- 2021-02-22 WO PCT/CN2021/077297 patent/WO2021169924A1/zh unknown
- 2021-02-22 EP EP21760041.0A patent/EP3979122A4/en not_active Withdrawn
- 2021-02-24 TW TW110106421A patent/TW202133036A/zh unknown
- 2021-12-22 US US17/559,119 patent/US20220114839A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050267630A1 (en) * | 2002-09-26 | 2005-12-01 | Shuuji Kajita | Walking gait producing device for walking robot |
CN102697508A (zh) * | 2012-04-23 | 2012-10-03 | 中国人民解放军国防科学技术大学 | 采用单目视觉的三维重建来进行步态识别的方法 |
CN111291718A (zh) * | 2020-02-28 | 2020-06-16 | 上海商汤智能科技有限公司 | 行为预测方法及装置、步态识别方法及装置 |
Non-Patent Citations (2)
Title |
---|
HUANG, BIN 、: "Human Action Recognition and Understanding in Intelligent Space", WANFANG SCIENCE PERIODICAL DATABASE, 22 April 2010 (2010-04-22), XP055841819, [retrieved on 20210916] * |
See also references of EP3979122A4 * |
Also Published As
Publication number | Publication date |
---|---|
TW202133036A (zh) | 2021-09-01 |
US20220114839A1 (en) | 2022-04-14 |
JP7311640B2 (ja) | 2023-07-19 |
EP3979122A1 (en) | 2022-04-06 |
EP3979122A4 (en) | 2022-10-12 |
CN111291718A (zh) | 2020-06-16 |
JP2022536354A (ja) | 2022-08-15 |
CN111291718B (zh) | 2022-06-03 |
KR20220008843A (ko) | 2022-01-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021169924A1 (zh) | 行为预测方法及装置、步态识别方法及装置、电子设备和计算机可读存储介质 | |
CN109670396B (zh) | 一种室内老人跌倒检测方法 | |
US9330470B2 (en) | Method and system for modeling subjects from a depth map | |
CN104599287B (zh) | 对象跟踪方法和装置、对象识别方法和装置 | |
JP5598751B2 (ja) | 動作認識装置 | |
CN111753747B (zh) | 基于单目摄像头和三维姿态估计的剧烈运动检测方法 | |
CN112560741A (zh) | 一种基于人体关键点的安全穿戴检测方法 | |
CN110991274B (zh) | 一种基于混合高斯模型和神经网络的行人摔倒检测方法 | |
CN108875586B (zh) | 一种基于深度图像与骨骼数据多特征融合的功能性肢体康复训练检测方法 | |
US20220366570A1 (en) | Object tracking device and object tracking method | |
JP2017059945A (ja) | 画像解析装置及び画像解析方法 | |
CN110910426A (zh) | 动作过程和动作趋势识别方法、存储介质和电子装置 | |
JP7488674B2 (ja) | 物体認識装置、物体認識方法及び物体認識プログラム | |
JP4670010B2 (ja) | 移動体分布推定装置、移動体分布推定方法及び移動体分布推定プログラム | |
JP2019012497A (ja) | 部位認識方法、装置、プログラム、及び撮像制御システム | |
JP2023129657A (ja) | 情報処理装置、制御方法、及びプログラム | |
CN114639168B (zh) | 一种用于跑步姿态识别的方法和系统 | |
US11527090B2 (en) | Information processing apparatus, control method, and non-transitory storage medium | |
CN110826495A (zh) | 基于面部朝向的身体左右肢体一致性跟踪判别方法及系统 | |
WO2019207721A1 (ja) | 情報処理装置、制御方法、及びプログラム | |
CN115471863A (zh) | 三维姿态的获取方法、模型训练方法和相关设备 | |
CN113963202A (zh) | 一种骨骼点动作识别方法、装置、电子设备及存储介质 | |
JP2022019988A (ja) | 情報処理装置、ディスプレイ装置、及び制御方法 | |
Li et al. | Unsupervised learning of human perspective context using ME-DT for efficient human detection in surveillance | |
JP7152651B2 (ja) | プログラム、情報処理装置、及び情報処理方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21760041 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20217039317 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2021573491 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2021760041 Country of ref document: EP Effective date: 20211228 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |