CN115170817A - Figure interaction detection method based on three-dimensional figure-figure grid topology enhancement - Google Patents

Figure interaction detection method based on three-dimensional figure-figure grid topology enhancement Download PDF

Info

Publication number
CN115170817A
CN115170817A CN202210862950.XA CN202210862950A CN115170817A CN 115170817 A CN115170817 A CN 115170817A CN 202210862950 A CN202210862950 A CN 202210862950A CN 115170817 A CN115170817 A CN 115170817A
Authority
CN
China
Prior art keywords
human
dimensional
human body
features
topological
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210862950.XA
Other languages
Chinese (zh)
Other versions
CN115170817B (en
Inventor
彭伟龙
李聪
陈庆丰
方美娥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou University
Original Assignee
Guangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou University filed Critical Guangzhou University
Priority to CN202210862950.XA priority Critical patent/CN115170817B/en
Publication of CN115170817A publication Critical patent/CN115170817A/en
Application granted granted Critical
Publication of CN115170817B publication Critical patent/CN115170817B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • G06T17/205Re-meshing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a character interaction detection method based on three-dimensional human-object grid topology enhancement, which comprises the following steps: acquiring visual characteristics of a human body and an object in a single picture; acquiring a human body three-dimensional structure and an object three-dimensional structure in a single picture, and fusing to construct an initial three-dimensional human-object integrated grid topology model; acquiring topological characteristics of a three-dimensional human body and topological characteristics of a three-dimensional human-object based on the enhanced three-dimensional human-object integrated grid topology; fusing the visual features and the topological features to obtain an enhanced three-dimensional human-object integrated grid topological model; and training the enhanced three-dimensional human-object integrated grid topology model, and acquiring a recognition result based on the trained model. The invention adopts bottom-up topological feature extraction, and obtains high-efficiency effect on extracting three-dimensional human body gestures. Three-dimensional character body mesh characteristics are fused on the HOI detection problem, the HOI identification performance is effectively improved, and higher accuracy is achieved on an HICO _ DET reference data set.

Description

Figure interaction detection method based on three-dimensional figure-figure grid topology enhancement
Technical Field
The invention belongs to the field of human interaction detection, relates to the field of computer vision and the field of deep neural networks, and particularly relates to a human interaction detection method based on three-dimensional human-object grid topology enhancement.
Background
Human-Object Interaction (HOI) is a high-level task of machine vision, and is important for a machine to understand the world more deeply. The human interaction detection not only needs to locate and identify people and objects in the scene, but also needs to deduce the interaction relationship between people and objects in the scene. The research of human interaction detection has important significance in a plurality of fields such as security systems, video retrieval and the like.
For interactive behavior recognition, the classic method proves the necessity of visual characteristics and spatial position characteristics of people and objects in a scene. However, with the complexity of the scene, the difficulty is relatively high. For example, the problem of screening by candidate pairs of multiple persons and multiple objects in a scene, the problem of false detection caused by coarse-grained position features, the problem of long-tail distribution caused by limitation of a data set, and the like. In order to solve many problems in the scene, more and more effective methods are proposed. Such as adding human pose information to infer human interaction behavior with more fine-grained features. By using the thought of attention mechanism, more abundant and effective character characteristics are obtained. By using the thought of the graph neural network, the problem of screening the character candidate pairs is solved.
However, behavior understanding based on two-dimensional vision is always disturbed by the problem of viewing angle, and for a certain interaction, the postures shot at different angles are greatly different on the image. Due to the lack of geometric information for image features, connectivity information to build topology is lacking.
Disclosure of Invention
The invention aims to provide a person interaction detection method based on three-dimensional person-object grid topology enhancement, so as to solve the problems in the prior art.
In order to achieve the above object, the present invention provides a human interaction detection method based on topology enhancement of a three-dimensional human-object grid, comprising:
acquiring visual characteristics of a human body and an object in a single picture;
acquiring a human body three-dimensional structure and an object three-dimensional structure in the single picture, and fusing and constructing an initial three-dimensional human-object integrated grid topology model;
acquiring topological features of a three-dimensional human body and topological features of a three-dimensional human-object based on the enhanced three-dimensional human-object integrated grid topology; fusing the visual features and the topological features to obtain an enhanced three-dimensional human-object integrated grid topological model;
and training the enhanced three-dimensional human-object integrated grid topological model, and acquiring a recognition result based on the trained model.
Optionally, the visual features are obtained based on a convolutional neural network, where the visual features include human body appearance features, object appearance features, and human space features.
Optionally, the process of acquiring the three-dimensional structure of the human body includes: acquiring a human body boundary frame in the single picture, and acquiring two-dimensional posture information by a posture evaluation method; and acquiring human body information in the picture, and acquiring the human body three-dimensional structure based on the human body information and the corresponding two-dimensional posture information.
Optionally, a mesh cnn network is used to extract topological features of a three-dimensional human body and topological features of a three-dimensional human-object from the bottom.
Optionally, the model is trained by adopting interactive behavior recognition, wherein a binary cross entropy loss function is adopted during training, and the training is divided into three branches including a human body, an object and a space; the total loss generated is the sum of the losses of the three branches of the human body, the object and the space.
Optionally, the process of performing feature fusion includes: performing feature fusion on the human body appearance features and the topological features of the three-dimensional human body; and performing feature fusion on the character space features and the topological features of the three-dimensional human-object.
Optionally, based on the trained enhanced three-dimensional human-object integrated mesh topology model, a confidence coefficient of interaction type recognition is obtained, and a final recognition result is obtained through weighting calculation on the level of the confidence coefficient, wherein the confidence coefficient of the human body and the confidence coefficient of the object are calculated through weighting calculation based on action scores of the human body, the object and three spatial branches and the interaction center.
Optionally, the human body three-dimensional structure is obtained by using a Smplify-X method.
The invention has the technical effects that:
1. the method provides a character interaction detection method with enhanced three-dimensional human body mesh topology, integrates three-dimensional character body mesh characteristics on the aspect of HOI detection, effectively improves the identification performance of HOI, and obtains higher accuracy on an HICO _ DET reference data set.
2. In the method, the relation between different body parts of a human body is constructed on the basis of dense local connection in a three-dimensional space, so that the topological feature extraction from the bottom to the top is adopted, and the efficient effect on extracting the three-dimensional human body posture is achieved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:
FIG. 1 is a diagram illustrating a process of constructing a human-object integrated grid model HOM according to an embodiment of the present invention;
FIG. 2 is a diagram of a human interaction detection framework in an embodiment of the invention;
FIG. 3 is a flow chart of a method in an embodiment of the invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than here.
Example one
As shown in fig. 1-2, the present embodiment provides a human interaction detection method based on topology enhancement of a three-dimensional human-object mesh, including:
human and object localization. Extracting detection boxes of each person-object interaction pair in the picture by comparing mature target detection algorithms SSD, fast RCNN and the like (b) h ,b o ). And (4) reconstructing a three-dimensional human body. Firstly, a human body boundary frame in a two-dimensional picture is obtained, and then human body two-dimensional posture information (mainly comprising body joint points, facial joint points and palm joint points) is detected by a posture evaluation method openposition. And finally, reconstructing a three-dimensional human body model M by using human body information in the picture and corresponding two-dimensional posture information through a Smplify-X method h
A human-object model is constructed. The main process is as follows: (1) acquiring three-dimensional structures of a person and an object respectively. Carrying out three-dimensional reconstruction on the person in the picture to obtain a human body three-dimensional grid; and for the three-dimensional structure of the object we denote by hollow spheres. Firstly using O (O) 1 ,o 2 ,o 3 ) And r, where O represents the center of sphere coordinates and r is the radius estimated from the object class. The three-dimensional structure M of the object is then represented using a discrete mesh having 162 vertices and 320 faces 0 . (2) Three-dimensional knot fusing human and objectConstructing and constructing three-dimensional human-object integrated grid (HOM) topological information to obtain
Figure BDA0003757386770000041
The construction of a collective, HOM is shown in FIG. 1.
And constructing a fusion model of the visual and topological characteristics. Visual and topological models are built based on the convolutional neural network, appearance characteristics of people and objects are respectively obtained to serve as visual clues, and grid characteristics of the people are extracted to serve as topological clues. The method comprises the steps that a pyramid feature extraction network based on ResNet-50 is adopted to extract visual features of a whole picture, wherein the ROI Pooling is utilized to obtain appearance features of a human body and an object according to a detection frame, and human space features are extracted according to input of a space binary image of the human body and the object; and extracting the topological features of people and the topological features of human objects in the HOMs from bottom to top by adopting the MeshCNN network.
As shown in FIG. 2, the human interaction detection framework extracts the appearance features and the human space features of human bodies and objects, which are respectively expressed as f h 、f o 、f sp (ii) a The three-dimensional human body topological characteristic and the human-object topological characteristic are respectively
Figure BDA0003757386770000051
And carrying out interactive behavior detection after fusing the obtained visual features and topological features, wherein the feature fusion method is represented as follows:
Figure BDA0003757386770000052
Figure BDA0003757386770000053
and finally, respectively obtaining confidence coefficients of the interactive type recognition based on the two features, and weighting on the level of the confidence coefficients to obtain a final recognition result.
Example two
As shown in fig. 3, the embodiment provides a human interaction detection method based on topology enhancement of a three-dimensional human-object grid, including:
inputting a single picture, extracting visual features of the picture by using a convolutional neural network, reconstructing three-dimensional human body information by a smplify-x method, and fusing three-dimensional information of an object to construct an HOM structure. And extracting topological information of the HOM from bottom to top by means of the MeshCNN, and fusing the topological information with the visual features to realize the enhancement of the three-dimensional human-object grid topology.
In the training stage, because the HOI detection is a multi-label classification task, a binary cross entropy loss function is selected for training in the training stage. The classification losses corresponding to three branches of human, object and space are respectively assumed to be
Figure BDA0003757386770000061
And
Figure BDA0003757386770000062
total loss of training frame L total Comprises the following steps:
Figure BDA0003757386770000063
in the inference stage, a single picture is given, and a final score S of an interaction category is obtained after the single picture passes through an interaction detection module HOI . Wherein the final score is mainly determined by the confidence (S) of the person and thing in each interaction pair h ,S o ) And the action scores of three branches of human, object and space in the interaction detection module
Figure BDA0003757386770000064
Figure BDA0003757386770000065
The formula is as follows:
Figure BDA0003757386770000066
the above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. A character interaction detection method based on three-dimensional human-object grid topology enhancement is characterized by comprising the following steps:
acquiring visual characteristics of a human body and an object in a single picture;
acquiring a human body three-dimensional structure and an object three-dimensional structure in the single picture, and fusing and constructing an initial three-dimensional human-object integrated grid topology model;
acquiring topological features of a three-dimensional human body and topological features of a three-dimensional human-object based on the enhanced three-dimensional human-object integrated grid topology; fusing the visual features and the topological features to obtain an enhanced three-dimensional human-object integrated grid topological model;
and training the enhanced three-dimensional human-object integrated grid topology model, and acquiring a recognition result based on the trained model.
2. The human interaction detection method based on the topological enhancement of the three-dimensional human-object grid according to claim 1, characterized in that the visual features are obtained based on a convolutional neural network, wherein the visual features comprise human appearance features, object appearance features and human space features.
3. The human interaction detection method based on topology enhancement of the three-dimensional human-object grid as claimed in claim 1, wherein the acquisition process of the human three-dimensional structure comprises: acquiring a human body boundary frame in the single picture, and acquiring two-dimensional posture information by a posture evaluation method; and acquiring human body information in the picture, and acquiring the human body three-dimensional structure based on the human body information and the corresponding two-dimensional posture information.
4. The human interaction detection method based on topology enhancement of the three-dimensional human-object grid as claimed in claim 1, wherein the MeshCNN network is adopted to extract the topological features of the three-dimensional human body and the topological features of the three-dimensional human-object from the bottom.
5. The human interaction detection method based on three-dimensional human-object grid topology enhancement as claimed in claim 2, characterized in that interactive behavior recognition is adopted to train the model, wherein, a binary cross entropy loss function is adopted during training, and the training is divided into three branches including human body, object and space; the total loss generated is the sum of the losses of the three branches of the human body, the object and the space.
6. The human interaction detection method based on topology enhancement of the three-dimensional human-object grid as claimed in claim 5, wherein the process of performing feature fusion comprises: performing feature fusion on the human body appearance features and the topological features of the three-dimensional human body; and performing feature fusion on the character space features and the topological features of the three-dimensional human-object.
7. The method for human interaction detection based on topology enhancement of three-dimensional human-object grid according to claim 6, wherein a confidence of interaction type recognition is obtained based on a trained enhanced three-dimensional human-object integrated grid topology model, and a final recognition result is obtained by weighted calculation at a confidence level, wherein the confidence of human body and object is calculated by weighted calculation based on action scores of human body, object and three branches in space and interaction center.
8. The human interaction detection method based on the topological enhancement of the three-dimensional human-object grid according to claim 3, characterized in that the human three-dimensional structure is obtained by a Smplify-X method.
CN202210862950.XA 2022-07-21 2022-07-21 Character interaction detection method based on three-dimensional human-object grid topology enhancement Active CN115170817B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210862950.XA CN115170817B (en) 2022-07-21 2022-07-21 Character interaction detection method based on three-dimensional human-object grid topology enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210862950.XA CN115170817B (en) 2022-07-21 2022-07-21 Character interaction detection method based on three-dimensional human-object grid topology enhancement

Publications (2)

Publication Number Publication Date
CN115170817A true CN115170817A (en) 2022-10-11
CN115170817B CN115170817B (en) 2023-04-28

Family

ID=83494409

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210862950.XA Active CN115170817B (en) 2022-07-21 2022-07-21 Character interaction detection method based on three-dimensional human-object grid topology enhancement

Country Status (1)

Country Link
CN (1) CN115170817B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080024608A1 (en) * 2005-02-11 2008-01-31 Bayerische Motoren Werke Aktiengesellschaft Method and device for visualizing the surroundings of a vehicle by fusing an infrared image and a visual image
CN110827295A (en) * 2019-10-31 2020-02-21 北京航空航天大学青岛研究院 Three-dimensional semantic segmentation method based on coupling of voxel model and color information
CN111401234A (en) * 2020-03-13 2020-07-10 深圳普罗米修斯视觉技术有限公司 Three-dimensional character model construction method and device and storage medium
US20200273190A1 (en) * 2018-03-14 2020-08-27 Dalian University Of Technology Method for 3d scene dense reconstruction based on monocular visual slam
CN113378676A (en) * 2021-06-01 2021-09-10 上海大学 Method for detecting figure interaction in image based on multi-feature fusion
US20210327116A1 (en) * 2018-12-29 2021-10-21 Huawei Technologies Co., Ltd. Method for generating animated expression and electronic device
CN113989854A (en) * 2021-11-22 2022-01-28 上海交通大学 Three-dimensional human body posture estimation method, system, device and medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080024608A1 (en) * 2005-02-11 2008-01-31 Bayerische Motoren Werke Aktiengesellschaft Method and device for visualizing the surroundings of a vehicle by fusing an infrared image and a visual image
US20200273190A1 (en) * 2018-03-14 2020-08-27 Dalian University Of Technology Method for 3d scene dense reconstruction based on monocular visual slam
US20210327116A1 (en) * 2018-12-29 2021-10-21 Huawei Technologies Co., Ltd. Method for generating animated expression and electronic device
CN110827295A (en) * 2019-10-31 2020-02-21 北京航空航天大学青岛研究院 Three-dimensional semantic segmentation method based on coupling of voxel model and color information
CN111401234A (en) * 2020-03-13 2020-07-10 深圳普罗米修斯视觉技术有限公司 Three-dimensional character model construction method and device and storage medium
CN113378676A (en) * 2021-06-01 2021-09-10 上海大学 Method for detecting figure interaction in image based on multi-feature fusion
CN113989854A (en) * 2021-11-22 2022-01-28 上海交通大学 Three-dimensional human body posture estimation method, system, device and medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
RANA HANOCKA 等: "MeshCNN: A Network with an Edge" *
于乃功 等: "基于卷积神经网络的仿鼠脑海马结构认知地图构建方法" *
张全贵 等: "融合Fuzzy拓扑与GALIF的三维形状检索" *

Also Published As

Publication number Publication date
CN115170817B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN110135375B (en) Multi-person attitude estimation method based on global information integration
WO2021129064A9 (en) Posture acquisition method and device, and key point coordinate positioning model training method and device
Singh et al. Video benchmarks of human action datasets: a review
CN105574510A (en) Gait identification method and device
CN112131908A (en) Action identification method and device based on double-flow network, storage medium and equipment
CN108416266A (en) A kind of video behavior method for quickly identifying extracting moving target using light stream
Zhou et al. Learning to estimate 3d human pose from point cloud
CN105095880B (en) A kind of multi-modal Feature fusion of finger based on LGBP coding
Nishi et al. Generation of human depth images with body part labels for complex human pose recognition
CN106815855A (en) Based on the human body motion tracking method that production and discriminate combine
CN110334607B (en) Video human interaction behavior identification method and system
CN106650617A (en) Pedestrian abnormity identification method based on probabilistic latent semantic analysis
CN112668550B (en) Double interaction behavior recognition method based on joint point-depth joint attention RGB modal data
CN114613013A (en) End-to-end human behavior recognition method and model based on skeleton nodes
CN109657634A (en) A kind of 3D gesture identification method and system based on depth convolutional neural networks
CN112906520A (en) Gesture coding-based action recognition method and device
CN113128424A (en) Attention mechanism-based graph convolution neural network action identification method
CN114821764A (en) Gesture image recognition method and system based on KCF tracking detection
Batool et al. Telemonitoring of daily activities based on multi-sensors data fusion
CN111339888B (en) Double interaction behavior recognition method based on joint point motion diagram
Uddin et al. Human activity recognition using robust spatiotemporal features and convolutional neural network
CN117115911A (en) Hypergraph learning action recognition system based on attention mechanism
CN112069943A (en) Online multi-person posture estimation and tracking method based on top-down framework
CN115170817B (en) Character interaction detection method based on three-dimensional human-object grid topology enhancement
CN116403286A (en) Social grouping method for large-scene video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant