CN115410115A - Action identification method and system based on multi-feature fusion - Google Patents

Action identification method and system based on multi-feature fusion Download PDF

Info

Publication number
CN115410115A
CN115410115A CN202210933601.2A CN202210933601A CN115410115A CN 115410115 A CN115410115 A CN 115410115A CN 202210933601 A CN202210933601 A CN 202210933601A CN 115410115 A CN115410115 A CN 115410115A
Authority
CN
China
Prior art keywords
target
continuous video
action
video frame
target continuous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210933601.2A
Other languages
Chinese (zh)
Inventor
蒲诗睿
夏勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Qichuang Funeng Intelligent Technology Co ltd
Original Assignee
Wuhan Qichuang Funeng Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Qichuang Funeng Intelligent Technology Co ltd filed Critical Wuhan Qichuang Funeng Intelligent Technology Co ltd
Priority to CN202210933601.2A priority Critical patent/CN115410115A/en
Publication of CN115410115A publication Critical patent/CN115410115A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method and a system for recognizing actions based on multi-feature fusion, wherein the method comprises the following steps: acquiring a target continuous video frame and a corresponding action angle sequence thereof, and preprocessing the target continuous video frame by utilizing a first recognition model and a just noticeable mode to eliminate noise and reflection in the target continuous video frame; carrying out boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames by using a second recognition model, and aligning the extracted multiple features with the action angle sequence of the target to obtain a feature vector; and inputting the characteristic vector and the action angle sequence of the target into a DTW model to obtain the action category of the continuous video frame of the target. The invention reduces the unclear boundary in the action angle sequence by combining SIRR and a plurality of recognition models, and improves the multi-occasion adaptability and the robustness of discontinuous action recognition in the feature vector and DTW model.

Description

Action identification method and system based on multi-feature fusion
Technical Field
The invention belongs to the technical field of visual identification and deep learning, and particularly relates to an action identification method and system based on multi-feature fusion.
Background
Human motion recognition has been one of the popular subjects in the field of computer vision, but with the rapid increase of video network information amount, the traditional machine learning methods, such as those based on human body joint points, based on spatio-temporal interest points, and based on dense tracks, have been unable to meet the increasing application requirements. Therefore, the center of gravity of motion recognition is shifted to deep learning based on video data, and a Convolutional Neural Network (CNN) has been achieved in image classification research, so that a great deal of information is provided for a video classification task. However, the video has a pending time dimension relative to the image, and how to capture the time dimension between adjacent frames of the video is the focus of research, and the difficulties are mainly the scene complexity, the uncertainty of the motion boundary, the continuity and the discontinuity of the motion.
The target detection is a core task in the field of computer vision, is the basis for realizing target tracking and behavior identification, and the current mainstream target detection algorithm based on the convolutional neural network is divided into a one-stage type and a two-stage type. As frameworks have enjoyed great success in the Natural Language Processing (NLP) domain, researchers have attempted to migrate them to the computer vision domain. In recent years, a transform-based target detection algorithm is advanced in the field of artificial intelligence including natural language processing, text recognition, image processing and the like, and a multimode neural network is caused to solve the heat of complex intelligent tasks.
Disclosure of Invention
In order to adapt to the complexity of an action scene, determine an action boundary and improve the accuracy of discontinuous action recognition, the invention provides a method for recognizing actions based on multi-feature fusion in a first aspect, which comprises the following steps: acquiring a target continuous video frame and a corresponding action angle sequence thereof, and preprocessing the target continuous video frame by utilizing a first recognition model and a just noticeable mode to eliminate noise and reflection in the target continuous video frame; carrying out boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames by using a second recognition model, and aligning the extracted multiple features with the action angle sequence of the target to obtain a feature vector; and inputting the characteristic vector and the action angle sequence of the target into a DTW model to obtain the action category of the continuous video frame of the target.
In some embodiments of the invention, the second recognition model comprises: the second identification model comprises a first convolutional neural network and a second convolutional neural network, wherein the first convolutional neural network is used for carrying out boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames to obtain the boundary of each target and multi-dimensional image features of the target; and the second convolutional neural network is used for aligning and fusing the boundary and the image characteristic of each target in the target continuous video frames according to a preset time length to obtain a characteristic vector of the target continuous video frames.
Further, the first convolutional neural network is Center-net, and the second convolutional neural network is Transformer.
In some embodiments of the present invention, the inputting the feature vector and the motion angle sequence of the target into the DTW model to obtain the motion category of the target consecutive video frames includes: determining one or more template action sequences of a target continuous video frame; calculating the similarity between each template action sequence and a target continuous video frame based on the feature vector and the Euclidean distance; and taking the category of the matched template action sequence with the highest similarity as the action category of the target continuous video frame.
Further, the similarity between each template action sequence and the target continuous video frame based on the feature vector and the Euclidean distance comprises: determining and calculating the minimum path of the matched target continuous video frame sequence and the template action sequence based on the feature vector and the regular path; and calculating the similarity between each template action sequence and the target continuous video frame according to the minimum path.
In the above embodiment, the acquiring target consecutive video frames and the corresponding motion angle sequence thereof includes: extracting a 3D framework of each target in target continuous video frames by using kinect, and calculating the angle characteristic of each joint in the 3D framework; and matching according to the angle characteristic of each joint and a preset template to obtain an action angle sequence corresponding to the target continuous video frame.
In a second aspect of the present invention, a motion recognition system based on multi-feature fusion is provided, which includes: the acquisition module is used for acquiring a target continuous video frame and a corresponding action angle sequence thereof, and preprocessing the target continuous video frame by utilizing a first recognition model and a proper perception mode to eliminate noise and reflection in the target continuous video frame; the extraction module is used for carrying out boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames by utilizing the second recognition model, and aligning the extracted multiple features with the action angle sequence of the target to obtain a feature vector; and the identification module is used for inputting the characteristic vector and the action angle sequence of the target into a DTW model to obtain the action category of the continuous video frame of the target.
Further, the obtaining module includes: the calculating unit is used for extracting a 3D framework of each target in the target continuous video frames by using kinect and calculating the angle characteristic of each joint in the 3D framework; and the matching unit is used for matching according to the angle characteristic of each joint and a preset template to obtain an action angle sequence corresponding to the target continuous video frame.
In a third aspect of the present invention, there is provided an electronic apparatus comprising: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors implement the multi-feature fusion-based action recognition method provided by the invention in the first aspect.
In a fourth aspect of the present invention, a computer-readable medium is provided, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the multi-feature fusion based action recognition method provided in the first aspect of the present invention.
The invention has the beneficial effects that:
the invention reduces the unclear boundary in the action angle sequence by combining SIRR and a plurality of recognition models, and improves the multi-occasion adaptability and the robustness of discontinuous action recognition in the feature vector and DTW model.
Drawings
FIG. 1 is a schematic flow chart of a method for multi-feature fusion based motion recognition in some embodiments of the present invention;
FIG. 2 is a schematic diagram of the structure of a transformer in some embodiments of the invention;
FIG. 3 is a diagram illustrating the effect of DTW sequence matching in some embodiments of the invention;
FIG. 4 is a schematic diagram of a multi-feature fusion based action recognition system in some embodiments of the present invention;
fig. 5 is a schematic structural diagram of an electronic device in some embodiments of the invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, in a first aspect of the present invention, there is provided a method for motion recognition based on multi-feature fusion, including: s100, acquiring target continuous video frames and action angle sequences corresponding to the target continuous video frames, and preprocessing the target continuous video frames by utilizing a first recognition model and a just noticeable mode to eliminate noise and reflection in the target continuous video frames; s200, carrying out boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames by using a second recognition model, and aligning the extracted multiple features with the action angle sequence of the target to obtain a feature vector; and S300, inputting the characteristic vector and the action angle sequence of the target into a DTW model to obtain the action category of the target continuous video frame.
It is to be understood that the first identifying model and the just noticeable preprocessing of the target consecutive video frames in step S100 of some embodiments of the present invention, the removing of noise and reflections therein comprises: shadow, reflections, highlights or artifacts etc. in the target consecutive video frames are removed using a dual stream framework (I-CNN or E-CNN).
Referring to fig. 2, in step S200 of some embodiments of the invention, the second recognition model includes: the second identification model comprises a first convolutional neural network and a second convolutional neural network, wherein the first convolutional neural network is used for carrying out boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames to obtain the boundary of each target and multi-dimensional image features of the target; and the second convolutional neural network is used for aligning and fusing the boundary and the image characteristic of each target in the target continuous video frames according to a preset time length to obtain a characteristic vector of the target continuous video frames.
Further, the first convolutional neural network is Center-net, and the second convolutional neural network is Transformer.
Referring to fig. 3, in step S300 of some embodiments of the present invention, the inputting the feature vector and the motion angle sequence of the target into the DTW model to obtain motion categories of consecutive video frames of the target includes: s301, determining one or more template action sequences of target continuous video frames; s302, calculating the similarity between each template action sequence and a target continuous video frame based on the feature vector and the Euclidean distance; and S303, taking the matched type of the template action sequence with the highest similarity as the action type of the target continuous video frame.
Further, in step S302, the similarity between each template motion sequence and the target continuous video frame based on the feature vector and the euclidean distance includes: determining and calculating the minimum path of the matched target continuous video frame sequence and the template action sequence based on the feature vector and the regular path; and calculating the similarity between each template action sequence and the target continuous video frames according to the minimum path.
In the above embodiment, the acquiring target consecutive video frames and the corresponding motion angle sequence thereof includes: extracting a 3D framework of each target in target continuous video frames by using kinect, and calculating the angle characteristic of each joint in the 3D framework; and matching according to the angle characteristic of each joint and a preset template to obtain an action angle sequence corresponding to the target continuous video frame.
It can be understood that a series of motion depth images are acquired by the depth camera, and 20 motions are included: higharmwave, horizotalarmwave, hammer, hardcatch, forwardshoot, highhrow, drawX, drawtick, drawcircle, hardclip, twohangewave, sidepushing, be nd, forwardkidk, sidekick, logging, tennisshift, golfswing, pickankanddthread. Each action was done 3 times by 10 different people for 15 seconds each frame for a total of 402 action samples, 23797 frames of depth images.
Example 2
Referring to fig. 4, in a second aspect of the present invention, there is provided a multi-feature fusion based action recognition system 1, including: an obtaining module 11, configured to obtain a target continuous video frame and a corresponding action angle sequence thereof, and perform preprocessing on the target continuous video frame by using a first recognition model and making sure to perceive the target continuous video frame, so as to eliminate noise and reflection therein; the extraction module 12 is configured to perform boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames by using the second recognition model, and align the extracted multiple features with the action angle sequence of the target to obtain a feature vector; and the identification module 13 is configured to input the feature vector and the motion angle sequence of the target into a DTW model to obtain a motion category of the target continuous video frame.
Further, the obtaining module 11 includes: the calculating unit is used for extracting a 3D framework of each target in the target continuous video frames by using kinect and calculating the angle characteristic of each joint in the 3D framework; and the matching unit is used for matching according to the angle characteristic of each joint and a preset template to obtain an action angle sequence corresponding to the target continuous video frame.
Example 3
Referring to fig. 5, in a third aspect of the present invention, there is provided an electronic apparatus comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to carry out the method of the invention in the first aspect.
Electronic device 500 may include a processing means (e.g., central processing unit, graphics processor, etc.) 501 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
The following devices may be connected to the I/O interface 505 in general: input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 507 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; a storage device 508 including, for example, a hard disk; and a communication device 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 illustrates an electronic device 500 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be alternatively implemented or provided. Each block shown in fig. 5 may represent one device or may represent multiple devices as desired.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or installed from the storage means 508, or installed from the ROM 502. The computer program, when executed by the processing device 501, performs the above-described functions defined in the methods of embodiments of the present disclosure. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer-readable medium carries one or more computer programs which, when executed by the electronic device, cause the electronic device to:
computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, python, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A motion recognition method based on multi-feature fusion is characterized by comprising the following steps:
acquiring a target continuous video frame and a corresponding action angle sequence thereof, and preprocessing the target continuous video frame by utilizing a first recognition model and a just noticeable mode to eliminate noise and reflection in the target continuous video frame;
carrying out boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames by using a second recognition model, and aligning the extracted multiple features with the action angle sequence of the target to obtain a feature vector;
and inputting the characteristic vector and the action angle sequence of the target into a DTW model to obtain the action category of the continuous video frame of the target.
2. The multi-feature fusion based action recognition method according to claim 1, wherein the second recognition model comprises:
the second recognition model includes a first convolutional neural network and a second convolutional neural network,
the first convolution neural network is used for carrying out boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames to obtain the boundary of each target and multi-dimensional image features of the target;
and the second convolutional neural network is used for aligning and fusing the boundary and the image characteristic of each target in the target continuous video frames according to a preset time length to obtain a characteristic vector of the target continuous video frames.
3. The multi-feature fusion-based motion recognition method according to claim 2, wherein the first convolutional neural network is Center-net and the second convolutional neural network is Transformer.
4. The method for motion recognition based on multi-feature fusion according to claim 1, wherein the step of inputting the feature vector and the motion angle sequence of the target into the DTW model to obtain the motion category of the target continuous video frames comprises:
determining one or more template action sequences of a target continuous video frame;
calculating the similarity between each template action sequence and a target continuous video frame based on the feature vector and the Euclidean distance;
and taking the category of the matched template action sequence with the highest similarity as the action category of the target continuous video frame.
5. The method according to claim 4, wherein the similarity between each template motion sequence and the target continuous video frame based on the feature vector and Euclidean distance comprises:
determining and calculating the minimum path of the matched target continuous video frame sequence and the template action sequence based on the feature vector and the regular path;
and calculating the similarity between each template action sequence and the target continuous video frames according to the minimum path.
6. The method according to any one of claims 1 to 5, wherein the obtaining of the target continuous video frames and the corresponding action angle sequence comprises:
extracting a 3D skeleton of each target in target continuous video frames by using kinect, and calculating the angle characteristic of each joint in the 3D skeleton;
and matching according to the angle characteristic of each joint and a preset template to obtain an action angle sequence corresponding to the target continuous video frame.
7. A motion recognition system based on multi-feature fusion, comprising:
the acquisition module is used for acquiring a target continuous video frame and a corresponding action angle sequence thereof, and preprocessing the target continuous video frame by utilizing a first recognition model and a proper perception mode to eliminate noise and reflection in the target continuous video frame;
the extraction module is used for carrying out boundary evaluation and multi-feature extraction on the preprocessed target continuous video frames by utilizing the second recognition model, and aligning the extracted multiple features with the action angle sequence of the target to obtain a feature vector;
and the identification module is used for inputting the characteristic vector and the action angle sequence of the target into a DTW model to obtain the action category of the target continuous video frame.
8. The multi-feature fusion based action recognition system of claim 7, wherein the obtaining module comprises:
the calculating unit is used for extracting a 3D framework of each target in the target continuous video frames by using kinect and calculating the angle characteristic of each joint in the 3D framework;
and the matching unit is used for matching according to the angle characteristic of each joint and a preset template to obtain an action angle sequence corresponding to the target continuous video frame.
9. An electronic device, comprising: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the multi-feature fusion based action recognition method of any one of claims 1 to 6.
10. A computer-readable medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the multi-feature fusion based action recognition method according to any one of claims 1 to 6.
CN202210933601.2A 2022-08-04 2022-08-04 Action identification method and system based on multi-feature fusion Pending CN115410115A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210933601.2A CN115410115A (en) 2022-08-04 2022-08-04 Action identification method and system based on multi-feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210933601.2A CN115410115A (en) 2022-08-04 2022-08-04 Action identification method and system based on multi-feature fusion

Publications (1)

Publication Number Publication Date
CN115410115A true CN115410115A (en) 2022-11-29

Family

ID=84158959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210933601.2A Pending CN115410115A (en) 2022-08-04 2022-08-04 Action identification method and system based on multi-feature fusion

Country Status (1)

Country Link
CN (1) CN115410115A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501176A (en) * 2023-06-27 2023-07-28 世优(北京)科技有限公司 User action recognition method and system based on artificial intelligence

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501176A (en) * 2023-06-27 2023-07-28 世优(北京)科技有限公司 User action recognition method and system based on artificial intelligence
CN116501176B (en) * 2023-06-27 2023-09-12 世优(北京)科技有限公司 User action recognition method and system based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN110221690B (en) Gesture interaction method and device based on AR scene, storage medium and communication terminal
EP3467707B1 (en) System and method for deep learning based hand gesture recognition in first person view
US20210158533A1 (en) Image processing method and apparatus, and storage medium
WO2019214319A1 (en) Vehicle loss assessment data processing method, apparatus, processing device and client
CN109871800B (en) Human body posture estimation method and device and storage medium
CN113420719B (en) Method and device for generating motion capture data, electronic equipment and storage medium
CN109993150B (en) Method and device for identifying age
CN112861575A (en) Pedestrian structuring method, device, equipment and storage medium
CN111783626B (en) Image recognition method, device, electronic equipment and storage medium
CN110059624B (en) Method and apparatus for detecting living body
CN108229375B (en) Method and device for detecting face image
CN111368668B (en) Three-dimensional hand recognition method and device, electronic equipment and storage medium
CN108133197B (en) Method and apparatus for generating information
CN110465089B (en) Map exploration method, map exploration device, map exploration medium and electronic equipment based on image recognition
CN114972958B (en) Key point detection method, neural network training method, device and equipment
US20210295016A1 (en) Living body recognition detection method, medium and electronic device
CN110110666A (en) Object detection method and device
CN114549557A (en) Portrait segmentation network training method, device, equipment and medium
CN115410115A (en) Action identification method and system based on multi-feature fusion
CN115439927A (en) Gait monitoring method, device, equipment and storage medium based on robot
CN115905622A (en) Video annotation method, device, equipment, medium and product
CN113920023A (en) Image processing method and device, computer readable medium and electronic device
CN111310595B (en) Method and device for generating information
CN111260756B (en) Method and device for transmitting information
CN113723306B (en) Push-up detection method, push-up detection device and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination