CN112396018B - Badminton player foul action recognition method combining multi-mode feature analysis and neural network - Google Patents

Badminton player foul action recognition method combining multi-mode feature analysis and neural network Download PDF

Info

Publication number
CN112396018B
CN112396018B CN202011364578.7A CN202011364578A CN112396018B CN 112396018 B CN112396018 B CN 112396018B CN 202011364578 A CN202011364578 A CN 202011364578A CN 112396018 B CN112396018 B CN 112396018B
Authority
CN
China
Prior art keywords
features
neural network
network
motion
optical flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011364578.7A
Other languages
Chinese (zh)
Other versions
CN112396018A (en
Inventor
张刚瀚
黄国恒
程良伦
张煜乾
陈泽炯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202011364578.7A priority Critical patent/CN112396018B/en
Publication of CN112396018A publication Critical patent/CN112396018A/en
Application granted granted Critical
Publication of CN112396018B publication Critical patent/CN112396018B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses a badminton player foul action recognition method combining multi-mode feature analysis and a neural network, which comprises the following steps: extracting character images, motion gesture sequences and optical flow data of athletes in real time; sending the character image into a space flow network of a double-flow network to obtain the space characteristics of the athlete; transmitting the motion gesture sequence as a directed graph into a multi-layer graph convolution neural network to obtain gesture space-time characteristics of the athlete during motion; extracting features of each frame of optical flow data through a convolutional neural network, and then sending the extracted features into a time relation network to obtain optical flow motion information features of athletes; and respectively pairing the obtained three features two by two to obtain three aggregation features, respectively sending the three aggregation features into a convolutional neural network to obtain three fusion features, weighting and fusing the three fusion features to obtain a final overall human body multi-modal fusion motion feature, and sending the final overall human body multi-modal fusion motion feature into a fully connected network to obtain a final action classification recognition result. The invention improves the accuracy of identifying the foul actions of the athlete.

Description

Badminton player foul action recognition method combining multi-mode feature analysis and neural network
Technical Field
The invention relates to the technical field of image processing, in particular to a badminton player foul action recognition method combining multi-mode feature analysis and a neural network.
Background
The badminton match rules have clear regulations, the game is required to meet the standard actions, the service cannot be deliberately delayed, the station cannot obstruct the vision of the opponent, the opponent field cannot be deliberately invaded in the match, and certain actions are made to obstruct the attack of the opponent or to disperse the attention of the opponent. However, these actions are sometimes subtle when they are sent out, so that the eyes cannot observe and judge in detail, and in addition, the actions of athletes change frequently in the competition process, so that the judge may ignore certain foul actions, thereby affecting the fairness of the competition. Along with the improvement of computer vision technology, the machine can realize the fine analysis of videos and images, so that the machine can be used for replacing human eyes to identify whether the player plays the game or not, namely, the player is identified by behavior to judge whether the player plays the game or not according with the specification. The existing behavior recognition method mainly comprises a double-flow method and 3D convolution, and also comprises the step of inputting human body gestures to perform behavior recognition. However, because of a lot of uncertainty on human behaviors, the player's actions are complex during competition, and the hand swing action may be inconspicuous, and complex actions may be mixed, which may cause misjudgment on the behaviors by the system, but only data of a single mode is used, and interaction between information is lacking, so that detailed analysis is difficult.
In the prior art, the Chinese patent with publication number of CN110705463A discloses a method and a system for identifying the human body behaviors of a video based on a multi-mode double-flow 3D network in the year 2020, 1 and 17, wherein the method comprises the following steps: a depth dynamic image sequence DDIS generated based on the depth video; a pose evaluation graph sequence PEMS generated based on RGB video; respectively inputting the depth dynamic image sequence and the gesture evaluation image sequence into a 3D convolutional neural network, and constructing a DDIS stream and a PEMS stream to obtain respective classification results; and fusing the obtained classification results to obtain a final behavior recognition result. The patent does not combine multi-feature information, does not have feature fusion, and has low recognition accuracy.
Disclosure of Invention
The invention provides a badminton player foul action recognition method combining multi-mode feature analysis and a neural network, which aims to overcome the defects that the player foul action recognition in the prior art does not have multi-feature information fusion and is low in recognition accuracy.
The primary purpose of the invention is to solve the technical problems, and the technical scheme of the invention is as follows:
a badminton player foul action recognition method combining multi-mode feature analysis and neural network comprises the following steps:
s1: extracting character images, motion gesture sequences and optical flow data of athletes in real time;
s2: sending the character image into a space flow network of a double-flow network to obtain the space characteristics of the athlete;
s3: transmitting the motion gesture sequence as a directed graph into a multi-layer graph convolution neural network to obtain gesture space-time characteristics of the athlete during motion;
s4: extracting features of each frame of optical flow data through a convolutional neural network, and then sending the extracted features into a time relation network to obtain optical flow motion information features of athletes;
s5: pairing the three features obtained in the steps S1, S2 and S3 two by two to obtain three polymerization features;
s6: the three aggregation features are respectively sent into a convolutional neural network to obtain three fusion features;
s7: the three fusion features are weighted and fused to obtain the final overall human body multi-mode fusion motion feature;
s8: and sending the overall human body multi-mode fusion motion characteristics into a fully-connected network to obtain a final action classification recognition result.
Further, in step S1, an image of the player character is acquired through video capturing, a motion gesture of the player is acquired through openPose, and optical flow data of the player is acquired through DenseF.
Further, step S2 is to send the character image into a spatial stream network of the dual stream network, and model the spatial information of the character image to obtain the character spatial characteristics of the athlete.
Further, step S3 is to transfer the motion gesture sequence as a directed graph into a multi-layer graph convolution neural network, and model the motion gesture sequence of the athlete to obtain the motion gesture space-time characteristics of the athlete.
Further, the motion gesture space-time characteristic is obtained through graph convolution operation, and the graph convolution operation formula is as follows:
Figure BDA0002805052770000021
wherein v is ti 、v tj Represents the joint point of human body posture, f in And f out Representing input and output images, w and w' representing weights between the nodes and weights after reconstruction, l ti (. Cndot.) means using the node v ti To assign digital labels to other nodes, which digital labels depend on the shortest path between two joints, Z ti (. Cndot.) is a regularization term; b (v) ti )={v tj |d(v tj ,v ti ) Not less than D }, wherein D is set to a constant 1, D (v tj ,v ti ) Is the shortest path between two joint points.
Further, the step S4 is a process of obtaining the athlete optical flow motion information feature:
modeling each frame of optical flow of the optical flow sequence by utilizing a ResNet base network in the convolutional neural network, and then fusing features of each modeled frame of optical flow;
sending the optical flow sequences with the characteristics fused into a time relation network to be grouped according to different frame numbers, and sequencing the serial numbers of the optical flows in each group from small to large;
modeling each group of optical flow sequences to obtain inter-frame time relation features, and then fusing the inter-frame time relation features of the same group to obtain inter-segment time relation features;
and adding all the time relation features to obtain the optical flow motion information features of the integral athlete containing time reasoning information.
Further, in step S6, the three kinds of aggregation features are respectively sent to the convolutional neural network, each aggregation feature contains a feature pair, and the convolutional neural network models and fuses each feature pair to obtain three kinds of fusion features.
Further, the fully-connected network performs action recognition and classification on the input human body multi-mode fusion movement characteristics, and judges whether the mobilizer has a foul action or not.
Further, the process of acquiring the motion gesture sequence includes:
acquiring a moving image sequence, and passing each picture of the image sequence through a VGG19 network to obtain image characteristics;
respectively acquiring the confidence coefficient of the joint point and the affinity vector between the joint points of each joint point of the body of the athlete according to the image characteristics;
clustering the joint points by using the confidence coefficient of the joint points and the affinity vector among the joint points, and performing skeleton splicing to obtain the motion gesture sequence of the athlete.
Further, the character image is an RGB image.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
according to the invention, different characteristics are acquired through the object image, the motion gesture sequence and the optical flow data, and the characteristics are fused to perform action recognition in the fully connected network, so that the accuracy of foul recognition is improved.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a diagram of a network architecture of the present invention.
FIG. 3 is a diagram of a time-dependent network according to the present invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those described herein, and therefore the scope of the present invention is not limited to the specific embodiments disclosed below.
Example 1
1-2, a badminton player foul action recognition method combining multi-mode feature analysis and neural network comprises the following steps:
s1: extracting character images, motion gesture sequences and optical flow data of athletes in real time;
s2: sending the character image into a space flow network of a double-flow network to obtain the space characteristics of the athlete;
s3: transmitting the motion gesture sequence as a directed graph into a multi-layer graph rolling neural network (GCN) to obtain gesture space-time characteristics of the athlete during the motion;
s4: extracting features of each frame of optical flow data through a convolutional neural network, and then sending the extracted features into a time relation network to obtain optical flow motion information features of athletes;
s5: pairing the three features obtained in the steps S1, S2 and S3 two by two to obtain three polymerization features;
s6: the three aggregation features are respectively sent into a convolutional neural network to obtain three fusion features;
s7: the three fusion features are weighted and fused to obtain the final overall human body multi-mode fusion motion feature;
s8: and sending the overall human body multi-mode fusion motion characteristics into a fully-connected network to obtain a final action classification recognition result.
Further, in step S1, an image of the player character is acquired through video capturing, a motion gesture of the player is acquired through openPose, and optical flow data of the player is acquired through DenseF.
In a specific embodiment, a plurality of cameras can be arranged to shoot a badminton playing field at multiple angles, and a character image is obtained by intercepting video frames in real time.
The process for acquiring the motion gesture sequence comprises the following steps:
obtaining a moving image sequence, and passing each picture of the image sequence through a 10-layer VGG19 network to obtain an image characteristic F;
motion is acquired according to image characteristicsJoint point confidence for each joint point of a person's body
Figure BDA0002805052770000051
Affinity vector between the joint points +.>
Figure BDA0002805052770000052
Wherein j refers to the index of the joint point, and c refers to the index number of the limb;
clustering the joint points by using the confidence coefficient of the joint points and the affinity vector among the joint points, and performing skeleton splicing to obtain the motion gesture sequence of the athlete.
And (3) taking the acquired motion gesture sequence as a directed graph to be transmitted into a multi-layer graph convolution neural network, and modeling the motion gesture sequence of the athlete to obtain the motion gesture space-time characteristics of the athlete.
More specifically, the motion gesture sequence of the athlete is regarded as a graph structure, the vertexes of the graph are the nodes in each frame, and the vertex set of the graph is expressed as V= { V ti T=1,..t, i=1,..n }, the edges of the graph being the connecting edges between the respective nodes within each frame and the connecting edges of the corresponding nodes from frame to frame, the set of connecting edges between the respective nodes within each frame being denoted as E s ={v ti v tj I (i, j) E H, the set of connecting edges of corresponding knuckles between frames is denoted as E F ={v ti v (t+1)j }. Then the skeleton sequence is graphically convolved using the following equation:
Figure BDA0002805052770000053
v ti 、v tj represents the joint point of human body posture, f in And f out Representing input and output images, w and w' representing weights between the nodes and weights after reconstruction, l ti (. Cndot.) means using the node v ti To assign digital labels to other nodes, which digital labels depend on the shortest path between two joints, Z ti (. Cndot.) is a regularization term; b (v) ti )={v tj |d(v tj ,v ti ) Not less than D }, wherein D is set to a constant 1, D (v tj ,v ti ) Is the shortest path between two joint points.
The invention is provided with 9 layers of space-time diagram convolution operation units, and finally can output the motion gesture space-time characteristics PF of the athlete.
In the invention, optical flow data is acquired through a DenseF low tool package, and the specific flow is as follows: two pictures are input each time, a first T (x, y) is a reference image, a second I (x, y) is a current image, and then the following objective functions are designed so that each corresponding point on the two registered images is the same as possible:
Figure BDA0002805052770000061
u (x, y), v (x, y) are the offsets of each point on the image,
Figure BDA0002805052770000062
ψ (x) is the error function. Minimizing the objective function allows the output of the sequence of optical flows mobilized during the competition.
After the optical flow sequence data of the athlete is obtained, the invention adopts the space flow network in the double-flow architecture to extract the space characteristics for the image frames, and as shown in fig. 3, the Time Relation Network (TRN) is used for replacing the time flow network to obtain the optical flow motion information characteristics, and the process for obtaining the optical flow motion information characteristics of the athlete is as follows:
modeling each frame of optical flow of the optical flow sequence by utilizing a ResNet base network in a Convolutional Neural Network (CNN), and then fusing features of each modeled frame of optical flow;
sending the optical flow sequences with the integrated features into a time relation network to be grouped according to different frame numbers (particularly, grouping can be carried out according to a group of 2 frames, a group of 3 frames and a group of 4 frames), and sequencing the serial numbers of the optical flows in each group from small to large;
modeling each group of optical flow sequences to obtain inter-frame time relation features, and then fusing the inter-frame time relation features of the same group to obtain inter-segment time relation features;
and adding all the time relation features to obtain the optical flow motion information features of the integral athlete containing time reasoning information.
Taking 3 frames as an example, the temporal relationship of 3 frames can be expressed as:
Figure BDA0002805052770000063
wherein h is φ And g θ Implemented using a simple multi-layer perceptron; athlete optical flow motion information feature MF (V) =t 2 (V)+T 3 (V)+T 4 (V)。
Further, in step S6, the three kinds of aggregation features are respectively sent to the convolutional neural network, each aggregation feature contains a feature pair, and the convolutional neural network models and fuses each feature pair to obtain three kinds of fusion features.
In a specific embodiment, the acquired human body space features SF, optical flow motion information features MF and motion gesture space-time features PF of the badminton athlete are paired and aggregated two by two to obtain three different aggregation features < SF, MF >, < PF, MF > and < PF, SF >. And then inputting the three aggregation features into a CNN module for modeling to obtain Fusion features Fusion1, fusion2 and Fusion3 of the paired modal features. The three fusion features are weighted and fused to obtain the final human multi-modal fusion feature, the feature comprises fusion features of different combinations of three modes, information complementation can be realized among different modes, the obtained information is more abundant, the robustness of the feature is higher,
further, the fully-connected network performs action recognition and classification on the input human body multi-mode fusion movement characteristics, and judges whether the mobilizer has a foul action or not.
The same or similar reference numerals correspond to the same or similar components;
the terms describing the positional relationship in the drawings are merely illustrative, and are not to be construed as limiting the present patent;
it is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (8)

1. A badminton player foul action recognition method combining multi-mode feature analysis and neural network is characterized by comprising the following steps:
s1: extracting character images, motion gesture sequences and optical flow data of athletes in real time;
s2: sending the character image into a space flow network of a double-flow network to obtain the space characteristics of the athlete;
s3: transmitting the motion gesture sequence as a directed graph into a multi-layer graph convolution neural network to obtain gesture space-time characteristics of the athlete during motion;
the motion attitude space-time characteristic is obtained through graph convolution operation, and the graph convolution operation formula is as follows:
Figure QLYQS_2
wherein (1)>
Figure QLYQS_4
、/>
Figure QLYQS_5
Representing the joints of human body posture->
Figure QLYQS_7
And->
Figure QLYQS_9
Representative input and outputGo out of the image->
Figure QLYQS_11
And->
Figure QLYQS_12
Representing the weights between the nodes and the weights after reconstruction, +.>
Figure QLYQS_1
Means using the node->
Figure QLYQS_3
To assign digital labels to other nodes, said digital labels being dependent on the shortest path between two joints,/for>
Figure QLYQS_6
Is a regularization term; />
Figure QLYQS_8
Wherein D is set to be a constant 1, < >>
Figure QLYQS_10
Is the shortest path between two joint points;
s4: extracting features of each frame of optical flow data through a convolutional neural network, and then sending the extracted features into a time relation network to obtain optical flow motion information features of athletes;
the process for obtaining the athlete optical flow movement information features is as follows:
modeling each frame of optical flow of the optical flow sequence by utilizing a ResNet base network in the convolutional neural network, and then fusing features of each modeled frame of optical flow;
sending the optical flow sequences with the characteristics fused into a time relation network to be grouped according to different frame numbers, and sequencing the serial numbers of the optical flows in each group from small to large;
modeling each group of optical flow sequences to obtain inter-frame time relation features, and then fusing the inter-frame time relation features of the same group to obtain inter-segment time relation features;
adding all the inter-segment time relation features to obtain the overall athlete optical flow movement information features containing time reasoning information;
s5: pairing the three features obtained in the steps S1, S2 and S3 two by two to obtain three polymerization features;
s6: the three aggregation features are respectively sent into a convolutional neural network to obtain three fusion features;
s7: the three fusion features are weighted and fused to obtain the final overall human body multi-mode fusion motion feature;
s8: and sending the overall human body multi-mode fusion motion characteristics into a fully-connected network to obtain a final action classification recognition result.
2. The method for identifying foul actions of badminton players by combining multi-mode feature analysis and neural networks according to claim 1, wherein in step S1, player character images are acquired through video frame capturing, player motion gestures are acquired through openPose, and player optical flow data are acquired through DenseF.
3. The method for identifying the foul actions of the badminton player by combining the multi-mode characteristic analysis and the neural network according to claim 1, wherein the step S2 is to send the character image into a space flow network of a double-flow network, and model the space information of the character image to obtain the character space characteristics of the player.
4. The method for identifying the foul actions of the badminton player by combining the multi-mode feature analysis and the neural network according to claim 1, wherein the step S3 is to transmit the motion gesture sequence as a directed graph into a multi-layer graph convolution neural network, and model the motion gesture sequence of the player to obtain the motion gesture space-time feature of the player.
5. The badminton player foul action recognition method combining multi-mode feature analysis and neural network according to claim 1, wherein in step S6, three kinds of aggregate features are respectively sent into a convolutional neural network, each aggregate feature contains a feature pair, and the convolutional neural network models and fuses each feature pair to obtain three kinds of fusion features.
6. The method for identifying the foul actions of the badminton player by combining the multi-mode characteristic analysis and the neural network according to claim 1, wherein the fully-connected network performs action identification classification on the input human body multi-mode fusion motion characteristics to judge whether the foul actions exist in the sportsman.
7. The method for identifying a fouled action of a shuttlecock player in combination with a multimodal feature analysis and a neural network as claimed in claim 1, wherein the step of obtaining a sequence of motion attitudes comprises:
acquiring a moving image sequence, and passing each picture of the image sequence through a VGG19 network to obtain image characteristics;
respectively acquiring the confidence coefficient of the joint point and the affinity vector between the joint points of each joint point of the body of the athlete according to the image characteristics;
clustering the joint points by using the confidence coefficient of the joint points and the affinity vector among the joint points, and performing skeleton splicing to obtain the motion gesture sequence of the athlete.
8. The method for identifying a fouled action of a shuttlecock player in combination with a multimodal feature analysis and a neural network of claim 1, wherein the character image is an RGB image.
CN202011364578.7A 2020-11-27 2020-11-27 Badminton player foul action recognition method combining multi-mode feature analysis and neural network Active CN112396018B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011364578.7A CN112396018B (en) 2020-11-27 2020-11-27 Badminton player foul action recognition method combining multi-mode feature analysis and neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011364578.7A CN112396018B (en) 2020-11-27 2020-11-27 Badminton player foul action recognition method combining multi-mode feature analysis and neural network

Publications (2)

Publication Number Publication Date
CN112396018A CN112396018A (en) 2021-02-23
CN112396018B true CN112396018B (en) 2023-06-06

Family

ID=74605505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011364578.7A Active CN112396018B (en) 2020-11-27 2020-11-27 Badminton player foul action recognition method combining multi-mode feature analysis and neural network

Country Status (1)

Country Link
CN (1) CN112396018B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239766A (en) * 2021-04-30 2021-08-10 复旦大学 Behavior recognition method based on deep neural network and intelligent alarm device
CN113239897B (en) * 2021-06-16 2023-08-18 石家庄铁道大学 Human body action evaluation method based on space-time characteristic combination regression

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109460707A (en) * 2018-10-08 2019-03-12 华南理工大学 A kind of multi-modal action identification method based on deep neural network
CN110096950A (en) * 2019-03-20 2019-08-06 西北大学 A kind of multiple features fusion Activity recognition method based on key frame
CN110705463A (en) * 2019-09-29 2020-01-17 山东大学 Video human behavior recognition method and system based on multi-mode double-flow 3D network
CN110892408A (en) * 2017-02-07 2020-03-17 迈恩德玛泽控股股份有限公司 Systems, methods, and apparatus for stereo vision and tracking
CN111259804A (en) * 2020-01-16 2020-06-09 合肥工业大学 Multi-mode fusion sign language recognition system and method based on graph convolution

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110892408A (en) * 2017-02-07 2020-03-17 迈恩德玛泽控股股份有限公司 Systems, methods, and apparatus for stereo vision and tracking
CN109460707A (en) * 2018-10-08 2019-03-12 华南理工大学 A kind of multi-modal action identification method based on deep neural network
CN110096950A (en) * 2019-03-20 2019-08-06 西北大学 A kind of multiple features fusion Activity recognition method based on key frame
CN110705463A (en) * 2019-09-29 2020-01-17 山东大学 Video human behavior recognition method and system based on multi-mode double-flow 3D network
CN111259804A (en) * 2020-01-16 2020-06-09 合肥工业大学 Multi-mode fusion sign language recognition system and method based on graph convolution

Also Published As

Publication number Publication date
CN112396018A (en) 2021-02-23

Similar Documents

Publication Publication Date Title
CN108520535B (en) Object classification method based on depth recovery information
CN110472612B (en) Human behavior recognition method and electronic equipment
Xia et al. Multi-scale mixed dense graph convolution network for skeleton-based action recognition
CN110163059B (en) Multi-person posture recognition method and device and electronic equipment
CN112396018B (en) Badminton player foul action recognition method combining multi-mode feature analysis and neural network
CN112131908A (en) Action identification method and device based on double-flow network, storage medium and equipment
CN110738154A (en) pedestrian falling detection method based on human body posture estimation
CN114582030B (en) Behavior recognition method based on service robot
CN112906604A (en) Behavior identification method, device and system based on skeleton and RGB frame fusion
CN113221663B (en) Real-time sign language intelligent identification method, device and system
Khalid et al. Multi-modal three-stream network for action recognition
CN109961039A (en) A kind of individual&#39;s goal video method for catching and system
CN112036260A (en) Expression recognition method and system for multi-scale sub-block aggregation in natural environment
CN111461063A (en) Behavior identification method based on graph convolution and capsule neural network
CN112200110A (en) Facial expression recognition method based on deep interference separation learning
CN111753795A (en) Action recognition method and device, electronic equipment and storage medium
Kumar et al. Human pose estimation using deep learning: review, methodologies, progress and future research directions
CN112052795B (en) Video behavior identification method based on multi-scale space-time feature aggregation
Hsieh et al. Online human action recognition using deep learning for indoor smart mobile robots
Liu et al. Viewpoint invariant RGB-D human action recognition
CN110717384B (en) Video interactive behavior recognition method and device
Saif et al. Aggressive action estimation: a comprehensive review on neural network based human segmentation and action recognition
CN112818801B (en) Motion counting method, recognition device, recognition system and storage medium
Li et al. Human behavior recognition based on attention mechanism
AU2020436769B2 (en) Method and system for matching 2D human poses from multiple views

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant