CN108288016B - Action identification method and system based on gradient boundary graph and multi-mode convolution fusion - Google Patents

Action identification method and system based on gradient boundary graph and multi-mode convolution fusion Download PDF

Info

Publication number
CN108288016B
CN108288016B CN201710018537.4A CN201710018537A CN108288016B CN 108288016 B CN108288016 B CN 108288016B CN 201710018537 A CN201710018537 A CN 201710018537A CN 108288016 B CN108288016 B CN 108288016B
Authority
CN
China
Prior art keywords
original video
fusion
gradient boundary
denotes
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710018537.4A
Other languages
Chinese (zh)
Other versions
CN108288016A (en
Inventor
胡瑞敏
陈军
陈华锋
李红阳
徐增敏
吴华
柴笑宇
柯亨进
马宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201710018537.4A priority Critical patent/CN108288016B/en
Publication of CN108288016A publication Critical patent/CN108288016A/en
Application granted granted Critical
Publication of CN108288016B publication Critical patent/CN108288016B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于梯度边界图和多模卷积融合的动作识别方法及系统,包括步骤:S1基于原始视频构造连续帧集;S2计算连续帧集中相邻两帧图像间的梯度边界值,从而获得梯度边界图集;S3计算连续帧集中相邻两帧图像间的帧间光流,从而获得光流图集;S4以原始视频的代表帧、梯度边界图集和光流图集为输入,采用卷积神经网络获得原始视频的多模CNN特征;S5融合原始视频的多模CNN特征,得到融合特征;S6基于融合特征,采用动作分类算法进行动作识别。本发明增加了梯度边界图这一重要的动作时空信息,并提出了多模数据卷积融合方法,保证了多模时空特征融合的一致性,提升了视频中人体动作特征描述精确性,提高了人体动作识别率。

Figure 201710018537

The invention discloses an action recognition method and system based on gradient boundary map and multi-mode convolution fusion, comprising the steps of: S1 constructing a continuous frame set based on the original video; S2 calculating the gradient boundary value between two adjacent frame images in the continuous frame set , so as to obtain the gradient boundary atlas; S3 calculates the inter-frame optical flow between two adjacent frame images in the continuous frame set, thereby obtaining the optical flow atlas; S4 takes the representative frame of the original video, the gradient boundary atlas and the optical flow atlas as input , using the convolutional neural network to obtain the multimodal CNN features of the original video; S5 fuses the multimodal CNN features of the original video to obtain the fusion features; S6 uses the action classification algorithm for action recognition based on the fusion features. The invention increases the important action spatiotemporal information of the gradient boundary map, and proposes a multi-mode data convolution fusion method, which ensures the consistency of multi-mode spatio-temporal feature fusion, improves the accuracy of human action feature description in the video, and improves the Human action recognition rate.

Figure 201710018537

Description

Action identification method and system based on gradient boundary graph and multi-mode convolution fusion
Technical Field
The invention belongs to the technical field of automatic video analysis, and relates to a motion recognition method and system based on gradient boundary graph and multi-mode convolution fusion.
Background
With the development of computer technology, how to automatically analyze and understand videos by using a computer is more and more urgent. The human body is a main object concerned by people in video data, and the purpose of recognizing human body behaviors in the video and generating high-level semantic information which is easier to understand is to analyze and understand the main content of the video by a computer. From the application perspective, as an important research content in the field of computer vision, human behavior recognition can meet the requirements of tasks such as intelligent video monitoring, intelligent monitoring, content-based video analysis and the like on automatic analysis and intellectualization, and social development progress is promoted.
Disclosure of Invention
The invention aims to provide an action identification method and system based on gradient boundary graph and multi-mode convolution fusion.
In order to achieve the purpose, the invention adopts the following technical scheme:
a motion identification method based on gradient boundary graph and multi-mode convolution fusion comprises the following steps:
s1 samples the original video to obtain a representative frame fpTake f from the original videop、fpS frame images and fpThe subsequent S frame images form a continuous frame set Sp=[fp-s,…,fp,…,fp+s](ii) a s is an empirical value, and the value range of s is 5-10; the original video is an original video training sample or an original video to be identified;
s2 calculating SpObtaining a gradient boundary matrix according to a gradient boundary value between two adjacent frames of images, and obtaining a gradient boundary atlas according to the gradient boundary matrix; the gradient boundary matrix
Figure BDA0001206606520000011
Pt xAnd Pt yRespectively represents ftAnd its subsequent adjacent frame image ft+1A gradient boundary matrix between the image in the transverse direction and the image in the longitudinal direction, t being p-s, p-s + 1.
Pt xFrom the element Pt x(u, v) constitution, Pt x(u,v)=[ft+1(u+1,v)-ft+1(u,v)]-[ft(u+1,v)-ft(u,v)]Wherein, (u, v) represents pixel coordinates; pt x(u, v) denotes ftGradient boundary values of the middle pixel (u, v) in the image transverse direction; f. oft+1(u +1, v) denotes ft+1Middle imageThe gray value of the pixel (u +1, v); f. oft+1(u, v) denotes ft+1The gray value of the middle pixel (u, v); f. oft(u +1, v) denotes ftThe gray value of the middle pixel (u +1, v); f. oft(u, v) denotes ftThe gray value of the middle pixel (u, v);
Pt yfrom the element Pt y(u, v) constitution, Pt y(u,v)=[ft+1(u,v+1)-ft+1(u,v)]-[ft(u,v+1)-ft(u,v)]Wherein, (u, v) represents pixel coordinates; pt y(u, v) denotes ftGradient boundary values of the middle pixel (u, v) in the longitudinal direction of the image; f. oft+1(u, v +1) represents ft+1The gray value of the middle pixel (u, v + 1); f. oft+1(u, v) denotes ft+1The gray value of the middle pixel (u, v); f. oft(u, v +1) represents ftThe gray value of the middle pixel (u, v + 1); f. oft(u, v) denotes ftThe gray value of the middle pixel (u, v);
s3 calculating a set S of consecutive framespInter-frame optical flow between two adjacent frames of images, thereby obtaining an optical flow atlas; the optical flow diagram set
Figure BDA0001206606520000021
oft xAnd oft yDenotes ftAnd ft+1Inter-frame optical flows in the image transverse direction and the image longitudinal direction, t ═ p-s, p-s +1, a.
S4, training a convolutional neural network by adopting the representative frame, the gradient boundary atlas and the optical flow atlas of each original video training sample; using each original video training sample and the representative frame, the gradient boundary graph set and the optical flow graph set of the original video to be identified as input, and adopting the trained convolutional neural network to obtain the CNN characteristic C of each original video training sample and the representative frame of the original video to be identifiedrgbGradient boundary CNN feature CgbfAnd optical flow CNN feature Cof
S5 Using C of each original video training samplergb、CgbfAnd CofTraining fusion formula Cfusion=ycatK and b in k + b, where k is a convolution kernel parameter(ii) a b is a bias parameter; y iscat=[Cgbf,Crgb,Cof](ii) a Fusing C of the original video to be recognized by adopting the trained fusion formulargb、CgbfAnd CofObtaining a fusion feature Cfusion
S6 based on fusion feature CfusionAnd performing action recognition by adopting an action classification algorithm.
Secondly, an action recognition system based on gradient boundary graph and multi-mode convolution fusion comprises the following steps:
a continuous frame set forming module for sampling the original video to obtain a representative frame fpTake f from the original videop、fpS frame images and fpThe subsequent S frame images form a continuous frame set Sp=[fp-s,…,fp,…,fp+s](ii) a s is an empirical value, and the value range of s is 5-10; the original video is an original video training sample or an original video to be identified;
a gradient boundary map set obtaining module for calculating SpObtaining a gradient boundary matrix according to a gradient boundary value between two adjacent frames of images, and obtaining a gradient boundary atlas according to the gradient boundary matrix; the gradient boundary matrix
Figure BDA0001206606520000022
Pt xAnd Pt yRespectively represents ftAnd its subsequent adjacent frame image ft+1A gradient boundary matrix between the image in the transverse direction and the image in the longitudinal direction, t being p-s, p-s + 1.
Pt xFrom the element Pt x(u, v) constitution, Pt x(u,v)=[ft+1(u+1,v)-ft+1(u,v)]-[ft(u+1,v)-ft(u,v)]Wherein, (u, v) represents pixel coordinates; pt x(u, v) denotes ftGradient boundary values of the middle pixel (u, v) in the image transverse direction; f. oft+1(u +1, v) denotes ft+1The gray value of the middle pixel (u +1, v); f. oft+1(u, v) denotes ft+1The gray value of the middle pixel (u, v); f. oft(u +1, v) denotes ftThe gray value of the middle pixel (u +1, v); f. oft(u, v) denotes ftThe gray value of the middle pixel (u, v);
Pt yfrom the element Pt y(u, v) constitution, Pt y(u,v)=[ft+1(u,v+1)-ft+1(u,v)]-[ft(u,v+1)-ft(u,v)]Wherein, (u, v) represents pixel coordinates; pt y(u, v) denotes ftGradient boundary values of the middle pixel (u, v) in the longitudinal direction of the image; f. oft+1(u, v +1) represents ft+1The gray value of the middle pixel (u, v + 1); f. oft+1(u, v) denotes ft+1The gray value of the middle pixel (u, v); f. oft(u, v +1) represents ftThe gray value of the middle pixel (u, v + 1); f. oft(u, v) denotes ftThe gray value of the middle pixel (u, v);
an optical flow map set acquisition module for calculating a set of successive frames SpInter-frame optical flow between two adjacent frames of images, thereby obtaining an optical flow atlas; the optical flow diagram set
Figure BDA0001206606520000031
oft xAnd oft yDenotes ftAnd ft+1Inter-frame optical flows in the image transverse direction and the image longitudinal direction, t ═ p-s, p-s +1, a.
The CNN characteristic identification module is used for adopting the representative frame, the gradient boundary atlas and the optical flow atlas of each original video training sample to train a convolutional neural network; using each original video training sample and the representative frame, the gradient boundary graph set and the optical flow graph set of the original video to be identified as input, and adopting the trained convolutional neural network to obtain the CNN characteristic C of each original video training sample and the representative frame of the original video to be identifiedrgbGradient boundary CNN feature CgbfAnd optical flow CNN feature Cof
A fusion module for adopting C of each original video training samplergb、CgbfAnd CofTraining fusion formula Cfusion=ycatParameters k and b in the x k + b, wherein k is a convolution kernel parameter; b is a bias parameter;ycat=[Cgbf,Crgb,Cof](ii) a Fusing C of the original video to be recognized by adopting the trained fusion formulargb、CgbfAnd CofObtaining a fusion feature Cfusion
A motion recognition module for recognizing the motion based on the fusion feature CfusionAnd performing action recognition by adopting an action classification algorithm.
Compared with the prior art, the invention has the beneficial effects that:
the important motion spatiotemporal information of the gradient boundary graph is added, the multimode data convolution fusion method is provided, the consistency of multimode spatiotemporal feature fusion is ensured, the human motion feature description accuracy in the video is improved, and the human motion recognition rate is improved.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples for the purpose of facilitating understanding and practicing the invention by those of ordinary skill in the art, it being understood that the examples described herein are for the purpose of illustration and explanation, and are not to be construed as limiting the invention.
Referring to fig. 1, an action identification method based on a gradient boundary graph and multi-mode convolution fusion provided by the embodiment of the present invention specifically includes the following steps:
step 1: for original video F ═ F1,…,fi,…,fn]Sampling to obtain frame image fpRepresentative frame f as a representative frame of the original videopAnd the former S frame image and the latter S frame image form a continuous frame set Sp=[fp-s,…,fp,…,fp+s]. Wherein f isiThe image of the ith frame in the original video is represented, i is 1,2, …, n is the total frame number of the image in the original video; s is an empirical value, and the preferable value range is 5-10. In this embodiment, S is 5, and the obtained continuous frame set is denoted as Sp=[fp-5,…,fp,…,fp+5]。
The acquisition of the representative frame may be achieved using techniques customary in the art, for example, but not limited to, random sampling.
Step 2: based on a set S of consecutive framespCalculating a set S of consecutive framespObtaining a gradient boundary matrix by the gradient boundary value between two adjacent frames of images
Figure BDA0001206606520000041
Wherein, Pt xAnd Pt yRepresenting a set S of consecutive framespIn ftAnd its subsequent adjacent frame image ft+1A gradient boundary matrix, t ═ p-5, p-4, p +4, in the image transverse direction (X direction) and the image longitudinal direction (Y direction), respectively.
To Pt xAny one element P in (1)t x(u, v) is calculated as follows:
Pt x(u,v)=[ft+1(u+1,v)-ft+1(u,v)]-[ft(u+1,v)-ft(u,v)] (1)
wherein, (u, v) represents pixel coordinates; pt x(u, v) represents the frame image ftGradient boundary values of the middle pixel (u, v) in the X direction; f. oft+1(u +1, v) represents the frame image ft+1The gray value of the middle pixel (u +1, v); f. oft+1(u, v) represents the frame image ft+1The gray value of the middle pixel (u, v); f. oft(u +1, v) represents the frame image ftThe gray value of the middle pixel (u +1, v); f. oft(u, v) represents the frame image ftThe gray value of the middle pixel (u, v).
Accordingly, Pt yAny one of the elements Pt y(u, v) is calculated as follows:
Pt y(u,v)=[ft+1(u,v+1)-ft+1(u,v)]-[ft(u,v+1)-ft(u,v)] (2)
wherein, (u, v) represents pixel coordinates; pt y(u, v) represents the frame image ftGradient boundary values of the middle pixel (u, v) in the Y direction; f. oft+1(u, v +1) represents the frame image ft+1The gray value of the middle pixel (u, v + 1); f. oft+1(u, v) shows a frame mapImage ft+1The gray value of the middle pixel (u, v); f. oft(u, v +1) represents the frame image ftThe gray value of the middle pixel (u, v + 1); f. oft(u, v) represents the frame image ftThe gray value of the middle pixel (u, v).
For each Pt x、Pt yLinearly scaling its value to [0,255 ]]To obtain a gradient boundary atlas
Figure BDA0001206606520000051
And step 3: computing a set of successive frames SpLinearly scaling the inter-frame optical flow to [0,255 ] between two adjacent frames of images]Integer value of between, obtaining an optical flow atlas
Figure BDA0001206606520000052
oft xAnd oft yRepresenting a set S of consecutive framespIn ftAnd ft+1Inter-frame optical flow, t ═ p-5, p-4, p +4, in the X and Y directions, respectively.
And 4, step 4: respectively learning features based on a Convolutional Neural Network (CNN) by using multimode data as input data to obtain a gradient boundary CNN feature CgbfRepresentative frame CNN feature CrgbAnd optical flow CNN feature Cof. The multi-mode data comprises a gradient boundary atlas GBF and a representative frame fpAnd an optical flow atlas OF.
And 5: for the gradient boundary CNN feature CgbfOriginal frame CNN feature CrgbAnd optical flow CNN feature CofPerforming multimode CNN feature fusion to obtain fusion feature Cfusion
The fusion formula is:
Cfusion=ycat*k+b (3)
wherein k is a convolution kernel parameter; b is a bias parameter; y iscat=[Cgbf,Crgb,Cof]. The convolution kernel parameter k and the bias parameter b are obtained in the CNN parameter training process.
Step 6: based on fusion characteristics CfusionUsing a motion classification algorithmAnd performing action recognition.
The method of the invention is divided into two stages of training and action recognition. And in the training stage, training the weight parameter, the convolution kernel parameter k and the bias parameter b of the CNN by adopting a training sample. And in the action recognition stage, a trained CNN network and a fusion formula are adopted to extract fusion characteristics, and classification results are given based on the fusion characteristics.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (2)

1. The motion recognition method based on the gradient boundary graph and the multimode convolution fusion is characterized by comprising the following steps of:
s1 samples the original video to obtain a representative frame fpTake f from the original videop、fpS frame images and fpThe subsequent S frame images form a continuous frame set Sp=[fp-s,…,fp,…,fp+s](ii) a s is an empirical value, and the value range of s is 5-10; the original video is an original video training sample or an original video to be identified;
s2 calculating SpObtaining a gradient boundary matrix according to a gradient boundary value between two adjacent frames of images, and obtaining a gradient boundary atlas according to the gradient boundary matrix; the gradient boundary matrix
Figure FDA0001206606510000011
Pt xAnd Pt yRespectively represents ftAnd its subsequent adjacent frame image ft+1A gradient boundary matrix between the image transverse direction and the image longitudinal direction, t ═ p-s, p-s + 1..,p+s-1;
Pt xFrom the element Pt x(u, v) constitution, Pt x(u,v)=[ft+1(u+1,v)-ft+1(u,v)]-[ft(u+1,v)-ft(u,v)]Wherein, (u, v) represents pixel coordinates; pt x(u, v) denotes ftGradient boundary values of the middle pixel (u, v) in the image transverse direction; f. oft+1(u +1, v) denotes ft+1The gray value of the middle pixel (u +1, v); f. oft+1(u, v) denotes ft+1The gray value of the middle pixel (u, v); f. oft(u +1, v) denotes ftThe gray value of the middle pixel (u +1, v); f. oft(u, v) denotes ftThe gray value of the middle pixel (u, v);
Pt yfrom the element Pt y(u, v) constitution, Pt y(u,v)=[ft+1(u,v+1)-ft+1(u,v)]-[ft(u,v+1)-ft(u,v)]Wherein, (u, v) represents pixel coordinates; pt y(u, v) denotes ftGradient boundary values of the middle pixel (u, v) in the longitudinal direction of the image; f. oft+1(u, v +1) represents ft+1The gray value of the middle pixel (u, v + 1); f. oft+1(u, v) denotes ft+1The gray value of the middle pixel (u, v); f. oft(u, v +1) represents ftThe gray value of the middle pixel (u, v + 1); f. oft(u, v) denotes ftThe gray value of the middle pixel (u, v);
s3 calculating a set S of consecutive framespInter-frame optical flow between two adjacent frames of images, thereby obtaining an optical flow atlas; the optical flow diagram set
Figure FDA0001206606510000012
oft xAnd oft yDenotes ftAnd ft+1Inter-frame optical flows in the image transverse direction and the image longitudinal direction, t ═ p-s, p-s +1, a.
S4, training a convolutional neural network by adopting the representative frame, the gradient boundary atlas and the optical flow atlas of each original video training sample; using each original video training sample and representative frame, gradient boundary diagram set and optical flow diagram set of original video to be identified as outputFirstly, obtaining training samples of each original video and CNN characteristics C of representative frames of the original video to be identified by adopting the trained convolutional neural networkrgbGradient boundary CNN feature CgbfAnd optical flow CNN feature Cof
S5 Using C of each original video training samplergb、CgbfAnd CofTraining fusion formula Cfusion=ycatParameters k and b in the x k + b, wherein k is a convolution kernel parameter; b is a bias parameter; y iscat=[Cgbf,Crgb,Cof](ii) a Fusing C of the original video to be recognized by adopting the trained fusion formulargb、CgbfAnd CofObtaining a fusion feature Cfusion
S6 based on fusion feature CfusionAnd performing action recognition by adopting an action classification algorithm.
2. The motion recognition system based on the fusion of the gradient boundary graph and the multimode convolution is characterized by comprising the following steps:
a continuous frame set forming module for sampling the original video to obtain a representative frame fpTake f from the original videop、fpS frame images and fpThe subsequent S frame images form a continuous frame set Sp=[fp-s,…,fp,…,fp+s](ii) a s is an empirical value, and the value range of s is 5-10; the original video is an original video training sample or an original video to be identified;
a gradient boundary map set obtaining module for calculating SpObtaining a gradient boundary matrix according to a gradient boundary value between two adjacent frames of images, and obtaining a gradient boundary atlas according to the gradient boundary matrix; the gradient boundary matrix
Figure FDA0001206606510000021
Pt xAnd Pt yRespectively represents ftAnd its subsequent adjacent frame image ft+1A gradient boundary matrix between the image in the transverse direction and the image in the longitudinal direction, t being p-s, p-s + 1.
Pt xFrom the element Pt x(u, v) constitution, Pt x(u,v)=[ft+1(u+1,v)-ft+1(u,v)]-[ft(u+1,v)-ft(u,v)]Wherein, (u, v) represents pixel coordinates; pt x(u, v) denotes ftGradient boundary values of the middle pixel (u, v) in the image transverse direction; f. oft+1(u +1, v) denotes ft+1The gray value of the middle pixel (u +1, v); f. oft+1(u, v) denotes ft+1The gray value of the middle pixel (u, v); f. oft(u +1, v) denotes ftThe gray value of the middle pixel (u +1, v); f. oft(u, v) denotes ftThe gray value of the middle pixel (u, v);
Pt yfrom the element Pt y(u, v) constitution, Pt y(u,v)=[ft+1(u,v+1)-ft+1(u,v)]-[ft(u,v+1)-ft(u,v)]Wherein, (u, v) represents pixel coordinates; pt y(u, v) denotes ftGradient boundary values of the middle pixel (u, v) in the longitudinal direction of the image; f. oft+1(u, v +1) represents ft+1The gray value of the middle pixel (u, v + 1); f. oft+1(u, v) denotes ft+1The gray value of the middle pixel (u, v); f. oft(u, v +1) represents ftThe gray value of the middle pixel (u, v + 1); f. oft(u, v) denotes ftThe gray value of the middle pixel (u, v);
an optical flow map set acquisition module for calculating a set of successive frames SpInter-frame optical flow between two adjacent frames of images, thereby obtaining an optical flow atlas; the optical flow diagram set
Figure FDA0001206606510000022
oft xAnd oft yDenotes ftAnd ft+1Inter-frame optical flows in the image transverse direction and the image longitudinal direction, t ═ p-s, p-s +1, a.
The CNN characteristic identification module is used for adopting the representative frame, the gradient boundary atlas and the optical flow atlas of each original video training sample to train a convolutional neural network; training samples by each original video and representative frames and gradient boundaries of the original video to be recognizedUsing the graph set and the optical flow graph set as input, and obtaining training samples of each original video and a representative frame CNN characteristic C of the original video to be identified by adopting the trained convolutional neural networkrgbGradient boundary CNN feature CgbfAnd optical flow CNN feature Cof
A fusion module for adopting C of each original video training samplergb、CgbfAnd CofTraining fusion formula Cfusion=ycatParameters k and b in the x k + b, wherein k is a convolution kernel parameter; b is a bias parameter; y iscat=[Cgbf,Crgb,Cof](ii) a Fusing C of the original video to be recognized by adopting the trained fusion formulargb、CgbfAnd CofObtaining a fusion feature Cfusion
A motion recognition module for recognizing the motion based on the fusion feature CfusionAnd performing action recognition by adopting an action classification algorithm.
CN201710018537.4A 2017-01-10 2017-01-10 Action identification method and system based on gradient boundary graph and multi-mode convolution fusion Active CN108288016B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710018537.4A CN108288016B (en) 2017-01-10 2017-01-10 Action identification method and system based on gradient boundary graph and multi-mode convolution fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710018537.4A CN108288016B (en) 2017-01-10 2017-01-10 Action identification method and system based on gradient boundary graph and multi-mode convolution fusion

Publications (2)

Publication Number Publication Date
CN108288016A CN108288016A (en) 2018-07-17
CN108288016B true CN108288016B (en) 2021-09-03

Family

ID=62831255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710018537.4A Active CN108288016B (en) 2017-01-10 2017-01-10 Action identification method and system based on gradient boundary graph and multi-mode convolution fusion

Country Status (1)

Country Link
CN (1) CN108288016B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522874B (en) * 2018-12-11 2020-08-21 中国科学院深圳先进技术研究院 Human body motion recognition method, device, terminal device and storage medium
CN109635764A (en) * 2018-12-19 2019-04-16 荆楚理工学院 A kind of Human bodys' response method and system based on multiple features linear temporal coding
CN113610821A (en) * 2021-08-12 2021-11-05 上海明略人工智能(集团)有限公司 Video shot boundary positioning method and device and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6535623B1 (en) * 1999-04-15 2003-03-18 Allen Robert Tannenbaum Curvature based system for the segmentation and analysis of cardiac magnetic resonance images
CN105550678A (en) * 2016-02-03 2016-05-04 武汉大学 Human body motion feature extraction method based on global remarkable edge area

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6535623B1 (en) * 1999-04-15 2003-03-18 Allen Robert Tannenbaum Curvature based system for the segmentation and analysis of cardiac magnetic resonance images
CN105550678A (en) * 2016-02-03 2016-05-04 武汉大学 Human body motion feature extraction method based on global remarkable edge area

Also Published As

Publication number Publication date
CN108288016A (en) 2018-07-17

Similar Documents

Publication Publication Date Title
Li et al. End-to-end united video dehazing and detection
CN104281853B (en) A kind of Activity recognition method based on 3D convolutional neural networks
CN107624061B (en) Machine vision with dimensional data reduction
CN108830252A (en) A kind of convolutional neural networks human motion recognition method of amalgamation of global space-time characteristic
CN108549841A (en) A kind of recognition methods of the Falls Among Old People behavior based on deep learning
CN107862376A (en) A kind of human body image action identification method based on double-current neutral net
CN110569795A (en) An image recognition method, device and related equipment
CN107862275A (en) Human bodys' response model and its construction method and Human bodys' response method
CN107909005A (en) Personage's gesture recognition method under monitoring scene based on deep learning
CN108416266A (en) A kind of video behavior method for quickly identifying extracting moving target using light stream
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
Dandıl et al. Real-time facial emotion classification using deep learning
CN108288015A (en) Human motion recognition method and system in video based on THE INVARIANCE OF THE SCALE OF TIME
CN108288016B (en) Action identification method and system based on gradient boundary graph and multi-mode convolution fusion
CN109389035A (en) Low latency video actions detection method based on multiple features and frame confidence score
CN116704196B (en) Method for training image semantic segmentation model
CN110852199A (en) A Foreground Extraction Method Based on Double Frame Encoding and Decoding Model
CN112949560B (en) Method for identifying continuous expression change of long video expression interval under two-channel feature fusion
CN110942037A (en) Action recognition method for video analysis
CN113158756A (en) Posture and behavior analysis module and method based on HRNet deep learning
CN118247844A (en) Animal posture estimation method, system and device based on diffusion model and attention mechanism
Li et al. Dilated spatial–temporal convolutional auto-encoders for human fall detection in surveillance videos
CN110348395B (en) Skeleton behavior identification method based on space-time relationship
Zhou et al. A deep learning algorithm for fast motion video sequences based on improved codebook model
CN109635764A (en) A kind of Human bodys' response method and system based on multiple features linear temporal coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant