CN109753897A - Based on memory unit reinforcing-time-series dynamics study Activity recognition method - Google Patents

Based on memory unit reinforcing-time-series dynamics study Activity recognition method Download PDF

Info

Publication number
CN109753897A
CN109753897A CN201811569882.8A CN201811569882A CN109753897A CN 109753897 A CN109753897 A CN 109753897A CN 201811569882 A CN201811569882 A CN 201811569882A CN 109753897 A CN109753897 A CN 109753897A
Authority
CN
China
Prior art keywords
memory unit
video
feature
follows
recurrent neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811569882.8A
Other languages
Chinese (zh)
Other versions
CN109753897B (en
Inventor
袁媛
王�琦
王栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN201811569882.8A priority Critical patent/CN109753897B/en
Publication of CN109753897A publication Critical patent/CN109753897A/en
Application granted granted Critical
Publication of CN109753897B publication Critical patent/CN109753897B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of Activity recognition method based on memory unit reinforcing-time-series dynamics study, the technical issues of the practicability is poor for solving existing Activity recognition method.Technical solution is the timing structure information of video sequence when modeling long using the recurrent neural network of fusion memory unit, each video frame of video sequence is classified as associated frame and noise frame by discretization memory unit read-write controller module, the information write-in memory unit of associated frame is ignored into noise frame information simultaneously.This method can filter a large amount of noise information in non-editing video, the recurrent neural network of fusion memory unit realizes the connection of large span sequential organization, pass through the autonomous training study of data-driven, to complicated personage's behavior it is long when sequential organization mode model, solve background technique to it is long when, the motor pattern of non-editing video it is complicated, background changes more problems, improves the robustness of personage's Activity recognition method, and has reached average 94.8% recognition accuracy.

Description

Based on memory unit reinforcing-time-series dynamics study Activity recognition method
Technical field
It is the present invention relates to a kind of Activity recognition method, in particular to a kind of based on memory unit reinforcing-time-series dynamics study Activity recognition method.
Background technique
Document " L.Wang, Y.Xiong, Z.Wang, Y.Qiao, D.Lin, X.Tang, and L.V.Gool.Temporal Segment Networks:Towards Good Practices for Deep Action Recognition,In Proceedings of European Conference on Computer Vision, pp.20-36,2016. " discloses one Personage Activity recognition method of the kind based on double-current convolutional neural networks Yu temporal sequence network.This method utilizes two independent volumes Product neural network solves Activity recognition task, wherein and space flow network extracts the appearance features of target from video frame, and when Sequence flow network then extracts the motion feature of target from corresponding light stream field data, is exported by merging the two networks and is gone For recognition result.Meanwhile this method propose temporal sequence network come model video sequence it is long when timing structure information, the network By the supervised learning of sparse Temporal Sampling strategy and sequence scale, efficient effective study of entire neural network is realized, and It is yielded good result on extensive public data collection.Document the method is more coarse to the time series modeling in video, So that network often sequential correlation of override feature in learning process;When video sequence is longer and non-editing, the party Unrelated noise information can be incorporated final recognition result, the accuracy rate of reduction personage's Activity recognition, while noise information by method It is added, the training study of entire neural network can also be made to become difficult.
Summary of the invention
In order to overcome the shortcomings of existing Activity recognition method, the practicability is poor, and the present invention provides a kind of strong based on memory unit Change-time-series dynamics study Activity recognition method.This method regards when long using the recurrent neural network modeling of fusion memory unit The timing structure information of frequency sequence is divided each video frame of video sequence by discretization memory unit read-write controller module Class is associated frame and noise frame, the information write-in memory unit of associated frame is ignored noise frame information simultaneously, this method can filter Fall a large amount of noise information in non-editing video, promotes the accuracy rate of subsequent act identification.In addition, the recurrence of fusion memory unit The connection of large span sequential organization may be implemented in neural network, is learnt by the autonomous training of data-driven, to complicated personage's row For it is long when sequential organization mode modeled, and then solve existing Activity recognition method to it is long when, non-editing video Motor pattern is complicated, and background changes more problems, improves the robustness of personage's Activity recognition method, and reaches average 94.8%, 71.8% recognition accuracy.
The technical solution adopted by the present invention to solve the technical problems: one kind being based on memory unit reinforcing-time-series dynamics The Activity recognition method of habit, its main feature is that the following steps are included:
Step 1: calculating video frame IaOptic flow information, wherein the Optic flow information of each pixel is by bivector (Δ x, Δ Y) it indicates and saves as light stream figure Im.Respective higher-dimension semantic feature is extracted using two independent thinking convolutional neural networks:
xa=CNNa(Ia;wa) (1)
xm=CNNm(Im;wm) (2)
Wherein, CNNa、CNNmApparent convolutional neural networks and movement convolutional neural networks are respectively represented, to extract video Frame IaWith light stream figure ImHigh dimensional feature.xa、xmRespectively 2048 dimensional vectors, represent that convolutional neural networks extract it is apparent with Motion feature.wa、wmIndicate that the inside of two convolutional neural networks can training parameter.Indicate that convolutional neural networks extract using x High dimensional feature.
Step 2: initialization memory unit M is sky, it is expressed as M0.Assuming that when t video frame, memory unit MtIt is not sky, It wherein include Nt> 0 element, is expressed asSo, the memory module read operation at corresponding moment is as follows:
Wherein, the mh read outtRepresent the historical information of t moment before video.
Step 3: extracting the contextual feature in short-term of video content using section type recurrent neural network.In terms of step 1 For obtained higher-dimension semantic feature x as input, feature when corresponding to t video frame is denoted as xt.The long mind of recurrence in short-term of initialization Hidden state h through network (LSTM)0、c0It is zero, then the contextual feature in short-term of t moment calculates as follows:
Wherein, EMD () indicates long recurrent neural network in short-term, ht-1,ct-1Indicate the hidden of recurrent neural network previous moment State.AndContextual feature in short-term as video content is used for subsequent calculating.
Step 4: for each video frame, Step 1: two, the three higher-dimension semantic feature x being calculatedt, memory unit goes through History information mhtAnd contextual feature in short-termMemory unit controller is inputted, binaryzation memory unit write instruction is calculated st∈ { 0,1 }, specific as follows:
at=σ (qt) (6)
st=τ (at) (7)
Wherein, vTFor the row vector parameter that can learn, Wf、Wc、WmFor the weight parameter that can learn, bsFor offset parameter. Sigmoid function σ () is by the result q of linear weighted functiontIt normalizes between 0,1, i.e. at∈(0,1)。atIt is input to threshold restriction Binaryzation function τ () obtain binaryzation memory unit write instruction st
Step 5: being based on binaryzation memory unit write instruction st, update memory unit and section type recurrent neural network. For each video frame, memory unit MtMore new strategy it is as follows:
Wherein, WwFor that can learn weight matrix, which passes through multiplying for higher-dimension semantic feature xtIt is single to be converted to memory Element Indicating willMemory unit M is writtent-1, form new memory unit Mt.In addition, section type recurrence The hidden state h of neural networkt, ctIt updates as follows:
Wherein,The result being calculated for formula (4).
Step 6: carrying out behavior classification using memory unit.Assuming that video overall length is T, entire video is remembered when processing terminate Recalling unit is MT, wherein there is NTA element, then the character representation f of entire video are as follows:
Wherein, f is D dimensional vector, represents the information of behavior classification in video.This feature inputs full link sort layer and is gone It is specific as follows for category score y:
Y=soft max (Wf) (12)
Wherein, W ∈ RC×D, the identifiable behavior classification sum of C expression.The y being calculated indicates system to each classification Classification score, the higher expression of score are more likely to be the class behavior.Assuming that ya、ymIt respectively indicates and is apparently obtained with kinesitherapy nerve network The score arrived, then final score yfIt is as follows:
yf=ya+ym (13)
Wherein, yfIndicate final personage's Activity recognition result.
The beneficial effects of the present invention are: video sequence when this method is long using the recurrent neural network modeling of fusion memory unit Each video frame of video sequence is classified as by the timing structure information of column by discretization memory unit read-write controller module The information write-in memory unit of associated frame is ignored noise frame information simultaneously by associated frame and noise frame, and this method can filter not A large amount of noise information in editing video promotes the accuracy rate of subsequent act identification.In addition, the recurrent neural of fusion memory unit The connection of large span sequential organization may be implemented in network, is learnt by the autonomous training of data-driven, to complicated personage's behavior Sequential organization mode is modeled when long, so solve existing Activity recognition method to it is long when, non-editing video movement Mode is complicated, and background changes more problems, improves the robustness of personage's Activity recognition method, and reach average 94.8%, 71.8% recognition accuracy.
It elaborates with reference to the accompanying drawings and detailed description to the present invention.
Detailed description of the invention
Fig. 1 is that the present invention is based on the flow charts of memory unit reinforcing-time-series dynamics study Activity recognition method.
Specific embodiment
Referring to Fig.1.The present invention is based on memory unit reinforcing-time-series dynamics study Activity recognition method specific steps such as Under:
Contain the higher-dimension of semantic information apparently and motion feature Step 1: extracting.Firstly, calculating video frame IaLight stream letter Breath, wherein the Optic flow information of each pixel is by bivector, (Δ x, Δ y) expression simultaneously save as light stream figure Im.Then, two are utilized A independent thinking convolutional neural networks extract respective higher-dimension semantic feature:
xa=CNNa(Ia;wa) (1)
xm=CNNm(Im;wm) (2)
Wherein CNNa、CNNmApparent convolutional neural networks and movement convolutional neural networks are respectively represented, to extract video frame IaWith light stream figure ImHigh dimensional feature.xa、xmRespectively 2048 dimensional vectors represent the apparent and fortune that convolutional neural networks extract Dynamic feature.wa、wmIndicate that the inside of two convolutional neural networks can training parameter.Due to apparent neural network and kinesitherapy nerve net The subsequent operation of network is completely the same, to make label simply clear, indicates that the higher-dimension that convolutional neural networks extract is special using x Sign.
Step 2: initialization memory unit M is sky, it is expressed as M0.Assuming that when t video frame, memory unit MtIt is not sky, It wherein include Nt> 0 element, is expressed asSo, the memory module read operation at corresponding moment is as follows:
The mh wherein read outtThe historical information of t moment before video is represented, while the historical information affects this moment The analysis and understanding of video content.
Step 3: extracting the contextual feature in short-term of video content using section type recurrent neural network.In terms of step 1 For obtained higher-dimension semantic feature x as input, feature when corresponding to t video frame is denoted as xt.Firstly, initialization length is passed in short-term Return the hidden state h of neural network (LSTM)0、c0It is zero, then the contextual feature in short-term of t moment calculates as follows:
Wherein EMD () indicates long recurrent neural network in short-term, ht-1,ct-1Indicate the hidden shape of recurrent neural network previous moment State.AndContextual feature in short-term as video content is used for subsequent calculating.
Step 4: discretization memory unit writing controller.For each video frame, step 1,2,3 height being calculated Tie up semantic feature xt, memory unit historical information mhtAnd contextual feature in short-termMemory unit controller is inputted, is calculated To binaryzation memory unit write instruction st∈ { 0,1 }, specific as follows:
at=σ (qt) (6)
st=τ (at) (7)
Wherein vTFor the row vector parameter that can learn, Wf、Wc、WmFor the weight parameter that can learn, bsFor offset parameter.By upper It can be seen that, sigmoid function σ () is by the result q of linear weighted functiontIt normalizes between 0,1, i.e. at∈(0,1).Secondly, atIt is defeated The binaryzation function τ () entered to threshold restriction obtains binaryzation memory unit write instruction st
Step 5: being based on binaryzation memory unit write instruction st, update memory unit and section type recurrent neural network. For each video frame, memory unit MtMore new strategy it is as follows:
Wherein WwFor that can learn weight matrix, which passes through multiplying for higher-dimension semantic feature xtIt is single to be converted to memory Element Indicating willMemory unit M is writtent-1, form new memory unit Mt.In addition, section type recurrence The hidden state h of neural networkt, ctIt updates as follows:
WhereinThe result being calculated for formula (4).
Step 6: carrying out behavior classification using memory unit.Assuming that video overall length is T, entire video is remembered when processing terminate Recalling unit is MT, wherein there is NTA element, then the character representation f of entire video are as follows:
Wherein f is D dimensional vector, represents the information of behavior classification in video.Then, this feature inputs full link sort layer Behavior category score y is obtained, specific as follows:
Y=soft max (Wf) (12)
Wherein W ∈ RC×D, the identifiable behavior classification sum of C expression.The y being calculated indicates system to each classification Classification score, the higher expression of score are more likely to be the class behavior.Assuming that ya、ymIt respectively indicates and is apparently obtained with kinesitherapy nerve network The score arrived, then final score yfIt is as follows:
yf=ya+ym (13)
Wherein yfIndicate final personage's Activity recognition result.
Effect of the invention is described further by following emulation experiment.
1. simulated conditions.
The present invention is to be in central processing unitXeon E5-2697A 2.6GHz CPU, video card NVIDIA K80, In 7 operating system of memory 16G, Centos, with the emulation of PyTorch software progress.
Data used in emulation are the data in two open test data set UCF101/HMDB51, wherein video camera Movement changes greatly, and background is complex.Experimental data includes 13320/6766 section of video altogether, can be divided into according to behavior classification 101/51 class.The wherein non-editing mostly of the video data in HMDB51 data set includes more noise.
2. emulation content.
In order to prove effectiveness of the invention, emulation experiment strengthens to memory unit proposed by the present invention and time-series dynamics Learning method has carried out comparative experiments.Specifically, as comparison algorithm of the invention, emulation experiment has selected accuracy rate highest double Flow network framework (TSN) and L.Sun et al. are in document " L.Sun, K.Jia, K.Chen, D.Yeung, B.Shi and S.Savarese.Lattice Long Short-Term Memory for Human Action Recognition,In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, The method (Lattice-LSTM) of the long recurrent neural network in short-term of lattice is proposed in pp.2166-2175,2011. ".Three algorithms Same parameter is set, its average AUC numerical value on UCF101/HMDB51 data set is calculated.Comparing result is as shown in table 1.
Table 1
Method TSN Lattice-LSTM OUR
AUC(UCF101) 93.6% 94.0% 94.8%
AUC(HMDB51) 66.2% 68.5% 71.8%
As seen from Table 1, recognition accuracy of the invention is higher than existing Activity recognition method significantly.Specifically, algorithm TSN Accuracy rate be lower than algorithm Lattice-LSTM and OUR, reason is that TSN algorithm does not account for the timing variations mould of video content Formula, and Lattice-LSTM and OUR use recurrent neural network and are modeled to the timing variations mode of video, thus Demonstrate the validity of the time-series dynamics learning method proposed by the present invention based on recurrent neural network.In addition, in HMDB51 number According on collection, algorithm OUR is substantially better than Lattice-LSTM, passs this is because memory unit proposed by the present invention can effectively be strengthened Return neural network to it is long when, non-editing video processing capacity.Therefore, in order to which memory unit has recurrent neural network reinforcing Effect property, emulation experiment is on UCF101 data set by all kinds of recurrent neural network LSTM, ALSTM and VideoLSTM and this hair Bright algorithm has carried out comparative experiments, and the results are shown in Table 2.
Table 2
Method LSTM ALSTM VideoLSTM Ours
AUC 88.3% 77.0% 89.2% 91.03%
As seen from Table 2, the result that the present invention merges is higher than all kinds of recurrent neural network result accuracys rate, and reason exists In memory unit intensifying method of the invention can effectively extract the effective information in video, and then model the timing in video Changing pattern.In contrast, influence of the simple recurrent neural network method vulnerable to noise, therefore accuracy rate is reduced instead. Therefore, effectiveness of the invention can be verified by the above emulation experiment.

Claims (1)

1. a kind of based on memory unit reinforcing-time-series dynamics study Activity recognition method, it is characterised in that the following steps are included:
Step 1: calculating video frame IaOptic flow information, wherein the Optic flow information of each pixel is by bivector (Δ x, Δ y) table Show and saves as light stream figure Im;Respective higher-dimension semantic feature is extracted using two independent thinking convolutional neural networks:
xa=CNNa(Ia;wa) (1)
xm=CNNm(Im;wm) (2)
Wherein, CNNa、CNNmApparent convolutional neural networks and movement convolutional neural networks are respectively represented, to extract video frame Ia With light stream figure ImHigh dimensional feature;xa、xmRespectively 2048 dimensional vectors represent the apparent and movement that convolutional neural networks extract Feature;wa、wmIndicate that the inside of two convolutional neural networks can training parameter;The height that convolutional neural networks extract is indicated using x Dimensional feature;
Step 2: initialization memory unit M is sky, it is expressed as M0;Assuming that when t video frame, memory unit MtIt is not sky, wherein Include Nt> 0 element, is expressed as m1,m2,...mNt;So, the memory module read operation at corresponding moment is as follows:
Wherein, the mh read outtRepresent the historical information of t moment before video;
Step 3: extracting the contextual feature in short-term of video content using section type recurrent neural network;It is calculated with step 1 For the higher-dimension semantic feature x arrived as input, feature when corresponding to t video frame is denoted as xt;The long recurrent neural net in short-term of initialization The hidden state h of network (LSTM)0、c0It is zero, then the contextual feature in short-term of t moment calculates as follows:
Wherein, EMD () indicates long recurrent neural network in short-term, ht-1,ct-1Indicate the hidden state of recurrent neural network previous moment; AndContextual feature in short-term as video content is used for subsequent calculating;
Step 4: for each video frame, Step 1: two, the three higher-dimension semantic feature x being calculatedt, memory unit history letter Cease mhtAnd contextual feature in short-termMemory unit controller is inputted, binaryzation memory unit write instruction s is calculatedt∈ { 0,1 }, specific as follows:
at=σ (qt) (6)
st=τ (at) (7)
Wherein, vTFor the row vector parameter that can learn, Wf、Wc、WmFor the weight parameter that can learn, bsFor offset parameter;sigmoid Function σ () is by the result q of linear weighted functiontIt normalizes between 0,1, i.e. at∈(0,1);atIt is input to the binaryzation of threshold restriction Function τ () obtains binaryzation memory unit write instruction st
Step 5: being based on binaryzation memory unit write instruction st, update memory unit and section type recurrent neural network;For Each video frame, memory unit MtMore new strategy it is as follows:
Wherein, WwFor that can learn weight matrix, which passes through multiplying for higher-dimension semantic feature xtBe converted to memory unit member Element Indicating willMemory unit M is writtent-1, form new memory unit Mt;In addition, section type recurrent neural The hidden state h of networkt, ctIt updates as follows:
Wherein,The result being calculated for formula (4);
Step 6: carrying out behavior classification using memory unit;Assuming that video overall length is T, entire video is remembered single when processing terminate Member is MT, wherein there is NTA element, then the character representation f of entire video are as follows:
Wherein, f is D dimensional vector, represents the information of behavior classification in video;This feature inputs full link sort layer and obtains behavior class Other score y, specific as follows:
Y=softmax (Wf) (12)
Wherein, W ∈ RC×D, the identifiable behavior classification sum of C expression;The y being calculated indicates classification of the system to each classification Score, the higher expression of score are more likely to be the class behavior;Assuming that ya、ymIt respectively indicates and is apparently obtained with kinesitherapy nerve network Score, then final score yfIt is as follows:
yf=ya+ym (13)
Wherein, yfIndicate final personage's Activity recognition result.
CN201811569882.8A 2018-12-21 2018-12-21 Behavior recognition method based on memory cell reinforcement-time sequence dynamic learning Active CN109753897B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811569882.8A CN109753897B (en) 2018-12-21 2018-12-21 Behavior recognition method based on memory cell reinforcement-time sequence dynamic learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811569882.8A CN109753897B (en) 2018-12-21 2018-12-21 Behavior recognition method based on memory cell reinforcement-time sequence dynamic learning

Publications (2)

Publication Number Publication Date
CN109753897A true CN109753897A (en) 2019-05-14
CN109753897B CN109753897B (en) 2022-05-27

Family

ID=66403877

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811569882.8A Active CN109753897B (en) 2018-12-21 2018-12-21 Behavior recognition method based on memory cell reinforcement-time sequence dynamic learning

Country Status (1)

Country Link
CN (1) CN109753897B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135345A (en) * 2019-05-15 2019-08-16 武汉纵横智慧城市股份有限公司 Activity recognition method, apparatus, equipment and storage medium based on deep learning
CN110348567A (en) * 2019-07-15 2019-10-18 北京大学深圳研究生院 A kind of memory network method integrated based on automatic addressing and recurrence information
CN110852273A (en) * 2019-11-12 2020-02-28 重庆大学 Behavior identification method based on reinforcement learning attention mechanism
CN111401149A (en) * 2020-02-27 2020-07-10 西北工业大学 Lightweight video behavior identification method based on long-short-term time domain modeling algorithm
CN111639548A (en) * 2020-05-11 2020-09-08 华南理工大学 Door-based video context multi-modal perceptual feature optimization method
CN112633260A (en) * 2021-03-08 2021-04-09 北京世纪好未来教育科技有限公司 Video motion classification method and device, readable storage medium and equipment
CN112926453A (en) * 2021-02-26 2021-06-08 电子科技大学 Examination room cheating behavior analysis method based on motion feature enhancement and long-term time sequence modeling

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407889A (en) * 2016-08-26 2017-02-15 上海交通大学 Video human body interaction motion identification method based on optical flow graph depth learning model
CN106845351A (en) * 2016-05-13 2017-06-13 苏州大学 It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term
CN106934352A (en) * 2017-02-28 2017-07-07 华南理工大学 A kind of video presentation method based on two-way fractal net work and LSTM
US20170255832A1 (en) * 2016-03-02 2017-09-07 Mitsubishi Electric Research Laboratories, Inc. Method and System for Detecting Actions in Videos
CN107330362A (en) * 2017-05-25 2017-11-07 北京大学 A kind of video classification methods based on space-time notice
CN108681712A (en) * 2018-05-17 2018-10-19 北京工业大学 A kind of Basketball Match Context event recognition methods of fusion domain knowledge and multistage depth characteristic
CN108805080A (en) * 2018-06-12 2018-11-13 上海交通大学 Multi-level depth Recursive Networks group behavior recognition methods based on context

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170255832A1 (en) * 2016-03-02 2017-09-07 Mitsubishi Electric Research Laboratories, Inc. Method and System for Detecting Actions in Videos
CN106845351A (en) * 2016-05-13 2017-06-13 苏州大学 It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term
CN106407889A (en) * 2016-08-26 2017-02-15 上海交通大学 Video human body interaction motion identification method based on optical flow graph depth learning model
CN106934352A (en) * 2017-02-28 2017-07-07 华南理工大学 A kind of video presentation method based on two-way fractal net work and LSTM
CN107330362A (en) * 2017-05-25 2017-11-07 北京大学 A kind of video classification methods based on space-time notice
CN108681712A (en) * 2018-05-17 2018-10-19 北京工业大学 A kind of Basketball Match Context event recognition methods of fusion domain knowledge and multistage depth characteristic
CN108805080A (en) * 2018-06-12 2018-11-13 上海交通大学 Multi-level depth Recursive Networks group behavior recognition methods based on context

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LIMIN WANG ET AL.: "Temporal Segment Networks: Towards Good Practices for Deep Action Recognition", 《ARXIV》 *
LIN SUN ET AL.: "Lattice Long Short-Term Memory for Human Action Recognition", 《2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION》 *
LINCHAO ZHU: "Bidirectional Multirate Reconstruction for Temporal Modeling in Videos", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
谯庆伟: "融合双重时空网络流和attention机制的人体行为识别", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135345A (en) * 2019-05-15 2019-08-16 武汉纵横智慧城市股份有限公司 Activity recognition method, apparatus, equipment and storage medium based on deep learning
CN110348567A (en) * 2019-07-15 2019-10-18 北京大学深圳研究生院 A kind of memory network method integrated based on automatic addressing and recurrence information
CN110348567B (en) * 2019-07-15 2022-10-25 北京大学深圳研究生院 Memory network method based on automatic addressing and recursive information integration
CN110852273A (en) * 2019-11-12 2020-02-28 重庆大学 Behavior identification method based on reinforcement learning attention mechanism
CN111401149A (en) * 2020-02-27 2020-07-10 西北工业大学 Lightweight video behavior identification method based on long-short-term time domain modeling algorithm
CN111401149B (en) * 2020-02-27 2022-05-13 西北工业大学 Lightweight video behavior identification method based on long-short-term time domain modeling algorithm
CN111639548A (en) * 2020-05-11 2020-09-08 华南理工大学 Door-based video context multi-modal perceptual feature optimization method
CN112926453A (en) * 2021-02-26 2021-06-08 电子科技大学 Examination room cheating behavior analysis method based on motion feature enhancement and long-term time sequence modeling
CN112926453B (en) * 2021-02-26 2022-08-05 电子科技大学 Examination room cheating behavior analysis method based on motion feature enhancement and long-term time sequence modeling
CN112633260A (en) * 2021-03-08 2021-04-09 北京世纪好未来教育科技有限公司 Video motion classification method and device, readable storage medium and equipment
CN112633260B (en) * 2021-03-08 2021-06-22 北京世纪好未来教育科技有限公司 Video motion classification method and device, readable storage medium and equipment

Also Published As

Publication number Publication date
CN109753897B (en) 2022-05-27

Similar Documents

Publication Publication Date Title
CN109753897A (en) Based on memory unit reinforcing-time-series dynamics study Activity recognition method
CN107679526B (en) Human face micro-expression recognition method
Qi et al. StagNet: An attentive semantic RNN for group activity and individual action recognition
CN110428428B (en) Image semantic segmentation method, electronic equipment and readable storage medium
CN107506712B (en) Human behavior identification method based on 3D deep convolutional network
Hasani et al. Spatio-temporal facial expression recognition using convolutional neural networks and conditional random fields
Cui et al. Efficient human motion prediction using temporal convolutional generative adversarial network
CN106845499A (en) A kind of image object detection method semantic based on natural language
Hu et al. Learning activity patterns using fuzzy self-organizing neural network
CN110414498B (en) Natural scene text recognition method based on cross attention mechanism
Cui Applying gradient descent in convolutional neural networks
CN112784763B (en) Expression recognition method and system based on local and overall feature adaptive fusion
CN107480726A (en) A kind of Scene Semantics dividing method based on full convolution and shot and long term mnemon
CN111125358B (en) Text classification method based on hypergraph
CN108133188A (en) A kind of Activity recognition method based on motion history image and convolutional neural networks
CN107330362A (en) A kind of video classification methods based on space-time notice
CN108509839A (en) One kind being based on the efficient gestures detection recognition methods of region convolutional neural networks
CN110334589B (en) High-time-sequence 3D neural network action identification method based on hole convolution
CN106599941A (en) Method for identifying handwritten numbers based on convolutional neural network and support vector machine
CN110399821B (en) Customer satisfaction acquisition method based on facial expression recognition
CN110378208B (en) Behavior identification method based on deep residual error network
CN113469356A (en) Improved VGG16 network pig identity recognition method based on transfer learning
CN107657233A (en) Static sign language real-time identification method based on modified single multi-target detection device
CN112464865A (en) Facial expression recognition method based on pixel and geometric mixed features
CN110490136A (en) A kind of human body behavior prediction method of knowledge based distillation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant