CN106909938A - Viewing angle independence Activity recognition method based on deep learning network - Google Patents

Viewing angle independence Activity recognition method based on deep learning network Download PDF

Info

Publication number
CN106909938A
CN106909938A CN201710082263.5A CN201710082263A CN106909938A CN 106909938 A CN106909938 A CN 106909938A CN 201710082263 A CN201710082263 A CN 201710082263A CN 106909938 A CN106909938 A CN 106909938A
Authority
CN
China
Prior art keywords
deep learning
model
viewing angle
learning network
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710082263.5A
Other languages
Chinese (zh)
Other versions
CN106909938B (en
Inventor
王传旭
胡国锋
刘继超
杨建滨
孙海峰
崔雪红
李辉
刘云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Shengruida Technology Co ltd
Original Assignee
Qingdao University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University of Science and Technology filed Critical Qingdao University of Science and Technology
Priority to CN201710082263.5A priority Critical patent/CN106909938B/en
Publication of CN106909938A publication Critical patent/CN106909938A/en
Application granted granted Critical
Publication of CN106909938B publication Critical patent/CN106909938B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present invention proposes a kind of viewing angle independence Activity recognition method based on deep learning network, comprises the following steps:By the video frame images typing under a certain visual angle, low-level image feature extraction and processing are carried out by the way of deep learning;Low-level image feature to obtaining is modeled, and cube model is obtained in chronological order;The cube model at all visual angles is converted into the cylinder feature space mapping of unchanged view angle, after be entered into grader and be trained, obtain video behavior viewing angle independence grader.Technical scheme is analyzed using deep learning network to the human body behavior under various visual angles, improves the robustness of disaggregated model;It is especially suitable for being trained based on big data, being learnt, can have well given play to its advantage.

Description

Viewing angle independence Activity recognition method based on deep learning network
Technical field
Technical field of computer vision of the present invention, particularly relates to a kind of viewing angle independence behavior based on deep learning network Recognition methods.
Background technology
With developing rapidly for information technology, computer vision along with the concepts such as VR, AR and artificial intelligence appearance Best developing period is welcome, has also increasingly been subject to domestic and international as the most important video behavioural analysis of computer vision field The favor of scholar.In a series of field such as video monitoring, man-machine interaction, Medical nursing, video frequency searching, video behavioural analysis is accounted for According to very big proportion.Such as now popular pilotless automobile project, video behavioural analysis is very challenging. Due to the complexity and multifarious feature of human action, along with multiple visual angles human body from blocking, multiple dimensioned and visual angle The influence of the factors such as rotation, translation so that the difficulty of video Activity recognition is very big.How real life is accurately recognized Human body behavior under middle multiple angles, and human body behavior is analyzed, it always is very important research topic, and society Requirement to behavioural analysis also more and more higher.
Traditional research method is comprising following several:
Based on space-time characteristic point:Video frame images to extracting extract space-time characteristic point therein, then space-time characteristic Point modeling, analysis, are finally classified.
Based on human skeleton:Human skeleton information is extracted by algorithm or depth camera, is then believed by skeleton Breath is described, models, and then video behavior is classified.
Behavior analysis method based on space-time characteristic point and framework information, takes under traditional single-view or under single player mode Significantly achievement was obtained, but is directed to now as the bigger area of pedestrian's flows such as street, airport, station or human body hide The appearance of a series of complex problem such as gear, illumination variation, view transformation, simple both analysis methods of use are in real life Middle effect does not often reach the requirement of people, and the robustness of algorithm is also very poor sometimes.
The content of the invention
In order to solve the defect of above prior art presence, the present invention propose a kind of visual angle based on deep learning network without Sexual behaviour recognition methods is closed, the human body behavior under various visual angles is analyzed using deep learning network, lifting disaggregated model Robustness;Especially deep learning network is suitably based on big data and is trained, learns, and can well give play to its advantage.
The technical proposal of the invention is realized in this way:
A kind of viewing angle independence Activity recognition method based on deep learning network, is divided using training sample set The training process of class device and the identification process using grader identification test sample;
The training process is comprised the following steps:
S1) video frame images Image 1 to the Image i under a certain visual angle are input into sequentially in time;
S2) to step S1) input image use CNN (Convolutional Neural Network, convolutional Neural net Network) carry out low-level image feature extraction and pond is carried out to it, the low-level image feature of Chi Huahou is used into STN (Spatial Transform Networks, space switching network) strengthened;
S3) to step S2) reinforcing after characteristic image (Feature Map) carry out pond and be input into RNN (Recurrent Neural Network, recurrent neural net network layers) time modeling is carried out, obtain the cube model of sequential correlation;
S4) repeat step S1) to S3) the spatial cuboids model of same behavior under multiple visual angles is obtained, by each visual angle Spatial cuboids model conversation be unchanged view angle the mapping of cylinder feature space, and as the training of the class behavior Sample is trained in being input to grader;
S5 each step more than) repeating, obtains the viewing angle independence grader of various actions;
The identification process is comprised the following steps:
S6) the video frame images under a certain visual angle of typing, using above-mentioned steps S1) to S3) low-level image feature is carried out to it carry Take and model, obtain the spatial cuboids model under the visual angle;
S7) by step S6) the spatial cuboids model conversation that obtains is a cylinder feature space mapping for unchanged view angle, And be entered into grader and be identified obtaining video behavior classification.
In above-mentioned technical proposal, step S2) preferably operation is accumulated using three-layer coil to extract low-level image feature;Step S2) and step Rapid S3) dimensionality reduction operation is preferably carried out to characteristic image using maximum pond method.
In above-mentioned technical proposal, step S3) what is obtained is the spatial cuboids model under some visual angle of same behavior, Operating procedure S1 repeatedly) to S3) obtain the spatial cuboids model of same behavior under multiple visual angles.
In technical scheme, it is preferred to use LSTM networks (Long-Short Term Memory, abbreviation LSTM) Time modeling is carried out, because the back-propagating process of deep learning network uses stochastic gradient descent method, using in LSTM Special door operation, the gradient disappearance problem of each layer can be prevented.
In above-mentioned technical proposal, step S4) specifically include:
S41) repeat step S1) to S3), obtain the spatial cuboids model at each visual angle of same behavior, and by its It is incorporated into x, y, z are in the cylindrical space of reference axis, cylindrical space represents the track description of motion feature under each visual angle;
S42) to step S41) model that obtains uses formula:
Polar coordinate transform is carried out, isogonal cylinder space mapping is obtained.
In above-mentioned technical proposal, also include:S0 data set) is built, present invention preferably employs IXMAS data sets.
Compared with prior art, technical scheme has following difference:
1st, feature extraction is carried out to low-level image feature using the method for CNN, obtains the feature of the overall situation rather than conventional method institute The key point for obtaining.
2nd, characteristic strengthening is carried out to the global characteristics for obtaining using STN methods, is directly carried out rather than the feature to obtaining Modeling.
3rd, time modeling is carried out to the global characteristics after being operated by reinforcing and dimensionality reduction using LSTM networks, adds weight The temporal information wanted, makes it have temporal associativity.
4th, coordinate transform is carried out to the spatial cuboids model at each visual angle of same behavior using polar coordinate transform, obtains angle The constant cylinder space mapping of degree, then training and Classification and Identification are completed by CNN.
The advantage of the invention is that:What is drawn using the method for CNN is global advanced features, by the characteristic strengthening of STN, There is good robustness to the video in real life, then using RNN network setup time information, polar coordinates are eventually passed Conversion is merged to the different characteristic in various visual angles, the isogonal descriptor for obtaining is trained and is divided using CNN Class, and operation is extracted without using traditional skeleton and key point, the feature that global characteristics are obtained is more comprehensively;RNN networks are obtained Inter frame temporal information so that behavior description ground is more complete, and applicability is stronger.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also Other accompanying drawings are obtained with according to these accompanying drawings.
Fig. 1 is the schematic flow sheet of training process of the present invention;
Fig. 2 is the schematic flow sheet of identification process of the present invention;
Fig. 3 is general Human bodys' response schematic flow sheet;
Fig. 4 is extraction and the modeling procedure figure of simplified low-level image feature;
Fig. 5 is the process chart of general CNN;
Fig. 6 is general RNN simplified structure diagrams;
Fig. 7 is LSTM block diagrams;
Fig. 8 is the flow chart that integrated classification is carried out to each visual angle;
Model schematics of the Fig. 9 for the Motion History Volume in Fig. 8 after polar coordinate transform.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.
As shown in Figures 1 and 2, the viewing angle independence Activity recognition method based on deep learning network of the invention, including The training process and the identification process using grader identification test sample of grader are obtained using training sample set;
The training process is as shown in figure 1, comprise the following steps:
S1) video frame images Image 1 to the Image i under a certain visual angle are input into sequentially in time;
S2) to step S1) input image low-level image feature extraction is carried out using CNN and pond is carried out to it, by Chi Huahou Low-level image feature strengthened using STN;
S3) to step S2) characteristic image after reinforcing carries out pond and is input into RNN carrying out time modeling, obtain sequential and close The cube model of connection;
S4) repeat step S1) to S3) the spatial cuboids model of same behavior under multiple visual angles is obtained, by each visual angle Spatial cuboids model conversation be unchanged view angle the mapping of cylinder feature space, and as the training of the class behavior Sample is trained in being input to grader;
S5 each step more than) repeating, obtains the viewing angle independence grader of various actions.
The identification process is as shown in Fig. 2 comprise the following steps:
S6) the video frame images under a certain visual angle of typing, using above-mentioned steps S1) to S3) low-level image feature is carried out to it carry Take and model, obtain the spatial cuboids model under the visual angle;
S7) by step S6) the spatial cuboids model conversation that obtains is a cylinder feature space mapping for unchanged view angle, And be entered into grader and be identified obtaining video behavior classification.
In above-mentioned technical proposal, step S2) preferably operation is accumulated using three-layer coil to extract low-level image feature;Step S2) and step Rapid S3) dimensionality reduction operation is preferably carried out to characteristic image using maximum pond method.
In above-mentioned technical proposal, step S3) what is obtained is the spatial cuboids model under some visual angle of same behavior, Operating procedure S1 repeatedly) to S3) obtain the spatial cuboids model of same behavior under multiple visual angles.
In technical scheme, it is preferred to use LSTM networks (Long-Short Term Memory, abbreviation LSTM) Time modeling is carried out, because the back-propagating process of deep learning network uses stochastic gradient descent method, using in LSTM Special door operation, the gradient disappearance problem of each layer can be prevented.
In above-mentioned technical proposal, step S4) specifically include:
S41) repeat step S1) to S3), obtain the spatial cuboids model at each visual angle of same behavior, and by its It is incorporated into x, y, z are in the cylindrical space of reference axis, cylindrical space represents the track description of motion feature under each visual angle;
S42) to step S41) model that obtains uses formula:
Polar coordinate transform is carried out, isogonal cylinder space mapping is obtained.
In above-mentioned technical proposal, also include:S0 data set) is built.
Present invention preferably employs IXMAS data sets, data set includes five different visual angles, everyone 14 actions of 12 people, Each action is in triplicate.Using 11 personal accomplishment training dataset therein, remaining 1 people is used as test data set.
Specifically, for example to recognize " running " this behavior, five kinds of running videos of lower 12 people in visual angle are gathered first, its In 11 running videos of people as training dataset, remaining 1 people as checking data set.First by an a certain personal visual angle Under running video frame images according to above-mentioned steps S1) to S3) operated, what is finally given is that " running " under the visual angle is regarded The cube model of the sequential correlation of frequency behavior, i.e., the spatial cuboids model of " running " behavior under the visual angle;Then repeat Step S1) to S3) the spatial cuboids model of " running " behavior under other four kinds of visual angles is obtained successively;By more than under five kinds of visual angles The spatial cuboids model conversation of " running " behavior is a cylinder feature space mapping for unchanged view angle, and as this The training sample of " running " this classification behavior of people, is input into classifier training;Trained by the training sample of multiple different people Afterwards, the viewing angle independence grader of " running " behavior is obtained.Similarly, the viewing angle independence classification of various video behaviors can be built Device.
When being identified, above-mentioned steps S6 is performed) and S7), some first by a people in test sample is regarded Video frame images under angle are according to above-mentioned steps S1) to S3) operated, obtain the spatial cuboids mould of the visual angle lower behavior Type, then be converted into cylinder feature space by polar coordinate transform and map, is inputted in grader and identifies behavior classification. The identification process at other visual angles is same with this.
In order to more fully understand and illustrating technical scheme, below by way of having for being related to above-mentioned technical proposal Pass technology explain in detail and analyze.
Method of the present invention model includes two Main Stages, and one is low-level image feature to be extracted, is modeled;Second is to each Visual angle is merged, is classified, and main innovation work is as follows.
The general flow of Human bodys' response as shown in figure 3, in the figure feature extraction and character representation stage be that behavior is known Other emphasis, the result in this stage will final influence identification accuracy, and algorithm robustness, present invention employs depth The method for spending study carries out feature extraction.
Simplified low-level image feature is illustrated in figure 4 to extract and modeling procedure figure.
In technical scheme, the deep learning framework of use is Caffe, the video under a certain visual angle in Fig. 4 Frame Image 1 to Image i are to be input in network sequentially in time.Feature is carried out using CNN to input picture first to carry Take, feature is strengthened using STN then, make it that there is certain robustness to translation, dimensional variation, angle change, so Pondization operation is carried out to characteristic image (Feature Map) afterwards, maximum pond method is used here, then will be by pond The characteristic image of operation carries out time modeling in being input to RNN layers, finally obtains the characteristic image with inter frame temporal relevance Sequence (Feature Maps Sequences).
Technical scheme accumulates operation to extract low-level image feature using three-layer coil, then by maximum pond method pair Feature carries out dimensionality reduction operation.Intensified operation, STN networks are carried out during the later characteristic image of pondization is input into STN layers to feature Function be that the feature that enables to has and has robustness to translation, rotation and dimensional variation.Then the spy for STN being exported Levying image carries out maximum pond, and dimension-reduction treatment is carried out again, is then input to make it insert temporal information in RNN networks, finally In chronological order, the Feature Maps that will be obtained are combined into spatial cuboids.The RNN networks used in the present invention are LSTM nets Network, because the back-propagating process of deep learning network uses stochastic gradient descent method, is grasped using the special door in LSTM Make, the gradient disappearance problem of each layer can be prevented.
In above-mentioned technical proposal, CNN is the efficient identification side for growing up and drawing attention developed in recent years Method.The sixties in 20th century, Hubel and Wiesel is when in studying cat cortex for the neuron of local sensitivity and set direction It was found that its unique network structure can be effectively reduced the complexity of Feedback Neural Network, CNN is then proposed.Now, CNN One of study hotspot of numerous scientific domains is had become, particularly in pattern classification field, because the network is avoided to figure The complicated early stage pretreatment of picture, can directly input original image, thus obtain more being widely applied.
Usually, the basic structure of CNN includes two-layer, and one is characterized extract layer, the input of each neuron with it is previous The local acceptance region of layer is connected, and extracts the local feature.After the local feature is extracted, it is and between further feature Position relationship is also decided therewith;The second is Feature Mapping layer, each computation layer of network is made up of multiple Feature Mappings, often Individual Feature Mapping is a plane, and the weights of all neurons are equal in plane.
It is exactly, using Feature Mapping layer, to extract the global low-level image feature in video frame images in technical scheme, Deeper treatment then is carried out to low-level image feature.
The vague generalization handling process of CNN is as shown in Figure 5.
The technical scheme layer to be used is exactly that we neglect in the Feature Map obtained after convolution Pondization slightly below and full articulamentum.What CNN was obtained is the characteristic information of single image, and to be processed is video information, because This needs to introduce temporal information, so simple use CNN can not reach the requirement for the treatment of video behavior.
In above-mentioned technical proposal, RNN or to be called Recognition with Recurrent Neural Network be in feedforward neural network (Feed-forward Neural Networks, abbreviation FNNs) on the basis of develop.Different from traditional FNNs, RNN introduces directed circulation, Can process those input between forward-backward correlation problem.RNN includes input block (Input units), and input set is labeled as {x0, x1..., xt-1, xt, xt+1..., and the output collection of output unit (Output units) is then marked as { o0, o1..., ot-1, ot, ot+1....Also comprising implicit unit (Hidden units), we output it collection labeled as { s to RNN0, s1..., st-1, st, St+1..., these implicit units complete work main.
It is illustrated in figure 6 general RNN and simplifies structure, in Fig. 6, the information flow for having an one-way flow is from input block Implicit unit is reached, the at the same time another information flow of one-way flow reaches output unit from implicit unit.In some feelings Under condition, RNN can break the limitation of the latter, and guidance information returns to implicit unit from output unit, and these are referred to as " Back Projections ", and the input of hidden layer also includes the state of a upper hidden layer, i.e. and the node in hidden layer can connect certainly Can also interconnect.Therefore, the connection of temporal information is achieved that in hidden layer, it is not necessary to which extra again consideration temporal information is asked Topic.This is also a big advantages of the RNN when video behavioural characteristic is processed.Therefore, the general treatment with timing information, in depth All it is to give RNN to process in study.
A model for new process time information is developed again on the basis of RNN:Section time memory (Long- long Short Term Memory, abbreviation LSTM).Because the stochastic gradient descent method that the back-propagating of deep learning network is used, because This, RNN occurs the problem that a kind of gradient disappears, that is, following time node under the node perceived power of prior time Drop.So it is exactly Cell that LSTM introduces a core element.The substantially block diagram of LSTM is as shown in Figure 7.
Fig. 8 show the flow chart that integrated classification is carried out to each visual angle.
Method according to Fig. 4 obtains the spatial cuboids model of same action under multiple visual angles, then by each visual angle To with x, y, z are in the cylindrical space of reference axis, cylindrical space is moved spatial cuboids model integration under representing each visual angle The track description of feature, then carries out polar coordinate transform using mathematical method, and it is transformed into r, θ, in the space of z coordinate axle, Formula is as follows:
Then isogonal cylinder space mapping (Invariant Cylinder Space Map) is obtained, is finally incited somebody to action To cylinder space mapping be input in grader, obtain behavior classification, the mode used here as CNN is classified, and is different from SVM classifier, because CNN is most initially for classifying what is used.Motion History Volume (motion histories in Fig. 8 Post) and model after polar coordinate transform it is as shown in Figure 9.
Technical scheme is more special than the space-time of conventional method using the bottom-up information that the method for deep learning is extracted Levy a little and framework information is more senior and bone rod is more preferable.
Presently preferred embodiments of the present invention is the foregoing is only, is not intended to limit the invention, it is all in essence of the invention Within god and principle, any modification, equivalent substitution and improvements made etc. should be included within the scope of the present invention.

Claims (6)

1. a kind of viewing angle independence Activity recognition method based on deep learning network, is classified using training sample set The training process of device and the identification process using grader identification test sample;It is characterized in that:
The training process is comprised the following steps:
S1) video frame images Image 1 to the Image i under a certain visual angle are input into sequentially in time;
S2) to step S1) input image low-level image feature extraction is carried out using CNN and pond is carried out to it, by the bottom of Chi Huahou Layer feature is strengthened using STN;
S3) to step S2) characteristic image after reinforcing carries out pond and is input into RNN carrying out time modeling, obtains sequential correlation Cube model;
S4) repeat step S1) to S3) the spatial cuboids model of same behavior under multiple visual angles is obtained, by the sky at each visual angle Between cube model be converted into the cylinder feature space mapping of unchanged view angle, and as the training sample of the class behavior It is input in grader and is trained;
S5 each step more than) repeating, obtains the viewing angle independence grader of various actions;
The identification process is comprised the following steps:
S6) the video frame images under a certain visual angle of typing, using above-mentioned steps S1) to S3) it is carried out low-level image feature extract and Modeling, obtains the spatial cuboids model under the visual angle;
S7) by step S6) the spatial cuboids model conversation that obtains is a cylinder feature space mapping for unchanged view angle, and will It is identified obtaining video behavior classification in being input to grader.
2. the viewing angle independence Activity recognition method based on deep learning network according to claim 1, it is characterised in that:
Step S2) low-level image feature is extracted using the operation of three-layer coil product.
3. the viewing angle independence Activity recognition method based on deep learning network according to claim 2, it is characterised in that:
Step S2) and step S3) dimensionality reduction operation is carried out to characteristic image using maximum pond method.
4. the viewing angle independence Activity recognition method based on deep learning network according to claim 1, it is characterised in that:
Step S3) time modeling is carried out using LSTM networks.
5. the viewing angle independence Activity recognition method based on deep learning network according to claim 1, it is characterised in that Step S4) specifically include:
S41 step S1) is repeated) to S3), the spatial cuboids model at each visual angle of same behavior is obtained, and integrated To with x, y, z are in the cylindrical space of reference axis, cylindrical space represents the track description of motion feature under each visual angle;
S42) to step S41) model that obtains uses formula:
Polar coordinate transform is carried out, isogonal cylinder space mapping is obtained.
6. the viewing angle independence Activity recognition method based on deep learning network according to claim 1, it is characterised in that Also include:
S0 data set) is built.
CN201710082263.5A 2017-02-16 2017-02-16 Visual angle independence behavior identification method based on deep learning network Active CN106909938B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710082263.5A CN106909938B (en) 2017-02-16 2017-02-16 Visual angle independence behavior identification method based on deep learning network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710082263.5A CN106909938B (en) 2017-02-16 2017-02-16 Visual angle independence behavior identification method based on deep learning network

Publications (2)

Publication Number Publication Date
CN106909938A true CN106909938A (en) 2017-06-30
CN106909938B CN106909938B (en) 2020-02-21

Family

ID=59208388

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710082263.5A Active CN106909938B (en) 2017-02-16 2017-02-16 Visual angle independence behavior identification method based on deep learning network

Country Status (1)

Country Link
CN (1) CN106909938B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463878A (en) * 2017-07-05 2017-12-12 成都数联铭品科技有限公司 Human bodys' response system based on deep learning
CN107609541A (en) * 2017-10-17 2018-01-19 哈尔滨理工大学 A kind of estimation method of human posture based on deformable convolutional neural networks
CN107679522A (en) * 2017-10-31 2018-02-09 内江师范学院 Action identification method based on multithread LSTM
CN108121961A (en) * 2017-12-21 2018-06-05 华自科技股份有限公司 Inspection Activity recognition method, apparatus, computer equipment and storage medium
CN108764050A (en) * 2018-04-28 2018-11-06 中国科学院自动化研究所 Skeleton Activity recognition method, system and equipment based on angle independence
CN112287754A (en) * 2020-09-23 2021-01-29 济南浪潮高新科技投资发展有限公司 Violence detection method, device, equipment and medium based on neural network
CN112686111A (en) * 2020-12-23 2021-04-20 中国矿业大学(北京) Attention mechanism-based multi-view adaptive network traffic police gesture recognition method
CN113111721A (en) * 2021-03-17 2021-07-13 同济大学 Human behavior intelligent identification method based on multi-unmanned aerial vehicle visual angle image data driving
CN113239819A (en) * 2021-05-18 2021-08-10 西安电子科技大学广州研究院 Visual angle normalization-based skeleton behavior identification method, device and equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1218936A (en) * 1997-09-26 1999-06-09 松下电器产业株式会社 Hand gesture identifying device
CN101216896A (en) * 2008-01-14 2008-07-09 浙江大学 An identification method for movement by human bodies irrelevant with the viewpoint based on stencil matching
CN103310233A (en) * 2013-06-28 2013-09-18 青岛科技大学 Similarity mining method of similar behaviors between multiple views and behavior recognition method
CN105956560A (en) * 2016-05-06 2016-09-21 电子科技大学 Vehicle model identification method based on pooling multi-scale depth convolution characteristics
CN106203283A (en) * 2016-06-30 2016-12-07 重庆理工大学 Based on Three dimensional convolution deep neural network and the action identification method of deep video

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1218936A (en) * 1997-09-26 1999-06-09 松下电器产业株式会社 Hand gesture identifying device
CN101216896A (en) * 2008-01-14 2008-07-09 浙江大学 An identification method for movement by human bodies irrelevant with the viewpoint based on stencil matching
CN103310233A (en) * 2013-06-28 2013-09-18 青岛科技大学 Similarity mining method of similar behaviors between multiple views and behavior recognition method
CN105956560A (en) * 2016-05-06 2016-09-21 电子科技大学 Vehicle model identification method based on pooling multi-scale depth convolution characteristics
CN106203283A (en) * 2016-06-30 2016-12-07 重庆理工大学 Based on Three dimensional convolution deep neural network and the action identification method of deep video

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ALEXANROS: "View-independent human action recognition base on multi-view action images and discriminant learning", 《IVMSP 2013》 *
JEFF DONAHUE: "Long-term Recurrent Convolutional Networks forVisual Recognition and Description", 《IEEE》 *
MYUNG-CHEOL ROH: "View-independent human action recognition with Volume Motion Template", 《PATTERN RECOGNITION LETTERS》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463878A (en) * 2017-07-05 2017-12-12 成都数联铭品科技有限公司 Human bodys' response system based on deep learning
CN107609541A (en) * 2017-10-17 2018-01-19 哈尔滨理工大学 A kind of estimation method of human posture based on deformable convolutional neural networks
CN107609541B (en) * 2017-10-17 2020-11-10 哈尔滨理工大学 Human body posture estimation method based on deformable convolution neural network
CN107679522B (en) * 2017-10-31 2020-10-13 内江师范学院 Multi-stream LSTM-based action identification method
CN107679522A (en) * 2017-10-31 2018-02-09 内江师范学院 Action identification method based on multithread LSTM
CN108121961A (en) * 2017-12-21 2018-06-05 华自科技股份有限公司 Inspection Activity recognition method, apparatus, computer equipment and storage medium
CN108764050A (en) * 2018-04-28 2018-11-06 中国科学院自动化研究所 Skeleton Activity recognition method, system and equipment based on angle independence
CN108764050B (en) * 2018-04-28 2021-02-26 中国科学院自动化研究所 Method, system and equipment for recognizing skeleton behavior based on angle independence
CN112287754A (en) * 2020-09-23 2021-01-29 济南浪潮高新科技投资发展有限公司 Violence detection method, device, equipment and medium based on neural network
CN112686111A (en) * 2020-12-23 2021-04-20 中国矿业大学(北京) Attention mechanism-based multi-view adaptive network traffic police gesture recognition method
CN113111721A (en) * 2021-03-17 2021-07-13 同济大学 Human behavior intelligent identification method based on multi-unmanned aerial vehicle visual angle image data driving
CN113111721B (en) * 2021-03-17 2022-07-05 同济大学 Human behavior intelligent identification method based on multi-unmanned aerial vehicle visual angle image data driving
CN113239819A (en) * 2021-05-18 2021-08-10 西安电子科技大学广州研究院 Visual angle normalization-based skeleton behavior identification method, device and equipment
CN113239819B (en) * 2021-05-18 2022-05-03 西安电子科技大学广州研究院 Visual angle normalization-based skeleton behavior identification method, device and equipment

Also Published As

Publication number Publication date
CN106909938B (en) 2020-02-21

Similar Documents

Publication Publication Date Title
CN106909938A (en) Viewing angle independence Activity recognition method based on deep learning network
Liao et al. Deep facial spatiotemporal network for engagement prediction in online learning
Ahmed The impact of filter size and number of filters on classification accuracy in CNN
CN106709461B (en) Activity recognition method and device based on video
CN107273800B (en) Attention mechanism-based motion recognition method for convolutional recurrent neural network
CN111652066B (en) Medical behavior identification method based on multi-self-attention mechanism deep learning
CN108830157A (en) Human bodys' response method based on attention mechanism and 3D convolutional neural networks
CN107609460A (en) A kind of Human bodys' response method for merging space-time dual-network stream and attention mechanism
CN109902546A (en) Face identification method, device and computer-readable medium
CN109902798A (en) The training method and device of deep neural network
Yan et al. Multi-attributes gait identification by convolutional neural networks
CN107679491A (en) A kind of 3D convolutional neural networks sign Language Recognition Methods for merging multi-modal data
CN110222634A (en) A kind of human posture recognition method based on convolutional neural networks
CN112101241A (en) Lightweight expression recognition method based on deep learning
CN106951858A (en) A kind of recognition methods of personage's affiliation and device based on depth convolutional network
CN110097029B (en) Identity authentication method based on high way network multi-view gait recognition
CN106709482A (en) Method for identifying genetic relationship of figures based on self-encoder
CN106980830A (en) One kind is based on depth convolutional network from affiliation recognition methods and device
CN111950455A (en) Motion imagery electroencephalogram characteristic identification method based on LFFCNN-GRU algorithm model
CN110135244B (en) Expression recognition method based on brain-computer collaborative intelligence
CN106980831A (en) Based on self-encoding encoder from affiliation recognition methods
WO2023108873A1 (en) Brain network and brain addiction connection calculation method and apparatus
CN115966010A (en) Expression recognition method based on attention and multi-scale feature fusion
Zhu et al. Indoor scene segmentation algorithm based on full convolutional neural network
CN112668486A (en) Method, device and carrier for identifying facial expressions of pre-activated residual depth separable convolutional network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220114

Address after: 266000 room 403-2, building A2, Qingdao National University Science Park, No. 127, huizhiqiao Road, high tech Zone, Qingdao, Shandong

Patentee after: Qingdao shengruida Technology Co.,Ltd.

Address before: 266000 Laoshan campus, Songling Road, Laoshan District, Qingdao, Shandong, China, 99

Patentee before: QINGDAO University OF SCIENCE AND TECHNOLOGY