CN104732208A - Video human action reorganization method based on sparse subspace clustering - Google Patents

Video human action reorganization method based on sparse subspace clustering Download PDF

Info

Publication number
CN104732208A
CN104732208A CN201510114150.XA CN201510114150A CN104732208A CN 104732208 A CN104732208 A CN 104732208A CN 201510114150 A CN201510114150 A CN 201510114150A CN 104732208 A CN104732208 A CN 104732208A
Authority
CN
China
Prior art keywords
video
human body
behavioural characteristic
human
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510114150.XA
Other languages
Chinese (zh)
Other versions
CN104732208B (en
Inventor
郝宗波
桑楠
陆霖霖
吴杰
杨眷玉
万士宁
赵俊
朱前芳
鄢宇烈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201510114150.XA priority Critical patent/CN104732208B/en
Publication of CN104732208A publication Critical patent/CN104732208A/en
Application granted granted Critical
Publication of CN104732208B publication Critical patent/CN104732208B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention belongs to computer visual pattern recognition and a video picture processing method. The computer visual pattern recognition and the video picture processing method comprise the steps that establishing a three-dimensional space-time sub-frame cube in a video human action reorganization model, establishing a human action characteristic space, conducting the clustering processing, updating labels, extracting the three-dimensional space-time sub-frame cube in the video human action reorganization model and the human action reorganization from monitoring video, extracting human action characteristic, confirming category of human sub-action in each video and classifying and merging on videos with sub-category labels. According to the computer visual pattern recognition and the video picture processing method, the highest identification accuracy is improved by 16.5% compared with the current international Hollywood2 human action database. Thus, the video human action reorganization method has the advantages that human action characteristic with higher distinguishing ability, adaptability, universality and invariance property can be extracted automatically, the overfitting phenomenon and the gradient diffusion problem in the neural network are lowered, and the accuracy of human action reorganization in a complex environment is improved effectively; the computer visual pattern recognition and the video picture processing method can be applied to the on-site video surveillance and video content retrieval widely.

Description

Based on the video human Activity recognition method of sparse subspace clustering
Technical field
The invention belongs to computer vision pattern-recognition and method of video image processing, particularly a kind ofly adopt sparse subspace (SSC) cluster, segmentation and by the neural network based on degree of depth study more for the number of plies, be split as the video behavior recognition methods of the less more shallow neural network based on degree of depth study of several number of plies.
Background technology
Human bodys' response based on video is the hot issue of computer vision field in recent years, understands problem as typical video, by analyzing the human action feature in sequence of video images, identification decision human body behavior pattern.More specifically, be from sequence of video images, extract the characteristic information that can describe behavior, utilize the technology such as machine learning to understand it, adopt sorter to classify, to reach the object identifying human body behavior.
Along with the development of modern information technologies and the raising of social public security demand, demand is day by day become to the understanding of human body behavior in daily life.Human bodys' response in intelligent video monitoring, Video content retrieval, novel human-machine interaction, virtual reality, Video coding and transmission, playing to control etc. many-sidedly have wide application scenarios, receive much concern.Video human Activity recognition comprises: the Human bodys' response based on space-time method, the Human bodys' response based on serial method and Human bodys' response three class based on degree of depth study.
Wherein: 1. based on the Human bodys' response of space-time method, 3D video is regarded as in time scale, arrange formed solid by 2D image, and carry out space-time expression, comprise again: based on the Human bodys' response of three-dimensional space-time, the Human bodys' response based on three-dimensional space-time local feature and the Human bodys' response based on track; The defects such as it is design manually mostly that these class methods exist human body behavioural characteristic, comparatively large by deviser's experience influence, and the large or adaptivity of calculated amount is poor;
2., based on the Human bodys' response of sequence, be extract proper vector to each two field picture of video, by relevant proper vector composition characteristic sequence, the final human body behavior characterizing this video, carries out discriminator on this basis.Common method is the Human bodys' response based on state model sequence, video is characterized by status switch, be defined as a state to human body static posture, be associated between different states by probability, the behavior that human body links up can regard the migration between the different conditions of these static postures as; Theoretical by this, generating probability model, utilizes similarity to identify, hidden Markov model (Hidden Markov models, HMMs) is the Typical Representative of the method.
3. based on the Human bodys' response method of degree of depth study, then be referred from biological neural theory, be a popular frontier in machine learning, its motivation is to set up and simulates human brain neural network, and the cerebral cortex of namely simulating human brain carries out stratification deciphering to data.In recent years, degree of depth study was used widely in Human bodys' response field.The method direct automatic learning from raw data obtains feature, different from traditional feature extraction, this category feature, without the need to Design intervention manually, has higher adaptivity, versatility and unchangeability (as translation invariance, scale invariability and rotational invariance).3D convolutional neural networks (3D Convolutional Neural Networks, 3D CNNs) be the Typical Representative of the method, traditional convolutional neural networks is expanded to time domain from image 2 dimension space by it, directly automatic learning space-time characteristic from original video sequence, instead of traditional space-time interest points and descriptor, can to simple human body behavior as good discriminations of acquisition such as applauding, wave.Although the method is the most popular at present and effective Human bodys' response method, easily there is the normal Expired Drugs existed in neural network; In addition along with the number of plies of the neural network learnt based on the degree of depth increases, diffusion problem is easily there is in error back propagation when carrying out arameter optimization, affect training process, and at present under comparatively complex scene (as different background, different angle lens and different context environmentals etc.) Human bodys' response in poor effect.
Publication number be CN103955671A, denomination of invention is disclose a kind of Human bodys' response method based on the public vector of Quick in the patent documentation of " the Human bodys' response method based on the public vector algorithm of Quick ", improve classification speed with the public vector algorithm of Quick, and solve the small sample problem in Human bodys' response.First sub-frame processing, gray proces and denoising are carried out to the video sequence of input; Then adopt time differencing method to carry out movement human target detection to the image after framing, extract target prospect; Then target area size is normalized; The method of k-means cluster is adopted to obtain the key frame of behavior sequence again; The public vector of Quick is finally adopted to classify to behavior.Although the method can improve recognition efficiency in ground to a certain degree, solve the small sample problem in Human bodys' response, (i.e. simple background under ecotopia, without obvious noise etc.) Human bodys' response accuracy rate higher, but the method mainly utilizes traditional image processing means, the characteristic limitations extracted is large, is easily affected by the external environment, poor compared with effect in the Human bodys' response under complex scene.
Publication number be CN103810496A, denomination of invention is disclose a kind of 3D Gaussian spatial Human bodys' response method based on image depth information in the patent documentation of " the 3D Gaussian spatial Human bodys' response method based on image depth information ", first extract the skeleton 3D coordinate in depth information and operation is normalized to it, filtering the low joint of Human bodys' response rate and Joint motion; Then build interest for each behavior and close knot cluster, carry out AP cluster based on Gauss's distance collator body action space characteristics, obtain behavioural characteristic word list and data scrubbing is carried out to it; Finally build human body behavior condition random field model of cognition, realize the classification to human body behavior accordingly.Although the method all has stronger anti-interference to the concrete direction of human body, skeletal size, locus, tool generalization ability to a certain degree, can be applicable to the Human bodys' response under more satisfactory environment, but the 3D depth camera needing use cost higher, the algorithm of the method is comparatively complicated in addition, and still undesirable compared with the effect in Human bodys' response under complex scene.
Summary of the invention
The object of the invention is the defect existed for background technology, a kind of video human Activity recognition method based on sparse subspace clustering of research and design, the method can extract the human body behavioural characteristic having more identification, adaptivity, versatility and unchangeability automatically, reduce the Expired Drugs in neural network and diffusion problem, effectively improve (as different background, different angle lens and different context environmentals etc.) Human bodys' response accuracy rate under complex environment to reach, can be widely used in the objects such as live video monitoring and Video content retrieval.
Solution of the present invention is
The present invention is directed to the many factors such as camera lens distance, different context environmentals, different background, under comparatively complex scene, the feature of the same class human body behavior of (as different background, different angle lens and different context environmentals etc.) often possesses the feasibility of segmentation; To input human body behavior video sample complete feature extraction after, be mapped to feature space by sample space after, utilizing sparse subspace clustering (Sparse Subspace Clustering:SSC) to carry out cluster to the feature of same class human body behavior, be subdivided into some sub-line is then upgrade corresponding human body behavior class label and learning training again; Simultaneously by the neural network based on degree of depth study more for the number of plies, be split as the more shallow neural network based on degree of depth study that several number of plies is less, to promote neural network performance, alleviate over-fitting and diffusion problem; When identifying, the recognition result that some sub-line are is reclassified primitive behavior and carry out discrimination statistics.Namely the discrimination of Activity recognition algorithm based on degree of depth study improves further with this by the present invention on original basis, finally reaches the requirement compared with the higher recognition accuracy of human body behavior tool under complex scene, thus realizes its goal of the invention.Thus the inventive method comprises:
A. the model of video human Activity recognition is set up:
A1. three-dimensional space-time subframe cube is set up: the subframe each frame on others' body behavior video of the same class of the human body behavior database for learning being divided into formed objects, then the length of time series of the partial continuous frame of corresponding human body behavior video will be formed as its thickness, to set up three-dimensional space-time subframe cube, and to gained each subframe cube at the identical class label of protoplast's body behavior video;
A2. set up human body behavioural characteristic space: by steps A 1 build each three-dimensional space-time subframe cube together with the class label of its human body behavior video be input to based on degree of depth study neural network, carry out first time and train, to extract the feature for classifying exceeding given behavior classification 50% in human body behavior database, set up the human body behavioural characteristic space after first time training;
A3. clustering processing: to steps A 2 build human body behavioural characteristic space, sparse subspace clustering (SSC) method is utilized to carry out cluster (segmentation) process, so that same class human body behavioural characteristic is subdivided into subclass behavioural characteristic again to the anthropoid behavioural characteristic of each in behavioural characteristic space respectively; The number of behavioural characteristic subclass is determined automatically according to sparse subspace clustering (SSC) method;
A4. the renewal of label: according to the result of Subspace clustering method segmentation sparse in steps A 3, give its subtab to each behavioural characteristic subclass video after cluster segmentation respectively under the class label that protoplast's body behavior video is identical, the sample after label must be upgraded;
A5. video human Activity recognition model is set up: steps A 4 gained is upgraded the sample after label and be input to identical with steps A 2 neural network learnt based on the degree of depth and carry out second time and train, to extract human body behavioural characteristic further, then the behavioural characteristic extracted input sorter is carried out classification process, thus set up the model being used for video human Activity recognition; And neural network parameter after preserving second time training, stand-by;
B. the identification of human body behavior:
B1. from monitoring video, extract three-dimensional space-time subframe cube: adopt the method identical with steps A 1, the three-dimensional space-time subframe cube identical with steps A 1 size and quantity is extracted respectively to the every section of human body behavior video monitored, then goes to step B2;
B2. human body behavioural characteristic is extracted: be input to respectively by the three-dimensional space-time subframe cube of each section of video that step B1 extracts and train through steps A 5 and to preserve in stand-by neural network, with the sub-behavioural characteristic of the human body extracting each section of video;
B3. determine the classification that each video human sub-line is: by each section of sub-behavioural characteristic of video human extracted in step B2, input sorter classification respectively, classification process is carried out successively to each section of monitor video, obtains the video with each subclass label;
B4. belt class label video classification merge: by the video of each for step B3 resulting tape subclass label, according to Hollywood2 human body behavior database institute divide large class classification merging, obtaining behavior classification belonging to each video human behavior, storing in order to calling.
Be Hollywood2 or KTH, HMDB51, UCF101, Sports 1M human body behavior database for the human body behavior database learnt described in steps A 1 and B4.
The neural network learnt based on the degree of depth described in steps A 2 is independence subspace analysis (Independent SubspaceAnalysis:ISA) neural network.
Described in steps A 3, utilize sparse subspace clustering (SSC) method, its step is as follows:
A3-1. be that main sequence arranges and generates a dictionary by the feature of human body behavior video each in A2 step gained behavioural characteristic space with row, recycling sparse coding method determines its sparse coefficient (C);
A3-2. sparse coefficient (C) is normalized;
A3-3. same class human body behavioural characteristic figure is formed: add its transposition after steps A 3-2 gained sparse coefficient is taken absolute value, obtain adjacency matrix; And then composition using each video sample as node, adjacency matrix represents the same class human body behavioural characteristic figure of weight;
A3-4. cluster Subdividing Processing: utilize sparse subspace clustering (SSC) method to be subdivided into each behavioural characteristic subclass to steps A 3-3 gained same class human body behavioural characteristic figure cluster.
Described in steps A 5 and B3, sorter is Softmax sorter.
The present invention is due to after completing feature extraction to the human body behavior video sample of input, namely utilizing the feature of sparse Subspace clustering method to the behavior of same class human body to carry out cluster, be subdivided into some sub-line is then upgrade corresponding human body behavior class label and learning training again; Simultaneously by the neural network based on degree of depth study more for the number of plies, be split as the more shallow neural network based on degree of depth study that several number of plies is less, to promote neural network performance, alleviate over-fitting and diffusion problem; When identifying, the recognition result that some sub-line are is reclassified primitive behavior and carry out discrimination statistics.Thus the total recognition accuracy of ISA neural network to Hollywood2 human body behavior database is brought up to 80.8%, be greatly improved compared with the recognition accuracy 53.3% only utilizing ISA neural network to obtain; And compared with the highest recognition accuracy 64.3% in the known world of current Hollywood2 human body behavior database, the present invention is then than improve 16.5%.Thus, the present invention has the human body behavioural characteristic that automatically can extract and have more identification, adaptivity, versatility and unchangeability, reduce the Expired Drugs in neural network and diffusion problem, effectively improve the accuracy rate of Human bodys' response under complex environment, can be widely used in the features such as live video monitoring and Video content retrieval.
Embodiment
Hardware configuration of the present invention is: dell server, 8 core 2.60Ghz CPU, 128Gb internal memory; Software merit rating is: Windows Server 2003 operating system, and OpenCV increases income computer vision storehouse, Microsoft Visual Studio 2010 development environment, Matlab simulated environment etc.
The specific embodiment of the invention stage comprises training stage and cognitive phase, and its concrete implementation step is as follows:
A. the model of video human Activity recognition is set up:
A1: set up three-dimensional space-time subframe cube: the subframe each frame on others' body behavior video of the same class of the human body behavior database Hollywood2 for learning being divided into formed objects (16 × 16 pixel), then the length of time series of the partial continuous frame (10 frame) of corresponding human body behavior video will be formed as its thickness, to set up three-dimensional space-time subframe cube (16 pixel × 10, pixel × 16 frame), and under the class label that protoplast's body behavior video is identical, its subtab is marked respectively to gained each subframe cube;
The above-mentioned video library for learning is Hollywood2 human body behavior database, comprise in real life common: make a phone call (class label 1), drive (class label 2), have a meal (class label 3), fight (class label 4), get off (class label 5), shake hands (class label 6), embrace (class label 7), kiss (class label 8), run (class label 9), sit down (class label 10), sit-ups (class label 11) and (class label 12) training video totally 823 at 12 interior class behaviors that stands up,
When implementing, video resolution unification to 200 × 160 pixel of this database, the highest recognition accuracy of this database is 64.3%; Test video quantity totally 884;
A2: set up human body behavioural characteristic space: by steps A 1 build each three-dimensional space-time subframe cube together with the class label of its human body behavior video be input to based on degree of depth study neural network ISA (Independent Subspace Analysis), carry out first time and train, reaching the feature for classifying of 53.3% to extract given 12 kinds of behavior average recognition rate in human body behavior database Hollywood2, setting up 3000 row higher-dimension behavioural characteristic spaces after first time training;
A3: clustering processing: 2 Construction Banks are feature space to steps A, sparse subspace clustering (SSC) method is utilized to carry out clustering processing to the anthropoid behavioural characteristic of each in behavioural characteristic space respectively, so that same class human body behavioural characteristic is subdivided into subclass behavioural characteristic again, its SSC clustering method is as follows:
A3-1: be that main sequence arranges and generates a dictionary with row by the feature of human body behavior video each in A2 step gained behavioural characteristic space, recycling sparse coding method determines its sparse coefficient C; The dictionary matrix size of behavior of making a phone call in Hollywood2 is 3000 × 68, C matrix size is 68 × 68; The dictionary matrix size of driving behavior is 3000 × 85, C matrix size is 85 × 85; The dictionary matrix size of behavior of having a meal is 3000 × 39, C matrix size is 39 × 39; The dictionary matrix size of behavior of fighting is 3000 × 54, C matrix size is 54 × 54; The dictionary matrix size of behavior of getting off is 3000 × 48, C matrix size is 48 × 48; The dictionary matrix size of behavior of shaking hands is 3000 × 32, C matrix size is 32 × 32; The dictionary matrix size of embracing behavior is 3000 × 58, C matrix size is 58 × 58; The dictionary matrix size of kiss behavior is 3000 × 99, C matrix size is 99 × 99; The dictionary matrix size of running behavior is 3000 × 122, C matrix size is 122 × 122; The dictionary matrix size of behavior of sitting down is 3000 × 93, C matrix size is 93 × 93; The dictionary matrix size of sit-ups behavior is 3000 × 22, C matrix size is 22 × 22; The dictionary matrix size of behavior of standing up is 3000 × 110, C matrix size is 110 × 110;
A3-2: sparse coefficient C is normalized;
A3-3: determine same class human body behavioural characteristic figure: add its transposition after steps A 3-2 gained sparse coefficient is taken absolute value, obtain adjacency matrix, that is: W=|C|+|C| t, W is adjacency matrix; And then composition using each video sample as node, adjacency matrix represents the same class human body behavioural characteristic figure of weight; Behavior of making a phone call in Hollywood2 comprises 68 video samples altogether, i.e. nodes totally 68, in like manner, driving behavior nodes totally 85, the behavior nodes of having a meal totally 39, the behavior nodes of fighting totally 54, the behavior nodes of getting off totally 48, the behavior nodes of shaking hands totally 32, the behavior nodes of embracing totally 58, kiss behavior nodes totally 99, running behavior nodes totally 122, the behavior nodes of sitting down totally 93, sit-ups behavior nodes totally 22, the behavior nodes of standing up totally 110;
A3-4: cluster Subdividing Processing: utilize sparse subspace clustering (SSC) method to be subdivided into each behavioural characteristic subclass to steps A 3-3 gained same class human body behavioural characteristic figure cluster;
A4: the renewal of label: according to the result of SSC cluster segmentation in steps A 3, its subtab is given to each behavioural characteristic subclass video after cluster segmentation respectively under the class label that protoplast's body behavior video is identical, the sample after label must be upgraded, Hollywood2 human body behavior database forms 29 subclasses altogether through steps A 3, label 1 is subdivided into label 1.1 and 1.2, label 2 is subdivided into label 2.1, 2.2 and 2.3, label 4 is subdivided into label 4.1 and 4.2, label 5 is subdivided into label 5.1 and 5.2, label 7 is subdivided into label 7.1 and 7.2, label 8 is subdivided into label 8.1, 8.2, 8.3 and 8.4, label 9 is subdivided into label 9.1, 9.2, 9.3 and 9.4, label 10 is subdivided into label 10.1, 10.2 with 10.3, label 12 is subdivided into label 12.1, 12.2, 12.3 with 12.4, and behavior representated by label 3,6 and 11 is not segmented due to the less therefore present embodiment of number of videos again,
A5: set up video human Activity recognition model: steps A 4 gained is upgraded the sample after label and be input to identical with the steps A 2 neural network ISA learnt based on the degree of depth and carry out second time and train, to extract human body behavioural characteristic further, feature is inputted Softmax sorter and carry out classification process, thus set up the model being used for video human Activity recognition; And neural network parameter after preserving second time training, stand-by;
B. the identification of human body behavior: in order to the effect of accurate validation the inventive method, present embodiment still adopts human body behavior video in Hollywood2 human body behavior database as monitoring video:
B1: extract three-dimensional space-time subframe cube from monitoring video: adopt the method identical with steps A 1, the three-dimensional space-time subframe cube identical with steps A 1 size and quantity is extracted respectively to the every section of human body behavior video monitored, then the three-dimensional space-time subframe cube of every section of human body behavior video extraction is gone to step B2;
B2: human body behavioural characteristic is extracted: the three-dimensional space-time subframe cube of each section of video extracted by step B1 is input in the neural network trained through steps A 5, with the sub-behavioural characteristic of the human body extracting each section of video respectively;
B3: determine the classification that each video human sub-line is: by each section of sub-behavioural characteristic of video human extracted in step B2, is input to the classification of Softmax sorter respectively, carries out classification process successively, obtain the video with each subclass label to each section of monitor video;
B4: the classification of belt class label video merges: according to the video of each subclass label of step B3 resulting tape, be that the label of the video of 1.1 and 1.2 classifies as label 1 by label, label is 2.1, 2.2 with label 2 is merged in the video tab classification of 2.3, label be 4.1 and 4.2 video tab sort out merge into label 4, label be 5.1 and 5.2 video tab sort out merge into label 5, label be 7.1 and 7.2 video tab sort out merge into label 7, label is 8.1, 8.2, 8.3 with label 8 is merged in the video tab classification of 8.4, label is 9.1, 9.2, 9.3 with label 9 is merged in the video tab classification of 9.4, label is 10.1, 10.2 with label 10 is merged in the video tab classification of 10.3, label is 12.1, 12.2, 12.3 with label 12 is merged in the video tab classification of 12.4, and behavior representated by label 3,6 and 11 is not segmented due to less therefore this real formula of number of videos again, then according to Hollywood2 human body behavior database divide large class sort out merge, obtaining behavior classification belonging to each video human behavior, storing in order to calling.
Present embodiment is added up the recognition result exporting final 12 class behaviors, obtain adopting the total recognition accuracy of ISA neural network to Hollywood2 human body behavior database to reach 80.8%, be greatly improved compared with the recognition accuracy 53.3% only utilizing ISA neural network to obtain; And at present the highest recognition accuracy in the known world of Hollywood2 human body behavior database is 64.3%, present embodiment compares with it and also improves 16.5%.
Present embodiment respectively to the statistical conditions of Activity recognition rate all kinds of in Hollywood2 human body behavior database as following table:

Claims (5)

1., based on a video human Activity recognition method for sparse subspace clustering, comprising:
A. the model of video human Activity recognition is set up:
A1. three-dimensional space-time subframe cube is set up: the subframe each frame on others' body behavior video of the same class of the human body behavior database for learning being divided into formed objects, then the length of time series of the partial continuous frame of corresponding human body behavior video will be formed as its thickness, to set up three-dimensional space-time subframe cube, and to gained each subframe cube at the identical class label of protoplast's body behavior video;
A2. set up human body behavioural characteristic space: by steps A 1 build each three-dimensional space-time subframe cube together with the class label of its human body behavior video be input to based on degree of depth study neural network, carry out first time and train, to extract the feature for classifying exceeding given behavior classification 50% in human body behavior database, set up the human body behavioural characteristic space after first time training;
A3. clustering processing: to steps A 2 build human body behavioural characteristic space, sparse Subspace clustering method is utilized to carry out clustering processing to the anthropoid behavioural characteristic of each in behavioural characteristic space respectively, so that same class human body behavioural characteristic is subdivided into subclass behavioural characteristic again; The number of behavioural characteristic subclass is determined automatically according to sparse Subspace clustering method;
A4. the renewal of label: according to the result of Subspace clustering method segmentation sparse in steps A 3, give its subtab to each behavioural characteristic subclass video after cluster segmentation respectively under the class label that protoplast's body behavior video is identical, the sample after label must be upgraded;
A5. video human Activity recognition model is set up: steps A 4 gained is upgraded the sample after label and be input to identical with steps A 2 neural network learnt based on the degree of depth and carry out second time and train, to extract human body behavioural characteristic further, then the behavioural characteristic extracted input sorter is carried out classification process, thus set up the model being used for video human Activity recognition; And neural network parameter after preserving second time training, stand-by;
B. the identification of human body behavior:
B1. from monitoring video, extract three-dimensional space-time subframe cube: adopt the method identical with steps A 1, the three-dimensional space-time subframe cube identical with steps A 1 size and quantity is extracted respectively to the every section of human body behavior video monitored, then goes to step B2;
B2. human body behavioural characteristic is extracted: be input to respectively by the three-dimensional space-time subframe cube of each section of video that step B1 extracts and train through steps A 5 and to preserve in stand-by neural network, with the sub-behavioural characteristic of the human body extracting each section of video;
B3. determine the classification that each video human sub-line is: by each section of sub-behavioural characteristic of video human extracted in step B2, input sorter classification respectively, classification process is carried out successively to each section of monitor video, obtains the video with each subclass label;
B4. belt class label video classification merge: by the video of each for step B3 resulting tape subclass label, according to Hollywood2 human body behavior database institute divide large class classification merging, obtaining behavior classification belonging to each video human behavior, storing in order to calling.
2., by the video human Activity recognition method based on sparse subspace clustering described in claim 1, to it is characterized in that described in steps A 1 and B4 for the human body behavior database learnt being Hollywood2 or KTH, HMDB51, UCF101, Sports1M human body behavior database.
3., by the video human Activity recognition method based on sparse subspace clustering described in claim 1, it is characterized in that the neural network learnt based on the degree of depth described in steps A 2 is independence subspace analysis neural network.
4., by the video human Activity recognition method based on sparse subspace clustering described in claim 1, it is characterized in that utilizing sparse Subspace clustering method described in steps A 3, its step is as follows:
A3-1. be that main sequence arranges and generates a dictionary by the feature of human body behavior video each in A2 step gained behavioural characteristic space with row, recycling sparse coding method determines its sparse coefficient;
A3-2. sparse coefficient is normalized;
A3-3. same class human body behavioural characteristic figure is formed: add its transposition after steps A 3-2 gained sparse coefficient is taken absolute value, obtain adjacency matrix; And then composition using each video sample as node, adjacency matrix represents the same class human body behavioural characteristic figure of weight;
A3-4. cluster Subdividing Processing: utilize sparse Subspace clustering method to be subdivided into each behavioural characteristic subclass to steps A 3-3 gained same class human body behavioural characteristic figure cluster.
5., by the video human Activity recognition method based on sparse subspace clustering described in claim 1, it is characterized in that sorter is Softmax sorter described in steps A 5 and B3.
CN201510114150.XA 2015-03-16 2015-03-16 Video human Activity recognition method based on sparse subspace clustering Expired - Fee Related CN104732208B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510114150.XA CN104732208B (en) 2015-03-16 2015-03-16 Video human Activity recognition method based on sparse subspace clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510114150.XA CN104732208B (en) 2015-03-16 2015-03-16 Video human Activity recognition method based on sparse subspace clustering

Publications (2)

Publication Number Publication Date
CN104732208A true CN104732208A (en) 2015-06-24
CN104732208B CN104732208B (en) 2018-05-18

Family

ID=53456081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510114150.XA Expired - Fee Related CN104732208B (en) 2015-03-16 2015-03-16 Video human Activity recognition method based on sparse subspace clustering

Country Status (1)

Country Link
CN (1) CN104732208B (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550713A (en) * 2015-12-21 2016-05-04 中国石油大学(华东) Video event detection method of continuous learning
CN105678216A (en) * 2015-12-21 2016-06-15 中国石油大学(华东) Spatio-temporal data stream video behavior recognition method based on deep learning
CN105938544A (en) * 2016-04-05 2016-09-14 大连理工大学 Behavior identification method based on integrated linear classifier and analytic dictionary
CN106127108A (en) * 2016-06-14 2016-11-16 中国科学院软件研究所 A kind of staff image region detection method based on convolutional neural networks
CN106682599A (en) * 2016-12-15 2017-05-17 浙江科技学院 Stereo image visual saliency extraction method based on sparse representation
CN107506740A (en) * 2017-09-04 2017-12-22 北京航空航天大学 A kind of Human bodys' response method based on Three dimensional convolution neutral net and transfer learning model
CN108256449A (en) * 2018-01-02 2018-07-06 重庆邮电大学 A kind of Human bodys' response method based on subspace grader
CN108446605A (en) * 2018-03-01 2018-08-24 南京邮电大学 Double interbehavior recognition methods under complex background
WO2018157383A1 (en) * 2017-03-03 2018-09-07 深圳大学 Video event human-like concept learning method and device
CN108830252A (en) * 2018-06-26 2018-11-16 哈尔滨工业大学 A kind of convolutional neural networks human motion recognition method of amalgamation of global space-time characteristic
CN108875597A (en) * 2018-05-30 2018-11-23 浙江大学城市学院 A kind of two layers of movable clustering recognition method towards large-scale dataset
CN108898042A (en) * 2017-12-27 2018-11-27 浩云科技股份有限公司 A kind of detection method applied to user's abnormal behaviour in ATM machine cabin
CN108988968A (en) * 2018-07-27 2018-12-11 河北工程大学 Human behavior detection method, device and terminal device
CN109523652A (en) * 2018-09-29 2019-03-26 百度在线网络技术(北京)有限公司 Processing method, device, equipment and the storage medium of insurance based on driving behavior
CN110046631A (en) * 2018-01-15 2019-07-23 塔塔咨询服务有限公司 System and method for inferring the variation of time-space image automatically
CN110046568A (en) * 2019-04-11 2019-07-23 中山大学 A kind of video actions recognition methods based on Time Perception structure
CN110298264A (en) * 2019-06-10 2019-10-01 上海师范大学 Based on the human body daily behavior activity recognition optimization method for stacking noise reduction self-encoding encoder
CN110309732A (en) * 2019-06-13 2019-10-08 浙江大学 Activity recognition method based on skeleton video
CN110796180A (en) * 2019-10-12 2020-02-14 吉林大学 Model training system and method based on artificial intelligence
CN110796014A (en) * 2019-09-29 2020-02-14 深圳市深网视界科技有限公司 Garbage throwing habit analysis method, system and device and storage medium
CN110942009A (en) * 2019-11-22 2020-03-31 南京甄视智能科技有限公司 Fall detection method and system based on space-time hybrid convolutional network
CN112446256A (en) * 2019-09-02 2021-03-05 中国林业科学研究院资源信息研究所 Vegetation type identification method based on deep ISA data fusion
CN113344097A (en) * 2021-06-21 2021-09-03 特赞(上海)信息科技有限公司 Image processing method and device based on multiple models

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102163290A (en) * 2011-05-16 2011-08-24 天津大学 Method for modeling abnormal events in multi-visual angle video monitoring based on temporal-spatial correlation information
CN103164694A (en) * 2013-02-20 2013-06-19 上海交通大学 Method for recognizing human motion
CN103177265A (en) * 2013-03-25 2013-06-26 中山大学 High-definition image classification method based on kernel function and sparse coding
CN103778240A (en) * 2014-02-10 2014-05-07 中国人民解放军信息工程大学 Image retrieval method based on functional magnetic resonance imaging and image dictionary sparse decomposition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102163290A (en) * 2011-05-16 2011-08-24 天津大学 Method for modeling abnormal events in multi-visual angle video monitoring based on temporal-spatial correlation information
CN103164694A (en) * 2013-02-20 2013-06-19 上海交通大学 Method for recognizing human motion
CN103177265A (en) * 2013-03-25 2013-06-26 中山大学 High-definition image classification method based on kernel function and sparse coding
CN103778240A (en) * 2014-02-10 2014-05-07 中国人民解放军信息工程大学 Image retrieval method based on functional magnetic resonance imaging and image dictionary sparse decomposition

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QUOC V. LE 等: ""Learning hierarchical invariant spatio-temporal features for action recognition"", 《 COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011 IEEE CONFERENCE ON》 *
ZONGBO HAO 等: ""Human Action Recognition by Fast Dense Trajectories "", 《MM "13 PROCEEDINGS OF THE 21ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA》 *

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105678216A (en) * 2015-12-21 2016-06-15 中国石油大学(华东) Spatio-temporal data stream video behavior recognition method based on deep learning
CN105550713A (en) * 2015-12-21 2016-05-04 中国石油大学(华东) Video event detection method of continuous learning
CN105938544A (en) * 2016-04-05 2016-09-14 大连理工大学 Behavior identification method based on integrated linear classifier and analytic dictionary
CN105938544B (en) * 2016-04-05 2020-05-19 大连理工大学 Behavior recognition method based on comprehensive linear classifier and analytic dictionary
CN106127108B (en) * 2016-06-14 2019-07-16 中国科学院软件研究所 A kind of manpower image region detection method based on convolutional neural networks
CN106127108A (en) * 2016-06-14 2016-11-16 中国科学院软件研究所 A kind of staff image region detection method based on convolutional neural networks
CN106682599A (en) * 2016-12-15 2017-05-17 浙江科技学院 Stereo image visual saliency extraction method based on sparse representation
WO2018157383A1 (en) * 2017-03-03 2018-09-07 深圳大学 Video event human-like concept learning method and device
CN107506740A (en) * 2017-09-04 2017-12-22 北京航空航天大学 A kind of Human bodys' response method based on Three dimensional convolution neutral net and transfer learning model
CN107506740B (en) * 2017-09-04 2020-03-17 北京航空航天大学 Human body behavior identification method based on three-dimensional convolutional neural network and transfer learning model
CN108898042B (en) * 2017-12-27 2021-10-22 浩云科技股份有限公司 Method for detecting abnormal user behavior in ATM cabin
CN108898042A (en) * 2017-12-27 2018-11-27 浩云科技股份有限公司 A kind of detection method applied to user's abnormal behaviour in ATM machine cabin
CN108256449A (en) * 2018-01-02 2018-07-06 重庆邮电大学 A kind of Human bodys' response method based on subspace grader
CN108256449B (en) * 2018-01-02 2021-11-16 重庆邮电大学 Human behavior identification method based on subspace classifier
CN110046631A (en) * 2018-01-15 2019-07-23 塔塔咨询服务有限公司 System and method for inferring the variation of time-space image automatically
CN110046631B (en) * 2018-01-15 2023-04-28 塔塔咨询服务有限公司 System and method for automatically inferring changes in spatiotemporal images
CN108446605A (en) * 2018-03-01 2018-08-24 南京邮电大学 Double interbehavior recognition methods under complex background
CN108446605B (en) * 2018-03-01 2019-09-20 南京邮电大学 Double interbehavior recognition methods under complex background
CN108875597B (en) * 2018-05-30 2021-03-30 浙江大学城市学院 Large-scale data set-oriented two-layer activity cluster identification method
CN108875597A (en) * 2018-05-30 2018-11-23 浙江大学城市学院 A kind of two layers of movable clustering recognition method towards large-scale dataset
CN108830252A (en) * 2018-06-26 2018-11-16 哈尔滨工业大学 A kind of convolutional neural networks human motion recognition method of amalgamation of global space-time characteristic
CN108830252B (en) * 2018-06-26 2021-09-10 哈尔滨工业大学 Convolutional neural network human body action recognition method fusing global space-time characteristics
CN108988968A (en) * 2018-07-27 2018-12-11 河北工程大学 Human behavior detection method, device and terminal device
CN109523652A (en) * 2018-09-29 2019-03-26 百度在线网络技术(北京)有限公司 Processing method, device, equipment and the storage medium of insurance based on driving behavior
CN110046568B (en) * 2019-04-11 2022-12-06 中山大学 Video action recognition method based on time perception structure
CN110046568A (en) * 2019-04-11 2019-07-23 中山大学 A kind of video actions recognition methods based on Time Perception structure
CN110298264A (en) * 2019-06-10 2019-10-01 上海师范大学 Based on the human body daily behavior activity recognition optimization method for stacking noise reduction self-encoding encoder
CN110298264B (en) * 2019-06-10 2023-05-30 上海师范大学 Human body daily behavior activity recognition optimization method based on stacked noise reduction self-encoder
CN110309732B (en) * 2019-06-13 2021-04-06 浙江大学 Behavior identification method based on skeleton video
CN110309732A (en) * 2019-06-13 2019-10-08 浙江大学 Activity recognition method based on skeleton video
CN112446256A (en) * 2019-09-02 2021-03-05 中国林业科学研究院资源信息研究所 Vegetation type identification method based on deep ISA data fusion
CN110796014A (en) * 2019-09-29 2020-02-14 深圳市深网视界科技有限公司 Garbage throwing habit analysis method, system and device and storage medium
CN110796180B (en) * 2019-10-12 2022-06-07 吉林大学 Model training system and method based on artificial intelligence
CN110796180A (en) * 2019-10-12 2020-02-14 吉林大学 Model training system and method based on artificial intelligence
CN110942009A (en) * 2019-11-22 2020-03-31 南京甄视智能科技有限公司 Fall detection method and system based on space-time hybrid convolutional network
CN113344097A (en) * 2021-06-21 2021-09-03 特赞(上海)信息科技有限公司 Image processing method and device based on multiple models
CN113344097B (en) * 2021-06-21 2024-03-19 特赞(上海)信息科技有限公司 Image processing method and device based on multiple models

Also Published As

Publication number Publication date
CN104732208B (en) 2018-05-18

Similar Documents

Publication Publication Date Title
CN104732208A (en) Video human action reorganization method based on sparse subspace clustering
CN109919031B (en) Human behavior recognition method based on deep neural network
CN104616316B (en) Personage's Activity recognition method based on threshold matrix and Fusion Features vision word
US20230289979A1 (en) A method for video moving object detection based on relative statistical characteristics of image pixels
CN101470809B (en) Moving object detection method based on expansion mixed gauss model
CN106228109A (en) A kind of action identification method based on skeleton motion track
CN104063719A (en) Method and device for pedestrian detection based on depth convolutional network
CN104572804A (en) Video object retrieval system and method
CN111738908A (en) Scene conversion method and system for generating countermeasure network by combining instance segmentation and circulation
CN106408030A (en) SAR image classification method based on middle lamella semantic attribute and convolution neural network
CN113761259A (en) Image processing method and device and computer equipment
CN108596256B (en) Object recognition classifier construction method based on RGB-D
CN111738054A (en) Behavior anomaly detection method based on space-time self-encoder network and space-time CNN
CN111353447A (en) Human skeleton behavior identification method based on graph convolution network
CN110458022A (en) It is a kind of based on domain adapt to can autonomous learning object detection method
CN112364791A (en) Pedestrian re-identification method and system based on generation of confrontation network
CN114463837A (en) Human behavior recognition method and system based on self-adaptive space-time convolution network
CN115410119A (en) Violent movement detection method and system based on adaptive generation of training samples
CN111882000A (en) Network structure and method applied to small sample fine-grained learning
CN105160285A (en) Method and system for recognizing human body tumble automatically based on stereoscopic vision
CN111626197B (en) Recognition method based on human behavior recognition network model
CN110334703B (en) Ship detection and identification method in day and night image
CN110796008A (en) Early fire detection method based on video image
CN114038011A (en) Method for detecting abnormal behaviors of human body in indoor scene
CN113903004A (en) Scene recognition method based on middle-layer convolutional neural network multi-dimensional features

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180518

Termination date: 20210316

CF01 Termination of patent right due to non-payment of annual fee