CN108664904A - A kind of human body sitting posture Activity recognition method and system based on Kinect - Google Patents
A kind of human body sitting posture Activity recognition method and system based on Kinect Download PDFInfo
- Publication number
- CN108664904A CN108664904A CN201810369535.4A CN201810369535A CN108664904A CN 108664904 A CN108664904 A CN 108664904A CN 201810369535 A CN201810369535 A CN 201810369535A CN 108664904 A CN108664904 A CN 108664904A
- Authority
- CN
- China
- Prior art keywords
- image
- frame
- frame image
- local feature
- skeleton
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The human body sitting posture Activity recognition method based on Kinect that the invention discloses a kind of, including:Skeleton image sequence is obtained from human body attitude behavioral data collection, the characteristic sequence of image is obtained according to each frame image and its former frame in skeleton image sequence and a later frame image, clustering processing is carried out to the characteristic sequence of the frame image of acquisition using K means, to obtain multiple clustering processing results, obtained multiple clustering processing results are handled using PCA methods, to obtain the local feature that multiple importance according to pivot information arrange from high to low, local feature is handled using feature coding algorithm, to obtain characteristics of image descriptor, according to the global loss function of characteristics of image descriptor structure of other frame images in frame image and skeleton image sequence, global loss function is solved using based on stochastic gradient descent algorithm.The present invention can solve traditional healthy sitting posture detecting method low technical problem of recognition accuracy when carrying out sitting posture behavioral value to " non-positive seat " state.
Description
Technical field
The invention belongs to technical field of computer vision, more particularly, to a kind of human body sitting posture row based on Kinect
For recognition methods and system.
Background technology
In recent years, in medical field, the application of human body attitude behavioural analysis becomes more and more extensive, in medical diagnosis on disease, health
Reexamine estimate and the daily monitoring of the elderly etc. have it is involved.Specifically, most of social worker is to do at present
Public room works, and for them, sitting posture is the most permanent operating attitude.Medical research shows sitting and bad seat
Appearance will lead to a variety of occupational musculoskeletal diseases such as protrusion of lumber intervertebral disc, cervical spondylosis etc..Nowadays, had much about strong
The method of health sitting posture detection, but these methods have usually only taken into account in the undesirable monitoring of sitting posture to skeleton information
Angle analysis is carried out, then judges whether human body sitting posture behavior is healthy only according to this angular standard.
However, in real life scene, when people is in the posture in " seat ", in addition to normally tapping keyboard or writing
Except both conventional postures, it is also possible to will appear other sitting postures such as wave, drink water, receiving calls, clapping hands when more
Life activities under state (" non-positive seat " state can also be referred to as), the bone information of the object in these behavior states
Angle can may also change, or even not meet the detection framework of simple bone angle information, if at this point, still applied
Traditional healthy sitting posture detecting method carries out sitting posture behavioral value to these " non-positive seat " states, then testing result can be caused to occur
Sizable deviation, actual scene application be not high.
Invention content
For the disadvantages described above or Improvement requirement of the prior art, the present invention provides a kind of, and the human body based on Kinect is sat
Appearance Activity recognition method and system, it is intended that solve traditional healthy sitting posture detecting method to " non-positive seat " state into
When row sitting posture behavioral value, testing result deviation is larger, recognition accuracy is low technical problem.
To achieve the above object, according to one aspect of the present invention, a kind of human body sitting posture row based on Kinect is provided
For recognition methods, include the following steps:
(1) skeleton image sequence is obtained from human body attitude behavioral data collection, according to each frame image in skeleton image sequence
And its former frame and a later frame image obtain the time local feature of the frame image, and according to each frame figure in skeleton image sequence
Space local feature as obtaining the frame image, time local feature and space local feature collectively form characteristic sequence fn,s,
Middle n ∈ [1,7], s are the picture frames in skeleton image sequence;
(2) K-means is used to carry out clustering processing to the characteristic sequence of the frame image obtained in step (1), it is more to obtain
A clustering processing result;
(3) multiple clustering processing results that step (2) obtains are handled using PCA methods, with obtain it is multiple according to
The local feature that the importance of pivot information arranges from high to low.
(4) local feature for using feature coding algorithm to obtain step (3) is handled, to obtain characteristics of image description
Symbol;
(5) according to the global loss letter of characteristics of image descriptor structure of other frame images in frame image and skeleton image sequence
Number, solves global loss function using based on stochastic gradient descent algorithm, to obtain making global loss function to minimize
Optimum linear transforming function transformation function;
(6) using obtained optimum linear transforming function transformation function to the image of other frame images in frame image and skeleton image sequence
Feature descriptor is handled, to obtain the similarity of frame image and other frame images in skeleton image sequence.
(7) frame image of the nonparametric K- nearest neighbor algorithms to being obtained in step (6) and other frames in skeleton image sequence are utilized
The similarity of image is handled, to carry out behavior classification to all images.
Preferably, which is obtained according to each frame image and its former frame and a later frame image in skeleton image sequence
Time local feature be specially:
First, artis is divided into three joint groups, wherein head, left hand, the right hand, left foot, the right side according to human body
Displacement vector of the foot in time-varying process constitutes the first joint group, and neck, left hand elbow, right hand elbow, left knee, right knee are in the time
Displacement vector in change procedure constitutes second joint group, and backbone, left shoulder, right shoulder, left stern, right stern are in time-varying process
Displacement vector constitutes third joint group;
Then, the time shifting vector of different artis in each frame image is obtained:
Wherein 1 < s < τ, τ is the quantity of picture frame in skeleton image sequence,It is s frame images in skeleton image sequence
Coordinate of i-th of artis in coordinate system (X, Y, Z),Indicate i-th of joint of s frame images in skeleton image sequence
The time shifting vector of point;
Finally, same joint group will be belonged in the time shifting vector of the different artis of each frame image achieved above
Time local feature combine, to establish the time local feature f of the first, second, and third joint group respectively1To f3。
Preferably, it is specially according to the space local feature that each frame image obtains the frame image in skeleton image sequence:
First, artis is divided into 4 joint groups according to human body, wherein head, left hand, the right hand be respectively and backbone
Relative position vector constitute the 4th joint group, head, left hand, left foot respectively with the Relative position vector of right stern constitute the 5th close
Section group;Head, the right hand, right crus of diaphragm respectively constitute the 6th joint group with the Relative position vector of left stern;Left hand, the right hand are respectively and head
Relative position vector constitute the 7th joint group;
Then, the space displacement vector of different artis in each frame image is obtained:
WhereinIt indicates space displacement vector of i-th of artis relative to j-th of artis in s frame images, and has
i≠j;
Finally, same joint group will be belonged in the space displacement vector of the different artis of each frame image achieved above
Time local feature combine, to establish the time local feature f of the four, the five, the 6th and the 7th joint group respectively4
To f7。
Preferably, feature coding algorithm is local feature polymerization description vectors symbol algorithm;
Preferably, local feature is handled, is specifically to obtain characteristics of image descriptor:
First, it is calculated using following formula:
Fn,c=[υn,c,1,...,υn,c,i,...,υn,c,k]
Wherein Fn,cIt indicates multiple residual vector υ of corresponding c-th of clustering processing result of n-th of joint groupn,c,mSeries connection
Obtained vector, υn,c,kDuring indicating clustering processing, the local feature set of the subgroup n of c-th of initialization, c=in gathering k
1,2 ... and C }, C indicates that the quantity of clustering processing result, m={ 1,2 ..., k }, k indicate gathering in clustering processing result
Quantity;μn,c,mIt indicates in corresponding c-th of clustering processing result of n-th of joint group
The corresponding cluster centre of m-th of gathering, and have
Sn,c,m={ fn,s| m=arg minp||fn,s-μn,c,p||}
Wherein p={ 1,2 ..., k }, and have p ≠ m;
Then, by all joint groups in frame image and the corresponding vector F of all clustering processing resultsn,cSummation, just obtains figure
As feature descriptor.
Preferably, step (5) is specially:
First, it is as follows to build global loss function:
ε (L)=(1- μ) εpull(L)+μεpush(L)+γ||LTL-I||2
Wherein μ is component εpush(L) and component εpull(L) ratio between component, γ are regularization coefficients, and I is unit
Matrix, L indicates linear transformation, and has
WhereinIt is the characteristics of image descriptor of frame image, component εpull(L) it indicates to further real target sample image
Feature descriptorAs same category of measurement, component εpush(L) it indicates to further feature vector characteristics of image descriptor
While practical will not push away characteristics of image descriptor in the same category of sample l that acts as fraudulent substitute for a person interferedMeasurement, ξ
It is characteristics of image descriptor and acts as fraudulent substitute for a person the expectation separation spacing between sample l, j → i indicates characteristics of image descriptorWith
Characteristics of image descriptorIt is same category of target base sample, l, which is sample index, acts as fraudulent substitute for a person sample when being i,
Indicate characteristics of image descriptorWith characteristics of image descriptorIt is not same category of target base sample;
Then, the optimum linear transforming function transformation function L for making global loss function minimize is acquired using SGD algorithms*:
L*=arg minLε(L)。
It is another aspect of this invention to provide that a kind of human body sitting posture Activity recognition system based on Kinect is provided, including:
First module, for obtaining skeleton image sequence from human body attitude behavioral data collection, according in skeleton image sequence
Each frame image and its former frame and a later frame image obtain the time local feature of the frame image, and according to skeleton image sequence
In each frame image obtain the space local feature of the frame image, time local feature and space local feature collectively form feature
Sequence fn,s, wherein n ∈ [1,7], s are the picture frames in skeleton image sequence;
Second module, for being carried out at cluster using the characteristic sequence of the frame image obtained in K-means pairs of the first module
Reason, to obtain multiple clustering processing results;
Third module, multiple clustering processing results for being obtained using the second module of PCA methods pair are handled, with
The local feature arranged from high to low to multiple importance according to pivot information.
4th module, the local feature for being obtained to third module using feature coding algorithm are handled, to obtain
Characteristics of image descriptor;
5th module, for being built according to the characteristics of image descriptor of other frame images in frame image and skeleton image sequence
Global loss function solves global loss function using based on stochastic gradient descent algorithm, to obtain making global loss
The optimum linear transforming function transformation function of function minimization;
6th module, the optimum linear transforming function transformation function for using is to other frames in frame image and skeleton image sequence
The characteristics of image descriptor of image is handled, to obtain the similarity of frame image and other frame images in skeleton image sequence.
7th module, for utilizing the frame image obtained in the 6th module of nonparametric K- nearest neighbor algorithms pair and skeleton image sequence
The similarity of other frame images is handled in row, to carry out behavior classification to all images.
In general, through the invention it is contemplated above technical scheme is compared with the prior art, can obtain down and show
Beneficial effect:
(1) discrimination of the method for the present invention is high:Since invention introduces the solution procedurees of optimum linear transforming function transformation function, and
The optimum linear transforming function transformation function is applied in K-NN sorting algorithms, to further improve recognition accuracy;
(2) present invention is when applied to sitting posture health detection, it is contemplated that more practical application scenes, have extensive
Application and practicability.
Description of the drawings
Fig. 1 is the flow chart of the human body sitting posture Activity recognition method the present invention is based on Kinect,
Fig. 2 is that 20 postures action of MSR-Action3D data sets is identified to obtain using the method for the present invention
Confusion matrix.
Fig. 3 is to obscure square using what the method for the present invention was identified for UTKinect-Action3D data sets
Battle array.
Fig. 4 is that Florence 3D Actions data sets are obscured using what the method for the present invention was identified
Matrix.
Fig. 5 is that the performance that the method for the present invention is obtained for human body attitude identification and human body sitting posture Activity recognition is more bent
Line.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below
It does not constitute a conflict with each other and can be combined with each other.
The human body sitting posture Activity recognition method based on Kinect that the present invention provides a kind of, first in Kinect bones
Simplify the definition of skeletal joint point on the basis of data, and devises a kind of completely new human body attitude Activity recognition on this basis
Frame, and this specific goal in research has been made to be further improved more to adapt to human body sitting posture behavior to the frame for human body sitting posture
Identification..In the frame, the space-time based on framework information being defined first and describes local feature, K-means clusters is then used to calculate
Method and principal component analysis (Principal Component Analysis, PCA) algorithm obtain the office of final feature clustering result
Portion polymerize description vectors symbol, then proposes using self-defined loss function and global stochastic gradient optimization algorithm two parts combination
Mode, carry out differentiate feature metric learning, the study transformation results in relation to characteristic information are calculated, finally use K-NN
Grader is to finally learning to obtain the Classification and Identification of characteristic information progress posture behavior.Experiment show this method is in human body
The validity and accuracy of sitting posture Activity recognition, and have good behaviour on human body sitting posture Activity recognition.
As shown in Figure 1, the human body sitting posture Activity recognition method the present invention is based on Kinect includes the following steps:
(1) skeleton image sequence is obtained from human body attitude behavioral data collection, according to each frame image in skeleton image sequence
And its former frame and a later frame image obtain the time local feature of the frame image, and according to each frame figure in skeleton image sequence
Space local feature as obtaining the frame image, time local feature and space local feature collectively form characteristic sequence;
Specifically, human body attitude behavioral data collection can be the MSR- Action3D data sets of Microsoft Research publication,
Either UTKinect-Action3D data sets or Florence 3D Actions data sets.
The time of the frame image is obtained according to each frame image and its former frame and a later frame image in skeleton image sequence
Local feature is specially:First, artis is divided into three joint groups according to human body, wherein head, left hand, the right hand,
The displacement vector of left foot, right crus of diaphragm in time-varying process constitutes the first joint group, neck, left hand elbow, right hand elbow, left knee, the right side
Displacement vector of the knee in time-varying process constitutes second joint group, and backbone, left shoulder, right shoulder, left stern, right stern are in time change
Displacement vector in the process constitutes third joint group;
Then, the time shifting vector of different artis in each frame image is obtained:
Wherein s is the picture frame in skeleton image sequence, and it is the number of picture frame in skeleton image sequence to have 1 < s < τ, τ
Amount,It is coordinate of i-th of the artis of s frame images in skeleton image sequence in coordinate system (X, Y, Z),Indicate bone
The time shifting vector of i-th of artis of s frame images in frame image sequence;
Finally, same joint group will be belonged in the time shifting vector of the different artis of each frame image achieved above
Time local feature combine, to establish the time local feature f of the first, second, and third joint group respectively1To f3;
It is specially according to the space local feature that each frame image obtains the frame image in skeleton image sequence:First, it presses
Artis is divided into 4 joint groups according to human body, wherein head, left hand, the right hand respectively with the Relative position vector of backbone
The 4th joint group is constituted, head, left hand, left foot respectively constitute the 5th joint group with the Relative position vector of right stern;Head, the right side
Hand, right crus of diaphragm respectively constitute the 6th joint group with the Relative position vector of left stern;Left hand, the right hand are respectively sweared with the relative position on head
Amount constitutes the 7th joint group.
Then, the space displacement vector of different artis in each frame image is obtained:
WhereinIt indicates space displacement vector of i-th of artis relative to j-th of artis in s frame images, and has
i≠j;
Finally, same joint group will be belonged in the space displacement vector of the different artis of each frame image achieved above
Time local feature combine, to establish the time local feature f of the four, the five, the 6th and the 7th joint group respectively4
To f7;
Finally obtained characteristic sequence is expressed as f in this stepn,s, wherein n ∈ [1,7].
(2) K-means is used to carry out clustering processing to the characteristic sequence of the frame image obtained in step (1), it is more to obtain
A clustering processing result;
(3) step (2) is obtained using Principal Component Analysis (Primary component analysis, abbreviation PCA)
Multiple clustering processing results handled, it is special to obtain the part that multiple importance according to pivot information arrange from high to low
It levies (wherein importance refers to the relevance between local feature and final human body sitting posture Activity recognition result).
Although can inevitably cause the loss of information while dimensionality reduction, but correlation is usually present between real data
Property, therefore can try every possible means as possible to reduce the loss of information while dimensionality reduction.A benefit using PCA be exactly into
When row Data Dimensionality Reduction, new calculated " pivot " vector can be ranked up according to its importance, take out be in sequence on demand
The subsequent dimension of sequence is saved in front i.e. mostly important part, to carry out simplified model either to data into
While row compression, the information of initial data is maintained to greatest extent.
(4) local feature for using feature coding algorithm to obtain step (3) is handled, to obtain characteristics of image description
Symbol;
Specifically, the feature coding algorithm used in this step is local feature polymerization description vectors symbol (vector
Of locally aggregated descriptors, abbreviation VLAD) algorithm.
This step is specifically to use following formula:
Fn,c=[υn,c,1,...,υn,c,i,...,υn,c,k]
Wherein Fn,cIt indicates multiple residual vector υ of corresponding c-th of clustering processing result of n-th of joint groupn,c,mSeries connection
Obtained vector, υn,c,kDuring the clustering processing for indicating step (2), the part of the subgroup n of c-th of initialization is special in gathering k
Collection is closed, c=1,2 ... and C }, C indicates that the quantity of clustering processing result, m={ 1,2 ..., k }, k indicate clustering processing result
The quantity of middle gathering.
In above-mentioned formula,Wherein μn,c,mIndicate the corresponding c of n-th of joint group
The corresponding cluster centre of m-th of gathering in a clustering processing result, and have
Sn,c,m={ fn,s| m=arg minp||fn,s-μn,c,p||}。
Wherein p={ 1,2 ..., k }, and have p ≠ m.
By all joint groups in frame image and the corresponding vector F of all clustering processing resultsn,cAfter summation, this step is just obtained
Rapid characteristics of image descriptor.
(5) according to the global loss letter of characteristics of image descriptor structure of other frame images in frame image and skeleton image sequence
Number loses the overall situation using based on stochastic gradient descent (Stochastic gradient descent, abbreviation SGD) algorithm
Function is solved, to obtain the optimum linear transforming function transformation function for making global loss function minimize;
Specifically, it is as follows to build global loss function first:
ε (L)=(1- μ) εpull(L)+μεpush(L)+γ||LTL-I||2
Wherein μ is component εpush(L) and component εpull(L) ratio between component, γ are regularization coefficients, and I is unit
Matrix, L indicates linear transformation, and has
WhereinIt is the characteristics of image descriptor of frame image, component εpull(L) it indicates to further real target sample image
Feature descriptorAs same category of measurement, component εpush(L) it indicates to further feature vector characteristics of image descriptor
While practical will not push away characteristics of image descriptor in the same category of sample l that acts as fraudulent substitute for a person interferedMeasurement, ξ
It is characteristics of image descriptor and acts as fraudulent substitute for a person the expectation separation spacing between sample l, j → i indicates characteristics of image descriptorWith
Characteristics of image descriptorIt is same category of target base sample, l, which is sample index, acts as fraudulent substitute for a person sample when being i,
Indicate characteristics of image descriptorWith characteristics of image descriptorIt is not same category of target base sample.
Then, the optimum linear transforming function transformation function L for making global loss function minimize is acquired using SGD algorithms*:
L*=arg minLε(L)
(6) using obtained optimum linear transforming function transformation function to the image of other frame images in frame image and skeleton image sequence
Feature descriptor is handled, to obtain the similarity of frame image and other frame images in skeleton image sequence.
(7) utilize nonparametric K- neighbours (K-Nearest Neighbors, abbreviation K-NN) algorithm to being obtained in step (6)
Frame image and the similarity of other frame images in skeleton image sequence handled, with to all images carry out behavior classification.
In K-NN algorithms, input is made of the nearest training examples of k in data set, output be then a class at
Member.The principle of classification of one new object to be sorted is that the object is divided into most of neighbours institutes in a neighbours nearest from it
Correspondence class in.Neighbours would generally be assigned with weights to be used for indicating contribution of the neighbours to classification.For example, can select to be sorted
Distance weights as the neighbours sample of the object to each neighbours.In the present invention, it by the metric learning in two stages, incites somebody to action
The final transformation of the feature samples arrived, feature representation and sample when can determine that each sample participates in classification according to the transformation
Between distance definition.
When carrying out final classification to these samples, K-NN is as a kind of most directly for the calculation for unknown data of classifying
Method, each training data have specific label, can also explicitly judge the label of new data.Specific algorithm process is:
(1) define between data apart from calculation, calculate new data at a distance from known class data point;
(2) by calculating apart from sort ascending, selection and k nearest point of current data to be sorted;
(3) for discrete classification, the k most classifications of the frequency of occurrences is returned and make prediction classification;K is then returned for recurrence
The weighted value of a point is as predicted value.
In simple terms, the process it is to be understood that have so a pile you known the data of classification, then when one is new
When data enter, begins to seek distance with point each of in training data, then choose k point nearest from this training data
Look at what type these points belong to, then use the principle that the minority is subordinate to the majority, sorts out to new data.
Experimental result
MSR-Action3D is currently used most common data sets, and the data set is selected to test frame first
And the method assessed with the currently used data set is compared.
When being trained assessment, the method for intersecting topic division that selection is proposed using Wang et al., specifically
It is ten themes of MSR-Action3D, selects theme 1,3,5,7,9 for instructing, and theme 2,4,6,8,10 is for testing.
In this case, frame of the invention acts the confusion matrix being identified for 20 postures of MSR- Action3D
As shown in Figure 2.
It can be accurate to 14 in 20 posture behaviors of data set from can be seen that frame in the confusion matrix of Fig. 2
It identifies, remaining has 4 discriminations 90% or more, and only 2 accuracy of identification are less than 90%.The assessment result with simultaneously
It is compared with the art methods for equally using these data sets to carry out gesture recognition, as shown in table 1 below.
Table 1
Can be seen that the average recognition rate of the frame from the comparing result of upper table 1 can reach 95.86%, and currently make
It is compared with the Activity recognition technical method of same behavior data set, still there is higher accuracy of identification.
By the method for the present invention on UTKinect-Action3D data sets and Florence 3D Actions data sets
Test carries out the finally obtained confusion matrix of human body attitude Activity recognition and distinguishes shown in following Fig. 3 and Fig. 4.
Assessment result in these three data sets is by the reference value as the assessment accuracy of frame, and to reach sitting posture
This specific identification object of behavior, after frame feature extraction is carried out refinement adjustment, using the frame after adjustment to sitting posture state
Under behavior be detected identification, the assessment result under the result and former frame is compared.Before being detected, first
All human body attitudes action that three data are concentrated is integrated and selected, it will wherein can be by as human body under sitting posture state
Picking out for behavior is tested.The final sitting posture behavior tested has:It sits down, height is waved, level is waved, hand is grabbed
It takes, hand is clamped down on, both hands are brandished, is drunk water, is received calls, clapping hands, sees table, stands.Sitting posture after corresponding obtained refinement adjustment
Detection framework compares the recognition result of each sitting posture behavior and the testing result of former detection framework as shown in Figure 5.
The sitting posture behavior frameworks after specific adjusted are can be seen that by the line chart in Fig. 5 with the present invention most to begin to use
Accuracy of detection of the human body attitude behavioral value frame in specified sitting posture behavior it is more close, compared using based on sitting posture behavior
In the particularity of the specified conditions of human body entirety posture behavior, corresponding initial input feature is reduced, to reduce in training process
Intrinsic dimensionality this refinement adjustment after, the assessment result of the frame simultaneously significantly affects, still have more considerable validity and
Accuracy rate.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to
The limitation present invention, all within the spirits and principles of the present invention made by all any modification, equivalent and improvement etc., should all include
Within protection scope of the present invention.
Claims (7)
1. a kind of human body sitting posture Activity recognition method based on Kinect, which is characterized in that include the following steps:
(1) obtain skeleton image sequence from human body attitude behavioral data collection, according to each frame image in skeleton image sequence and its
Former frame and a later frame image obtain the time local feature of the frame image, and are obtained according to each frame image in skeleton image sequence
The space local feature of the frame image, time local feature and space local feature is taken to collectively form characteristic sequence fn,s, wherein n
∈ [1,7], s are the picture frames in skeleton image sequence;
(2) K-means is used to carry out clustering processing to the characteristic sequence of the frame image obtained in step (1), it is multiple poly- to obtain
Class handling result;
(3) multiple clustering processing results that step (2) obtains are handled using PCA methods, it is multiple according to pivot to obtain
The local feature that the importance of information arranges from high to low.
(4) local feature for using feature coding algorithm to obtain step (3) is handled, to obtain characteristics of image descriptor;
(5) global loss function is built according to the characteristics of image descriptor of other frame images in frame image and skeleton image sequence,
Global loss function is solved using based on stochastic gradient descent algorithm, to obtain making global loss function to minimize most
Good linear transformation function;
(6) using obtained optimum linear transforming function transformation function to the characteristics of image of other frame images in frame image and skeleton image sequence
Descriptor is handled, to obtain the similarity of frame image and other frame images in skeleton image sequence.
(7) frame image of the nonparametric K- nearest neighbor algorithms to being obtained in step (6) and other frame images in skeleton image sequence are utilized
Similarity handled, with to all images carry out behavior classification.
2. human body sitting posture Activity recognition method according to claim 1, which is characterized in that according to every in skeleton image sequence
The time local feature that one frame image and its former frame and a later frame image obtain the frame image is specially:
First, artis is divided into three joint groups according to human body, wherein head, left hand, the right hand, left foot, right crus of diaphragm exists
Displacement vector in time-varying process constitutes the first joint group, and neck, left hand elbow, right hand elbow, left knee, right knee are in time change
Displacement vector in the process constitutes second joint group, the displacement of backbone, left shoulder, right shoulder, left stern, right stern in time-varying process
Vector constitutes third joint group;
Then, the time shifting vector of different artis in each frame image is obtained:
Wherein 1 < s < τ, τ is the quantity of picture frame in skeleton image sequence,It is of s frames image in skeleton image sequence
Coordinate of the i artis in coordinate system (X, Y, Z),Indicate i-th of artis of s frame images in skeleton image sequence
Time shifting vector;
Finally, by belong in the time shifting vector of the different artis of each frame image achieved above same joint group when
Between local feature combine, to establish the time local feature f of the first, second, and third joint group respectively1To f3。
3. human body sitting posture Activity recognition method according to claim 2, which is characterized in that according to every in skeleton image sequence
The space local feature that one frame image obtains the frame image is specially:
First, artis is divided into 4 joint groups according to human body, wherein head, left hand, the right hand respectively with the phase of backbone
4th joint group is constituted to position vector, head, left hand, left foot respectively constitute the 5th joint with the Relative position vector of right stern
Group;Head, the right hand, right crus of diaphragm respectively constitute the 6th joint group with the Relative position vector of left stern;Left hand, the right hand respectively with head
Relative position vector constitutes the 7th joint group;
Then, the space displacement vector of different artis in each frame image is obtained:
WhereinIndicate space displacement vector of i-th of artis relative to j-th of artis in s frame images, and have i ≠
j;
Finally, by belong in the space displacement vector of the different artis of each frame image achieved above same joint group when
Between local feature combine, to establish the time local feature f of the four, the five, the 6th and the 7th joint group respectively4To f7。
4. human body sitting posture Activity recognition method according to claim 3, which is characterized in that feature coding algorithm is local spy
Sign polymerization description vectors accord with algorithm.
5. human body sitting posture Activity recognition method according to claim 4, which is characterized in that local feature is handled,
It is specifically to obtain characteristics of image descriptor:
First, it is calculated using following formula:
Fn,c=[υn,c,1,...,υn,c,i,...,υn,c,k]
Wherein Fn,cIt indicates multiple residual vector υ of corresponding c-th of clustering processing result of n-th of joint groupn,c,mSeries connection obtains
Vector, υn,c,kDuring indicating clustering processing, the local feature set of the subgroup n of c-th of initialization in gathering k, c=1,
2 ... C }, C indicates that the quantity of clustering processing result, m={ 1,2 ..., k }, k indicate the quantity of gathering in clustering processing result;μn,c,mIt indicates in corresponding c-th of clustering processing result of n-th of joint group m-th
The corresponding cluster centre of gathering, and have
Sn,c,m={ fn,s| m=arg minp||fn,s-μn,c,p||}
Wherein p={ 1,2 ..., k }, and have p ≠ m;
Then, by all joint groups in frame image and the corresponding vector F of all clustering processing resultsn,cSummation just obtains image spy
Levy descriptor.
6. human body sitting posture Activity recognition method according to claim 5, which is characterized in that step (5) is specially:
First, it is as follows to build global loss function:
ε (L)=(1- μ) εpull(L)+μεpush(L)+γ||LTL-I||2
Wherein μ is component εpush(L) and component εpull(L) ratio between component, γ are regularization coefficients, and I is unit matrix, L
It indicates linear transformation, and has
WhereinIt is the characteristics of image descriptor of frame image, component εpull(L) it indicates to further real target sample characteristics of image
DescriptorAs same category of measurement, component εpush(L) it indicates to further the same of feature vector characteristics of image descriptor
When practical will not push away characteristics of image descriptor in the same category of sample l that acts as fraudulent substitute for a person interferedMeasurement, ξ is figure
As feature descriptor and act as fraudulent substitute for a person the expectation separation spacing between sample l, j → i indicates characteristics of image descriptorAnd image
Feature descriptorIt is same category of target base sample, l, which is sample index, acts as fraudulent substitute for a person sample when being i,It indicates
Characteristics of image descriptorWith characteristics of image descriptorIt is not same category of target base sample;
Then, the optimum linear transforming function transformation function L for making global loss function minimize is acquired using SGD algorithms*:
L*=argminLε(L)。
7. a kind of human body sitting posture Activity recognition system based on Kinect, which is characterized in that including:
First module, for obtaining skeleton image sequence from human body attitude behavioral data collection, according to each in skeleton image sequence
Frame image and its former frame and a later frame image obtain the time local feature of the frame image, and according to every in skeleton image sequence
One frame image obtains the space local feature of the frame image, and time local feature and space local feature collectively form characteristic sequence
fn,s, wherein n ∈ [1,7], s are the picture frames in skeleton image sequence;
Second module, for carrying out clustering processing using the characteristic sequence of the frame image obtained in K-means pairs of the first module, with
Obtain multiple clustering processing results;
Third module, multiple clustering processing results for being obtained using the second module of PCA methods pair are handled, more to obtain
The local feature that a importance according to pivot information arranges from high to low.
4th module, the local feature for being obtained to third module using feature coding algorithm are handled, to obtain image
Feature descriptor;
5th module, it is global for being built according to the characteristics of image descriptor of other frame images in frame image and skeleton image sequence
Loss function solves global loss function using based on stochastic gradient descent algorithm, to obtain making global loss function
The optimum linear transforming function transformation function of minimum;
6th module, the optimum linear transforming function transformation function for using is to other frame images in frame image and skeleton image sequence
Characteristics of image descriptor handled, to obtain the similarity of other frame images in frame image and skeleton image sequence.
7th module, for using in the frame image and skeleton image sequence obtained in the 6th module of nonparametric K- nearest neighbor algorithms pair
The similarity of other frame images is handled, to carry out behavior classification to all images.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810369535.4A CN108664904A (en) | 2018-04-24 | 2018-04-24 | A kind of human body sitting posture Activity recognition method and system based on Kinect |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810369535.4A CN108664904A (en) | 2018-04-24 | 2018-04-24 | A kind of human body sitting posture Activity recognition method and system based on Kinect |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108664904A true CN108664904A (en) | 2018-10-16 |
Family
ID=63781050
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810369535.4A Withdrawn CN108664904A (en) | 2018-04-24 | 2018-04-24 | A kind of human body sitting posture Activity recognition method and system based on Kinect |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108664904A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109902614A (en) * | 2019-02-25 | 2019-06-18 | 重庆邮电大学 | A kind of Human bodys' response method based on local space time's feature |
CN111582154A (en) * | 2020-05-07 | 2020-08-25 | 浙江工商大学 | Pedestrian re-identification method based on multitask skeleton posture division component |
CN113288122A (en) * | 2021-05-21 | 2021-08-24 | 河南理工大学 | Wearable sitting posture monitoring device and sitting posture monitoring method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103530892A (en) * | 2013-10-21 | 2014-01-22 | 清华大学深圳研究生院 | Kinect sensor based two-hand tracking method and device |
CN103970883A (en) * | 2014-05-20 | 2014-08-06 | 西安工业大学 | Motion sequence search method based on alignment clustering analysis |
-
2018
- 2018-04-24 CN CN201810369535.4A patent/CN108664904A/en not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103530892A (en) * | 2013-10-21 | 2014-01-22 | 清华大学深圳研究生院 | Kinect sensor based two-hand tracking method and device |
CN103970883A (en) * | 2014-05-20 | 2014-08-06 | 西安工业大学 | Motion sequence search method based on alignment clustering analysis |
Non-Patent Citations (2)
Title |
---|
DIOGO CARBONERA LUVIZON ET AL.: ""Learning features combination for human action recognition from skeleton sequences"", 《PATTERN RECOGNITION LETTERS》 * |
于成龙: ""基于视频的人体行为识别关键技术研究"", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109902614A (en) * | 2019-02-25 | 2019-06-18 | 重庆邮电大学 | A kind of Human bodys' response method based on local space time's feature |
CN111582154A (en) * | 2020-05-07 | 2020-08-25 | 浙江工商大学 | Pedestrian re-identification method based on multitask skeleton posture division component |
CN113288122A (en) * | 2021-05-21 | 2021-08-24 | 河南理工大学 | Wearable sitting posture monitoring device and sitting posture monitoring method |
CN113288122B (en) * | 2021-05-21 | 2023-12-19 | 河南理工大学 | Wearable sitting posture monitoring device and sitting posture monitoring method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105894047B (en) | A kind of face classification system based on three-dimensional data | |
CN110287825B (en) | Tumble action detection method based on key skeleton point trajectory analysis | |
Zhang et al. | RGB-D camera-based daily living activity recognition | |
EP1864246B1 (en) | Spatio-temporal self organising map | |
Gurovich et al. | DeepGestalt-identifying rare genetic syndromes using deep learning | |
CN109711254A (en) | The image processing method and device of network are generated based on confrontation | |
WO2016106383A2 (en) | First-person camera based visual context aware system | |
CN108664904A (en) | A kind of human body sitting posture Activity recognition method and system based on Kinect | |
CN101167087A (en) | Using time in recognizing persons in images | |
JP2005149506A (en) | Method and apparatus for automatic object recognition/collation | |
CN106529504B (en) | A kind of bimodal video feeling recognition methods of compound space-time characteristic | |
CN110084211B (en) | Action recognition method | |
KR101687217B1 (en) | Robust face recognition pattern classifying method using interval type-2 rbf neural networks based on cencus transform method and system for executing the same | |
CN106960185B (en) | The Pose-varied face recognition method of linear discriminant deepness belief network | |
Reid et al. | Imputing human descriptions in semantic biometrics | |
Iosifidis et al. | Neural representation and learning for multi-view human action recognition | |
CN111160119B (en) | Multi-task depth discrimination measurement learning model construction method for face verification | |
Mukherjee et al. | Recognizing interaction between human performers using'key pose doublet' | |
CN116311497A (en) | Tunnel worker abnormal behavior detection method and system based on machine vision | |
CN113627236A (en) | Sitting posture identification method, device, equipment and storage medium | |
CN110070070B (en) | Action recognition method | |
CN107862246A (en) | A kind of eye gaze direction detection method based on various visual angles study | |
Gehrig et al. | Draft: evaluation guidelines for gender classification and age estimation | |
CN106228163B (en) | A kind of poor ternary sequential image feature in part based on feature selecting describes method | |
CN114913585A (en) | Household old man falling detection method integrating facial expressions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20181016 |