CN106446847A - Human body movement analysis method based on video data - Google Patents
Human body movement analysis method based on video data Download PDFInfo
- Publication number
- CN106446847A CN106446847A CN201610867148.4A CN201610867148A CN106446847A CN 106446847 A CN106446847 A CN 106446847A CN 201610867148 A CN201610867148 A CN 201610867148A CN 106446847 A CN106446847 A CN 106446847A
- Authority
- CN
- China
- Prior art keywords
- path
- action
- motion
- bounding box
- evaluation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Abstract
The invention provides a human body movement analysis method based on video data. The main content of the method comprises data input, space movement evaluation, time movement path extraction and movement suggestion generation. The method comprises processes of firstly using UCF-Sports data for carrying out training, using Olympic sports dataset for carrying out testing, subjecting input data to space movement evaluation including human body evaluation and movement evaluation, obtaining movement scores, completing movement paths by time movement path generation and connection, and finally obtaining movement suggestion results. The method can process human body movements of different postures and generate movement suggestions, provides a greedy search algorithm to achieve movement path generation, and improves the precision of suggestion generation and the efficiency of suggestions.
Description
Technical field
The present invention relates to field of human motion analysis, especially relate to a kind of analysis of the human action based on video data
Method.
Background technology
Video actions analysis is an important subject understanding mankind's activity, has obtained extensive concern in recent years.
One common task of video actions analysis is action recognition, and its purpose is to determine that the action of which type occurs regarding
In frequency.Compare with action recognition, motion detection is an extremely difficult task, and it does not require nothing more than determination type of action, and
Also analyze semantics information.
Nowadays video action analysis remain a challenging problem.The space-time of the complexity due to being related in task
Relationship modeling, this problem can be considered as there are two basic steps, and that is, space is (for example:Frame level) action evaluation and the (example of time
As:Videl stage) path of motion generation.On the one hand, due to the multiformity of action species and the change of behavior of men, very difficult have
The frame level action suggestion of meaning and differentiation.Other aspects, it is true that generally in the potential action region of each frame sum, video
Persistent period be exponentially increased, this causes certain difficulty to motion analyses.
The present invention proposes a kind of new frame extracting based on spatial displacements evaluation and time path of motion.Using UCF-
Sports data is trained, and is tested using Olympic sports dataset data, the data of input is wrapped
Include human body evaluation and the spatial displacements evaluation of Motion evaluation, acquisition action fraction, then generated by time path of motion and contact
Execution path, finally obtains action advisory result.The present invention can process human action and the generation action of different attitudes
Suggestion, provides a greedy search algorithm to solve coordinates measurement of taking action, and improves precision and the suggestion of suggestion generation simultaneously
Efficiency.
Content of the invention
For solving the problems, such as no constraining in video clipping search action suggestion, it is an object of the invention to provide a kind of
Human action analysis method based on video data is it is proposed that a kind of extracted based on spatial displacements evaluation and time path of motion
New frame.
For solving the above problems, the present invention provides a kind of human action analysis method based on video data, and it is mainly interior
Hold and include:
(1) data input;
(2) spatial displacements evaluation;
(3) time path of motion extracts;
(4) action suggestion generates.
Wherein, described data input, including training and test two parts, is wherein instructed using UCF-Sports data
Practice, tested using Olympic sports dataset data;
(1) UCF-Sports data set has 10 kinds of actions, 150 sections of short-sighted frequencies, has been widely used for operating position fixing;
(2)Olympic sports dataset:This data base has 16 kinds of behaviors, 783 sections of videos,.
Wherein, described spatial displacements evaluation, calculates including human body evaluation, Motion evaluation and action fraction.
Further, described evaluation, including having evaluation index, evaluation is based on action suggestionWith ground truth G it
Between average IoU value, it is defined as:
Wherein GtWithBe respectively t frame detection bounding box and ground truth, o (...) is IoU value, and | C | is one group
Frame, testing result therein or ground truth are not empty;WhenWhen, then action suggestion is positive group;η is specified
Threshold value, η is set to 0.5.
Further, described human body evaluation, including execution training data, rotate each training sample, respectively fromArriveSeven different angles, be spaced apart Represent t frame i-th action bounding box, bounding box be expressed as [x, y,
W, h], wherein w and h represents width and height respectively, and (x, y) is center;After training terminates, each bounding box is in test video
'sProbabilityCNN assessing network can be passed through;By arranging a probability threshold value, the mankind with more high probability build
View, keeps for subsequent treatment.
Further, described action evaluation, including using motion clue exclusion negative action suggestion;Light stream rectangular histogram
(HOF) descriptor is used to describe everyone exercise suggestion;Two gauss hybrid models (GMMs), G are constructed based on HOFsp
(.) and Gn(.), represents positive and negative suggestion respectively, and prediction belongs to the probability of the motor pattern of action or ground truth;
HOFs calculates intersecting unit (IoU) bounding box, overlapping with ground truth more than 0.5 as positive, and those overlaps are less than
0.1 is negative sample;A given testing schemeHOFh with iti, definitionProbability as motion scores, make
Prediction with the mixture of two Gauss models:
σ=1/ (1+e-x) mapping probability scope be [0,1].
Further, described action fraction calculates, and scores including a bounding box actionBy human detection scoring and
Motion scores two parts form, and are defined as follows:
λpIt is the parameter that the balance mankind evaluate and Motion evaluation scores.
Wherein, described time path of motion extracts, and generates including path of motion and contact, path of motion complete, step
As follows:
(1) path of motion generates
Action suggestion on each framework, finds one group of action path P={ p1,p2,…,pi, wherein pi=A corresponding path, starts to e-th frame end from s-th frame;Formulating and finding action path set P is maximum
Collection covering problem (MSCP), formulates improved optimization purpose MSCP, makes the member in action scoring and set of paths P simultaneously
Between similarity maximum;In form, optimization aim is as follows:
s.t|P|≤N (4)
O(pi,pj)≤ηP,i≠j
W(pi,pj) represent path of motion piAnd pjBetween similarity, its definition will action path association in explain;
S(bt) it is bounding box btAction scoring;Φ is path of motion Candidate Set;ηPIt is a threshold value;
First constraint setting in equation (4) comprises the maximum number of path P;Second constraint is conducive to P to avoid producing
Raw overlapping redundant actions path;The overlap in two kinds of paths passes through O (pi,pj) evaluate, it is defined as follows:
In equation (5),It is defined asRepresent two bounding boxsWithIoU;
In order to solve the MSCP in equation (4), need first to obtain action path candidates collection φ;φ is by space-time smooth-path pi
Composition, its continuous element Following two requirements should be met:
Represent IoU,WithRepresentColor histogram (HOC) and histogrammic gradient
(HOG);λaIt is the balance balancing this two weights;ηoAnd ηfIt is threshold value;First requirement in equation (6) ensures continuous bag
Enclose boxWithSpatially continuous;Second requirement guaranteesWithThere is similar outward appearance;Therefore, path piMay be with
With same action person;
The algorithm obtaining φ includes two stages:Sweep forward and backward tracking;The former purpose is the knot of location path
Bundle, the purpose of the latter is intended to recover whole path;Its central idea is intended to maintain the optimal Top-N path candidates of a renewal
People, is expressed as φ=(τk,bk), k=1,2 ..., N, wherein, τkPath K score, by accumulation'sObtain, bkIt is k road
The bounding box of footpath end;In forward lookup, it also records eachAccumulative action fraction
WithMeet two requirements of formula (6), in t frameUpdate path candidates according to following two steps
Pond:For each candidate, (τk,bk), k=1,2 ..., N, if there is anyIt is connected to bk, then bkTo there be is maximum'sReplace;IfCumulative point bigger than the fraction that N-th advises, for example,(τN,bN), then more
It is newlyAfter searching for forward, follow the tracks of backward and recover path candidate (τk,bk) eachMore specifically, for road
Footpath pk, obtainBy solving equation
(2) path of motion contact
After obtaining φ, the MSCP in formula (4) can be solved;Maximum set covering problem greedy search algorithm can
To realize 1-1/e approximation ratio;At first, use maximum actuation fraction τ in φkCandidate pool p foundi, then it is added
It is added to path set P;Assume that P has and comprise k path of motion, enumerate remaining path in φ, find a maximum flow equation
For:
In formula (6), W (pi,pj) path of motion piAnd pjSimilarity, be defined as
W(pi,pj)=1/ (‖ C (pi)-C(pj)‖+λa‖H(pi)-H(pj)‖) (10)
C(p*) and H (p*) difference delegated path p*The cluster centre of bounding box HOC and HOG, W (pi,pj) higher value, pi
And pjIt is probably identical actor;In order to reduce the redundant path in set P, the new path p addingiEquation (5) should be met
In constraint;
(3) path of motion completes
The linear SVM of training is as frame level detector;The positive bag including data set P of initial group
Enclose box, and negative bounding box composition excludes data set p, bounding box randomly chooses in positive group, and IoU is less than 0.3;Give in t frame
Detection zone bt, the test position missed in t+1 frame, in order to find most possible position;First, with region btInterior light
Transformation in stream, btIt is mapped to b 't+1;Second, by extending b 't+1Height and width, the original length of past half, build
Region of search b 't+1;3rd, b ' is scanned by a set of windowt+1, the ratio of width and length changes in the range of [0.8,1.2]
Adapting to a performer may size variation;bt+1Best regional choice is as a ground below equation to greatest extent:
N(b′t+1) represent scanning b 't+1The window collection producing, Sf() is SVM classifier, and input feature vector is selected as
The combination of HOC and HOG;Obtaining bt+1Afterwards, the support vector machine detector of renewal, is used as a positive sample by adding
This, bt+1The bounding box that IoU is less than 0.3 is feminine gender.
Wherein, described action suggestion generates, and can be considered an action including space and time continuous track, be absorbed in one
Actor is from appearance until disappearing;For each action, if its persistent period is more than a threshold value specified, this row
Dynamic suggestion, is expressed as T.
Brief description
Fig. 1 is a kind of system flow chart of the human action analysis method based on video data of the present invention.
Fig. 2 is a kind of comparison diagram of the human detection result of the human action analysis method based on video data of the present invention.
Fig. 3 is a kind of example of the path of motion generation of human action analysis method based on video data of the present invention.
Fig. 4 is a kind of action suggestion in UCF-Sports of human action analysis method based on video data of the present invention
The result generating.
Specific embodiment
It should be noted that in the case of not conflicting, the embodiment in the application and the feature in embodiment can phases
Mutually combine, with specific embodiment, the present invention is described in further detail below in conjunction with the accompanying drawings.
Fig. 1 is a kind of system flow chart of the human action analysis method based on video data of the present invention.Main inclusion number
According to input;Spatial displacements are evaluated;Time path of motion extracts;Action suggestion generates.
Wherein, data input includes training and test two parts, is wherein trained using UCF-Sports data, makes
Tested with Olympic sports dataset data;
(1) UCF-Sports data set has 10 kinds of actions, 150 sections of short-sighted frequencies, has been widely used for operating position fixing;
(2)Olympic sports dataset:This data base has 16 kinds of behaviors, 783 sections of videos.
Wherein, spatial displacements evaluation includes human body evaluation, Motion evaluation and action fraction and calculates.
Evaluate inclusion and there is evaluation index, evaluation is based on action suggestionAverage IoU value and ground truth G between, it
It is defined as:
Wherein GtWithBe respectively t frame detection bounding box and ground truth, o (...) is IoU value, and | C | is one group
Frame, testing result therein or ground truth are not empty;WhenWhen, then action suggestion is positive group;η is specified
Threshold value, η is set to 0.5.
Human body evaluate, include execute training data, rotate each training sample, respectively fromArriveSeven different
Angle, is spaced apart Represent the bounding box of i-th action in t frame, bounding box is expressed as [x, y, w, h], and wherein w and h divides
Do not represent width and height, (x, y) is center;After training terminates, each bounding box is in test videoProbability
CNN assessing network can be passed through;By arranging a probability threshold value, there is mankind's suggestion of more high probability, keep for follow-up
Process.
Action evaluation, including using motion clue exclusion negative action suggestion;Light stream rectangular histogram (HOF) descriptor is used to
Everyone exercise suggestion is described;Two gauss hybrid models (GMMs), G are constructed based on HOFsp(.) and Gn(.), generation respectively
The positive and negative suggestion of table, prediction belongs to the probability of the motor pattern of action or ground truth;HOFs calculates intersecting unit
(IoU) bounding box, overlapping with ground truth more than 0.5 as positive, and those overlaps are negative sample less than 0.1;Give
A fixed testing schemeHOFh with iti, definitionProbability as motion scores, using two Gauss models
The prediction of mixture:
σ=1/ (1+e-x) mapping probability scope be [0,1].
Action fraction calculates, and scores including a bounding box actionBy human detection scoring and motion scores two parts group
Become, be defined as follows:
λpIt is the parameter that the balance mankind evaluate and Motion evaluation scores.
Wherein, the extraction of time path of motion includes path of motion generation and contact, path of motion complete.Step is as follows:
(1) path of motion generates
Action suggestion on each framework, finds one group of action path P={ p1,p2,…,pi, wherein pi=A corresponding path, starts to e-th frame end from s-th frame;Formulating and finding action path set P is maximum
Collection covering problem (MSCP), formulates improved optimization purpose MSCP, makes the member in action scoring and set of paths P simultaneously
Between similarity maximum;In form, optimization aim is as follows:
s.t|P|≤N (4)
O(pi,pj)≤ηP,i≠j
W(pi,pj) represent path of motion piAnd pjBetween similarity, its definition will action path association in explain;
S(bt) it is bounding box btAction scoring;Φ is path of motion Candidate Set;ηPIt is a threshold value;First in equation (4) about
Bundle setting comprises the maximum number of path P;Second constraint is conducive to P to avoid producing overlapping redundant actions path;Liang Zhong road
The overlap in footpath passes through O (pi,pj) evaluate, it is defined as follows:
In equation (5),It is defined asRepresent two bounding boxsWithIoU;
In order to solve the MSCP in equation (4), need first to obtain action path candidates collection φ;φ is by space-time smooth-path pi
Composition, its continuous element Following two requirements should be met:
Represent IoU,And HRepresentColor histogram (HOC) and histogrammic gradient
(HOG);λaIt is the balance balancing this two weights;ηoAnd ηfIt is threshold value;First requirement in equation (6) ensures continuous bag
Enclose boxWithSpatially continuous;Second requirement guaranteesWithThere is similar outward appearance;Therefore, path piMay be with
With same action person;
The algorithm obtaining φ includes two stages:Sweep forward and backward tracking;The former purpose is the knot of location path
Bundle, the purpose of the latter is intended to recover whole path;Its central idea is intended to maintain the optimal Top-N path candidates of a renewal
People, is expressed as φ=(τk,bk), k=1,2 ..., N, wherein, τkPath K score, by accumulation'sObtain, bkIt is k road
The bounding box of footpath end;In forward lookup, it also records eachAccumulative action fraction
WithMeet two requirements of formula (6), in t frameUpdate path candidates according to following two steps
Pond:For each candidate, (τk,bk), k=1,2 ..., N, if there is anyIt is connected to bk, then bkTo there be is maximum'sReplace;IfCumulative point bigger than the fraction that N-th advises, for example,(τN,bN), then more
It is newlyAfter searching for forward, follow the tracks of backward and recover path candidate (τk,bk) eachMore specifically, for road
Footpath pk, obtainBy solving equation
(2) path of motion contact
After obtaining φ, the MSCP in formula (4) can be solved;Maximum set covering problem greedy search algorithm can
To realize 1-1/e approximation ratio;At first, use maximum actuation fraction τ in φkCandidate pool p foundi, then it is added
It is added to path set P;Assume that P has and comprise k path of motion, enumerate remaining path in φ, find a maximum flow equation
For:
In formula (6), W (pi,pj) path of motion piAnd pjSimilarity, be defined as
W(pi,pj)=1/ (‖ C (pi)-C(pj)‖+λa‖H(pi)-H(pj)‖) (10)
C(p*) and H (p*) difference delegated path p*The cluster centre of bounding box HOC and HOG, W (pi,pj) higher value, pi
And pjIt is probably identical actor;In order to reduce the redundant path in set P, the new path p addingiEquation (5) should be met
In constraint;
(3) path of motion completes
The linear SVM of training is as frame level detector;The positive bag including data set P of initial group
Enclose box, and negative bounding box composition excludes data set p, bounding box randomly chooses in positive group, and IoU is less than 0.3;Give in t frame
Detection zone bt, the test position missed in t+1 frame, in order to find most possible position;First, with region btInterior light
Transformation in stream, btIt is mapped to b 't+1;Second, by extending b 't+1Height and width, the original length of past half, build
Region of search b 't+1;3rd, b ' is scanned by a set of windowt+1, the ratio of width and length changes in the range of [0.8,1.2]
Adapting to a performer may size variation;bt+1Best regional choice is as a ground below equation to greatest extent:
N(b′t+1) represent scanning b 't+1The window collection producing, Sf() is SVM classifier, and input feature vector is selected as
The combination of HOC and HOG;Obtaining bt+1Afterwards, the support vector machine detector of renewal, is used as a positive sample by adding
This, bt+1The bounding box that IoU is less than 0.3 is feminine gender.
Wherein, action advises that generation includes space and time continuous track and can be considered an action, is absorbed in an actor
From appearance until disappearing;For each action, if its persistent period is more than a threshold value specified, this action is built
View, is expressed as.
Fig. 2 is a kind of comparison diagram of the human detection result of the human action analysis method based on video data of the present invention.
As shown in the figure it can be observed that model inspection result is more accurate and complicated.The bounding box of square frame 1 and 2 is ground truth respectively
And testing result.First width figure and the 3rd width figure are that (there is the inspection of a loss in the 3rd by what quick r-cnn obtained
Survey);And the second width figure and the 4th width figure are the results using method therefor of the present invention, there is not loss to human action detection
Situation.
Fig. 3 is a kind of example of the path of motion generation of human action analysis method based on video data of the present invention.As
Shown in figure it can be observed that, in the first row, front several square frames contain incoherent actor, and in the second row, employ this
Invention method therefor, the path of motion of actor, all by accurate recording, illustrates that the method improves to some extent.
Fig. 4 is a kind of action suggestion in UCF-Sports of human action analysis method based on video data of the present invention
The result generating.The bounding box of square frame 1 and 2 is ground truth and action suggestion respectively.
For those skilled in the art, the present invention is not restricted to the details of above-described embodiment, in the essence without departing substantially from the present invention
In the case of god and scope, the present invention can be realized with other concrete forms.Additionally, those skilled in the art can be to this
Bright carry out various change and modification without departing from the spirit and scope of the present invention, these improve and modification also should be regarded as the present invention's
Protection domain.Therefore, all changes that claims are intended to be construed to including preferred embodiment and fall into the scope of the invention
More and modification.
Claims (10)
1. a kind of human action analysis method based on video data is it is characterised in that mainly include data input ();Space
Action evaluation (two);Time path of motion extracts (three);Action suggestion generates (four).
2., based on the data input () described in claim 1 it is characterised in that including training and test two parts, wherein make
It is trained with UCF-Sports data, tested using Olympic sports dataset data;
(1) UCF-Sports data set has 10 kinds of actions, 150 sections of short-sighted frequencies, has been widely used for operating position fixing;
(2)Olympic sports dataset:This data base has 16 kinds of behaviors, 783 sections of videos.
3. based on the spatial displacements evaluation (two) described in claim 1 it is characterised in that include human body evaluation, Motion evaluation and
Action fraction calculates.
4., based on the evaluation described in claim 3 it is characterised in that including thering is evaluation index, evaluation is based on action suggestion
Average IoU value and ground truth G between, it is defined as:
Wherein GtWithBe respectively t frame detection bounding box and ground truth, o (...) is IoU value, and | C | is a framing, its
In testing result or ground truth be not empty;WhenWhen, then action suggestion is positive group;η is the threshold specified
Value, η is set to 0.5.
5., based on the human body evaluation described in claim 3 it is characterised in that execution training data, rotate each training sample, point
Be not fromArriveSeven different angles, be spaced apart Represent the bounding box of i-th action in t frame, bounding box table
It is shown as [x, y, w, h], wherein w and h represents width and height respectively, and (x, y) is center;After training terminates, each bounding box exists
In test videoProbabilityCNN assessing network can be passed through;By arranging a probability threshold value, there is more high probability
The mankind suggestion, keep for subsequent treatment.
6. based on the action evaluation described in claim 3 it is characterised in that including building using the negative action of motion clue exclusion
View;Light stream rectangular histogram (HOF) descriptor is used to describe everyone exercise suggestion;Two Gaussian Mixture are constructed based on HOFs
Model (GMMs), Gp(.) and Gn(.), represents positive and negative suggestion respectively, and prediction belongs to the motion of action or ground truth
The probability of pattern;HOFs calculates intersecting unit (IoU) bounding box, overlapping with ground truth more than 0.5 as positive, and that
Overlapping a bit is negative sample less than 0.1;A given testing schemeHOFh with iti, definitionProbability as one transport
Dynamic scoring, using the prediction of the mixture of two Gauss models:
σ=1/ (1+e-x) mapping probability scope be [0,1].
7. calculated based on the action fraction described in claim 3 it is characterised in that a bounding box action is scoredBy people's health check-up
Test and appraisal point and motion scores two parts composition, are defined as follows:
λpIt is the parameter that the balance mankind evaluate and Motion evaluation scores.
8. (three) are extracted based on the time path of motion described in claim 1 and generate and join it is characterised in that including path of motion
System, path of motion complete, and step is as follows:
(1) path of motion generates
Action suggestion on each framework, finds one group of action path P={ p1,p2,…,pi, wherein A corresponding path, starts to e-th frame end from s-th frame;Formulating and finding action path set P is maximum
Collection covering problem (MSCP), formulates improved optimization purpose MSCP, makes the member in action scoring and set of paths P simultaneously
Between similarity maximum;In form, optimization aim is as follows:
W(pi,pj) represent path of motion piAnd pjBetween similarity, its definition will action path association in explain;S(bt)
It is bounding box btAction scoring;Φ is path of motion Candidate Set;ηPIt is a threshold value;
First constraint setting in equation (4) comprises the maximum number of path P;Second constraint is conducive to P to avoid producing weight
Folded redundant actions path;The overlap in two kinds of paths passes through O (pi,pj) evaluate, it is defined as follows:
In equation (5),It is defined asRepresent two bounding boxsWithIoU;
In order to solve the MSCP in equation (4), need first to obtain action path candidates collection φ;φ is by space-time smooth-path piComposition,
Its continuous elementFollowing two requirements should be met:
Represent IoU,WithRepresentColor histogram (HOC) and histogrammic gradient (HOG);λaIt is
Balance the balance of this two weights;ηoAnd ηfIt is threshold value;First requirement in equation (6) ensures continuous bounding boxWithSpatially continuous;Second requirement guaranteesWithThere is similar outward appearance;Therefore, path piMay follow identical
Actor;
The algorithm obtaining φ includes two stages:Sweep forward and backward tracking;The former purpose is the end of location path, after
The purpose of person is intended to recover whole path;Its central idea is intended to maintain the optimal Top-N path candidates people of a renewal, represents
For φ=(τk,bk), k=1,2 ..., N, wherein, τkPath K score, by accumulation'sObtain, bkIt is k path ends
Bounding box;In forward lookup, it also records eachAccumulative action fraction
WithMeet two requirements of formula (6), in t frameUpdate path candidates pond according to following two steps:For
Each candidate, (τk,bk), k=1,2 ..., N, if there is anyIt is connected to bk, then bkTo there be is maximum'sReplace;IfCumulative point bigger than the fraction that N-th advises, for example,(τN,bN), then it is updated toAfter searching for forward, follow the tracks of backward and recover path candidate (τk,bk) eachMore specifically, for path pk,
ObtainBy solving equation
(2) path of motion contact
After obtaining φ, the MSCP in formula (4) can be solved;Maximum set covering problem greedy search algorithm can be real
Existing 1-1/e approximation ratio;At first, use maximum actuation fraction τ in φkCandidate pool p foundi, then add it to
Path set P;Assume that P has and comprise k path of motion, enumerate remaining path in φ, finding a maximum flow equation is:
In formula (6), W (pi,pj) path of motion piAnd pjSimilarity, be defined as
W(pi,pj)=1/ (‖ C (pi)-C(pj)‖+λa‖H(pi)-H(pj)‖) (10)
C(p*) and H (p*) difference delegated path p*The cluster centre of bounding box HOC and HOG, W (pi,pj) higher value, piAnd pjCan
It can be identical actor;In order to reduce the redundant path in set P, the new path p addingiShould meet in equation (5)
Constraint.
9. completed based on the path of motion described in claim 8 it is characterised in that including the linear SVM conduct trained
Frame level detector;The positive bounding box including data set P of initial group, and negative bounding box composition exclusion data set p,
Bounding box randomly chooses in positive group, and IoU is less than 0.3;Give detection zone b in t framet, the test position missed in t+1 frame,
In order to find most possible position;First, with region btTransformation in interior light stream, btIt is mapped to bt′+1;Second, pass through
Extension bt′+1Height and width, the original length of past half, build region of search bt′+1;3rd, swept by a set of window
Retouch bt′+1, the ratio of width and length changes one performer of adaptation in the range of [0.8,1.2] may size variation;bt+1Best
Regional choice as one ground below equation to greatest extent:
N(bt′+1) represent scanning bt′+1The window collection producing, Sf() is SVM classifier, input feature vector be selected as HOC and
The combination of HOG;Obtaining bt+1Afterwards, the support vector machine detector of renewal, is used as a positive sample, b by addingt+ 1The bounding box that IoU is less than 0.3 is feminine gender.
10. (four) are generated based on the action suggestion described in claim 1 it is characterised in that include space and time continuous track can be by
It is considered as an action, be absorbed in an actor from appearance until disappearing;For each action, if its persistent period is big
In a threshold value specified, this action suggestion, it is expressed as
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610867148.4A CN106446847A (en) | 2016-09-30 | 2016-09-30 | Human body movement analysis method based on video data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610867148.4A CN106446847A (en) | 2016-09-30 | 2016-09-30 | Human body movement analysis method based on video data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106446847A true CN106446847A (en) | 2017-02-22 |
Family
ID=58173185
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610867148.4A Pending CN106446847A (en) | 2016-09-30 | 2016-09-30 | Human body movement analysis method based on video data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106446847A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108519823A (en) * | 2018-03-29 | 2018-09-11 | 北京微播视界科技有限公司 | The treating method and apparatus of human-machine interaction data based on terminal |
CN109344692A (en) * | 2018-08-10 | 2019-02-15 | 华侨大学 | A kind of motion quality evaluation method and system |
CN110362715A (en) * | 2019-06-28 | 2019-10-22 | 西安交通大学 | A kind of non-editing video actions timing localization method based on figure convolutional network |
CN111222476A (en) * | 2020-01-10 | 2020-06-02 | 北京百度网讯科技有限公司 | Video time sequence action detection method and device, electronic equipment and storage medium |
CN111222737A (en) * | 2018-11-27 | 2020-06-02 | 富士施乐株式会社 | Method and system for real-time skill assessment and computer readable medium |
CN112936342A (en) * | 2021-02-02 | 2021-06-11 | 福建天晴数码有限公司 | System and method for evaluating actions of entity robot based on human body posture recognition algorithm |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105844258A (en) * | 2016-04-13 | 2016-08-10 | 中国农业大学 | Action identifying method and apparatus |
-
2016
- 2016-09-30 CN CN201610867148.4A patent/CN106446847A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105844258A (en) * | 2016-04-13 | 2016-08-10 | 中国农业大学 | Action identifying method and apparatus |
Non-Patent Citations (2)
Title |
---|
DU TRAN 等: "Video Event Detection: From Subvolume Localization to Spatiotemporal Path Search", 《PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
NANNAN LI 等: "Searching Action Proposals via Spatial Actionness Estimation and Temporal Path Inference and Tracking", 《COMPUTER VISION AND PATTERN RECOGNITION》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108519823A (en) * | 2018-03-29 | 2018-09-11 | 北京微播视界科技有限公司 | The treating method and apparatus of human-machine interaction data based on terminal |
CN108519823B (en) * | 2018-03-29 | 2021-10-26 | 北京微播视界科技有限公司 | Terminal-based human-computer interaction data processing method and device |
CN109344692A (en) * | 2018-08-10 | 2019-02-15 | 华侨大学 | A kind of motion quality evaluation method and system |
CN109344692B (en) * | 2018-08-10 | 2020-10-30 | 华侨大学 | Motion quality evaluation method and system |
CN111222737A (en) * | 2018-11-27 | 2020-06-02 | 富士施乐株式会社 | Method and system for real-time skill assessment and computer readable medium |
CN111222737B (en) * | 2018-11-27 | 2024-04-05 | 富士胶片商业创新有限公司 | Method and system for real-time skill assessment and computer readable medium |
CN110362715A (en) * | 2019-06-28 | 2019-10-22 | 西安交通大学 | A kind of non-editing video actions timing localization method based on figure convolutional network |
CN110362715B (en) * | 2019-06-28 | 2021-11-19 | 西安交通大学 | Non-clipped video action time sequence positioning method based on graph convolution network |
CN111222476A (en) * | 2020-01-10 | 2020-06-02 | 北京百度网讯科技有限公司 | Video time sequence action detection method and device, electronic equipment and storage medium |
US11600069B2 (en) | 2020-01-10 | 2023-03-07 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for detecting temporal action of video, electronic device and storage medium |
CN112936342A (en) * | 2021-02-02 | 2021-06-11 | 福建天晴数码有限公司 | System and method for evaluating actions of entity robot based on human body posture recognition algorithm |
CN112936342B (en) * | 2021-02-02 | 2023-04-28 | 福建天晴数码有限公司 | Physical robot action evaluation system and method based on human body gesture recognition algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106446847A (en) | Human body movement analysis method based on video data | |
Punnakkal et al. | BABEL: Bodies, action and behavior with english labels | |
Andriluka et al. | Posetrack: A benchmark for human pose estimation and tracking | |
Yang et al. | SiamAtt: Siamese attention network for visual tracking | |
CN109919122A (en) | A kind of timing behavioral value method based on 3D human body key point | |
US20150142716A1 (en) | Tracking player role using non-rigid formation priors | |
Chen et al. | LSTM with bio inspired algorithm for action recognition in sports videos | |
Qin et al. | Semantic loop closure detection based on graph matching in multi-objects scenes | |
Ren et al. | Learning with weak supervision from physics and data-driven constraints | |
CN103336967B (en) | A kind of hand motion trail detection and device | |
Suzuki et al. | Enhancement of gross-motor action recognition for children by CNN with OpenPose | |
Wang et al. | Visual object tracking with multi-scale superpixels and color-feature guided kernelized correlation filters | |
Ren et al. | Adversarial constraint learning for structured prediction | |
Dewi et al. | YOLOv7 for face mask identification based on deep learning | |
Wang et al. | Will you ever become popular? Learning to predict virality of dance clips | |
Akhter | Automated posture analysis of gait event detection via a hierarchical optimization algorithm and pseudo 2D stick-model | |
Xu et al. | Spatio-temporal action detection with multi-object interaction | |
Wang et al. | A fine-grained unsupervised domain adaptation framework for semantic segmentation of remote sensing images | |
Wang et al. | A Dense-aware Cross-splitNet for Object Detection and Recognition | |
CN105224952B (en) | Double interbehavior recognition methods based on largest interval markov pessimistic concurrency control | |
CN110659576A (en) | Pedestrian searching method and device based on joint judgment and generation learning | |
Zuo et al. | Three-dimensional action recognition for basketball teaching coupled with deep neural network | |
Li et al. | Siamese visual tracking with deep features and robust feature fusion | |
Chen | Image recognition method for pitching fingers of basketball players based on symmetry algorithm | |
Vo et al. | VQASTO: Visual question answering system for action surveillance based on task ontology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170222 |