CN115050055A - Human body skeleton sequence construction method based on Kalman filtering - Google Patents
Human body skeleton sequence construction method based on Kalman filtering Download PDFInfo
- Publication number
- CN115050055A CN115050055A CN202210788077.4A CN202210788077A CN115050055A CN 115050055 A CN115050055 A CN 115050055A CN 202210788077 A CN202210788077 A CN 202210788077A CN 115050055 A CN115050055 A CN 115050055A
- Authority
- CN
- China
- Prior art keywords
- frame
- skeleton
- kalman filtering
- processing queue
- human body
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001914 filtration Methods 0.000 title claims abstract description 41
- 238000010276 construction Methods 0.000 title claims abstract description 17
- 238000000034 method Methods 0.000 claims abstract description 44
- 238000010606 normalization Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 30
- 230000007704 transition Effects 0.000 claims description 6
- 230000017105 transposition Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 abstract description 15
- 238000000605 extraction Methods 0.000 abstract description 3
- 230000006399 behavior Effects 0.000 description 23
- 238000010586 diagram Methods 0.000 description 8
- 238000013527 convolutional neural network Methods 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a human body skeleton sequence construction method based on Kalman filtering, which comprises the following steps: s1, carrying out attitude estimation and carrying out normalization processing on the joint point characteristics; s2, numbering all skeletons of which the characteristics of the first frame of video are not all invalid values, and adding the skeletons into a processing queue; s3, inputting all numbered skeleton sequences and skeleton sets in the processing queue into a Kalman filtering module for constructing human skeleton sequences, and updating the processing queue according to the processing result; and S4, repeatedly executing S3 on each frame of video until all frames are processed, wherein each numbered skeleton sequence in the processing queue is the constructed human skeleton sequence. Compared with the traditional Kalman filtering, the newly defined decision module further processes the observed value of the attitude estimation method, so that the Kalman filtering module has the capability of tracking the motion of the human body; through the Kalman filtering algorithm, the problem of characteristic extraction errors caused by false detection and missing detection of the attitude estimation method is corrected.
Description
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a human skeleton sequence construction method based on Kalman filtering.
Background
Video-based behavior recognition is one of the representative tasks for understanding video information, and the task of recognizing human actions in video is called video behavior recognition. The current video behavior identification method based on deep learning comprises the following steps: double-stream Networks (Two-stream Networks), three-dimensional Convolutional Neural Networks (3D Convolutional Neural Networks), other non-end-to-end methods, and the like.
The proposal of space-time Convolutional network (STGCN) and P-CNN (Pose-based CNN) provides a new non-end-to-end method for behavior identification. The method extracts the joint points frame by frame through an advanced posture estimation method, clusters the joint points to form a skeleton as network input, and extracts and fuses the joint characteristics for identifying the video behavior.
Because the posture information is closely related to human behaviors, the method achieves a good effect on behavior recognition independent of background information. The mainstream human body posture estimation method does not correlate skeletons among different frames if openposition, but some methods can only analyze behaviors of the same person, and the method depends heavily on joint point information extracted by the posture estimation method, is quite sensitive to mistaken identification and missing identification results of the posture estimation method caused by disturbance in video frames, and randomly numbers each skeleton by the posture estimation method when a plurality of persons exist in a video simultaneously, so that the behavior identification result is wrong.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a human body skeleton sequence construction method based on Kalman filtering, which carries out observation value decision by comparing the output results of the current frame attitude estimation method through Kalman filtering module prediction characteristics and updates a processing queue by using the decision results so as to obtain a human body skeleton sequence. The problem of characteristic extraction errors caused by false detection and missing detection of the attitude estimation method is solved.
The purpose of the invention is realized by the following technical scheme: a human body skeleton sequence construction method based on Kalman filtering comprises the following steps:
s1, carrying out posture estimation on the video containing the human body behavior information frame by frame, and carrying out normalization processing on all joint point characteristics to obtain a skeleton set of related node characteristic information;
s2, numbering all skeletons of which the characteristics of the first frame of video are not all invalid values, and adding the skeletons into a processing queue;
s3, inputting all numbered skeleton sequences and skeleton sets in the processing queue into a Kalman filtering module for constructing human skeleton sequences, and updating the processing queue according to the processing result, wherein the updating unit is a frame;
and S4, repeatedly executing S3 on each frame of video until all frames are processed, wherein each numbered skeleton sequence in the processing queue is the constructed human skeleton sequence.
Further, in step S3, the kalman filtering module for constructing the human skeleton sequence includes a prediction module, a decision module, and an update module connected in series.
The prediction module is a Kalman filtering algorithm prediction part, prediction input is all numbered frameworks in a processing queue, and output specifically comprises the following steps:
wherein the content of the first and second substances,for estimating posterior state according to t-1 timeAnd the estimated prior state at the moment T predicted by the motion model, wherein T is the total frame number, A is a state transition matrix,estimating covariance for posterior state based on time t-1The calculated prior state at the time T is used for estimating covariance, Q is a process covariance matrix, and superscript T represents transposition;
x and y are defined as Z when t is 0 x,t,v,n ,Z y,t,v,n ,Z x,t,v,n 、Z y,t,v,n Normalizing the joint point characteristics of the joint v with the serial number n in the t frame; n is a framework serial number, and n is equal to m during initialization; v. of x ,v y When t is 0, t is not equal to 0,representing the posterior state estimation of the joint v with the number m in the t frame for the iteration result of the Kalman filtering module;
when t is 0Defined as an identity matrix, when t ≠ 0,estimating covariance of posterior state of the joint v with the number m in the t frame for an iteration result of the Kalman filtering module;
the state transition matrix a is defined as:
dt is the video frame time interval;
the process covariance matrix Q is defined as:
is the coordinate variance of the joints of the human body,is coordinate movement velocity variance parameter.
The decision module specifically comprises the following steps:
s31, calculating the matching degree of the processing queue and all detected skeletons of the current frame to obtain a Mahalanobis distance matrix D t (ii) a The method specifically comprises the following steps: calculating a candidate set of each framework in the processing queue, wherein elements in the set are successfully matched current frame frameworks, and outputting a result through a prediction moduleAndcalculating each joint point v characteristic Z of all skeletons n of current frame t t,v,n A priori estimate of each skeleton m in the corresponding processing queueIs a horse-like distance D t,v,nm ;
The mahalanobis distance matrix used for measuring the matching degree is as follows:
D t,nm =min(D t,v,nm ),v∈V
h is an observation matrix:
s32, obtaining D by using Hungarian algorithm t Optimal matching of matrix, taking matching threshold alpha, as long as D is satisfied t,nm If the matching is successful, the current frame observed value C is used t,v,m - Set as a joint point feature Z t,v,n ;D t,nm If alpha is greater than alpha, the matching fails, and the observed value is set as a predicted value
S33, substituting factor D in S32 t,nm Alpha or N > M results in unmatched skeletons being added to the processing queue with new numbers.
The updating module is specifically as follows: calculating the Kalman gain K t,v,m Posterior estimation of joint point characteristicsCovariance with a posteriori estimate
WhereinC t,v,m - Outputting results by a prediction module and a decision module, wherein R is an observation noise covariance matrix:
finally update C t,v,m :
C t,v,m Is set C v,m Element (2) represents C v,m At frame t, the features of joint v in skeleton number m.
The invention has the beneficial effects that: the invention uses the attitude estimation method to process the result initialization processing queue, takes the frame as the unit, predicts the characteristics through the Kalman filtering module, compares the output result of the current frame attitude estimation method to carry out the observed value decision, and uses the decision result to update the processing queue, thereby obtaining the human skeleton sequence. The method has the following advantages:
1. compared with the traditional Kalman filtering, the newly defined decision module further processes the observed value of the attitude estimation method, so that the Kalman filtering module has the capability of tracking the human motion;
2. through a Kalman filtering algorithm, the problem of characteristic extraction errors caused by false detection and missing detection of an attitude estimation method is corrected;
3. aiming at a video behavior recognition network such as STGCN (Standard template network) which needs to extract the same skeleton motion information, the problem that the network cannot be directly used due to the fact that the human skeleton serial number recognized by the posture estimation method is random in a multi-person scene is solved.
Drawings
FIG. 1 is a block diagram of the algorithm flow of the present invention;
FIG. 2 is a block diagram of a Kalman filtering module for constructing a human skeleton sequence according to the present invention;
FIG. 3 is a diagram comparing a frame skipping construction result with original data in skeleton detection according to the present invention;
FIG. 4 is a comparison graph of the construction result of the false detection in the skeleton detection and the original data in the present invention;
FIG. 5 is a diagram comparing a constructed result with original data when serial numbers of a multi-user scene framework are frequently exchanged;
FIG. 6 is a comparison graph of the constructed result and the original data when the simulated multi-person scenes are partially overlapped;
FIG. 7 is a comparison of the original data with the recognition result of the skeleton sequence constructed according to the present invention on STGCN.
Detailed Description
The invention provides a human body skeleton sequence construction method based on Kalman filtering, which is characterized in that probability distribution of target joint points is predicted by using the Kalman filtering according to a time sequence, an outlier and a missing value are processed by setting a probability threshold value to match with an optimal adaptive skeleton, rather than directly inputting a skeleton of posture estimation into a posture identification network, so that a more stable skeleton sequence can be provided. The technical scheme of the invention is further explained by combining the attached drawings.
As shown in FIG. 1, the human skeleton sequence construction method based on Kalman filtering of the invention comprises the following steps:
s1, carrying out posture estimation on the video containing the human body behavior information frame by frame, and carrying out normalization processing on all joint point characteristics to obtain a skeleton set of related node characteristic information;
the skeleton set is defined as Z F,T,V,N Where F ═ { F ═ x, F ═ y } is an articulation point feature, i.e., image twoDimensional coordinate information; t is the total frame number, V is the total joint number, if the human body posture estimation method openposition is used, the effective value of the joint number is V ═ V 0 ,v 1 ,...,v 24 ](ii) a N is a total framework serial number which is a random number carried out in a certain frame according to the quantity of the identified frameworks, and the numbers of different frames are not related;
the normalization method comprises the following steps:
wherein Z is x,t,v,n 、Z y,t,v,n Normalized joint point feature of joint v at frame t, X, with sequence number n t,v,n 、Y t,v,n The attitude estimation result of the joint V with the sequence number n in the t frame, V xmax 、V ymax For maximum significant value on the corresponding feature, typically image width and height, V xmin 、V ymin Is the smallest valid value on the corresponding feature, typically zero.
S2, numbering all skeletons of which the characteristics of the first frame of video are not all invalid values, and adding the skeletons into a processing queue; the skeleton in the processing queue is defined as C V,M M is the total skeleton number, V is the total joint number, C v,m All feature sets of the joints V in the skeleton number M in the video set are represented, and M is 1, 2.
S3, inputting all numbered skeleton sequences and skeleton sets in the processing queue into a Kalman filtering module for constructing human skeleton sequences, and updating the processing queue according to the processing result, wherein the updating unit is a frame;
the kalman filtering module for constructing the human skeleton sequence includes a prediction module, a decision module and an update module connected in series, as shown in fig. 2.
The prediction module is a Kalman filtering algorithm prediction part, prediction input is all numbered frameworks in a processing queue, and output specifically comprises the following steps:
wherein the content of the first and second substances,for estimating posterior state according to t-1 timeAnd the estimated prior state at the moment T predicted by the motion model, wherein T is the total frame number, A is a state transition matrix,estimating covariance for posterior state based on time t-1The calculated prior state at the time T is used for estimating covariance, Q is a process covariance matrix, and superscript T represents transposition;
x and y are defined as Z when t is 0 x,t,v,n ,Z y,t,v,n N is the number of the skeleton, n is equal to m during initialization, v x ,v y When t is 0, t is not equal to 0,representing the posterior state estimation of the joint v with the number m in the t frame for the iteration result of the Kalman filtering module;
when t is 0Defined as an identity matrix, when t ≠ 0,estimating covariance of posterior state of the joint v with the number m in the t frame for an iteration result of the Kalman filtering module;
the state transition matrix a is defined as:
dt is the video frame time interval;
the process covariance matrix Q is defined as:
is the coordinate variance of the joints of the human body,is coordinate movement velocity variance parameter.
The decision module specifically comprises the following steps:
s31, calculating the matching degree of the processing queue (numbered skeleton sequence) and all detected skeletons of the current frame to obtain a Mahalanobis distance matrix D t (ii) a The method specifically comprises the following steps: calculating a candidate set of each skeleton in the processing queue, wherein elements in the set are the current frames successfully matchedSkeleton, outputting results through prediction moduleAndcalculating each joint point v characteristic Z of all skeletons n of current frame t t,v,n A priori estimate of each skeleton m in the corresponding processing queueIs a horse-like distance D t,v,nm 。
The mahalanobis distance matrix used for measuring the matching degree is as follows:
D t,nm= min(D t,v,nm ),v∈V
h is an observation matrix:
s32, the matching problem of the numbered skeleton sequences and all skeletons of the current frame is an assignment problem essentially, and the Hungarian algorithm is used for obtaining D t Optimal matching of matrix, taking matching threshold alpha, as long as D is satisfied t,nm If the matching is successful, the current frame observed value C is used t,v,m - Set as a joint point feature Z t,v,n ;D t,nm If alpha is greater than alpha, the matching fails, and the observed value is set as a predicted value
S33, converting factor D in S32 t,nm Alpha or N > M results in unmatched skeletons being added to the processing queue with new numbers.
The updating module is specifically as follows: calculating the Kalman gain K t,v,m Posterior estimation of joint point characteristicsCovariance with a posteriori estimate
WhereinC t,v,m - Outputting results by a prediction module and a decision module, wherein R is an observation noise covariance matrix:
finally update C t,v,m :
C t,v,m Is set C v,m Element (2) represents C v, And when m is at the t-th frame, the characteristics of the joints v in the skeleton number m.
And S4, repeatedly executing S3 on each frame of video until all frames are processed, wherein each numbered skeleton sequence in the processing queue is the constructed human skeleton sequence.
This example was experimentally verified using python3.8 in the Windows operating system using the data set Kinetics-700. And selecting the video which is relatively representative and can generate detection errors, detect missing frames and multi-person scenes as the material for constructing the skeleton sequence. The common parameters used in the verification process are listed in table 1.
TABLE 1 common parameters table
And inputting the video into an openposition human body posture estimation method, analyzing and outputting a result, normalizing the result, and inputting the result into a Kalman filtering module for constructing a human body skeleton sequence to obtain the constructed skeleton sequence.
Fig. 3 is a comparison diagram of a pose estimation result and a neck joint of a constructed skeleton sequence when frame skipping occurs in pose estimation, fig. 3(a) is an image drawn according to neck joint point features in an openposition output result when a 2 nd frame of data is shown, a human body skeleton is not detected at this time, fig. 3(b) is a comparison image of a normalized pose estimation result (marked with an asterisk) and a constructed skeleton sequence (marked with a dotted line) on an x axis of a frame, and fig. 3(c) is a comparison image on a corresponding y axis. In the frame 2, because the scene is complex, the human skeleton is not detected by openposition, namely the x and y characteristic coordinates are 0; at the moment, the outlier is filtered by the output value of the Kalman filtering module, and the framework sequence is updated by replacing the outlier with the prior estimation value.
Fig. 4 is a comparison diagram of the pose estimation result and the constructed skeleton sequence neck joint when the pose estimation has false detection, fig. 4(a) is an image drawn according to the neck joint point characteristics in the openposition output result when the 4 th frame of the data is shown, fig. 4(b) is a comparison image of the normalized pose estimation result (marked with an asterisk) and the constructed skeleton sequence (marked with a dotted line) on the x axis of the frame, and fig. 4(c) is a comparison image on the corresponding y axis. In the 4 th frame and the 5 th frame, because of background interference, openposition detection is wrong, x and y coordinates have large offset, and the x and y values are corrected after passing through a Kalman filtering module.
Fig. 5 is a diagram showing a comparison between the skeleton sequence and the posture estimation result when the skeleton number obtained by posture estimation is repeatedly switched, and fig. 5(a) is an image drawn based on the openposition output result. FIG. 5(b) shows the neck joint point characteristics of two skeletons in the openposition output result, the horizontal axis shows normalized x characteristics, the vertical axis shows normalized y characteristics, and the asterisks indicate the neck joint points Z with the posture estimation output skeleton number of 0 f,t,1,0 Characterised by the addition of a number 1Z f,t,1,1 Characterized in that the solid line and the dotted line respectively represent the neck joint point C in the skeleton sequences numbered 0 and 1 after the skeleton sequence is constructed t,1,0 And C t,1,1 The subscript v ═ 1 denotes the neck joint point. Fig. 5(c) and 5(d) are graphs comparing the results of the x-feature construction and the y-feature construction of fig. 5(b), respectively, in terms of frame number.
Fig. 5(b) shows that the two detected human skeletons are distributed in the upper left area and the lower right area, but openposition detects each frame, and the detection results between frames are independent, so the skeleton numbers are not fixed, and the skeleton numbers in the upper left area and the lower right area are frequently switched. The result shows that after a Kalman filtering module of the framework sequence is constructed, the characteristic changes of two framework frames and interframes are effectively tracked.
Fig. 6 is a comparison diagram before and after construction of a skeleton sequence when a simulation multi-person scene is partially overlapped, wherein a program generates two paths from (0.35, 0.47) to (0.39, 0.32) and from (0.5, 0.37) to (0.31, 0.46), samples the paths and adds gaussian noise, the gaussian noise is in accordance with N to (0, 0.0001), discrete points are obtained on the paths as simulation of joint coordinates, the skeleton serial number is randomly 0 or 1, joint point motion is simulated by the points, and fig. 6(a) is the generated discrete point coordinates. Fig. 6(b) and 6(c) are graphs comparing the results of constructing the x-feature and the y-feature of fig. 6(a), respectively, according to the number of frames.
It can be seen that in a multi-person scene environment, if correct joint point characteristics can be detected, even if the distance between joint points is small and the movement tracks are overlapped, the method can still capture the inter-frame joint information and has the capability of distinguishing different frameworks.
All the constructed skeleton sequences and the original data are input into an STGCN, the weights of the models are st _ gcn.kinetics.pt provided by the government, and the reasoning results of partial videos on the STGCN are shown in a table 2.
Sequence number: video sequence number, all videos are videos under the absering label in kinetics-700. The camera of the video 1 is stable, and the character behaviors are clear. The video 2 is the video shown in fig. 3, and the attitude estimation result is unstable, and has the conditions of missing detection and false detection. Video 3 is the video shown in fig. 5, and has a problem of switching the serial numbers of the multi-user frameworks. 1 ', 2 ', 3 ' are the results of the skeleton sequence constructed by the method of the invention corresponding to the 1, 2, 3 videos through STGCN inference:
frame number: representing the total number of identified video frames;
output frame number: the number of frames of the STGCN output characteristic is compressed to the original frame number T by the convolution of the multilayer TCN module
abselling: outputting the number of frames identified as the absering behavior in the frame number;
water skiing: outputting the number of frames identified as the water skiing behavior in the frame number, wherein the behavior is easy to be confused with absering;
other number of actions: outputting the behavior quantity identified as other behaviors in the frame number;
other behavior frame number: outputting the number of frames identified as other behaviors from the number of frames;
voting result: and (5) outputting the behavior with the maximum frame number in the frame number as the main behavior in the video as a voting result.
TABLE 2 comparison of inference results of partial videos on STGCN
The features of the abssering and the water skiing are similar in the case of ignoring the background, so that the recognition results of the abssering and the water skiing can be considered to be correct at the same time in this case. The comparison result between the original data and the human body skeleton recognition result constructed by the invention is obtained according to the table 2 and is shown in fig. 7, and in two adjacent bar graphs of each frame in the graph, the left side is data before processing, and the right side is data after processing. From the comparison result, the skeleton sequence constructed by the method is improved to different degrees in the identification accuracy compared with the direct identification of the openposition attitude estimation result. Meanwhile, the skeleton sequence constructed by the method can reduce the adverse effects of the false recognition and the missing recognition on the recognition result of the posture estimation method, namely the number of false recognition behaviors is reduced. For the STGCN needing to extract time information, the problem that the output framework sequence number of the attitude estimation method is random is effectively solved.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (5)
1. A human body skeleton sequence construction method based on Kalman filtering is characterized by comprising the following steps:
s1, carrying out posture estimation on the video containing the human body behavior information frame by frame, and carrying out normalization processing on all joint point characteristics to obtain a skeleton set of related node characteristic information;
s2, numbering all skeletons of which the characteristics of the first frame of video are not all invalid values, and adding the skeletons into a processing queue;
s3, inputting all numbered skeleton sequences and skeleton sets in the processing queue into a Kalman filtering module for constructing human skeleton sequences, and updating the processing queue according to the processing result, wherein the updating unit is a frame;
and S4, repeatedly executing S3 on each frame of video until all frames are processed, wherein each numbered skeleton sequence in the processing queue is the constructed human skeleton sequence.
2. The method for constructing the human body skeleton sequence based on the kalman filter according to claim 1, wherein in the step S3, the kalman filter module for constructing the human body skeleton sequence includes a prediction module, a decision module and an update module connected in series.
3. The Kalman filtering based human body skeleton sequence construction method according to claim 2, characterized in that the prediction module is a Kalman filtering algorithm prediction part, prediction inputs are all numbered skeletons in a processing queue, and outputs specifically are:
wherein, the first and the second end of the pipe are connected with each other,for estimating posterior state according to t-1 timeAnd the estimated prior state at the moment T predicted by the motion model, wherein T is the total frame number, A is a state transition matrix,estimating covariance for posterior state based on time t-1The calculated prior state at the time T is used for estimating covariance, Q is a process covariance matrix, and superscript T represents transposition;
x and y are defined as Z when t is 0 x,t,v,n ,Z y,t,v,n ,Z x,t,v,n 、Z y,t,v,n Normalizing the joint point characteristics of the joint v with the serial number n in the t frame; n is a framework serial number, and n is equal to m during initialization; v. of x ,v y When t is 0, t is not equal to 0,representing the posterior state estimation of the joint v with the number m in the t frame for the iteration result of the Kalman filtering module;
when t is 0Defined as an identity matrix, when t ≠ 0,the posterior state estimation of the joint v with the number m in the t frame is represented as the iteration result of the Kalman filtering moduleA covariance;
the state transition matrix a is defined as:
dt is the video frame time interval;
the process covariance matrix Q is defined as:
4. The Kalman filtering-based human body skeleton sequence construction method according to claim 2, characterized in that the decision module specifically comprises the following steps:
s31, calculating the matching degree of the processing queue and all detected skeletons of the current frame to obtain a Mahalanobis distance matrix D t (ii) a The method specifically comprises the following steps: calculating a candidate set of each framework in the processing queue, wherein elements in the set are successfully matched current frame frameworks, and outputting a result through a prediction moduleAndcalculating each joint point v characteristic Z of all skeletons n of current frame t t,v,n A priori estimate of each skeleton m in the corresponding processing queueIs a horse-like distance D t,v,nm ;
The mahalanobis distance matrix used for measuring the matching degree is as follows:
D t,nm =min(D t,v,nm ),v∈V
h is an observation matrix:
s32, obtaining D by using Hungarian algorithm t Optimal matching of matrix, taking matching threshold alpha, as long as D is satisfied t,nm If the matching is successful, the current frame observed value C is used t,v,m - Set as a joint point feature Z t,v,n ;D t,nm If alpha is greater than alpha, the matching fails, and the observed value is set as a predicted value
S33, converting factor D in S32 t,nm Alpha or N > M results in unmatched skeletons being added to the processing queue with new numbers.
5. The Kalman filtering-based human body skeleton sequence construction method according to claim 2, characterized in that the updating module specifically is: calculating CarerMangan gain K t,v,m Posterior estimation of joint point characteristicsCovariance with a posteriori estimate
WhereinC t,v,m - Outputting results by a prediction module and a decision module, wherein R is an observation noise covariance matrix:
finally update C t,v,m - :
C t,v,m Is set C v,m Element (2) represents C v,m At frame t, the features of joint v in skeleton number m.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210788077.4A CN115050055B (en) | 2022-07-06 | 2022-07-06 | Human skeleton sequence construction method based on Kalman filtering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210788077.4A CN115050055B (en) | 2022-07-06 | 2022-07-06 | Human skeleton sequence construction method based on Kalman filtering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115050055A true CN115050055A (en) | 2022-09-13 |
CN115050055B CN115050055B (en) | 2024-04-30 |
Family
ID=83164491
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210788077.4A Active CN115050055B (en) | 2022-07-06 | 2022-07-06 | Human skeleton sequence construction method based on Kalman filtering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115050055B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109522793A (en) * | 2018-10-10 | 2019-03-26 | 华南理工大学 | More people's unusual checkings and recognition methods based on machine vision |
CN110458944A (en) * | 2019-08-08 | 2019-11-15 | 西安工业大学 | A kind of human skeleton method for reconstructing based on the fusion of double-visual angle Kinect artis |
CN110530365A (en) * | 2019-08-05 | 2019-12-03 | 浙江工业大学 | A kind of estimation method of human posture based on adaptive Kalman filter |
CN111932580A (en) * | 2020-07-03 | 2020-11-13 | 江苏大学 | Road 3D vehicle tracking method and system based on Kalman filtering and Hungary algorithm |
US20210000404A1 (en) * | 2019-07-05 | 2021-01-07 | The Penn State Research Foundation | Systems and methods for automated recognition of bodily expression of emotion |
CN112633205A (en) * | 2020-12-28 | 2021-04-09 | 北京眼神智能科技有限公司 | Pedestrian tracking method and device based on head and shoulder detection, electronic equipment and storage medium |
CN114038056A (en) * | 2021-10-29 | 2022-02-11 | 同济大学 | Skip and squat type ticket evasion behavior identification method |
CN114609912A (en) * | 2022-03-18 | 2022-06-10 | 电子科技大学 | Angle-only target tracking method based on pseudo-linear maximum correlation entropy Kalman filtering |
-
2022
- 2022-07-06 CN CN202210788077.4A patent/CN115050055B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109522793A (en) * | 2018-10-10 | 2019-03-26 | 华南理工大学 | More people's unusual checkings and recognition methods based on machine vision |
US20210000404A1 (en) * | 2019-07-05 | 2021-01-07 | The Penn State Research Foundation | Systems and methods for automated recognition of bodily expression of emotion |
CN110530365A (en) * | 2019-08-05 | 2019-12-03 | 浙江工业大学 | A kind of estimation method of human posture based on adaptive Kalman filter |
CN110458944A (en) * | 2019-08-08 | 2019-11-15 | 西安工业大学 | A kind of human skeleton method for reconstructing based on the fusion of double-visual angle Kinect artis |
CN111932580A (en) * | 2020-07-03 | 2020-11-13 | 江苏大学 | Road 3D vehicle tracking method and system based on Kalman filtering and Hungary algorithm |
CN112633205A (en) * | 2020-12-28 | 2021-04-09 | 北京眼神智能科技有限公司 | Pedestrian tracking method and device based on head and shoulder detection, electronic equipment and storage medium |
CN114038056A (en) * | 2021-10-29 | 2022-02-11 | 同济大学 | Skip and squat type ticket evasion behavior identification method |
CN114609912A (en) * | 2022-03-18 | 2022-06-10 | 电子科技大学 | Angle-only target tracking method based on pseudo-linear maximum correlation entropy Kalman filtering |
Non-Patent Citations (3)
Title |
---|
SUNGPHILL MOON等: "Multiple kinect sensor fusion for human skeleton tracking using Kalman filtering", 《SAGE JOURNALS》, 15 May 2017 (2017-05-15) * |
刁宏健: "基于深度学习的斗殴行为识别技术研究", 《万方数据》, 2 October 2023 (2023-10-02) * |
李扬;: "基于视频序列的运动目标追踪算法", 电子科技, no. 08, 15 August 2012 (2012-08-15) * |
Also Published As
Publication number | Publication date |
---|---|
CN115050055B (en) | 2024-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109460702B (en) | Passenger abnormal behavior identification method based on human body skeleton sequence | |
Du et al. | Hierarchical recurrent neural network for skeleton based action recognition | |
CN109871750A (en) | A kind of gait recognition method based on skeleton drawing sequence variation joint repair | |
Dreuw et al. | Tracking using dynamic programming for appearance-based sign language recognition | |
CN107833239B (en) | Optimization matching target tracking method based on weighting model constraint | |
JP2008257425A (en) | Face recognition device, face recognition method and computer program | |
CN114187665B (en) | Multi-person gait recognition method based on human skeleton heat map | |
CN112131908A (en) | Action identification method and device based on double-flow network, storage medium and equipment | |
Abdelkader et al. | Integrated motion detection and tracking for visual surveillance | |
CN114582030A (en) | Behavior recognition method based on service robot | |
CN110969078A (en) | Abnormal behavior identification method based on human body key points | |
CN113608663B (en) | Fingertip tracking method based on deep learning and K-curvature method | |
CN112966628A (en) | Visual angle self-adaptive multi-target tumble detection method based on graph convolution neural network | |
Martinez-Contreras et al. | Recognizing human actions using silhouette-based HMM | |
CN112200020A (en) | Pedestrian re-identification method and device, electronic equipment and readable storage medium | |
Kishore et al. | Selfie sign language recognition with convolutional neural networks | |
CN112926522A (en) | Behavior identification method based on skeleton attitude and space-time diagram convolutional network | |
Ali et al. | Deep Learning Algorithms for Human Fighting Action Recognition. | |
CN114694261A (en) | Video three-dimensional human body posture estimation method and system based on multi-level supervision graph convolution | |
Parisi et al. | Human action recognition with hierarchical growing neural gas learning | |
CN114332157A (en) | Long-term tracking method controlled by double thresholds | |
Mattheus et al. | A review of motion segmentation: Approaches and major challenges | |
CN113378799A (en) | Behavior recognition method and system based on target detection and attitude detection framework | |
CN110348395B (en) | Skeleton behavior identification method based on space-time relationship | |
CN115050055A (en) | Human body skeleton sequence construction method based on Kalman filtering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |