CN106683111A - Human motion video segmentation method based on temporal clustering - Google Patents
Human motion video segmentation method based on temporal clustering Download PDFInfo
- Publication number
- CN106683111A CN106683111A CN201611040136.0A CN201611040136A CN106683111A CN 106683111 A CN106683111 A CN 106683111A CN 201611040136 A CN201611040136 A CN 201611040136A CN 106683111 A CN106683111 A CN 106683111A
- Authority
- CN
- China
- Prior art keywords
- video
- frame
- matrix
- feature
- human motion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a human motion video segmentation method based on temporal clustering. The method specifically comprises the following steps: extracting the features of video frames, calculating a distance transform diagram, and carrying out treatment such as k-means clustering to get class tag vectors output as the features of the video frames; modeling the relationship between the features of the video frames: building a correlation matrix M for the feature matrix of the video frames to describe the relationship between the features of the video frames; and after getting the correlation matrix M, performing a graph cut algorithm on the correlation matrix M to get a clustering result of the features of the video frames, wherein the clustering result is used as the segmentation result of the video frames, and each class in the clustering result represents a video segment containing an independent action. The fusion problem concerning similarity and temporality of the features of human motion in video frames in human motion video segmentation is solved. The segmentation accuracy is improved. As no iterative calculation is needed in calculation of the relationship between video frames, the calculation efficiency is improved.
Description
Technical field
The invention belongs to image/video process field, more particularly to a kind of human motion Video segmentation side of timing cluster
Method.
Background technology
《Temporal Subspace Clustering for Human Motion Segmentation》This text
In offering, author proposes the two-value template that human body image is extracted in each frame video, and by range conversion range conversion figure is formed,
Preliminary cluster result is formed by K mean cluster, using the cluster result of bi-level fashion as frame of video feature.Subsequently, author
A Laplce canonical related to encoder matrix is with the addition of on the basis of least square returns Subspace clustering method about
Beam, for forming the model of frame of video characteristic time relation, by alternating direction multiplier method dictionary and encoder matrix is solved.Most
Afterwards continuous videos are cut into the video-frequency band comprising self contained function by the figure blanking method on encoder matrix.Experiment shows that the method exists
There is preferable performance in the accuracy rate and normalized mutual information index of the cutting of human motion video-frequency band.
But the algorithm due to the process employs this kind of iterative of alternating direction multiplier method, compares on time consumption
Greatly, thus video-frequency band cutting speed it is slow.In addition the method needs the time Laplce's item by encoder matrix to retouch
Frame of video relation in time is stated, for the dependency on video frame time describes complex.
The content of the invention
For above-mentioned technical problem, the present invention proposes a kind of human motion methods of video segmentation of timing cluster, more
The relation between frame of video is comprehensively described, and improves computational efficiency.
To reach above-mentioned purpose, the technical solution used in the present invention is:
A kind of human motion methods of video segmentation of timing cluster, specifically includes following components:The spy of frame of video
Extraction is levied, the modeling of relation between frame of video feature, the solution of correlation matrix, the figure on correlation matrix is cut.
Frame of video feature extraction:Input t frame videos, to each frame of video background reducing is carried out, and extracts human body image
With background image, binary map is formed, wherein human body image region represents that background image region is represented with black image with white;
To frame of video computed range Transformation Graphs, range conversion figure is expanded into into a column vector by row:To on t frame videos it is acquired away from
The column vector launched from Transformation Graphs carries out K mean cluster, the class label vector of the bi-level fashion of each frame video is obtained, by category
The feature that vector is signed as these frame of video is exported.The class quantity value of K mean cluster is as follows:Work as t<When=50, class quantity is just
Value t, works as t>50, the general value of quantity of class is 50.
The modeling of relation between frame of video feature:Characteristic set { the x of input video frame1,x2,…,xt, wherein xiRepresent
The class label vector of resulting bi-level fashion in i-th frame frame of video, namely the feature of the frame video, these features constitute one
Eigenmatrix X=x1,x2,…,xt].In order to describe the relation between frame of video feature, build a correlation matrix M and merge phase
Like property tolerance and feature sequential propinquity.Matrix M can carry out minimum and obtain by following function pair M:
Meet M >=0 (1) simultaneously
Tr (AM) is the mark of calculating matrix AM, and λ is the regular parameter that is positive number.By carrying out derivation to formula (1), can
To obtain the constraint equation with regard to M, and solve with this to obtain correlation matrix M
A then serves as a weight matrix related to sequential.The line number of its matrix, columns and matrix XTX is consistent, and in addition 0
Represent line number, columns and matrix XTNull matrix of the X consistent element all for 0;
Max computings represent that the value of each element in M takesMaximum between corresponding element value and 0 value.
Figure on correlation matrix is cut:After correlation matrix M is obtained, figure is performed to it and cuts algorithm, it is possible to obtain video
The cluster result of frame feature, and in this, as the segmentation result of frame of video, namely each class in cluster result includes one
The all of frame of video of individual self contained function.
Wherein, on the one hand correlation matrix M contains the similarity measurement between frame of video feature, on the other hand measures
Propinquity of the frame of video feature in sequential, to describe relation of the frame of video feature on similarity and sequential propinquity.
A serves as a weight matrix related to sequential.The value of A each element is as represented by following formula:
In above formula, line index, column index in i, k difference representing matrix.ε values are 10-6.τ is time window length, and τ takes
It is worth for 5~17.In above formula, the setting of weights A allows adjacent element in sequential to give more when similarity is calculated
Big weights.
The invention has the advantages that:The present invention solves human motion in frame of video in human motion Video segmentation
Characteristic similarity and the fusion problem of timing, improve segmentation precision, while when frame of video relation is calculated due to not needing
Calculating is iterated, therefore improves computational efficiency.
Description of the drawings
Fig. 1 is the human motion methods of video segmentation flow chart of the embodiment of the present invention.
Specific embodiment
For the ease of the understanding of those skilled in the art, the present invention is made further with reference to embodiment and accompanying drawing
It is bright.
Human motion video-frequency band cutting method depends on the dependency description of frame of video, and great majority are cut based on cluster
Segmentation method only considers frame of video similarity characteristically when structure video frame correlation is described, often, and seldom considers to regard
Frequency frame dependency in time.The present embodiment is also added into frame of video while similarity measurement between frame of video is retained
The tolerance of propinquity in sequential, therefore, it is possible to the relation being described more fully hereinafter between frame of video.In addition, lifting video cutting
Speed be also crucial.
Human motion methods of video segmentation flow process as shown in Figure 1:One section of video comprising human motion of input, will be each
Frame color video frequency image does subtraction with corresponding static background image, completes background subtraction and processes operation, obtains human body image area
Domain, and be white by the human body image region labeling for being extracted, background area is demarcated as black, obtains a bianry image, right
Bianry image carries out range conversion and obtains range conversion figure.By the range conversion figure of all frame of video by row launch, formed row to
Duration set, obtains on its basis the class label vector of bi-level fashion using K mean cluster algorithm, and it is represented with 1 in the vector
Affiliated class, 0 represents that it is not belonging to corresponding class.
Class label vector [0,0,1,0]TRepresent, corresponding feature can be classified as the 3rd class, and be not belonging to other classes.
Using class label vector as the feature of frame of video, correlation matrix M is carried out in the feature base of these frame of video
Solution.According to formula 2:
The solution with regard to correlation matrix M is obtained, in above formula, the value of A each element is as represented by following formula:
In above formula, line index, column index in i, k difference representing matrix.ε values are 10-6.In the setting of weight matrix τ,
In the present embodiment its value is τ=9.
After correlation matrix M is obtained, figure is performed to it and cuts algorithm, so as to obtain the cluster result of individual features, enter one
Corresponding video slicing is the video-frequency band comprising human body self contained function according to cluster result by step.
Embodiment above technological thought only to illustrate the invention, it is impossible to which protection scope of the present invention is limited with this, it is all
It is any change done on the basis of technical scheme according to technological thought proposed by the present invention, each falls within present invention protection model
Within enclosing.
Claims (6)
1. the human motion methods of video segmentation that timing is clustered, it is characterised in that comprise the steps:The feature of frame of video is carried
Take, relationship modeling between frame of video feature, the solution of correlation matrix, the figure on correlation matrix is cut.
2. human motion methods of video segmentation according to claim 1, it is characterised in that:
Frame of video feature extraction:Input t frame videos, to each frame of video background reducing is carried out, and extracts human body image with the back of the body
Scape image, forms binary map;To frame of video computed range Transformation Graphs, range conversion figure is expanded into into a column vector by row;To t
The column vector that range conversion figure acquired on frame video launches carries out K mean cluster, obtains the bi-level fashion of each frame video
Class label vector, using class label vector as these frame of video feature export;
Relationship modeling between frame of video feature:Characteristic set { the x of input video frame1,x2,…,xt, wherein xiRepresent that the i-th frame is regarded
The feature of frequency frame, these features constitute an eigenmatrix X=[x1,x2,…,xt];
The solution of correlation matrix:Build a correlation matrix M fusion similarity measurement and feature sequential propinquity;
Correlation matrix M
A is a weight matrix related to sequential, the line number of its matrix, columns and matrix XTX is consistent, in addition 0 represent line number,
Columns and matrix XTNull matrix of the X consistent element all for 0;
Max computings represent that the value of each element in M takesMaximum between corresponding element value and 0 value;
Figure on correlation matrix is cut:After correlation matrix M is obtained, figure is performed to it and cuts algorithm, obtain frame of video feature
Cluster result, and in this, as the segmentation result of frame of video, namely each class in cluster result is moved comprising an independence
Make all of frame of video.
3. human motion methods of video segmentation according to claim 2, it is characterised in that:Represented with 1 in class label vector
It belongs to such, and 0 represents that it is not belonging to corresponding class.
4. human motion methods of video segmentation according to claim 2, it is characterised in that:Correlation matrix M includes video
Similarity measurement between frame feature, and measure propinquity of the frame of video feature in sequential.
5. the human motion methods of video segmentation according to one of claim 2 to 4, it is characterised in that:The value of element A is such as
Represented by following formula:
In above formula, line index, column index in i, k difference representing matrix, ε values are 10-6, τ is time window length;Weights A's
Setting allows adjacent element in sequential to give bigger weights when similarity is calculated.
6. human motion methods of video segmentation according to claim 5, it is characterised in that:τ values are 5~17.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611040136.0A CN106683111B (en) | 2016-11-24 | 2016-11-24 | Human motion video segmentation method based on time-sequence clustering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611040136.0A CN106683111B (en) | 2016-11-24 | 2016-11-24 | Human motion video segmentation method based on time-sequence clustering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106683111A true CN106683111A (en) | 2017-05-17 |
CN106683111B CN106683111B (en) | 2020-01-31 |
Family
ID=58867341
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611040136.0A Active CN106683111B (en) | 2016-11-24 | 2016-11-24 | Human motion video segmentation method based on time-sequence clustering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106683111B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110866936A (en) * | 2018-08-07 | 2020-03-06 | 阿里巴巴集团控股有限公司 | Video labeling method, tracking method, device, computer equipment and storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102360494A (en) * | 2011-10-18 | 2012-02-22 | 中国科学院自动化研究所 | Interactive image segmentation method for multiple foreground targets |
-
2016
- 2016-11-24 CN CN201611040136.0A patent/CN106683111B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102360494A (en) * | 2011-10-18 | 2012-02-22 | 中国科学院自动化研究所 | Interactive image segmentation method for multiple foreground targets |
Non-Patent Citations (3)
Title |
---|
ELHAMIFAR E ET AL.: "Sparse subspace clustering: algorithm, theory, and applications", 《IEEE》 * |
钱诚: "基于对比度直方图特征子空间聚类的区域级背景减除方法", 《软件导刊》 * |
钱诚: "增量型目标跟踪关键技术压就", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110866936A (en) * | 2018-08-07 | 2020-03-06 | 阿里巴巴集团控股有限公司 | Video labeling method, tracking method, device, computer equipment and storage medium |
CN110866936B (en) * | 2018-08-07 | 2023-05-23 | 创新先进技术有限公司 | Video labeling method, tracking device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106683111B (en) | 2020-01-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105205475B (en) | A kind of dynamic gesture identification method | |
CN108280397B (en) | Human body image hair detection method based on deep convolutional neural network | |
CN110033007B (en) | Pedestrian clothing attribute identification method based on depth attitude estimation and multi-feature fusion | |
EP3171297A1 (en) | Joint boundary detection image segmentation and object recognition using deep learning | |
CN109241995B (en) | Image identification method based on improved ArcFace loss function | |
CN102254328B (en) | Video motion characteristic extracting method based on local sparse constraint non-negative matrix factorization | |
CN105528794A (en) | Moving object detection method based on Gaussian mixture model and superpixel segmentation | |
CN108154157B (en) | Fast spectral clustering method based on integration | |
CN108986101B (en) | Human body image segmentation method based on cyclic cutout-segmentation optimization | |
CN110675422B (en) | Video foreground and background separation method based on generalized non-convex robust principal component analysis | |
Gaus et al. | Hidden Markov Model-Based gesture recognition with overlapping hand-head/hand-hand estimated using Kalman Filter | |
Rao et al. | Sign Language Recognition System Simulated for Video Captured with Smart Phone Front Camera. | |
CN112364791B (en) | Pedestrian re-identification method and system based on generation of confrontation network | |
CN113255557B (en) | Deep learning-based video crowd emotion analysis method and system | |
Wang et al. | MCF3D: Multi-stage complementary fusion for multi-sensor 3D object detection | |
Sethi et al. | Signpro-An application suite for deaf and dumb | |
CN103578107A (en) | Method for interactive image segmentation | |
de Arruda et al. | Counting and locating high-density objects using convolutional neural network | |
CN107424174B (en) | Motion salient region extraction method based on local constraint non-negative matrix factorization | |
Wei et al. | A new semantic segmentation model for remote sensing images | |
Wang et al. | Semantic annotation for complex video street views based on 2D–3D multi-feature fusion and aggregated boosting decision forests | |
CN103440651B (en) | A kind of multi-tag image labeling result fusion method minimized based on order | |
Vafadar et al. | A vision based system for communicating in virtual reality environments by recognizing human hand gestures | |
US20120053944A1 (en) | Method for Determining Compressed State Sequences | |
CN104504715A (en) | Image segmentation method based on local quaternion-moment characteristic |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |