CN106228111A - A kind of method based on skeleton sequential extraction procedures key frame - Google Patents
A kind of method based on skeleton sequential extraction procedures key frame Download PDFInfo
- Publication number
- CN106228111A CN106228111A CN201610539455.XA CN201610539455A CN106228111A CN 106228111 A CN106228111 A CN 106228111A CN 201610539455 A CN201610539455 A CN 201610539455A CN 106228111 A CN106228111 A CN 106228111A
- Authority
- CN
- China
- Prior art keywords
- frame
- skeleton
- motion vector
- entropy
- information entropy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to a kind of method based on skeleton sequential extraction procedures key frame, including: catch human action by Kinect video camera, obtain comprising the three-dimensional skeleton sequence of multiple skeleton node;The skeleton coordinate of consecutive frame is subtracted each other, obtains the three-dimensional skeleton motion vector of all skeletons;The three-dimensional skeleton motion vector of all skeletons is projected respectively in the three of Descartes's rhombic system planes, on each perspective plane, according to direction and amplitude, skeleton motion vector is carried out probability statistics, obtain rectangular histogram;According to comentropy formula, the skeleton motion vector histograms of consecutive frame being asked information entropy, the frame definition that comentropy has local maximum is primitive frame;For each primitive frame of whole three-dimensional skeleton sequence, calculate intertexture coefficient, and the information entropy of primitive frame is weighted;Obtain the key frame of this skeleton sequence human's action.The present invention can extract human action's key frame accurately and reliably and efficiently.
Description
Technical field
The invention belongs to multimedia signal processing field, relate to a kind of method extracting key frame.
Background technology
Along with the arrival of cybertimes, and the developing rapidly of computer industry, the market of computer intelligence is the most vigorously sent out
Exhibition.The fields such as machine learning, pattern recognition, data mining have wide development space in current social.The pattern that is subordinated to is known
The human action of the computer in other field detects identification and has a lot of application in today's world, the body-sensing trip of such as man-machine interaction
Play, intelligent monitoring, video frequency searching etc..But in the angle of computer disposal video, amount of video information is the hugest.For
Raising video processing speed, makes machine learning algorithm based on video have higher applicability, filters out in video
Comprise the key frame that action message more enriches to process, become very popular in recent years.The present invention propose a kind of for
Human action, based on three-dimensional skeleton sequence, the method for extraction action key frame in the video sequence.
In recent years, camera industry development is rapid, it is possible to the camera catching depth information has and is increasingly widely applied.?
After within 2010, Microsoft issues Kinect video camera, depth camera comes into huge numbers of families, the most substantial amounts of video and picture
Research direction is increasingly turned to information processing based on RGB-D by the scholar in direction.Along with following the tracks of the mankind in deep video sequence
Updating of skeleton algorithm, bone information, as more abstract and high-level characteristics of human body, is widely used, because it has
There are the insensitive characteristic of light, and more comprehensively three-dimensional character.But, there is presently no key frame based on skeleton sequence and carry
Take technology.
Summary of the invention
In order to video sequence is processed more easily, computer is allowed to identify the action of the mankind fast and effectively, this
Invention, based on skeleton sequence, proposes the extracting method of a kind of human action's key frame.The method has compares two-dimensional signal more
The spatial character of robust.Meanwhile, do well out of the simple characteristic of bone information, there is the highest operational efficiency.Summary of the invention is as follows:
A kind of method based on skeleton sequential extraction procedures key frame, comprises the following steps:
1) catch human action by Kinect video camera, in the data stream of capture, carry out skeleton tracking, comprised
The three-dimensional skeleton sequence of multiple skeleton nodes;
2) for each skeleton node, the skeleton coordinate of consecutive frame is subtracted each other, obtain each this skeleton node adjacent
The skeleton motion vector of interframe, and then calculate the three-dimensional skeleton motion vector of all skeletons;
3) the three-dimensional skeleton motion vector of all skeletons is projected respectively in the three of Descartes's rhombic system planes,
On each perspective plane, according to direction and amplitude, skeleton motion vector is carried out probability statistics, obtain rectangular histogram, be defined as
Skeleton motion vector histograms;
4) according to comentropy formula, the skeleton motion vector histograms of consecutive frame is sought information entropy;By whole video sequence
In row, all of skeleton motion vector histograms information entropy is arranged in order according to video sequencing, and is depicted as song
Line chart, is entropy curve by this curve definitions, in entropy curve, obtains local maximum, comentropy is had local maximum
Frame definition is primitive frame;
5) for each primitive frame i of whole three-dimensional skeleton sequence, according to himself information entropy and consecutive frame
Information entropy, can calculate intertexture coefficient HI by the interleaving mode of following formula:
Wherein, the information entropy of primitive frame is H (i), and makes H (i ± x) represent the comentropy of x frame adjacent with primitive frame i
Value, the frame after+number representative, the frame before-number representative;Itself and original frame information entropy is made to carry out product, thus to primitive frame
Information entropy be weighted;
6) by information entropy H (i) of each primitive frame, it is interleaved with coefficient HI and is multiplied and is weighted, after being weighted
Original frame information entropy;
7) according to the original frame information entropy after weighting, draw new information entropy curve, obtain local maximum corresponding
Frame, as the key frame of this skeleton sequence human's action.
The present invention can extract human action's key frame accurately and reliably and efficiently.
Accompanying drawing explanation
Fig. 1 is whole key-frame extraction framework
The key frame that action of waving is extracted on MSRAction-3D data set by Fig. 2 by the present invention, uses gray-scale map
Visualization
Detailed description of the invention
1) present invention uses the 32-bit operating system of Windows8, and exploitation IDE is VS2010, has configured Kinect for
Windows SDK v1.6 and OpenCV2.3.0 or more highest version, use NUI skeleton tracking mode will capture at Kinect
Data stream in carry out skeleton tracking, and human skeletal's action sequence is exported.
2), in each frame of skeleton sequence, the three-dimensional coordinate of 20 human skeletal's nodes is comprised.Each skeleton is saved
Point, the absolute difference of the skeleton three-dimensional coordinate of the most adjacent two frames is this skeleton node and vows at the skeleton motion of these adjacent two interframe
Amount, and then the three-dimensional skeleton motion vector of all 20 skeleton nodes can be obtained.
3) the skeleton motion vector to two interframe all skeletons node carries out the throwing in three directions in Descartes's rhombic system
Shadow, the projection on two dimensional surface of the motion vector of each skeleton node has different directions and vector size.Each
On two dimensional surface, on the basis of x-axis positive direction, in the counterclockwise direction, often rotating 45 ° and be defined as a direction, so far plane can
It is divided into 8 directions.According to experimental result, the present invention is with the maximum amplitude value of skeleton motion vectors all in each video sequence
For standard, the skeleton motion vector on all two dimensional surfaces is divided into 5 magnitude range.Thus, according to skeleton motion vector
Size and Orientation, defines 40 classifications (classification order and numbering do not affect result), the skeleton in each two-dimensional projection face successively
Motion vector can be returned according to direction and size to be divided into a classification.
On each perspective plane, count the skeleton number that each classification comprises, can obtain dimension be 40 to
Amount (i.e. rectangular histogram), couples together adding up, on three perspective planes, the vector obtained respectively, obtains vector that dimension is 120 (i.e.
Rectangular histogram), it is defined as skeleton motion vector histograms.
4) to each skeleton motion vector histograms, according to comentropy formula:
The entropy that each skeleton motion vector is corresponding can be obtained.Wherein H is information entropy, piIt is that in 120 dimensional vectors, i-th classification exists
Ratio shared in whole rectangular histogram, n is histogrammic length, takes n=120 in the present invention.
For whole skeleton sequence, all comentropies are coupled together, obtain the curve being made up of comentropy, defined
For entropy curve.Extract entropy in the local maximum in entropy curve, i.e. entropy curve and meet two frame entropy about simultaneously greater than
Point, is primitive frame by having the frame definition of local maximum in skeleton sequence entropy curve.
5) in entropy curve, to each primitive frame, it is assumed that this frame is the i-th frame in whole video, then its entropy is H
(i), and another H (i ± x) represent the x frame adjacent with this frame (before or after) entropy.This is calculated according to following interleaving formula
The intertexture coefficient HI of frame.The skeleton motion that this intertexture coefficient reflects primitive frame is adjacent the motion difference size of frame.With friendship
Knit coefficient to be multiplied with the entropy of primitive frame, thus realize primitive frame entropy is weighted.Intertexture coefficient formula is as follows:
6) the primitive frame entropy after being weighted, is sequentially connected according to video sequences and connects, and obtains new entropy curve.Newly
The frame corresponding to local maximum is taken out, as the key frame of human action's sequence in this video on entropy curve.
The result tested the present invention on MSRAction-3D data set below illustrates:
The human action that MSRAction-3D is the most influential detects identification data set, and this data set comprises 20 classes
Action, it is provided that depth information data and skeleton data.The present invention, according to extraction method of key frame described above, moves waving
Making to have carried out key-frame extraction, this action of waving comprises 58 frames altogether, by the inventive method, extracts 8 key frames altogether.Fig. 2
For the result being arranged in order after depth information corresponding for key frame is visualized.By result it can be seen that, originally
The action sequence comprising multiframe passes through the inventive method, has extracted a few frames that can characterize whole action.By the present invention,
Process to whole video sequence can be converted to process keyframe sequence, thus greatly reduces the redundancy processing data, reduces
The operation time of algorithm, space cost, improve the practicality of complicated algorithm in terms of processing video.
Claims (1)
1. a method based on skeleton sequential extraction procedures key frame, comprises the following steps:
1) catch human action by Kinect video camera, in the data stream of capture, carry out skeleton tracking, obtain comprising multiple
The three-dimensional skeleton sequence of skeleton node;
2) for each skeleton node, the skeleton coordinate of consecutive frame is subtracted each other, obtain each this skeleton node in adjacent interframe
Skeleton motion vector, and then calculate the three-dimensional skeleton motion vector of all skeletons;
3) the three-dimensional skeleton motion vector of all skeletons is projected, often respectively in the three of Descartes's rhombic system planes
On individual perspective plane, according to direction and amplitude, skeleton motion vector is carried out probability statistics, obtain rectangular histogram, be defined as skeleton
Motion vector histograms;
4) according to comentropy formula, the skeleton motion vector histograms of consecutive frame is sought information entropy;By in whole video sequence,
All of skeleton motion vector histograms information entropy is arranged in order according to video sequencing, and is depicted as curve chart,
It is entropy curve by this curve definitions, in entropy curve, obtains local maximum, comentropy is had the frame definition of local maximum
For primitive frame;
5) for each primitive frame i of whole three-dimensional skeleton sequence, according to himself information entropy and the information of consecutive frame
Entropy, can calculate intertexture coefficient HI by the interleaving mode of following formula:
Wherein, the information entropy of primitive frame is H (i), and makes H (i ± x) represent the information entropy of x frame adjacent with primitive frame i ,+number
Frame after representative, the frame before-number representative;It is made to carry out product with original frame information entropy, thus the information to primitive frame
Entropy is weighted;
6) by information entropy H (i) of each primitive frame, it is interleaved with coefficient HI and is multiplied and is weighted, original after being weighted
Frame information entropy;
7) according to the original frame information entropy after weighting, draw new information entropy curve, obtain the frame that local maximum is corresponding,
Key frame as the sequence human's action of this skeleton.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610539455.XA CN106228111A (en) | 2016-07-08 | 2016-07-08 | A kind of method based on skeleton sequential extraction procedures key frame |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610539455.XA CN106228111A (en) | 2016-07-08 | 2016-07-08 | A kind of method based on skeleton sequential extraction procedures key frame |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106228111A true CN106228111A (en) | 2016-12-14 |
Family
ID=57520339
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610539455.XA Pending CN106228111A (en) | 2016-07-08 | 2016-07-08 | A kind of method based on skeleton sequential extraction procedures key frame |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106228111A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109190474A (en) * | 2018-08-01 | 2019-01-11 | 南昌大学 | Human body animation extraction method of key frame based on posture conspicuousness |
CN109934183A (en) * | 2019-03-18 | 2019-06-25 | 北京市商汤科技开发有限公司 | Image processing method and device, detection device and storage medium |
CN111402290A (en) * | 2020-02-29 | 2020-07-10 | 华为技术有限公司 | Action restoration method and device based on skeleton key points |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102395984A (en) * | 2009-04-14 | 2012-03-28 | 皇家飞利浦电子股份有限公司 | Key frames extraction for video content analysis |
WO2012078702A1 (en) * | 2010-12-10 | 2012-06-14 | Eastman Kodak Company | Video key frame extraction using sparse representation |
CN102749993A (en) * | 2012-05-30 | 2012-10-24 | 无锡掌游天下科技有限公司 | Motion recognition method based on skeleton node data |
CN103020648A (en) * | 2013-01-09 | 2013-04-03 | 北京东方艾迪普科技发展有限公司 | Method and device for identifying action types, and method and device for broadcasting programs |
US20150125045A1 (en) * | 2013-11-04 | 2015-05-07 | Steffen Gauglitz | Environment Mapping with Automatic Motion Model Selection |
-
2016
- 2016-07-08 CN CN201610539455.XA patent/CN106228111A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102395984A (en) * | 2009-04-14 | 2012-03-28 | 皇家飞利浦电子股份有限公司 | Key frames extraction for video content analysis |
WO2012078702A1 (en) * | 2010-12-10 | 2012-06-14 | Eastman Kodak Company | Video key frame extraction using sparse representation |
CN102749993A (en) * | 2012-05-30 | 2012-10-24 | 无锡掌游天下科技有限公司 | Motion recognition method based on skeleton node data |
CN103020648A (en) * | 2013-01-09 | 2013-04-03 | 北京东方艾迪普科技发展有限公司 | Method and device for identifying action types, and method and device for broadcasting programs |
US20150125045A1 (en) * | 2013-11-04 | 2015-05-07 | Steffen Gauglitz | Environment Mapping with Automatic Motion Model Selection |
Non-Patent Citations (2)
Title |
---|
LING SHAO等: "Motion Histogram Analysis Based Key Frame Extraction for Human Action/Activity Representation", 《2009 CANADIAN CONFERENCE ON COMPUTER AND ROBOT VISION》 * |
邹维嘉: "基于多示例学习的动作识别与显著性检测研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109190474A (en) * | 2018-08-01 | 2019-01-11 | 南昌大学 | Human body animation extraction method of key frame based on posture conspicuousness |
CN109190474B (en) * | 2018-08-01 | 2021-07-20 | 南昌大学 | Human body animation key frame extraction method based on gesture significance |
CN109934183A (en) * | 2019-03-18 | 2019-06-25 | 北京市商汤科技开发有限公司 | Image processing method and device, detection device and storage medium |
CN111402290A (en) * | 2020-02-29 | 2020-07-10 | 华为技术有限公司 | Action restoration method and device based on skeleton key points |
CN111402290B (en) * | 2020-02-29 | 2023-09-12 | 华为技术有限公司 | Action restoration method and device based on skeleton key points |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Song et al. | Richly activated graph convolutional network for action recognition with incomplete skeletons | |
Liu et al. | Skepxels: Spatio-temporal image representation of human skeleton joints for action recognition. | |
Gao et al. | Infar dataset: Infrared action recognition at different times | |
Hassner | A critical review of action recognition benchmarks | |
Malgireddy et al. | A temporal Bayesian model for classifying, detecting and localizing activities in video sequences | |
CN105528794A (en) | Moving object detection method based on Gaussian mixture model and superpixel segmentation | |
CN103020647A (en) | Image classification method based on hierarchical SIFT (scale-invariant feature transform) features and sparse coding | |
CN103902989B (en) | Human action video frequency identifying method based on Non-negative Matrix Factorization | |
Gao et al. | Human action recognition via multi-modality information | |
Liu et al. | 3D action recognition using multiscale energy-based global ternary image | |
Kihl et al. | A unified framework for local visual descriptors evaluation | |
CN106228111A (en) | A kind of method based on skeleton sequential extraction procedures key frame | |
CN105469050A (en) | Video behavior identification method based on local space-time characteristic description and pyramid vocabulary tree | |
Zhou et al. | Human action recognition toward massive-scale sport sceneries based on deep multi-model feature fusion | |
CN103577804A (en) | Abnormal human behavior identification method based on SIFT flow and hidden conditional random fields | |
Hou et al. | Enhancing and dissecting crowd counting by synthetic data | |
Roy et al. | Sparsity-inducing dictionaries for effective action classification | |
CN103218829A (en) | Foreground extracting method suitable for dynamic background | |
Sun et al. | Learning spatio-temporal co-occurrence correlograms for efficient human action classification | |
CN103778439A (en) | Body contour reconstruction method based on dynamic time-space information digging | |
Yin et al. | Small human group detection and event representation based on cognitive semantics | |
Li et al. | Trajectory-pooled spatial-temporal architecture of deep convolutional neural networks for video event detection | |
CN116403286A (en) | Social grouping method for large-scene video | |
Ma et al. | Video event classification and image segmentation based on noncausal multidimensional hidden markov models | |
Choi et al. | A view-based real-time human action recognition system as an interface for human computer interaction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161214 |