CN105550678A

CN105550678A - Human body motion feature extraction method based on global remarkable edge area

Info

Publication number: CN105550678A
Application number: CN201610075788.1A
Authority: CN
Inventors: 胡瑞敏; 徐增敏; 陈军; 陈华锋; 李红阳; 王中元; 郑淇; 吴华; 王晓; 周立国
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2016-02-03
Filing date: 2016-02-03
Publication date: 2016-05-04
Anticipated expiration: 2036-02-03
Also published as: CN105550678B

Abstract

The invention discloses a human body motion feature extraction method based on a global remarkable edge area, comprising steps of using a contrast between an area and a whole image to calculate the significance, reducing the color quantity of the color space, smoothing the significance of the color space, calculating the significance area according to the space relation of the neighboring areas, performing morphology gradient changing on the foreground area segmented by a binarized threshold to generate a global remarkable edge area, traversing strong corner points of all grids of the video frames under various sizes, collecting key characteristic points, the light stream amplitude value of which is not 0, in the remarkable edge area, solving the displacement of the strong corner point according to the corrected light stream field, and forming the human body motion local time space characteristic by using the strong corner point continuous multi-frame displacement locus and the neighbourhood gradient vector. The invention extracts the motion characteristics through global remarkable edge area, eliminates the background noise points irrelevant to the human body motion, removes the affect on the light stream calculation by the camera motion, improves the accuracy of the human body motion local time space characteristic description and improves the human body motion recognition rate.

Description

Based on the human action feature extracting method in overall prominent edge region

Technical field

The invention belongs to video analysis field, relate to a kind of human body behavior automatic identifying method, be specifically related to the human action feature extracting method based on overall prominent edge region.

Background technology

Along with the development of internet, the continuous popularization of video monitoring system, the video data volume sharp increase.In the face of the video data that magnanimity is emerged in large numbers, how analyzing video human behavior becomes a problem demanding prompt solution.Because video data is easily subject to the impact that foreground moving region is undistinct, camera shake amplitude is large, scene environment is complicated, make human motion in video data, there is a large amount of noise angle points, cause the key feature points of frame of video to extract inaccurate, Human bodys' response precision is limited.

Human action feature extraction is the important component part of Human bodys' response, belongs to an important research content in video analysis field, its objective is to allow Computer Automatic Extraction human action feature, the behavior of automatic decision prediction human body.Therefore, effective motion characteristic extracting method is conducive to the precision improving action recognition.

Current human action feature extracting method is divided into 3 classes: extract the method for bottom local space time point of interest, the motion characteristic attribute description method based on middle level semanteme study and the method based on high-level semantics features point tracking and limbs deformable template based on single-frame images or multi-frame video stream.

Method based on bottom local space time point of interest needs extraction target object being carried out to local space time's point of interest, and obtains target object motion modeling in conjunction with certain light stream estimation, and is aided with various description operator expression limb action.The defect of these class methods is the impact being easily subject to ground unrest, camera shake and target occlusion, and the analysis lacked human body behavior global characteristics and behavior model globality and understanding.

Based on the semantic method learnt in middle level usually on the basis extracting bottom activities feature, usually through prospect marking area, moving object detection, contour of object segmentation, differentiate the methods such as dictionary learning, multi-channel feature fusion, convolutional neural networks, higher level semantic feature modeling is carried out to basic motion feature, obtains the overall situation or local space time's feature representation of target object motion in multi-frame video stream.The problem of this method is highly to rely on the ability to express of input feature vector and the performance of middle level semanteme study algorithm frame.

Method based on high-level semantics features point depends on mark or body sense camera manually, demarcate human skeleton articulation point and carry out real-time tracing, and construct limbs tree structure model or deformable template, characterize human action feature in conjunction with articulation point motion history and the conventional operator that describes.The defect of this method is to need to rely on human experience to spend the plenty of time to mark video sample, or relies on intelligent body sense equipment calibration skeletal joint point.

The patents list relevant to motion characteristic extracting method is as follows:

The mutual field of human body: the open patent of invention of Institute of Automation, CAS in 2015 " human action collection and motion recognition system and control method " thereof, this invention uses wireless transceiver and 3 axle acceleration sensor circuit to obtain human action, is intended to the effect improving stage performance and speech; The open patent of invention " a kind of intelligent watch based on action recognition and action identification method " of Xian Electronics Science and Technology University in 2015, this invention carries out control operation by setting human body forearm gesture motion to intelligent watch; The open patent of invention " headwork defining method and device " of Beijing Zhi Gu Virtuozzo company in 2015, this invention, by obtaining the brain electro-detection information of described human body, determines the headwork corresponding with described brain electro-detection information; Within 2015, Lianxiang (Beijing) Co., Ltd. announces patent of invention " a kind of action identification method, device and electronic equipment ", this invention adds the trigger condition that action obtains, just trigger action identification when only satisfying condition with described electronic equipment physical distance in monitored area.

Video analysis field: the open patent of invention " a kind of action identification method based on time pyramid local matching window " of Zhejiang Polytechnical University in 2015, this invention extracts 3D articulation point from the human depth figure that stereoscopic camera obtains, with the feature representation of the 3D displacement difference between attitude as every frame depth map; Within 2015, Pan Gu scientific & technical corporation of BeiJing ZhongKe announces patent of invention " the human body limb gesture actions recognition methods based on compartition study ", and this invention carries out matching ratio pair by human synovial data and given pose sequence library of setting up; The open patent of invention " a kind of posture sequence finite state machine action identification method " of Xinan Science and Technology Univ. in 2015, the limbs node data that Kinect sensor obtains by this invention carries out coordinate transform, adopts unified space lattice model to measure transform data; The Computer Department of the Chinese Academy of Science in 2015 announces patent of invention " a kind of based on time sequence information across visual angle action identification method and system ", this invention using point of interest exercise intensity as feature interpretation, in conjunction with the source coarseness markup information of source multi-view video to obtain target coarse grain information.

Video analysis field based on significance analysis: Xinan Science and Technology Univ. announced patent of invention " a kind of Human bodys' response algorithm based on STDF feature " in 2015, this invention utilizes the depth information determination human motion salient region of video image, by the energy function of light stream in zoning as gauge region liveness, Gauss's sampling is carried out to motion salient region, make the probability distribution of samples points in motion intense regions, using the sample point that collects as action low-level image feature; University Of Suzhou announces patent of invention " the personage's Activity recognition method based on threshold matrix and Fusion Features vision word ", this invention by frame of video significance acquisitor's object area position, then to taking different threshold test to go out point of interest as motion characteristic inside and outside region; Within 2015, Nanjing Univ. of Posts and Telecommunications announces patent of invention " a kind of Human bodys' response method based on RGB-D video ", this invention extracts dense MovingPose feature, SHOPC characteristic sum HOG3D feature respectively from RGB-D video, adopts the Multiple Kernel Learning method of edge limitation to carry out Fusion Features to three kinds of features; Within 2014, University Of Tianjin announces patent of invention " a kind of human motion recognition method based on local feature ", space-time interest points characteristic sum coordinate is extracted in this invention from motion images sequence, trains word bag dictionary model to come to encode to local feature respectively by dividing human region.

Summary of the invention

In order to solve the problems of the technologies described above, the object of this invention is to provide a kind of human action feature extracting method based on overall prominent edge region.

The technical solution adopted in the present invention is: based on the human action feature extracting method in overall prominent edge region, it is characterized in that, comprise the following steps:

Step 1: the number of colors reducing rgb color space, the significance in smooth color space;

Step 2: the spatial relationship according to adjacent area calculates salient region;

Step 3: adopt binary-state threshold segmentation prospect marking area;

Step 4: do morphology graded to the foreground area be partitioned into, generates overall prominent edge region;

Step 5: by feature point pairs and stochastic sampling consistent correction optical flow field;

Step 6: the strong angle point of all grid-search method under traversal frame of video different scale;

Step 7: gather in prominent edge region and revise the non-vanishing strong angle point of light stream amplitude as the strong angle point of key feature;

Step 8: check the strong angle point number of key feature that step 7 obtains, if number is zero, get the strong angle point of step 6 as the strong angle point of key feature;

Step 9: according to the displacement revising the strong angle point of optical flow computation key feature;

Step 10: with the coordinate displacement track of strong angle point continuous multiple frames, and angle point neighborhood gradient vector composition human action local space time feature.

In described step 1, reduce the number of colors of rgb color space, the significance in smooth color space; Specific implementation process is:

A kth pixel I in definition image I _ksignificance S () be:

S (I_{k}) = \underset{&ForAll; I_{i} &Element; I}{Σ} D (I_{k}, I_{i}) \underset{&ForAll; I_{i} &Element; I}{Σ} | | I_{k} - I_{i} | |, - - - (1)

Wherein D (I _k, I _i) be pixel I _kwith pixel I _iat the distance metric of color space;

First by the color quantizing of rgb color space 3 passages to 12 different values, make the number of colors of image pixel reduce to 12 ³=1728; Then the color by selecting high frequency to occur, reduces to n=85 by number of colors, guarantees that these colors cover the pixel of more than 95%; Then to the smoothing operation of significance of color c after each quantification, improve significance with the weighted mean value of m neighbour's color conspicuousness, formula is as follows:

S^{'} (c) = \frac{1}{(m - 1) T} Σ_{i = 1}^{m} (T - D (c, c_{i})) S (c_{i}), - - - (2)

Wherein for color c and m neighbour color c _ibetween distance.

In described step 2, the spatial relationship according to adjacent area calculates salient region; Implementation procedure is:

First use image segmentation algorithm that input video frame is divided into multiple region, and set up color histogram for each region; For each region r _k, by calculating significance with the color contrast in other region, formula is as follows:

S (r_{k}) = \underset{r_{k} &NotEqual; r_{i}}{Σ} w (r_{i}) D_{r} (r_{k}, r_{i}), - - - (3)

Wherein w (r _i) be the sum of all pixels in i-th region in image, represent region r _iweight, emphasize the color contrast of large regions with this; D _r() is the color distance in two regions; Two region r ₁and r ₂color distance be:

D_{r} (r_{1}, r_{2}) = Σ_{i = 1}^{n_{1}} Σ_{j = 1}^{n_{2}} f (c_{1, i}) f (c_{2, j}) D (c_{1, i}, c_{2, j}), - - - (4)

Wherein c _{1, i}for region r ₁in the color value of i-th pixel, f (c _{1, i}) represent c _{1, i}the probability occurred in image I; c _{2, j}for region r ₂the color value of a middle jth pixel, f (c _{2, j}) represent c _{2, j}the probability occurred in image I; D (c _{1, i}, c _{2, j}) represent two pixel c _{1, i}and c _{2, j}color distance;

Then on the basis of formula (3), add adjacent space information, increase the impact of neighbour's area of space, see formula:

S (r_{k}) = \underset{r_{k} &NotEqual; r_{i}}{Σ} \exp (- D_{s} (r_{k}, r_{i}) / σ_{s}^{2}) w (r_{i}) D_{r} (r_{k}, r_{i}), - - - (5)

Wherein D _s(r _i, r _k) be region r _iand r _kspace length (i.e. the Euclidean distance of two regional barycenters), σ _sfor color space weight intensity.

In described step 3, adopt binary-state threshold segmentation prospect marking area; Implementation procedure is: the frame of video marking area calculated by formula (5), 8 are converted to without symbol gray-scale map from real-coded GA, by setting one [0,255] threshold value carries out binaryzation operation, using the binary image drawn as the foreground area RCmap of input video frame.

In described step 4, morphology graded is done to the foreground area be partitioned into, generate new overall prominent edge region RCBmap; Implementation procedure is: do morphology graded with following formula to RCmap:

RCBmap＝morph _grad(RCmap)＝dilate(RCmap)-erode(RCmap),(6)

Wherein morph _grad() represents Morphological Gradient operation, and dilate () and erode () represents that dilation and erosion operates respectively.

In described step 5, by feature point pairs and stochastic sampling consistent correction optical flow field; Implementation procedure is: first use algorithm obtains the dense optical flow field vector ω of current video frame _t, by SURF unique point and the strong angle point composition characteristic point pair of key feature of front and back two frame, then obtain revised optical flow field vector ω ' with RANSAC algorithm and these feature point pairs _t.

In described step 6, the strong angle point of all grid-search method under traversal frame of video different scale; Implementation procedure is:

For the frame of video of different scale each after down-sampling, first press the model split grid of n*n pixel, then with the strong angle point of following formulas Extraction current video frame I:

T = 0.001 \times \underset{i &Element; I}{m a x} m i n (λ_{i}^{1}, λ_{i}^{2}), - - - (7)

Wherein for in pixel i contiguous range each in frame of video I, the 2*2 gradient covariance matrix eigenwert obtained by image derivative; For each pixel character pair value being greater than threshold value T, record its coordinate position in frame of video I, the n*n pixel coverage of a grid in the video frame if this pixel coordinate falls, then using the central pixel point of this grid as strong angle point P.

In described step 7, gather in prominent edge region and revise the non-vanishing strong angle point of light stream amplitude as the strong angle point of key feature; Implementation procedure is: according to the prominent edge region RCBmap of frame of video under each different scale that formula (6) obtains, from the whole strong angle point of all grids of t frame of video, filter out the strong angle point that coordinate drops on prominent edge region, if this angle point is greater than minimum light stream threshold value in correction light stream vector amplitude after normalization of next frame, so just using this angle point as the strong angle point P of key feature _t; Optical flow field is mag (I in the amplitude of motion vector after normalization of i-th pixel _i), computing formula is as follows:

m a g (I_{i}) = \frac{\sqrt[2]{I_{i}^{u} * I_{i}^{u} + I_{i}^{v} * I_{i}^{v}}}{\underset{&ForAll; i &Element; I}{m a x} (\sqrt[2]{I_{i}^{u} * I_{i}^{u} + I_{i}^{v} * I_{i}^{v}})}, - - - (8)

Wherein, suppose for current optical flow field I is at the motion vector of i-th pixel, so with be respectively in the horizontal direction with the component in vertical direction.In described step 8, check the strong angle point number of key feature that step 7 obtains, if number is zero, get the strong angle point of step 6 as the strong angle point of key feature; Implementation procedure is: check the strong angle point number of key feature that step 7 collects at current video frame, if angle point number is zero, so the restriction in prominent edge region and minimum light stream threshold value will be cancelled, directly according to the method for step 6, using the strong angle point all under current scale of t frame of video as the strong angle point P of key feature _t.

In described step 9, according to the displacement revising the strong angle point of optical flow computation key feature; Implementation procedure is: the correction optical flow field vector ω ' calculated according to step 5 _t, the strong angle point P of recorded key feature _tat the displacement coordinate P of t+1 frame _t+1, formula is as follows:

P _t+1＝(x _t+1,y _t+1)＝(x _t,y _t)+(M*ω′ _t)| _(xt,yt),(9)

Wherein M represents the core of median filter, (x _t, y _t) represent angle point P _t+1at the coordinate position of frame of video.

In described step 10, with the coordinate displacement track of strong angle point continuous multiple frames, and angle point neighborhood gradient vector composition human action local space time feature; Implementation procedure is: the coordinate P recording the continuous L frame of the strong angle point of each key feature _tto P _t+L, this angle point, at neighborhood gradient vector HOG, HOF and MBH of continuous multiple frames, operator is described, by blank coil during 16 pixel × 5, pixel × 16 frames formation local feature; By formed in continuous for this angle point L=15 frame 3 local features time blank coil, describe operator by HOG, HOF, MBH and calculate this angle neighborhood of a point gradient vector, composition human action local space time feature.

Relative to prior art, beneficial effect of the present invention is: the marking area being partitioned into foreground moving by the remarkable algorithm of global contrast, there is according to the change of movement edge region gradient the visual characteristic of strong judgement index, in the overall prominent edge region generated frame by frame after Morphological Gradient conversion, in conjunction with the light stream motion vector revised, extract and revise the non-vanishing strong angle point of key feature of light stream amplitude, and estimate the deformation trace of these strong angle point continuous multiple frames, with blank coil during description operator combination formation local feature, realize the motion characteristic extracting method of middle level semantic class.The present invention can reject the ground unrest irrelevant with human motion, eliminates camera shake describes operator impact on HOF, MBH in motion characteristic, promotes the accuracy of human action local space time feature interpretation, improve Human bodys' response rate.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the embodiment of the present invention.

Blank coil exemplary plot when Fig. 2 is the action local feature of the embodiment of the present invention.

Embodiment

Understand for the ease of those of ordinary skill in the art and implement the present invention, below in conjunction with drawings and Examples, the present invention is described in further detail, should be appreciated that exemplifying embodiment described herein is only for instruction and explanation of the present invention, is not intended to limit the present invention.

See Fig. 1, a kind of human action feature extracting method based on overall prominent edge region that the embodiment of the present invention provides, specifically comprises the following steps:

Step 1: the number of colors reducing rgb color space, the significance in smooth color space.Specific implementation process is: a kth pixel I in definition image I _ksignificance S () be:

S (I_{k}) = \underset{&ForAll; I_{i} &Element; I}{Σ} D (I_{k}, I_{i}) = \underset{&ForAll; I_{i} &Element; I}{Σ} | | I_{k} - I_{i} | |, - - - (1)

Wherein D (I _k, I _i) be pixel I _kwith pixel I _iat the distance metric of color space.In the application, all S () are all the meanings representing significance, as S (I _k) represent the significance of a kth pixel in image I.S is writing a Chinese character in simplified form of Saliency.Symbol i in formula (1) represents image i-th pixel.

First by the color quantizing of rgb color space 3 passages to 12 different values, make the number of colors of image pixel reduce to 12 ³=1728.Then the color by selecting high frequency to occur, reduces to n=85 by number of colors, guarantees that these colors cover the pixel of more than 95%.Then to the smoothing operation of significance of color c after each quantification, significance is improved with the weighted mean value of m neighbour's color conspicuousness.Formula is as follows:

S^{'} (c) = \frac{1}{(m - 1) T} Σ_{j = 1}^{m} (T - D (c, c_{j})) S (c_{j}), - - - (2)

Wherein, for color c and m neighbour color c _jbetween distance.Subscript j represents a jth contiguous color; C is writing a Chinese character in simplified form of color, and color c only has 1728 values after quantizing, and after color reduces to n=85 again, can reduce the time complexity of distance calculating.S (c _j) represent the significance of jth neighbour's color in the color c after quantizing.

Step 2: the spatial relationship according to adjacent area calculates salient region.Specific implementation process is:

First use image segmentation algorithm that input video frame is divided into multiple region, and set up color histogram for each region.For each region r _k, we are by calculating significance with the color contrast in other region, and formula is as follows:

S (r_{k}) = \underset{r_{k} &NotEqual; r_{i}}{Σ} w (r_{i}) D_{r} (r_{k}, r_{i}), - - - (3)

Wherein w (r _i) be region r _iweight, D _r() is the color distance in two regions.And all functions occurred with D () form in the application, the distance metric function of formula (1) can be used.D (I _k, I _i), D (c, c _j) be all represent the color distance between two parameters.D is writing a Chinese character in simplified form of Distance, and r is writing a Chinese character in simplified form of region, and w is writing a Chinese character in simplified form of weight, w (r _i) represent the sum of all pixels in i-th region in image, the color contrast of large regions is emphasized with this.Two region r ₁and r ₂color distance be:

D_{r} (r_{1}, r_{2}) = Σ_{i = 1}^{n_{1}} Σ_{j = 1}^{n_{2}} f (c_{1, i}) f (c_{2, j}) D (c_{1, i}, c_{2, j}), - - - (4)

Wherein c _{1, i}for region r ₁in the color value of i-th pixel, f (c _{1, i}) represent c _{1, i}the probability occurred in image I; c _{2, j}for region r ₂the color value of a middle jth pixel, f (c _{2, j}) represent c _{2, j}the probability occurred in image I; D (c _{1, i}, c _{2, j}) represent two pixel c _{1, i}and c _{2, j}color distance.

S (r_{k}) = \underset{r_{k} &NotEqual; r_{i}}{Σ} \exp (- D_{s} (r_{k}, r_{i}) / σ_{s}^{2}) w (r_{i}) D_{r} (r_{k}, r_{i}), - - - (5)

Step 3: adopt binary-state threshold segmentation prospect marking area.Specific implementation process is: the frame of video marking area calculated by formula (5), 8 are converted to without symbol gray-scale map from real-coded GA, carry out binaryzation operation using the average of this gray-scale map as threshold value, the binary image drawn is the prospect marking area RCmap of input video frame.

Step 4: do morphology graded to the foreground area be partitioned into, generates new overall prominent edge region RCBmap.Specific implementation process is: do 2 Morphological Gradient changes with following formula to RCmap and expand prominent edge regional extent:

RCBmap＝morph _grad(RCmap)＝dilate(RCmap)-erode(RCmap),(6)

Wherein morph _grad() represents Morphological Gradient operation, and dilate () and erode () represents that dilation and erosion operates respectively.Subscript i in formula (3), (5) represents i-th region.

Step 5: by feature point pairs and stochastic sampling consistent correction optical flow field.Implementation procedure is: first use algorithm obtains the dense optical flow field vector ω of current video frame _t, by SURF unique point and the strong angle point composition characteristic point pair of key feature of front and back two frame, then obtain revised optical flow field vector with RANSAC algorithm and these feature point pairs

Step 6: the strong angle point of all grid-search method under traversal frame of video different scale.Implementation procedure is: for the frame of video of different scale each after down-sampling, first presses the model split grid of n*n pixel, then with the strong angle point of Harris of following formulas Extraction current video frame I:

T = 0.001 \times \underset{i &Element; I}{m a x} m i n (λ_{i}^{1}, λ_{i}^{2}), - - - (7)

Wherein for within the scope of 3*3 neighborhood of pixels around pixel i each in frame of video I, the 2*2 gradient covariance matrix eigenwert obtained by image derivative.For each pixel character pair value being greater than threshold value T, record its coordinate position in frame of video I, the n*n pixel coverage of a grid in the video frame if this pixel coordinate falls, then using the central pixel point of this grid as strong angle point P.

Step 7: gather in prominent edge region and revise the non-vanishing strong angle point of light stream amplitude as the strong angle point of key feature.Implementation procedure is: according to the prominent edge region RCBmap of frame of video under each different scale that formula (6) obtains, from the whole strong angle point of all grids of t frame of video, filter out the strong angle point that coordinate drops on prominent edge region, if this angle point is greater than minimum light stream threshold value (minimum light stream threshold value can be set to 0.001) in correction light stream vector amplitude after normalization of next frame, so just using this angle point as the strong angle point P of key feature _t.Optical flow field is mag (I in the amplitude of motion vector after normalization of i-th pixel _i), computing formula is as follows:

m a g (I_{i}) = \frac{\sqrt[2]{I_{i}^{u} * I_{i}^{u} + I_{i}^{v} * I_{i}^{v}}}{\underset{&ForAll; i &Element; I}{m a x} (\sqrt[2]{I_{i}^{u} * I_{i}^{u} + I_{i}^{v} * I_{i}^{v}})}, - - - (8)

Wherein, suppose for current optical flow field I is at the motion vector of i-th pixel, so with be respectively in the horizontal direction with the component in vertical direction.Step 8: check the strong angle point number of key feature that step 7 obtains, if number is zero, get the strong angle point of step 6 as the strong angle point of key feature; Implementation procedure is: check the strong angle point number of key feature that step 7 collects at current video frame, if angle point number is zero, so the restriction in prominent edge region and minimum light stream threshold value will be cancelled, directly according to the method for step 6, using the strong angle point all under current scale of t frame of video as the strong angle point P of key feature _t.

Step 9: according to the displacement revising the strong angle point of optical flow computation key feature; Implementation procedure is: the correction optical flow field vector ω ' calculated according to step 5 _t, the strong angle point P of recorded key feature _tat the displacement coordinate P of t+1 frame _t+1, formula is as follows:

P _t+1＝(x _t+1,y _t+1)＝(x _t,y _t)+(M*ω′ _t)| _(xt,yt),(9)

Step 10: with the coordinate displacement track of strong angle point continuous multiple frames, and angle point neighborhood gradient vector composition human action local space time feature.Implementation procedure is: the coordinate P recording the continuous L frame of the strong angle point of each key feature _tto P _t+L, this angle point, at neighborhood gradient vector such as HOG, HOF and MBH of continuous multiple frames, operator is described, by blank coil during 16 pixel × 5, pixel × 16 frames formation local feature.By formed in continuous for this angle point L=15 frame 3 local features time blank coil, describe operator by HOG, HOF, MBH and calculate this angle neighborhood of a point gradient vector, be composed in series human action local space time feature.

Should be understood that, the part that this instructions does not elaborate all belongs to prior art.

Should be understood that; the above-mentioned description for preferred embodiment is comparatively detailed; therefore limiting the scope of the invention can not be thought; those of ordinary skill in the art is under enlightenment of the present invention; do not departing under the ambit that the claims in the present invention protect; can also make and replacing or distortion, all fall within protection scope of the present invention, request protection domain of the present invention should be as the criterion with claims.

Claims

1., based on the human action feature extracting method in overall prominent edge region, it is characterized in that, comprise the following steps:

Step 3: adopt binary-state threshold segmentation prospect marking area;

2. the human action feature extracting method based on overall prominent edge region according to claim 1, is characterized in that: in described step 1, reduces the number of colors of rgb color space, the significance in smooth color space; Specific implementation process is:

A kth pixel I in definition image I _ksignificance S () be:

Wherein, for color c and m neighbour color c _jbetween distance; Subscript j represents a jth contiguous color.

3. the human action feature extracting method based on overall prominent edge region according to claim 2, is characterized in that: in described step 2, and the spatial relationship according to adjacent area calculates salient region; Implementation procedure is:

4. the human action feature extracting method based on overall prominent edge region according to claim 3, is characterized in that: in described step 3, adopts binary-state threshold segmentation prospect marking area; Implementation procedure is: the frame of video marking area calculated by formula (5), 8 are converted to without symbol gray-scale map from real-coded GA, by setting one [0,255] threshold value carries out binaryzation operation, using the binary image drawn as the foreground area RCmap of input video frame.

5. the human action feature extracting method based on overall prominent edge region according to claim 4, is characterized in that: in described step 4, does morphology graded to the foreground area be partitioned into, and generates new overall prominent edge region RCBmap; Implementation procedure is: do morphology graded with following formula to RCmap:

RCBmap＝morph _grad(RCmap)＝dilate(RCmap)-erode(RCmap),(6)。

6. the human action feature extracting method based on overall prominent edge region according to claim 5, is characterized in that: in described step 5, by feature point pairs and stochastic sampling consistent correction optical flow field; Implementation procedure is: first use algorithm obtains the dense optical flow field vector ω of current video frame _t, by SURF unique point and the strong angle point composition characteristic point pair of key feature of front and back two frame, then obtain revised optical flow field vector with RANSAC algorithm and these feature point pairs

7. the human action feature extracting method based on overall prominent edge region according to claim 6, is characterized in that: in described step 6, the strong angle point of all grid-search method under traversal frame of video different scale; Implementation procedure is:

8. the human action feature extracting method based on overall prominent edge region according to claim 7, is characterized in that: in described step 7, gathers and revise the non-vanishing strong angle point of light stream amplitude as the strong angle point of key feature in prominent edge region; Implementation procedure is: according to the prominent edge region RCBmap of frame of video under each different scale that formula (6) obtains, from the whole strong angle point of all grids of t frame of video, filter out the strong angle point that coordinate drops on prominent edge region, if this angle point is greater than minimum light stream threshold value in correction light stream vector amplitude after normalization of next frame, so just using this angle point as the strong angle point P of key feature _t; Optical flow field is mag (Ι in the amplitude of motion vector after normalization of i-th pixel _i), computing formula is as follows:

Wherein, suppose for current optical flow field Ι is at the motion vector of i-th pixel, so with be respectively in the horizontal direction with the component in vertical direction;

In described step 8, check the strong angle point number of key feature that step 7 obtains, if number is zero, get the strong angle point of step 6 as the strong angle point of key feature; Implementation procedure is: check the strong angle point number of key feature that step 7 collects at current video frame, if angle point number is zero, so the restriction in prominent edge region and minimum light stream threshold value will be cancelled, directly by the method for step 6, using the strong angle point all under current scale of t frame of video as the strong angle point P of key feature _t.

9. the human action feature extracting method based on overall prominent edge region according to claim 8, is characterized in that: in described step 9, according to the displacement revising the strong angle point of optical flow computation key feature; Implementation procedure is: the correction optical flow field vector calculated according to step 5 the strong angle point P of recorded key feature _tat the displacement coordinate P of t+1 frame _t+1, formula is as follows:

P _t+1＝(x _t+1,y _t+1)＝(x _t,y _t)+(M*ω _t′)|(x _t,y _t),(9)

10. the human action feature extracting method based on overall prominent edge region according to claim 9, it is characterized in that: in described step 10, with the coordinate displacement track of strong angle point continuous multiple frames, and angle point neighborhood gradient vector composition human action local space time feature; Implementation procedure is: the coordinate P recording the continuous L frame of the strong angle point of each key feature _tto P _t+L, this angle point, at neighborhood gradient vector HOG, HOF and MBH of continuous multiple frames, operator is described, by blank coil during 16 pixel × 5, pixel × 16 frames formation local feature; By formed in continuous for this angle point L=15 frame 3 local features time blank coil, describe operator by HOG, HOF, MBH and calculate this angle neighborhood of a point gradient vector, composition human action local space time feature.