CN104299007A

CN104299007A - Classifier training method for behavior recognition

Info

Publication number: CN104299007A
Application number: CN201410472263.2A
Authority: CN
Inventors: 解梅; 许茂鹏; 张碧武; 卜英家
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2014-09-17
Filing date: 2014-09-17
Publication date: 2015-01-21

Abstract

The invention discloses a classifier training method for behavior recognition, and belongs to the technical field of image processing. The method includes the steps that STIPs of an input movement video image stream are extracted according to a Dollar detection operator; partial cavities between every two STIPs are filled, and all pixel points of which the vertical distance to the straight lines formed by every two STIPs is smaller than a preset threshold are set as new STIPs; all the current STIPs are presented based on an LDPD descriptor, a statistical histogram of the current movement video image stream is formed based on the LDPD description vector quantity of each STIP, the histogram serves as a training sample, and a behavior recognition classifier is output based on a support vector machine. The training method is used for behavior recognition, is not sensitive to an initial parameter, and is good in robustness when used for behavior recognition.

Description

A kind of sorter training method for Activity recognition

Technical field

The invention belongs to technical field of image processing, be specifically related to a kind of sorter training method for Activity recognition.

Background technology

Human bodys' response in video has become a research field of showing great attention to, and Activity recognition is used in every field, comprising: video index and browsing, video monitoring, identifies gesture, sport event analysis etc.And Activity recognition is mainly divided into behavioural analysis and identification, only have good behavioural analysis, just can better identify.Although each research institution current is constantly being studied in human action analysis, also having many open questions.This is because in real world, can by different building shape, outward appearance, speed, and the object of posture makes similar action.In addition, object that is static or movement is blocked, illumination variation, or shade can analyze the larger negative effect of generation to human action.

Early stage motion analysis scheme realizes based on template and tracking, need very detailed profile to describe in this scenario, but this not easily realizes at real world.In order to address this problem, with space-time interest points (STIP, the space-time characteristic of time shaft marked change along the line in video) based on method be widely used in the behavioural analysis of Activity recognition, the basic thought of the method is as space-time amount using continuous print video.Compare with based on the method for following the tracks of with template, the treatability meeting that the method moves produced negative effect to noise and video camera is better.Based on STIP method the action of the mankind being used as is a container of space-time interest points, again space-time interest points is extracted from continuous print video, and be described (wherein each descriptor is defined as visual vocabulary, and its histogram is used for Activity recognition) with outward appearance descriptor.Therefore, based on the method for STIP, only rely on the local appearance descriptor of individual to carry out Activity recognition, and the spatial and temporal distributions information of space-time interest points is left in the basket.

From the video of motion, extract STIP by detecting device (as Dollar detects), and build sparse stream of action with the STIP extracted.But, because the gap in rarefied flow between two consecutive point is too large, the cavity existed is too many, thus cannot in sparse motion stream the space-time characteristic of effective capturing motion, especially in rapid movement, gap between two between STIP can become very large, substantially cannot describe the space-time characteristic of action, and then cannot identify.

Summary of the invention

The object of the invention is to, propose a kind of sorter training method describing the space-time characteristic of action based on close space-time characteristic point, based on obtained sorter consummatory behavior identification.

Sorter training method for Activity recognition of the present invention, comprises the following steps:

Step 1: input motion video image stream;

Step 2: all space-time interest points extracting described motion video image stream according to Dollar detective operators, forms space-time interest points collection G;

Step 3: set up line segment based on adjacent two space-time interest points A, B will with line segment vertical range all pixels of being less than predetermined threshold value be set as space-time interest points, be increased in described space-time interest points collection G;

Step 4: represent each space-time interest points in current space-time interest point set G based on LDPD descriptor:

To any space-time interest points, be designated as P (x, y, z), wherein x represents the horizontal ordinate of place image, and y represents the ordinate of place image; Z represents frame number;

A three-dimensional cube V is set up, the x, y, z of the corresponding empty point of interest P of length difference of described three-dimensional cube V centered by space-time interest points P;

Along the y-axis direction three-dimensional cube V is divided into N number of elementary sub-block, N is more than or equal to 2; From each elementary sub-block, set up the individual intermediate sub-block less than elementary sub-block of M again, wherein M is more than or equal to 2; Get the feature of number as corresponding sub-block of the space-time interest points that elementary sub-block/intermediate sub-block comprises; By M+N sub-block to candidate expression space-time interest points P;

Step 5: based on the LDPD description vectors of each space-time interest points, forms the statistic histogram of current kinetic video image stream, using described histogram as training sample, exports behavior category classifier based on support vector machine.

Technique effect of the present invention is: based on closely knit stream (between the adjacent space-time interest points corresponding to existing rarefied flow, the point that increase satisfies condition is as new space-time interest points, with fill up between two between space-time interest points part cavity) method, and new descriptor LPDP is proposed, can be more accurate, clear to the description of various action, obtain training for the sorter of Activity recognition based on this, and then realize Activity recognition; The present invention is insensitive to initial parameter in addition, good for robustness during Activity recognition.

Embodiment

For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with embodiment, the present invention is described in further detail.

In this patent, the method based on sparse local feature being converted into closely knit motion stream is proposed.Specifically, first Dollar detecting device is adopted to extract space-time interest points STIP from video, and build sparse motion stream with the STIP extracted, wherein Dollar detects is the motion analysis detective operators proposed by the people such as Dollar, and this detective operators by Gabor filter and just, cosine function forms.This detective operators Main Function is the position of extracting local motion information from sport video.Because the gap in sparse motion stream between two adjacent S TIP is too large, cannot by the movable information of the effective capturing motion of sparse motion stream, especially in rapid movement, gap between adjacent S TIP can become very large, the movable information of action cannot be described substantially, cause action recognition rate on the low side, and then cannot identify.In order to solve this restriction, the present invention proposes MFE (expansion of motion stream) and sparse motion circulation is turned to closely knit motion stream.Its basic thought is the part cavity by filling up between adjacent S TIP, make its feature space curved surface more round and smooth, closely knit, then the spatial and temporal distributions descriptor LDPD (local distribution descriptor) of proposition is used to measure closely knit spatial and temporal distributions information of moving in stream, finally discrete local distribution descriptor is carried out statistics and be converted into corresponding histogram, namely can be used for follow-up behaviour classification, it specifically comprises the following steps:

Step 1: utilize camera to catch the image containing moving object, produce continuous print video flowing;

Step 2: all space-time interest points extracting the video flowing inputted according to Dollar detective operators, forms space-time interest points collection G;

Step 3: set up line segment based on adjacent two space-time interest points A, B will with line segment vertical range all pixels of being less than predetermined threshold value be set as space-time interest points, be increased in space-time interest points collection G;

Step 5: based on the LDPD description vectors of each space-time interest points, form the statistic histogram of current kinetic video image stream, using described histogram as training sample, export behavior category classifier based on support vector machine, in this embodiment, its concrete processing procedure is:

(running video is comprised based on all training samples, video on foot, to box video etc.) in the LDPD description vectors of each space-time interest points, utilize clustering algorithm (such as K means clustering method) by the LDPD description vectors cluster of all space-time interest points to K (K>=5) individual center vector, and using this K center vector as code book (code element of coding).According to the position relationship in each training sample between LDPD description vectors and code book, utilize nearest neighbouring rule to encode to all training samples, add up the histogram of quantity as current sample of LDPD near each code element in each training sample.Finally by the histogram of normalized, as the proper vector of each training sample, send into support vector machine and export behavior category classifier.

When the behavior category classifier obtained based on the present invention completes the Activity recognition to the video image stream of input, first the LDPD description vectors obtaining the corresponding space-time interest points of current video image stream according to above-mentioned steps 1-4 (equals 3 with N, M equals 2 for example, namely each space-time interest points is the proper vector of 6 dimensions, the space-time interest points number that the feature often tieed up comprises for one of them sub-block (elementary sub-block or intermediate sub-block)); Then the position relationship between code book is obtained according to LDPD description vectors and training, nearest neighbouring rule is utilized to encode to current video stream, add up the histogram of quantity as current kinetic video flowing of LDPD near each code element, finally by the histogram of normalized, as the proper vector of current kinetic video flowing, Behavior-based control category classifier identification behavior classification.

The present invention is not limited to aforesaid embodiment.The present invention expands to any new feature of disclosing in this manual or any combination newly, and the step of the arbitrary new method disclosed or process or any combination newly.

Claims

1., for a sorter training method for Activity recognition, it is characterized in that, comprise the following steps:

Step 1: input motion video image stream;