KR20160124948A

KR20160124948A - Tensor Divergence Feature Extraction System based on HoG and HOF for video obejct action classification

Info

Publication number: KR20160124948A
Application number: KR1020150055044A
Authority: KR
Inventors: 김진영; 뷔넉남; 민소희; 김정기
Original assignee: 전남대학교산학협력단
Priority date: 2015-04-20
Filing date: 2015-04-20
Publication date: 2016-10-31
Anticipated expiration: 2035-04-20
Also published as: KR101713189B1

Abstract

본 발명은 비디오로부터 객체 행동 분류를 위한 특징정보를 추출하는 방법에 관한 것으로서, 처리대상 비디오의 영상 프레임들에 대해 선정된 키포인트에 대해 그레디언트 벡터와 광학 옵티컬 플로우벡터를 산출하는 단계와, 그레디언트 벡터와 옵티컬 플로우 벡터에 대한 텐서곱을 구하는 단계와, 텐서곱의 차원을 낮추도록 텐서곱에 대한 텐서다이버전스를 산출하고, 산출된 텐서 다이버전스를 움직임 분류용 특징 벡터로 결정하는 단계를 포함한다. 이러한 비디오 객체 행동 분류를 위한 특징정보 추출 방법 및 특징 추출기에 의하면, HOG 및 HOF의 특징을 하나의 특징으로 융합하여 공간과 시간상의 변화를 반영할 수 있으면서 특징벡터의 차원을 증가시키지 않음으로써, 인식기인 분류기의 계산량을 증가시키지 않으면서도 비디오내 행동분류의 성능을 향상시킬 수 있는 장점을 제공한다.The present invention relates to a method of extracting feature information for classification of object behavior from video, comprising the steps of: calculating a gradient vector and an optical optical flow vector for a selected key point for video frames of a video to be processed; Obtaining a tensor product of the optical flow vector, calculating a tensor divergence with respect to the tensor product so as to lower the dimension of the tensor product, and determining the calculated tensor divergence as a feature vector for motion classification. According to the feature information extraction method and feature extractor for classifying the video object behavior, it is possible to reflect changes in space and time by fusing the features of HOG and HOF into one feature, while not increasing the dimension of the feature vector, It is possible to improve the performance of the behavior classification in video without increasing the computational complexity of the classifier.

Description

TECHNICAL FIELD The present invention relates to a method and an apparatus for extracting feature information based on HOG / HOF for classification of video object behavior,

본 발명은 비디오 객체 행동 분류를 위한 HOG/HOF 기반 특징정보 추출 방법 및 특징 추출기에 관한 것으로서, 비디오 내 객체의 움직임을 분류 성능을 향상시키기 위한 비디오 객체 행동 분류를 위한 특징정보 추출 방법 및 특징 추출기에 관한 것이다. The present invention relates to HOG / HOF-based feature information extraction method and feature extractor for video object behavior classification, and more particularly, to feature information extraction method and feature extractor for video object behavior classification for improving classification performance of motion in an object .

비디오 내 인간의 행동(action)과 같은 객체의 행동을 분류하기 위한 특징추출방식이 다양하게 제안되고 있다.A variety of feature extraction methods have been proposed for classifying the behavior of an object such as a human action in a video.

인간 행동의 경우 걷기, 뛰기, 조깅, 팔굽펴펴기 등 다양한 행동들이 있으며 이를 인식하기 위한 기술들이 개발되어 왔다. In the case of human behavior, there are various behaviors such as walking, running, jogging, and straightening of the arm. Techniques have been developed to recognize these behaviors.

특히, 객체 행동을 표현하기 위한 우수한 특징들이 최근 개발되어 왔는데, 영상의 그래디언트(gradient)에 기반한 HOG(histrogram of Grandient), 영상의 시간상 옵티컬 플로우(optical flow)에 기반한 HOF(histogram of optical flow), 그리고 카메라의 움직임에 강인한 성질을 보이는 옵티컬 플로우(optical flow)의 미분을 이용하는 MBH(motion boundary histogram), SIFT(Scale invariant feature transformation) 등이 있다. Particularly, excellent features for expressing object behavior have recently been developed. HOG (histogram of the Grandient) based on the gradient of the image, histogram of optical flow (HOF) based on the optical flow of the image in time, Motion boundary histogram (MBH) and scale invariant feature transformation (SIFT), which use optical flow derivatives that are robust to camera motion.

MBH는 x-축과 y-축에 대하여 각각 얻어지는데 이를 MBHx 그리고 MBHy라고 한다. MBH is obtained for the x-axis and the y-axis, respectively, which is called MBHx and MBHy.

추출된 특징으로부터 객체의 행동 분류를 위해서는 분류기를 사용하여야한다. 분류기로는 HMM(hidden markov model), SVM(support vector machine), Fisher vector 기반의 GMM(Gaussian mixture model), 그리고 히스토그램 특징에 기반한 BOF(bag of feature)등이 널리 사용되고 있다.A classifier should be used to classify the behavior of objects from extracted features. Classifiers are HMM (hidden markov model), SVM (support vector machine), Fisher vector based GMM (Gaussian mixture model), and histogram feature based BOF (bag of feature).

객체 행동 특징을 추출하는 방식들에 대해 최신 논문 {H. Wang, A. Klㅴser, C. Schmid, and C.-L. Liu, "Dense Trajectories and Motion Boundary Descriptors for Action Recognition," International Journal of Computer Vision, vol. 103, no. 1, pp. 60-79, (May, 2013).}에서 심도 있게 평가되었다. For methods of extracting object behavior features, the latest article {H. Wang, A. Klitsch, C. Schmid, and C.-L. Liu, " Dense Trajectories and Motion Boundary Descriptors for Action Recognition, "International Journal of Computer Vision, vol. 103, no. 1, pp. 60-79, (May, 2013).

한편, 객체의 행동 분류를 위해 적용된 분류기의 계산량을 저감하면서도 분류 성능을 향상시킬 수 있는 비디오내 특징정보 추출방안은 꾸준이 요구되고 있다.On the other hand, there is a demand for a feature extraction method in video that can improve the classification performance while reducing the amount of computation of the classifier applied for classifying the behavior of the object.

본 발명은 상기와 같은 요구사항을 해결하기 위하여 창안된 것으로서, 비디오 내 객체의 행동을 분류하기 위한 특징정보로서 영상의 그레디언트 정보와 옵티컬 플로우 정보를 결합하면서도 계산량을 저감할 수 있는 비디오 객체 행동 분류를 위한 특징정보 추출 방법 및 특징 추출기를 제공하는데 그 목적이 있다.The present invention has been made to solve the above-mentioned problems, and it is an object of the present invention to provide a video object behavior classification method capable of reducing calculation amount while combining gradient information and optical flow information of an image as feature information for classifying a behavior of an object in a video The feature information extracting method and the feature extracting device.

상기의 목적을 달성하기 위하여 본 발명에 따른 비디오 객체 행동 분류를 위한 특징정보 추출 방법은
According to an aspect of the present invention, there is provided a feature information extracting method for classifying a video object behavior according to the present invention,

본 발명에 따른 비디오 객체 행동 분류를 위한 특징정보 추출 방법 및 특징 추출기에 의하면, HOG 및 HOF의 특징을 하나의 특징으로 융합하여 공간과 시간상의 변화를 반영할 수 있으면서 특징벡터의 차원을 증가시키지 않음으로써, 인식기인 분류기의 계산량을 증가시키지 않으면서도 비디오내 행동분류의 성능을 향상시킬 수 있는 장점을 제공한다.According to the feature information extraction method and feature extractor for video object behavior classification according to the present invention, it is possible to reflect changes in space and time by fusing features of HOG and HOF into one feature, , It provides the advantage that the performance of the behavior classification in video can be improved without increasing the amount of calculation of the classifier as the recognizer.

도 1은 본 발명에 따른 비디오 객체 행동 분류를 위한 특징정보 추출 과정을 나타내보인 플로우도이고,
도 2는 도 1의 특징정보 추출과정을 상세하게 나타내 보인 플로우도이고,
도 3은 본 발명에 따라 추출된 특징정보로부터 행동인식에 적용하는 예를 설명하기 위한 도면이고,
도 4 및 도 5는 본 발명에 따라 추출된 특징정보로부터 히스토그램을 산출하는 과정을 나타내 보인 도면이고,
도 6은 본 발명에 따른 비디오 객체 행동 분류를 위한 특징정보 추출기를 나타내 보인 블록도이다.FIG. 1 is a flowchart illustrating a feature information extraction process for classifying a video object behavior according to the present invention.
FIG. 2 is a flow chart showing in detail the feature information extraction process of FIG. 1,
FIG. 3 is a diagram for explaining an example of application to behavior recognition from feature information extracted according to the present invention,
4 and 5 are diagrams illustrating a process of calculating a histogram from feature information extracted according to the present invention,
6 is a block diagram illustrating a feature information extractor for classifying a video object behavior according to the present invention.

이하, 첨부된 도면을 참조하면서 본 발명의 바람직한 실시예에 따른 비디오 객체 행동 분류를 위한 특징정보 추출 방법 및 특징 추출기를 더욱 상세하게 설명한다.Hereinafter, a feature information extracting method and a feature extractor for classifying a video object behavior according to a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 비디오 객체 행동 분류를 위한 특징정보 추출 과정을 나타내보인 플로우도이다.FIG. 1 is a flowchart showing a feature information extraction process for classifying a video object behavior according to the present invention.

먼저, 입력되는 처리대상 비디오의 영상 프레임들에 대한 키포인트를 선정하고(단계 10), 선정된 키포인트에 대해 그레디언트 벡터(g)와 광학 옵티컬 플로우벡터(f)를 산출한다(단계 20).First, a key point for image frames of an input video to be processed is selected (step 10), and a gradient vector g and an optical optical flow vector f are calculated for the selected key point (step 20).

다음은 산출된 그레디언트 벡터(g)와 광학 옵티컬 플로우벡터(f)에 대한 텐서곱을 산출하고(단계 30), 텐서곱의 차원을 낮추도록 산출된 텐서곱에 대한 텐서 다이버전스(divergence)를 산출한다(단계 40).Next, a tensor product of the calculated gradient vector (g) and the optical optical flow vector (f) is calculated (step 30) and a tensor divergence is calculated for the calculated tensor product to lower the dimension of the tensor product ( Step 40).

이후, 산출된 텐서 다이버전스를 움직임 분류용 특징 벡터로 결정하고, 결정된 움직임 분류용 특징벡터 데이터는 BoF 또는 SVM의 분류기에 제공되어 행동이 분류처리된다(단계 50).Then, the calculated tensor divergence is determined as a feature vector for motion classification, and the determined feature vector data for motion classification is provided to a classifier of BoF or SVM to classify the behavior (step 50).

이하에서는 이러한 처리과정을 도 2 내지 도 5를 함께 참조하면서 더욱 상세하게 설명한다.Hereinafter, this processing will be described in more detail with reference to FIGS. 2 to 5. FIG.

먼저, 비디오의 영상 프레임에서 주요 픽셀점인 키포인트들(key points)이 선정되었을 때, 키포인트(key point)의 특징벡터를

라고 하자. 여기서

는 그레디언트(gradient) 벡터이고

는 옵티컬 플로우(opttical flow) 벡터이다. First, when the key points, which are the main pixel points in the video frame of the video, are selected, the feature vector of the key point

Let's say. here

Is a gradient vector and

Is an opttical flow vector.

이러한 그레디언트 벡터(g)와 옵티컬 플로우 벡터(f)에 대한 텐서곱(tensor product)은 아래의 수학식 2로 표현할 수 있다.The tensor product of the gradient vector g and the optical flow vector f can be expressed by the following equation (2).

여기서

는 텐서곱을 의미하고, 수학식 2는 아래의 수학식 3과 같다.here

Denotes the tensor product, and the equation (2) is the following equation (3).

여기서

는

축(가로축)의 단위벡터,

는

축(세로축)의 단위벡터이다. 결국 두 벡터의 텐서(tensor) 곱은 2x2의 행렬로 표현된다. here

The

The unit vector of the axis (horizontal axis)

The

Is the unit vector of the axis (vertical axis). As a result, the tensor product of the two vectors is expressed by a matrix of 2x2.

따라서 그레디언트 벡터(g)와 옵티컬 플로우 벡터(f)가 행렬로 확장되어 표현되었으며, 이 행렬에서 각각의 인자들은 결국 두 벡터 원소들간의 상관(correlation)으로 표현된다. Therefore, the gradient vector (g) and the optical flow vector (f) are expressed as an extension of a matrix, and each factor in the matrix is expressed as a correlation between two vector elements.

본 특허에서는 텐서(tensor)곱으로 확장된 특징을 다시 차원이 낮은 벡터로 변환하기 위하여 텐서 다이버전스(tensor divergence)를 사용한다. In this patent, tensor divergence is used to convert a feature extended by a tensor product into a low-dimensional vector.

즉, 텐서 행렬을

라고 할 때, 텐서다이버전스(tensor divergence)는 아래의 수학식4 및 5와 정의된다.That is,

, The tensor divergence is defined by the following equations (4) and (5).

따라서 단계 40을 통해 생성되는 특징정보인 그레디언트-플로우 텐서다이버전스(gradient-flow tensor divergence; TDGF)는 아래의 수학식 5로 정의된다.Therefore, the gradient-flow tensor divergence (TDGF), which is the feature information generated through step 40, is defined by the following equation (5).

수학식 5를 통해 얻어진

는 2차원 벡터로서, 특징벡터의 차원을 낮추어주며, 그레디언트 벡터(g) 정보와 옵티컬 플로우 벡터(f) 정보를 압축적으로 보여준다. &Lt; RTI ID = 0.0 >

Is a two-dimensional vector that lowers the dimension of the feature vector and compressively shows the gradient vector (g) information and the optical flow vector (f) information.

이와 같이 추출된 특징정보는 다음 학습 및 인식단계에서는 이 특징벡터를 그대로 사용할 수도 있으며, 히스토그램(histogram)과 같은 특징으로 변환하여 기존의 SVM 또는 BOF의 인식 방법론을 적용할 수 있다. The feature information thus extracted can be used as it is in the next learning and recognition step, and can be applied to the conventional SVM or BOF recognition method by converting it to a feature such as a histogram.

한편, 키포인트 선정과정의 예를 설명하면, 입력되는 비디오의 영상 프레임을

라고 하자. 여기서

는 영상 프레임 인덱스이며,

는 영상으로서

의 행렬로서 표시된다. An example of a key point selection process is described below.

Let's say. here

Is an image frame index,

As an image

Lt; / RTI >

이러한 영상 프레임에 대해 그레디언트(gradient)와 옵티컬 플로우(optical flow)를 추적할 대상이 되는 키포인트를 선정하기 위해 공지된 다양한 방법을 사용하면 된다.Various methods known in the art may be used to select the keypoints to be traced for the gradient and optical flow for such image frames.

이하에서는 키포인트 선정의 예로서, 2차 그레디언트(2^nd gradient) 행렬의 고유값을 사용하는 방법을 적용하여 설명한다.Hereinafter, a method of using the eigenvalues of the 2 ^nd gradient matrix will be described as an example of selecting a key point.

먼저, 키포인트는 처리대상 비디오의 입력영상으로부터 각 픽셀에 대해 2차 그래디언트(2^nd gradient)를 아래의 수학식6을 이용하여 구하고, 2차 그래디언트에 대한 고유값(eigen value)을 산출한 다음, 산출된 고유값이 설정된 임계값보다 큰 픽셀을 키포인트로 결정한다.First, the key point is obtained by using the equation (6) below the second gradient (2 ^nd gradient) for each pixel from the input image to be processed video, calculating the eigenvalues (eigen value) of the second gradient, and then, And a pixel having a calculated eigenvalue greater than a set threshold value is determined by a key point.

여기서

와

는 임의 픽셀에서

축과

축 방향의 그레디언트(gradient)를 의미하고,

와

는 2^nd gradient이다. 실제 디지털 영상에 있어 이러한 값들은

축 방향에 대하여

그리고

축에 대하여

커널들과의 필터링을 통해 얻어진다. here

Wow

&Lt; / RTI >

Axis

Means a gradient in the axial direction,

Wow

Is a 2 ^nd gradient. For real digital images these values

About axial direction

And

About axis

It is obtained through filtering with kernels.

각 픽셀에 대해 주어진 행렬

에 대하여 고유벡터분할(eigen-vector decomposition)을 구하고, 두 개의 고유값들이 특정 임계값(

)보다 큰 픽셀들을 키포인트(key points)로 정한다. 여기서 특정임계값(

)은 경험적으로 결정된다. 위의 과정에서 그레디언트(gradient) 벡터가 얻어지는데, 그레디언트 벡터는 각 키포인트(key point)에 대하여

와 같다. Given a matrix for each pixel

Eigen-vector decomposition is performed on the two eigenvalues, and two eigenvalues are set to a specific threshold value

) Are defined as key points. Here, a specific threshold value (

) Is determined empirically. In the above procedure, a gradient vector is obtained, and the gradient vector is obtained for each key point

.

다음, 시간

에서

번째 특징점

는 다음 프레임

에서 추적되며 아래의 수학식 7과 같이 표현된다.Next, time

in

Second feature point

The next frame

And is expressed by Equation (7) below.

여기서,

는 시간

에서 키포인트(key point) 위치이며,

은에디안(median) 필터 커널(kernel)이고,ω는

로서 옵티컬 플로우 커널(optical flow kernel)이다. 그러면 옵니컬 플로우 벡터(optical flow vector)는 아래의 수학식 8과 같이 표현된다.here,

Time

Key point position in the < RTI ID = 0.0 >

Is a median filter kernel, and < RTI ID = 0.0 >

Is an optical flow kernel. Then, the optical flow vector is expressed by Equation (8) below.

옵티컬 플로우(optical flow) 벡텨를 구하는 방법에 대해서는 다양하게 공지되어있어 상세한 설명은 생략한다.A method of obtaining an optical flow path is variously known and a detailed description thereof will be omitted.

그레디언트 벡터(g)와 옵티컬 플로우벡터(f)는 모든 프레임상에서, 그리고 모든 키포인트(key points)상에서 정의되므로 시간과 공간상에서 정의된다. The gradient vector (g) and the optical flow vector (f) are defined in time and space because they are defined on all frames and on all key points.

즉

,

로 표현된다. 여기서

는 프레임 인덱스

는 키포인트(key points) 인덱스를 의미한다. In other words

,

Lt; / RTI > here

Frame index

Means an index of key points.

다음은 앞서 설명된 바와 같이 프레임 인덱스와 키포인트 인덱스를 반영하여 그레디언트 벡터와 옵티컬 플로우 벡터에 대한 텐서곱은 아래의 수학식9로 다시 표현할 수 있다.Next, the tensor product of the gradient vector and the optical flow vector can be rewritten as Equation (9) below, reflecting the frame index and the keypoint index as described above.

또한, 텐서 다이버전스를 적용한 특징 벡터(

)를 아래의 수학식 10을 통해 얻는다.In addition, the feature vector to which the tensor divergence is applied

) Is obtained by the following equation (10).

위와 같이 행동 분류를 위한 특징벡터가 얻어지면, 일반적인 학습 및 인식 절차에 따라 행동 인식을 수행하면 된다.Once the feature vectors for behavior classification are obtained as described above, behavior recognition can be performed according to general learning and recognition procedures.

본 발명에서는

를 행동인식에 적용하는 예를 보이기 위해 BOF 및 SVM(support vector machine)을 사용하는 방법을 도 3 내지 도 5를 함께 참조하여 설명한다.In the present invention,

To support the behavior recognition, a method of using BOF and SVM (support vector machine) will be described with reference to FIG. 3 to FIG. 5. FIG.

먼저 도 3을 참조하여,

를 적용한 행동분류(action classification)의 방법에 대하여 설명한다. Referring first to Figure 3,

And a method of action classification using the method.

도 3에 보인 바와 같이 먼저 특징벡터

에 대하여

서술자(descriptor)를 계산한다. 이를

라고 하자. 이를 위하여 먼저

특징이 2차원 공간의 벡터이므로, 이를 양자화 한다. 이를 위해 벡터양자화기(vector quantizer)를 사용하면 되고, 모든

특징 벡터들이 모여진 TDGF 특징벡터 풀로부터 k-means 집단화를 통해

개의 대표 벡터를 얻는다. 이를

라고 하자. 그러면 입력되는 하나의

는 가장 작은 거리를 갖는

에 할당된다. 여기서

는 가장 작은 거리를 갖는 대표벡터의 인덱스이다. As shown in FIG. 3,

about

Calculates a descriptor. This

Let's say. To do this,

Since the feature is a vector of two-dimensional space, it is quantized. To do this, you can use a vector quantizer,

Through the k-means aggregation from TDGF feature vector pools with feature vectors

Of representative vectors are obtained. This

Let's say. Then,

Has the smallest distance

Lt; / RTI > here

Is the index of the representative vector having the smallest distance.

그러면

의 벡터코딩은

과 같다.then

The vector coding of

Respectively.

다음 각 키포인트(key points)를 대상으로

그리고

의 시공간상 블록을 설정한다. 그리고 이 블록은 도 4에 예시된 바와 같이

의 부블록(subblock)으로 나누어진다. 다음 이 부블록 내에 존재하는 각 픽셀들에 대하여

를 계산하여 이들을 연쇄시킨다. 이를

로 정의하는데,

의 차원은

가 된다. For each of the following key points:

And

Block in the space-time. 4,

The subblocks are divided into subblocks. Next, for each pixel in the subblock

Are calculated, and these are concatenated. This

In addition,

The dimension of

.

다음은 도 4 및 도 5에 도시된 바와 같이 이

를 프레임 지속시간

내 모든 키포인트(key points) 및 부블록들에 대하여 계산하고 히스토그램

를 계산한다. 이를 위해 먼저

에 대한 대표 벡터를

특징벡터풀로부터 k-means 집단화를 통해 얻는다. 단 대표의 개수를

라고 하자. 다음 입력되는

프레임내 모든 키포인트(key points) 및 부블록들에 대하여

들을 계산하고, 이 정보와

의 벡터정보를 이용하여 하나의 특징벡터

를 계산한다. 이를 위해 한 특징벡터

가 대표

번째에 할당되면

의

번째 값을 1가 증가 시키면 된다. 그러면

프레임에 대해 하나의

를 얻을 수 있는데, 이

는

의 벡터이다. 물론 이렇게 얻어진 히스토그램(histogram)에 대해서 정규화(normalization)을 수행한다. 그러면 하나의 액션(action) 비디오 클립에 대하여 하나의

의 특징벡터를 얻을 수 있다. 이 특징 벡터를

라고 하자. 그러면 특징벡터는

의 인덱스를 사용하여 표현되는데,

전체 행동의 가지수로서

이고

은 각 행동에 대한 표본 비디오 클립 수로서

와 같이 표현된다. 마지막으로 특징벡터

를 사용한 인식은 비선형커널(non-linear kernel)을 채용한 SVM(support vector machine)을 이용하여 이루어진다.Next, as shown in Figs. 4 and 5,

Frame duration

Calculate for all my key points and subblocks and use the histogram

. First,

Representative vector for

The feature vector is obtained from the pool by k-means aggregation. However,

Let's say. Then entered

For all key points and subblocks in a frame

, And the information

Using one vector < RTI ID = 0.0 >

. For this purpose,

Representative

Th

of

1 < / RTI > then

One for frame

Can be obtained.

The

. Of course, normalization is performed on the thus obtained histogram. Then, for a single action video clip,

Can be obtained. This feature vector

Let's say. Then the feature vector

&Lt; RTI ID = 0.0 >

As a branch of total action

ego

Is the number of sample video clips for each action

. Finally,

Is performed using a support vector machine (SVM) employing a non-linear kernel.

이러한 추출방법을 적용한 본 발명의 특징정보 추출기가 도 6에 도시되어 있다.The characteristic information extractor of the present invention to which this extraction method is applied is shown in Fig.

도 6을 참조하면 특징정보 추출기(100)는 입력부(110), 키포인트 추출부(120), 특징정보 추출부(130) 및 분류기(140)를 구비한다.6, the feature information extractor 100 includes an input unit 110, a keypoint extractor 120, a feature information extractor 130, and a classifier 140.

입력부(110)는 처리대상 영상 데이터를 입력받는다.The input unit 110 receives the image data to be processed.

키포인트 추출부(120)는 앞서 설명된 방식으로 입력부(110)로 입력된 영상데이터로부터 키포인트를 추출한다.The keypoint extraction unit 120 extracts keypoints from the image data input to the input unit 110 in the manner described above.

특징정보 추출부(130)는 키포인트 추출부(120)에 의해 추출된 키포인트에 대해 그래디언트 벡터(g)와 플로우 벡터(f)에 대한 텐서곱을 구하고, 텐서곱의 차원을 낮추도록 텐서곱에 대한 텐서다이버전스를 산출하며, 산출된 텐서 다이버전스를 움직임 분류용 특징 벡터로 결정하여 분류기(140)에 제공한다.The feature information extracting unit 130 obtains a tensor product of the gradient vector g and the flow vector f with respect to the keypoint extracted by the keypoint extractor 120 and calculates a tensor product of the tensor product to lower the dimension of the tensor product. Calculates the divergence, and determines the calculated tensor divergence as a feature vector for motion classification, and provides it to the classifier 140.

분류기(140)는 특징정보 추출부(130)로부터 수신된 특징벡터정보로부터 객체의 움직임을 분류한다.The classifier 140 classifies the motion of the object from the feature vector information received from the feature information extractor 130.

본 추출방법을 검증하기 위하여 KTH 데이터베이스를 대상으로 인식실험을 수행한 결과, HOG, HOF, MBHx, MBHy 및 본 특허인

를 채용한 결과, 아래의 표1에 나타낸 바와 같이 제안 기술의 특징을 사용한 경우 가장 우수한 성능을 보였음.In order to verify this extraction method, the recognition experiment was performed on the KTH database. As a result, HOG, HOF, MBHx, MBHy,

As shown in Table 1 below, the best performance was obtained when the features of the proposed technique were used.

HOGHOG HOFHOF MbhXMbhX MbhYMbhY

KTH 93.98 94.44 95.83 94.44 96.3096.30

참고로 인식 실험에 사용된 KTH 데이터 셋의 샘플 영상은 다음과 같음.For reference, a sample image of the KTH data set used in the recognition experiment is as follows.

이상에서 설명된 바와 같이 본 발명에 따른 비디오 객체 행동 분류를 위한 특징정보 추출 방법 및 특징 추출기에 의하면, HOG 및 HOF의 특징을 하나의 특징으로 융합하여 공간과 시간상의 변화를 반영할 수 있으면서 특징벡터의 차원을 증가시키지 않음으로써, 인식기 즉 분류기의 계산량을 증가시키지 않으면서도 비디오내 행동분류의 성능을 향상시킬 수 있는 장점을 제공한다.As described above, according to the feature information extraction method and feature extractor for video object behavior classification according to the present invention, it is possible to reflect changes in space and time by integrating the features of HOG and HOF into one feature, It is possible to improve the performance of the intra-video motion classification without increasing the amount of calculations of the recognizer or classifier.

110: 입력부 120: 키포인트 추출부
130: 특징정보 추출부 140: 분류기110: Input unit 120: Keypoint extraction unit
130: Feature information extracting unit 140:

Claims

A method for extracting feature information for object behavior classification from video,
end. Calculating a gradient vector and an optical optical flow vector for a selected key point for video frames of a video to be processed;
I. Obtaining a tensor product of the gradient vector and the optical flow vector;
All. Calculating a tensor divergence with respect to the tensor product to reduce the dimension of the tensor product, and determining the calculated tensor divergence as a feature vector for motion classification. Way.

The method of claim 1, wherein the key point is obtained by obtaining a second gradient for each pixel from an input image of a video to be processed, calculating a unique value for the second gradient, and then calculating a pixel whose calculated eigenvalue is greater than a set threshold value And extracting characteristic information for video object behavior classification.

The method of claim 1, wherein the feature vector data for motion classification is provided to a classifier of a BoF or an SVM, and the behavior is classified and processed.

An input unit for receiving image data;
A key point extracting unit for extracting a key point from the image data input to the input unit;
Calculating a tensor product of the gradient vector with respect to the key vector extracted by the keypoint extractor, calculating a tensor product of the tensor product to lower the dimension of the product of the tensor product, And a feature information extracting unit for extracting feature information from the feature information extracted by the extracting unit.