KR20150025508A

KR20150025508A - Multi-view object detection method using shared local features

Info

Publication number: KR20150025508A
Application number: KR20130103540A
Authority: KR
Inventors: 고병철; 정지훈; 남재열; 주영도; 강정석
Original assignee: 계명대학교 산학협력단
Priority date: 2013-08-30
Filing date: 2013-08-30
Publication date: 2015-03-10
Also published as: KR101584091B1

Abstract

The present invention relates to a multi-view object detection method using shared local features and, more specifically, to a multi-view object detection method using shared local features including the steps of: generating randomly N regions which have a random location and a random size in a training image with a multi-view including an object; extracting orientation center symmetric local binary patterns (OCS-LBP); learning feature vectors extracted by each N region through random forest based on a region; and selecting the region of local features.

Description

[0001] MULTI-VIEW OBJECT DETECTION METHOD USING SHARED LOCAL FEATURES [

본 발명은 공유 지역 특징을 이용한 다시점 객체 검출 방법에 관한 것으로서, 보다 구체적으로는 정지 영상에서 다양한 관점을 가지는 객체에 대해 지역적 OCS-LBP(Orientation Center Symmetric Local Binary Patterns) 특징과 공유 지역 기반의 랜덤 포레스트 분류기를 사용하여 영상 내의 다양한 관점을 가지는 객체에 대해서 빠르고 정확한 검출이 가능하도록 해주는 공유 지역 특징을 이용한 다시점 객체 검출 방법에 관한 것이다.The present invention relates to a method of detecting a multi-view object using a shared area feature, and more particularly, to a method of detecting a multi-view object using a local OCS-LBP (Orientation Center Symmetric Local Binary Patterns) The present invention relates to a method of detecting a multi-view object using a shared area feature that enables fast and accurate detection of an object having various viewpoints in an image using a forest classifier.

일반적으로 객체 검출 시스템은 감시 시스템, 행동 인식, 증강 현실과 같은 컴퓨터 비전 분야에서 중요한 연구 분야 중 하나이다. 일반적인 영상에서의 객체 검출은 객체의 다양한 크기, 복잡한 배경, 가려짐 등에 의해 객체 검출이 어려운 문제가 있으며, 이러한 문제를 해결하기 위해 객체 검출과 관련된 활발한 연구가 진행되고 있다.
In general, object detection system is one of the important research fields in computer vision such as surveillance system, behavior recognition, and augmented reality. Object detection in a general image has a problem that it is difficult to detect an object by various sizes of objects, complex backgrounds, and the like. In order to solve this problem, active research is being conducted related to object detection.

객체 검출을 위한 선행기술로서는, 선행기술 1(대한민국 공개특허공보 제10-2011-0131727호, 발명의 명칭: 영상 처리 시스템에서의 객체 인식 방법)과, 선행기술 2(C. H. Chang, C. C. Wang, and J. J. Lien, “Multi-view vehicle detection using gentle boost with sharing HOG feature,” in Proceedings of 22th IPPR Conference on Computer Vision, Graphics and Image Processing(IPPR, 2009), pp.1688-1694) 등의 참고 문헌이 개시된 바 있다. 선행기술 1은 정지 영상에서 소벨 필터(Sobel Filter), GIST 알고리즘을 이용한 특징 추출을 통해서 전역 특징 기반 방법과 지역 특징 기반 방법을 사용하여 객체를 인식하는 방법을 개시하고 있다. 선행기술 2는 객체를 포함하는 영상에 대해 패치를 생성하고 해당 패치에 대해 HOG(Histogram of oriented gradient) 특징을 추출하고, 특징 추출 후 Gentle Boost를 사용하여 학습 및 객체 검출을 하는 알고리즘을 개시하고 있다. 선행기술 2에서 사용된 알고리즘은 특징 추출 방법으로 HOG를 사용하며, HOG는 많은 특징 차원 수를 가지므로 매우 느린 연산 속도를 가지는 문제가 있었다.Prior art 1 (Korean Patent Laid-Open Publication No. 10-2011-0131727, titled "Object Recognition Method in Image Processing System") and prior art 2 (CH Chang, CC Wang, and Reference is made to JJ Lien, " Multi-view vehicle detection using gentle boost with sharing HOG feature, " in Proceedings of 22nd IPPR Conference on Computer Vision, Graphics and Image Processing (IPPR, 2009), pp.1688-1694) There is a bar. Prior Art 1 discloses a method of recognizing an object using a global feature-based method and a region feature-based method through feature extraction using a Sobel filter and a GIST algorithm in a still image. Prior Art 2 discloses an algorithm for generating a patch for an image including an object, extracting a histogram of oriented gradient (HOG) characteristic for the patch, and learning and object detection using a Gentle Boost after feature extraction . The algorithm used in Prior Art 2 uses HOG as a feature extraction method, and HOG has a problem that it has a very slow operation speed because it has many feature dimension numbers.

본 발명은 기존에 제안된 방법들의 상기와 같은 문제점들을 해결하기 위해 제안된 것으로서, 정지 영상에서 다양한 관점을 가지는 객체에 대해 지역적 OCS-LBP 특징 추출과 공유 지역 기반의 랜덤 포레스트 분류기를 사용하여 학습 및 분류함으로써, 영상 내의 다양한 관점을 가지는 다시점의 객체에 대해서 빠르고 정확한 검출이 가능하도록 하는, 공유 지역 특징을 이용한 다시점 객체 검출 방법을 제공하는 것을 그 목적으로 한다.
The present invention has been proposed in order to solve the above-mentioned problems of the previously proposed methods. The present invention proposes a method of extracting local OCS-LBP features for objects having various viewpoints in a still image and learning and learning using a random- The object of the present invention is to provide a method of detecting a multi-view object using a shared area feature, which enables quick and accurate detection of multi-view objects having various views in an image.

또한, 본 발명은, OCS-LBP 특징 추출을 사용함으로써, 방향 정보를 포함하는 적은 특징 차원 수에도 불구하고 매우 빠른 성능을 나타낼 수 있도록 하며, 랜덤 포레스트 분류기를 사용하여 생성된 모든 영역을 객체 분류에 사용하지 않고 최적의 공유 지역 특징을 사용하여 학습 및 분류함으로써, 적은 트레이닝 데이터를 사용하고도 다양한 관점을 가지는 객체에 대해 빠른 학습 속도 및 검출 속도를 가질 수 있도록 하는, 공유 지역 특징을 이용한 다시점 객체 검출 방법을 제공하는 것을 또 다른 목적으로 한다.
In addition, the present invention uses OCS-LBP feature extraction so that it can exhibit very fast performance in spite of a small number of characteristic dimensions including direction information, and it is possible to classify all regions generated using a random forest classifier into object classification A multi-view object using a shared area feature, which enables fast learning speed and detection speed for objects having various viewpoints even if less training data is used, by learning and classifying by using optimal shared area features without using It is still another object to provide a detection method.

뿐만 아니라, 본 발명은, 공유되는 지역 특징을 사용하여 부분 가림을 가진 복잡한 배경(백그라운드)에서도 다시점 객체의 검출이 가능하며, 특히 공유된 지역 특징을 각 클래스에 대해 독립적이기보다 공동으로 공유하여 선택할 수 있도록 함으로써, 분류기의 수를 줄여 최소화하고, 계산의 복잡성을 줄여줄 수 있도록 하는, 공유 지역 특징을 이용한 다시점 객체 검출 방법을 제공하는 것을 또 다른 목적으로 한다.In addition, the present invention is capable of detecting multi-point objects even in complex backgrounds (backgrounds) with partial occlusion using shared area features, and in particular sharing shared area features rather than being independent for each class Another object of the present invention is to provide a method of detecting a multi-view object using shared area features, which can minimize the number of classifiers and reduce the complexity of calculation.

상기한 목적을 달성하기 위한 본 발명의 특징에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법은,According to an aspect of the present invention, there is provided a method of detecting a multi-view object using a shared area feature,

(1) 객체를 포함하는 여러 시점(Multi-View)을 가지는 트레이닝 영상에서 임의의 위치와 크기를 가지는 후보 영역(Randomly N Regions)들을 생성하는 단계;(1) generating candidate regions (Randomly N Regions) having arbitrary positions and sizes in a training image having multiple views including an object;

(2) 각각의 후보 영역에 대해 OCS-LBP(Orientation Center Symmetric Local Binary Patterns) 특징을 추출하는 단계;(2) extracting an OCS-LBP (Orientation Center Symmetric Local Binary Patterns) characteristic for each candidate region;

(3) 각각의 후보 영역별로 추출된 특징 벡터들을 랜덤 포레스트를 통해 지역 기반으로 학습하는 단계;(3) learning region-based feature vectors extracted for each candidate region through a random forest;

(4) 테스트 영상을 학습된 랜덤 포레스트에 적용하여 확률 값을 추정한 다음 상위 값들을 가지는 지역 특징(Local Features)의 영역을 선택하는 단계;(4) applying a test image to the learned random forest to estimate a probability value, and then selecting an area of Local Features having higher values;

(5) 각 시점에 대해서 선택된 지역 특징의 영역에 대해 공유되는 영역을 찾아 선택하고, 공유된 지역 특징의 영역에 대해서 랜덤 포레스트로 학습하는 단계;(5) finding and selecting a shared region for a region of the selected region feature for each viewpoint, and learning the regions of the shared region feature in a random forest;

(6) 트레이닝 과정 수행 후, 객체를 검출하기 위해서 테스트 영상이 입력되면, 각각의 관점별로 확률 값이 추정되고, 확률 값 히스토그램을 생성하는 단계;(6) generating a probability value histogram by estimating a probability value for each view when a test image is input to detect an object after performing a training process;

(7) 생성한 확률 값 히스토그램에서 추정된 확률 값들 중 최댓값을 찾고, 최댓값을 가지는 주위 5개의 확률 값에 대해 가중치를 부과한 뒤 선형 결합하여 최종적인 확률 값을 추정하는 단계; 및(7) Finding the maximum value among the probability values estimated in the generated probability value histogram, estimating the final probability value by applying a weight to the five surrounding probability values having the maximum value, and linearly combining the weight values; And

(8) 상기 추정된 최종적인 확률 값이 미리 지정한 임계값보다 큰 경우에 객체로 검출하는 단계를 포함하는 것을 그 구성상의 특징으로 한다.
(8) detecting as an object when the estimated final probability value is larger than a predetermined threshold value.

바람직하게는, 상기 단계 (1)에서의 후보 영역들은,Preferably, the candidate regions in step (1)

상기 트레이닝 영상의 원본 이미지에서 객체가 포함된 80×40의 크기로 조정하여 자른 윈도우 영역에서 수집된 N 개의 훈련 샘플로 구성할 수 있다.
And N pieces of training samples collected in the window region adjusted by adjusting the size of 80 × 40 including the object in the original image of the training image.

더욱 바람직하게는, 상기 후보 영역들은,More preferably, the candidate regions include,

상기 80×40의 크기를 갖는 윈도우 영역에서 최소 10×10의 크기로부터 최대 40×20(윈도우 영역의 절반)의 크기를 갖는 지역 특징을 랜덤하게 생성할 수 있다.
It is possible to randomly generate a local feature having a size of at least 40 × 20 (half of the window area) from a size of at least 10 × 10 in the window area having the size of 80 × 40.

다시점 객체 검출에 필요한 회전 객체의 지역 특징을 위해 30° 간격으로 0°에서 360°의 범위에서 12 트레이닝 세트로 지정할 수 있도록 할 수 있다.
It is possible to specify 12 training sets in the range of 0 ° to 360 ° at intervals of 30 ° for the local features of the rotating object necessary for multi-point object detection.

더욱 바람직하게는, 상기 트레이닝 세트는,More preferably, the training set comprises:

지역 특징의 특정 위치와 크기를 선택하는 학습 그룹과, 테스트 그룹으로 똑같이 나누어지도록 구성할 수 있다.
A learning group that selects a specific location and size of a local feature, and a test group.

바람직하게는,Preferably,

네거티브 세트는 배경(백그라운드) 영역에서 추출할 수 있다.
The negative set can be extracted from the background (background) area.

바람직하게는, 상기 단계 (2)에서는,Preferably, in the step (2)

상기 후보 영역들 각각에 대해 4개(2×2)의 서브 영역으로 분할하고, 각 서브 영역에서 8차원의 OCS-LBP 특징을 추출할 수 있다.
For each of the candidate regions, it is possible to divide into 4 (2 x 2) sub-regions and to extract 8-dimensional OCS-LBP features in each sub-region.

각 서브 영역에서 추출된 OCS-LBP 특징들을 연결하여 각 후보 영역당 32(8×4) 차원의 특징 벡터를 추출할 수 있다.
By combining the extracted OCS-LBP features in each sub-region, it is possible to extract 32 (8 × 4) dimensional feature vectors for each candidate region.

바람직하게는, 상기 단계 (3)에서의 랜덤 포레스트는,Preferably, the random forest in step (3)

의사 결정 트리를 기반으로 하는 앙상블 분류기로 구성할 수 있다.
And an ensemble classifier based on a decision tree.

더욱 바람직하게는, 상기 랜덤 포레스트는,More preferably,

의사 결정 트리를 사용하여 높은 훈련 속도와 많은 양의 데이터를 처리할 수 있도록 기능할 수 있다.
Decision trees can be used to handle high training speeds and large amounts of data.

더욱 바람직하게는, 상기 랜덤 포레스트는,More preferably,

각 트리의 구조를 이진 하향식(top-down) 방식으로 구현할 수 있다.
The structure of each tree can be implemented in a binary top-down fashion.

바람직하게는, 상기 단계 (5)에서의 랜덤 포레스트는,Preferably, the random forest in step (5)

동일한 지역 특징을 공유하는 각 뷰 클래스에서 수집된 훈련 데이터를 사용하여 공유 분류기로 훈련하여 공유 지역 특징에 대한 멀티 클래스 분류기를 생성할 수 있다.
You can train a shared classifier using training data collected from each view class that shares the same local characteristics to create a multi-class classifier for shared area features.

더욱 바람직하게는, 상기 단계 (6)에서는,More preferably, in the step (6)

각 공유 분류기들에서 공유되는 멀티 클래스 분류기의 히스토그램 확률 분포로부터 확률 값 히스토그램을 생성할 수 있다.
A probability histogram can be generated from the histogram probability distribution of a multi-class classifier shared by each shared classifier.

더욱더 바람직하게는, 상기 단계 (7)에서의 최댓값은,Even more preferably, the maximum value in the step (7)

생성한 확률 값 히스토그램에서 가장 기본이 되는 시점을 찾기 위한 최대 함수(Max Operation)를 이용할 수 있다.The maximum function (Max Operation) for finding the most basic point in the generated probability value histogram can be used.

본 발명에서 제안하고 있는 공유 지역 특징을 이용한 다시점 객체 검출 방법에 따르면, 정지 영상에서 다양한 관점을 가지는 객체에 대해 지역적 OCS-LBP 특징 추출과 공유 지역 기반의 랜덤 포레스트 분류기를 사용하여 학습 및 분류함으로써, 영상 내의 다양한 관점을 가지는 다시점의 객체에 대해서 빠르고 정확한 검출이 가능하도록 할 수 있다.
According to the multi-viewpoint object detection method using the shared area feature proposed in the present invention, the local OCS-LBP feature extraction and the shared area based random forest classifier are used for learning and classifying the objects having various viewpoints in the still image , It is possible to perform fast and accurate detection on multi-view objects having various viewpoints in the image.

또한, 본 발명에 따르면, OCS-LBP 특징 추출을 사용함으로써, 방향 정보를 포함하는 적은 특징 차원 수에도 불구하고 매우 빠른 성능을 나타낼 수 있도록 하며, 랜덤 포레스트 분류기를 사용하여 생성된 모든 영역을 객체 분류에 사용하지 않고 최적의 공유 지역 특징을 사용하여 학습 및 분류함으로써, 적은 트레이닝 데이터를 사용하고도 다양한 관점을 가지는 객체에 대해 빠른 학습 속도 및 검출 속도를 가질 수 있도록 할 수 있다.
In addition, according to the present invention, by using the OCS-LBP feature extraction, it is possible to exhibit very fast performance despite the small number of characteristic dimensions including direction information, and all regions generated using the random forest classifier can be classified into object classification Learning and classification using optimal shared area features instead of using training data can enable fast learning speed and detection speed for objects having various viewpoints even when using less training data.

뿐만 아니라, 본 발명에 따르면, 공유되는 지역 특징을 사용하여 부분 가림을 가진 복잡한 배경(백그라운드)에서도 다시점 객체의 검출이 가능하며, 특히 공유된 지역 특징을 각 클래스에 대해 독립적이기보다 공동으로 공유하여 선택할 수 있도록 함으로써, 분류기의 수를 줄여 최소화하고, 계산의 복잡성을 줄여줄 수 있도록 할 수 있다.In addition, according to the present invention, it is possible to detect a multi-point object even in a complicated background (background) with partial occlusion using a shared area feature, and more particularly, So that the number of classifiers can be reduced to a minimum and the complexity of the calculation can be reduced.

도 1은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법의 흐름을 도시한 도면.
도 2는 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법을 블록 다이어그램으로 도시한 도면.
도 3은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법에서, 각 후보 영역에 대해 4개의 서브 영역으로 분할하여 OCS-LBP 특징을 추출하는 블록 다이어그램을 도시한 도면.
도 4는 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법에서, 정지 영상에서 12개의 뷰 클래스로부터 공유 지역 특징을 선택하는데 사용되는 과정의 블록 다이어그램을 도시한 도면.
도 5는 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법에 따른 4 가지 관점에서의 공유 지역 특징과 그 위치의 일례를 도시한 도면.
도 6은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법에 적용되는 알고리즘의 일례를 도시한 도면.
도 7 내지 도 13은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법에 따른 실험 결과를 도시한 도면.1 is a flowchart illustrating a method of detecting a multi-view object using a shared area feature according to an exemplary embodiment of the present invention.
2 is a block diagram illustrating a multi-view object detection method using shared area features according to an exemplary embodiment of the present invention.
FIG. 3 is a block diagram illustrating a method of detecting a multi-view object using shared area features according to an exemplary embodiment of the present invention, in which OCS-LBP features are extracted by dividing each candidate region into four sub-regions.
4 is a block diagram of a process used to select a shared area feature from 12 view classes in a still image in a multi-view object detection method using a shared area feature according to an exemplary embodiment of the present invention.
5 is a view illustrating an example of a shared area feature and its location in four viewpoints according to a method of detecting a multi-view object using a shared area feature according to an exemplary embodiment of the present invention.
6 is a diagram illustrating an example of an algorithm applied to a multi-viewpoint detection method using shared area features according to an embodiment of the present invention.
FIG. 7 to FIG. 13 illustrate experimental results of a multi-point detection method using shared area features according to an embodiment of the present invention. FIG.

이하, 첨부된 도면을 참조하여 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있도록 바람직한 실시예를 상세히 설명한다. 다만, 본 발명의 바람직한 실시예를 상세하게 설명함에 있어, 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략한다. 또한, 유사한 기능 및 작용을 하는 부분에 대해서는 도면 전체에 걸쳐 동일한 부호를 사용한다.
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings, in order that those skilled in the art can easily carry out the present invention. In the following detailed description of the preferred embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. In the drawings, like reference numerals are used throughout the drawings.

덧붙여, 명세서 전체에서, 어떤 부분이 다른 부분과 ‘연결’ 되어 있다고 할 때, 이는 ‘직접적으로 연결’ 되어 있는 경우뿐만 아니라, 그 중간에 다른 소자를 사이에 두고 ‘간접적으로 연결’ 되어 있는 경우도 포함한다. 또한, 어떤 구성요소를 ‘포함’ 한다는 것은, 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있다는 것을 의미한다.
In addition, in the entire specification, when a part is referred to as being 'connected' to another part, it may be referred to as 'indirectly connected' not only with 'directly connected' . Also, to "include" an element means that it may include other elements, rather than excluding other elements, unless specifically stated otherwise.

도 1은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법의 흐름을 도시한 도면이고, 도 2는 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법을 블록 다이어그램으로 도시한 도면이다. 도 1에 도시된 바와 같이, 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법은, 트레이닝 영상에서 임의의 위치와 크기를 가지는 후보 영역들을 생성하는 단계(S110), 각 후보 영역에 대해 OCS-LBP 특징을 추출하는 단계(S120), 각 후보 영역별로 추출된 특징 벡터들을 랜덤 포레스트를 통해 지역 기반으로 학습하는 단계(S130), 테스트 영상을 학습된 랜덤 포레스트에 적용하여 확률 값을 추정한 다음 상위 값들을 가지는 지역 특징의 영역을 선택하는 단계(S140), 각 시점에 대해서 공유되는 영역을 찾고, 공유된 지역 특징의 영역에 대해서 랜덤 포레스트로 학습하는 단계(S150), 테스트 영상이 입력되면, 각 관점별로 확률 값이 추정되고, 확률 값 히스토그램을 생성하는 단계(S160), 생성한 확률 값 히스토그램에서 최댓값을 가지는 클래스의 주위 5개의 확률 값에 대해 가중치를 부과한 뒤 선형 결합하여 최종적인 확률 값을 추정하는 단계(S170), 및 추정된 최종적인 확률 값이 미리 설정한 임계값보다 큰 경우에 객체로 검출하는 단계(S180)를 포함하여 구현될 수 있다.
FIG. 1 is a flowchart illustrating a method of detecting a multi-view object using a shared area feature according to an exemplary embodiment of the present invention. FIG. 2 is a flowchart illustrating a method of detecting multi-view objects using a shared area feature according to an exemplary embodiment of the present invention As a block diagram. 1, a method of detecting a multi-view object using a shared area feature according to an exemplary embodiment of the present invention includes generating candidate regions having an arbitrary position and size in a training image (S110) Extracting the OCS-LBP feature for each candidate region (S120), learning the feature vectors extracted for each candidate region region-based through a random forest (S130), applying the test image to the learned random forest, (S140), searching for a region to be shared for each viewpoint, and learning the region of the shared region feature in a random forest (S150) A probability value is estimated for each viewpoint, a probability value histogram is generated (S160), a class having a maximum value in the generated probability value histogram (S170) of estimating a final probability value by applying a weight to five probability values around the object and linearly combining the estimated probability values, and detecting the object as an object when the estimated final probability value is greater than a preset threshold value S180).

단계 S110에서는, 객체를 포함하는 여러 시점(Multi-View)을 가지는 트레이닝 영상에서 임의의 위치와 크기를 가지는 후보 영역(Randomly N Regions)들을 생성한다. 여기서, 단계 S110에서의 후보 영역들은 트레이닝 영상의 원본 이미지에서 객체가 포함된 80×40의 크기로 조정하여 자른 윈도우 영역에서 수집된 N 개의 훈련 샘플로 구성한다. 또한, 후보 영역들은 80×40의 크기를 갖는 윈도우 영역에서 최소 10×10의 크기로부터 최대 40×20(윈도우 영역의 절반)의 크기를 갖는 지역 특징을 랜덤하게 생성한 것을 사용한다. 또한, 단계 S110에서의 후보 영역들은 도 4에 도시된 바와 같이, 다시점 객체 검출에 필요한 회전 객체의 지역 특징을 위해 30° 간격으로 0°에서 360°의 범위에서 12 트레이닝 세트로 지정할 수 있도록 한다. 이는 하나의 뷰의 지역 특징은 -15°~ +15°로 회전 객체를 검출할 수 있기 때문이다. 트레이닝 세트는 지역 특징의 특정 위치와 크기를 선택하는 학습 그룹과, 테스트 그룹으로 똑같이 나누어질 수 있도록 한다. 한편, 네거티브 세트는 배경(백그라운드) 영역에서 추출하여 사용할 수 있다.
In step S110, candidate regions (randomly N regions) having arbitrary positions and sizes are generated in the training image having the multi-view including the object. Here, the candidate regions in step S110 are composed of N training samples collected in the window region adjusted by adjusting the size of 80 × 40 including the object in the original image of the training image. In addition, the candidate regions use randomly generated local features having a size of at least 40 × 20 (half of the window region) from a size of at least 10 × 10 in a window region having a size of 80 × 40. Further, candidate regions in step S110 can be designated as 12 training sets in the range of 0 DEG to 360 DEG at intervals of 30 DEG for the local feature of the rotating object necessary for multi-point object detection, as shown in FIG. 4 . This is because the local feature of one view can detect rotation objects from -15 ° to + 15 °. A training set can be divided into a learning group that selects a specific location and size of a local feature, and a test group. On the other hand, the negative set can be extracted and used in the background (background) area.

단계 S120에서는, 단계 S110에서 생성한 각각의 후보 영역에 대해 OCS-LBP(Orientation Center Symmetric Local Binary Patterns) 특징을 추출한다. 도 3에 도시된 바와 같이, 단계 S120에서는 후보 영역들 각각에 대해 4개(2×2)의 서브 영역으로 분할하고, 각 서브 영역에서 8차원의 OCS-LBP 특징을 추출한다. 여기서, 후보 영역들은 각 서브 영역에서 추출된 OCS-LBP 특징들을 연결하여 각 후보 영역당 32(8×4) 차원의 특징 벡터를 추출할 수 있다. 본 발명에서는 각 지역에 대해 개별적으로 OCS-LBP 특징을 계산하고, 계산 시간을 줄이기 위해 통합 히스토그램을 사용하여 지역 변화를 유지할 수 있도록 한다.
In step S120, the OCS-LBP (Orientation Center Symmetric Local Binary Patterns) feature is extracted for each candidate region generated in step S110. As shown in FIG. 3, in step S120, four (2 × 2) sub-regions are divided for each candidate region, and 8-dimensional OCS-LBP features are extracted in each sub-region. Here, the candidate regions can extract 32 (8 × 4) dimensional feature vectors for each candidate region by connecting the extracted OCS-LBP features in each sub-region. In the present invention, OCS-LBP features are individually calculated for each region, and an integrated histogram can be used to maintain local changes to reduce computation time.

도 3은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법에서, 각 후보 영역에 대해 4개의 서브 영역으로 분할하여 OCS-LBP 특징을 추출하는 블록 다이어그램을 도시한다. 도 3의 (a)는 후보 영역을 4개의 서브 영역으로 분할하는 과정을 나타내고, 도 3의 (b)는 분할된 서브 영역에서 8차원의 OCS-LBP를 추출하는 과정을 나타내며, 도 3의 (c)는 기울기 방향과 크기를 갖는 OCS-LBP 특징을 나타낸다.
FIG. 3 illustrates a block diagram for extracting OCS-LBP features by dividing each candidate region into four sub-regions in a multi-point object detection method using a shared area feature according to an exemplary embodiment of the present invention. 3 (a) shows a process of dividing the candidate region into four sub-regions, and FIG. 3 (b) shows a process of extracting the 8-dimensional OCS-LBP in the divided sub- c) shows the OCS-LBP feature with the tilting direction and magnitude.

OCS-LBP 생성은 하기의 수학식 1에 적용된다.The OCS-LBP generation is applied to Equation (1) below.

여기서, n_i와 n_i _+(N/2)는 반경 R을 가진 서클에서 N의 일정한 간격의 픽셀을 위한 픽셀 중심 대칭 쌍의 강도 값에 해당하고, K는 기울기(gradient) 방향의 빈(bin) 번호를 나타낸다.
Where n _i and n _i _{+ (N / 2)} correspond to intensity values of a pixel-centered symmetric pair for pixels of constant spacing N in a circle with a radius R, and K is a gradient- ) Number.

단계 S130에서는, 각각의 후보 영역별로 추출된 특징 벡터들을 랜덤 포레스트를 통해 지역 기반으로 학습한다. 상기 단계 S130에서의 랜덤 포레스트는 도 2 및 도 4에 도시된 바와 같이, 의사 결정 트리를 기반으로 하는 앙상블 분류기로 구성할 수 있으며, 랜덤 포레스트에 적용한 각 트리의 구조를 이진 하향식(top-down) 방식으로 구현하게 된다. 또한, 랜덤 포레스트는 의사 결정 트리를 사용하여 빠른 훈련 속도와 많은 양의 데이터를 처리할 수 있도록 기능한다. 특히, 본 발명에서의 랜덤 포레스트는 지역의 중요성을 기반으로 공유 지역 특징의 선택에 사용된다. 즉, 본 발명은 지역의 체험적인 선택을 방지하기 위해 정규화 된 객체 윈도우에서 직사각형의 지역 특징의 랜덤 집합을 생성하고, 의미 있는 크기와 지역 특징의 위치를 결정하기 위해 랜덤 포레스트를 공유 분류기로 사용하게 된다.
In step S130, the feature vectors extracted for each candidate region are learned on an area basis through a random forest. As shown in FIG. 2 and FIG. 4, the random forest in step S130 can be configured as an ensemble classifier based on a decision tree. The structure of each tree applied to the random forest is top- . Random forests also use decision trees to process fast data and large amounts of data. In particular, the random forest in the present invention is used to select shared area features based on the importance of the area. That is, the present invention uses a random forest as a shared classifier to generate a random set of rectangular local features in a normalized object window to prevent experiential selection of regions, and to determine the location of meaningful size and local features do.

단계 S140에서는, 테스트 영상을 학습된 랜덤 포레스트에 적용하여 확률 값을 추정한 다음 상위 값들을 가지는 지역 특징(Local Features)의 영역을 선택한다.
In step S140, a test image is applied to the learned random forest to estimate a probability value, and then an area of Local Features having higher values is selected.

단계 S150에서는, 각 시점에 대해서 선택된 지역 특징의 영역에 대해 공유되는 영역을 찾아 선택하고, 공유된 지역 특징의 영역에 대해서 랜덤 포레스트로 학습한다. 단계 S150에서의 랜덤 포레스트는 동일한 지역 특징을 공유하는 각 뷰 클래스에서 수집된 훈련 데이터를 사용하여 공유 분류기로 훈련하여 공유 지역 특징에 대한 멀티 클래스 분류기를 생성한다. 또한, 일부 특징에 대해서는 다른 관점에서 여러 번 선택된다. 도 4는 정지 영상에서 12개의 뷰 클래스로부터 공유 지역 특징을 선택하는데 사용되는 과정의 블록 다이어그램을 나타낸다.
In step S150, a shared area is selected and selected for each region of the selected region, and the area is shared in a random forest. The random forest in step S150 generates a multi-class classifier for the shared area feature by training with the shared classifier using the training data collected in each view class sharing the same local feature. Further, some features are selected several times from different viewpoints. Figure 4 shows a block diagram of a process used to select a shared area feature from 12 view classes in a still image.

단계 S160에서는, 트레이닝 과정 수행 후, 객체를 검출하기 위해서 테스트 영상이 입력되면, 각각의 관점별로 확률 값이 추정되고, 확률 값 히스토그램을 생성한다. 여기서, 단계 S160에서는 각 공유 분류기들에서 공유되는 멀티 클래스 분류기의 히스토그램 확률 분포로부터 확률 값 히스토그램을 생성하게 된다.
In step S160, after a training process is performed, when a test image is input to detect an object, a probability value is estimated for each viewpoint, and a probability value histogram is generated. Here, in step S160, a probability value histogram is generated from the histogram probability distribution of the multi-class classifiers shared by the respective shared classifiers.

단계 S170에서는, 생성한 확률 값 히스토그램에서 추정된 확률 값들 중 최댓값을 찾고, 최댓값을 가지는 클래스의 주위 5개의 확률 값에 대해 가중치를 부과한 뒤 선형 결합하여 최종적인 확률 값을 추정한다. 즉, 1번째 최댓값과 2번째 최댓값에 대해서 주위 5개 확률 값에 대해서 각각 가중치를 부과하고 이를 선형 결합하여 그 값이 더 큰 경우를 최종 확률 값으로 사용하게 된다. 단계 S170에서의 최댓값은 생성한 확률 값 히스토그램에서 가장 기본이 되는 시점을 찾기 위한 최대 함수(Max Operation)를 이용한다.
In step S170, the maximum value of the probability values estimated in the generated probability value histogram is searched for, and a final probability value is estimated by applying a weight to five probability values around the class having the maximum value and linearly combining them. That is, for each of the first and second maximum values, weights are applied to the five probability values around each other, and the result is linearly combined to use the case where the value is larger as the final probability value. The maximum value in step S170 uses a maximum function (Max Operation) for finding the most basic point in the generated probability value histogram.

단계 S180에서는, 추정된 최종적인 확률 값이 미리 지정한 임계값보다 큰 경우에 객체로 검출하게 된다. 도 5는 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법에 따른 4 가지 관점에서의 공유 지역 특징과 그 위치의 일례를 나타낸다. 도 5의 (a)는 비행기를 나타내고, 도 5의 (b)는 자동차를 나타내며, 도 5의 (c)는 오토바이의 경우를 나타낸다. 즉, 본 발명에서는 공유 지역 특징에 대한 멀티 클래스 분류기를 생성하기 위해, 랜덤 포레스트를 사용하여 동일한 지역 특징을 공유하는 각 뷰 클래스에서 수집된 훈련 데이터를 사용하여 공유 분류기를 훈련하게 된다.
In step S180, when the estimated final probability value is larger than a predetermined threshold value, it is detected as an object. FIG. 5 illustrates an example of a shared area feature and its location in four viewpoints according to a method of detecting a multi-view object using a shared area feature according to an exemplary embodiment of the present invention. Fig. 5 (a) shows an airplane, Fig. 5 (b) shows a motor vehicle, and Fig. 5 (c) shows a motorcycle. That is, in order to create a multi-class classifier for shared area features, the present invention uses a random forest to train shared classifiers using training data collected from each view class that shares the same local characteristics.

도 6은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법에 적용되는 알고리즘의 일례를 도시하고, 도 7 내지 도 13은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법에 따른 실험 결과를 도시한다. 본 발명에서는 공유 지역 특징을 이용한 다시점 검출 방법에 도 6의 알고리즘을 이용하고, 도 7 내지 도 13을 통해 실험 결과를 검증하였다. 먼저, 다시점 객체 검출의 감지 성능을 평가하기 위해, 트레이닝에 여러 뷰를 포함하는 1200 객체 샘플을 사용하고, 배경에서 700 네거티브 샘플을 무작위로 선택하였다. PASCAL VOC 2012 데이터 집합은 서로 다른 규모와 다른 관점에서 총 9415 객체가 포함된 6475 이미지를 구성한다. 테스트 세트의 데이터는 도 8의 테이블과 도 13에 도시되는 다른 스케일과 다른 관점의 객체를 포함한다.
FIG. 6 illustrates an example of an algorithm applied to the multi-viewpoint detection method using the shared area feature according to an exemplary embodiment of the present invention. FIGS. 7 to 13 illustrate an example of an algorithm applied to the multi- And the experimental result according to the point detection method is shown. In the present invention, the algorithm of FIG. 6 is used for the multi-point detection method using the shared area feature, and the experimental results are verified through FIGS. 7 to 13. First, in order to evaluate the detection performance of multi-point object detection, we used 1200 object samples with multiple views for training and randomly selected 700 negative samples in the background. The PASCAL VOC 2012 data set comprises 6475 images containing 9415 objects from different scales and from different perspectives. The data of the test set includes an object of a different view from the table of FIG. 8 and the other scale shown in FIG.

도 7은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법에서, 공유 지역 특징의 적정 개수를 결정하는 여섯 가지의 값을 사용한 실험 결과를 나타낸다. 즉, 임계값에 대한 6가지 값을 사용하여 처리 시간 대비 정밀도와 리콜의 측면에서 실험한 결과를 보여준다. 여기에서는 상대적으로 적은 왜곡과 일정한 형식 패턴 객체를 갖고 공유 지역 특징의 적절한 수를 결정하기 위해 테스트 범주에서 4개의 샘플(비행기, 자전거, 자동차, 오토바이) 객체가 선정되었다.
FIG. 7 shows experimental results using six values for determining an appropriate number of shared area features in a multi-point detection method using shared area features according to an exemplary embodiment of the present invention. In other words, we show the results of experiments on the accuracy and recall of processing time by using 6 values for the threshold value. Here, four samples (airplane, bicycle, car, motorcycle) objects were selected in the test category to determine the appropriate number of shared area features with relatively little distortion and certain pattern pattern objects.

하기의 수학식 2와 수학식 3은 정밀도와 리콜의 수식을 나타낸다.The following equations (2) and (3) express the precision and the recall formula.

도 8은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법에 따른 평균 정확도 결과와, 관점의 분류를 고려하지 않은 종래의 4가지 방법의 평균 정확도 결과의 비교를 나타낸다. 성능은 PASCAL VOC 2012 도전 데이터 집합을 동일하게 사용하여 모든 클래스에 대한 각 클래스의 평균 밀도에 따라 평가하였다. 본 발명은 종래의 4개의 알고리즘보다 우수한 객체 감지 성능을 보여준다. 알고리즘 1보다는 10,3%, 알고리즘 2보다는 8.7%, 알고리즘 3보다는 4.8%, 알고리즘 4보다는 12.6%의 정밀도가 높았다.
FIG. 8 shows a comparison of the average accuracy results according to the multi-viewpoint detection method using the shared area feature according to the embodiment of the present invention and the average accuracy results of the conventional four methods without considering the classification of the viewpoints. Performance was evaluated according to the average density of each class for all classes using the same PASCAL VOC 2012 challenge data set. The present invention shows better object detection performance than the conventional four algorithms. The accuracy of algorithm 1 was 10.3%, that of algorithm 2 was 8.7%, that of algorithm 3 was 4.8%, and that of algorithm 4 was 12.6%.

도 9는 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법을 사용하여 PASCAL VOC 2012 데이터 집합에 따라 관점의 분류에 따른 정밀 평가를 나타내고, 도 10은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법을 사용하여 PASCAL VOC 2012 데이터 집합에 따라 관점의 분류에 따른 리콜의 평가를 나타낸다. 도 11은 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법을 사용하여 PASCAL VOC 2012 데이터 집합에 따라 관점의 분류를 고려하지 않고 10 객체를 사용한 리콜 결과의 비교를 나타낸다.
FIG. 9 shows an accurate evaluation according to classification of viewpoints according to a PASCAL VOC 2012 data set using a multi-point detection method using shared area features according to an embodiment of the present invention, and FIG. The results show that the recall is classified according to the PASCAL VOC 2012 dataset. FIG. 11 illustrates a comparison of recall results using 10 objects without considering classification of viewpoints according to a PASCAL VOC 2012 data set using a multi-point detection method using shared area features according to an embodiment of the present invention.

도 12는 본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 검출 방법을 사용하여 PASCAL VOC 2012 이미지 샘플 객체 검출 결과를 나타내는 것으로서, (a)는 비행기, (b)는 자전거, (c)는 배, (d)는 버스, (e)는 자동차, (f)는 개, (g)는 말, (h)는 오토바이, (i)는 소파, (j)는 기차를 나타낸다. 도 13은 다양한 객체 샘플에서의 샘플 검출 결과를 나타내는 것으로서, 전면 또는 후면 관점, 또는 배경에서 유사한 구조의 존재로 인해 객체의 심각한 중복이 발생하는 다양한 객체 샘플에서의 샘플 검출 결과를 나타내며, 파란색 사각형은 오류의 탐지를 표시하고, 빨간색 사각형은 정확한 탐지를 표시한다.
12 is a diagram illustrating a result of PASCAL VOC 2012 image sample object detection using a multi-point detection method using a shared area feature according to an exemplary embodiment of the present invention. FIG. 12A shows an airplane, FIG. 12B shows a bicycle, (D) is a bus, (e) is a car, (f) is a dog, (g) is a horse, (h) is a motorcycle, (i) is a sofa, and (j) is a train. Figure 13 shows sample detection results in various object samples, showing sample detection results in various object samples where there is a serious duplication of objects due to the presence of similar structures in the foreground or background view, or in the background, It displays the detection of the error, and the red rectangle shows the exact detection.

본 발명의 일실시예에 따른 공유 지역 특징을 이용한 다시점 객체 검출 방법의 발명은 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체로의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.
The invention of the multi-point object detection method using the shared area feature according to an embodiment of the present invention can be implemented as a computer-readable code on a computer-readable recording medium. A computer-readable recording medium includes all kinds of recording apparatuses in which data that can be read by a computer system is stored. Examples of the computer-readable recording medium include a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like, and a carrier wave (transmission via the Internet) . In addition, the computer-readable recording medium may be distributed over a network-connected computer system so that computer-readable code can be stored and executed in a distributed manner.

이상 설명한 본 발명은 본 발명이 속한 기술분야에서 통상의 지식을 가진 자에 의하여 다양한 변형이나 응용이 가능하며, 본 발명에 따른 기술적 사상의 범위는 아래의 특허청구범위에 의하여 정해져야 할 것이다.The present invention may be embodied in many other specific forms without departing from the spirit or essential characteristics of the invention.

S110: 트레이닝 영상에서 임의의 위치와 크기를 가지는 후보 영역들을 생성하는 단계
S120: 각 후보 영역에 대해 OCS-LBP 특징을 추출하는 단계
S130: 각 후보 영역별로 추출된 특징 벡터들을 랜덤 포레스트를 통해 지역 기반으로 학습하는 단계
S140: 테스트 영상을 학습된 랜덤 포레스트에 적용하여 확률 값을 추정한 다음 상위 값들을 가지는 지역 특징의 영역을 선택하는 단계
S150: 각 시점에 대해서 공유되는 영역을 찾고, 공유된 지역 특징의 영역에 대해서 랜덤 포레스트로 학습하는 단계
S160: 테스트 영상이 입력되면, 각 관점별로 확률 값이 추정되고, 확률 값 히스토그램을 생성하는 단계
S170: 생성한 확률 값 히스토그램에서 최댓값을 가지는 주위 5개의 확률 값에 대해 가중치를 부과한 뒤 선형 결합하여 최종적인 확률 값을 추정하는 단계
S180: 추정된 최종적인 확률 값이 미리 설정한 임계값보다 큰 경우에 객체로 검출하는 단계S110: generating candidate regions having an arbitrary position and size in the training image
S120: Extracting the OCS-LBP feature for each candidate region
S130: learning feature vectors extracted for each candidate region on a region basis through a random forest
S140: applying a test image to the learned random forest to estimate a probability value, and then selecting a region of a local feature having higher values
S150: Finding a shared area for each viewpoint and learning in a random forest for an area of a shared area feature
S160: When a test image is input, a probability value is estimated for each viewpoint and a probability value histogram is generated
S170: Step of estimating a final probability value by imposing a weight on the five probability values having the largest value in the generated probability value histogram and linearly combining the weight values
S180: detecting as an object when the estimated final probability value is larger than a predetermined threshold value

Claims

(1) generating candidate regions (Randomly N Regions) having arbitrary positions and sizes in a training image having multiple views including an object;
(2) extracting an OCS-LBP (Orientation Center Symmetric Local Binary Patterns) characteristic for each candidate region;
(3) learning region-based feature vectors extracted for each candidate region through a random forest;
(4) applying a test image to the learned random forest to estimate a probability value, and then selecting an area of Local Features having higher values;
(5) finding and selecting a shared region for a region of the selected region feature for each viewpoint, and learning the regions of the shared region feature in a random forest;
(6) generating a probability value histogram by estimating a probability value for each view when a test image is input to detect an object after performing a training process;
(7) Finding the maximum value among the probability values estimated in the generated probability value histogram, estimating the final probability value by applying a weight to the five surrounding probability values having the maximum value, and linearly combining the weight values; And
(8) detecting an object as an object when the estimated final probability value is greater than a predetermined threshold value.

2. The method of claim 1, wherein the candidate regions in step (1)
Wherein the training object is N training samples collected in a window region adjusted to a size of 80x40 including an object in the original image of the training image.

3. The apparatus of claim 2,
And a local feature having a size of at least 40 × 20 (half of the window area) is randomly generated from a size of at least 10 × 10 in the window area having the size of 80 × 40. Point object detection method.

2. The method of claim 1, wherein the candidate regions in step (1)
And a 12 training set in a range of 0 ° to 360 ° at intervals of 30 ° for a local feature of a rotating object necessary for detecting a multi-view object.

5. The method of claim 4,
A learning group for selecting a specific location and size of a region feature, and a dividing region for dividing the same into a test group.

The method according to claim 1,
And wherein the negative set is extracted from the background region.

2. The method according to claim 1, wherein in the step (2)
(2 × 2) sub-regions for each of the candidate regions, and extracts 8-dimensional OCS-LBP features in each sub-region. .

8. The method of claim 7,
And extracting 32 (8x4) -dimensional feature vectors for each candidate region by connecting the extracted OCS-LBP features in each sub-region.

2. The method of claim 1, wherein the random forest in step (3)
And an ensemble classifier based on a decision tree.

10. The method of claim 9,
Wherein the decision tree is used to process a high training rate and a large amount of data.

10. The method of claim 9,
Wherein the structure of each tree is implemented in a binary top-down manner.

2. The method of claim 1, wherein the random forest in step (5)
Wherein training is performed with a shared classifier using training data collected from each view class sharing the same local feature to generate a multi-class classifier for the shared area feature.

13. The method of claim 12, wherein in step (6)
And generating a probability value histogram from the histogram probability distribution of the multi-class classifiers shared by the respective shared classifiers.

14. The method of claim 13, wherein the maximum value in step (7)
And a maximum function (Max Operation) for finding a most basic point in the generated probability value histogram is used.