KR20180024232A

KR20180024232A - Method of detecting object in image using selective sliding winsdow and apparatus thereof

Info

Publication number: KR20180024232A
Application number: KR1020160110039A
Authority: KR
Inventors: 임영철; 강민성
Original assignee: 재단법인대구경북과학기술원
Priority date: 2016-08-29
Filing date: 2016-08-29
Publication date: 2018-03-08

Abstract

The present invention relates to a method and an apparatus for detecting an object in an image by using a selective sliding window. The method for detecting an object according to the present invention comprises: a step of receiving a taken image from a camera; a step of sliding a unit window as much as integer times of the number of movement pixels preset in the taken image, to calculate first classification result values for a plurality of first image areas corresponding to the slid unit window; a step of selecting a first image area where the first classification result value is larger than a reference object detection probability value; a step of selecting an image area corresponding to the unit window slid as much as a preset number of movement pixels in at least one direction of up, down, left, and right from the selected first area, as a second image area; a step of calculating a second classification result value for the plurality of second image areas; and a step of detecting an object in the taken image by using the first and second classification result values. According to the present invention, since calculation is performed on an object search area for object detection selected from the whole image area, it is possible to reduce the time required to detect an object while maintaining the same detection performance.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a method for detecting an object in an image using a selective sliding window,

본 발명은 선택적 슬라이딩 윈도우를 이용한 영상 내 객체 검출 방법 및 그 장치에 관한 것으로서, 더욱 상세하게는 영상 내 객체 검출시 연산속도를 향상시키기 위한 선택적 슬라이딩 윈도우를 이용한 영상 내 객체 검출 방법 및 그 장치에 관한 것이다.The present invention relates to a method and an apparatus for detecting an object in an image using a selective sliding window, and more particularly, to a method and an apparatus for detecting an object in an image using a selective sliding window will be.

최근 지능형 자동차, 로봇, 영상 보안등의 분야에서 영상을 이용하여 사람, 차량, 이륜차등과 같은 특정 객체를 검출하는 방법에 대하여 활발하게 연구가 진행되고 있다. 객체 검출 기술은 주어진 영상에서 사람이나 차량과 같은 특정 객체의 위치와 크기를 추정하는 기술을 의미한다. Recently, a method of detecting a specific object such as a person, a vehicle, a two-wheeled vehicle, etc. using an image in the fields of intelligent automobile, robot, and image security has been actively researched. Object detection technology refers to a technique for estimating the position and size of a specific object such as a person or a vehicle in a given image.

도 1은 종래 객체 검출 기술을 설명하기 위한 도면이다. 도 1에 도시된 바와 같이, 일반적인 객체 검출 기술은 사전에 학습된 분류기를 이용하여 영상에서 전체 영역을 탐색하면서 객체의 유무를 판단하다. 분류기에서 학습된 모델은 일반적으로 정해진 크기의 윈도우 사이즈를 갖기 때문에 영상에서 다양한 크기의 객체를 검출하기 위해서는 테스트 영상의 크기를 변화하면서 탐색을 진행하여야 하며, 객체의 위치를 추정하기 위해서는 수평, 수직으로 윈도우를 이동하면서 객체의 위치를 탐색하여야 한다. 1 is a view for explaining a conventional object detection technique. As shown in FIG. 1, a general object detection technique searches for an entire region of an image using a classifier learned in advance, and determines the presence or absence of the object. Since the model learned in the classifier generally has a window size of a predetermined size, in order to detect objects of various sizes in the image, it is necessary to search while changing the size of the test image. In order to estimate the position of the object, You must navigate the object while moving the window.

이러한 다중 스케일 슬라이딩 윈도우 방법을 이용하여 객체를 정밀하게 검출하기 위해서는 탐색하고자 하는 영역을 세밀하게 이동하여야 하지만, 이것은 탐색하려고 하는 윈도우의 개수를 증가시키기 때문에 그 만큼의 연산 시간이 소요되는 문제가 발생한다. In order to precisely detect an object using this multi-scale sliding window method, it is necessary to move the area to be searched finely. However, this increases the number of windows to be searched, .

이에 따라, 탐색 영역을 최소화함으로써 연산 시간을 줄이기 위해서, 배경 모델, 움직임 모델, 3D 모델링 기법 등과 같이 배경과 전경 영역을 구분하여 관심 영역을 추출하는 방법들이 많이 제안되었다. Accordingly, in order to reduce the computation time by minimizing the search area, methods of extracting the region of interest by distinguishing the background region and the foreground region such as the background model, the motion model, and the 3D modeling technique have been proposed.

하지만, 관심 영역 기반의 객체 검출 방법들은 카메라 정보나 영상에 관련된 사전 정보가 필요하고, 제한된 조건하에서만 관심영역을 추출할 수 있기 때문에 응용 분야에 많은 제약이 발생한다. 또한, 관심영역 추정 단계에서의 오류로 인하여 미검출이나 부정확한 관심 영역이 발생하게 되면, 객체를 검출할 수 없는 문제가 발생하기도 한다. However, object detection methods based on ROIs require preliminary information related to camera information or images, and there are many restrictions on application fields because ROIs can be extracted only under limited conditions. In addition, if an undetected or incorrect region of interest occurs due to an error in the region of interest estimation, a problem that an object can not be detected may occur.

따라서, 제한 조건이 없이도 영상으로부터 객체를 정밀하게 추출할 뿐만 아니라 연산 속도를 향상시킬 수 있는 객체 검출 기술이 요구된다. Therefore, there is a need for an object detection technique capable of not only precisely extracting an object from an image but also improving the operation speed without any restriction.

본 발명의 배경이 되는 기술은 한국공개특허 제10-2016-0046210호(2016.04.28.공개)에 개시되어 있다.The background technology of the present invention is disclosed in Korean Patent Publication No. 10-2016-0046210 (published on June 28, 2016).

본 발명이 이루고자 하는 기술적 과제는 영상 내 객체 검출시 연산속도를 향상시키기 위한 선택적 슬라이딩 윈도우를 이용한 영상 내 객체 검출 방법 및 그 장치를 제공하기 위한 것이다.An object of the present invention is to provide a method and an apparatus for detecting an object in an image using a selective sliding window for improving an operation speed when detecting an object in an image.

이러한 기술적 과제를 이루기 위한 본 발명의 실시예에 따르면 객체 검출 장치를 이용한 영상 내 객체 검출 방법에 있어서, 객체 검출 방법은 카메라로부터 촬영 영상을 입력받는 단계, 상기 촬영 영상에 기 설정된 이동 화소수의 정수배만큼 단위 윈도우를 슬라이딩하여, 상기 슬라이딩한 단위 윈도우에 대응하는 복수의 제1 영상 영역에 대한 제1 분류 결과값들을 산출하는 단계, 상기 제1 분류 결과값이 기준 객체 검출 확률 값보다 큰 제1 영상 영역을 선택하는 단계, 상기 선택된 제1 영상 영역으로부터 상측, 하측, 좌측 및 우측 중 적어도 하나의 방향으로 기 설정된 이동 화소수만큼 슬라이딩 된 단위 윈도우에 대응하는 영상 영역을 제2 영상 영역으로 선택하는 단계, 상기 복수의 제2 영상 영역에 대한 제2 분류 결과값들을 산출하는 단계, 그리고 상기 제1 및 제2 분류 결과값을 이용하여 상기 촬영 영상 내 객체를 검출하는 단계를 포함한다. According to another aspect of the present invention, there is provided a method of detecting an object in an image using an object detecting apparatus, the method comprising: receiving a captured image from a camera; Calculating a first classification result value for a plurality of first image regions corresponding to the sliding unit window by sliding the unit window as many times as the first window, Selecting an image region corresponding to a unit window that has been slid by a predetermined number of movement pixels in at least one direction of the upper side, the lower side, the left side, and the right side from the selected first image region as a second image region; , Calculating second classification result values for the plurality of second video regions, And detecting an object in the photographed image using the first and second classification result values.

상기 정수배는 2 이상의 정수배일 수 있다. The integer multiple may be an integral multiple of 2 or more.

상기 제1 분류 결과값들을 산출하는 단계는, 상기 촬영 영상에 기 설정된 이동 화소수의 정수배만큼 단위 윈도우를 슬라이딩하여, 상기 정수배만큼 슬라이딩한 단위 윈도우에 대응하는 복수의 제1 영상 영역에 대한 특징 벡터들을 추출하는 단계, 그리고 기 학습된 분류기에 상기 복수의 제1 영상 영역에 대한 특징 벡터들을 입력하여 상기 복수의 제1 영상 영역들에 대응하는 제1 분류 결과값들을 산출할 수 있다. Wherein the step of calculating the first classification result values comprises the steps of: sliding a unit window by an integral multiple of the predetermined number of moving pixels in the captured image, calculating a feature vector of a plurality of first image areas corresponding to the unit window Extracting feature vectors corresponding to the plurality of first image regions, and calculating first classification result values corresponding to the plurality of first image regions by inputting feature vectors for the plurality of first image regions to the previously learned classifiers.

상기 제2 분류 결과값들을 산출하는 단계는, 상기 복수의 제2 영상 영역들에 대한 특징 벡터들을 추출하는 단계, 그리고 기 학습된 분류기에 상기 복수의 제2 영상 영역들에 대한 특징 벡터를 입력하여 상기 복수의 제2 영상 영역들에 대응하는 제2 분류 결과값들을 산출할 수 있다. The calculating of the second classification result values may include extracting the feature vectors for the plurality of second image regions, inputting the feature vectors for the plurality of second image regions to the learned classifier, And calculate second classification result values corresponding to the plurality of second video regions.

다양한 크기의 객체를 검출하기 위하여 상기 입력받은 촬영 영상을 기 설정된 영상 크기별로 변환하는 단계를 더 포함할 수 있다. The method may further include converting the input captured image by a predetermined image size to detect objects of various sizes.

본 발명의 다른 실시예에 따른 객체 검출 장치는 카메라로부터 촬영 영상을 입력받는 입력부, 상기 촬영 영상에 기 설정된 이동 화소수의 정수배만큼 단위 윈도우를 슬라이딩하여, 상기 슬라이딩한 단위 윈도우에 대응하는 복수의 제1 영상 영역에 대한 제1 분류 결과값들을 산출하는 제1 산출부, 상기 제1 분류 결과값이 기준 객체 검출 확률 값보다 큰 제1 영상 영역을 선택하고, 상기 선택된 제1 영상 영역으로부터 상측, 하측, 좌측 및 우측 중 적어도 하나의 방향으로 기 설정된 이동 화소수만큼 슬라이딩 된 단위 윈도우에 대응하는 영상 영역을 제2 영상 영역으로 선택하는 선택부, 상기 복수의 제2 영상 영역에 대한 제2 분류 결과값들을 산출하는 제2 산출부, 그리고 상기 제1 및 제2 분류 결과값을 이용하여 상기 촬영 영상 내 객체를 검출하는 검출부를 포함한다. According to another aspect of the present invention, there is provided an apparatus for detecting an object, the apparatus comprising: an input unit for inputting a photographed image from a camera; a unit window for sliding an integral number of the predetermined number of moving pixels in the captured image, A first calculation unit for calculating first classification results for a first image region and a second classification result for a first image region, and a second calculation unit for selecting a first image region having a first classification result value larger than a reference object detection probability value, A selection unit for selecting, as a second image region, an image region corresponding to a unit window that is slid by a predetermined number of movement pixels in at least one direction of a left side and a right side, And a detection unit for detecting an object in the photographed image using the first and second classification result values, .

이와 같이 본 발명에 따르면, 전체 영상 영역에서 객체 검출을 위한 객체 탐색 영역을 선택하여 연산하므로 검출 성능은 동일하게 유지하면서 객체 검출에 소모되는 시간을 줄일 수 있는 장점이 있다. As described above, according to the present invention, since the object search area for object detection is selected and operated in the entire image area, the time required for object detection can be reduced while maintaining the same detection performance.

도 1은 종래 객체 검출 기술을 설명하기 위한 도면이다.
도 2는 본 발명의 실시예에 따른 객체 검출 장치의 구성도이다.
도 3은 본 발명의 실시예에 따른 객체 검출 방법의 순서도이다.
도 4는 도 3의 S320 단계를 상세하게 설명하기 위한 도면이다.
도 5는 도 3의 S340 단계를 상세하게 설명하기 위한 도면이다.
도 6은 본 발명의 실시예에 따른 영상 탐색 영역 선택 과정을 설명하기 위한 도면이다.1 is a view for explaining a conventional object detection technique.
2 is a configuration diagram of an object detecting apparatus according to an embodiment of the present invention.
3 is a flowchart of an object detection method according to an embodiment of the present invention.
FIG. 4 is a diagram for explaining step S320 of FIG. 3 in detail.
FIG. 5 is a diagram for explaining step S340 of FIG. 3 in detail.
6 is a view for explaining an image search area selection process according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있다는 것을 의미한다.Throughout the specification, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise.

그러면 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention.

우선, 도 2를 통해 본 발명의 실시예에 따른 선택적 슬라이딩 윈도우를 이용한 영상 내 객체 검출 장치에 대해 살펴본다. 도 2는 본 발명의 실시예에 따른 객체 검출 장치의 구성도이다. First, an object detection apparatus using a selective sliding window according to an embodiment of the present invention will be described with reference to FIG. 2 is a configuration diagram of an object detecting apparatus according to an embodiment of the present invention.

도 2에 나타난 바와 같이, 본 발명의 실시예에 따른 객체 검출 장치(100)는 입력부(110), 제1 산출부(120), 선택부(130), 제2 산출부(140) 및 검출부(150)를 포함하며, 변환부(160)를 더 포함할 수 있다. 2, the object detecting apparatus 100 according to the embodiment of the present invention includes an input unit 110, a first calculating unit 120, a selecting unit 130, a second calculating unit 140, 150, and may further include a conversion unit 160.

먼저, 입력부(110)는 카메라로부터 촬영 영상을 입력받는다. First, the input unit 110 receives a photographed image from a camera.

다음으로, 제1 산출부(120)는 촬영 영상에 기 설정된 이동 화소수의 정수배만큼 단위 윈도우를 슬라이딩하여, 슬라이딩한 단위 윈도우에 대응하는 복수의 제1 영상 영역에 대한 제1 분류 결과값들을 산출한다. Next, the first calculation unit 120 calculates a first classification result value for a plurality of first image regions corresponding to the sliding unit window by sliding a unit window by an integral multiple of the predetermined number of moving pixels in the captured image do.

구체적으로, 제1 산출부(120)는 촬영 영상에 기 설정된 이동 화소수의 정수배만큼 단위 윈도우를 슬라이딩한다. 그리고, 제1 산출부(120)는 정수배만큼 슬라이딩한 단위 윈도우에 대응하는 복수의 제1 영상 영역에 대한 특징 벡터들을 추출한다. 이때, 정수배는 2 이상의 정수배를 의미한다.Specifically, the first calculation unit 120 slides the unit window by an integral multiple of the predetermined number of moving pixels in the captured image. The first calculation unit 120 extracts feature vectors for a plurality of first image regions corresponding to the unit window that has been slid by an integral multiple. In this case, an integer multiple means an integer multiple of 2 or more.

그러면, 제1 산출부(120)는 기 학습된 분류기에 복수의 제1 영상 영역에 대한 특징 벡터들을 입력하여 복수의 제1 영상 영역들에 대응하는 제1 분류 결과값들을 산출한다. Then, the first calculation unit 120 calculates the first classification result values corresponding to the plurality of first image regions by inputting the feature vectors for the plurality of first image regions to the learned classifier.

다음으로, 선택부(130)는 제1 분류 결과값이 기준 객체 검출 확률 값보다 큰 제1 영상 영역을 선택한다. Next, the selecting unit 130 selects the first image area having the first classification result value larger than the reference object detection probability value.

그리고, 선택부(130)는 선택된 제1 영상 영역으로부터 상측, 하측, 좌측 및 우측 중 적어도 하나의 방향으로 기 설정된 이동 화소수만큼 슬라이딩 된 단위 윈도우에 대응하는 영상 영역을 제2 영상 영역으로 선택한다. Then, the selection unit 130 selects, as the second video region, the video region corresponding to the unit window that has been slid by the predetermined number of the moving pixels in at least one of the upper side, the lower side, the left side, and the right side from the selected first video region .

다음으로, 제2 산출부(140)는 복수의 제2 영상 영역에 대한 제2 분류 결과값들을 산출한다. Next, the second calculation unit 140 calculates second classification result values for the plurality of second video regions.

구체적으로, 제2 산출부(140)는 복수의 제2 영상 영역들에 대한 특징 벡터들을 추출한다. 그리고, 제2 산출부(140)는 기 학습된 분류기에 복수의 제2 영상 영역들에 대한 특징 벡터를 입력하여 복수의 제2 영상 영역들에 대응하는 제2 분류 결과값들을 산출한다. Specifically, the second calculator 140 extracts feature vectors for a plurality of second image regions. The second calculation unit 140 calculates the second classification result values corresponding to the plurality of second image regions by inputting the feature vectors of the plurality of second image regions to the learned classifier.

한편, 제1 산출부(120) 또는 제2 산출부(140)에서 이용하는 기 학습된 분류기는 해당 영상 영역에 검출하고자 하는 객체가 있는지 여부를 판단할 수 있는 객체 검출 분류기를 의미한다. 이러한 분류기는 SVM 분류기(Support Vector Machine Classifier)를 포함하며, 이외의 객체 검출을 위한 분류기를 더 포함할 수 있다. Meanwhile, the classifier learned in the first calculation unit 120 or the second calculation unit 140 means an object detection classifier capable of determining whether there is an object to be detected in the corresponding image region. The classifier includes a SVM classifier (Support Vector Machine Classifier), and may further include a classifier for detecting other objects.

다음으로, 검출부(150)는 제1 및 제2 분류 결과값을 이용하여 촬영 영상 내 객체를 검출한다. Next, the detection unit 150 detects an object in the photographed image using the first and second classification result values.

다음으로 변환부(160)는 다양한 크기의 객체를 검출하기 위하여 입력받은 촬영 영상을 기 설정된 영상 크기별로 변환한다.Next, the conversion unit 160 converts the input captured image by predetermined image size to detect objects of various sizes.

이하에서는 도 3 내지 도 6을 통해 본 발명의 실시예에 따른 선택적 슬라이딩 윈도우를 이용한 영상 내 객체 검출 방법에 대해 살펴보도록 한다. 도 3은 본 발명의 실시예에 따른 객체 검출 방법의 순서도이다. Hereinafter, a method of detecting an object in an image using a selective sliding window according to an exemplary embodiment of the present invention will be described with reference to FIG. 3 through FIG. 3 is a flowchart of an object detection method according to an embodiment of the present invention.

우선 입력부(110)는 카메라로부터 촬영 영상을 입력받는다(S310). First, the input unit 110 receives the photographed image from the camera (S310).

다음으로, 제1 산출부(120)는 촬영 영상에 기 설정된 이동 화소수의 정수배만큼 단위 윈도우를 슬라이딩하여, 슬라이딩한 단위 윈도우에 대응하는 복수의 제1 영상 영역에 대한 제1 분류 결과값들을 산출한다(S320). Next, the first calculation unit 120 calculates a first classification result value for a plurality of first image regions corresponding to the sliding unit window by sliding a unit window by an integral multiple of the predetermined number of moving pixels in the captured image (S320).

도 4는 도 3의 S320 단계를 상세하게 설명하기 위한 도면으로서, 도 4의 (a)는 단위 윈도우가 슬라이딩 되는 과정을 나타낸 도면이고, 도 4의 (b)는 슬라이딩 된 단위 윈도우에 대응하는 제1 영상 영역들의 left-top을 표시한 도면이다. FIG. 4 is a view for explaining step S320 of FIG. 3 in detail. FIG. 4 (a) illustrates a process of sliding a unit window, and FIG. 4 (b) 1 < / RTI >

구체적으로, 제1 산출부(120)는 촬영 영상에 기 설정된 이동 화소수의 정수배만큼 단위 윈도우를 슬라이딩한다. 이때, 정수배는 2 이상의 정수배를 의미한다. Specifically, the first calculation unit 120 slides the unit window by an integral multiple of the predetermined number of moving pixels in the captured image. In this case, an integer multiple means an integer multiple of 2 or more.

예를 들어, 기 설정된 이동 화소수의 2배만큼 단위 윈도우를 슬라이딩 한다고 가정한다. 그러면, 제1 산출부(120)는 도 4의 (a)에 도시된 것처럼 영상의 좌상단을 시작으로 단위 윈도우를 기 설정된 이동 화소수의 2배 만큼씩 우측 및 하측으로 이동시키게 된다. 여기서, 도 4의 영상 위에 표시된 그리드(grid)는 기 설정된 이동 화소수 간격을 표시한 것이다.For example, it is assumed that the unit window is slid by twice the predetermined number of moving pixels. Then, the first calculation unit 120 moves the unit window to the right and to the lower side by twice the predetermined number of moving pixels starting from the upper left end of the image as shown in FIG. 4 (a). Here, the grid displayed on the image of FIG. 4 indicates the predetermined number of moving pixels.

그리고, 제1 산출부(120)는 정수배만큼 슬라이딩한 단위 윈도우에 대응하는 복수의 제1 영상 영역에 대한 특징 벡터들을 추출한다. 예를 들어, 도 4의 (a)에서와 같이 단위 윈도우를 슬라이딩 한 경우, 제1 산출부(120)는 도 4의 (b)에서와 같이 36개의 제1 영상 영역 영역에 대한 특징 벡터들을 추출한다. The first calculation unit 120 extracts feature vectors for a plurality of first image regions corresponding to the unit window that has been slid by an integral multiple. For example, when the unit window is slid as shown in FIG. 4A, the first calculation unit 120 extracts 36 feature vectors for the first image region, as shown in FIG. 4B. do.

여기서, 특징 벡터는 해당 영상 영역의 영상 특징을 나타내는 파라미터를 의미하며, HOG (Histogram Of Gradients Feature)와 같은 특징 벡터를 포함한다. Here, the feature vector refers to a parameter representing an image feature of the image region, and includes a feature vector such as HOG (Histogram Of Gradients Feature).

예를 들어, 제1 산출부(120)는 제1 영상 영역의 HOG와 같은 특징 벡터를 SVM과 같은 분류기에 입력하여 해당 제1 영상 영역의 제1 분류 결과값을 산출할 수 있다. For example, the first calculation unit 120 may calculate a first classification result value of the first image region by inputting a feature vector such as HOG of the first image region to a classifier such as SVM.

그러면, 선택부(130)는 제1 분류 결과값이 기준 객체 검출 확률 값보다 큰 제1 영상 영역을 선택한다(S330). 이때, 기준 객체 검출 확률 값은 영상에서 검출하고자 하는 객체에 따라 다르게 설정될 수 있으며, 당업자에 의해 설계변경이 가능하다. Then, the selecting unit 130 selects a first image area having a first classification result value larger than the reference object detection probability value (S330). At this time, the reference object detection probability value can be set differently according to the object to be detected in the image, and the design can be changed by a person skilled in the art.

그리고, 선택부(130)는 S330 단계에서 선택된 제1 영상 영역으로부터 상측, 하측, 좌측 및 우측 중 적어도 하나의 방향으로 기 설정된 이동 화소수만큼 슬라이딩 된 단위 윈도우에 대응하는 영상 영역 중 복수의 제2 영상 영역들을 선택한다(S340). 도 5는 도 3의 S340 단계를 상세하게 설명하기 위한 도면이다. The selecting unit 130 may select one of the plurality of video regions corresponding to the unit window that is slid by the predetermined number of moving pixels in the direction of at least one of the upper side, the lower side, the left side, and the right side from the first video region selected in operation S330. And selects image areas (S340). FIG. 5 is a diagram for explaining step S340 of FIG. 3 in detail.

예를 들어, 도 5에서 O가 표시된 칸은 S330 단계에서 선택된 제1 영상 영역, 즉, 제1 분류 결과값이 기준 객체 검출 확률 값보다 큰 제1 영상 영역의 left-top을 나타낸다. 이때, 선택부(130)는 도 5에 표시된 것처럼, 선택된 제1 영상 영역의 상측, 하측, 좌측, 우측으로 기 설정된 이동 화소수만큼 슬라이딩 된 단위 윈도우에 대응하는 영상 영역을 제2 영상 영역으로 선택한다. 도 5에서는 9개의 제2 영상 영역이 선택된다. For example, the box marked O in FIG. 5 represents the left-top of the first image area selected in step S330, i.e., the first image area where the first classification result value is larger than the reference object detection probability value. At this time, as shown in FIG. 5, the selecting unit 130 selects the video region corresponding to the unit window that is slid by the predetermined number of the moving pixels to the upper side, the lower side, the left side, and the right side of the selected first video region as the second video region do. In FIG. 5, nine second video regions are selected.

다음으로, 제2 산출부(140)는 복수의 제2 영상 영역에 대한 제2 분류 결과값들을 산출한다(S350). Next, the second calculation unit 140 calculates second classification result values for the plurality of second video regions (S350).

그러면, 검출부(150)는 제1 및 제2 분류 결과값을 이용하여 촬영 영상 내 객체를 검출한다(S360). 예를 들어, 검출부(150)는 NMS(Non-Maximum Suppression) 기법을 이용하여 제1 및 제2 분류 결과값들 중 신뢰도 낮은 분류 결과값을 제거함으로써, 촬영 영상 내 객체를 검출할 수 있다. Then, the detecting unit 150 detects the object in the photographed image using the first and second classification result values (S360). For example, the detection unit 150 may detect an object in the photographed image by removing a low-confidence classification result value among the first and second classification result values using a non-maximum suppression (NMS) technique.

한편, 변환부(160)는 입력받은 촬영 영상을 기 설정된 영상 크기별로 변환할 수 있다(미도시). 구체적으로, 변환부(160)는 S310 단계 다음에 입력받은 촬영 영상을 기 설정된 영상크기별로 변환할 수 있다. 그러면, 본 발명의 실시예에 따른 객체 검출 장치(100)는 다양한 크기의 객체를 검출하기 위하여 영상 크기 별로 변환된 촬영 영상 각각에 대해 S320 내지 S360 단계를 수행한다. On the other hand, the converting unit 160 may convert the input captured image according to a preset image size (not shown). Specifically, the converting unit 160 may convert the captured image input after step S310 according to a preset image size. Then, the object detecting apparatus 100 according to the embodiment of the present invention performs steps S320 to S360 for each shot image converted for each image size to detect objects of various sizes.

다른 한편으로, 본 발명의 실시예에 따른 객체 검출 장치(100)는 제2 분류 결과값을 이용하여 객체 검출을 위한 영상 탐색 영역을 더 선택할 수 있다(미도시). 도 6은 본 발명의 실시예에 따른 영상 탐색 영역 선택 과정을 설명하기 위한 도면이다. On the other hand, the object detecting apparatus 100 according to the embodiment of the present invention can further select an image search area for object detection using a second classification result value (not shown). 6 is a view for explaining an image search area selection process according to an embodiment of the present invention.

도 6에 도시된 바와 같이, 본 발명의 실시예에 따른 객체 검출 장치(100)는 제2 영상 영역 중 분류 결과값이 기준 객체 검출 확률 값보다 큰 영역을 선택할 수 있다. As shown in FIG. 6, the object detecting apparatus 100 according to the embodiment of the present invention can select an area having a classification result value of the second image area that is larger than the reference object detection probability value.

그러면, 객체 검출 장치(100)는 분류 결과값이 기준 객체 검출 확률 값보다 큰 제2 영상 영역으로부터 상측, 하측, 좌측, 우측으로 기 설정된 이동 화소수만큼 슬라이딩 된 단위 윈도우에 대응하는 영상 영역을 제3 영상 영역으로 선택한다. 그리고, 객체 검출 장치(100)는 제3 영상 영역의 분류 결과 값을 산출하게 된다. Then, the object detecting apparatus 100 detects an image region corresponding to a unit window that is slid by the predetermined number of movement pixels from the second image region whose classification result value is larger than the reference object detection probability value to the upper, lower, left, 3 Select the image area. Then, the object detecting apparatus 100 calculates the classification result value of the third video region.

다음으로, 객체 검출 장치(100)는 제3 영상 영역의 분류 결과 값으로 상기의 영상 탐색 영역 선택 과정을 선택 가능한 영역이 없을때까지 반복수행한다.Next, the object detecting apparatus 100 repeatedly performs the above-described image search region selection process until there is no selectable region as the classification result value of the third image region.

그러면, 도 6에 도시된 바와 같이, 객체 검출 장치(100)는 상기의 과정을 통해 전체 126개의 영상 영역 중 54개의 영상 영역만을 탐색함으로써, 영상 내 객체를 검출하게 된다. Then, as shown in FIG. 6, the object detecting apparatus 100 detects 54 objects in the image by searching only 54 image areas out of 126 image areas through the above process.

본 발명의 실시예에 따르면, 전체 영상 영역에서 객체 검출을 위한 객체 탐색 영역을 선택하여 연산하므로 검출 성능은 동일하게 유지하면서 객체 검출에 소모되는 시간을 줄일 수 있는 장점이 있다. According to the embodiment of the present invention, since the object search area for object detection is selected and operated in the entire image area, the time required for object detection can be reduced while maintaining the same detection performance.

본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 다른 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의하여 정해져야 할 것이다. While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.

100 : 객체 검출 장치 110 : 입력부
120 : 제1 산출부 130 : 선택부
140 : 제2 산출부 150 : 검출부
160 : 변환부100: object detecting apparatus 110: inputting unit
120: first calculation unit 130:
140: second calculating section 150: detecting section
160:

Claims

A method for detecting an object in an image using an object detecting apparatus,
Receiving a photographed image from a camera,
Calculating first classification result values for a plurality of first image regions corresponding to the sliding unit window by sliding a unit window by integer times the predetermined number of moving pixels in the captured image,
Selecting a first image region in which the first classification result value is larger than a reference object detection probability value,
Selecting an image area corresponding to a unit window that is slid by a predetermined number of movement pixels in at least one direction of the upper side, the lower side, the left side, and the right side from the selected first image area as a second image area,
Calculating second classification result values for the plurality of second video regions, and
And detecting an object in the photographed image using the first and second classification result values.

The method according to claim 1,
Wherein the integer multiple is an integer multiple of 2 or more.

The method according to claim 1,
The step of calculating the first classification result values comprises:
Extracting feature vectors for a plurality of first image regions corresponding to a unit window that has been slid by the integral multiple by sliding a unit window by an integral multiple of the predetermined number of moving pixels in the captured image,
And inputting the feature vectors for the plurality of first image regions into the learned classifier to calculate first classification result values corresponding to the plurality of first image regions.

The method according to claim 1,
Wherein the step of calculating the second classification result values comprises:
Extracting feature vectors for the plurality of second video regions, and
And inputting a feature vector for the plurality of second image regions to the learned classifier to calculate second classification result values corresponding to the plurality of second image regions.

The method according to claim 1,
Further comprising the step of transforming the input captured image by a predetermined image size to detect objects of various sizes.

An input unit for inputting the photographed image from the camera,
A first calculation unit for calculating a first classification result value for a plurality of first image regions corresponding to the sliding unit window by sliding a unit window by an integral multiple of the predetermined number of moving pixels in the captured image,
A first image region having a first classification result value larger than a reference object detection probability value and a second image region having a first classification result value larger than a reference object detection probability value, A selection unit for selecting a video area corresponding to a unit window as a second video area,
A second calculation unit for calculating second classification result values for the plurality of second image areas, and
And a detector for detecting an object in the photographed image using the first and second classification result values.

The method according to claim 6,
Wherein the integer multiple is an integer multiple of 2 or more.

The method according to claim 6,
The first calculation unit calculates,
A unit window is slid by an integral multiple of the predetermined number of moving pixels in the captured image to extract feature vectors for a plurality of first image regions corresponding to the unit window that has been slid by integer times, And inputting the feature vectors for the first image region to calculate first classification result values corresponding to the plurality of first image regions.

The method according to claim 6,
The second calculation unit calculates,
Extracting feature vectors for the plurality of second image regions, inputting a feature vector for the plurality of second image regions to the learned classifier, and outputting a second classification corresponding to the plurality of second image regions, And calculates result values.

The method according to claim 6,
Further comprising: a conversion unit for converting the input captured image according to a predetermined image size to detect objects of various sizes.