KR101899993B1

KR101899993B1 - Object recognition method based on restricted region on image using disparity map

Info

Publication number: KR101899993B1
Application number: KR1020160069238A
Authority: KR
Inventors: 전재욱; 이상준; 딘 빈 뉴엔
Original assignee: 성균관대학교산학협력단
Priority date: 2016-06-03
Filing date: 2016-06-03
Publication date: 2018-09-18
Also published as: KR20170137301A

Abstract

시차 맵으로 한정된 영역에 기반한 객체 인식 방법은 영상 처리 장치가 소스 영상에 대한 시차 맵(disparity map)을 생성하는 단계, 상기 영상 처리 장치가 상기 시차 맵을 이용하여 V-시차 맵 및 U-시차 맵을 생성하는 단계, 상기 영상 처리 장치가 상기 V-시차 맵과 상기 U-시차 맵에 나타난 정보를 이용하여 후보 영역을 결정하는 단계 및 상기 영상 처리 장치가 상기 소스 영상의 후보 영역에 대한 딥 러닝을 수행하는 단계를 포함한다.A method for recognizing an object based on an area defined by a parallax map includes the steps of an image processing apparatus generating a disparity map for a source image, the image processing apparatus including a V-parallax map and a U- Wherein the image processing apparatus includes a step of determining a candidate region by using the information indicated in the V-parallax map and the U-parallax map, and a step of performing a deep run of the candidate region of the source image .

Description

[0001] OBJECT RECOGNITION METHOD BASED ON RESTRICTED REGION ON IMAGE USING DISPARITY MAP [0002]

이하 설명하는 기술은 딥 러닝을 이용하여 영상의 객체를 인식하는 기법에 관한 것이다.The technique described below relates to a technique of recognizing an object of an image using deep learning.

딥 러닝(Deep Learning) 기술은 기계 학습 기능을 이용하여 제공된 정보를 기계가 스스로 학습하여 이후 입력된 영상 정보로부터 학습된 물체들의 정보를 자동 인식하는 방법이다. Deep Learning (Deep Learning) is a method in which the machine learns the information provided by using the machine learning function and automatically recognizes the information of the learned objects from the input image information.

한국공개특허 제10-2008-0069601호Korean Patent Publication No. 10-2008-0069601

이하 설명하는 기술은 시차 맵을 이용하여 입력된 영상에서 딥 러닝을 수행할 후보 영역을 결정하고, 후보 영역에 대한 딥 러닝을 수행하는 기법을 제공하고자 한다.The technique described below is to provide a technique of determining a candidate region to be deep-runned in an input image using a parallax map, and performing a deep run on a candidate region.

이하 설명하는 기술은 입력 영상 전체에 대해 딥 러닝을 수행하지 않고, 제한된 영역에 대한 딥 러닝을 수행하여 영상에서 객체를 인식하는 속도가 향상된다.The technique described below improves the speed of recognizing an object in an image by performing a deep run on a limited area without performing a deep run on the entire input image.

도 1은 딥 러닝을 수행하는 영상 처리 시스템에 대한 예이다.
도 2는 딥 러닝을 수행하는 과정에 대한 예이다.
도 3은 V-시차 맵과 U-시차 맵을 생성하는 과정에 대한 예이다.
도 4는 깊이 맵을 이용하여 소스 영상에서 객체 검출을 위한 후보 영역을 결정하는 예이다.
도 5는 소스 영상에서 후보 영역에 대한 딥 러닝을 수행한 결과에 대한 예이다.1 is an example of an image processing system that performs deep learning.
2 is an example of a process of performing deep learning.
3 is an example of a process of generating the V-disparity map and the U-disparity map.
4 is an example of determining a candidate region for object detection in a source image using a depth map.
5 is an example of a result of performing a deep run on a candidate region in a source image.

이하 설명하는 기술은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나, 이는 이하 설명하는 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 이하 설명하는 기술의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.The following description is intended to illustrate and describe specific embodiments in the drawings, since various changes may be made and the embodiments may have various embodiments. However, it should be understood that the following description does not limit the specific embodiments, but includes all changes, equivalents, and alternatives falling within the spirit and scope of the following description.

제1, 제2, A, B 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 해당 구성요소들은 상기 용어들에 의해 한정되지는 않으며, 단지 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 이하 설명하는 기술의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.The terms first, second, A, B, etc., may be used to describe various components, but the components are not limited by the terms, but may be used to distinguish one component from another . For example, without departing from the scope of the following description, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. And / or < / RTI > includes any combination of a plurality of related listed items or any of a plurality of related listed items.

본 명세서에서 사용되는 용어에서 단수의 표현은 문맥상 명백하게 다르게 해석되지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함한다" 등의 용어는 설시된 특징, 개수, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 의미하는 것이지, 하나 또는 그 이상의 다른 특징들이나 개수, 단계 동작 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 배제하지 않는 것으로 이해되어야 한다.As used herein, the singular " include "should be understood to include a plurality of representations unless the context clearly dictates otherwise, and the terms" comprises & , Parts or combinations thereof, and does not preclude the presence or addition of one or more other features, integers, steps, components, components, or combinations thereof.

도면에 대한 상세한 설명을 하기에 앞서, 본 명세서에서의 구성부들에 대한 구분은 각 구성부가 담당하는 주기능 별로 구분한 것에 불과함을 명확히 하고자 한다. 즉, 이하에서 설명할 2개 이상의 구성부가 하나의 구성부로 합쳐지거나 또는 하나의 구성부가 보다 세분화된 기능별로 2개 이상으로 분화되어 구비될 수도 있다. 그리고 이하에서 설명할 구성부 각각은 자신이 담당하는 주기능 이외에도 다른 구성부가 담당하는 기능 중 일부 또는 전부의 기능을 추가적으로 수행할 수도 있으며, 구성부 각각이 담당하는 주기능 중 일부 기능이 다른 구성부에 의해 전담되어 수행될 수도 있음은 물론이다.Before describing the drawings in detail, it is to be clarified that the division of constituent parts in this specification is merely a division by main functions of each constituent part. That is, two or more constituent parts to be described below may be combined into one constituent part, or one constituent part may be divided into two or more functions according to functions that are more subdivided. In addition, each of the constituent units described below may additionally perform some or all of the functions of other constituent units in addition to the main functions of the constituent units themselves, and that some of the main functions, And may be carried out in a dedicated manner.

또, 방법 또는 동작 방법을 수행함에 있어서, 상기 방법을 이루는 각 과정들은 문맥상 명백하게 특정 순서를 기재하지 않은 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 과정들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.Also, in performing a method or an operation method, each of the processes constituting the method may take place differently from the stated order unless clearly specified in the context. That is, each process may occur in the same order as described, may be performed substantially concurrently, or may be performed in the opposite order.

이하 설명하는 기술은 영상에 포함된 객체를 인식 내지 검출하는 딥 러닝 기법에 관한 것이다. 도 1은 딥 러닝을 수행하는 영상 처리 시스템(100)에 대한 예이다. 영상 처리 시스템(100)은 클라이언트 장치(110a, 110b), 영상 데이터베이스(120) 및 영상 처리 장치(130)를 포함한다. 도 1은 네트워크를 통해 연결된 시스템을 예로 도시한다.The following description relates to a deep learning technique for recognizing or detecting an object included in an image. 1 is an example of an image processing system 100 that performs deep learning. The image processing system 100 includes client devices 110a and 110b, an image database 120, and an image processing device 130. [ 1 shows an example of a system connected through a network.

클라이언트 장치(110a, 110b)는 영상을 전달하는 장치이다. 클라이언트 장치(110a, 110b)는 카메라로 촬영한 영상을 전달하거나, 저장 매체에 저장된 영상을 전달한다. 영상 데이터베이스(120)는 클라이언트 장치(110a, 110b)가 전달한 영상을 저장한다. 영상 처리 장치(130)는 영상 데이터베이스(120)에 저장된 영상을 학습하여 분석하는 장치이다. 영상 처리 장치(130)는 영상 처리 프로그램이 작동하는 서버나 컴퓨터 장치일 수 있다. 영상 처리 장치(130)는 영상 데이터베이스(120)에 저장된 영상에 접근하여 영상을 학습한다. 영상 처리 장치(130)는 학습한 영상을 일정한 기준에 따라 분류할 수 있다. 예컨대, 영상 처리 장치(130)는 특정 객체가 포함된 영상, 특정 배경이 포함된 영상, 특정 색상이 포함된 영상, 영상의 크기 등과 같이 다양한 조건을 기준으로 영상을 분류할 수 있다. 이를 위해 영상 처리 장치(130)는 영상에 포함된 객체를 인식 내지 검출해야 한다. 영상 처리 장치(130)는 딥 러닝 기법을 이용하여 영상에 포함된 객체를 인식하거나 검출할 수 있다. 딥 러닝과 관련된 구체적인 기법은 해당 분야에 널리 알려진 다양한 알고리즘을 사용할 수 있다. The client devices 110a and 110b are devices for transmitting images. The client devices 110a and 110b transmit images photographed by the camera or images stored in the storage medium. The image database 120 stores images transmitted by the client devices 110a and 110b. The image processing apparatus 130 is an apparatus that learns and analyzes an image stored in the image database 120. The image processing apparatus 130 may be a server or a computer apparatus on which the image processing program operates. The image processing apparatus 130 accesses the image stored in the image database 120 and learns the image. The image processing apparatus 130 can classify the learned image according to a certain criterion. For example, the image processing apparatus 130 can classify an image based on various conditions such as an image including a specific object, an image including a specific background, an image including a specific color, and an image size. To this end, the image processing apparatus 130 must recognize or detect an object included in the image. The image processing apparatus 130 can recognize or detect an object included in the image using a deep learning technique. Specific techniques related to deep running can use various algorithms well known in the art.

딥 러닝은 도 1에 도시한 시스템(100)과 달리 자동차와 같은 닫힌 시스템에서도 수행될 수 있다. 예컨대, 자동차의 시스템은 카메라를 이용하여 영상을 획득하고, 딥 러닝을 수행하여 사람, 자동차와 같은 충돌 대상이 있는지를 판단할 수 있다.Deep running may be performed in a closed system, such as an automobile, unlike the system 100 shown in FIG. For example, an automobile system can acquire an image using a camera and perform deep running to determine whether there is a collision object such as a person or an automobile.

도 2는 딥 러닝을 수행하는 과정에 대한 예이다. 도 2는 종래 영상에 대한 딥 러닝을 수행하는 과정에 대한 예이다. 종래 딥 러닝 기법은 도 2의 좌측과 같이 입력 영상(10)에 일정한 크기의 윈도우(5)를 적용하여 영상(10) 전체를 스캔한다. 이를 통해 도 2의 우측과 같은 결과 영상에 나온 것처럼 인식 내지 검출하고자 하는 객체를 찾는다. 다만 종래 딥 러닝 방법은 영상(10)내 물체가 없는 영역까지 모두 반복된 연산을 수행한다. 따라서 불필요한 연산은 시스템 성능을 떨어뜨리는 원인이 된다.2 is an example of a process of performing deep learning. 2 is an example of a process of performing deep learning on a conventional image. In the conventional deep-drawing technique, a window 5 of a predetermined size is applied to the input image 10 as shown on the left side of FIG. 2 to scan the whole image 10. As a result, an object to be recognized or detected is searched as shown in the result image shown in the right side of FIG. However, the conventional deep learning method performs repeated calculations to the area where no object exists in the image 10. Therefore, unnecessary operations cause the performance of the system to deteriorate.

이하 설명하는 딥 러닝 기법은 영상 전체를 스캔하지 않고, 영상에서 객체가 존재하는 것으로 추정되는 유력한 영역만을 스캔한다. 즉, 이하 설명하는 기술은 영상에서 윈도우를 적용하여 스캔하고자 하는 영역을 먼저 설정한다. 전체 영상 중 영상 처리 장치(130)가 영상을 스캔하는 대상이 되는 영역을 후보 영역이라고 명명한다. 이하 설명하는 기술은 시차 맵(disparity map)을 사용하여 후보 영역을 설정한다. 영상 처리 장치는 후보 영역을 기준으로 영상을 스캔하여 딥 러닝을 수행한다.The deep-learning technique described below does not scan the whole image but scans only a region where the object is estimated to exist in the image. That is, the following description first sets an area to be scanned by applying a window in an image. A region to be scanned by the image processing apparatus 130 among the entire images is referred to as a candidate region. The technique described below sets a candidate region using a disparity map. The image processing apparatus scans an image based on the candidate region and performs deep running.

영상 처리 장치는 시차 맵을 생성하고, 시차 맵으로부터 V-시차 맵과 U-시차 맵을 생성한다. 영상 처리 장치는 V-시차 맵 및 U-시차 맵을 기준으로 후보 영역을 결정한다.The image processing apparatus generates a parallax map, and generates a V-parallax map and a U-parallax map from the parallax map. The image processing apparatus determines a candidate region based on the V-parallax map and the U-parallax map.

먼저 V-시차 맵 및 U-시차 맵을 생성하는 과정에 대해 간략하게 설명한다. 도 3은 V-시차 맵과 U-시차 맵을 생성하는 과정에 대한 예이다. 영상 처리 장치는 스테레오 카메라가 획득한 양안 영상을 기준으로 시차 맵을 생성한다. 도 3(a)는 스테레오 카메라 또는 복수의 카메라가 획득한 스테레오 영상이다. 스테레오 영상으로부터 시차 맵을 생성하는 다양한 기법이 있다. 영상 처리 장치는 다양한 기법 중 하나를 이용하여 시차 맵을 생성할 수 있다. 예컨대, 영상 처리 장치는 소스 영상인 스테레오 영상에서 픽셀을 분류하고, 분류한 픽셀을 기반으로 스테레오 매칭을 수행할 수 있다. 도 3(b)는 LBP(Local Binary Pattern)과 같은 픽셀 분류 기법을 사용하여 스테레오 영상에 대한 픽셀을 분류한 결과이다. 영상 처리 장치는 도 3(b)와 같은 영상을 기반으로 스테레오 매칭을 수행하여 시차 맵을 생성한다. 도 3(c)는 생성된 시차 맵에 대한 예이다.First, the process of generating the V-disparity map and the U-disparity map will be briefly described. 3 is an example of a process of generating the V-disparity map and the U-disparity map. The image processing apparatus generates a parallax map based on the binocular image acquired by the stereo camera. 3 (a) is a stereo image obtained by a stereo camera or a plurality of cameras. There are various techniques for generating a parallax map from a stereo image. The image processing apparatus can generate a parallax map using one of various techniques. For example, the image processing apparatus can classify pixels in a stereo image, which is a source image, and perform stereo matching based on the classified pixels. FIG. 3 (b) is a result of classifying pixels for a stereo image using a pixel classification technique such as LBP (Local Binary Pattern). The image processing apparatus performs stereo matching based on the image as shown in FIG. 3 (b) to generate a parallax map. 3 (c) is an example of the generated parallax map.

영상 처리 장치는 시차 맵을 기준으로 V-시차 맵 및 U-시차 맵을 생성한다. 도 3(d)는 도 3(c)를 기준으로 생성한 V-시차 맵에 대한 예이다. 도 3(e)는 도 3(c)를 기준으로 생성한 U-시차 맵에 대한 예이다. The image processing apparatus generates the V-parallax map and the U-parallax map on the basis of the parallax map. Fig. 3 (d) is an example of the V-parallax map generated based on Fig. 3 (c). Fig. 3 (e) is an example of a U-parallax map generated based on Fig. 3 (c).

V-시차 맵은 깊이 맵에서 동일한 세로축에 위치하면서 동시에 같은 시차 값을 갖는 픽셀을 동일한 세로축에 누적한 것이다. V-시차 맵은 가로축이 시차 값을 나타내고, 세로축은 시차 맵의 세로축과 같다. V-시차 맵을 생성하는 알고리즘은 아래와 같다.The V-parallax map is obtained by accumulating the pixels having the same parallax value on the same vertical axis while being located on the same vertical axis in the depth map. In the V-parallax map, the horizontal axis represents the parallax value, and the vertical axis is the vertical axis of the parallax map. The algorithm for generating the V-parallax map is as follows.

[V-시차 맵 생성 알고리즘][V-Parallax Map Generation Algorithm]

for each i^th column on disp dofor each i ^th column on disp do

for each j^th line on disp dofor each j ^th line on disp do

if disp(i,j) > 0 thenif disp (i, j)> 0 then

vdisp(disp(i,j),j) ++vdisp (disp (i, j), j) ++

endend

상기 코드에서 disp는 시차 맵이다. disp(x,y)는 깊이 맵에서 위치 (x,y)의 시차 값을 나타낸다. vdisp는 V-시차 맵이다. vdisp(x,y)는 V-시차 맵에서 위치 (x, y)의 값을 나타낸다.In the above code, disp is a time difference map. disp (x, y) represents the parallax value of the position (x, y) in the depth map. vdisp is the V-disparity map. vdisp (x, y) represents the value of position (x, y) in the V-parallax map.

U-시차 맵은 누적을 실시하는 방향이 가로축이라는 점을 제외하고 V-시차 맵을 생성하는 과정과 동일하다. U-시차 맵은 깊이 맵에서 동일한 가로축에 위치하면서 동시에 같은 시차 값을 갖는 픽셀을 동일한 가로축에 누적한 것이다. U-시차 맵은 세로축이 시차 값을 나타내고, 가로축은 시차 맵의 가로축과 같다. The U-parallax map is the same as the process of generating the V-parallax map except that the direction of accumulation is the horizontal axis. The U-parallax map accumulates pixels having the same parallax value on the same horizontal axis while having the same parallax value in the depth map. In the U-parallax map, the vertical axis represents the parallax value, and the horizontal axis is the horizontal axis of the parallax map.

영상 처리 장치는 V-시차 맵 및 U-시차 맵에서 선형적인 변화를 보이는 영역을 후보 영역으로 결정한다. V-시차 맵 및 U-시차 맵은 일정한 값으로 표현된다. 도 3은 시차 맵과 같은 그레이 영상으로 일정한 값을 갖는 V-시차 맵 및 U-시차 맵을 표현한 것이다. 영상 처리 장치는 V-시차 맵 및 U-시차 맵 중 어느 하나를 기준으로 후보 영역을 결정할 수도 있을 것이다. 예컨대, 영상 처리 장치는 V-시차 맵 및 U-시차 맵 중 어느 하나가 가로축 또는 세로축을 기준으로 큰 변화를 보이는 영역을 후보 영역으로 결정할 수 있다. 나아가 영상 처리 장치는 V-시차 맵 및 U-시차 맵 모두에서 큰 변화를 보이는 영역을 후보 영역으로 결정할 수도 있다.The image processing apparatus determines a region showing a linear change in the V-parallax map and the U-parallax map as a candidate region. The V-disparity map and the U-disparity map are expressed by a constant value. FIG. 3 is a representation of a V-parallax map and a U-parallax map having a constant value in a gray image like the parallax map. The image processing apparatus may determine the candidate region based on any one of the V-parallax map and the U-parallax map. For example, the image processing apparatus can determine, as a candidate region, an area in which any one of the V-parallax map and the U-parallax map shows a large change with reference to the horizontal axis or the vertical axis. Furthermore, the image processing apparatus may determine a region showing a large change in both the V-parallax map and the U-parallax map as a candidate region.

선형적 변화는 V-시차 맵 또는 U-시차 맵을 구성하는 값이 가로 방향 또는 세로 방향을 기준으로 급격하게 변하는 것을 의미한다. 예컨대, '3, 5, 4, 6, 15, 16, 20, 12, 7, 5, 5'와 같은 데이터에서 "15, 16, 20"은 선형적 변화가 있다라고 볼 수 있다. 영상 처리 장치는 후보 영역을 결정하기 위한 변화의 범위를 사전에 설정할 수 있다. 예컨대, 5, 10 등과 같은 기준값을 사용할 수 있다.The linear change means that the values constituting the V-parallax map or the U-parallax map change abruptly with respect to the horizontal direction or the vertical direction. For example, in the data such as '3, 5, 4, 6, 15, 16, 20, 12, 7, 5 and 5', "15, 16, 20" The image processing apparatus can preset the range of the change for determining the candidate region. For example, reference values such as 5, 10, and the like can be used.

도 4는 깊이 맵을 이용하여 소스 영상에서 객체 검출을 위한 후보 영역을 결정하는 예이다. 영상 처리 장치는 입력 영상(10)에 대한 V-시차 맵 또는 U-시차 맵을 생성한다. 영상 처리 장치는 V-시차 맵 또는 U-시차 맵을 기준으로 선형적 변화를 보이는 후보 영역을 결정한다. 도 4에서 영상(20)은 후보 영역을 결정한 예이다. 영상 처리 장치는 입력 영상(10)에서 후보 영역을 제외한 나머지 영역을 제거할 수 있다. 도 4에서 영상(150)는 입력 영상(10)에서 후보 영역에만 픽셀 정보를 남긴 예이다. 물론 영상 처리 장치는 도 4와 같이 픽셀 정보를 변경하지 않고 윈도우를 후보 영역에서만 스캔하도록 처리할 수도 있을 것이다.4 is an example of determining a candidate region for object detection in a source image using a depth map. The image processing apparatus generates a V-parallax map or a U-parallax map for the input image (10). The image processing apparatus determines a candidate region showing a linear change based on the V-parallax map or the U-parallax map. In FIG. 4, the image 20 is an example of determining a candidate region. The image processing apparatus can remove the remaining region excluding the candidate region from the input image 10. In FIG. 4, the image 150 is an example in which pixel information is left only in the candidate region in the input image 10. Of course, the image processing apparatus may process the window only in the candidate area without changing the pixel information as shown in FIG.

도 5는 소스 영상에서 후보 영역에 대한 딥 러닝을 수행한 결과에 대한 예이다. 도 5는 도 4에서 결정한 후보 영역에 대해서 영상 처리 장치가 딥 러닝을 수행하여 물체를 인식 내지 검출한 결과이다.5 is an example of a result of performing a deep run on a candidate region in a source image. FIG. 5 is a result obtained by recognizing or detecting an object by performing a deep run of the image processing apparatus for the candidate region determined in FIG.

본 실시예 및 본 명세서에 첨부된 도면은 전술한 기술에 포함되는 기술적 사상의 일부를 명확하게 나타내고 있는 것에 불과하며, 전술한 기술의 명세서 및 도면에 포함된 기술적 사상의 범위 내에서 당업자가 용이하게 유추할 수 있는 변형 예와 구체적인 실시예는 모두 전술한 기술의 권리범위에 포함되는 것이 자명하다고 할 것이다.It should be noted that the present embodiment and the drawings attached hereto are only a part of the technical idea included in the above-described technology, and those skilled in the art will readily understand the technical ideas included in the above- It is to be understood that both variations and specific embodiments which can be deduced are included in the scope of the above-mentioned technical scope.

100 : 영상 처리 시스템
110a, 110b : 클라이언트 장치
120 : 영상 데이터베이스
130 : 영상 처리 장치100: Image processing system
110a, 110b: Client device
120: image database
130: Image processing device

Claims

Generating a disparity map for the source image by the image processing apparatus;
Generating a V-parallax map and a U-parallax map using the parallax map;
The V-parallax map and the U-parallax map are set as a reference area on the basis of the horizontal axis or the vertical axis, respectively, as an area in which the image processing apparatus displays a linear change using the information represented by the V-parallax map and the U- Determining a region indicating a change in disparity over a range as a candidate region;
Removing an area other than the candidate area from the source image; And
And performing the Deep Learning on the candidate region of the source image by the image processing apparatus,
Wherein the candidate region includes a region having a variation of the parallax value equal to or larger than a first reference value in the V-parallax map and a region having a parallax value corresponding to a region having a change in parallax value greater than or equal to a second reference value in the U- An object recognition method based on an area defined by a map.

The method according to claim 1,
Wherein the parallax map is generated by performing stereo matching based on an image obtained by classifying pixels of the source image using a pixel classification technique.

The method according to claim 1,
Wherein the V-parallax map is formed by accumulating pixels having the same parallax value included in the parallax map in the vertical direction on the map in which the horizontal axis is the parallax value and the vertical axis is the vertical axis in the parallax map,
Wherein the U-parallax map is formed by accumulating pixels having the same parallax value included in the parallax map in the horizontal direction on the map in which the vertical axis is the parallax value and the horizontal axis is the same as the horizontal axis in the parallax map, A method for recognizing an object based on a region defined by a region.

delete