WO2011136405A1 - Image recognition device and method using 3d camera - Google Patents

Image recognition device and method using 3d camera Download PDF

Info

Publication number
WO2011136405A1
WO2011136405A1 PCT/KR2010/002669 KR2010002669W WO2011136405A1 WO 2011136405 A1 WO2011136405 A1 WO 2011136405A1 KR 2010002669 W KR2010002669 W KR 2010002669W WO 2011136405 A1 WO2011136405 A1 WO 2011136405A1
Authority
WO
WIPO (PCT)
Prior art keywords
area
image
digital images
unit
extracted
Prior art date
Application number
PCT/KR2010/002669
Other languages
French (fr)
Korean (ko)
Inventor
강인배
Original Assignee
(주)아이티엑스시큐리티
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by (주)아이티엑스시큐리티 filed Critical (주)아이티엑스시큐리티
Publication of WO2011136405A1 publication Critical patent/WO2011136405A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30236Traffic on road, railway or crossing

Definitions

  • the present invention relates to an image recognition apparatus and method using a 3D camera that can recognize an object based on 3D depth map data acquired using two cameras.
  • a method of using a stereo camera, a method of using a laser scan, or a method of using a time of flight (TOF) The back is known.
  • stereo matching using a stereo camera is a hardware implementation of a process of recognizing a stereoscopic object using two eyes, and a pair of images obtained by photographing the same subject with two cameras. It is a method of extracting information about depth (or distance) in space through the interpretation process of.
  • binocular differences on the same Epipolar Line of images obtained from two cameras are calculated.
  • the binocular difference includes distance information, and the geometrical characteristic calculated from the binocular difference becomes the depth.
  • the binocular difference value is calculated in real time from the input image, three-dimensional distance information of the observation space can be measured.
  • stereo matching algorithm for example, "image matching method using a plurality of image lines” of the Republic of Korea Patent No. 0517876 or "binocular difference estimation method for three-dimensional object recognition” of the Republic of Korea Patent No. 0601958.
  • An object of the present invention is to provide an image recognition apparatus and method using a 3D camera, which can recognize a subject based on 3D depth map data acquired using two cameras.
  • an image recognition method using a 3D camera of the present invention includes converting a pair of analog images photographed using two cameras photographing the same area into a digital image, and the converted pair of Computing 3D depth map data using a digital image, Extracting a region of a moving object by comparing one of the digital image with a reference background image, Based on the distance information to the object identified from the depth map data And calculating the area of the object and recognizing the object by a method of determining whether the calculated area is within a preset range of the area of a particular object.
  • the extracting of the area of the object may include performing a subtraction operation on one of the digital images and the reference background image, and detecting an outline of the object extracted as the subtracted image. .
  • the total area of the object may be obtained by obtaining a unit area of pixels located at the distance of the extracted object, and then multiplying the number of pixels included in the outline. .
  • An image recognition apparatus includes a stereo camera unit, a distance information calculator, an object extractor, and an object recognizer.
  • the stereo camera unit includes two cameras photographing the same area to convert a pair of analog images photographed into digital images.
  • the distance information calculator calculates 3D depth map data using a pair of digital images generated by the stereo camera unit, and the object extractor compares one of the digital images generated by the stereo camera unit with a reference background image. Extract the area of the object.
  • the object recognition unit calculates the area of the object extracted by the object extracting unit based on the distance information to the object identified from the depth map data calculated by the distance information calculating unit, and the calculated area is a preset range as the area of a specific object. Recognize the object in a way to determine if it belongs to.
  • the image recognition device of the present invention can recognize a moving object in the photographing area in a simpler method.
  • the recognition algorithm is relatively simple compared to the two-dimensional image processing, instead of processing the image generated using the two cameras, the recognition speed and efficiency is improved, and above all, the recognition rate is excellent.
  • FIG. 1 is a block diagram of a 3D image recognition device according to an embodiment of the present invention.
  • FIG. 3 is a diagram illustrating an image processing result in an object extraction step
  • FIG. 4 is a diagram provided to explain a method for calculating an area of an object.
  • the image recognition device 100 of the present invention includes a stereo camera unit 110 and an image processor 130 to recognize a subject in a three-dimensional space.
  • the stereo camera unit 110 includes a first camera 111, a second camera 113, and an image receiver 115.
  • the first camera 111 and the second camera 113 are a pair of cameras spaced apart from each other to photograph the same area, and are called a stereo camera.
  • the first camera 111 and the second camera 113 output an analog image signal photographing an area to the image receiver 115.
  • the image receiver 115 converts a video signal (or image) of a continuous frame input from the first camera 111 and the second camera 113 into a digital image and synchronizes the frame to the image processor 130 in synchronization with the frame. to provide.
  • the first camera 111 and the second camera 113 of the stereo camera unit 110 may be a camera that outputs a digital video signal instead of an analog image.
  • the image receiver 115 may be different. It provides an interface with the image processor 130 without conversion processing and serves to match frame synchronization of a pair of images.
  • the stereo camera unit 110 may further include a wired or wireless interface for connecting to the image processing unit 130 through an IP (Internet Protocol) network.
  • IP Internet Protocol
  • the image processor 130 extracts an area of an object moving on the shooting area from the pair of digital image frames output from the stereo camera unit 110 to determine whether the object is an object of interest, and continuously from the stereo camera unit 110.
  • the above determination process may be performed in real time on all frames of the image (video) that is input to the image.
  • the image processor 130 includes a distance information calculator 131, an object extractor 133, and an object recognizer 135.
  • a distance information calculator 131 for the above process, the image processor 130 includes a distance information calculator 131, an object extractor 133, and an object recognizer 135.
  • operations of the distance information calculator 131, the object extractor 133, and the object recognizer 135 will be described with reference to FIG. 2.
  • the image receiver 115 converts the analog video signal into a digital video signal and then synchronizes the frame to the image processor 130. Provided to (S201, step S203).
  • the distance information calculator 131 calculates 3D depth map data including distance information of each pixel from a pair of digital images received in real time from the image receiver 115.
  • the distance information of each pixel is binocular difference information obtained by the stereo matching method described in the prior art, and the "three-dimensional image matching method using a plurality of image lines" of Korean Patent No. 0517876 or the Korean Patent No. 0601958.
  • the depth map data calculated by the distance information calculator 131 may include distance information about each pixel. .
  • the object extractor 133 extracts a region of the moving object from one image of the pair of digital images input through the image receiver 115.
  • the moving object refers to an object existing in the photographing area of the camera and an object whose position or motion is changed or newly entered into the photographing area.
  • the method of extracting the area of the moving object may be variously performed.
  • the object extracting unit 133 of the present invention extracts a region of a moving object by a method of subtracting a background image previously held from an input image frame.
  • the subtraction operation is performed by subtracting pixel values of each pixel of two corresponding image frames.
  • the reference background image is an image in which no moving object is set, and the object extractor 133 may store and use the reference background image in a storage medium (not shown).
  • the object extractor 133 may perform a Gaussian distribution on the resultant image of the subtraction operation.
  • Background Modeling which applies the Distribution process, can cope with noise or light changes.
  • (a) is an image input from the image receiver 115
  • (b) is a basic background image
  • (c) is a result image of a subtraction operation.
  • FIG. 3C it can be seen that a region of a moving object is extracted from an image input from the image receiver 115.
  • the object extractor 133 detects an outline of a moving object by performing outline detection on the resultant image of the subtraction operation of step S207. Edge detection is handled using different types of edges, depending on the borderline width and shape of the object.
  • the object extractor 133 may remove a noise by applying a morphology operation to a subtraction image and simplify an outline or a skeleton line to detect an outline.
  • the morphology operation can basically use erosion operation to remove noise and dilation operation to fill small holes in an object.
  • the object recognizer 135 uses the outline information extracted by the object extractor 133 and the 3D depth map data calculated by the distance information calculator 131 to calculate the area of the object extracted in step S209. Calculate
  • the actual area per pixel (hereinafter referred to as the 'unit area' of the pixel) at the distance l where the object extracted in step S209 is located is obtained, and then the pixel included in the outline of the object is calculated. This is done by multiplying numbers.
  • the actual area M corresponding to the entire frame at the maximum depth L based on the existing background image, and the actual area m corresponding to the entire frame at the position l of the extracted object. l) is displayed.
  • the actual area m (l) corresponding to the entire frame at the distance l at which the object is located may be obtained as in Equation 1 below.
  • M is an actual area corresponding to the entire frame (eg, 720 ⁇ 640 pixels) at the maximum distance L based on the existing background image.
  • the area of the object can be obtained as shown in Equation 3 by multiplying the unit area mp (l) of the pixel by the number of pixels included in the outline (pc).
  • pc is the number of pixels included in the object.
  • the object recognition unit 135 recognizes the object by comparing the area of the object obtained in operation S211 with a preset value. For example, if the area of the object is less than or equal to the first size, it may be recognized as an animal moving in four legs, or if the size of the object is within the first range, it may be recognized as an automobile.
  • the image recognition device of the present invention obtains 3D depth map data using a stereo camera and recognizes an object captured in the image.
  • the calculation of the depth map data of step S205 can be performed in parallel with the extraction process of the moving object of steps S207 and S209 as shown in FIG. And after step S209.

Abstract

Disclosed are an image recognition device and method using a 3D camera. An image processing device of the present invention calculates the area of a moving object by using 3D depth map data obtained with two cameras, and recognizes whether the moving object is a particular object depending on whether the calculated area is included within a preset range. To this end, the image recognition device calculates the area of the object extracted, in addition to from the step of generating the 3D depth map data, from the steps of: extracting the moving object; and extracting an outline of the extracted object.

Description

3D 카메라를 이용한 영상인식장치 및 방법Image recognition device and method using 3D camera
본 발명은, 2개의 카메라를 이용하여 획득한 3D 심도 맵(Depth Map) 데이터를 기반으로 사물을 인식할 수 있는, 3D 카메라를 이용한 영상인식장치 및 방법에 관한 것이다.The present invention relates to an image recognition apparatus and method using a 3D camera that can recognize an object based on 3D depth map data acquired using two cameras.
영상으로부터 3차원 공간상의 심도 정보(Depth Map), 다시 말해 3차원 공간상의 피사체와의 거리를 얻기 위한 방법에는, 스테레오 카메라를 이용하는 방법, 레이저 스캔을 이용하는 방법, TOF(Time of Flight)를 이용하는 방법 등이 알려지고 있다. In order to obtain a depth map on a three-dimensional space from an image, that is, a distance from a subject in a three-dimensional space, a method of using a stereo camera, a method of using a laser scan, or a method of using a time of flight (TOF) The back is known.
이 중에서, 스테레오 카메라를 이용하는 스테레오 정합(Stereo Matching)은, 사람이 두 눈을 이용하여 입체를 인지하는 과정을 하드웨어적으로 구현한 것으로서, 동일한 피사체를 두 개의 카메라로 촬영하여 획득한 한 쌍의 이미지에 대한 해석과정을 통해 공간에서의 깊이(또는 거리)에 대한 정보를 추출하는 방법이다. 이를 위해, 두 개의 카메라로부터 획득한 영상의 동일한 에피폴라 선(Epipolar Line)상의 양안차를 계산한다. 양안차는 거리 정보를 포함하며, 이러한 양안차로부터 계산된 기하학적 특성이 깊이(depth)가 된다. 입력 영상으로부터 실시간으로 양안차값을 계산하면 관측 공간의 삼차원 거리 정보 등을 측정할 수 있다.Among these, stereo matching using a stereo camera is a hardware implementation of a process of recognizing a stereoscopic object using two eyes, and a pair of images obtained by photographing the same subject with two cameras. It is a method of extracting information about depth (or distance) in space through the interpretation process of. To this end, binocular differences on the same Epipolar Line of images obtained from two cameras are calculated. The binocular difference includes distance information, and the geometrical characteristic calculated from the binocular difference becomes the depth. When the binocular difference value is calculated in real time from the input image, three-dimensional distance information of the observation space can be measured.
스테레오 정합 알고리즘으로 알려진 것에는, 예컨대, 대한민국 등록특허 제0517876호의 "복수 영상 라인을 이용한 영상 정합 방법"이나, 대한민국 등록특허 제0601958호의 "3차원 객체 인식을 위한 양안차 추정방법"이 있다.Known as a stereo matching algorithm, for example, "image matching method using a plurality of image lines" of the Republic of Korea Patent No. 0517876 or "binocular difference estimation method for three-dimensional object recognition" of the Republic of Korea Patent No. 0601958.
본 발명의 목적은 2개의 카메라를 이용하여 획득한 3D 심도 맵 데이터를 기반으로 피사체를 인식할 수 있는, 3D 카메라를 이용한 영상인식장치 및 방법을 제공함에 있다.An object of the present invention is to provide an image recognition apparatus and method using a 3D camera, which can recognize a subject based on 3D depth map data acquired using two cameras.
상기 목적을 달성하기 위한 본 발명의 3D 카메라를 이용한 영상인식방법은, 동일한 영역을 촬영하는 두 개의 카메라를 이용하여 촬영한 한 쌍의 아날로그 영상을 디지털 영상으로 변환하는 단계, 상기 변환된 한 쌍의 디지털 영상을 이용하여 3차원 심도 맵 데이터를 계산하는 단계, 상기 디지털 영상 중 하나를 기준 배경영상과 비교하여 움직이는 객체의 영역을 추출하는 단계, 상기 심도 맵 데이터로부터 확인한 상기 객체까지의 거리정보를 기초로 상기 객체의 면적을 계산하는 단계 및 상기 계산된 면적이 특정 사물의 면적으로 기 설정된 범위 내에 속하는지 판단하는 방법으로 상기 객체를 인식하는 단계를 포함한다.In order to achieve the above object, an image recognition method using a 3D camera of the present invention includes converting a pair of analog images photographed using two cameras photographing the same area into a digital image, and the converted pair of Computing 3D depth map data using a digital image, Extracting a region of a moving object by comparing one of the digital image with a reference background image, Based on the distance information to the object identified from the depth map data And calculating the area of the object and recognizing the object by a method of determining whether the calculated area is within a preset range of the area of a particular object.
여기서, 상기 객체의 영역을 추출하는 단계는, 상기 디지털 영상 중 하나와 상기 기준 배경영상에 대해 뺄셈 연산을 수행하는 단계 및 상기 뺄셈 연산된 영상으로 추출한 객체의 외곽선을 검출하는 단계를 포함할 수 있다.The extracting of the area of the object may include performing a subtraction operation on one of the digital images and the reference background image, and detecting an outline of the object extracted as the subtracted image. .
나아가, 상기 객체의 면적을 계산하는 단계는, 상기 추출된 객체의 거리에서 위치한 픽셀의 단위 면적을 구한 다음, 상기 외곽선 내부에 포함되는 픽셀의 수를 곱하는 방법으로 상기 객체의 전체 면적을 구할 수 있다.In the calculating of the area of the object, the total area of the object may be obtained by obtaining a unit area of pixels located at the distance of the extracted object, and then multiplying the number of pixels included in the outline. .
본 발명의 다른 실시 예에 따른 영상인식장치는, 스테레오카메라부, 거리정보계산부, 객체추출부 및 객체인식부를 포함한다. 스테레오카메라부는 동일한 영역을 촬영하는 두 개의 카메라를 구비하여 촬영한 한 쌍의 아날로그 영상을 디지털 영상으로 변환한다. 거리정보계산부는 상기 스테레오카메라부에서 생성한 한 쌍의 디지털 영상을 이용하여 3차원 심도 맵 데이터를 계산하고, 객체추출부는 상기 스테레오카메라부에서 생성한 디지털 영상 중 하나를 기준 배경영상과 비교하여 움직이는 객체의 영역을 추출한다. An image recognition apparatus according to another embodiment of the present invention includes a stereo camera unit, a distance information calculator, an object extractor, and an object recognizer. The stereo camera unit includes two cameras photographing the same area to convert a pair of analog images photographed into digital images. The distance information calculator calculates 3D depth map data using a pair of digital images generated by the stereo camera unit, and the object extractor compares one of the digital images generated by the stereo camera unit with a reference background image. Extract the area of the object.
객체인식부는 상기 거리정보계산부에서 계산한 심도 맵 데이터로부터 확인한 상기 객체까지의 거리정보를 기초로 상기 객체추출부가 추출한 객체의 면적을 계산하고, 상기 계산된 면적이 특정 사물의 면적으로 기 설정된 범위 내에 속하는지 판단하는 방법으로 상기 객체를 인식한다.The object recognition unit calculates the area of the object extracted by the object extracting unit based on the distance information to the object identified from the depth map data calculated by the distance information calculating unit, and the calculated area is a preset range as the area of a specific object. Recognize the object in a way to determine if it belongs to.
본 발명의 영상인식장치는 촬영 영역에서 움직이는 객체를 보다 간단한 방법으로 인식할 수 있다. 이러한 방법은, 두 개의 카메라를 이용하여 생성한 영상을 처리하는 대신에, 2차원 이미지 처리에 비해 그 인식 알고리즘이 상대적으로 간단하여 인식 속도와 효율이 개선되며, 무엇보다 인식률이 뛰어난 특징이 있다. The image recognition device of the present invention can recognize a moving object in the photographing area in a simpler method. In this method, the recognition algorithm is relatively simple compared to the two-dimensional image processing, instead of processing the image generated using the two cameras, the recognition speed and efficiency is improved, and above all, the recognition rate is excellent.
도 1은 본 발명의 일 실시 예에 따른 3차원 영상인식장치의 블록도,1 is a block diagram of a 3D image recognition device according to an embodiment of the present invention;
도 2는 본 발명의 3차원 영상인식과정의 설명에 제공되는 흐름도,2 is a flowchart provided to explain a three-dimensional image recognition process of the present invention;
도 3은 객체 추출 단계에서의 영상처리 결과를 도시한 도면, 그리고3 is a diagram illustrating an image processing result in an object extraction step; and
도 4는 객체의 면적 계산방법의 설명에 제공되는 도면이다.4 is a diagram provided to explain a method for calculating an area of an object.
이하 도면을 참조하여 본 발명을 더욱 상세히 설명한다.Hereinafter, the present invention will be described in more detail with reference to the accompanying drawings.
도 1을 참조하면, 본 발명의 영상인식장치(100)는 스테레오카메라부(110) 및 영상처리부(130)를 포함하여 3차원 공간상의 피사체를 인식하게 된다. Referring to FIG. 1, the image recognition device 100 of the present invention includes a stereo camera unit 110 and an image processor 130 to recognize a subject in a three-dimensional space.
스테레오카메라부(110)는 제1 카메라(111), 제2 카메라(113) 및 영상수신부(115)를 포함한다. The stereo camera unit 110 includes a first camera 111, a second camera 113, and an image receiver 115.
제1 카메라(111) 및 제2 카메라(113)는 동일한 영역을 촬영하도록 상호 이격되어 설치된 한 쌍의 카메라들로서, 소위 스테레오 카메라라고 한다. 제1 카메라(111) 및 제2 카메라(113)는 영역을 촬영한 아날로그 영상신호를 영상수신부(115)로 출력한다. The first camera 111 and the second camera 113 are a pair of cameras spaced apart from each other to photograph the same area, and are called a stereo camera. The first camera 111 and the second camera 113 output an analog image signal photographing an area to the image receiver 115.
영상수신부(115)는 제1 카메라(111) 및 제2 카메라(113)에서 입력되는 연속적인 프레임의 영상신호(또는 이미지)를 디지털 영상으로 변환하고, 그 프레임 동기를 맞추어 영상처리부(130)에게 제공한다. The image receiver 115 converts a video signal (or image) of a continuous frame input from the first camera 111 and the second camera 113 into a digital image and synchronizes the frame to the image processor 130 in synchronization with the frame. to provide.
실시 예에 따라, 스테레오카메라부(110)의 제1 카메라(111)와 제2 카메라(113)는 아날로그 영상이 아닌 디지털 영상신호를 출력하는 카메라일 수 있으며, 이 경우 영상수신부(115)는 다른 변환처리없이 영상처리부(130)와의 인터페이스를 제공하며 한 쌍의 영상의 프레임 동기를 맞추는 역할을 한다.According to an exemplary embodiment, the first camera 111 and the second camera 113 of the stereo camera unit 110 may be a camera that outputs a digital video signal instead of an analog image. In this case, the image receiver 115 may be different. It provides an interface with the image processor 130 without conversion processing and serves to match frame synchronization of a pair of images.
또한, 스테레오카메라부(110)는 IP(Internet Protocol) 망을 통해 영상처리부(130)에 연결되기 위한 유선 또는 무선 인터페이스를 더 포함할 수 있다. In addition, the stereo camera unit 110 may further include a wired or wireless interface for connecting to the image processing unit 130 through an IP (Internet Protocol) network.
영상처리부(130)는 스테레오카메라부(110)로부터 출력되는 한 쌍의 디지털 영상 프레임으로부터 촬영영역 상에서 움직이는 객체의 영역을 추출하여 해당 객체가 관심 사물인지를 판단하며, 스테레오카메라부(110)로부터 연속적으로 입력되는 영상(동영상)의 모든 프레임에 대해 실시간으로 이상의 판단과정을 수행할 수 있다.The image processor 130 extracts an area of an object moving on the shooting area from the pair of digital image frames output from the stereo camera unit 110 to determine whether the object is an object of interest, and continuously from the stereo camera unit 110. The above determination process may be performed in real time on all frames of the image (video) that is input to the image.
이상의 처리를 위해, 영상처리부(130)는 거리정보계산부(131), 객체추출부(133) 및 객체인식부(135)를 포함한다. 이하에서는, 도 2를 참조하여 거리정보계산부(131), 객체추출부(133) 및 객체인식부(135)의 동작을 설명한다. For the above process, the image processor 130 includes a distance information calculator 131, an object extractor 133, and an object recognizer 135. Hereinafter, operations of the distance information calculator 131, the object extractor 133, and the object recognizer 135 will be described with reference to FIG. 2.
먼저, 제1 카메라(111) 및 제2 카메라(113)가 아날로그 영상신호를 생성하면, 영상수신부(115)가 해당 아날로그 영상신호를 디지털 영상신호로 변환한 다음 프레임 동기를 맞추어 영상처리부(130)에게 제공한다(S201, S203단계).First, when the first camera 111 and the second camera 113 generates an analog video signal, the image receiver 115 converts the analog video signal into a digital video signal and then synchronizes the frame to the image processor 130. Provided to (S201, step S203).
<심도 맵 데이터 계산: S205 단계>Depth Map Data Calculation: Step S205
거리정보계산부(131)는 영상수신부(115)로부터 실시간으로 입력받는 한 쌍의 디지털 영상으로부터 각 픽셀의 거리정보를 포함하는 3차원 심도 맵(3D Depth Map) 데이터를 계산한다. The distance information calculator 131 calculates 3D depth map data including distance information of each pixel from a pair of digital images received in real time from the image receiver 115.
여기서, 각 픽셀의 거리 정보는 종래기술에서 설명한 스테레오 정합방법에 의해 구해지는 양안차 정보로서, 대한민국 등록특허 제0517876호의 "복수 영상 라인을 이용한 영상 정합 방법"이나 대한민국 등록특허 제0601958호의 "3차원 객체 인식을 위한 양안차 추정방법에 제시된 그래프 컷(Graph Cut) 알고리즘 등을 이용하여 계산할 수 있다. 따라서, 거리정보계산부(131)에서 계산한 심도 맵 데이터에는 각 픽셀에 대한 거리정보가 포함된다. Here, the distance information of each pixel is binocular difference information obtained by the stereo matching method described in the prior art, and the "three-dimensional image matching method using a plurality of image lines" of Korean Patent No. 0517876 or the Korean Patent No. 0601958. The depth map data calculated by the distance information calculator 131 may include distance information about each pixel. .
<움직이는 객체의 영역 추출: S207 단계><Extract region of moving object: step S207>
객체추출부(133)는 영상수신부(115)를 통해 입력되는 한 쌍의 디지털 이미지 중 하나의 이미지로부터 움직이는 객체의 영역을 추출한다. 여기서, 움직이는 객체라 함은, 카메라의 촬영 영역 내에 존재한 객체로서 그 위치나 동작이 변경된 객체 또는 촬영 영역 내로 새롭게 진입한 객체를 말한다. The object extractor 133 extracts a region of the moving object from one image of the pair of digital images input through the image receiver 115. Here, the moving object refers to an object existing in the photographing area of the camera and an object whose position or motion is changed or newly entered into the photographing area.
이러한 움직이는 객체의 영역을 추출하는 방법은 다양하게 이루어질 수 있다. 예컨대, 본 발명의 객체추출부(133)는 입력된 영상 프레임에서 기 보유한 배경영상을 빼는 방법(Background Subtraction)으로 움직이는 객체의 영역을 추출한다. 여기서, 뺄셈 연산은 대응되는 두 개 영상 프레임의 각 픽셀의 화소 값을 빼는 방법으로 이루어진다. 또한, 기준 배경영상은 움직이는 객체가 없다고 설정한 경우의 영상으로서, 객체추출부(133)는 저장매체(미도시)에 기준 배경영상을 저장해 두었다가 사용할 수 있다.The method of extracting the area of the moving object may be variously performed. For example, the object extracting unit 133 of the present invention extracts a region of a moving object by a method of subtracting a background image previously held from an input image frame. Here, the subtraction operation is performed by subtracting pixel values of each pixel of two corresponding image frames. In addition, the reference background image is an image in which no moving object is set, and the object extractor 133 may store and use the reference background image in a storage medium (not shown).
나아가, 객체 영역이 아닌 배경부분에서도 카메라 잡음이나 촬영영역에 대한 조명의 변화에 의한 차 영상(Difference Image)이 발생할 수 있으므로, 객체추출부(133)는 뺄셈 연산의 결과영상에 대하여 가우시안 분포(Gaussian Distribution)처리 등을 적용하는 배경 모델링(Background Modeling)을 통해 잡음이나 조명의 변화 등에 대응할 수 있다. Furthermore, since a difference image may be generated due to camera noise or a change in illumination of the photographing area, the object extractor 133 may perform a Gaussian distribution on the resultant image of the subtraction operation. Background Modeling, which applies the Distribution process, can cope with noise or light changes.
도 3을 참조하면, (a)는 영상수신부(115)로부터 입력된 영상, (b)는 기본 배경영상, 그리고 (c)는 뺄셈연산의 결과영상이다. 도 3의 (c)를 참조하면, 영상수신부(115)로부터 입력된 영상에서 움직이는 객체의 영역이 추출되었음을 알 수 있다. Referring to FIG. 3, (a) is an image input from the image receiver 115, (b) is a basic background image, and (c) is a result image of a subtraction operation. Referring to FIG. 3C, it can be seen that a region of a moving object is extracted from an image input from the image receiver 115.
<움직이는 객체의 외곽선 검출: S209 단계><Detect Edge of Moving Object: Step S209>
객체추출부(133)는 S207 단계의 뺄셈 연산의 결과영상에서 외곽선 검출을 수행하여 움직이는 객체의 외곽선을 검출한다. 외곽선 검출은 객체의 경계선 넓이와 형태에 따라 여러 종류의 형태의 에지를 사용하여 처리된다. The object extractor 133 detects an outline of a moving object by performing outline detection on the resultant image of the subtraction operation of step S207. Edge detection is handled using different types of edges, depending on the borderline width and shape of the object.
객체추출부(133)는 외곽선 검출을 위해, 뺄셈 영상에 모폴로지(Morphology) 연산을 적용하여 잡음을 제거하고, 외각선이나 골격선을 간단하게 할 수 있다. 모폴로지 연산에는 기본적으로 잡음을 제거하는 침식(Erosion) 연산과 객체 내의 작은 구멍을 메우는 팽창(Dilation) 연산이 사용될 수 있다. The object extractor 133 may remove a noise by applying a morphology operation to a subtraction image and simplify an outline or a skeleton line to detect an outline. The morphology operation can basically use erosion operation to remove noise and dilation operation to fill small holes in an object.
<움직이는 객체의 면적 계산: S211 단계><Calculate Area of Moving Object: Step S211
객체인식부(135)는 객체추출부(133)가 추출한 외곽선 정보와 거리정보계산부(131)가 계산한 3차원 심도 맵(3D Depth Map) 데이터를 이용하여, S209 단계에서 추출한 객체의 면적을 계산한다. The object recognizer 135 uses the outline information extracted by the object extractor 133 and the 3D depth map data calculated by the distance information calculator 131 to calculate the area of the object extracted in step S209. Calculate
객체의 면적 계산은, S209 단계에서 추출된 객체가 위치한 거리(l)에서의 픽셀 당 실제 면적(이하, 픽셀의 '단위 면적'이라 함)을 구한 다음, 해당 객체의 외곽선 내부에 포함된 픽셀의 수를 곱하는 방법으로 이루어진다.In calculating the area of an object, the actual area per pixel (hereinafter referred to as the 'unit area' of the pixel) at the distance l where the object extracted in step S209 is located is obtained, and then the pixel included in the outline of the object is calculated. This is done by multiplying numbers.
도 4를 참조하면, 기존 배경영상을 기준으로 최대 심도(L)에서의 전체 프레임에 대응하는 실제면적(M)과, 추출된 객체의 위치(l)에서의 전체 프레임에 대응하는 실제면적 m(l)이 표시되어 있다. 먼저 해당 객체가 위치하는 거리(l)에서의 프레임 전체에 대응되는 실제면적 m(l)은 다음의 수학식 1과 같이 구할 수 있다. Referring to FIG. 4, the actual area M corresponding to the entire frame at the maximum depth L based on the existing background image, and the actual area m corresponding to the entire frame at the position l of the extracted object. l) is displayed. First, the actual area m (l) corresponding to the entire frame at the distance l at which the object is located may be obtained as in Equation 1 below.
수학식 1
Figure PCTKR2010002669-appb-M000001
Equation 1
Figure PCTKR2010002669-appb-M000001
여기서, M은 기존 배경영상을 기준으로 최대 거리(L)에서의 전체 프레임(예컨대, 720×640 픽셀)에 대응되는 실제 면적이다. Here, M is an actual area corresponding to the entire frame (eg, 720 × 640 pixels) at the maximum distance L based on the existing background image.
다음으로, 객체가 위치하는 거리(l)에서의 전체 프레임에 대응되는 실제 면적 m(l)을 프레임 전체의 픽셀 수(P, 예컨대, 460,800=720×640)로 나눔으로써, 객체 영역에 포함된 픽셀의 단위 면적 mp(l)을 다음의 수학식 2와 같이 구한다. Next, by dividing the actual area m (l) corresponding to the entire frame at the distance l where the object is located by the number of pixels P (for example, 460,800 = 720 × 640) of the entire frame, The unit area m p (l) of the pixel is obtained as in Equation 2 below.
수학식 2
Figure PCTKR2010002669-appb-M000002
Equation 2
Figure PCTKR2010002669-appb-M000002
여기서, P는 전체 픽셀의 수이다. 수학식 2에 의하면, mp(l)은 3차원 심도 맵 데이터의 거리 정보로부터 확인한 해당 객체까지의 거리(l)에 따라 달라짐을 알 수 있다.Where P is the total number of pixels. According to Equation 2, it can be seen that mp (l) depends on the distance (l) to the corresponding object confirmed from the distance information of the 3D depth map data.
마지막으로, 객체의 면적은 앞에서 설명한 것처럼 픽셀의 단위 면적 mp(l)에 해당 외곽선 내부에 포함되는 픽셀의 수(pc)를 곱함으로써 다음의 수학식 3과 같이 구할 수 있다. Finally, the area of the object can be obtained as shown in Equation 3 by multiplying the unit area mp (l) of the pixel by the number of pixels included in the outline (pc).
수학식 3
Figure PCTKR2010002669-appb-M000003
Equation 3
Figure PCTKR2010002669-appb-M000003
여기서, pc는 객체에 포함된 픽셀의 수이다. Where pc is the number of pixels included in the object.
<움직이는 객체의 면적을 기초로 객체 인식: S213 단계><Object recognition based on the area of the moving object: step S213>
객체인식부(135)는 S211 단계에서 구한 객체의 면적을 기 설정된 값과 비교하는 방법으로 객체를 인식한다. 예컨대, 객체의 면적이 제1 크기 이하인 경우 4발로 움직이는 동물로 인식하거나, 객체의 크기가 제1 범위 내에 속하는 경우, 자동차인 것으로 인식할 수 있는 것이다. The object recognition unit 135 recognizes the object by comparing the area of the object obtained in operation S211 with a preset value. For example, if the area of the object is less than or equal to the first size, it may be recognized as an animal moving in four legs, or if the size of the object is within the first range, it may be recognized as an automobile.
이상의 과정을 통해 본 발명의 영상인식장치는 스테레오 카메라를 이용한 3차원 심도 맵 데이터를 구하고, 영상에 포착된 객체를 인식하게 된다. Through the above process, the image recognition device of the present invention obtains 3D depth map data using a stereo camera and recognizes an object captured in the image.
여기에서 시계열적으로 선행하는 것처럼 설명되는 것과 달리, S205 단계의 심도 맵 데이터의 계산은 도 2에 도시된 것처럼 S207 및 S209 단계의 움직이는 객체의 추출과정과 병렬적으로 수행될 수 있을 뿐만 아니라, S207 및 S209 단계 후에 수행될 수도 있다. Contrary to what is described herein as time series preceding, the calculation of the depth map data of step S205 can be performed in parallel with the extraction process of the moving object of steps S207 and S209 as shown in FIG. And after step S209.
이상에서는 본 발명의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어서는 안 될 것이다.Although the above has been illustrated and described with respect to preferred embodiments of the present invention, the present invention is not limited to the above-described specific embodiments, it is usually in the technical field to which the invention belongs without departing from the spirit of the invention claimed in the claims. Various modifications can be made by those skilled in the art, and these modifications should not be individually understood from the technical spirit or the prospect of the present invention.

Claims (5)

  1. 동일한 영역을 촬영하는 두 개의 카메라를 이용하여, 한 쌍의 디지털 영상을 생성하는 단계;Generating a pair of digital images using two cameras photographing the same area;
    상기 변환된 한 쌍의 디지털 영상을 이용하여 3차원 심도 맵 데이터를 계산하는 단계;Calculating 3D depth map data using the converted pair of digital images;
    상기 디지털 영상 중 하나를 기준 배경영상과 비교하여 움직이는 객체의 영역을 추출하는 단계;Extracting an area of a moving object by comparing one of the digital images with a reference background image;
    상기 심도 맵 데이터로부터 확인한 상기 객체까지의 거리정보를 기초로 상기 객체의 면적을 계산하는 단계: 및Calculating an area of the object based on distance information to the object identified from the depth map data; and
    상기 계산된 면적이 특정 사물의 면적으로 기 설정된 범위 내에 속하는지 판단하는 방법으로 상기 객체를 인식하는 단계를 포함하는 것을 특징으로 하는 3D 카메라를 이용한 영상인식방법.And recognizing the object by a method of determining whether the calculated area is within a predetermined range of an area of a specific object.
  2. 제1항에 있어서,The method of claim 1,
    상기 객체의 영역을 추출하는 단계는, Extracting the area of the object,
    상기 디지털 영상 중 하나와 상기 기준 배경영상에 대해 뺄셈 연산을 수행하는 단계; 및Performing a subtraction operation on one of the digital images and the reference background image; And
    상기 뺄셈 연산된 영상으로 추출한 객체의 외곽선을 검출하는 단계를 포함하는 것을 특징으로 하는 3D 카메라를 이용한 영상인식방법.And detecting an outline of an object extracted as the subtracted image.
  3. 제2항에 있어서,The method of claim 2,
    상기 객체의 면적을 계산하는 단계는, 상기 추출된 객체의 거리에서 위치한 픽셀의 단위 면적을 구한 다음, 상기 외곽선 내부에 포함되는 픽셀의 수를 곱하여 상기 객체의 전체 면적을 구하는 것을 특징으로 하는 3D 카메라를 이용한 영상인식방법.The calculating of the area of the object may include obtaining a unit area of pixels located at a distance of the extracted object, and then multiplying the number of pixels included in the outline to obtain a total area of the object. Image recognition method using.
  4. 동일한 영역을 촬영하는 두 개의 카메라를 구비하여, 한 쌍의 디지털 영상을 생성하는 스테레오카메라부;A stereo camera unit having two cameras for capturing the same area and generating a pair of digital images;
    상기 스테레오카메라부에서 생성한 한 쌍의 디지털 영상을 이용하여 3차원 심도 맵 데이터를 계산하는 거리정보계산부;A distance information calculator for calculating 3D depth map data using a pair of digital images generated by the stereo camera unit;
    상기 스테레오카메라부에서 생성한 디지털 영상 중 하나를 기준 배경영상과 비교하여 움직이는 객체의 영역을 추출하는 객체추출부; 및An object extracting unit extracting an area of a moving object by comparing one of the digital images generated by the stereo camera unit with a reference background image; And
    상기 거리정보계산부에서 계산한 심도 맵 데이터로부터 확인한 상기 객체까지의 거리정보를 기초로 상기 객체추출부가 추출한 객체의 면적을 계산하고, 상기 계산된 면적이 특정 사물의 면적으로 기 설정된 범위 내에 속하는지 판단하는 방법으로 상기 객체를 인식하는 객체인식부를 포함하는 것을 특징으로 하는 3D 카메라를 이용한 영상인식장치.The area of the object extracted by the object extracting unit is calculated based on the distance information to the object identified from the depth map data calculated by the distance information calculating unit, and whether the calculated area is within a preset range as an area of a specific object. An image recognition apparatus using a 3D camera, characterized in that it comprises an object recognition unit for recognizing the object as a method of determining.
  5. 제4항에 있어서,The method of claim 4, wherein
    상기 객체추출부는, The object extraction unit,
    상기 스테레오카메라부에서 생성한 디지털 영상 중 하나와 상기 기준 배경영상에 대해 뺄셈 연산을 수행하여 상기 객체를 추출한 다음, 상기 면적 계산을 위해 상기 추출한 객체의 외곽선을 검출하는 것을 특징으로 하는 3D 카메라를 이용한 영상인식장치.The object is extracted by performing a subtraction operation on one of the digital images generated by the stereo camera unit and the reference background image, and then the outline of the extracted object is detected to calculate the area. Video recognition device.
PCT/KR2010/002669 2010-04-28 2010-04-28 Image recognition device and method using 3d camera WO2011136405A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020100039302A KR101148029B1 (en) 2010-04-28 2010-04-28 Video Analysing Apparatus and Method Using 3D Camera
KR10-2010-0039302 2010-04-28

Publications (1)

Publication Number Publication Date
WO2011136405A1 true WO2011136405A1 (en) 2011-11-03

Family

ID=44861681

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2010/002669 WO2011136405A1 (en) 2010-04-28 2010-04-28 Image recognition device and method using 3d camera

Country Status (2)

Country Link
KR (1) KR101148029B1 (en)
WO (1) WO2011136405A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563461A (en) * 2017-08-25 2018-01-09 北京中骏博研科技有限公司 The automatic fees-collecting method and system of catering industry based on image recognition

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101203121B1 (en) * 2012-04-20 2012-11-21 주식회사 아이티엑스시큐리티 3 Dimensional Motion Recognition System and Method Using Stereo Camera
US9454816B2 (en) 2013-10-23 2016-09-27 International Electronic Machines Corp. Enhanced stereo imaging-based metrology
KR102259509B1 (en) 2019-08-22 2021-06-01 동의대학교 산학협력단 3d modeling process based on photo scanning technology
KR102339339B1 (en) * 2020-12-30 2021-12-15 (주)해양정보기술 Method for calculate volume of wave overtopping

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09145368A (en) * 1995-11-29 1997-06-06 Ikegami Tsushinki Co Ltd Moving and tracing method for object by stereoscopic image
KR20050066400A (en) * 2003-12-26 2005-06-30 한국전자통신연구원 Apparatus and method for the 3d object tracking using multi-view and depth cameras
KR20070065480A (en) * 2005-12-20 2007-06-25 한국철도기술연구원 A monitoring system for subway platform using stereoscopic video camera
KR20090027410A (en) * 2007-09-12 2009-03-17 한국철도기술연구원 Stereo vision based monitoring system in railway station and method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09145368A (en) * 1995-11-29 1997-06-06 Ikegami Tsushinki Co Ltd Moving and tracing method for object by stereoscopic image
KR20050066400A (en) * 2003-12-26 2005-06-30 한국전자통신연구원 Apparatus and method for the 3d object tracking using multi-view and depth cameras
KR20070065480A (en) * 2005-12-20 2007-06-25 한국철도기술연구원 A monitoring system for subway platform using stereoscopic video camera
KR20090027410A (en) * 2007-09-12 2009-03-17 한국철도기술연구원 Stereo vision based monitoring system in railway station and method thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563461A (en) * 2017-08-25 2018-01-09 北京中骏博研科技有限公司 The automatic fees-collecting method and system of catering industry based on image recognition

Also Published As

Publication number Publication date
KR20110119893A (en) 2011-11-03
KR101148029B1 (en) 2012-05-24

Similar Documents

Publication Publication Date Title
WO2011136407A1 (en) Apparatus and method for image recognition using a stereo camera
WO2016122069A1 (en) Method for measuring tire wear and device therefor
WO2011136405A1 (en) Image recognition device and method using 3d camera
WO2012124852A1 (en) Stereo camera device capable of tracking path of object in monitored area, and monitoring system and method using same
WO2013151270A1 (en) Apparatus and method for reconstructing high density three-dimensional image
CN110243390B (en) Pose determination method and device and odometer
WO2014035103A1 (en) Apparatus and method for monitoring object from captured image
CN112045676A (en) Method for grabbing transparent object by robot based on deep learning
WO2008111550A1 (en) Image analysis system and image analysis program
WO2015069063A1 (en) Method and system for creating a camera refocus effect
KR20110129158A (en) Method and system for detecting a candidate area of an object in an image processing system
WO2012133962A1 (en) Apparatus and method for recognizing 3d movement using stereo camera
WO2019098421A1 (en) Object reconstruction device using motion information and object reconstruction method using same
KR101281003B1 (en) Image processing system and method using multi view image
WO2018021657A1 (en) Method and apparatus for measuring confidence of deep value through stereo matching
WO2014185691A1 (en) Apparatus and method for extracting high watermark image from continuously photographed images
WO2017086522A1 (en) Method for synthesizing chroma key image without requiring background screen
KR20170001448A (en) Apparatus for measuring position of camera using stereo camera and method using the same
WO2014204126A2 (en) Apparatus for capturing 3d ultrasound images and method for operating same
WO2013077508A1 (en) Device and method for depth map generation and device and method using same for 3d image conversion
WO2021256640A1 (en) Device and method for reconstructing human posture and shape model on basis of multi-view image by using information on relative distance between joints
CN109447087A (en) A kind of oil smoke image dynamic area extracting method, identifying system and kitchen ventilator
CN111382607A (en) Living body detection method and device and face authentication system
CN111696143A (en) Event data registration method and system
CN111311615A (en) ToF-based scene segmentation method and system, storage medium and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10850773

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10850773

Country of ref document: EP

Kind code of ref document: A1