KR20180111150A

KR20180111150A - Apparatus for extracting foreground from video and method for the same

Info

Publication number: KR20180111150A
Application number: KR1020170041664A
Authority: KR
Inventors: 최성록; 김동현; 이상재; 김지완; 이재영; 임재호; 조재일
Original assignee: 한국전자통신연구원
Priority date: 2017-03-31
Filing date: 2017-03-31
Publication date: 2018-10-11

Abstract

본 발명의 일 실시예는, 고정된 위치에 설치된 촬영 장치에서 촬영된 입력 영상들을 저장하는 데이터베이스; 하나 이상의 상기 입력 영상들을 이용하여 전경을 추출하고자 하는 대상 영상에 상응하는 기준배경모델을 생성하는 기준배경모델 관리부; 자동 배경 학습을 통하여 상기 대상 영상에 상응하는 학습배경모델을 생성하거나 갱신하는 학습배경모델 관리부; 및 상기 대상 영상에 대해서 상기 기준배경모델 및 상기 학습배경모델과의 비교를 통하여 상기 대상 영상에 상응하는 전경을 추출하는 전경 추출부; 를 포함하는, 영상에서 전경을 추출하는 장치를 제공한다.According to an embodiment of the present invention, there is provided an image processing apparatus comprising: a database for storing input images photographed by a photographing apparatus installed at a fixed position; A reference background model management unit for generating a reference background model corresponding to a target image to extract a foreground using one or more of the input images; A learning background model manager for generating or updating a learning background model corresponding to the target image through automatic background learning; And a foreground extracting unit for extracting a foreground corresponding to the target image by comparing the target image with the reference background model and the learning background model; And extracting the foreground from the image.

Description

[0001] APPARATUS FOR EXTRACTING FOREGROUND FROM VIDEO AND METHOD FOR THE SAME [0002]

본 발명은 고정된 카메라나 CCTV와 같은 촬영 장치를 통해 획득한 영상에서 전경을 분리하기 위한 장치 및 그 방법에 관한 것이다.The present invention relates to an apparatus and method for separating a foreground from an image acquired through a photographing apparatus such as a fixed camera or a CCTV.

최근에는 사건/사고 모니터링 및 증거 확보를 위해 주변에 많은 CCTV가 설치되어 있고, 앞으로 더욱 많은 CCTV가 설치될 것으로 예상된다. 하지만, 현재는 대부분의 CCTV들이 단순한 영상 모니터링 및 저장 기능만을 수행하고 있다. 따라서, 저장된 영상에서 원하는 사건을 확인하기 위해 정확한 시간 정보를 통해 탐색하거나, 장시간을 투자하여 직접 찾아야하는 비효율적인 문제가 있었다.In recent years, many CCTVs have been installed around them to monitor incidents and accidents, and more CCTVs are expected to be installed in the future. Currently, however, most CCTVs are performing only simple image monitoring and storage functions. Therefore, there has been an inefficient problem of searching through the accurate time information in order to check a desired event in a stored image or directly searching for a long time.

그리하여, 최근에는 CCTV에서 획득한 영상에서 공간, 물체 및 사람 등을 인지하여 사건/사고를 자동으로 감지하고 알릴 수 있는 지능형 영상 감시(IVS: Intelligent Visual Surveillance) 및 지능형 영상 분석 기술이 연구/개발되고 있고, 현재 기본적인 기능을 탑재한 제품들도 시장에 나와 있다. 점차 많은 CCTV들이 이러한 기능을 탑재하게 될 것으로 예상되고 있다.Recently, intelligent video surveillance (IVS) and intelligent image analysis technology that can detect and annotate events / accidents automatically by recognizing space, objects, and people in CCTV images have been researched / developed There are also products with basic functions now on the market. More and more CCTVs are expected to feature these capabilities.

그러나, 기존의 지능형 영상 감시(IVS) 시스템이 널리 사용되지 못하는 가장 큰 이유는 지능형영상감시 시스템의 성능의 한계, 특히 높은 오탐지율(false alarm rate)과 낮은 검출율(detection rate) 때문이다. 즉, 정상상황을 이상상황으로 인지하여 오탐지를 하거나, 이상상황이 발생해도 이를 놓치는 경우가 빈번하게 발생하기 때문이다. 이러한 성능의 한계를 일으키는 많은 원인들이 있지만, 지능형영상감시의 핵심 요소기술로 사용되는 배경학습 및 전경분리(background subtraction)기술의 전경 오탐지 및 미검출이 가장 큰 문제이다.However, the main reason why the existing intelligent video surveillance (IVS) system is not widely used is due to limitations of the performance of the intelligent video surveillance system, especially high false alarm rate and low detection rate. That is, it is frequently detected that a normal situation is recognized as an abnormal situation and false detection is made or an abnormal situation is missed. Although there are many causes of this performance limitation, the foreground detection and non-detection of the background learning and the background subtraction technique which are used as the core element technologies of the intelligent video surveillance are the biggest problems.

배경학습 및 전경분리기술은 연속된 영상에서 변하지 않는 부분을 통해 배경으로 학습하면서 배경과 차이가 큰 부분을 전경으로 분리한다. 전경으로 분리된 부분은 사람이나 물체로 인지할 수 있기 때문에 이상상황을 판단하는 기본 데이터로 사용된다.The background learning and foreground separation techniques are used to study the background through the unchanging part of the continuous image, while separating the part having a large difference from the background into the foreground. Since the part separated by the foreground can be recognized as a person or an object, it is used as basic data for judging an abnormal situation.

즉, 종래에는 입력 영상과 학습배경모델과의 차이를 계산하여 배경과 전경을 구분하며, 구분된 배경을 정해진 학습률에 따라 학습배경모델에 반영하는 방식으로 전경 추출이 이루어졌다.That is, in the past, foreground extraction was performed by dividing the background and foreground by calculating the difference between the input image and the learning background model, and reflecting the divided background to the learning background model according to the determined learning rate.

그러나 자동 배경학습은 시간이 지남에 따라 오랫동안 멈춰있는 사람이나 물체를 배경으로 흡수하여 전경으로 분리하지 못하는 치명적인 단점이 있다. 그렇다고 자동 배경학습을 사용하지 않으면, 천천히 진행되는 조도 변화가 배경모델에 반영되지 못하여 어느 정도 시간이 지난 뒤에 영상 전체가 전경으로 추출될 수 있다.However, the automatic background learning has a fatal disadvantage that it can not be separated into the foreground by absorbing a person or an object that has been stopped for a long time as a background. However, if the automatic background learning is not used, the slowly changing illumination intensity is not reflected in the background model, and the whole image can be extracted to the foreground after a certain time.

또한, 배경학습 및 전경분리기술은 아래의 도 1 및 도 2와 같이 배경과 유사한 색상의 전경을 구분하지 못하는 문제를 갖고 있다. 도 1는 기차역 플랫폼에서 촬영된 영상의 정지 장면이며, 도 2는 도 1의 영상에서 전경을 추출한 영상의 정지 장면이다. 기존의 배경학습 및 전경분리기술을 이용하면, 도 1의 우측에 위치한 사람은 상의가 흰색으로 주변 배경인 바닥의 색과 유사하여 도 2와 같이 전경이 부정확하게 추출된다.In addition, the background learning and foreground separation techniques have a problem that foregrounds similar to the background can not be distinguished as shown in FIGS. 1 and 2 below. FIG. 1 is a still view of a video image taken on a train station platform, and FIG. 2 is a still picture of a foreground image extracted from the video of FIG. Using the existing background learning and foreground separation techniques, the person located on the right side of FIG. 1 is white and the foreground is incorrectly extracted as shown in FIG. 2 because it is similar to the background color of the surrounding background.

전술한 배경기술은 발명자가 본 발명의 도출을 위해 보유하고 있었거나, 본 발명의 도출 과정에서 습득한 기술 정보로서, 반드시 본 발명의 출원 전에 일반 공중에게 공개된 공지기술이라 할 수는 없다.The above-described background technology is technical information that the inventor holds for the derivation of the present invention or acquired in the process of deriving the present invention, and can not necessarily be a known technology disclosed to the general public prior to the filing of the present invention.

국내 공개특허공보 제10-2011-0109595호Korean Patent Laid-Open No. 10-2011-0109595

본 발명의 목적은 고정된 촬영 장치에서 장시간 촬영된 영상들을 이용하여 생성한 기준배경모델을 활용하여 성능이 향상된 전경 분리 기술을 갖는 영상에서 전경을 추출하는 장치 및 그 방법을 제공하는 것이다.An object of the present invention is to provide an apparatus and method for extracting a foreground from an image having a foreground separation technique with improved performance using a reference background model generated by using a fixed photographing apparatus for a long time.

또한, 본 발명의 목적은 카메라의 초점거리, 위치 및 자세 등의 정보를 이용하여 성능이 향상된 전경 분리 기술을 갖는 영상에서 전경을 추출하는 장치 및 그 방법을 제공하는 것이다.It is another object of the present invention to provide an apparatus and method for extracting a foreground from an image having a foreground separation technique with improved performance using information such as a focal length, a position and an attitude of a camera.

또한, 본 발명의 목적은 영상 속의 각 위치에 상응하는 크기 보정이 이루어진 마스크를 사용하여 성능이 향상된 전경 분리 기술을 갖는 영상에서 전경을 추출하는 장치 및 그 방법을 제공하는 것이다.It is also an object of the present invention to provide an apparatus and method for extracting a foreground from an image having a foreground separation technique with improved performance using a mask having a size correction corresponding to each position in the image.

본 발명에 따르면, 영상에서 전경을 추출하는 장치 및 그 방법에 의해, 고정된 촬영 장치에서 장시간 촬영된 영상들을 이용하여 생성한 기준배경모델을 활용하여 성능이 향상된 전경 분리 기술을 제공하여, 전경의 오검출과 미검출 문제를 해결할 수 있다.According to the present invention, an apparatus for extracting a foreground from an image and a method for extracting foreground from the image provide a foreground separation technique with improved performance by utilizing a reference background model generated by using images taken for a long time in a fixed photographing apparatus, The problem of erroneous detection and non-detection can be solved.

또한, 본 발명은 영상에서 전경을 추출하는 장치 및 그 방법에 의해, 카메라의 초점거리, 위치 및 자세 등의 정보를 이용함으로써, 보다 성능이 향상된 전경 분리 기술을 이용하는 고성능의 지능형 영상 감시 시스템을 구축할 수 있다.In addition, the present invention utilizes information such as the focal distance, position, and posture of a camera, and thereby, a high-performance intelligent video surveillance system using a foreground separation technique with improved performance is constructed can do.

또한, 본 발명은 영상에서 전경을 추출하는 장치 및 그 방법에 의해, 영상 속의 각 위치에 상응하는 크기 보정이 이루어진 마스크를 사용함으로써, 보다 성능이 향상된 전경 분리 기술을 구현할 수 있다.In addition, according to the apparatus and method for extracting the foreground from an image, a foreground separation technique with improved performance can be realized by using a mask whose size is corrected corresponding to each position in a video image.

도 1은 기차역 플랫폼에서 촬영된 영상의 일 예를 나타낸 도면이다.
도 2는 도 1에 도시된 영상으로부터 종래 기술을 이용하여 전경을 추출한 결과의 일 예를 나타낸 도면이다.
도 3은 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)을 나타낸 도면이다.
도 4는 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)의 동작 방법을 나타낸 동작 흐름도이다.
도 5는 도 3에 도시된 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 장치(100)를 나타낸 블록도이다.
도 6은 도 5에 도시된 영상에서 전경을 추출하는 장치(100)의 일 예의 동작 관계를 나타낸 도면이다.
도 7은 영상 내의 각 위치에 상응하는 객체 마스크의 예시를 나타낸 도면이다.
도 8은 GIS 정보를 이용하여 영상 내에서의 객체의 크기를 추정하는 방법의 일 예를 설명하기 위한 도면이다.
도 9는 도 8에 도시된 상황에서 촬영된 영상의 일 예를 나타낸 도면이다.
도 10은 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법을 나타낸 동작 흐름도이다.1 is a view showing an example of an image taken on a train station platform.
FIG. 2 is a diagram showing an example of a result of extracting a foreground from the image shown in FIG. 1 using a conventional technique.
3 is a diagram illustrating an intelligent video surveillance system 1 according to an embodiment of the present invention.
4 is a flowchart illustrating an operation method of the intelligent video surveillance system 1 according to an embodiment of the present invention.
FIG. 5 is a block diagram illustrating an apparatus 100 for extracting a foreground in an image according to an embodiment of the present invention shown in FIG.
FIG. 6 is a diagram showing an operation relationship of an example of the apparatus 100 for extracting a foreground in the image shown in FIG.
7 is a diagram showing an example of an object mask corresponding to each position in an image.
8 is a diagram for explaining an example of a method of estimating the size of an object in an image using GIS information.
9 is a view showing an example of an image photographed in the situation shown in FIG.
10 is a flowchart illustrating a method of extracting a foreground from an image according to an exemplary embodiment of the present invention.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. 본 발명의 효과 및 특징, 그리고 그것들을 달성하는 방법은 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 여기서, 반복되는 설명, 본 발명의 요지를 불필요하게 흐릴 수 있는 공지 기능, 및 구성에 대한 상세한 설명은 생략한다. 본 발명의 실시형태는 당 업계에서 평균적인 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위해서 제공되는 것이다. 따라서, 도면에서의 요소들의 형상 및 크기 등은 보다 명확한 설명을 위해 과장될 수 있다. The present invention is capable of various modifications and various embodiments, and specific embodiments are illustrated and described in the drawings. The effects and features of the present invention and methods of achieving them will be apparent with reference to the embodiments described in detail below with reference to the drawings. Hereinafter, a repeated description, a known function that may obscure the gist of the present invention, and a detailed description of the configuration will be omitted. Embodiments of the present invention are provided to more fully describe the present invention to those skilled in the art. Accordingly, the shapes and sizes of the elements in the drawings and the like can be exaggerated for clarity.

그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성되어 다양한 형태로 구현될 수 있다. 이하의 실시예에서, 제1, 제2 등의 용어는 한정적인 의미가 아니라 하나의 구성 요소를 다른 구성 요소와 구별하는 목적으로 사용되었다. 또한, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한 복수의 표현을 포함한다. 또한, 포함하다 또는 가지다 등의 용어는 명세서상에 기재된 특징, 또는 구성요소가 존재함을 의미하는 것이고, 하나 이상의 다른 특징들 또는 구성요소가 부가될 가능성을 미리 배제하는 것은 아니다.However, the present invention is not limited to the embodiments described below, but all or some of the embodiments may be selectively combined and implemented in various forms. In the following embodiments, the terms first, second, and the like are used for the purpose of distinguishing one element from another element, not the limitative meaning. Also, the singular expressions include plural expressions unless the context clearly dictates otherwise. Also, the terms include, including, etc. mean that there is a feature, or element, recited in the specification and does not preclude the possibility that one or more other features or components may be added.

이하, 첨부된 도면을 참조하여 본 발명의 실시예들을 상세히 설명하기로 하며, 도면을 참조하여 설명할 때 동일하거나 대응하는 구성 요소는 동일한 도면 부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings, wherein like reference numerals refer to like or corresponding components throughout the drawings, and a duplicate description thereof will be omitted .

도 1은 기차역 플랫폼에서 촬영된 영상의 일 예를 나타낸 도면이다.1 is a view showing an example of an image taken on a train station platform.

도 2는 도 1에 도시된 영상으로부터 종래 기술을 이용하여 전경을 추출한 결과의 일 예를 나타낸 도면이다.FIG. 2 is a diagram showing an example of a result of extracting a foreground from the image shown in FIG. 1 using a conventional technique.

상세히, 도 1은 기차역 플랫폼에 고정되어 설치된 촬영 장치에서 촬영된 영상의 정지 장면이다. 그리고, 도 2는 도 1에 도시된 영상에서 종래의 배경학습 및 전경추출기술을 이용하여 전경을 추출한 영상의 정지 장면이다.In detail, FIG. 1 shows a still image of a video shot by a photographing apparatus fixedly installed on a train station platform. FIG. 2 is a still image of a scene extracted from a foreground image using the conventional background learning and foreground extraction techniques in the image shown in FIG.

도 1과 도 2를 참조하면, 도 1의 우측에 위치한 사람은 상의가 흰색이며, 주변 배경이 백색계열의 바닥이다. 즉, 상기 사람의 상의와 주변은 색이 유사한 상황이다. 이에, 종래의 배경학습 및 전경추출기술을 적용하여 전경을 추출한 결과, 도 2와 같이 상기 사람이 전경으로 정확히 추출되지 않았다. 즉, 종래 기술에 따르면 배경과 유사한 색상을 가지는 전경의 추출의 정확도가 낮다.1 and 2, a person located on the right side of FIG. 1 has a white top and a white background around the background. That is, the human image and surrounding are in a situation similar in color. As a result of extracting the foreground by applying the conventional background learning and foreground extraction techniques, the person is not accurately extracted into the foreground as shown in FIG. That is, according to the related art, the accuracy of extracting the foreground having a color similar to the background is low.

또한, 원근법으로 인하여 동일한 객체라도 위치에 따라 영상 속에서의 위치와 크기가 달라지는데, 종래의 기술에 따르면 이를 반영하지 못한다. 예컨대, 멀리 떨어진 객체가 전경에서 제외되거나, 가까이 있는 하나의 객체가 여러 개의 전경으로 분리될 수 있다.In addition, due to the perspective, the position and size of the same object in the image are different according to the position even though the same object is present, which is not reflected according to the conventional technique. For example, a far object may be excluded from the foreground, or an object in close proximity may be separated into multiple foregrounds.

도 3은 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)을 나타낸 도면이다.3 is a diagram illustrating an intelligent video surveillance system 1 according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)에서 영상에서 전경을 추출하는 장치(100)는 촬영 장치(400) 및 블롭(Blob) 검출 장치(200)와 상호 연결된다. 그리고, 블롭 검출 장치(200)는 영상에서 전경을 추출하는 장치(100) 및 이벤트 검출 장치(300)과 상호 연결된다.3, an apparatus 100 for extracting foreground images from an image in an intelligent video surveillance system 1 according to an embodiment of the present invention includes a photographing apparatus 400, a blob detecting apparatus 200, . The blob detection apparatus 200 is interconnected with the apparatus 100 for extracting the foreground from the image and the event detection apparatus 300.

본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 장치(100)는 고정된 장소에서 촬영된 영상들을 입력 받고, 입력 받은 영상들을 통해서 전경을 추출하고자 하는 대상 영상에 상응하는 기준배경모델을 생성하고, 대상 영상에 대해서 자동 배경 학습을 통해 생성된 학습배경모델을 이용한다. 그리고 대상 영상을 기준배경모델 및 학습배경모델 각각과의 비교를 통한 차이 영상들을 이용하여 대상 영상에 상응하는 전경을 추출하여 블롭 검출 장치(200)으로 전송한다. 여기서 대상 영상에서 전경이 추출되고 남은 배경 영상은 학습배경모델을 학습시켜 갱신하는데 사용된다.The apparatus 100 for extracting a foreground from an image according to an exemplary embodiment of the present invention receives an image captured at a fixed location and generates a reference background model corresponding to a target image to extract a foreground through input images And uses the learning background model generated through automatic background learning for the target image. Then, the foreground corresponding to the target image is extracted using the difference images obtained by comparing the target image with each of the reference background model and the learning background model, and transmitted to the blob detection apparatus 200. Here, the foreground is extracted from the target image and the remaining background image is used to learn and update the learning background model.

학습배경모델은 입력 영상 중에서 연속된 부분을 분석하여 변하지 않는 부분을 배경으로 학습하고, 배경과 차이가 큰 부분을 전경으로 학습하여 생성된 배경모델을 의미한다.The learning background model refers to a background model generated by analyzing successive parts of the input image, learning the unchanged part as a background, and learning a part having a large difference from the foreground as a background.

기준배경모델은 대상 영상에서 기준이 되는 배경모델을 의미한다. 이를 통해서 정지한 객체가 시간이 지남에 따라 점차 배경으로 흡수되는 것을 방지할 수 있다. 이때, 기준배경모델을 생성하기 위하여 데이터베이스에 저장된 입력 영상들을 이용할 수 있다. The reference background model refers to a background model that is a reference in the target image. This prevents the frozen object from being absorbed gradually into the background over time. At this time, the input images stored in the database can be used to generate the reference background model.

그리고, 차이 영상들은 대상 영상과 학습배경모델 사이의 차이 영상 및 대상 영상과 기준배경모델 사이의 차이 영상을 의미한다. 이때, 차이 영상은 배경모델과 차이 계산 방법에 따라 단채널 영상(예: 640x480x1)이거나 다채널 영상(예: 640x480x3)일 수 있다. The difference images mean the difference image between the target image and the learning background model and the difference image between the target image and the reference background model. In this case, the difference image may be a short channel image (eg, 640 × 480 × 1) or a multi-channel image (eg, 640 × 480 × 3) according to the background model and the difference calculation method.

선택적 실시예에서, 영상에서 전경을 추출하는 장치(100)는 기준배경모델을 생성함에 있어서, 사용자의 입력을 수신하여 기준배경모델을 생성할 수 있다. 이에 따라, 사용자가 배경으로 지정하고 싶은 부위를 명확하게 특정할 수 있고, 사용자 피드백이 반영될 수 있다.In an alternative embodiment, apparatus 100 for extracting foregrounds in an image may generate a reference background model by generating a reference background model by receiving user input. As a result, the user can clearly specify the part he / she wants to designate as the background, and the user feedback can be reflected.

선택적 실시예에서, 영상에서 전경을 추출하는 장치(100)는 전경을 추출함에 있어서, 사용자의 입력을 수신하여 전경을 추출할 수 있다. 이에 따라, 사용자가 전경으로 지정하고 싶은 객체를 명확하게 특정할 수 있고, 사용자 피드백이 반영될 수 있다.In an alternative embodiment, apparatus 100 for extracting foreground from an image may extract the foreground by receiving user input in extracting the foreground. Accordingly, the user can clearly specify the object to be designated as the foreground, and the user feedback can be reflected.

선택적 실시예에서, 영상에서 전경을 추출하는 장치(100)는 전경을 추출함에 있어서, 객체의 크기를 고려한 마스크를 통해 잘못 구분된 배경과 전경을 제거할 수 있다. 특히, 객체의 크기를 고려할 때, 카메라의 초점거리, 위치, 자세와 같은 GIS(Geographic Information System: 지리정보체계) 정보를 이용하여 계산할 수 있다.In an alternative embodiment, the apparatus 100 for extracting a foreground from an image may remove a wrongly separated background and foreground through a mask that takes into account the size of the object in extracting the foreground. In particular, when considering the size of the object, it can be calculated using the GIS (Geographic Information System) information such as the focal distance, the position and the posture of the camera.

예컨대, 가까이 위치한 객체는 실제 크기보다 영상에서 더 크게 나타나게 되어 큰 마스크를 설정하고, 멀리 위치한 객체는 실제 크기보다 영상에서 더 작게 나타나게 되어 작은 마스크를 설정할 수 있다.For example, a nearby object may appear larger in the image than in the actual size, so that a larger mask may be set, and a far object may appear smaller in the image than the actual size, thus setting a smaller mask.

선택적 실시예에서, 영상에서 전경을 추출하는 장치(100)는 대상 영상에 대해서 전처리를 하여 전경을 추출할 수 있다. 또한, 영상에서 전경을 추출하는 장치(100)는 전경 추출 영상에 대해서 후처리를 하여 블롭 검출 장치(200)로 전송할 수 있다.In an alternative embodiment, apparatus 100 for extracting a foreground from an image may extract the foreground by pre-processing the subject image. In addition, the apparatus 100 for extracting the foreground from the image may post-process the foreground extracted image to the blob detection apparatus 200.

블롭 검출 장치(200)는 전경 추출 영상에서 블롭을 검출하는 장치를 의미한다. 여기서, 블롭은 연결된 픽셀 그룹의 구역이다. 그리고, 검출한 블롭 정보를 이벤트 검출 장치(300)으로 전송할 수 있다. The blob detecting apparatus 200 refers to a device for detecting blobs in a foreground extracted image. Here, the blob is the region of the connected pixel group. Then, the detected blob information can be transmitted to the event detection device 300. [

이벤트 검출 장치(300)는 영상에서 인식된 추출된 객체와 관련된 정보를 통하여 이벤트 데이터베이스에 저장된 이벤트의 발생 여부를 감지하는 장치를 의미한다. 예컨대, 제한구역의 진입 감지나 이상 상황 감지 등의 이벤트 감지 등이 존재한다.The event detecting apparatus 300 detects whether an event stored in the event database is generated through information related to the extracted object recognized in the image. For example, detection of an event such as detection of an entry of a restricted area or detection of an abnormal situation exists.

촬영 장치(400)는 고정된 장치에 설치된 영상을 촬영할 수 있는 장치를 의미한다. 예컨대, 촬영 장치(400)는 CCTV나 캠코더, 디지털 카메라 등일 수 있다. The photographing apparatus 400 means a device capable of photographing an image installed in a fixed apparatus. For example, the photographing apparatus 400 may be a CCTV, a camcorder, a digital camera, or the like.

도 4는 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)의 동작 방법을 나타낸 동작 흐름도이다.4 is a flowchart illustrating an operation method of the intelligent video surveillance system 1 according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 고정된 위치에 설치된 촬영 장치(400)가, 영상을 촬영한다(S401).Referring to FIG. 4, in the intelligent video surveillance system 1 according to an embodiment of the present invention, a photographing apparatus 400 installed at a fixed position captures an image (S401).

또한, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 촬영 장치(400)가, 촬영한 영상을 전경 추출 장치(100)로 전송한다(S403).In addition, in the intelligent video surveillance system 1 according to the embodiment of the present invention, the photographing apparatus 400 transmits the photographed image to the foreground extracting apparatus 100 (S403).

또한, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 전경 추출 장치(100)가, 촬영 장치(400)로부터 전송 받은 입력 영상에 대해서 전경을 용이하게 추출하기 위하여 전처리한다(S405). 여기서, 전처리는 영상의 노이즈를 줄이거나, 움직임을 보상하거나, RGB 외의 다른 특징 벡터나 코드북으로 변환하는 작업 등을 포함할 수 있다.In addition, the intelligent video surveillance system 1 according to the embodiment of the present invention preprocesses the foreground extracting apparatus 100 to extract the foreground easily from the input image received from the photographing apparatus 400 (S405) . Here, the preprocessing may include reducing the noise of the image, compensating for the motion, or converting the feature vector or the codebook into a feature vector or codebook other than RGB.

또한, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 전경 추출 장치(100)가, 전처리된 입력 영상에 대해서 전경을 추출한다(S407).Also, in the intelligent video surveillance system 1 according to the embodiment of the present invention, the foreground extracting apparatus 100 extracts the foreground of the preprocessed input image (S407).

또한, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 전경 추출 장치(100)가, 블롭을 검출하기 용이하게 전경 추출 영상에 대해서 후처리를 한다(S409).Also, in the intelligent video surveillance system 1 according to the embodiment of the present invention, the foreground extracting apparatus 100 performs a post-process on the foreground extracted image so as to easily detect the blob (S409).

또한, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 전경 추출 장치(100)가, 전경 추출 영상을 블롭 검출 장치(200)으로 전송한다(S411).In the intelligent video surveillance system 1 according to the embodiment of the present invention, the foreground extracting apparatus 100 transmits the foreground extracted image to the blob detecting apparatus 200 (S411).

또한, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 블롭 검출 장치(200)가, 전송 받은 전경 추출 영상으로부터 블롭을 검출한다(S413).In the intelligent video surveillance system 1 according to the embodiment of the present invention, the blob detection apparatus 200 detects a blob from the received foreground extracted image (S413).

또한, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 블롭 검출 장치(200)가, 검출한 블롭에 대해서 필터링을 수행한다(S415).In the intelligent video surveillance system 1 according to the embodiment of the present invention, the blob detection apparatus 200 performs filtering on the detected blobs (S415).

또한, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 블롭 검출 장치(200)가, 생성된 블롭 정보를 이벤트 검출 장치(300)으로 전송한다(S417).Also, in the intelligent video surveillance system 1 according to the embodiment of the present invention, the blob detection apparatus 200 transmits the generated blob information to the event detection apparatus 300 (S417).

또한, 본 발명의 일 실시예에 따른 지능형 영상 감시 시스템(1)은 이벤트 검출 장치(300)가, 전송 받은 블롭 정보에 대해서 이벤트 데이터베이스에 저장된 이벤트 정보를 이용하여 이벤트를 검출한다(S419).Also, in the intelligent video surveillance system 1 according to the embodiment of the present invention, the event detection apparatus 300 detects an event using the event information stored in the event database with respect to the received blob information (S419).

이에 따라, 보다 정확히 전경을 추출하여, 대상 영상 속의 객체와 그 객체의 움직임이 정확히 파악되어 이벤트 검출 정확도가 향상시킬 수 있다.Accordingly, it is possible to extract the foreground more accurately and accurately grasp the object in the target image and the motion of the object, thereby improving the event detection accuracy.

도 5는 도 3에 도시된 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 장치(100)를 나타낸 블록도이다.FIG. 5 is a block diagram illustrating an apparatus 100 for extracting a foreground in an image according to an embodiment of the present invention shown in FIG.

도 5를 참조하면, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 장치(100)는 제어부(110), 통신부(120), 메모리(130), 데이터베이스(150), 기준배경모델 관리부(160), 학습배경모델 관리부(170), 전경 추출부(180) 등을 포함한다.5, an apparatus 100 for extracting a foreground from an image according to an exemplary embodiment of the present invention includes a controller 110, a communication unit 120, a memory 130, a database 150, a reference background model manager 160, a learning background model management unit 170, a foreground extraction unit 180, and the like.

또한, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 장치(100)는 전처리부(140) 또는 후처리부(190)를 더 포함할 수 있다.In addition, the apparatus 100 for extracting a foreground from an image according to an exemplary embodiment of the present invention may further include a preprocessor 140 or a post-processing unit 190.

상세히, 제어부(110)는 일종의 중앙처리장치로서 영상에서 전경을 추출하는 전체 과정을 제어한다. 즉, 제어부(110)는 전처리부(140), 기준배경모델 관리부(160), 학습배경모델 관리부(170), 전경 추출부(180) 및 후처리부(190) 등을 제어하여 다양한 기능을 제공할 수 있다.In detail, the control unit 110 controls the entire process of extracting the foreground from the image as a kind of central processing unit. That is, the control unit 110 controls the pre-processing unit 140, the reference background model managing unit 160, the learning background model managing unit 170, the foreground extracting unit 180, and the post-processing unit 190 to provide various functions .

여기서, 제어부(110)는 프로세서(processor)와 같이 데이터를 처리할 수 있는 모든 종류의 장치를 포함할 수 있다. 여기서, '프로세서(processor)'는, 예를 들어 프로그램 내에 포함된 코드 또는 명령으로 표현된 기능을 수행하기 위해 물리적으로 구조화된 회로를 갖는, 하드웨어에 내장된 데이터 처리 장치를 의미할 수 있다. 이와 같이 하드웨어에 내장된 데이터 처리 장치의 일 예로써, 마이크로프로세서(microprocessor), 중앙처리장치(central processing unit: CPU), 프로세서 코어(processor core), 멀티프로세서(multiprocessor), ASIC(application-specific integrated circuit), FPGA(field programmable gate array) 등의 처리 장치를 망라할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다.Here, the control unit 110 may include any kind of device capable of processing data, such as a processor. Herein, the term " processor " may refer to a data processing apparatus embedded in hardware, for example, having a circuit physically structured to perform a function represented by a code or an instruction contained in the program. As an example of the data processing apparatus built in hardware, a microprocessor, a central processing unit (CPU), a processor core, a multiprocessor, an application-specific integrated circuit (ASIC) circuit, and a field programmable gate array (FPGA), but the scope of the present invention is not limited thereto.

통신부(120)는 영상에서 전경을 추출하는 장치(100)와 촬영 장치(400) 및 블롭 추출 장치(200) 간의 송수신 신호를 전송하는데 필요한 통신 인터페이스를 제공한다.The communication unit 120 provides a communication interface necessary for transmitting and receiving signals between the apparatus 100 for extracting the foreground from the image and the imaging apparatus 400 and the blob extracting apparatus 200.

여기서, 통신부(120)는 다른 네트워크 장치와 유무선 연결을 통해 제어 신호 또는 데이터 신호와 같은 신호를 송수신하기 위해 필요한 하드웨어 및 소프트웨어를 포함하는 장치일 수 있다. Here, the communication unit 120 may be a device including hardware and software necessary to transmit / receive a signal such as a control signal or a data signal through a wired / wireless connection with another network device.

메모리(130)는 제어부(110)가 처리하는 데이터를 일시적 또는 영구적으로 저장하는 기능을 수행한다. 여기서, 메모리(130)는 자기 저장 매체(magnetic storage media) 또는 플래시 저장 매체(flash storage media)를 포함할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다.The memory 130 performs a function of temporarily or permanently storing data processed by the controller 110. Here, the memory 130 may include magnetic storage media or flash storage media, but the scope of the present invention is not limited thereto.

전처리부(140)는 입력 영상에서 전경을 용이하게 추출하기 위한 처리를 행한다. 이는, 영상의 노이즈를 줄이거나, 움직임을 보상하거나, RGB 외의 다른 특징 벡터나 코드북으로 변환하는 작업 등을 포함할 수 있다.The preprocessing unit 140 performs processing for easily extracting the foreground from the input image. This may include reducing noise in the image, compensating for motion, or converting a feature vector or codebook other than RGB.

데이터베이스(150)는 입력 영상들, 기준배경모델들 및 학습배경모델들을 포함한다.The database 150 includes input images, reference background models, and learning background models.

기준배경모델 관리부(160)는 데이터베이스(150)에 저장된 입력 영상들을 통해서 전경을 추출하고자 하는 대상 영상에 상응하는 기준배경모델을 생성한다.The reference background model management unit 160 generates a reference background model corresponding to a target image to extract a foreground through the input images stored in the database 150.

여기서, 기준배경모델은 대상 영상에서 기준이 되는 배경모델을 의미한다. 이를 통해서 정지한 객체가 시간이 지남에 따라 점차 배경으로 흡수되는 것을 방지할 수 있다. 이때, 기준배경모델을 생성하기 위하여 데이터베이스에 저장된 입력 영상들을 이용할 수 있다.Here, the reference background model means a background model that is a reference in the target image. This prevents the frozen object from being absorbed gradually into the background over time. At this time, the input images stored in the database can be used to generate the reference background model.

이때, 기준배경모델의 생성하는 제1 방법으로서, 사용자가 데이터베이스에 저장된 입력 영상들에서 직접 기준배경모델을 지정하거나, 사용자가 추가적으로 입력한 객체가 없는 특정 시점의 영상을 기준배경모델로 지정할 수 있다.In this case, as a first method of generating a reference background model, a user may designate a reference background model directly from input images stored in a database, or designate an image at a specific time point in which there is no object additionally input by a user as a reference background model .

그리고, 기준배경모델을 생성하는 제2 방법으로서, 데이터베이스에 저장된 입력 영상들 중에서 대상 영상이 촬영된 전날 영상에 대해서, 대상 영상이 촬영된 시간으로부터 기설정된 T₁시간 전후 동안의 평균 영상을 기준배경모델로 지정할 수 있다. 이때, 평균 영상은 영상들의 산술 평균, 가중치 평균 또는 영상 행렬화 후의 주요성분(principal component)을 통해 계산할 수 있다. 영상의 평균을 이용함에 따라, 영상 내에 있던 객체는 빈도가 높았던 배경에 의해 상쇄될 수 있다.A second method for generating a reference background model is a method for generating a reference background model by calculating an average image of a pre-set T ₁ time from a time when a target image is captured, It can be specified as a model. At this time, the average image can be calculated through arithmetic mean, weighted average of images, or principal component after image matrix transformation. By using the average of the images, the objects in the images can be canceled by the background with high frequency.

또한, 기준배경모델을 생성하는 제3 방법으로서, 데이터베이스에 저장된 입력 영상들 중에서 대상 영상이 쵤영된 날로부터 D일 전까지이며 대상 영상이 촬영된 시간으로부터 기설정된 T₂시간 전후 동안의 평균 영상들 중에서, 대상 영상과 가장 유사한 평균 영상을 기준배경모델로 할 수 있다. 이때, 영상의 유사도는 영상간의 단순 차영상의 합 또는 혼합 가우시안 모델과 같은 확률 모델에 대한 우도(likelihood)를 유사도의 척도로 사용할 수 있다. 이에 따라, 영상에서 조도, 날씨 또는 전체적인 밝기 등이 고려될 수 있다.Further, as a third method of generating a reference background model, of the inputs is stored in a database image, and D until days after the subject image choelyoung among average image for a preset T ₂ hours before and after from the object image, the recording time , The average image most similar to the target image can be used as the reference background model. In this case, the degree of similarity of images can be used as a measure of the degree of similarity, such as a sum of simple differences between images or a likelihood for a probability model such as a mixed Gaussian model. Accordingly, illuminance, weather, or overall brightness of the image can be considered.

나아가, 기준배경모델을 생성할 때 상기 3가지의 기준배경모델을 생성하는 제1 방법 내지 제3 방법을 혼합하여 사용할 수 있다.Furthermore, when generating the reference background model, the first to third methods for generating the three reference background models may be mixed and used.

학습배경모델 관리부(170)는 입력 영상들에 대해서 자동 배경 학습을 통해 학습배경모델을 생성하며, 기존의 학습 배경모델에 대해서 전경 추출부(180)로부터 전경이 추출되고 남은 배경 영상들을 기설정된 학습률에 따라 반영하여 갱신한다.The learning background model management unit 170 generates a learning background model through automatic background learning on the input images and extracts the foreground images from the foreground image extracting unit 180 for the existing learning background model, And updates it.

여기서, 학습배경모델은 입력 영상 중에서 연속된 부분을 분석하여 변하지 않는 부분을 배경으로 학습하고, 배경과 차이가 큰 부분을 전경으로 학습하여 생성된 배경모델을 의미한다.Here, the learning background model means a background model generated by analyzing consecutive portions of the input image, learning the unchanged portion as a background, and learning a portion having a large difference from the foreground as a foreground.

이때, 최초의 학습배경모델을 생성하는 제1 방법으로서, 첫 입력 영상을 학습배경모델로 할 수 있다. 이는, 첫 입력 영상에 전경 객체가 없을 때 매우 유용하며, 전경 객체가 있더라도 시간이 지남에 따라 학습에 통해 상쇄될 수 있다. 또한, 첫 영상이 입력되자마자 바로 생성될 수 있다는 점이 장점이다.At this time, as a first method for generating the first learning background model, the first input image may be a learning background model. This is very useful when there is no foreground object in the first input image, and even if there is foreground object, it can be canceled through learning over time. It is also advantageous that the first image can be generated as soon as it is input.

그리고, 최초의 학습배경모델을 생성하는 제2 방법으로서, 첫 입력 영상으로부터 기설정된 N개의 입력 영상들의 평균 영상을 초기 학습배경모델로 할 수 있다. 이는, 초기 입력 영상들에 전경 객체가 있더라도 평균화를 통해 효과가 크게 줄어드는 장점이 있다.As a second method of generating the first learning background model, an average image of the N input images preset from the first input image may be used as the initial learning background model. This is advantageous in that the effect is greatly reduced through averaging even if there are foreground objects in the initial input images.

전경 추출부(180)는 전경을 추출하고자 하는 대상 영상을 기준배경모델 및 학습배경모델 각각과의 비교를 통한 차이 영상들을 이용하여 대상 영상에 상응하는 전경을 추출한다. 이를 통해 배경과 전경의 구분의 정확도를 높일 수 있다.The foreground extracting unit 180 extracts the foreground corresponding to the target image using the difference images by comparing the target image to extract the foreground with each of the reference background model and the learning background model. This can increase the accuracy of the distinction between background and foreground.

여기서, 차이 영상들은 대상 영상과 학습배경모델 사이의 차이 영상 및 대상 영상과 기준배경모델 사이의 차이 영상을 의미한다. 이때, 차이 영상은 배경모델과 차이 계산 방법에 따라 단채널 영상(예: 640x480x1)이거나 다채널 영상(예: 640x480x3)일 수 있다.Here, the difference images mean a difference image between the target image and the learning background model, and a difference image between the target image and the reference background model. In this case, the difference image may be a short channel image (eg, 640 × 480 × 1) or a multi-channel image (eg, 640 × 480 × 3) according to the background model and the difference calculation method.

이때, 배경모델과 전처리된 입력영상은 RGB나 밝기(intensity) 외에 아래의 (1) 내지 (3)과 같은 형태의 속성을 가질 수 있다. 이러한 속성은 가우시안확률분포(평균과 표준편차)나 히스토그램 등으로 표현될 수 있다. In this case, the background model and the preprocessed input image may have the following attributes (1) to (3) in addition to RGB or intensity. These attributes can be expressed as Gaussian probability distributions (mean and standard deviation) or histograms.

(1) 영상의 성분: RGB 또는 변환 색공간(color space), 밝기(intensity)(1) Image components: RGB or converted color space, intensity,

(2) 영상의 공간적 차성분(gradient): 영상의 에지(edge) 및 에지의 방향과 세기(2) Spatial gradient of image: Direction and intensity of edge and edge of image

(3) 영상의 시간적 차성분(difference): 영상의 움직임(optical flow).(3) temporal difference of image: optical flow.

또한, 단채널 또는 다채널 속성에 대해 차이를 계산하는 방법도 각 채널별 값을 단순히 빼는 방법(difference)이나 유클리디안거리(Euclidean distance), 마할라노비스거리(Mahalanobis distance), L1-norm, L2-norm 등을 이용하는 방법들을 사용할 수 있다.In addition, the method of calculating the difference for the short channel or the multi-channel property also includes a method of simply subtracting each channel's value, Euclidean distance, Mahalanobis distance, L1-norm, L2-norm and the like can be used.

그리고, 배경과 전경을 구분하는 제1 방법으로서, 두 차이 영상들의 각 픽셀 값의 최솟값(min)이 기설정된 임계값 Th₁을 넘는 경우에 전경으로 구분하고, 그렇지 않은 경우 배경으로 구분할 수 있다.As a first method for distinguishing the background from the foreground, if the minimum value (min) of each pixel value of the two difference images exceeds the predetermined threshold value Th ₁ , the foreground can be divided into foregrounds.

또한, 배경과 전경을 구분하는 제2 방법으로서, 통하여 두 차이 영상들의 각 픽셀 값의 최댓값(max)이 기설정된 임계값 Th₂를 넘는 경우에 전경으로 구분하고, 그렇지 않은 경우 배경으로 구분할 수 있다.As a second method of distinguishing the background from the foreground, if the maximum value (max) of each pixel value of the two difference images exceeds the preset threshold value Th ₂ , it is divided into foregrounds, .

또한, 배경과 전경을 구분하는 제3 방법으로서, 두 차이 영상들의 각 픽셀 값들에 대해서 기설정된 가중치 w와 (1-w)에 따른 가중치 합을 구하여 기설정된 임계값 Th₃을 넘는 경우에 전경으로 구분하고, 그렇지 않은 경우 배경으로 구분할 수 있다.As a third method for distinguishing the background from the foreground, a weight sum according to predetermined weight w and (1-w) is calculated for each pixel value of two difference images, and when the threshold value Th ₃ is exceeded, If not, it can be classified as background.

나아가, 배경과 전경을 구분하는 제4 방법으로서, 두 차이 영상들의 각 i) 픽셀의 값과 ii) 인접 픽셀들의 값 그리고 iii) 직전에 전경/배경 여부를 고려한 가변 가중치를 계산하고, 이를 통해 계산된 가중치 합이 기설정된 임계값 Th₄를 넘는 경우 전경으로 구분하고, 그렇지 않은 경우 배경으로 구분할 수 있다.Further, as a fourth method of distinguishing the background from the foreground, it is possible to calculate variable weights considering the values of i) pixels, ii) values of adjacent pixels, and iii) If the weighted sum exceeds the preset threshold value Th ₄ , it is divided into foregrounds. Otherwise, it can be classified into backgrounds.

여기서, 입력 영상, 배경모델 및 차이영상은 다채널이 가능한데, 채널별(또는 속성별)로 다른 가변 가중치가 사용 가능하다. 또 기설정된 임계값들(Th₁, Th₂, Th₃, Th₄)이나 가중치 w는 사용자가 설정할 수도 있고, 종래의 기계학습방법론과 사전에 확보한 데이터셋을 통해 도출한 최적값을 이용할 수도 있다.Here, the input image, the background model, and the difference image can be multi-channel, and different variable weights can be used for each channel (or attribute). The predetermined threshold values Th ₁ , Th ₂ , Th ₃ and Th ₄ and the weight w can be set by the user or can be obtained by using the optimum value derived from the conventional machine learning methodology and the data set acquired in advance have.

또한, 전경 추출부(180)는 영상 내에서 각 위치에 상응하는 크기 보정이 이루어진 마스크를 사용하여 배경과 전경을 구분할 수 있다. 이는, 동일한 객체라도 영상 내에서는 위치에 따라 서로 다른 크기로 촬영되기 때문이다.In addition, the foreground extracting unit 180 may distinguish the background from the foreground using a mask having a size correction corresponding to each position in the image. This is because even the same object is photographed at different sizes depending on positions in the image.

이때, 마스크 크기를 결정하는 제1 방법으로서, 사용자의 입력을 수신하여 촬영 장치(도 3의 400 참조)의 중심에서의 객체 크기 S_c와 상하로 움직일 때의 증감율 S_d를 설정하고, 이를 이용하여 마스크 크기를 결정할 수 있다.As a first method for determining the mask size, a method of receiving a user's input and setting the object size S _c at the center of the photographing apparatus (see 400 in FIG. 3) and the increase / decrease rate S _d when moving up and down The mask size can be determined.

그리고, 마스크 크기를 결정하는 제2 방법으로서, 촬영 장치(도 3의 400 참조)에 상응하는 GIS 정보를 이용하여 마스크 크기를 결정할 수 있다. 상세히, 촬영 장치(도 3의 400 참조)의 초점거리(f), 중점 좌표(c_x, c_y), 설치된 높이(L) 및 기울어진 각도(

)를 이용하여, 객체의 높이(H)와 영상 속에서의 객체의 크기의 관계식을 세울 수 있다.As a second method for determining the mask size, the mask size can be determined using the GIS information corresponding to the photographing apparatus (see 400 in FIG. 3). In detail, the focal length f, the central coordinates (c _x , c _y ), the installed height L and the tilted angle ( _x , _y ) of the photographing apparatus

), The relation between the height H of the object and the size of the object in the image can be established.

후처리부(190)는 전경 추출부(180)에서 전경이 추출되어 생성된 전경 추출 영상에 대해서, 블롭을 용이하게 검출하기 위한 영상 처리를 한다.The post-processing unit 190 performs image processing for easily detecting the blob for the foreground extracted image generated by extracting the foreground in the foreground extracting unit 180. [

도 6은 도 5에 도시된 영상에서 전경을 추출하는 장치(100)의 일 예의 동작 관계를 나타낸 도면이다.FIG. 6 is a diagram showing an operation relationship of an example of the apparatus 100 for extracting a foreground in the image shown in FIG.

도 6을 참조하면, 도 5에 도시된 영상에서 전경을 추출하는 장치(100)는 입력 영상을 데이터베이스에 저장하고, 저장된 영상들을 이용하여 기준배경모델을 생성할 수 있다. 그리고, 전경을 추출하고자 하는 대상 영상에 대해서 학습배경모델과 기준배경모델 각각과의 비교를 통한 차이 영상들을 이용하여 차이를 계산할 수 있다. 이후, 계산된 차이를 통해서 배경과 전경을 구분하여 전경을 추출할 수 있다.Referring to FIG. 6, an apparatus 100 for extracting a foreground in the image shown in FIG. 5 may store an input image in a database and generate a reference background model using the stored images. Then, the difference between the learning background model and the reference background model can be calculated using the difference images for the target image to extract the foreground. Then, the foreground can be extracted by separating the foreground and background from the calculated difference.

이때, 배경과 전경이 구분되면, 기설정된 학습률에 따라 구분된 배경 정보를 활용하여 기존의 학습배경모델에 반영할 수 있다.At this time, if the background is divided into foreground, it can be reflected in the existing learning background model using the divided background information according to the predetermined learning rate.

여기서, 기준배경모델은 데이터베이스에 저장된 영상들을 이용하여 생성될 수 있다. 특히, 기준배경모델은 영상들의 유사도 판단에 의하여 자동으로 생성될 수도 있고, 사용자의 입력을 수신하여 수동으로 선택/생성될 수도 있다. 여기서 생성된 기준배경모델은 이후의 입력 영상에 상응하는 차이를 계산할 때 사용될 수 있다.Here, the reference background model can be generated using images stored in the database. In particular, the reference background model may be automatically generated by determining similarity of images, or may be manually selected / generated by receiving a user input. Here, the generated reference background model can be used to calculate a difference corresponding to a subsequent input image.

또한, 배경과 전경을 구분할 때, 차이 영상들을 이용하여 자동으로 구분할 수도 있고, 사용자의 입력을 수신하여 수동으로 구분할 수도 있다.In addition, when the background and the foreground are distinguished, they may be automatically classified using difference images, or the user's input may be received and manually classified.

나아가, 배경과 전경을 구분할 때, GIS 정보를 이용하여 영상 내의 위치에 상응하는 크기 보정이 이루어진 마스크를 사용할 수 있다. 이에 따라, 전경과 배경을 구분할 때 잘못된 배경과 전경을 바로잡을 수 있다.Furthermore, when distinguishing the background from the foreground, a mask having a size correction corresponding to the position in the image can be used using GIS information. As a result, when you distinguish between foreground and background, you can correct the wrong background and foreground.

이에 따라, 종래의 기술에와 같이 학습배경모델이 가질수 있는 한계를 데이터베이스에 저장된 여러 입력 영상들로부터 생성한 기준배경모델을 이용하여 보완 및 극복할 수 있다.Accordingly, the limit of the learning background model can be supplemented and overcome by using the reference background model generated from the various input images stored in the database, as in the conventional art.

도 7은 영상 내의 각 위치에 상응하는 객체 마스크의 예시를 나타낸 도면이다.7 is a diagram showing an example of an object mask corresponding to each position in an image.

도 7을 참조하면, 영상 내에서 각 위치마다 촬영된 객체의 크기는 서로 다르다. 이는, 기본적으로 원근법에 의하여 촬영 장치(도 3의 400 참조)와 멀리 떨어진 객체일수록, 영상 속에서 실제 크기에 비하여 작게 나타나게 된다. 하지만, 이와 같이 영상 내에서 각 위치에 상응하는 크기 보정이 이루어진 마스크를 사용하지 않을 경우, 멀리 떨어진 객체의 경우 전경으로 구분될 것이 제거되기도 하고, 가까이에 위치한 하나의 객체가 여러 개의 전경으로 분리되기도 한다.Referring to FIG. 7, the sizes of objects photographed at respective positions in an image are different from each other. This is basically due to the perspective, objects far away from the photographing apparatus (see 400 in FIG. 3) appear smaller in the image than the actual size. However, if we do not use masks with size corrections corresponding to each position in the image, distant objects may be separated from foreground objects, and nearby objects may be separated into multiple foreground objects do.

따라서, 도 7과 같이 GIS 정보를 활용하여 영상 내의 각 위치에 상응하는 크기 보정이 이루어진 마스크를 사용할 수 있다. 이에 따라, 보다 정확한 전경 추출이 가능하다.Therefore, as shown in FIG. 7, a mask having a size correction corresponding to each position in the image can be used by utilizing GIS information. Thus, more accurate foreground extraction is possible.

도 8 GIS 정보를 이용하여 영상 내에서의 객체의 크기를 추정하는 방법의 일 예를 설명하기 위한 도면이다.8 is a view for explaining an example of a method of estimating the size of an object in an image using GIS information.

도 9는 도 8에 도시된 상황에서 촬영된 영상의 일 예를 나타낸 도면이다.9 is a view showing an example of an image photographed in the situation shown in FIG.

상세히, 도 9는 영상 내에서의 객체를 나타낸 도면으로, 영상의 원점은 좌상단이 되고, 우측 방향이 x축이며 아래 방향이 y축에 해당한다. 중심에 '+' 모양에 해당하는 부분이 영상 속에서의 중점 좌표에 해당한다.9 shows an object in an image. The origin of the image is the left upper end, the right direction is the x-axis, and the downward direction is the y-axis. The portion corresponding to the '+' shape at the center corresponds to the central coordinate in the image.

도 8과 도 9를 참조하면, 촬영 장치(도 3의 400 참조)는 점 C에 위치하고, 초점거리가 f이며, 설치된 높이는 L미터이며, 기울어진 각도는

이다. 그리고, 영상 속에서의 중점 좌표는 (c_x, c_y)이다. 또한, 영상 속에서 객체의 저점이 위치한 좌표는 P'(x, y), 영상 속에서 객체의 고점이 위치한 좌표는 Q'(x, y'), 객체의 크기는 H미터라 한다. 이 경우, 영상 속에서 객체의 크기는 (y-y')픽셀이라 할 수 있다. 이는, 아래 방향이 양의 y축에 해당하여, y가 y'보다 크기 때문이다.Referring to Figs. 8 and 9, the photographing apparatus (see 400 in Fig. 3) is located at point C, the focal length is f, the installed height is L meters,

to be. The center coordinates in the image are (c _x , c _y ). The coordinates of the bottom point of the object in the image are P '(x, y), the coordinates of the object in the image are Q' (x, y '), and the size of the object is H meters. In this case, the size of the object in the image may be (y-y ') pixels. This is because the downward direction corresponds to the positive y axis and y is greater than y '.

여기서, 점 C가 지면과 가장 가까운 점을 점 C', 객체가 지면과 닿는 지점을 점 P_-, 객체의 고점을 점 Q, 그리고 점 C와 점 Q를 지나는 직선이 지면과 닿는 지점을 점 R이라 한다. 그리고, 점 C'과 점 P 사이의 거리를 Z미터, 점 C'과 점 R 사이의 거리를 Z'미터라고 한다. 또한, 사전에 주어진 GIS 정보를 이용하여 하기 수학식 1에 따라 촬영 장치(도 3의 400 참조)의 초점 방향과 점 P가 이루는 각도

를 획득할 수 있다.Here, point C is closest to the ground, point P _- is the point at which the object touches the ground, point Q is the point at which the object is highest, point R is the point where the straight line passing through point C and point Q touches the ground Quot; The distance between the point C 'and the point P is Z, and the distance between the point C' and the point R is Z '. Further, an angle formed by the focus direction of the photographing apparatus (refer to 400 in Fig. 3) and the point P by the following equation (1) using the GIS information given in advance

Can be obtained.

[수학식 1][Equation 1]

그리고, 하기 수학식 2와 같이 Z와 L의 관계식을 얻을 수 있다.Then, a relational expression of Z and L can be obtained as shown in the following equation (2).

[수학식 2]&Quot; (2) "

그리고, 하기 수학식 3을 통하여 영상 속에서의 객체의 고점 Q'(x, y')을 구할 수 있다.Then, the peak Q '(x, y') of the object in the image can be obtained through the following equation (3).

[수학식 3]&Quot; (3) "

또한, 다른 방법으로서 하기 수학식 4를 이용할 수도 있다.As another method, the following equation (4) may be used.

[수학식 4]&Quot; (4) "

상기 수학식들을 통하여 영상 속에서의 객체의 크기(y-y') 혹은 영상 속에서의 객체의 고점에 해당하는 좌표 Q'(x, y')를 예측할 수 있고, 이를 통하여 객체의 크기에 상응하는 마스크 크기를 사용할 수 있다.The coordinates Q '(x, y') corresponding to the object size (y-y ') in the image or the high point of the object in the image can be predicted through the above equations, Mask size can be used.

또한, 카메라 방향이 세가지 방향으로 기울어진 경우(자유도 3)에 있어서도, 상기 수학식과 같이 관계식을 유도하여 크기를 예측할 수 있다.Also, in the case where the camera direction is inclined in three directions (degrees of freedom 3), the relationship can be derived as in the above equation to predict the size.

도 10은 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법을 나타낸 동작 흐름도이다.10 is a flowchart illustrating a method of extracting a foreground from an image according to an exemplary embodiment of the present invention.

도 10을 참조하면, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법은 영상에서 전경을 추출하는 장치(도 3의 100 참조)가, 고정된 위치에서 촬영된 입력 영상을 수신한다(S1001). Referring to FIG. 10, an apparatus for extracting a foreground in an image according to an embodiment of the present invention receives an input image captured at a fixed position (see 100 in FIG. 3) S1001).

또한, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법은 영상에서 전경을 추출하는 장치(도 3의 100 참조)가, 입력 영상에 대해서 전경을 추출하기 용이하도록 전처리 과정을 실시한다(S1003).In addition, in the method of extracting the foreground in the image according to the embodiment of the present invention, a device for extracting the foreground from the image (see 100 in FIG. 3) performs a preprocessing process so that the foreground can be easily extracted from the input image S1003).

전처리 과정은, 영상의 노이즈를 줄이거나, 움직임을 보상하거나, RGB 외의 다른 특징 벡터나 코드북으로 변환하는 작업 등을 포함할 수 있다.The preprocessing process may include reducing the noise of the image, compensating for the motion, or converting the feature vector or codebook to a feature vector other than RGB.

또한, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법은 영상에서 전경을 추출하는 장치(도 3의 100 참조)가, 입력 영상을 데이터베이스(도 3의 150 참조)에 저장한다(S1005).In addition, in the method of extracting foreground from an image according to an embodiment of the present invention, an apparatus for extracting a foreground from an image (see 100 in FIG. 3) stores an input image in a database (see 150 in FIG. 3) ).

또한, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법은 영상에서 전경을 추출하는 장치(도 3의 100 참조)가, 데이터베이스(도 3의 150 참조)에 저장된 입력 영상들을 이용하여 기준배경모델을 생성한다(S1007).In addition, a method of extracting a foreground from an image according to an embodiment of the present invention is a method of extracting a foreground from an image (see 100 in FIG. 3) by using an input image stored in a database (see 150 in FIG. 3) And creates a background model (S1007).

여기서, 기준배경모델은 전경을 추출하고자 하는 대상 영상에서 기준이 되는 배경모델을 의미한다. 이를 통해서 정지한 객체가 시간이 지남에 따라 점차 배경으로 흡수되는 것을 방지할 수 있다.Here, the reference background model refers to a background model that is a reference in a target image to extract a foreground. This prevents the frozen object from being absorbed gradually into the background over time.

또한, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법은 영상에서 전경을 추출하는 장치(도 3의 100 참조)가, 대상 영상과 기준배경모델의 차이를 계산한다(S1009).In addition, in a method of extracting foreground from an image according to an embodiment of the present invention, an apparatus for extracting a foreground from an image (see 100 in FIG. 3) calculates a difference between a target image and a reference background model (S1009).

여기서, 대상 영상과 기준배경모델의 차이 영상을 생성하고 이용하여 차이를 계산할 수 있다.Here, the difference image between the target image and the reference background model can be generated and used to calculate the difference.

또한, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법은 영상에서 전경을 추출하는 장치(도 3의 100 참조)가, 대상 영상과 학습배경모델의 차이를 계산한다(S1011).In addition, in the method of extracting foreground from an image according to an embodiment of the present invention, an apparatus for extracting foreground from an image (see 100 in FIG. 3) calculates a difference between a target image and a learning background model (S1011).

여기서, 대상 영상과 학습배경모델의 차이 영상을 생성하고 이용하여 차이를 계산할 수 있다.Here, the difference image between the target image and the learning background model can be generated and used to calculate the difference.

또한, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법은 영상에서 전경을 추출하는 장치(도 3의 100 참조)가, 대상 영상에 대해서 계산된 기준배경모델과 학습배경모델과의 차이를 이용하여 전경을 추출한다(S1013).In addition, in the method of extracting foreground from an image according to an embodiment of the present invention, an apparatus for extracting a foreground in an image (see 100 in FIG. 3) extracts a difference between a reference background model calculated for a target image and a learning background model To extract the foreground (S1013).

여기서, 배경과 전경을 구분하는 제1 방법으로서, 두 차이 영상들의 각 픽셀 값의 최솟값(min)이 기설정된 임계값 Th₁을 넘는 경우에 전경으로 구분하고, 그렇지 않은 경우 배경으로 구분할 수 있다.Here, as a first method for distinguishing the background from the foreground, if the minimum value (min) of each pixel value of the two difference images exceeds the predetermined threshold Th ₁ , the foreground can be divided into foreground and background.

특히, 대상 영상에서 배경과 전경을 구분할 때, 영상 내에서 각 위치에 상응하는 크기 보정이 이루어진 마스크를 사용하여 배경과 전경을 구분할 수 있다. 이는, 동일한 객체라도 영상 내에서는 위치에 따라 서로 다른 크기로 촬영되기 때문이다.In particular, when separating the background from the foreground in the target image, it is possible to distinguish the background from the foreground by using a mask having a size correction corresponding to each position in the image. This is because even the same object is photographed at different sizes depending on positions in the image.

또한, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법은 영상에서 전경을 추출하는 장치(도 3의 100 참조)가, 대상 영상으로부터 구분된 배경과 전경 정보를 이용하여 학습을 통해 학습배경모델을 갱신한다(S1015).In addition, a method of extracting foreground from an image according to an embodiment of the present invention is a method of extracting a foreground from an image (see 100 in Fig. 3) by learning using a background and foreground information The background model is updated (S1015).

또한, 본 발명의 일 실시예에 따른 영상에서 전경을 추출하는 방법은 영상에서 전경을 추출하는 장치(도 3의 100 참조)가, 대상 영상에서 전경을 추출하여 생성된 전경 추출 영상에 대해서 후처리 과정을 실시한다(S1017).In addition, in the method of extracting foreground from an image according to an embodiment of the present invention, an apparatus for extracting a foreground from an image (see 100 in FIG. 3) (S1017).

후처리 과정은 전경 추출 영상으로부터 이후에 블롭을 용이하게 검출할 수 있도록 행하는 영상 처리이다.The post-processing is an image processing that is performed so that the blob can be detected easily from the foreground extracted image.

이와 같이, 종래의 전경 추출 기술과 달리 학습배경모델뿐만 아니라 기준배경모델과의 비교를 추가적으로 행하여 더욱 높은 정확도로 전경을 추출할 수 있다.In this manner, unlike the conventional foreground extraction technique, not only the learning background model but also a comparison with the reference background model can be additionally performed to extract the foreground with higher accuracy.

선택적 실시예에서, 상기 단계들(S1001, S1003, S1005, S1007, S1009, S1011, S1013, S1015 및 S1017)에 있어서, 대상 영상에 대해서 기준배경모델과의 차이를 계산하는 단계(S1009)와 대상 영상에 대해서 학습배경모델과의 차이를 계산하는 단계(S1011)는 병렬적으로 수행될 수 있다.In an alternative embodiment, a step S1009 of calculating a difference between the target image and the reference background model in the steps S1001, S1003, S1005, S1007, S1009, S1011, S1013, S1015, and S1017, (S1011) of calculating the difference from the learning background model may be performed in parallel.

선택적 실시예에서, 상기 단계들(S1001, S1003, S1005, S1007, S1009, S1011, S1013, S1015 및 S1017)에 있어서, 대상 영상에 대해서 학습배경모델과의 차이를 계산하는 단계(S1011)가 먼저 수행되고, 대상 영상에 대해서 기준배경모델과의 차이를 계산하는 단계(S1009)가 수행될 수 있다.In an alternative embodiment, the step of calculating the difference from the learning background model for the target image (S1011) is performed first in the steps (S1001, S1003, S1005, S1007, S1009, S1011, S1013, S1015 and S1017) And step S1009 of calculating the difference from the reference background model with respect to the target image may be performed.

이상 설명된 본 발명에 따른 실시예는 다양한 컴퓨터 구성요소를 통하여 실행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수 있다. 컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등과 같은, 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는, 컴파일러에 의하여 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용하여 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위하여 하나 이상의 소프트웨어 모듈로 변경될 수 있으며, 그 역도 마찬가지이다.The embodiments of the present invention described above can be implemented in the form of program instructions that can be executed through various computer components and recorded in a computer-readable recording medium. The computer-readable recording medium may include program commands, data files, data structures, and the like, alone or in combination. The program instructions recorded on the computer-readable recording medium may be those specifically designed and configured for the present invention or may be those known and used by those skilled in the computer software arts. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape, optical recording media such as CD-ROM and DVD, magneto-optical media such as floptical disks, medium, and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code, such as those generated by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware device may be modified into one or more software modules for performing the processing according to the present invention, and vice versa.

본 발명에서 설명하는 특정 실행들은 일 실시 예들로서, 어떠한 방법으로도 본 발명의 범위를 한정하는 것은 아니다. 명세서의 간결함을 위하여, 종래 전자적인 구성들, 제어 시스템들, 소프트웨어, 상기 시스템들의 다른 기능적인 측면들의 기재는 생략될 수 있다. 또한, 도면에 도시된 구성 요소들 간의 선들의 연결 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것으로서, 실제 장치에서는 대체 가능하거나 추가의 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들로서 나타내어질 수 있다. 또한, “필수적인”, “중요하게” 등과 같이 구체적인 언급이 없다면 본 발명의 적용을 위하여 반드시 필요한 구성 요소가 아닐 수 있다.The specific acts described in the present invention are, by way of example, not intended to limit the scope of the invention in any way. For brevity of description, descriptions of conventional electronic configurations, control systems, software, and other functional aspects of such systems may be omitted. Also, the connections or connecting members of the lines between the components shown in the figures are illustrative of functional connections and / or physical or circuit connections, which may be replaced or additionally provided by a variety of functional connections, physical Connection, or circuit connections. Also, unless explicitly mentioned, such as " essential ", " importantly ", etc., it may not be a necessary component for application of the present invention.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 또는 이로부터 등가적으로 변경된 모든 범위는 본 발명의 사상의 범주에 속한다고 할 것이다.Accordingly, the spirit of the present invention should not be construed as being limited to the above-described embodiments, and all ranges that are equivalent to or equivalent to the claims of the present invention as well as the claims .

1: 지능형 영상 감시 시스템
100: 영상에서 전경을 추출하는 장치
110: 제어부 120: 통신부
130: 메모리 140: 전처리부
150: 데이터베이스 160: 기준배경모델 관리부
170: 학습배경모델 관리부 180: 전경 추출부
190: 후처리부
200: 블롭 검출 장치 300: 이벤트 검출 장치1: Intelligent video surveillance system
100: Device for extracting foreground from image
110: control unit 120: communication unit
130: memory 140: preprocessing section
150: Database 160: Reference background model manager
170: learning background model managing unit 180: foreground extracting unit
190: Post-
200: blob detection device 300: event detection device

Claims

A database for storing input images photographed by a photographing apparatus installed at a fixed position;
A reference background model management unit for generating a reference background model corresponding to a target image to extract a foreground using one or more of the input images;
A learning background model manager for generating or updating a learning background model corresponding to the target image through automatic background learning; And
A foreground extracting unit for extracting a foreground corresponding to the target image by comparing the target image with the reference background model and the learning background model;
And extracting the foreground from the image.