KR102141646B1

KR102141646B1 - Method and apparatus for detecting moving object from image recorded by unfixed camera

Info

Publication number: KR102141646B1
Application number: KR1020190002990A
Authority: KR
Inventors: 이준희; 양동원; 하태길; 임종인; 최진영; 김도경
Original assignee: 국방과학연구소
Priority date: 2019-01-09
Filing date: 2019-01-09
Publication date: 2020-08-05
Also published as: KR20200086585A

Abstract

비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법에 있어서, 영상을 구성하는 프레임들 각각을 배경 모델을 적용하기 위한 그리드(grid) 영역들로 분할하는 단계, 그리드 영역들 각각이 처리되는 해상도를 결정하기 위해 그리드 영역들에 주의 집중 마스크(attention mask)를 설정하는 단계, 비고정 카메라의 움직임에 따른 배경 모델의 움직임을 보상하는 단계, 및 움직임이 보상된 배경 모델에 관한 모델 변수들을 업데이트하여 프레임들 각각으로부터 이동체를 선별하는 단계를 포함하는 방법이 개시된다.A method of detecting a moving object from an image photographed by a non-fixed camera, the method comprising: dividing each of the frames constituting the image into grid regions for applying a background model, and resolution of each of the grid regions By setting the attention mask on the grid areas to determine the step, compensating for the movement of the background model according to the movement of the non-fixed camera, and updating the model parameters for the background model for which the motion is compensated, A method is disclosed that includes selecting a moving object from each of the frames.

Description

Method and apparatus for detecting moving object from image recorded by unfixed camera}

본 개시는 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법 및 장치에 관한 것이다. 보다 상세하게는, 본 개시는 주의 집중 마스크(attention mask)가 활용됨에 따라 개선된 검출 효율을 갖는, 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법 및 장치에 관한 것이다.The present disclosure relates to a method and apparatus for detecting a moving object from an image captured by a non-fixed camera. More particularly, the present disclosure relates to a method and apparatus for detecting a moving object from an image captured by a non-fixed camera, having improved detection efficiency as an attention mask is utilized.

카메라에 의한 영상으로부터 이동체를 검출하는 기술에 대한 연구가 진행되고 있다. 영상에서 이동하지 않는 배경으로부터 이동하는 물체를 나타내는 부분이 구별 가능해지는 경우, 그로부터 다양한 응용 방안이 파생될 수 있다. 따라서, 컴퓨터 비전 및 영상 감시 등의 분야에서 이동체 검출에 대한 중요성이 증가하고 있다.Research is being conducted on a technique for detecting a moving object from an image by a camera. When a portion representing an object moving from a background that does not move in the image is distinguishable, various application methods can be derived therefrom. Accordingly, the importance of moving object detection is increasing in fields such as computer vision and image surveillance.

이동체를 촬영하는 카메라가 고정되어 있지 않은 환경에서도 이동체 검출이 요구될 수 있다. 카메라 자체가 이동하며 이동체를 촬영하는 비고정 카메라의 경우, 고정 카메라의 경우와는 달리, 검출 대상이 되는 이동체뿐만 아니라 배경까지도 영상 내에서 이동하게 되므로, 움직임을 갖는 배경으로부터 이동체가 검출되는 과정이 보다 복잡해질 수 있다.Even in an environment in which the camera photographing the moving object is not fixed, the moving object detection may be required. In the case of a non-fixed camera in which the camera itself moves and photographs the moving object, unlike the fixed camera, not only the target moving object but also the background are moved in the image, so the process of detecting the moving object from the background with movement It can be more complicated.

비고정 카메라의 영상으로부터 이동체를 검출하기 위해서는 배경의 움직임을 파악하여 이를 보상하는 과정이 수반될 수 있다. 배경의 움직임을 보상하는 과정에서 이동체 검출에 요구되는 연산량이 크게 증가할 수 있어, 비고정 카메라의 경우 촬영되는 영상으로부터 실시간으로 이동체를 검출하는 것이 어려울 수 있다. 그에 따라, 비고정 카메라의 경우에도 이동체 검출의 실시간성을 확보할 수 있도록, 이동체 검출의 연산량을 감소시키기 위한 기술이 요구될 수 있다.In order to detect a moving object from the image of a non-fixed camera, a process of grasping the motion of the background and compensating it may be involved. In the process of compensating for the movement of the background, the amount of computation required for detecting a moving object may be greatly increased, and in the case of a non-fixed camera, it may be difficult to detect the moving object in real time from an image taken. Accordingly, in the case of a non-fixed camera, a technique for reducing the computational amount of moving object detection may be required to ensure real-time property of moving object detection.

등록특허공보 제10-1438451호Registered Patent Publication No. 10-1438451

다양한 실시예들은 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법 및 장치를 제공하기 위한 것이다. 본 개시가 이루고자 하는 기술적 과제는 전술한 바와 같은 기술적 과제들로 한정되지 않으며, 이하의 실시예들로부터 또 다른 기술적 과제들이 유추될 수 있다.Various embodiments are to provide a method and apparatus for detecting a moving object from an image photographed by a non-fixed camera. The technical problem to be achieved by the present disclosure is not limited to the technical problems as described above, and other technical problems may be inferred from the following embodiments.

전술한 기술적 과제를 해결하기 위한 수단으로서, 본 개시의 일 측면에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법은, 상기 영상을 구성하는 프레임들 각각을 배경 모델을 적용하기 위한 그리드(grid) 영역들로 분할하는 단계; 상기 그리드 영역들 각각이 처리되는 해상도를 결정하기 위해 상기 그리드 영역들에 주의 집중 마스크(attention mask)를 설정하는 단계; 상기 비고정 카메라의 움직임에 따른 상기 배경 모델의 움직임을 보상하는 단계; 및 상기 움직임이 보상된 배경 모델에 관한 모델 변수들을 업데이트하여 상기 프레임들 각각으로부터 상기 이동체를 선별하는 단계를 포함할 수 있다.As a means for solving the above technical problem, a method of detecting a moving object from an image photographed by a non-fixed camera according to an aspect of the present disclosure includes a grid for applying a background model to each of the frames constituting the image dividing into (grid) regions; Setting an attention mask on the grid areas to determine the resolution at which each of the grid areas is processed; Compensating for the movement of the background model according to the movement of the non-fixed camera; And selecting the moving object from each of the frames by updating model variables related to the motion compensated background model.

본 개시의 다른 측면에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 장치는, 적어도 하나의 프로그램을 저장하는 메모리; 및 상기 적어도 하나의 프로그램을 실행함으로써 상기 이동체를 검출하는 프로세서를 포함하고, 상기 프로세서는, 상기 영상을 구성하는 프레임들 각각을 배경 모델을 적용하기 위한 그리드 영역들로 분할하고, 상기 그리드 영역들 각각이 처리되는 해상도를 결정하기 위해 상기 그리드 영역들에 주의 집중 마스크를 설정하고, 상기 비고정 카메라의 움직임에 따른 상기 배경 모델의 움직임을 보상하고, 상기 움직임이 보상된 배경 모델에 관한 모델 변수들을 업데이트하여 상기 프레임들 각각으로부터 상기 이동체를 선별할 수 있다.An apparatus for detecting a moving object from an image photographed by a non-fixed camera according to another aspect of the present disclosure includes: a memory storing at least one program; And a processor that detects the moving object by executing the at least one program, wherein the processor divides each of the frames constituting the image into grid areas for applying a background model, and each of the grid areas. In order to determine the resolution to be processed, an attention mask is set in the grid areas, the motion of the background model is compensated for according to the motion of the non-fixed camera, and model parameters related to the motion compensated background model are updated. By doing so, the moving object can be selected from each of the frames.

본 개시에 따른 이동체를 검출하는 방법 및 장치에 의해, 이동체 검출의 대상이 되는 영상이 비고정 카메라에 의해 촬영되는 경우에도 영상의 부위별로 해상도가 다르게 설정될 수 있어, 이동체를 검출하기 위해 요구되는 연산량이 감소할 수 있다. 연산량이 감소됨에 따라, 촬영되는 영상으로부터 이동체를 검출하는 것이 지연되지 않을 수 있으므로, 비고정 카메라의 경우에도 이동체 검출의 실시간성이 확보될 수 있다.According to the method and apparatus for detecting a moving object according to the present disclosure, even when an image targeted for detecting the moving object is photographed by a non-fixed camera, resolution may be differently set for each part of the image, which is required for detecting the moving object. The amount of computation can decrease. As the amount of computation decreases, since it may not be delayed to detect the moving object from the captured image, real-time property of moving object detection can be secured even in the case of a non-fixed camera.

도 1은 일부 실시예에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 과정을 설명하기 위한 도면이다.
도 2는 일부 실시예에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법을 구성하는 단계들을 나타내는 흐름도이다.
도 3은 일부 실시예에 따른 주의 집중 마스크(attention mask)를 설정하는 방식을 설명하기 위한 도면이다.
도 4는 일부 실시예에 따른 그리드(grid) 영역들 각각이 처리되는 해상도를 결정하는 방식을 설명하기 위한 도면이다.
도 5는 일부 실시예에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 성능을 설명하기 위한 도면이다.
도 6은 일부 실시예에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 장치를 구성하는 요소들을 나타내는 블록도이다.1 is a view for explaining a process of detecting a moving object from an image photographed by a non-fixed camera according to some embodiments.
2 is a flowchart illustrating steps configuring a method of detecting a moving object from an image photographed by a non-fixed camera according to some embodiments.
FIG. 3 is a diagram for explaining a method of setting an attention mask according to some embodiments.
4 is a diagram for describing a method of determining a resolution in which each of the grid regions is processed according to some embodiments.
5 is a view for explaining the performance of detecting a moving object from an image photographed by a non-fixed camera according to some embodiments.
6 is a block diagram illustrating elements constituting an apparatus for detecting a moving object from an image photographed by a non-fixed camera according to some embodiments.

이하 첨부된 도면을 참조하면서 오로지 예시를 위한 실시예들을 상세히 설명하기로 한다. 아래의 설명은 실시예들을 구체화하기 위한 것일 뿐 발명의 권리 범위를 제한하거나 한정하는 것이 아님은 물론이다. 상세한 설명 및 실시예로부터 당해 기술분야의 전문가가 용이하게 유추할 수 있는 것은 권리범위에 속하는 것으로 해석된다.Hereinafter, exemplary embodiments will be described in detail with reference to the accompanying drawings. It should be understood that the following description is only for the purpose of embodying the embodiments and does not limit or limit the scope of the invention. From the detailed description and examples, what can be easily inferred by experts in the art is interpreted as belonging to the scope of rights.

본 명세서에서 사용되는 '구성된다' 또는 '포함한다' 등의 용어는 명세서 상에 기재된 여러 구성 요소들 또는 여러 단계들을 반드시 모두 포함하는 것으로 해석되지 않아야 하며, 그 중 일부 구성 요소들 또는 일부 단계들은 포함되지 않을 수 있고, 또는 추가적인 구성 요소들 또는 단계들이 더 포함될 수 있는 것으로 해석되어야 한다.As used herein, terms such as'consist of' or'comprises' should not be construed as including all the various components or various steps described in the specification, and some of the components or some steps It may not be included, or it should be construed that additional components or steps may be further included.

본 명세서에서 사용되는 '제 1' 또는 '제 2' 등과 같은 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있으나, 상기 구성 요소들은 상기 용어들에 의해 한정되지 않아야 한다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소들로부터 구별하기 위한 목적으로만 사용된다.As used herein, terms including an ordinal number such as'first' or'second' may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from other components.

본 명세서에서 사용되는 용어는 본 발명에서의 기능을 고려하여 가능한 현재 널리 사용되는 일반적인 용어들로 선택되었으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당하는 발명의 설명 부분에서 그 의미가 상세하게 기재될 것이다. 따라서, 본 발명에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 발명의 전반에 걸친 내용을 토대로 정의되어야 한다.The terms used in the present specification have been selected as general terms that are widely used at present considering the functions in the present invention, but this may be changed according to the intention or precedent of a person skilled in the art or the appearance of new technologies. In addition, in certain cases, some terms are arbitrarily selected by the applicant, and in this case, their meanings will be described in detail in the description of the corresponding invention. Therefore, the term used in the present invention should be defined based on the meaning of the term and the contents of the present invention, rather than a simple term name.

본 실시예들은 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법 및 장치에 관한 것으로서 이하의 실시예들이 속하는 기술 분야에서 통상의 지식을 가진 자에게 널리 알려져 있는 사항들에 관해서는 자세한 설명을 생략한다.The present embodiments relate to a method and apparatus for detecting a moving object from an image photographed by a non-fixed camera, and detailed descriptions of matters well known to those of ordinary skill in the art to which the following embodiments belong Omitted.

도 1은 일부 실시예에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 과정을 설명하기 위한 도면이다.1 is a view for explaining a process of detecting a moving object from an image photographed by a non-fixed camera according to some embodiments.

도 1을 참조하면, 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 과정이 영상 입력(input video)(110), 검출 루프(detecting loop)(120) 및 전경 추정(foreground estimation)(130)으로 구성될 수 있음이 도시되어 있다.Referring to FIG. 1, a process of detecting a moving object from an image photographed by a non-fixed camera includes an input video 110, a detection loop 120, and a foreground estimation 130. It can be configured as shown.

영상 입력(110)은 비고정 카메라에 의해 촬영되는 영상을 수신하는 단계를 의미할 수 있다. 비고정 카메라는 촬영의 대상이 되는 이동체뿐만 아니라 이동체를 촬영하는 카메라 또한 움직임을 갖는 것을 의미할 수 있다. 예를 들면, 비고정 카메라는 차량의 블랙박스 등일 수 있다. 또는, 비고정 카메라는 이동과 동시에 주변 환경에 대한 영상을 수신하는 로봇에 활용될 수 있다.The image input 110 may refer to a step of receiving an image photographed by a non-fixed camera. The non-fixed camera may mean that not only the moving object to be photographed, but also the camera photographing the moving object has movement. For example, the non-fixed camera may be a black box of a vehicle or the like. Alternatively, the non-fixed camera may be utilized in a robot that simultaneously receives an image of the surrounding environment while moving.

검출 루프(120)는 영상 입력(110) 과정을 통해 수신되는 영상을 처리하는 단계를 의미할 수 있다. 비고정 카메라의 영상은 초당 복수의 프레임들을 갖는 동영상일 수 있고, 검출 루프(120)의 영상을 처리하는 과정은 복수의 프레임들 각각에 대해 반복적으로 수행될 수 있다. 복수의 프레임들 각각은 특정 개수의 픽셀들로 구성될 수 있고, 영상 처리 과정은 픽셀들이 나타내는 수치에 기반하여 수행될 수 있다.The detection loop 120 may mean a step of processing an image received through the image input 110 process. The image of the non-fixed camera may be a video having a plurality of frames per second, and the process of processing the image of the detection loop 120 may be repeatedly performed for each of the plurality of frames. Each of the plurality of frames may be composed of a specific number of pixels, and the image processing process may be performed based on the values indicated by the pixels.

검출 루프(120)는 호모그래피 추정(121), 주의 집중 마스크 추정(122), 배경 모델의 이동(123) 및 배경 모델 업데이트(124)로 구성될 수 있다. 다만 이는 하나의 예시일 뿐이고, 다른 범용적인 과정들 또는 단계들이 검출 루프(120) 상에 더 포함될 수도 있다.The detection loop 120 may be composed of a homography estimation 121, an attention mask estimation 122, a background model movement 123, and a background model update 124. However, this is only an example, and other general-purpose processes or steps may be further included on the detection loop 120.

호모그래피 추정(121)은 비고정 카메라의 이동을 나타내는 호모그래피 행렬(homography matrix)를 도출하는 과정을 의미할 수 있다. 호모그래피 행렬이 도출되는 경우 이를 활용하여 비고정 카메라의 움직임을 상쇄할 수 있다. 비고정 카메라의 움직임이 상쇄되는 경우 배경의 움직임이 상쇄될 수 있고, 그에 따라 실제 이동체의 움직임만이 반영될 수 있다.The homography estimation 121 may mean a process of deriving a homography matrix indicating movement of a non-fixed camera. When a homography matrix is derived, it can be used to offset the motion of a non-fixed camera. When the movement of the non-fixed camera is canceled, the background movement may be canceled, and accordingly, only the movement of the actual moving object may be reflected.

주의 집중 마스크 추정(122)은 영상 입력(110) 과정에 의한 입력 영상에 주의 집중 마스크를 설정하는 과정을 의미할 수 있다. 주의 집중 마스크가 설정되는 경우 입력 영상이 각기 다른 해상도로 분리될 수 있고, 그에 따라 복잡한 부분은 높은 해상도로, 단순한 부분은 낮은 해상도로 처리하여 이동체 검출의 연산량이 대폭 감소할 수 있다. 주의 집중 마스크를 통한 연산량 감소에 의해 이동체 검출이 입력되는 영상에 대해 실시간으로 수행될 수 있다.The attention mask estimation 122 may refer to a process of setting the attention mask on the input image by the image input 110 process. When the attention mask is set, input images may be separated at different resolutions, and accordingly, a complicated portion may be processed at a high resolution and a simple portion may be processed at a low resolution, so that the computation amount of moving object detection may be greatly reduced. The moving object detection may be performed in real time on the input image by reducing the amount of computation through the attention mask.

배경 모델의 이동(123)은 비고정 카메라의 움직임에 따른 배경의 이동을 보정하는 과정을 의미할 수 있다. 전술한 바와 같이, 호모그래피 행렬이 도출되는 경우 카메라의 움직임이 도출될 수 있고, 배경은 카메라의 이동 방향과 반대 방향으로 이동한다는 점을 고려하면, 호모그래피 행렬을 통해 배경의 이동이 보정될 수 있다. 배경의 움직임이 보정되는 경우, 영상에서 이동체만이 움직임을 가지므로, 이동체 검출이 수행될 수 있다.The movement 123 of the background model may mean a process of correcting the movement of the background according to the movement of the non-fixed camera. As described above, when a homography matrix is derived, the movement of the camera may be derived, and considering that the background moves in a direction opposite to the movement direction of the camera, the movement of the background may be corrected through the homography matrix. have. When the motion of the background is corrected, only the moving object in the image has motion, so that moving object detection can be performed.

배경 모델 업데이트(124)는 움직임이 보정된 배경 모델의 모델 변수들을 업데이트하는 과정을 의미할 수 있다. 배경 모델은 입력 영상에서 배경 및 배경을 바탕으로 이동하는 이동체를 모델링한 것을 의미할 수 있다. 배경 모델이 설정되는 경우, 배경 모델의 모델 변수들에 대한 연산을 통해 배경 모델로부터 이동체를 나타내는 부분이 결정될 수 있다. 특히, 모델 변수들에 대한 연산은 주의 집중 마스크 추정(122) 과정에서 설정된 주의 집중 마스크에 의한 해상도에 따라 서로 다른 정도로 수행될 수 있고, 그에 따라 연산의 효율성이 증대될 수 있어 전체 연산량이 감소할 수 있다.The background model update 124 may refer to a process of updating model parameters of a motion-corrected background model. The background model may mean modeling a moving object based on the background and the background in the input image. When the background model is set, a portion representing the moving object from the background model may be determined through calculation on model variables of the background model. In particular, the operation on the model variables can be performed to different degrees according to the resolution by the attention mask set in the attention mask estimation process 122, and accordingly, the efficiency of the operation can be increased, thereby reducing the overall computation amount. Can.

검출 루프(120)는 매 프레임마다 전술한 과정들이 반복되어 수행될 수 있다. 매 프레임마다 프레임을 구성하는 픽셀들에 의해 배경 모델의 모델 변수들이 업데이트될 수 있고, 그에 따라 픽셀들 중 어떠한 픽셀이 이동체를 나타내는 픽셀인지가 판단될 수 있다.The detection loop 120 may be performed by repeating the above-described processes every frame. The model variables of the background model may be updated by the pixels constituting the frame every frame, and accordingly it may be determined which of the pixels represents a moving object.

전경 추정(130)은 검출 루프(120)를 통해 업데이트되는 배경 모델의 모델 변수들로부터 전경(foreground)를 추정하는 과정을 의미할 수 있다. 전경은 배경(background)에 대비되는 것으로서, 이동체를 나타낼 수 있다. 즉, 배경으로부터 전경을 추정하는 것은 정지된 물체들로부터 이동체를 검출하는 과정을 의미할 수 있다.The foreground estimation 130 may mean a process of estimating a foreground from model variables of a background model updated through the detection loop 120. The foreground is contrasted with the background, and may represent a moving object. That is, estimating the foreground from the background may mean a process of detecting a moving object from stationary objects.

배경 모델(background model)은 배경에 대해 움직임을 갖는 이동체를 검출하기 위해 배경을 모델링한 것으로서, 다양한 모델 변수들을 가질 수 있다. 모델 변수들은 입력 영상의 프레임들 각각을 구성하는 픽셀들에 의해 결정될 수 있다. 예를 들면, 픽셀들 각각이 갖는 픽셀 강도(pixel intensity)에 의해 모델 변수들의 값이 결정될 수 있다. 픽셀 강도는 픽셀의 밝기를 나태내는 지표일 수 있다.Background model (background model) is to model the background to detect a moving object with movement against the background, it can have a variety of model variables. Model variables may be determined by pixels constituting each of the frames of the input image. For example, the value of model variables may be determined by the pixel intensity of each pixel. The pixel intensity may be an indicator of the brightness of the pixel.

배경 모델의 모델 변수들은 픽셀들의 픽셀 강도에 대한 평균(mean), 분산(variance) 및 에이지(age)를 포함할 수 있다. 예를 들면, 배경 모델은 특정 값의 평균 및 분산을 갖는 단일 가우시안 모델(SGM, single Gaussian model)일 수 있고, 단일 가우시안 모델은 학습의 정도를 나타내는 에이지를 가질 수 있다. 단일 가우시안 모델은 매 프레임의 픽셀들에 대한 학습을 통해 평균 및 분산의 값을 업데이트할 수 있고, 학습에 따라 에이지의 값이 증가할 수 있다.The model variables of the background model may include mean, variance and age for the pixel intensity of the pixels. For example, the background model may be a single Gaussian model (SGM) having a mean and variance of a specific value, and the single Gaussian model may have an age indicating the degree of learning. A single Gaussian model can update the values of the mean and variance through learning about the pixels of each frame, and the value of the age can increase according to the learning.

배경 모델은 이중모드 SGM(dumal-mode single Gaussian model)일 수 있다. 더욱 상세하게는, 배경 모델은 어패런트 배경 모델(apparent background model) 및 캔디더트 배경 모델(candidate background model)을 포함하는 이중모드 SGM일 수 있다. 배경 모델이 두 개의 SGM에 의한 이중모드 SGM으로 구현되는 경우 배경이 두 개의 SGM으로 모델링되고, 그에 따라 배경이 잡음 또는 전경 물체에 의해 손상되는 것이 방지될 수 있다.The background model may be a dual-mode single-gaussian model (SGM). More specifically, the background model may be a dual-mode SGM including an parent background model and a canddate background model. When the background model is implemented as a dual-mode SGM by two SGMs, the background is modeled by two SGMs, so that the background can be prevented from being damaged by noise or foreground objects.

배경 모델은 그리드(grid) 단위로 적용될 수 있다. 그리드는 입력 영상을 구성하는 프레임들이 격자 형태로 나뉘어지는 단위를 의미할 수 있다. 예를 들면, 입력 영상의 프레임들이 320*240의 픽셀들로 구성되는 경우, 16*16의 픽셀들이 하나의 그리드를 형성할 수 있고, 입력 영상은 20*15의 그리드들로 치환될 수 있다. 소정의 크기로 형성되는 각각의 그리드에 대해 배경 모델이 적용될 수 있다.The background model can be applied in a grid unit. The grid may mean a unit in which frames constituting the input image are divided into a grid. For example, when the frames of the input image are composed of 320*240 pixels, 16*16 pixels may form one grid, and the input image may be replaced by 20*15 grids. A background model can be applied to each grid formed to a predetermined size.

특정 프레임 또는 특정 시간 t에서, i번째 그리드를 구성하는 픽셀 그룹을 G_i ^(t)라 하고, 해당 픽셀 그룹의 전체 픽셀 개수를 ｜G_i ^(t)｜라 하며, t에서 j번째 픽셀의 픽셀 강도를 I_j ^(t)라 하는 경우, i번째 그리드에 적용되는 SGM 배경 모델의 평균 μ_i ^(t), 분산 σ_i ^(t) 및 에이지 α_i ^(t)는 아래의 수학식 1과 같이 표현될 수 있다.At a specific frame or at a certain time t, the group of pixels constituting the i-th grid is called G _i ^(t) , and the total number of pixels in the group of pixels is called |G _i ^(t) | When the intensity is I _j ^(t) , the mean μ _i ^(t) , variance σ _i ^(t) and age α _i ^(t) of the SGM background model applied to the i-th grid are expressed as in Equation 1 below. Can be.

수학식 1에 의해 프레임 또는 시간이 t-1에서 t로 진행됨에 따라 배경 모델의 학습을 수행하고, 그에 따라 평균, 분산 및 에이지가 업데이트될 수 있다.

,

및

는 각각 t-1에서 배경의 움직임이 보상된 배경 모델의 모델 변수들을 의미할 수 있다. 배경의 움직임 보상에 대한 구체적인 내용은 후술할 도 2의 단계 230을 통해 설명될 수 있다. 한편, M_i ^(t) 및 V_i ^(t)는 아래의 수학식 2와 같이 표현될 수 있다.As the frame or time progresses from t-1 to t by Equation 1, learning of the background model may be performed, and the average, variance, and age may be updated accordingly.

,

And

May denote model variables of a background model in which background motion is compensated at t-1. Details of motion compensation in the background may be described through step 230 of FIG. 2 to be described later. Meanwhile, M _i ^(t) and V _i ^(t) may be expressed as Equation 2 below.

수학식 2는 픽셀 강도의 평균 및 분산을 그리드 단위로 구한 것으로서, M_i ^(t) 및 V_i ^(t)는 그리드를 단위에 특화된 배경 모델의 모델 변수들을 의미할 수 있다. 수학식 2의 경우와 같이 모델 변수들이 픽셀 단위가 아닌 그리드 단위로 적용됨에 따라, 그리드 내부의 일부 특이 픽셀에 의해 배경 모델의 학습에 오류가 발생하는 것이 방지될 수 있다.Equation 2 is obtained by calculating the average and variance of pixel intensity in grid units, and M _i ^(t) and V _i ^(t) may refer to model variables of a background model specialized for the grid unit. As in the case of Equation 2, since model variables are applied in grid units, not pixel units, errors in learning of the background model may be prevented by some singular pixels in the grid.

수학식 1 및 수학식 2의 모델 변수들에 의해 입력 영상이 배경 모델에 의해 모델링 될 수 있다. 구체적으로, 입력 영상의 프레임들을 구성하는 픽셀들이 그리드들로 분할되어, 각각의 그리드에 배경 모델이 적용될 수 있다. 또한, 배경 모델은 수학식 1 및 수학식 2의 모델 변수들에 의한 모델이 이중으로 구비되는 이중 모드 SGM으로 구현될 수 있다.The input image may be modeled by the background model by the model variables of Equation 1 and Equation 2. Specifically, pixels constituting the frames of the input image are divided into grids, and a background model may be applied to each grid. In addition, the background model may be implemented as a dual mode SGM in which a model by model variables of Equation 1 and Equation 2 is provided in duplicate.

도 2는 일부 실시예에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법을 구성하는 단계들을 나타내는 흐름도이다.2 is a flowchart illustrating steps configuring a method of detecting a moving object from an image photographed by a non-fixed camera according to some embodiments.

도 2를 참조하면, 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법은 단계 210 내지 단계 240을 포함할 수 있다. 다만 이에 제한되는 것은 아니고, 도 2에 도시되는 단계들 외에 다른 범용적인 단계들이 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법에 더 포함될 수 있다.Referring to FIG. 2, a method of detecting a moving object from an image photographed by a non-fixed camera may include steps 210 to 240. However, the present invention is not limited thereto, and other general steps other than the steps illustrated in FIG. 2 may be further included in a method of detecting a moving object from an image photographed by a non-fixed camera.

도 2의 방법은 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 장치에 의해 수행될 수 있다. 도 2의 방법을 수행하는 장치는 후술할 도 6의 장치(600)일 수 있다.The method of FIG. 2 may be performed by an apparatus for detecting a moving object from an image captured by a non-fixed camera. The apparatus for performing the method of FIG. 2 may be the apparatus 600 of FIG. 6 to be described later.

단계 210에서, 장치(600)는 비고정 카메라에 의해 촬영되는 영상을 구성하는 프레임들 각각을 배경 모델을 적용하기 위한 그리드 영역들로 분할할 수 있다.In step 210, the apparatus 600 may divide each of the frames constituting the image captured by the non-fixed camera into grid regions for applying a background model.

단계 220에서, 장치(600)는 그리드 영역들 각각이 처리되는 해상도를 결정하기 위해 그리드 영역들에 주의 집중 마스크(attention mask)를 설정할 수 있다.In step 220, the apparatus 600 may set an attention mask on the grid regions to determine the resolution at which each of the grid regions is processed.

주의 집중 마스크를 설정하는 과정에서, 장치(600)는 그리드 영역들 각각에 포함되는 픽셀(pixel)들의 분산(variance)에 기초하여 그리드 영역들의 적어도 일부를 병합하거나 분할할 수 있다.In the process of setting the attention mask, the apparatus 600 may merge or divide at least a portion of the grid regions based on variance of pixels included in each of the grid regions.

주의 집중 마스크를 설정하는 과정에서, 장치(600)는, 그리드 영역들의 적어도 일부가 병합되는 경우, 병합되는 그리드 영역에 적용되는 모델 변수들을 병합되기 이전의 그리드 영역들 각각의 모델 변수들에 대한 평균으로 설정할 수 있다.In the process of setting the attention mask, the apparatus 600, when at least a part of the grid areas are merged, averages the model variables applied to the merged grid area to the model variables of each of the grid areas before being merged. Can be set to

주의 집중 마스크를 설정하는 과정에서, 장치(600)는, 그리드 영역들의 적어도 일부가 분할되는 경우, 분할되는 그리드 영역들 각각의 모델 변수들을 분할되기 이전의 그리드 영역의 모델 변수들로 설정할 수 있다.In the process of setting the attention mask, the apparatus 600 may set the model variables of each of the divided grid regions as model variables of the grid region before division, when at least a portion of the grid regions are divided.

주의 집중 마스크를 설정하는 과정에서, 장치(600)는 행렬 형상으로 배열되는 그리드 영역들의 행 방향 또는 열 방향을 따라 주의 집중 마스크를 설정할 수 있다.In the process of setting the attention mask, the apparatus 600 may set the attention mask according to the row direction or column direction of grid regions arranged in a matrix shape.

주의 집중 마스크를 설정하는 단계에 대한 보다 구체적인 내용은 후술할 도 3 내지 도 4를 통해 설명될 수 있다.More detailed information on the step of setting the attention mask may be described with reference to FIGS. 3 to 4 to be described later.

단계 230에서, 장치(600)는 비고정 카메라의 움직임에 따른 배경 모델의 움직임을 보상할 수 있다.In step 230, the device 600 may compensate for the movement of the background model according to the movement of the non-fixed camera.

움직임을 보상하는 과정에서, 장치(600)는 호모그래피 행렬(homography matrix)에 따른 그리드 영역들의 이동 및 이동되는 그리드 영역들과 이동 이전의 그리드 영역들이 오버랩(overlap)되는 영역에 기초하여 모델 혼합을 수행할 수 있다. 모델 혼합은 아래의 수학식 3을 통해 설명될 수 있다.In the process of compensating for the motion, the apparatus 600 mixes the model based on the movement of the grid regions according to the homography matrix and the moving grid regions and the regions where the grid regions before the movement overlap. Can be done. Model mixing can be explained through Equation 3 below.

수학식 3을 참조하면,

,

및

는 수학식 1의 모델 업데이트에 활용되는 변수들로서, 각각 t-1에서 배경의 움직임이 보상된 배경 모델의 평균, 분산 및 에이지를 의미할 수 있다. 수학식 3의 가중치 w_k는 아래의 수학식 4를 통해 설명될 수 있다.Referring to Equation 3,

,

And

Is a variable used in the model update of Equation 1, and may mean average, variance, and age of the background model in which the motion of the background is compensated at t-1, respectively. The weight w _k of Equation 3 may be described through Equation 4 below.

수학식 4를 참조하면, 가중치 가중치 w_k는 호모그래피 행렬에 따라 이동되는 그리드 영역 R_i와 이동 이전의 그리드 영역 G_k ^(t)가 오버랩되는 영역에 비례할 수 있고, 모든 가중치들의 합은 1일 수 있다. 즉, 오버랩되는 부분이 많은 영역은 높은 가중치를 통해 움직임 보상 후의 모델 변수에 크게 반영될 수 있고, 오버랩되는 부분이 적은 영역은 그 반대일 수 있다. 이와 같이 오버랩되는 부분들이 혼합되는 방식으로 모델 혼합에 의한 움직임 보상이 수행될 수 있다.Referring to Equation 4, the weight weight w _k may be proportional to a region in which the grid region R _i moved according to the homography matrix and the grid region G _k ^(t) before the movement overlap, and the sum of all weights is 1 Can be That is, a region having many overlapping portions may be largely reflected in a model variable after motion compensation through a high weight, and a region having few overlapping portions may be vice versa. In this way, motion compensation by model mixing may be performed in such a way that overlapping parts are mixed.

단계 240에서, 장치(600)는 움직임이 보상된 배경 모델에 관한 모델 변수들을 업데이트하여 프레임들 각각으로부터 이동체를 선별할 수 있다.In step 240, the device 600 may update the model parameters related to the motion compensated background model to select a moving object from each of the frames.

이동체를 선별하는 과정에서, 장치(600)는 그리드 영역들 각각에 포함되는 픽셀들의 픽셀 강도(pixel intensity) 및 업데이트되는 모델 변수들에 기초하여 이동체를 나타내는 픽셀들을 선별할 수 있다. 이동체를 선별하기 위한 구체적인 기준은 아래의 수학식 5를 통해 설명될 수 있다.In the process of selecting a moving object, the apparatus 600 may select pixels representing the moving object based on pixel intensity of the pixels included in each of the grid areas and updated model variables. Specific criteria for selecting a moving object may be described through Equation 5 below.

수학식 5를 참조하면, 각 그리드에 대해, 픽셀 강도가 평균으로부터 일정 수준 이상 벗어난 픽셀들이 이동체에 해당하는 것으로서 전경 픽셀로 간주될 수 있다.Referring to Equation 5, for each grid, pixels whose pixel intensity deviates by a predetermined level or more from the average may be regarded as a moving object and may be regarded as a foreground pixel.

도 2의 단계 210 내지 단계 240, 수학식 1 내지 수학식 5에 관한 보다 구체적인 내용은 등록특허공보 제10-1438451호를 통해 제공될 수 있다.More detailed information on steps 210 to 240 of FIG. 2 and equations 1 to 5 may be provided through patent registration number 10-1438451.

도 3은 일부 실시예에 따른 주의 집중 마스크(attention mask)를 설정하는 방식을 설명하기 위한 도면이다.FIG. 3 is a diagram for explaining a method of setting an attention mask according to some embodiments.

도 3을 참조하면, 입력 영상(311)에 대한 주의 집중 마스크(312), 입력 영상(321)에 대한 주의 집중 마스크(322) 및 입력 영상(331)에 대한 주의 집중 마스크(332)가 도시되어 있다.Referring to FIG. 3, the attention mask 312 for the input image 311, the attention mask 322 for the input image 321, and the attention mask 332 for the input image 331 are illustrated. have.

주의 집중 마스크는 그리드 영역들 각각에 포함되는 픽셀들의 분산에 기초하여 설정될 수 있다. 주의 집중 마스크를 설정하는 과정에서, 그리드 영역들의 적어도 일부가 병합되거나 분할될 수 있다. 그리드 영역들의 병합 또는 분할에 의해 입력 영상을 처리하는 해상도가 변경될 수 있다. 입력 영상에서 복잡한 부분은 그리드 영역에 대한 분할을 통해 고해상도로 처리될 수 있고, 단순한 부분은 그리드 영역들에 대한 병합을 통해 저해상도로 처리될 수 있다.The attention mask may be set based on the variance of pixels included in each of the grid regions. In the process of setting the attention mask, at least some of the grid regions may be merged or divided. The resolution for processing the input image may be changed by merging or splitting grid regions. The complex portion of the input image may be processed at a high resolution through division of the grid region, and the simple portion may be processed at a low resolution through merging of the grid regions.

고해상도로 처리할지, 또는 저해상도로 처리할지 여부는 픽셀들의 표준편차 또는 분산을 기준으로 결정될 수 있다. 예를 들면, 입력 영상의 프레임을 구성하는 그리드들 각각에 대해 표준편차 또는 분산이 계산될 수 있고, 표준편차 또는 분산이 높을수록 고해상도로, 낮을수록 저해상도로 처리될 수 있다.Whether to process at high resolution or at low resolution may be determined based on the standard deviation or dispersion of pixels. For example, the standard deviation or variance may be calculated for each of the grids constituting the frame of the input image, and the higher the standard deviation or variance, the higher the resolution and the lower the resolution.

주의 집중 마스크는 행렬 형상으로 배열되는 그리드 영역들의 행 방향 또는 열 방향을 따라 설정될 수 있다. 도 3의 주의 집중 마스크(312) 내지 주의 집중 마스크(332)는 모두 행 방향을 따라 설정되었음이 확인될 수 있다. 구체적으로, 행렬 형상의 그리드들 각각에 대해 표준 편차 또는 분산을 계산하고, 표준 편차 또는 분산을 행(row) 방향으로 합산하면 하나의 열(column)을 갖는 열 벡터가 도출될 수 있고, 열 벡터의 값은 행 방향으로 합산된 표준편차 또는 분산을 나타낼 수 있다. 따라서, 열 벡터의 값을 기준으로 행 방향을 따라 주의 집중 마스크가 설정될 수 있다.The attention mask may be set along a row direction or a column direction of grid regions arranged in a matrix shape. It can be confirmed that all of the attention mask 312 to the attention mask 332 of FIG. 3 are set along the row direction. Specifically, by calculating the standard deviation or variance for each of the matrix-shaped grids, and adding the standard deviation or variance in the row direction, a column vector having one column can be derived, and the column vector The value of can represent the standard deviation or variance summed in the row direction. Therefore, the attention mask can be set along the row direction based on the value of the column vector.

입력 영상(311)에 대한 주의 집중 마스크(312)를 참조하면, 입력 영상(311) 상단부는 비교적 단순한 부분으로서 그리드들의 표준편차 또는 분산의 값이 작고, 그에 따라 저해상도를 나타내는 어두운 색으로 표시되어 있으며, 입력 영상(311)의 하단부는 다소 복잡한 부분으로서, 그리드들의 표준편차 또는 분산의 값이 크고, 그에 따라 고해상도를 나타내는 밝은 색으로 표시되어 있다. 입력 영상(321)에 대한 주의 집중 마스크(322) 및 입력 영상(331)에 대한 주의 집중 마스크(332) 또한 동일한 방식으로 행 방향을 따라 주의 집중 마스크가 설정되었음이 확인될 수 있다.Referring to the attention mask 312 for the input image 311, the upper portion of the input image 311 is a relatively simple part, and the standard deviation or variance of the grids is small, and accordingly, it is displayed in a dark color indicating low resolution. , The lower end of the input image 311 is a rather complicated part, and the standard deviation or variance of the grids is large, and accordingly, it is displayed in a bright color indicating high resolution. It can be confirmed that the attention mask 322 for the input image 321 and the attention mask 332 for the input image 331 are also set in the same manner along the row direction.

도 3에서는 주의 집중 마스크가 행 방향을 따라 설정되는 것으로 예시되었으나, 이에 제한되는 것은 아니고, 주의 집중 마스크는 대응되는 원리에 따라 열 방향으로 설정될 수도 있고, 기타 다른 적합한 방식에 따라 설정될 수도 있다.In FIG. 3, the attention mask is illustrated as being set along the row direction, but is not limited thereto, and the attention mask may be set in the column direction according to a corresponding principle, or may be set according to other suitable methods. .

주의 집중 마스크가 설정되어, 입력 영상이 처리되는 해상도가 일괄적으로 설정되지 않고, 복잡도에 따라 서로 다른 해상도로 설정될 수 있다. 그에 따라, 고해상도 처리가 불필요한 부분은 저해상도로 설정되어 보다 신속하게 처리될 수 있어, 이동체 검출의 전체적인 연산량이 대폭 감소할 수 있다.The attention mask is set, and the resolution at which the input image is processed is not collectively set, but may be set to different resolutions according to complexity. Accordingly, a portion in which high resolution processing is unnecessary may be set to a low resolution and processed more quickly, so that the overall computational amount of moving object detection can be significantly reduced.

도 4는 일부 실시예에 따른 그리드(grid) 영역들 각각이 처리되는 해상도를 결정하는 방식을 설명하기 위한 도면이다.4 is a diagram for describing a method of determining a resolution in which each of the grid regions is processed according to some embodiments.

도 4를 참조하면, 주의 집중 마스크의 설정에 따라 그리드 영역이 분할되어 해상도가 증가하거나, 그리드 영역들이 병합되어 해상도가 감소하는 예시가 도시되어 있다.Referring to FIG. 4, an example is shown in which a grid area is divided according to the attention mask setting to increase resolution, or grid areas are merged to decrease resolution.

그리드 영역(411)의 표준편차 또는 분산이 상대적으로 큰 경우, 그리드 영역(411)은 그리드 영역들(412) 또는 그리드 영역들(413)로 분할될 수 있고, 반대의 경우, 그리드 영역들(412) 또는 그리드 영역들(413)은 그리드 영역(411)으로 병합될 수 있다.When the standard deviation or variance of the grid area 411 is relatively large, the grid area 411 may be divided into grid areas 412 or grid areas 413, and vice versa, grid areas 412 ) Or the grid regions 413 may be merged into the grid region 411.

그리드 영역(411)이 4개의 그리드 영역들(412)으로 분할되는 경우, 4개의 그리드 영역들(412)에는 4개의 독립적인 배경 모델들이 각각 적용될 수 있다. 이 경우, 4개의 배경 모델들의 모델 변수들에는 분할 전의 그리드 영역(411)에 적용되던 배경 모델의 모델 변수들의 값이 그대로 붙여넣어질 수 있다. 이와 같은 방식은 그리드 영역(411)이 16개의 그리드 영역들(413)으로 분할되는 경우에도 마찬가지로 적용될 수 있다. 분할 이후, 분리된 그리드 영역들에 적용되는 배경 모델은 독립적으로 동작하여 각 영역들의 모델 변수들의 값은 달라질 수 있다.When the grid area 411 is divided into four grid areas 412, four independent background models may be applied to the four grid areas 412, respectively. In this case, values of model variables of the background model applied to the grid region 411 before division may be pasted to model variables of the four background models. This method can be applied to the case where the grid area 411 is divided into 16 grid areas 413. After division, the background model applied to the separated grid regions operates independently, so that the values of model variables of each region can be changed.

4개의 그리드 영역들(412)이 그리드 영역(411)으로 병합되는 경우, 4개의 그리드 영역들(412) 각각에 적용되던 배경 모델들의 모델 변수들에 대한 평균이 병합 후의 그리드 영역(411)에 적용될 수 있다. 예를 들면, 4개의 그리드 영역들(412)에 각각 적용되는 배경 모델들의 모델 변수로서, 분산 σ₁ ^(t), σ₂ ^(t), σ₃ ^(t), σ₄ ^(t)이 각각 10, 10, 10, 30인 경우, 병합 후의 그리드 영역(411)에 적용되는 배경 모델의 모델 변수로서 분산은 15일 수 있다. 16개의 그리드 영역들(413)이 그리드 영역(411)으로 병합되는 경우에도 동일한 방식이 적용될 수 있다.When the four grid regions 412 are merged into the grid region 411, the average of the model variables of the background models applied to each of the four grid regions 412 is applied to the grid region 411 after merging. Can. For example, as the model variables of the background models applied to the four grid regions 412, σ ₁ ^(t) , σ ₂ ^(t) , σ ₃ ^(t) , and σ ₄ ^(t) are 10 respectively. , 10, 10, 30, the variance may be 15 as a model variable of the background model applied to the grid region 411 after merging. The same method may be applied when the 16 grid regions 413 are merged into the grid region 411.

한편, 그리드 영역(421)이 그리드 영역들(422) 또는 그리드 영역들(423)로 분할되거나, 반대로 병합되는 경우에서와 같이, 일부 그리드 영역의 픽셀들이 정의되지 않는 경우(NaN), 정의되지 않는 그리드 영역들이 제외된 상태로 그리드 영역의 병합 및 분할이 수행될 수 있다.On the other hand, when the grid region 421 is divided into grid regions 422 or grid regions 423 or merged on the contrary, when pixels of some grid regions are not defined (NaN), undefined The grid regions may be merged and divided with the grid regions excluded.

도 5는 일부 실시예에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 성능을 설명하기 위한 도면이다.5 is a view for explaining the performance of detecting a moving object from an image photographed by a non-fixed camera according to some embodiments.

도 5를 참조하면, 입력 영상 (a)에 대한 선행 기술에 따른 검출 결과(b) 및 본 개시에 따른 검출 결과(c)가 도시되어 있다.Referring to FIG. 5, a detection result (b) according to the prior art for the input image (a) and a detection result (c) according to the present disclosure are illustrated.

입력 영상 (a)로부터 이동체를 검출하는 성능에 있어서, 도 5의 (b) 및 (c)를 참조하면, 본 개시에 따른 방법의 성능은 선행 기술의 성능과 동일한 수준의 성능을 가질 수 있다. 한편, 본 개시에 따른 검출 결과(c)는 선행 기술에 따른 검출 결과(b) 대비 6 fps(frame per second)의 속도 향상이 있어, 선행 기술 대비 약 2배의 속도 향상이 확인될 수 있었다.In the performance of detecting a moving object from the input image (a), referring to (b) and (c) of FIG. 5, the performance of the method according to the present disclosure may have the same level of performance as that of the prior art. On the other hand, the detection result (c) according to the present disclosure has a speed improvement of 6 fps (frame per second) compared to the detection result (b) according to the prior art, and a speed improvement of about 2 times compared to the prior art can be confirmed.

본 개시에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법은 주의 집중 마스크를 설정하여 그리드 영역마다 해상도를 달리할 수 있고, 단순한 부분의 해상도를 낮게 설정하는 방식으로, 이동체 검출에 소요되는 연산량을 대폭 감소시킬 수 있다. 연산량의 감소에 따라 실시간으로 처리 가능한 초당 프레임 수가 상승할 수 있고, 그에 따라 이동체 검출의 실시간성이 확보될 수 있다.A method of detecting a moving object from an image photographed by a non-fixed camera according to the present disclosure may set a attention mask to vary the resolution for each grid area, and set the resolution of a simple part to a low level, which is necessary for detecting the moving object. The amount of computation to be performed can be greatly reduced. The number of frames per second that can be processed in real time may increase as the amount of computation decreases, and accordingly, real-time property of moving object detection may be secured.

도 6은 일부 실시예에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 장치를 구성하는 요소들을 나타내는 블록도이다.6 is a block diagram illustrating elements constituting an apparatus for detecting a moving object from an image photographed by a non-fixed camera according to some embodiments.

도 6을 참조하면, 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 장치(600)는 메모리(610) 및 프로세서(620)를 포함할 수 있다. 다만 이에 제한되는 것은 아니고, 도 6에 도시되는 요소들 외에 다른 범용적인 요소들이 장치(600)에 더 포함될 수 있다.Referring to FIG. 6, an apparatus 600 for detecting a moving object from an image photographed by a non-fixed camera may include a memory 610 and a processor 620. However, the present invention is not limited thereto, and other general-purpose elements may be further included in the device 600 in addition to the elements illustrated in FIG. 6.

도 1 내지 도 5를 통해 설명된 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법을 구성하는 단계들은 도 6의 장치(600)에서 시계열적으로 처리될 수 있다. 따라서, 이하에서 도 6의 장치(600)에 관해 생략되는 내용이라 하더라도, 도 1 내지 도 5의 방법에 관해 이상에서 기술된 내용은 도 6의 장치(600)에 대해서도 적용될 수 있다.The steps constituting the method of detecting the moving object from the image photographed by the non-fixed camera described with reference to FIGS. 1 to 5 may be processed in time series in the apparatus 600 of FIG. 6. Accordingly, even if omitted for the device 600 of FIG. 6, the content described above with respect to the method of FIGS. 1 to 5 may also be applied to the device 600 of FIG. 6.

메모리(610)는 장치(600)에서 처리된 데이터 및 처리될 데이터로서 입력 영상 또는 배경 모델의 모델 변수 등에 관한 데이터를 저장할 수 있다. 메모리(610)는 적어도 하나의 프로그램을 저장할 수 있다. 메모리(610)에 저장되는 적어도 하나의 프로그램은 프로세서(620)에 의해 실행될 수 있다.The memory 610 may store data related to model variables of an input image or a background model as data processed by the device 600 and data to be processed. The memory 610 may store at least one program. At least one program stored in the memory 610 may be executed by the processor 620.

메모리(610)는 DRAM(dynamic random access memory), SRAM(static random access memory), ROM(read-only memory), EEPROM(electrically erasable programmable read-only memory), CD-ROM, 블루레이(blu-ray) 또는 다른 광학 디스크 스토리지, HDD(hard disk drive), SSD(solid state drive), 또는 플래시 메모리로 구현될 수 있으나, 이에 한정되는 것은 아니다.The memory 610 includes dynamic random access memory (DRAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), CD-ROM, and Blu-ray ) Or other optical disk storage, hard disk drive (HDD), solid state drive (SSD), or flash memory, but is not limited thereto.

프로세서(620)는 다수의 논리 게이트들의 어레이로 구현될 수 있고, 범용적인 마이크로 프로세서와 마이크로 프로세서에서 실행될 수 있는 프로그램이 저장된 메모리의 조합으로 구현될 수도 있다. 또한, 프로세서(620)는 복수 개의 프로세싱 엘리먼트들(processing elements)로 구성될 수도 있다.The processor 620 may be implemented as an array of multiple logic gates, or may be implemented as a combination of a general-purpose microprocessor and a memory in which programs that can be executed in the microprocessor are stored. Also, the processor 620 may be composed of a plurality of processing elements.

프로세서(620)는 메모리(610)에 저장되는 적어도 하나의 프로그램을 실행함으로써 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법을 구성하는 단계들을 수행할 수 있다.The processor 620 may perform steps configuring a method of detecting a moving object from an image photographed by a non-fixed camera by executing at least one program stored in the memory 610.

프로세서(620)는 비고정 카메라에 의해 촬영되는 영상을 구성하는 프레임들 각각을 배경 모델을 적용하기 위한 그리드 영역들로 분할할 수 있다.The processor 620 may divide each of the frames constituting the image captured by the non-fixed camera into grid regions for applying a background model.

프로세서(620)는 그리드 영역들 각각이 처리되는 해상도를 결정하기 위해 그리드 영역들에 주의 집중 마스크를 설정할 수 있다.The processor 620 may set an attention mask on the grid regions to determine the resolution at which each of the grid regions is processed.

프로세서(620)는 비고정 카메라의 움직임에 따른 배경 모델의 움직임을 보상할 수 있다.The processor 620 may compensate for the movement of the background model according to the movement of the non-fixed camera.

프로세서(620)는 움직임이 보상된 배경 모델에 관한 모델 변수들을 업데이트하여 프레임들 각각으로부터 이동체를 선별할 수 있다.The processor 620 may select a moving object from each of the frames by updating model variables related to a motion compensated background model.

한편, 본 개시에 따른 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 방법은 그 방법을 실행하는 명령어들을 포함하는 하나 이상의 프로그램이 기록되는 컴퓨터로 판독 가능한 기록 매체에 기록될 수 있다.Meanwhile, a method of detecting a moving object from an image photographed by a non-fixed camera according to the present disclosure may be recorded in a computer-readable recording medium in which one or more programs including instructions for executing the method are recorded.

컴퓨터로 판독 가능한 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함될 수 있다. 프로그램 명령어의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드가 포함될 수 있다.Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs, DVDs, and floptical disks. Hardware devices specifically configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like may be included. Examples of program instructions may include high-level language code that can be executed by a computer using an interpreter, etc., as well as machine language codes made by a compiler.

이상에서 실시예들에 대하여 상세하게 설명하였지만 본 발명의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 발명의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속한다.Although the embodiments have been described in detail above, the scope of the present invention is not limited thereto, and various modifications and improvements of those skilled in the art using the basic concept of the present invention defined in the following claims are also provided. Belongs to

600: 비고정 카메라에 의해 촬영되는 영상으로부터 이동체를 검출하는 장치
610: 메모리
620: 프로세서600: a device for detecting a moving object from an image captured by a non-fixed camera
610: memory
620: processor

Claims

A method for detecting a moving object from an image photographed by a non-fixed camera,
Dividing each of the frames constituting the image into grid regions for applying a background model;
Setting an attention mask on the grid areas to determine the resolution at which each of the grid areas is processed;
Compensating for the movement of the background model according to the movement of the non-fixed camera; And
And selecting the moving object from each of the frames by updating model variables related to the motion compensated background model,
The step of setting the attention mask,
And merging or dividing at least a portion of the grid regions based on variance of pixels included in each of the grid regions.

delete

According to claim 1,
The step of setting the attention mask,
And when the at least a part is merged, setting the model variables applied to the merged grid area as an average for the model variables of each of the grid areas before being merged.

According to claim 1,
The step of setting the attention mask,
And if the at least a portion is divided, setting the model variables of each of the divided grid regions to the model variables of the grid region before division.

According to claim 1,
The step of setting the attention mask,
And setting the attention mask along a row direction or a column direction of the grid regions arranged in a matrix shape.

According to claim 1,
Compensating the movement,
And performing model mixing based on movement of the grid regions according to a homography matrix and regions in which the moved grid regions and the grid regions before the movement overlap. Way.

According to claim 1,
The step of selecting the moving object,
And selecting pixels representing the moving object based on pixel intensity of the pixels included in each of the grid regions and the updated model variables.

According to claim 1,
The model variables include a mean, variance and age for the pixel intensity of pixels included in each of the grid regions.

According to claim 1,
The background model is implemented as a dual-mode dual-mode single Gaussian model (SGM) model comprising an aparent background model and a canddate background model.

A computer-readable recording medium in which a program for implementing a method for detecting a moving object from an image photographed by a non-fixed camera is recorded,
The above method,
Dividing each of the frames constituting the image into grid regions for applying a background model;
Setting an attention mask on the grid areas to determine the resolution at which each of the grid areas is processed;
Compensating for the movement of the background model according to the movement of the non-fixed camera; And
And selecting the moving object from each of the frames by updating model variables related to the motion compensated background model,
The step of setting the attention mask,
And merging or dividing at least a portion of the grid areas based on a variance of pixels included in each of the grid areas.

In the apparatus for detecting a moving object from the image taken by the non-fixed camera,
A memory storing at least one program; And
And a processor for detecting the moving object by executing the at least one program,
The processor,
Each of the frames constituting the image is divided into grid regions for applying a background model,
Attention masks are set on the grid regions to determine the resolution at which each of the grid regions is processed,
Compensate the movement of the background model according to the movement of the non-fixed camera,
The moving object is selected from each of the frames by updating model parameters related to the motion compensated background model,
The processor
And setting the attention mask by merging or dividing at least a portion of the grid regions based on a variance of pixels included in each of the grid regions.