KR100220838B1

KR100220838B1 - Apparatus and method of target tracking in image telephone

Info

Publication number: KR100220838B1
Application number: KR1019960054440A
Authority: KR
Inventors: 정성학
Original assignee: 전주범; 대우전자주식회사
Priority date: 1996-11-15
Filing date: 1996-11-15
Publication date: 1999-09-15
Also published as: KR19980035972A

Abstract

본 발명은 화상 전화기에 있어서, 목표물을 추적하기 위해 쿼드 트리 분할(Quadtree) 및 병합을 이용하여 효과적인 상관 영역의 모양을 임의로 설정하고 상관도를 계산하여 카메라의 위치를 이동시키는 목표물 추적 방법 및 장치에 관한 것이다.The present invention relates to a target tracking method and apparatus for moving a position of a camera by randomly setting the shape of an effective correlation area using a quadtree and merging to track a target, .

본 발명은 대화자를 추적하기 위해 쿼드 트리 분할(Quadtree) 및 병합을 이용하여 상관 영역을 임의로 설정하고 이에 따라 상관도를 계산하여 카메라의 위치를 이동시킨다.The present invention arbitrarily sets a correlation area using a quadtree and a merge to track a talker, and calculates the correlation and accordingly moves the position of the camera.

따라서 본 발명은 종래의 정사각형의 윈도우 대신에 쿼드트리 분할 및 병합을 통해 형성된 상관 영역을 이용하여 상관도를 추정하므로, 목표물을 추적하는 추적 성능을 향상시키면서도 계산량을 줄이는 효과가 있다.Therefore, the present invention estimates the correlation using the correlation area formed by the quad tree partitioning and merging instead of the conventional square window, so that it has the effect of reducing the calculation amount while improving the tracking performance for tracking the target.

Description

A method and apparatus for tracking a target of a videophone (Target Tracking Method and Device for Video Phone)

일반적으로 물체, 즉 목표물(Target)을 추적하는 추적 기법에는 중심점 추적 기법과 상관 추적 기법이 있다.In general, tracking methods for tracking an object, ie, a target, include a center point tracking method and a correlation tracking method.

중심점 추적 기법은 도1a에 도시한 바와 같이 이동 물체를 배경으로 부터 분리한후 추출된 이동 물체의 중심점(A)을 추적하는 방법이다. 이때, 이동 물체, 즉 목표물을 배경과 분리하기 위해 문턱치를 이용하게 된다. 즉, 문턱치를 이용하여 배경과 물체를 이진화 한다.The center point tracking method is a method of tracking the center point A of the moving object after separating the moving object from the background as shown in FIG. 1A. At this time, the threshold value is used to separate the moving object, that is, the target, from the background. That is, the background and the object are binarized using the threshold value.

그러나 이러한 중심적 추적 기법은 잡음에 대한 영향을 많이 받는 단점이 있다.However, this central tracking technique has a disadvantage that it is highly affected by noise.

즉, 영상이 비교적 단순하여 영상 영역화가 용이하고 추적 가능한 물체의 속도에 대한 제약이 비교적 적은 경우에는 추적 안정성이 좋고 잡음의 영향이 적다. 그러나 반대로 영상이 비교적 복잡하여 영상 영역화가 용이하지 않고 추적 가능한 물체의 속도에 대한 제약이 비교적 많은 경우에는 추적 안정성이 나쁘고 잡음이 많아 진다.That is, if the image is relatively simple and the image area is easy and the restriction on the speed of the trackable object is relatively small, the tracking stability is good and the influence of noise is small. On the other hand, if the image is relatively complicated and the image area is not easy and the constraint on the speed of the trackable object is relatively large, the tracking stability is bad and the noise is increased.

또한, 상관 추적 기법은 도1b에 도시한 바와 같이 이전 영상의 이동 물체, 즉 목표물의 위치에 적당한 크기의 영역(B)을 정의하고 정의된 영역(B)과 현재 영상내의 검색 영역과의 상관도를 계산하여 상관도가 가장 높은 영역(B')으로 물체가 이동한 것으로 추정하는 방법이다.1B, the correlation tracking method defines a moving object of a previous image, that is, a region B having a size appropriate to the position of the target, and calculates a correlation between the defined region B and the search region in the current image And estimates that the object has moved to the region B 'having the highest degree of correlation.

즉, 상관 추적 기법은 주어진 n번째 영상에서 이동 물체의 위치가 주어진 경우 이동 물체를 포함하는 일정한 크기의 윈도우 영역, 즉 상관 영역을 정의하고, n+1번째 영상에서의 검색 영역상의 각 위치에 대하여 상관도를 계산하여 상관도가 가장 높은 영역의 위치를 n+1번째 영상에서의 이동 물체의 위치로 간주한다.That is, the correlation tracking method defines a window region of a fixed size including a moving object, that is, a correlation region when a position of a moving object is given in a given n-th image, The position of the region with the highest degree of correlation is regarded as the position of the moving object in the (n + 1) -th image.

여기서, 상관도 계산시 초기창의 모양, 즉 현재 프레임과 이전 프레임의 상관도 계산시 사용되는 영역은 주로 정사각형 형태의 윈도우 형태로 이루어진다.Here, in calculating the correlation, the area of the initial window, that is, the area used in calculating the correlation between the current frame and the previous frame, is mainly a square window.

따라서 상관 추적 기법은 영상 영역화 과정을 수행하지 않고 입력되는 현재 프레임의 영상으로 부터 직접 상관도를 계산하기 때문에 비교적 복잡한 영상에 대해서도 추적 성능이 유지되지만 계산량이 많아지는 단점이 있다.Therefore, the correlation tracking method calculates the correlation directly from the image of the current frame without performing the image segmentation process, so that the tracking performance is maintained even for a relatively complicated image, but the calculation amount is increased.

즉, 상관 추적 기법은 일반적으로 중심점 추적 기법에 비하여 영상을 이진화하지 않고 영상의 명암 정보를 사용하기 때문에 배경 산란 등이 첨가되어 영상 영역화가 불가능한 경우에도 어느 정도의 추적 성능을 기대할 수 있다. 그러나 상관 추적 기법은 이동 물체의 움직임을 추정하기 위해서 상관 영역과 검색 영역 사이의 모든 경우에 대해 상관도를 계산하여야 하기 때문에 계산량이 많아지는 단점이 있다.In other words, since the correlation tracking method generally uses the intensity information of the image without binarizing the image as compared with the center-point tracking method, it is expected that some degree of tracking performance can be expected even when the image area can not be added by adding background scattering. However, the correlation tracking method has a disadvantage in that the computation amount is increased because the correlation degree must be calculated for all cases between the correlation region and the search region in order to estimate the movement of the moving object.

상기 단점을 개선하기 위한 본 발명은 화상 전화기에 있어서, 대화자를 추적하기 위해 쿼드 트리 분할(Quadtree) 및 병합을 이용하여 효과적인 상관 영역의 모양을 임의로 설정하고 상관도를 계산하여 카메라의 위치를 이동시키므로써 계산량을 줄이면서 추적 성능을 향상시키기 위한 목표물 추적 방법 및 장치를 제공함에 그 목적이 있다.The present invention for improving the above-mentioned disadvantages of the present invention is that, in a videophone, a shape of an effective correlation area is arbitrarily set by using a quadtree and merging to track a talker, The object of the present invention is to provide a target tracking method and apparatus for improving tracking performance while reducing computational complexity.

도1a은 종래의 중심점 추적 기법을 설명하기 위한 도면1A is a view for explaining a conventional center point tracking technique;

도1b은 종래의 상관 추적 기법을 설명하기 위한 도면1B is a diagram for explaining a conventional correlation tracking technique;

도2는 본 발명에 의한 목표물 추적 방법의 흐름도2 is a flowchart of a target tracking method according to the present invention.

도3은 도2의 쿼드트리 분할 단계를 설명하기 위한 도면FIG. 3 is a view for explaining the quad tree segmentation step of FIG. 2; FIG.

도4는 도2의 병합 단계를 설명하기 위한 도면4 is a view for explaining the merging step of FIG. 2; FIG.

도5는 도2의 상관도 계산 단계를 설명하기 위한 도면5 is a view for explaining the correlation degree calculating step of FIG. 2; FIG.

도6은 본 발명에 의한 목표물 추적 장치의 구성도FIG. 6 is a block diagram of a target tracking apparatus according to the present invention.

* 도면의 주요부분에 대한 부호의 설명DESCRIPTION OF THE REFERENCE NUMERALS

300 : 카메라 400 : 모터부300: camera 400: motor part

410 : 모터 드라이버 420 : 모터410: motor driver 420: motor

500 : 제어부 600 : 영상 저장부500: control unit 600: image storage unit

610 : A/D 변환기 620,630 : 메모리610: A / D converter 620 630: Memory

700 : 상관 영역 결정부 710 : 쿼드트리 분할부700: correlation area determination unit 710: quad tree partition unit

720 : 병합부 730 : 상관 영역 추출부720: merging unit 730: correlation area extracting unit

800 : 상관도 계산부 900 : 움직임 추정부800: Correlation calculation unit 900: Motion estimation unit

상기 목적을 달성하기 위해 본 발명에 의한 화상 전화기의 목표물 추적 방법은 추적 모드에서 입력되는 초기 영상의 목표물을 중심으로 초기창내의 각 화소의 밝기값을 쿼드트리 분할하고 병합하여 상관 영역을 결정하는 상관 영역 설정 단계; 입력되는 현재 영상에서 검색을 위한 검색 영역을 추출하는 검색 영역 추출 단계; 상기 결정된 상관 영역과 추출된 검색 영역간의 상관도에 따라 움직임을 추정하여 카메라를 이동시키는 움직임 추정 및 모터 구동 단계, 및 통화가 종료될때까지 입력되는 영상에 대해 상기 검색 영역 추출 단계와 움직임 추정 및 모터 구동 단계를 반복하는 통화 종료 단계를 포함하여 수행됨을 특징으로 한다.In order to achieve the above object, according to the present invention, there is provided a tracking method of a target object of a video telephone, comprising the steps of: dividing a brightness value of each pixel in an initial window around a target of an initial image input in a tracking mode, Area setting step; A retrieval region extracting step of extracting a retrieval region for retrieval from an input current image; A motion estimation and motor driving step of estimating a motion according to the degree of correlation between the determined correlation area and the extracted search area and moving the camera, And a call termination step of repeating the driving step.

또한, 상기 목적을 달성하기 위한 본 발명에 의한 화상 전화기의 목표물 추적 장치는 추적 모드에서 입력되는 영상을 A/D 변환하여 저장하는 영상 저장 수단; 상기 영상 저장 수단에 저장된 초기 영상의 목표물을 중심으로 초기창내의 각 화소의 밝기값을 쿼드트리 분할하고 병합하여 상관 영역을 결정하는 상관 영역 결정 수단; 상기 상관 영역 결정 수단에서 결정된 상관 영역에 따라 상기 영상 저장 수단에서 출력되는 현재 영상으로부터 검색 영역을 추출하여 상관도를 계산하는 상관도 계산 수단; 상기 상관도 계산 수단에서 계산된 상관도에 따라 움직임을 추정하는 움직임 추정수단; 및 상기 영상 저장 수단의 동작을 제어하고 상기 움직임 추정 수단에서 추정된 움직임에 따라 카메라를 이동시키도록 제어하는 제어 수단을 포함하여 구성됨을 특징으로 한다.According to another aspect of the present invention, there is provided an apparatus for tracking a target object of a video telephone, comprising: image storage means for A / D converting an image input in a tracking mode; A correlation area determination unit for determining a correlation area by dividing and merging brightness values of each pixel in an initial window around a target of an initial image stored in the image storage unit; Correlation calculation means for extracting a search region from a current image output from the image storage means according to a correlation region determined by the correlation region determination means and calculating a correlation degree; A motion estimation means for estimating a motion according to the correlation calculated by the correlation calculation means; And control means for controlling the operation of the image storage means and moving the camera in accordance with the motion estimated by the motion estimation means.

이하 첨부한 도면을 참조하여 본 발명의 실시예를 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명에 의한 화상 전화기의 목표물 추적 방법은 도2에 도시한 바와 같이 상관 영역 설정 단계(100, 101, 102, 103, 104, 105), 검색 영역 추출 단계(106, 107), 움직임 추정 및 모터 구동 단계(108, 109, 110), 및 통화 종료 단계(111, 112)에 의해 수행된다.2, the target tracking method of a video telephone according to the present invention includes a correlation area setting step 100, 101, 102, 103, 104 and 105, a search area extraction step 106 and 107, The driving steps 108, 109 and 110, and the call termination steps 111 and 112, respectively.

상기 상관 영역 설정 단계(100, 101, 102, 103, 104, 105)에서는 추적 모드에서 입력되는 초기 영상의 목표물을 중심으로 초기창내의 각 화소의 밝기값을 쿼드트리 분할하고 병합하여 상관 영역을 결정하며, 추적 모드에서 입력되는 초기 영상을 A/D(Analog/Digital) 변환하여 저장하는 초기 영상 저장 단계(100, 101), 상기 저장된 초기 영상에서 목표물의 위치에 초기창을 설정하는 초기창 설정 단계(102), 상기 설정된 초기창을 각 화소의 밝기값에 따라 균일성 검사 함수(Homoginity Measure)를 이용하여 쿼드트리 분할하는 쿼드트리 분할 단계(103), 상기 목표물이 위치한 쿼드트리 분할된 사각형과 주변에 인접한 사각형을 분산에 따라 병합하는 병합 단계(104), 및 상기 병합된 사각형이 이루는 영역을 상관 영역으로 결정하는 상관 영역 결정 단계(105)에 의해 수행된다.In the correlation area setting step 100, 101, 102, 103, 104, and 105, the brightness values of each pixel in the initial window are divided and merged with the target of the initial image input in the tracking mode to determine a correlation area An initial image storage step (100, 101) for converting an initial image input in a tracking mode into an A / D (Analog / Digital) and storing the initial image, an initial window setting step of setting an initial window at a position of a target in the stored initial image A quad tree dividing step 103 for dividing the set initial window into a quad tree using a homogeneity check function according to a brightness value of each pixel, A merging step 104 for merging the squares adjacent to the merged rectangle according to the variance, and a correlation area determination step 105 for determining the merged rectangle as the correlation area.

여기서, 상기 초기 영상은 목표물이 미리 설정된 위치에 있을때 카메라가 취한 영상이다.Here, the initial image is an image taken by the camera when the target is at a preset position.

상기 검색 영역 추출 단계(106, 107)에서는 입력되는 현재 영상에서 검색을 위한 검색 영역을 추출한다.In the search region extracting step (106, 107), a search region for search is extracted from the input current image.

여기서, 상기 검색 영역은 이전 영상에서 추출된 상관 영역을 기준으로 현재 영상에서 상하좌우 방향으로 8화소만큼 추가된 화소로 이루어진다.Here, the search area is composed of pixels added by 8 pixels in the up, down, left, and right directions on the basis of the correlation area extracted from the previous image.

상기 움직임 추정 및 모터 구동 단계(108, 109, 110)에서는 상기 결정된 상관 영역과 추출된 검색 영역간의 상관도에 따라 움직임을 추정하여 카메라를 이동시킨다.In the motion estimation and motor driving steps (108, 109, 110), motion is estimated according to the correlation between the determined correlation area and the extracted search area, and the camera is moved.

여기서, 상기 상관도를 계산하는 상관 함수는 MAE(Mean Absolute Error)이다.Here, the correlation function for calculating the correlation is MAE (Mean Absolute Error).

상기 통화 종료 단계(111, 112)에서는 통화가 종료될때까지 입력되는 영상에 대해 상기 검색 영역 추출 단계와 움직임 추정 및 모터 구동 단계를 반복한다.In the call termination steps 111 and 112, the search region extraction step, the motion estimation and the motor driving step are repeated for the image input until the call is terminated.

이와 같이 수행되는 본 발명에 의한 화상 전화기의 목표물 추적 방법을 첨부한 도면을 참조하여 상세히 설명한다.Hereinafter, a method for tracking a target of a video telephone according to the present invention will be described in detail with reference to the accompanying drawings.

먼저, 상관 영역 설정 단계(100, 101, 102, 103, 104)를 수행하여 추적을 위한 상관 영역을 결정하는데, 이를 세부적으로 설명하면 다음과 같다.First, the correlation area setting step (100, 101, 102, 103, 104) is performed to determine a correlation area for tracking, which will be described in detail below.

전화가 개시되고 사용자의 상황에 따라서 추적 기능을 사용하기를 원하지 않을 경우가 있을 수 있으므로 사용자가 추적 기능을 사용할 것인가를 먼저 결정한다(100). 추적 모드와 비추적 모드의 구별은 스위치로 간단하게 구현할 수 있다. 즉, 스위치가 온(ON)되어 있으면 추적 기능을 수행하고 오프(OFF)되어 있으면 수행하지 않는다.Since there may be times when a call is initiated and the user does not want to use the tracking function depending on the situation, the user first decides whether to use the tracking function (100). The distinction between trace mode and non-trace mode can be implemented simply with a switch. That is, if the switch is ON, the tracking function is performed. If the switch is OFF, the tracking function is not performed.

스위치를 온시켜 추적 기능을 수행하는 추적 모드가 되면 카메라로 부터 입력되는 초기 영상을 A/D(Analog/Digital) 변환하여 저장하는 초기 영상 저장 단계(100, 101)를 수행한다.And an initial image storage step (100, 101) in which an initial image input from a camera is A / D (Analog / Digital) converted and stored in a tracking mode for performing a tracking function by turning on a switch.

즉, 카메라에서 들어오는 NTSC 신호는 A/D 변환기를 통해 8비트의 디지탈 신호로 출력된다. 이때 높은 비트의 A/D 변환기를 사용할 수도 있다. 이 디지탈 신호는 영상 신호이므로, 메모리상에 2차원 행렬상에서 지정된 범위내의 값을 가지는 디지탈 영상(I(x, y))이 되어 저장된다.That is, the NTSC signal coming from the camera is output as an 8-bit digital signal through the A / D converter. At this time, a high-bit A / D converter may be used. Since this digital signal is a video signal, it is stored as a digital image I (x, y) having a value within a specified range on the two-dimensional matrix on the memory.

이 디지탈 영상은 64 x 64 화소의 크기를 갖는 영상으로, 메모리에 저장되어 목표물 추적에 이용된다.This digital image is an image with a size of 64 x 64 pixels, which is stored in memory and used for target tracking.

상관 추적에서는 초기 목표물의 록킹(Locking) 과정에 의해서 초기 위치, 즉 목표물인 통화자가 전화를 하기 위해서 다이얼을 누룰 수 있는 위치를 알고 있다고 가정한다. 따라서, 이러한 초기 위치에서 목표물이 통화를 위해 미리 설정된 위치에 있을때 카메라가 취한 초기 영상을 취할 수 있게 된다.In the correlation tracking, it is assumed that the initial position, that is, the position where the target caller can enjoy the dial to make a call, is known by the locking process of the initial target. Thus, in this initial position, the initial image taken by the camera can be taken when the target is at a preset location for the call.

이와 같이 취해진 초기 영상으로부터 목표물인 통화자의 얼굴을 포함하는 16 x 16화소인 정사각형으로 이루어진 윈도우를 초기창으로 설정하는 초기창 설정 단계(102)를 수행한다.An initial window setting step 102 is performed to set a window made up of a square of 16 x 16 pixels including a target person's face as an initial window from the initial image taken as described above.

즉, 64 x 64 화소 크기의 초기 영상으로부터 통화자의 얼굴을 포함하는 16 x 16 화소로 이루어진 초기창을 설정하는데, 초기창의 크기는 다양하게 설정될 수 있다.That is, an initial window composed of 16 x 16 pixels including the face of the caller is set from the initial image of 64 x 64 pixels size, and the size of the initial window can be set variously.

이와 같이 초기창을 설정한 후에는 상기 설정된 초기창을 각 화소의 밝기값에 따라 균일성 검사 함수(Homoginity Measure)를 이용하여 쿼드트리 분할하는 쿼드트리 분할 단계(103)를 수행한다.After the initial window is set, the quad tree segmenting step 103 is performed to divide the set initial window into quad-trees using the homogeneity measure according to the brightness values of the pixels.

즉, 상기 초기창을 균일성 검사 함수에 따라 쿼드트리 분할 기법을 이용하여 분할한다. 균일성 검사 함수는 아래 (식 1)에 나타낸 바와 같이 분산을 이용한다.That is, the initial window is divided according to the uniformity check function using a quadtree partitioning technique. The uniformity check function uses variance as shown in Equation (1) below.

υ = 1/N ΣΣ│I(x,y) - M│ [식 1]υ = 1 / N ΣΣI (x, y) - M [Expression 1]

x yx y

위의 (식 1)에서 N은 창내의 화소의 갯수이고 M은 대상 창 내부에 있는 화소 밝기의 평균이고 I(x,y)는 각 화소의 밝기값이다.In Equation (1), N is the number of pixels in the window, M is the average of pixel brightness inside the target window, and I (x, y) is the brightness value of each pixel.

위의 (식 1)에 의한 계산 결과인 분산값(υ)이 작을수록 창 내부의 화소는 평균과 비슷한 밝기값을 갖게 되어 밝기값의 분포가 균일하다고 볼 수 있다.As the variance value (υ), which is a result of the calculation according to the above formula (1), becomes smaller, the pixels in the window have a brightness value similar to the average, so that the distribution of brightness values is uniform.

그러므로, 위의 균일성 검사 함수를 적용하여 계산된 분산값(υ)이 설정된 문턱치 이상인지 검색하여 설정된 문턱치 이상이 아닌 경우 분할하지 않고 크면 분할하는 과정을 반복한다.Therefore, if the variance value (v) calculated by applying the above uniformity check function is greater than a threshold value that is set, and if the variance value (v) is not greater than the threshold value, the process is repeated.

즉, 도 3에 도시한 바와 같이 대상이 되는 창(C)에 대해 위의 (식 1)에 의해 분산값(υ)을 계산한 결과 분산값(υ)이 설정된 문턱치 보다 크면 상기 창(C)을 쿼드트리 분할하고(D) 분산값(υ)이 설정된 문턱치 보다 작으면 창(C)내의 화소의 밝기값이 균일하다고 결정하고 상기 창(C)을 쿼드트리 분할하지 않는다(E).That is, as shown in FIG. 3, when the variance value (v) is calculated by the above-mentioned (Formula 1) for the window C to be the object, (D) determining that the brightness value of a pixel in the window C is uniform if the variance value (v) is less than the threshold value set, and does not divide the window (C) into a quad tree.

다시 위의 과정을 반복하여 대상이 되는 창을 쿼드트리 분할된 사각형(D) 중의 하나로 하고 위의 (식 1)에 의해 계산된 분산값(υ)에 따라 쿼드트리 분할한다.The above process is repeated to make the target window one of the quad-tree divided squares D, and the quadtree is divided according to the variance value (υ) calculated by the above (Equation 1).

이와 같이 쿼드트리 분할 단계(103)를 수행한후에는 상기 목표물이 위치한 쿼드트리 분할된 사각형과 주변에 인접한 사각형을 분산에 따라 병합하는 병합 단계(104)를 수행한다.After the quad tree segmentation step 103 is performed, a merging step 104 is performed to merge the quad-tree segmented quadrangle in which the target is located and the neighboring quadrangles adjacent to the quad tree segment.

즉, 도 4에 도시한 바와 같이 쿼드트리 분할되었다고 가정하고 현재 대화자의 위치가 'A' 사각형이라고 하면 대화자가 위치한 사각형(F)을 중심으로 주변에 인접한 사각형은 7개의 사각형(Q1, Q2, Q3, Q4, Q5, Q6, Q7)이 되는데, 사각형(F)과 주변의 7개의 사각형과의 병합 과정을 수행하게 된다.That is, assuming that the quad tree is divided as shown in FIG. 4 and the current talker position is an 'A' quadrangle, the quadrangle adjacent to the periphery around the quadrangle (F) where the talker is located is divided into 7 quadrangles (Q1, Q2, Q3 , Q4, Q5, Q6, Q7). The merging process of the rectangle (F) and the surrounding rectangles is performed.

병합은 대화자가 위치한 사각형(F)과 이에 인접한 각각의 사각형과의 분산을 위의 (식 1)에 따라 계산하여 계산된 분산값(υ)이 설정한 문턱치 이하가 되면 2개의 사각형을 합한다. 이때, Σ를 행하는 구역은 2개의 사각형을 합한 영역이 된다.The merging is performed by calculating the variance between the rectangle (F) in which the dialogue character is located and each of the adjacent rectangles (Formula 1), and when the calculated variance value (v) is equal to or less than the threshold value, two rectangles are added. At this time, the area where? Is made is the sum of two squares.

예를 들어 도 4에 도시한 바와 같이 대화자가 위치한 사각형(F)과 인접한 사각형(Q1, Q2, Q3, Q4, Q5, Q6, Q7)에 대해 각각 분산값(υ)을 계산한 결과, 4개의 사각형(Q1, Q3, Q5, Q7)과의 분산값(υ)이 설정된 문턱치 이하가 되면 이를 병합한다.For example, as shown in FIG. 4, the variance value (v) is calculated for each of the quadrangles (F) and adjacent quadrangles (Q1, Q2, Q3, Q4, Q5, Q6, Q7) When the variance value (v) with the quadrangles (Q1, Q3, Q5, Q7) becomes equal to or less than the set threshold value, it is merged.

이와 같이 사각형이 병합되면 상기 병합된 사각형이 이루는 영역을 상관 영역으로 결정하는 상관 영역 결정 단계(105)를 수행한다.When the rectangles are merged as described above, a correlation area determination step 105 is performed to determine the area formed by the merged rectangles as a correlation area.

즉, 도 4에 도시한 바와 같이 병합된 영역, 즉 사각형(F, Q1, Q3, Q5, Q7)로 이루어지는 영역을 도 5(a)에 도시한 바와 같이 상관 영역으로 결정한다(105).That is, as shown in FIG. 5A, the region including the merged region, that is, the rectangle (F, Q1, Q3, Q5, Q7) is determined as the correlation region (105).

이와 같이 상관 영역을 결정한후에는 검색 영역 추출 단계(106, 107)를 수행하여 입력되는 현재 영상에서 검색을 위한 검색 영역을 추출한다.After the correlation area is determined as described above, a search area extraction step (106, 107) is performed to extract a search area for searching from the input current image.

즉, 입력되는 현재 영상을 A/D(Analog/Digital) 변환하여 저장하고(106), 도 5(b)에 도시한 바와 같이 이전 영상에서 추출된 상관 영역을 기준으로 현재 영상에서 상하좌우 방향으로 8 화소만큼 추가된 화소로 이루어진 영역, 즉 16 x 16 화소로 이루어진 초기창을 중심으로 하는 32 x 32 화소로 이루어진 영역을 검색 영역으로 추출한다.That is, the input current image is A / D converted (Analog / Digital) and stored (106). As shown in FIG. 5 (b), in the current image, An area made up of pixels added by 8 pixels, that is, an area made up of 32 x 32 pixels centering on an initial window made up of 16 x 16 pixels, is extracted as a search area.

이와 같이 검색 영역을 도 5(b)와 같이 추출한후에는 움직임 추정 및 모터 구동 단계(108, 109, 110)를 수행한다.After the search area is extracted as shown in FIG. 5 (b), the motion estimation and motor drive steps 108, 109, and 110 are performed.

먼저, 상기 결정된 상관 영역과 추출된 검색 영역간의 상관도를 계산한다(108).First, the degree of correlation between the determined correlation region and the extracted search region is calculated (108).

초기 영상의 상관 영역과 현재 영상의 검색 영역의 상관도를 계산하기 위해서는 사용할 상관 함수를 결정해야 한다.In order to calculate the correlation between the correlation region of the initial image and the search region of the current image, a correlation function to be used must be determined.

이러한 상관도 계산을 위한 상관 함수로는 아래 (식 2), (식 3), (식 4)에 나타낸 바와 같이 NCCF(Normalized Cross Correlation Function), MSE(Mean Square Error), 및 MAE(Mean Absolute Error)가 있다.Correlation functions for the correlation calculation include NCCF (Normalized Cross Correlation Function), MSE (Mean Square Error), and MAE (Mean Absolute Error) as shown in Equation 2, Equation 3 and Equation 4 below. ).

[식 2][Formula 2]

NCCF(p, q) = [ΣI_n(i, j)·I_n+1(i+p, j+q)] / [(ΣI_n ²(i, j))^1/2·(ΣI_n+1 ²(i+p, j+q))^1/2]NCCF (p, q) = [ ΣI n (i, j) · I n + 1 (i + p, j + q)] / [(ΣI n 2 (i, j)) 1/2 · (ΣI n + ₁ ² (i + p, j + q)) ^1/2 ]

MSE(p, q) = E([I_n(i, j) - I_n+1(i+p, j+q)]²) [식 3]MSE (p, q) = E ([I n (i, j) - I n + 1 (i + p, j + q)] 2) [ Equation 3]

MAE(p, q) = E(┃I_n(i, j) - I_n+1(i+p, j+q)┃) [식 4]MAE (p, q) = E (┃I n (i, j) - I n + 1 (i + p, j + q) ┃) [ Expression 4]

여기서 E(·)는 평균을 나타낸다.Where E (·) represents the mean.

상관 함수로는 NCCF가 가장 적합하지만 계산량을 고려하여 MAE를 사용한다.As the correlation function, NCCF is most suitable but MAE is used considering the calculation amount.

MAE는 영상간의 상관도가 높을수록 작은 값을 나타내게 된다. 따라서 상관 함수로 MAE를 사용할때는 그 값이 가장 낮은 위치가 다음 영상에서의 이동 물체, 즉 목표물의 추정 위치가 된다.The higher the correlation between images, the smaller the value of MAE is. Therefore, when MAE is used as a correlation function, the position with the lowest value becomes the moving object in the next image, that is, the estimated position of the target.

이와 같은 상관 함수에 따라 현재 영상의 검색 영역과 이전 영상인 초기 영상의 상관 영역의 상관도를 계산하고, 상관도가 가장 높은 부분은 검색 영역의 대화자, 즉 목표물이라고 볼 수 있으므로 움직임은 이전 위치와 현재 위치의 차이에 의해 계산된다.According to the correlation function, the correlation between the search area of the current image and the correlation area of the initial image, which is the previous image, is calculated, and the part having the highest correlation is considered as the target of the search area, It is calculated by the difference of the current position.

즉, 이전 영상인 초기 영상의 상관 영역의 위치와 검색 영역내에서 상관도가 가장 높은 부분이 있는 위치의 차이를 계산하여 목표물인 대화자의 얼굴의 이동을 추정하게 된다(109).That is, the movement of the face of the talker, which is the target, is estimated by calculating the difference between the position of the correlation area of the initial image, which is the previous image, and the position of the part having the highest correlation in the search area.

이때 추정된 추정된 움직임, 즉 대화자의 이동에 따라 모터를 구동시켜 카메라를 이동시킨다(110).At this time, the motor is driven according to the estimated estimated movement, that is, the talker moves to move the camera (110).

이때, 카메라로부터 새로운 영상이 입력되었는지 검색하여 새로운 영상이 입력되었으면 이를 다시 A/D 변환하여 저장하는 영상 저장 단계(106)를 수행한다.At this time, if a new image is input from the camera, a new image is inputted and then an image is stored (step 106).

한 프레임의 영상 신호는 하나의 메모리에 저장된후 다음 메모리로 쉬프트되어 저장되어 이전 영상으로 사용된다. 즉, 2개의 메모리에 각각 현재 영상과 이전 영상의 영상 신호가 저장되게 된다.The video signal of one frame is stored in one memory and then shifted to the next memory and used as a previous video. That is, the video signals of the current video and the previous video are stored in the two memories, respectively.

이와 같이 메모리에 각각 이전 영상과 현재 영상의 영상 신호가 저장되면 검색 영역을 추출하여(107) 위와 같이 상관도를 계산하고 움직임을 추정하게 된다(108, 109).When the video signals of the previous video and the current video are stored in the memory, the search area is extracted (107), the correlation is calculated and the motion is estimated (108, 109).

이때, 상기 이전 영상에서 추출된 상관도가 가장 높은 부분인 상기 움직임 추정 단계(109)에서 추출된 상관도가 가장 높은 부분을 추출하여 상관 영역으로 설정할 수 있다.At this time, the portion having the highest correlation degree extracted in the motion estimation step 109, which is the highest correlation degree extracted from the previous image, can be extracted and set as a correlation region.

한편, 카메라로부터 새로운 영상이 입력되지 않으면 통화가 종료되었는지 검색하여 추적 기능을 종료하는 통화 종료 단계를 수행한다.On the other hand, if a new image is not input from the camera, a call termination step of terminating the tracking function is performed by searching whether the call is terminated.

즉, 새로운 영상이 입력되면 영상 저장 단계(106)로 진행하여 위의 과정을 반복하고, 새로운 영상이 입력되지 않고 통화가 종료되었으면 추적 기능을 마친다(111, 112).That is, when a new image is input, the process proceeds to the image storage step 106 and repeats the above process. When the new image is not inputted and the call is terminated, the tracking function is completed (111, 112).

또한, 통화가 종료되지 않았는데 새로운 영상이 입력되지 않으면 새로운 영상이 입력될때까지 대기한다.If the call is not terminated but a new image is not input, the system waits until a new image is input.

다음으로 본 발명에 의한 화상 전화기의 목표물 추적 장치는 도 6에 도시한 바와 같이 영상 저장부(300), 상관 영역 결정부(700), 상관도 계산부(800), 움직임 추정부(900), 모터부(400), 및 제어부(500)로 구성된다6, the apparatus for tracking a target object of a video telephone according to the present invention includes an image storage unit 300, a correlation area determination unit 700, a correlation calculation unit 800, a motion estimation unit 900, A motor unit 400, and a control unit 500

영상 저장부(600)는 추적 모드에서 입력되는 영상을 A/D 변환하여 저장하는 것으로, 추적 모드에서 입력되는 영상을 A/D 변환하는 A/D 변환기(610), 상기 A/D 변환기(610)로부터 출력되는 현재 영상 신호를 저장하는 제1 메모리(620), 및 상기 제1 메모리(620)로 부터 출력되는 영상 신호를 저장하는 제2 메모리(630)로 구성된다.The image storage unit 600 A / D converts the image input in the tracking mode and stores the image. The A / D converter 610 performs A / D conversion on the image input in the tracking mode, the A / D converter 610 And a second memory 630 for storing a video signal output from the first memory 620. The first memory 620 stores a current video signal outputted from the first memory 620,

상관 영역 결정부(700)는 상기 영상 저장부(600)에 저장된 초기 영상의 목표물을 중심으로 초기창내의 각 화소의 밝기값을 쿼드트리 분할하고 병합하여 상관 영역을 결정하는 것으로, 상기 영상 저장부(600)의 제2 메모리(630)에 저장된 초기 영상에서 목표물의 위치에 초기창을 설정하고 각 화소의 밝기값에 따라 균일성 검사 함수(Homoginity Measure)를 이용하여 쿼드트리 분할하는 쿼드트리 분할부(710), 상기 쿼드트리 분할부(710)에서 쿼드트리 분할되어 입력되는 상기 목표물이 위치한 사각형과 주변에 인접한 사각형을 분산에 따라 병합하는 병합부(720), 및 상기 병합부(730)로 부터 출력되는 병합된 사각형이 이루는 영역을 상관 영역으로 추출하는 상관 영역 추출부(730)로 구성된다.The correlation area determination unit 700 determines a correlation area by dividing and merging the brightness values of each pixel in the initial window centered on the target of the initial image stored in the image storage unit 600, An initial window is set at the position of the target in the initial image stored in the second memory 630 of the quad-tree partitioning unit 600, and a quad-tree partitioning unit, which performs a quad-tree division using a homogeneity check function according to the brightness value of each pixel, A merging unit 720 for merging a quadrangle in which the target is located and a quadrangle adjacent to the target in a quad tree divided by the quad tree dividing unit 710, And a correlation region extracting unit 730 for extracting a region formed by the output merged rectangle as a correlation region.

상기 상관도 계산부(800)는 상기 상관 영역 결정부(700)에서 결정된 상관 영역에 따라 상기 영상 저장부(600)에서 출력되는 현재 영상으로부터 검색 영역을 추출하여 상관도를 계산한다.The correlation calculation unit 800 extracts a search region from the current image output from the image storage unit 600 according to the correlation region determined by the correlation region determination unit 700 and calculates the correlation.

여기서, 상기 검색 영역은 이전 영상에서 추출된 상관 영역을 기준으로 현재 영상에서 상하좌우 방향으로 8 화소만큼 추가된 화소로 이루어지며, 상기 상관도를 계산하는 상관 함수는 MAE(Mean Absolute Error)이다.Here, the search area is composed of pixels that are added by 8 pixels in the up, down, left, and right directions in the current image based on the correlation area extracted from the previous image, and the correlation function for calculating the correlation is MAE (Mean Absolute Error).

상기 움직임 추정부(900)는 상기 상관도 계산부(800)에서 계산된 상관도에 따라 움직임을 추정한다.The motion estimation unit 900 estimates motion according to the degree of correlation calculated by the correlation calculation unit 800. [

상기 제어부(500)는 상기 영상 저장부(600)의 동작을 제어하고 상기 움직임 추정부(900)에서 추정된 움직임에 따라 카메라를 이동시키도록 제어한다.The control unit 500 controls the operation of the image storage unit 600 and controls the movement of the camera according to the motion estimated by the motion estimation unit 900.

상기 모터부(400)는 상기 제어부(500)의 제어에 따라 모터를 구동시키는 모터 드라이버(410)와, 상기 모터 드라이버(410)에 의해 카메라(300)를 이동시키는 모터(420)로 구성된다.The motor unit 400 includes a motor driver 410 for driving the motor under the control of the controller 500 and a motor 420 for moving the camera 300 by the motor driver 410.

이와 같이 구성되는 본 발명에 의한 화상 전화기의 목표물 추적 장치의 동작을 설명한다.The operation of the target tracking device of the image phone according to the present invention constructed as above will be described.

먼저, 제어부(500)의 제어에 따라 A/D 변환기(610)가 온되어 카메라(300)로 부터 들어오는 NTSC 신호를 A/D 변환하여 8비트의 디지탈 신호로 출력하게 된다.First, the A / D converter 610 is turned on under the control of the controller 500 to A / D convert the NTSC signal coming from the camera 300 and output it as an 8-bit digital signal.

이때, 추적 모드가 온되어 전화 통화를 시작하는 경우에는 카메라(300)로부터 들어오는 최초 영상을 A/D 변환기(610)에서 A/D 변환하여 제1 메모리(620)에 저장하게 된다.At this time, when the tracking mode is turned on and a telephone conversation is started, the original image coming from the camera 300 is A / D converted by the A / D converter 610 and stored in the first memory 620.

제1 메모리(620)에 저장된 초기 영상은 다시 제2 메모리(630)로 쉬프트되어 쿼드 트리 분할부(710)로 입력되게 된다. 이때, 초기 영상은 제1 메모리(620)에서 바로 쿼드 트리 분할부(710)로 입력될 수도 있다.The initial image stored in the first memory 620 is shifted back to the second memory 630 and input to the quad tree partitioning unit 710. [ At this time, the initial image may be directly input to the quad tree partition unit 710 in the first memory 620.

제1 메모리(620)에서 출력되는 초기 영상은 쿼드 트리 분할부(710)에 입력되어 목표물인 통화자의 얼굴이 위치하고 8 x 8 화소로 이루어진 영역으로 초기창이 설정되게 된다.The initial image output from the first memory 620 is input to the quad tree partitioning unit 710, and an initial window is set to an area of 8 x 8 pixels in which the target face of the caller is located.

이와 같이 초기창이 설정된후에는 위의 (식 1)과 도 3에 도시한 바와 같이 쿼드트리 분할된다.After the initial window is set as described above, quad-tree division is performed as shown in (Equation 1) and FIG. 3 above.

이와 같이 쿼드트리 분할부(710)에서 쿼드트리 분할되어 출력되는 영상은 병합부(720)에서 도 4에 도시한 바와 같이 병합된후 도 5에 도시한 바와 같이 상관 영역 추출부(730)에서 상관 영역이 추출된다. 상기 병합부(720)와 상관 영역 추출부(730)의 세부적인 처리 과정은 위에서 설명한 바와 같다.4, the image output from the quad tree partitioning unit 710 is output to the correlation region extracting unit 730, as shown in FIG. 5, Regions are extracted. The detailed process of the merging unit 720 and the correlation region extracting unit 730 is as described above.

한편, 제2 메모리(630)에 저장된 초기 영상은 이전 영상이 되고, 카메라(300)로부터 들어오는 NTSC 신호는 제어부(500)의 제어에 따라 A/D 변환기(610)에서 A/D 변환되어 제1 메모리(620)에 저장된다.The initial image stored in the second memory 630 is a previous image and the NTSC signal from the camera 300 is A / D converted by the A / D converter 610 under the control of the controller 500, And stored in the memory 620.

이와 같이 제2 메모리(630)에 저장된 이전 영상은 창이 이동된다. 즉, 현재 영상에서의 검색 영역과 이전 영상에서의 상관 영역의 상관도를 계산하기 위해서는 이전 영상의 창을 이동시켜 상관 영역을 맞추어주어야 한다.As described above, the window of the previous image stored in the second memory 630 is moved. That is, in order to calculate the correlation between the search area in the current image and the correlation area in the previous image, the window of the previous image must be moved to match the correlation area.

이와 같이 이전 영상의 창에서는 상관도 계산부(800)에서 상관도 계산을 위해 상관 영역 추출부(730)에 의해 결정된 상관 영역의 모양으로 상관 영역이 추출되고 제1 메모리(620)에서 출력되는 현재 영상에서는 상관도 계산을 위해 검색 영역이 추출된다.As described above, in the window of the previous image, the correlation area is extracted in the shape of the correlation area determined by the correlation area extraction unit 730 in order to calculate the correlation in the correlation calculation unit 800, In the image, a search area is extracted for correlation calculation.

이와 같이 추출된 상관 영역과 검색 영역은 상관도 계산부(800)에서 MAE에 의해 상관도가 계산된다.The degree of correlation is calculated by the MAE in the correlation calculation unit 800 with the extracted correlation region and the search region.

따라서 검색 영역 중에서 상관도가 가장 높은, 즉 MAE가 가장 작은 부분은 목표물, 즉 대화자라고 볼 수 있으므로 움직임 추정부(900)에서는 이전 위치와 현재 위치의 차이에 의해 움직임을 계산한다.Therefore, since the part having the highest correlation degree, that is, the part having the smallest MAE, can be regarded as the target, that is, the talker, the motion estimation unit 900 calculates the motion based on the difference between the previous position and the current position.

이 움직임 정보를 이용하여 제어부(500)에서는 모터 드라이버(410)를 제어하여 모터(420)를 구동시키도록 한다. 이에 따라 카메라(300)가 이동하여 A/D 변환기(610)로 새로운 영상이 입력되게 되고, 이때 제어부(500)는 A/D 변환기(610)를 온시킨다.The controller 500 controls the motor driver 410 to drive the motor 420 using the motion information. Accordingly, the camera 300 moves and a new image is input to the A / D converter 610, and the controller 500 turns on the A / D converter 610 at this time.

한편, 제어기(500)에서는 A/D 변환기(610)와 제1 메모리(620)를 온시킨후 움직임 추정부(900)에서 움직임을 추정하여 움직임 정보를 출력할때까지 A/D 변환기(610)와 제1 메모리(620)를 오프시켜 카메라(300)로 부터 NTSC 신호가 입력되지 못하도록 한다.In the controller 500, the A / D converter 610 and the first memory 620 are turned on and the A / D converter 610 is operated until the motion estimator 900 estimates motion and outputs motion information. And the first memory 620 are turned off to prevent the NTSC signal from being input from the camera 300.

이와 같은 동작을 통해 카메라(300)를 목표물의 위치에 따라 이동시켜 카메라(300)가 목표물을 추적할 수 있도록 한다.Through such an operation, the camera 300 is moved according to the position of the target so that the camera 300 can track the target.

이상에서 설명한 바와 같이 본 발명에 의한 화상 전화기의 목표물 추적 방법 및 장치는 종래의 정사각형의 윈도우 대신에 쿼드트리 분할 및 병합을 통해 형성된 상관 영역을 이용하여 상관도를 추정하므로, 목표물을 추적하는 추적 성능을 향상시키면서도 계산량을 줄이는 효과가 있다.As described above, according to the present invention, the method and apparatus for tracking a target object of a video telephone use a correlation area formed by a quad tree segmentation and merging instead of a conventional square window to estimate correlation, It is possible to reduce the amount of calculations.

Claims

A correlation area setting step (100, 101, 102, 103, 104, 105) for determining a correlation area by dividing and merging brightness values of each pixel in an initial window based on a target of an initial image input in a tracking mode;

A retrieval region extracting step (106, 107) for extracting a retrieval region for retrieval from an input current image;

A motion estimation and motor driving step (108, 109, 110) for moving a camera by estimating a motion according to a correlation between the determined correlation area and the extracted search area, and

And a call terminating step (111, 112) for repeating the search region extracting step, the motion estimation and the motor driving step for an image input until the call is terminated.

The method according to claim 1, wherein the initial image is an image taken by a camera when the target is at a predetermined position.

The method according to claim 1 or 2, wherein the correlated region setting step

An initial image storage step (100, 101) for A / D (Analog / Digital) conversion of an initial image input in the tracking mode and storing the converted image; An initial window setting step (102) of setting an initial window at a position of a target in the stored initial image; A quad tree partitioning step (103) of dividing the set initial window into quad tree using a uniformity check function according to the brightness value of each pixel; A merging step (104) of merging a quad-tree-divided rectangle in which the target is located and a quadrangle adjacent to the target in a dispersed manner; And a correlation area determination step (105) of determining a region formed by the merged rectangle as a correlation area.

2. The method according to claim 1, wherein the search area comprises pixels added by 8 pixels in the up, down, left, and right directions of the current image based on the correlation area extracted from the previous image.

2. The method of claim 1, wherein the correlation function for calculating the correlation is a Mean Absolute Error (MAE).

Image storage means 600 for A / D-converting an image input in the tracking mode;

A correlation area determination unit (700) for determining a correlation area by dividing and merging brightness values of each pixel in an initial window around a target of an initial image stored in the image storage unit (600);

Correlation calculation means (800) for extracting a search region from a current image output from the image storage means (600) according to a correlation region determined by the correlation region determination means (700) and calculating a correlation degree;

A motion estimation unit 900 for estimating a motion according to the degree of correlation calculated by the correlation calculation unit 800; And

And control means (500) for controlling the operation of the image storage means (600) and controlling the movement of the camera in accordance with the motion estimated by the motion estimation means (900) Device.

7. The apparatus of claim 6, wherein the image storage means (600) comprises an A / D converter (610) for A / D-converting an image input in a tracking mode; A first memory 620 for storing a current image signal output from the A / D converter 610; And a second memory (630) for storing video signals output from the first memory (620).

7. The method according to claim 6, wherein the correlation area determination unit (700) sets an initial window at a target position in an initial image stored in the image storage unit (600) and calculates a homogence measure A quad tree partitioning unit 710 for partitioning the quad tree using the quad tree partitioning unit 710; A merging unit 720 for merging a quadrangle in which the target is located and a quadrangle adjacent to the target in the quad tree partitioning unit 710 according to the distribution; And a correlation area extractor (730) for extracting a region formed by the merged rectangle output from the merger (730) as a correlation area.

The apparatus according to claim 6, wherein the initial image is a video taken by the camera when the target is at a predetermined position.

9. The apparatus according to claim 8, wherein the search area comprises pixels added by 8 pixels in the up, down, left, and right directions of the current image based on the correlation area extracted from the previous image.

The apparatus according to claim 6, wherein the correlation function for calculating the correlation is a Mean Absolute Error (MAE).