KR100226583B1

KR100226583B1 - Target tracking method and device for video phone

Info

Publication number: KR100226583B1
Application number: KR1019960070818A
Authority: KR
Inventors: 정성학
Original assignee: 전주범; 대우전자주식회사
Priority date: 1996-12-24
Filing date: 1996-12-24
Publication date: 1999-10-15
Also published as: KR19980051898A

Abstract

본 발명은 화상 전화기에 있어서, 입력 영상으로부터 이전 영상의 상관 영역과 현재 영상의 검색 영역을 설정한 다음 이를 상관 영역에 대해서 소정 크기의 부블럭으로 분할하여 유사한 밝기값을 가지는 부블럭을 병합시키고, 이의 병합된 부블럭과 검색영역과의 상관도를 계산하여 목표물의 움직임을 추정함으로써 계산량을 줄이고 추적 성능을 향상시키는데 목적이 있는 것으로, 이와같은 목적은 추적 모드가 되면 입력되는 이전 영상에서 목표물이 위치한 상관 영역을 설정하고 입력되는 현재 영상에서 검색을 위한 검색 영역을 설정하는 영역 설정 과정; 상기 영역 설정 과정에 의해 설정된 상관 영역 소정 크기의 부블럭으로 분할한 다음 밝기값이 유사한 부블럭들을 하나의 영역으로 병합하는 블럭 분할/병합 과정; 상기 블럭 분할/병합 과정에 의해 병합된 상관영역의 부블럭을 검색영역에 매핑하여 상관도를 계산하는 상관도 계산 과정; 상기 상관도 계산 과정에 의해 계산된 상관도에 따라 움직임을 추정하여 카메라를 이동시키는 움직임 추정 및 모터 구동 과정; 및 통화가 종료될때까지 입력되는 영상에 대해 영역 설정 과정부터 움직임 추정 및 모터 구동 과정까지를 반복하는 통화 종료 과정을 포함하여 수행됨으로써 달성된다.According to the present invention, in a video telephone, a correlation region of a previous image and a search region of a current image are set from an input image, and then divided into subblocks having a predetermined size with respect to the correlation region to merge subblocks having similar brightness values. Its purpose is to reduce the amount of calculation and improve tracking performance by estimating the motion of the target by calculating the correlation between the merged subblock and the search area. An area setting process of setting a correlation area and setting a search area for searching in the input current image; A block dividing / merging process of dividing the subblocks having a predetermined size into the correlated region set by the region setting process and then merging subblocks having similar brightness values into one region; A correlation calculation step of calculating a correlation by mapping subblocks of the correlation area merged by the block division / merging process to a search area; A motion estimation and motor driving process of estimating motion and moving the camera according to the correlation calculated by the correlation calculation process; And a call termination process of repeating a region setting process, a motion estimation process, and a motor driving process with respect to the input image until the call ends.

Description

Target tracking device and method of video telephone

본 발명은 화상 전화기에 있어서, 목표물을 추적하기 위해 상관영역의 유사한 부불럭을 병합한 다음 검색영역의 부블럭과의 상관도를 계산하여 카메라의 방향을 변경해 주는 목표물 추적 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for tracking an object in a video telephone, in which a similar subblock of a correlation area is merged to track a target, and then a direction of the camera is changed by calculating a correlation with the subblock of the search area.

일반적으로 물체, 즉 목표물(Target)을 추적하는 추적 기법에는 중심점 추적 기법과 상관 추적 기법이 있다.In general, a tracking method for tracking an object, that is, a target, includes a center tracking method and a correlation tracking method.

중심점 추적 기법은 도 1a에 도시한 바와 같이 이동 물체를 배경으로 부터 분리한후 추출된 이동 물체의 중심점(A)을 추적하는 방법이다.The center point tracking technique is a method of tracking the center point A of the extracted moving object after separating the moving object from the background as shown in FIG. 1A.

이때, 이동 물체, 즉 목표물을 배경과 분리하기 위해 문턱치를 이용하게 된다. 즉, 문턱치를 이용하여 배경과 물체를 이진화 한다.At this time, the threshold value is used to separate the moving object, that is, the target from the background. In other words, the background and the object are binarized using the threshold.

그러나 이러한 중심적 추적 기법은 잡음에 대한 영향을 많이 받는 단점이 있다.However, this central tracking technique has a drawback of being affected by noise.

즉, 영상이 비교적 단순하여 영상 영역화가 용이하고 추적 가능한 물체의 속도에 대한 제약이 비교적 적은 경우에는 추적 안정성이 좋고 잡음의 영향이 적다.In other words, when the image is relatively simple and the image is easily segmented and the constraint on the speed of the traceable object is relatively low, the tracking stability is good and the noise is less affected.

그러나 반대로 영상이 비교적 복잡하여 영상 영역화가 용이하지 않고 추적 가능한 물체의 속도에 대한 제약이 비교적 많은 경우에는 추적 안정성이 나쁘고 잡음이 많아 진다.On the contrary, when the image is relatively complicated and image segmentation is not easy and the constraint on the speed of the traceable object is relatively high, the tracking stability is poor and the noise is high.

또한, 상관 추적 기법은 도 1b에 도시한 바와 같이 이전 프레임의 이동 물체, 즉 목표물의 위치에 적당한 크기의 영역(B)을 정의하고 정의된 영역(B)과 현재 프레임내의 검색 영역과의 상관도를 계산하여 상관도가 가장 높은 영역(B')으로 물체가 이동한 것으로 추정하는 방법이다.In addition, the correlation tracking technique defines a region B of a size appropriate for the moving object of the previous frame, that is, the target position, as shown in FIG. 1B, and the correlation between the defined region B and the search region in the current frame. It is a method of estimating that an object has moved to the region B 'having the highest correlation by calculating.

즉, 상관 추적 기법은 주어진 n번째 영상에서 이동 물체의 위치가 주어진 경우 이동 물체를 포함하는 일정한 크기의 윈도우 영역, 즉 상관 영역을 정의하고, n+1번째 영상에서의 검색 영역상의 각 위치에 대하여 상관도를 계산하여 상관도가 가장 높은 영역의 위치를 n+1번째 영상에서의 이동 물체의 위치로 간주한다.That is, the correlation tracking technique defines a window area having a constant size, that is, a correlation area, including a moving object when a position of a moving object is given in a given n-th image, and for each position on the search area in the n + 1 th image. The correlation is calculated to regard the position of the region having the highest correlation as the position of the moving object in the n + 1 th image.

여기서, 상관도 계산시 초기창의 모양, 즉 현재 프레임과 이전 프레임의 상관도 계산시 사용되는 영역은 주로 정사각형 형태의 윈도우 형태로 이루어진다.Here, the shape of the initial window when the correlation is calculated, that is, the area used when calculating the correlation between the current frame and the previous frame is mainly composed of a square window.

따라서, 상관 추적 기법은 영상 영역화 과정을 수행하지 않고 입력되는 현재 프레임의 영상으로부터 직접 상관도를 계산하기 때문에 비교적 복잡한 영상에 대해서도 추적 성능이 유지되지만 계산량이 많아지는 단점이 있다.Therefore, since the correlation tracking technique calculates the correlation directly from the image of the current frame input without performing the image segmentation process, the tracking performance is maintained even for a relatively complex image, but a large amount of calculation is required.

즉, 상관 추적 기법은 일반적으로 중심점 추적 기법에 비하여 영상을 이진화하지 않고 영상의 명암 정보를 사용하기 때문에 배경 산란 등이 첨가되어 영상 영역화가 불가능한 경우에도 어느 정도의 추적 성능을 기대할 수 있다.That is, the correlation tracking technique generally uses contrast information of the image rather than the center point tracking technique, so that some tracking performance can be expected even when image scattering is impossible due to background scattering.

그러나, 상관 추적 기법은 이동 물체의 움직임을 추정하기 위해서 상관 영역과 검색 영역 사이의 모든 경우에 대해 상관도를 계산하여야 하기 때문에 계산량이 많아지는 단점이 있다.However, the correlation tracking technique has a disadvantage in that a large amount of calculation is required because the correlation must be calculated in all cases between the correlation region and the search region in order to estimate the movement of the moving object.

상기 단점을 개선하기 위한 본 발명은 화상 전화기에 있어서, 입력 영상으로부터 이전 영상의 상관 영역과 현재 영상의 검색 영역을 설정한 다음 이를 상관 영역에 대해서 소정 크기의 부블럭으로 분할하여 유사한 밝기값을 가지는 부블럭을 병합시키고, 이의 병합된 부블럭과 검색영역과의 상관도를 계산하여 목표물의 움직임을 추정함으로써 계산량을 줄이고 추적 성능을 향상시키기 위한 목표물 추적 방법 및 장치를 제공함에 그 목적이 있다.In order to solve the above disadvantages, the present invention provides a video telephone having a similar brightness value by setting a correlation region of a previous image and a search region of a current image from an input image, and then dividing the correlation region into subblocks having a predetermined size. It is an object of the present invention to provide a target tracking method and apparatus for reducing a computation amount and improving tracking performance by merging subblocks and calculating correlations between the merged subblocks and a search region.

도 1a은 종래의 중심점 추적 기법을 설명하기 위한 도면1A is a diagram for explaining a conventional center point tracking technique.

도 1b은 종래의 상관 추적 기법을 설명하기 위한 도면1B is a diagram for explaining a conventional correlation tracking technique.

도 2 는 본 발명에 의한 목표물 추적 방법의 흐름도2 is a flowchart of a target tracking method according to the present invention.

도 3a은 도 2 의 검색영역 설정 단계를 설명하기 위한 도면3A is a diagram for explaining a search area setting step of FIG. 2;

도 3b은 도 2 의 상관영역 설정 단계를 설명하기 위한 도면FIG. 3B is a diagram for describing a correlation area setting step of FIG. 2.

도 3c은 상관영역에서의 부블럭 추출 단계를 설명하기 위한 도면3C is a diagram for explaining a subblock extraction step in a correlation region;

도 3d은 상관영역에서의 유사 부블럭 병합 상태를 설명하기 위한 도면3D is a diagram for describing a like subblock merging state in a correlation region.

도 3e은 상관도를 계산하기 위하여 검색영역내에서의 상관영역에 의한 매핑3E is a mapping by correlation region in search region to calculate correlation

상태를 나타낸 도면Drawing showing status

도 4 는 본 발명에 의한 목표물 추적 장치의 블럭도4 is a block diagram of a target tracking device according to the present invention.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

100 : 카메라 200 : 영상 저장부100: camera 200: video storage

210 : 아날로그/디지탈 변환부 220 : 현재 영상 저장부210: analog / digital converter 220: current image storage unit

230 : 이전 영상 저장부 300 : 블럭 분할/병합부230: previous image storage unit 300: block division / merge unit

310 : 검색영역 설정부 320 : 상관영역 설정부310: search area setting unit 320: correlation area setting unit

330 : 부블럭 분할부 350 : 유사 부블럭 병합부330: subblock division unit 350: pseudo subblock merge unit

400 : 상관도 계산부 500 : 움직임 추정부400: correlation calculation unit 500: motion estimation unit

600 : 제어부 700 : 모터부600: control unit 700: motor unit

710 : 모터 구동부 720 : 모터710: motor driving unit 720: motor

상기 목적을 달성하기 위해 본 발명에 의한 화상 전화기의 목표물 추적 방법은, 추적 모드가 되면 입력되는 이전 영상에서 목표물이 위치한 상관 영역을 설정하고 입력되는 현재 영상에서 검색을 위한 검색 영역을 설정하는 영역 설정 과정; 상기 영역 설정 과정에 의해 설정된 상관 영역 소정 크기의 부블럭으로 분할한 다음 밝기값이 유사한 부블럭들을 하나의 영역으로 병합하는 블럭 분할/병합 과정; 상기 블럭 분할/병합 과정에 의해 병합된 상관영역의 부블럭을 검색영역에 매핑하여 상관도를 계산하는 상관도 계산 과정; 상기 상관도 계산 과정에 의해 계산된 상관도에 따라 움직임을 추정하여 카메라를 이동시키는 움직임 추정 및 모터 구동 과정; 및 통화가 종료될때까지 입력되는 영상에 대해 영역 설정 과정부터 움직임 추정 및 모터 구동 과정까지를 반복하는 통화 종료 과정을 포함하여 수행됨을 특징으로 한다.In order to achieve the above object, a target tracking method of a video telephone according to the present invention, in the tracking mode, sets a correlation region where a target is located in a previous image inputted and sets a search region for searching in the current image inputted. process; A block dividing / merging process of dividing the subblocks having a predetermined size into the correlated region set by the region setting process and then merging subblocks having similar brightness values into one region; A correlation calculation step of calculating a correlation by mapping subblocks of the correlation area merged by the block division / merging process to a search area; A motion estimation and motor driving process of estimating motion and moving the camera according to the correlation calculated by the correlation calculation process; And a call termination process of repeating a region setting process, a motion estimation process, and a motor driving process with respect to the input image until the call ends.

또한, 상기 목적을 달성하기 위한 본 발명에 의한 화상 전화기의 목표물 추적 장치는, 추적 모드에서 입력되는 영상을 A/D 변환하여 저장하는 영상 저장 수단; 상기 영상 저장 수단에 저장된 현재 영상과 이전 영상으로부터 검색영역과 상관영역을 설정한 다음 상관영역만을 부블럭으로 분할하여 그 부블럭에 대해 유사한 밝기값을 가지는 부블럭들을 병합하는 블럭 분할/병합 수단; 상기 블럭 분할/병합 수단에 의해 병합된 상관영역의 부블럭을 검색영역에 매핑하여 상관도를 계산하는 상관도 계산 수단; 상기 상관도 계산 수단에서 계산된 상관도에 따라 움직임을 추정하는 움직임 추정수단; 및 상기 영상 저장 수단의 동작을 제어하고 상기 움직임 추정 수단에서 추정된 움직임에 따라 카메라를 이동시키도록 제어하는 제어 수단을 포함하여 구성됨을 특징으로 한다.In addition, the target tracking device of the video telephone according to the present invention for achieving the above object, the image storage means for storing the A / D conversion of the image input in the tracking mode; Block dividing / merging means for setting a search region and a correlation region from the current image and the previous image stored in the image storage means, and then dividing only the correlation region into sub-blocks and merging sub-blocks having similar brightness values for the sub-blocks; Correlation coefficient calculating means for calculating a correlation by mapping subblocks of the correlation region merged by the block division / merge means to a search region; Motion estimation means for estimating motion according to the correlation degree calculated in said correlation calculation means; And control means for controlling the operation of the image storing means and controlling the camera to move according to the motion estimated by the motion estimating means.

이하 첨부한 도면을 참조하여 본 발명의 실시예를 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명에 의한 화상 전화기의 목표물 추적 방법은 도 2에 도시한 바와 같이 영역 설정 과정(ST1, ST2, ST3, ST4), 부블럭 분할/병합 과정(ST5, ST6), 상관도 계산 과정(ST7), 움직임 추정 및 모터 구동 과정(ST8, ST9), 및 통화 종료 과정(ST10, ST11)에 의해 수행된다.As shown in FIG. 2, the target tracking method of the video telephone according to the present invention includes a region setting process (ST1, ST2, ST3, ST4), a subblock division / merge process (ST5, ST6), and a correlation calculation process (ST7). , Motion estimation and motor driving processes ST8 and ST9, and call termination processes ST10 and ST11.

영역 설정 과정(ST1, ST2, ST3, ST4)은 추적 모드에서 입력되는 이전의 영상으로부터 목표물이 위치한 상관 영역을 추출하고, 현재 영상으로부터 검색을 위한 검색 영역을 추출하기 위한 과정으로, 추적 모드에서 입력되는 이전의 영상과 현재의 영상을 A/D(Analog/Digital) 변환하여 저장하는 제 1 단계(ST1, ST2, ST3), 상기 저장된 이전 영상으로부터 상관영역을 설정하고 현재 영상으로부터 검색영역을 설정하는 제 2 단계(ST4)에 의해 수행된다.The region setting process (ST1, ST2, ST3, ST4) is a process for extracting a correlation region where a target is located from a previous image input in the tracking mode and extracting a search region for searching from the current image. A first step (ST1, ST2, ST3) of converting and storing the previous image and the current image to A / D (Analog / Digital), to set the correlation region from the stored previous image and to set the search region from the current image It is performed by the second step ST4.

여기서, 입력되는 영상의 크기는 64×64 크기의 영상이라고 가정한다면, 이전영상, 즉 대화자가 통화를 위해 미리 설정된 위치에 있을때 카메라가 취한 영상에 의해 설정되는 상관영역의 크기는 16×16으로 설정하며, 현재 영상으로부터 설정되는 검색영역은 상관영역보다 상하좌우로 각각 8화소씩만큼 추가하여 32×32 크기의 영역으로 설정한다.Here, assuming that the size of the input image is 64 × 64 size, the size of the correlation area set by the previous image, that is, the image taken by the camera when the talker is at a preset position for the call is set to 16 × 16. The search area set from the current image is set to a 32 × 32 size area by adding 8 pixels each up, down, left, and right from the correlation area.

블럭 분할/병합 과정(ST5, ST6)은 상관도를 계산하기 위한 전처리 과정으로서 기 설정된 16×16크기의 상관 영역을 일정한 크기의 4×4 크기의 부블럭으로 분할한 다음 밝기값이 유사한 부블럭을 하나의 영역으로 병합하는 과정이다.The block division / merge process (ST5, ST6) is a preprocessing process for calculating the correlation. The predetermined 16 × 16 size correlation area is divided into 4 × 4 sized subblocks of a constant size, and then subblocks having similar brightness values. Is the process of merging into one area.

상관도 계산 과정(ST7)은 상기 블럭 분할/병합 과정에 의해 병합된 영역을 검색영역에 매핑함으로써 상관 영역과 검색 영역간의 상관도를 계산하는 과정이다.The correlation calculation process ST7 is a process of calculating the correlation between the correlation region and the search region by mapping the region merged by the block division / merging process to the search region.

움직임 추정 및 모터 구동 과정(ST8, ST9)은 상기 블럭 분할/병합 과정에 의해 계산된 상관도에 따라 움직임을 추정하여 카메라를 이동시키는 과정으로, 계산된 상관도가 가장 높은 부분을 추출하여 움직임을 추정하는 제 1 단계(ST8); 상기 추정된 움직임에 따라 카메라를 이동시키는 제 2 단계(ST9)에 의해 수행된다.The motion estimation and motor driving process (ST8, ST9) is a process of moving the camera by estimating the motion according to the correlation calculated by the block division / merging process. Estimating (ST8); A second step ST9 is performed to move the camera according to the estimated movement.

여기서, 상기 상관도를 계산하는 상관 함수는 MAE(Mean Absolute Error) 이다.Here, the correlation function for calculating the correlation is a mean absolute error (MAE).

통화 종료 과정(ST10, ST11)은 통화가 종료될때까지 입력되는 영상에 대해 영역 설정 과정(ST1, ST2, ST3, ST4)부터 움직임 추정 및 모터 구동 과정(ST8, ST9)까지를 반복하여 수행하는 과정이다.The call termination process (ST10, ST11) is a process of repeatedly performing the area setting process (ST1, ST2, ST3, ST4) to the motion estimation and motor driving process (ST8, ST9) for the input image until the call is terminated to be.

이와같이 수행되는 본 발명에 의한 화상 전화기의 목표물 추적 방법을 첨부한 도면을 참조하여 상세히 설명한다.The target tracking method of the video telephone according to the present invention performed as described above will be described in detail with reference to the accompanying drawings.

먼저, 전화가 개시되고 사용자의 상황에 따라서 추적 기능을 사용하기를 원하지 않을 경우가 있을 수 있으므로 사용자가 추적 기능을 사용할 것인가를 먼저 결정한다(ST1).First, since the call is initiated and there may be times when the user does not want to use the tracking function, the user first decides whether to use the tracking function (ST1).

추적 모드와 비추적 모드의 구별은 스위치로 간단하게 구현할 수 있다. 즉, 스위치가 온(ON)되어 있으면 추적 기능을 수행하고 오프(OFF)되어 있으면 수행하지 않는다.The distinction between tracking mode and non-tracking mode is simple with a switch. That is, if the switch is ON, the tracking function is performed. If the switch is OFF, the tracking function is not performed.

스위치를 온시켜 추적 기능을 수행하는 추적 모드가 되면 카메라로 부터 입력되는 초기 영상을 A/D(Analog/Digital) 변환하여 이전 영상과 현재 영상을 저장하는 제 1 과정인 영상 저장 단계(ST1, ST2, ST3)를 수행한다.When the tracking mode is turned on to perform the tracking function, the first image storage step (ST1, ST2) converts an initial image input from the camera to A / D (Analog / Digital) to store the previous image and the current image. , ST3).

즉, 카메라에서 들어오는 NTSC 신호는 A/D 변환기를 통해 8비트의 디지탈 신호로 출력된다. 이때 높은 비트의 A/D 변환기를 사용할 수도 있다.In other words, NTSC signal from camera is output as 8-bit digital signal through A / D converter. A high bit A / D converter can also be used.

이 디지탈 신호는 영상 신호이므로, 메모리상에 2차원 행렬상에서 지정된 범위내의 값을 가지는 디지탈 영상(I(x, y))이 되어 저장된다.Since this digital signal is a video signal, it is stored in the memory as a digital video (I (x, y)) having a value within a range specified on a two-dimensional matrix.

이 디지탈 영상이 64 x 64 화소의 크기를 갖는 영상으로, 메모리에 저장되어 목표물 추적에 이용된다.This digital image is an image having a size of 64 x 64 pixels, stored in a memory and used for tracking a target.

상관 추적에서는 초기 목표물의 록킹(Locking) 과정에 의해서 초기 위치, 즉 목표물인 통화자가 전화를 하기 위해서 다이얼을 누룰 수 있는 위치를 알고 있다고 가정한다.The correlation tracking assumes that the initial target is known by the locking process of the initial target, that is, the position where the target caller can dial to make a call.

따라서, 초기 영상과 그 다음의 영상 즉, 이전 영상과 현재 영상을 메모리에 저장한다.Therefore, the initial image and the next image, that is, the previous image and the current image are stored in the memory.

이와 같이 취해진 이전 영상으로부터 목표물인 통화자의 얼굴을 포함할 수 있는 16×16 크기의 상관 영역을 설정하고, 또한 현재 영상으로부터 32×32 크기의 검색 영역을 설정하게 된다(ST4).From the previous image taken as described above, a correlation area of 16 × 16 size, which may include the caller's face as a target, is set, and a search area of 32 × 32 size is set from the current image (ST4).

즉, 도 3a에 도시한 바와 같이 64 x 64 화소 크기의 현재 영상으로부터 이전 영상과의 상관도를 계산하기 위한 32×32 크기의 검색 영역을 설정하며, 도 3b에 도시한 바와같이 이전 영상으로부터는 16×16 크기의 통화자의 얼굴을 포함하는 상관 영역을 설정한다.That is, as shown in FIG. 3A, a search area of size 32 × 32 for calculating a correlation with the previous image is set from the current image having a size of 64 × 64 pixels, and as shown in FIG. 3B. A correlation area including a face of a 16 × 16 caller is set.

이렇게 상관 영역 및 검색 영역이 설정되면, 도 3c에 도시한 바와같이 각 상관 영역을 4×4 크기의 부블럭으로 분할하게 되는데, 이는 상관 영역과 검색 영역간의 상관도 계산시에 계산량이 많은 화소단위의 상관도 계산이 아닌 부블럭단위의 상관도를 계산을 행함으로써 그 계산량을 줄이기 위한 전처리 과정이다(ST5).When the correlation region and the search region are set in this way, as shown in FIG. 3C, each correlation region is divided into 4 × 4 subblocks, which is a pixel unit having a large amount of computation when calculating the correlation between the correlation region and the search region. This is a preprocessing step for reducing the amount of calculation by calculating the correlation of subblock units rather than the correlation of (ST5).

이렇게 부블럭으로 분할이 되면 도 3d에 도시한 바와같이 16×16 크기의 상관영역내에서 밝기값이 유사한 부블럭을 하나의 영역으로 병합하는 과정(ST6)을 수행하게 되는데, 이를 좀 더 상세히 설명하면 다음과 같다.When divided into subblocks, a process of merging subblocks having similar brightness values into one region is performed in a 16 × 16 correlation region as shown in FIG. 3D, which will be described in more detail. Is as follows.

유사 부블럭을 추출하는 이유는 유사한 부블럭의 모양이 어느 정도의 목표물의 모양을 형성하게 되므로 영상을 영역화(segmentation)한 것과 유사한 효과를 거둘 수 있기 때문이다.The reason for extracting the similar subblocks is that the similar subblocks form the shape of the target to some extent, and thus have an effect similar to that of segmenting the image.

그러므로, 상관 영역의 각 부블럭에 대하여 비슷한 밝기값을 가지는 지의 여부를 판단하게 되는데, 이의 판단은 다음 식 [1]에 의하여 검사하게 된다.Therefore, it is determined whether or not each of the subblocks in the correlation region has a similar brightness value, and the determination thereof is examined by the following equation [1].

같은 영역 = ｜M1 - M2｜＜ TSame area = | M1-M2 | <T

다른 영역 = ｜M1 - M2｜ ≥ TOther areas = | M1-M2 | ≥ T

여기서, M은 해당 부블럭의 평균 밝기값이며,Where M is the average brightness of the subblock,

T는 일정한 문턱치이다.T is a constant threshold.

이는 곧, 어느 두 블럭 즉, M1과 M2의 밝기값의 차이가 소정 문턱치 이하이면 비슷한 밝기값을 가지는 부블럭이라고 판단하여 이를 병합함으로써 같은 영역으로 포함시키게 되며, 밝기값의 차이가 소정 문턱치 이상이면 같은 영역에 포함시키지 않게 된다.In other words, if the difference between the brightness values of any two blocks, that is, M1 and M2 is less than or equal to the predetermined threshold, it is determined to be a sub-block having similar brightness and included in the same area by merging them. It will not be included in the same area.

이러한 과정을 통해 상관영역에 대하여 소정 갯수(도 3d에서는 4개의 병합 영역)의 영역을 추출한다.Through this process, a predetermined number of regions (four merge regions in FIG. 3D) of the correlation region are extracted.

이와 같이 추출된 상관 영역에서 병합된 영역을 도 3e에 도시한 바와같이 검색영역에 매핑하여 상관도를 계산하게 되는데(ST7), 상관영역의 병합된 영역을 검색영역에 대응시키면서 상관도를 계산하게 된다.As shown in FIG. 3E, the merged region in the extracted correlation region is mapped to the search region to calculate the correlation (ST7). The correlation is calculated while the merged region of the correlation region corresponds to the search region. do.

따라서, 상관영역의 부블럭 전체에 대하여 그에 대응된 검색영역과의 상관도를 계산하는 것이 아니라 병합된 영역에 대해서만 검색영역과의 상관도를 계산함으로써 그만큼 계산량을 줄일 수 있다.Therefore, rather than calculating the correlation with the search region corresponding to the entire subblock of the correlation region, the amount of computation can be reduced by calculating the correlation with the search region only for the merged region.

이때, 이전 영상의 상관 영역의 병합 영역과 현재 영상의 검색 영역의 상관도를 계산하기 위해서는 사용할 상관 함수를 결정해야 한다.In this case, in order to calculate a correlation between the merged region of the correlation region of the previous image and the search region of the current image, a correlation function to be used must be determined.

이러한 상관도 계산을 위한 상관 함수로는 다음 식 [2], [3], [4]에 나타낸 바와같이 NCCF(Normalized Cross Correlation Function), MSE(Mean Square Error), 및 MAE(Mean Absolute Error)가 있다.Correlation functions for such correlation calculation include the Normalized Cross Correlation Function (NCCF), Mean Square Error (MSE), and Mean Absolute Error (MAE), as shown in the following equations [2], [3], and [4]: have.

여기서 E(·)는 평균을 나타낸다.Here, E (*) represents an average.

상관 함수로는 NCCF가 가장 적합하지만 계산량을 고려하여 MAE를 사용한다.NCCF is the best correlation function, but MAE is used considering the amount of calculation.

MAE는 영상간의 상관도가 높을수록 작은 값을 나타내게 된다.The higher the correlation between images, the smaller the MAE is.

따라서 상관 함수로 MAE를 사용할때는 그 값이 가장 낮은 위치가 다음 영상에서의 이동 물체, 즉 목표물의 추정 위치가 된다.Therefore, when using MAE as the correlation function, the lowest position is the estimated position of the moving object, that is, the target in the next image.

이와 같은 상관 함수에 따라 현재 영상의 검색 영역과 이전 영상의 상관 영역과의 상관도를 계산하고, 상관도가 가장 높은 부분은 검색 영역의 대화자, 즉 목표물이라고 볼 수 있으므로 움직임은 이전 위치와 현재 위치의 차이에 의해 계산된다.According to the correlation function, the correlation between the search area of the current image and the correlation area of the previous image is calculated, and since the highest correlation is regarded as the dialog of the search area, that is, the target, the movement is the previous position and the current position. Is calculated by the difference.

즉, 이전 영상에서의 상관 영역의 병합 영역 위치와 현재 영상의 검색 영역내에서 상관도가 가장 높은 부분이 있는 각각 위치의 차이를 계산하여 그에대한 각 병합영역의 움직임 벡터 즉, ①, ②, ③, ④의 움직임 벡터를 추출한 다음 이의 평균을 구함으로써 최종적인 움직임을 추정하게 된다(ST8).That is, the difference between the positions of the merged regions of the correlation region in the previous image and the positions of the most highly correlated portions in the search region of the current image is calculated, and the motion vectors of the merged regions, i.e. ①, ②, ③ Finally, the final motion is estimated by extracting the motion vector ④ and then calculating the average of the motion vector (ST8).

그러므로, 추정된 움직임 즉 대화자의 이동에 따라 모터를 구동시켜 카메라를 이동시킨다(ST9).Therefore, the camera is moved by driving the motor in accordance with the estimated movement, that is, the movement of the talker (ST9).

이때, 카메라로부터 새로운 영상이 입력되었는지 검색하여 새로운 영상이 입력되었으면 이를 다시 A/D 변환하여 저장하는 영역 저장 단계(ST3)을 수행한다.At this time, if the new image is input from the camera and the new image is input, the area storage step ST3 of A / D conversion is again performed.

한 프레임의 영상 신호는 하나의 메모리에 저장된후 다음 메모리로 쉬프트되어 저장되어 이전 영상으로 사용된다.An image signal of one frame is stored in one memory, shifted to the next memory, and used as a previous image.

즉, 2개의 메모리에 각각 현재 영상과 이전 영상의 영상 신호가 저장되게 된다.That is, the video signals of the current video and the previous video are stored in the two memories, respectively.

따라서 영역 설정 과정(ST1, ST2, ST3, ST4)에서 저장되는 영상은 현재 입력되는 현재 영상으로, 초기 영상을 포함한 이전 영상은 다음번 움직임 추정을 위해 다른 메모리에 쉬프트되어 저장된다.Therefore, the image stored in the area setting process (ST1, ST2, ST3, ST4) is the current image currently input, the previous image including the initial image is shifted and stored in another memory for the next motion estimation.

이와 같이 메모리에 각각 이전 영상과 현재 영상의 영상 신호가 저장되면 상관 영역과 검색 영역을 추출하게 된다(ST4).As described above, when image signals of the previous image and the current image are stored in the memory, the correlation region and the search region are extracted (ST4).

즉, 상기 이전 영상에서 추출된 상관도가 가장 높은 부분인 상기 움직임 추정 단계(ST8)에서 추출된 상관도가 가장 높은 부분을 상기 결정된 상관 영역의 모양으로 추출하여 상관 영역으로 설정하고, 상기 설정된 이전 영상의 상관 영역을 중심으로 32 x 32화소로 이루어진 검색 영역을 현재 영상에서 설정하게 된다.That is, the highest correlation degree extracted in the motion estimation step ST8, which is the highest correlation degree extracted from the previous image, is extracted as a shape of the determined correlation area, and is set as a correlation area. A search region consisting of 32 x 32 pixels is set in the current image based on the correlation region of the image.

이후에 상관도를 계산하는 부블럭 분할 및 병합 단계(ST5, ST6), 상관도 계산 단계(ST7), 움직임 추정 단계(ST8), 및 모터 구동 단계(ST9)가 위에서 설명한 바와 같이 수행된다.Subsequently, the subblock division and merging steps ST5 and ST6, the correlation calculation step ST7, the motion estimation step ST8, and the motor driving step ST9 for calculating the correlation are performed as described above.

한편, 카메라로부터 새로운 영상이 입력되지 않으면 통화가 종료되었는지 검색하여 추적 기능을 종료하는 통화 종료 단계를 수행한다.On the other hand, if a new video is not input from the camera performs a call termination step of searching whether the call is terminated and ends the tracking function.

즉, 새로운 영상이 입력되면 영상 저장 단계(ST3)로 진행하여 위의 과정을 반복하고, 새로운 영상이 입력되지 않고 통화가 종료되었으면 추적 기능을 마친다(ST11).That is, when a new video is input, the process proceeds to the video storing step ST3 and repeats the above process.

또한, 통화가 종료되지 않았는데 새로운 영상이 입력되지 않으면 새로운 영상이 입력될때까지 대기한다.In addition, if a call is not ended but a new video is not input, it waits until a new video is input.

다음으로 본 발명에 의한 화상 전화기의 목표물 추적 장치는 도 4에 도시한 바와 같이 영상 저장부(200), 블럭 분할/병합부(300), 상관도 계산부(400), 움직임 추정부(500), 제어부(600), 모터부(700)로 구성된다.Next, as shown in FIG. 4, the target tracking device of the video telephone according to the present invention includes an image storage unit 200, a block division / merge unit 300, a correlation calculation unit 400, and a motion estimation unit 500. , The control unit 600 and the motor unit 700.

영상 저장부(200)는 제어부(600)의 제어에 따라 추적 모드에서 카메라(100)로부터 입력되는 영상을 A/D 변환하여 저장하는 것으로, 추적 모드에서 입력되는 영상을 A/D 변환하는 A/D 변환부(210), 상기 A/D 변환부(210)로부터 출력되는 현재 영상 신호를 저장하는 현재 영상 저장부(220), 상기 현재 영상 저장부(220)로 부터 출력되는 영상 신호를 저장하는 이전 영상 저장부(230)로 구성된다.The image storage unit 200 A / D converts and stores the image input from the camera 100 in the tracking mode under the control of the controller 600, and A / D converts the image input in the tracking mode. The D converter 210, the current image storage unit 220 for storing the current video signal output from the A / D converter 210, and stores the video signal output from the current image storage unit 220 The previous image storage unit 230 is configured.

블럭 분할/병합부(300)는 상기 영상 저장부(200)에 저장된 이전의 영상으로부터 목표물이 위치한 상관 영역을 추출하고, 현재 영상으로부터 검색을 위한 검색 영역을 설정하기 위한 것으로, 현재 영상으로부터 상관도의 검색을 위한 소정 크기의 검색영역을 설정하는 검색영역 설정부(310), 이전 영상으로부터 목표물이 위치한 소정 크기의 상관영역을 설정하는 상관영역 설정부(320), 상기 상관영역 설정부(320)에 의해 설정된 상관영역을 소정 크기의 부블럭으로 분할하는 부블럭 분할부(330), 상기 부블럭 분할부(330)에 의해 분할된 상관영역의 부블럭중에서 밝기값이 유사한 부블럭을 검사하여 이를 병합하는 유사 부블럭 병합부(340)로 구성된다.The block dividing / merge unit 300 extracts a correlation region where a target is located from a previous image stored in the image storage unit 200 and sets a search region for searching from the current image. A search region setting unit 310 for setting a search region having a predetermined size for searching a target, a correlation region setting unit 320 for setting a correlation region having a predetermined size in which a target is located from a previous image, and the correlation region setting unit 320. The sub-block dividing unit 330 for dividing the correlation region set by the sub block into a sub-block having a predetermined size and the sub-blocks having similar brightness values from the sub-blocks of the correlation region divided by the sub-block dividing unit 330 are examined. Similar sub-block merging unit 340 for merging.

상관영역은 이전 영상으로부터 목표물인 통화자의 얼굴을 포함할 수 있는 16×16 크기로 설정하고, 또한 검색영역은 현재 영상으로부터 32×32 크기로 설정하게 된다.The correlation area is set to a size of 16 × 16 that can include the caller's face as a target from the previous picture, and the search area is set to a size of 32 × 32 from the current picture.

이렇게 설정된 상관영역은 4×4 크기의 부블럭으로 분할된 다음 상기 상관영역의 부블럭중 밝기값이 유사한 부블럭만을 추출하여 이를 하나의 영역으로 병합하게 된다.The correlated region thus set is divided into subblocks having a size of 4x4, and only subblocks having similar brightness values among the subblocks of the correlated region are extracted and merged into one region.

상관도 계산부(400)는 블럭 분할부(300)에 의해 병합된 영역을 검색영역에 매핑하여 서로 대응시키면서 상관도를 계산하게 된다.The correlation calculator 400 calculates a correlation by mapping the regions merged by the block dividing unit 300 to a search region and corresponding to each other.

여기서, 상기 상관도를 계산하는 상관 함수는 MAE(Mean Absolute Error) 로 이루어진다.Here, the correlation function for calculating the correlation is made of Mean Absolute Error (MAE).

움직임 추정부(500)는 상기 상관도 계산부(400)에서 계산된 상관도에 따라 움직임을 추정한다.The motion estimator 500 estimates the motion according to the correlation degree calculated by the correlation calculator 400.

제어부(600)는 상기 영상 저장부(200)의 동작을 제어하고 상기 움직임 추정부(500)에서 추정된 움직임에 따라 카메라(100)를 이동시키도록 제어한다.The controller 600 controls the operation of the image storage unit 200 and controls the camera 100 to move according to the motion estimated by the motion estimator 500.

모터부(700)는 상기 제어부(600)의 제어에 따라 모터를 구동시키는 모터 구동부(710)와, 상기 모터 구동부(710)에 의해 카메라(100)를 이동시키는 모터(720)로 구성된다.The motor unit 700 includes a motor driver 710 for driving a motor under the control of the controller 600, and a motor 720 for moving the camera 100 by the motor driver 710.

이와 같이 구성되는 본 발명에 의한 화상 전화기의 목표물 추적 장치의 동작을 설명한다.The operation of the target tracking device of the video telephone according to the present invention configured as described above will be described.

먼저, 제어부(600)의 제어에 따라 A/D 변환부(210)가 온되어 카메라(100)로 부터 들어오는 NTSC 신호를 A/D 변환하여 8비트의 디지탈 신호로 출력하게 된다.First, the A / D converter 210 is turned on under the control of the controller 600 to A / D convert NTSC signals from the camera 100 and output them as 8-bit digital signals.

이때, 추적 모드가 온되어 전화 통화를 시작하는 경우에는 카메라(100)로부터 들어오는 최초 영상을 A/D 변환부(210)에서 A/D 변환하여 현재 영상 저장부(220)에 저장하게 된다.In this case, when the tracking mode is turned on to start a phone call, the A / D conversion unit 210 converts the first image from the camera 100 to the current image storage unit 220.

현재 영상 저장부(220)에 저장된 초기 영상은 블럭 분할부(300)의 검색영역 설정부(310)에 입력되어 상관도 검색을 위한 32×32 크기의 검색영역이 설정된다.The initial image stored in the current image storage unit 220 is input to the search region setting unit 310 of the block divider 300 to set a 32 × 32 size search region for correlation search.

또한 이 초기 영상은 이전 영상 저장부(230)에 쉬프트되어 저장됨으로써 블럭 분할부(300)의 상관영역 설정부(330)에 의해 목표물인 통화자의 얼굴이 위치한 부분에서 16 x 16 화소로 이루어진 상관영역이 설정되게 된다.In addition, the initial image is shifted and stored in the previous image storage unit 230 so that the correlation region of 16 x 16 pixels is located at the part where the caller's face as a target is located by the correlation region setting unit 330 of the block division unit 300. Will be set.

이렇게 검색영역과 상관영역이 설정되면 부블럭 분할부(330)에서는 상관영역을 인가받아 4×4 크기의 부블럭으로 각각 분할하게 된다.When the search area and the correlation area are set in this way, the sub-block dividing unit 330 receives the correlation area and divides the sub block into 4 × 4 sub-blocks.

이는 상관 영역과 검색 영역간의 상관도 계산시에 계산량이 많은 화소단위의 상관도 계산이 아닌 부블럭단위의 상관도를 계산을 행함으로써 그 계산량을 줄이기 위한 전처리 과정이다.This is a preprocessing process for reducing the amount of calculation by calculating the correlation between sub-blocks rather than the calculation of the pixel-by-pixel correlations when calculating the correlation between the correlation area and the search area.

유사 부블럭 병합부(340)에서는 상관영역의 부블럭을 인가받아 밝기값이 유사한 4×4 크기의 부블럭만을 추출하여 이를 하나의 영역으로 병합하게 되는데, 그 이유는 밝기값이 유사한 부블럭의 모양이 어느 정도의 목표물의 모양을 형성하게 되므로 영상을 영역화(segmentation)한 것과 유사한 효과를 거둘 수 있기 때문이다.The similar subblock merging unit 340 receives the subblocks of the correlation region and extracts only the 4 × 4 subblocks having similar brightness values, and merges them into one area. This is because the shape forms the shape of the target to some extent, and thus an effect similar to that of segmenting the image can be obtained.

상기와 같이 상관영역의 밝기값이 유사한 부블럭이 병합되면 상관도 계산부(400)에서는 블럭 분할부(300)로부터 출력되는 상관영역의 병합 영역을 도 3e에 도시한 바와같이 검색영역에 매핑하여 상관도를 계산하게 되는데, 상관영역을 검색영역에 이동하여 대응시키면서 병합 영역에 대해서만 상관도를 계산하게 된다.When the sub-blocks having similar brightness values of the correlation region are merged as described above, the correlation calculator 400 maps the merge region of the correlation region output from the block divider 300 to the search region as illustrated in FIG. 3E. The degree of correlation is calculated. The degree of correlation is calculated only for the merged area while the correlation area is moved to the search area.

이때, 본 발명에서의 최소 움직임 단위는 병합 영역 크기만큼이고, 병합 영역 크기이하는 무시하게 된다.In this case, the minimum motion unit in the present invention is as much as the size of the merge region, and the following is ignored.

따라서, 상관영역의 하나의 영역으로 병합된 영역에 대해서만 검색영역과의 상관도를 계산함으로써 그만큼 계산량을 줄일 수 있다.Therefore, the amount of calculation can be reduced by calculating the correlation with the search area only for the area merged into one area of the correlation area.

이러한 상관도는 MAE(Mean Absolute Error)에 의해 계산된다.This correlation is calculated by means of Mean Absolute Error (MAE).

따라서 검색 영역 중에서 상관도가 가장 높은, 즉 MAE가 가장 작은 부분은 목표물, 즉 대화자라고 볼 수 있으므로 움직임 추정부(500)에서는 이전 위치와 현재 위치의 차이에 의해 움직임을 계산한다.Therefore, the portion having the highest correlation among the search areas, that is, the smallest MAE, can be regarded as the target, that is, the dialogue person, so the motion estimator 500 calculates the motion based on the difference between the previous position and the current position.

이 움직임 정보를 이용하여 제어부(600)에서는 모터 구동부(710)를 제어하여 모터(720)를 구동시키도록 한다.Using the motion information, the controller 600 controls the motor driver 710 to drive the motor 720.

이에 따라 카메라(100)가 이동하여 A/D 변환부(210)로 새로운 영상이 입력되게 되고, 이때 제어부(600)는 A/D 변환부(210)를 온시킨다.Accordingly, the camera 100 moves and a new image is input to the A / D converter 210. At this time, the controller 600 turns on the A / D converter 210.

한편, 제어부(600)에서는 A/D 변환부(210)와 현재 영상 저장부(220)를 온시킨후 움직임 추정부(500)에서 움직임을 추정하여 움직임 정보를 출력할때까지 A/D 변환부(210)와 현재 영상 저장부(220)를 오프시켜 카메라(100)로 부터 NTSC 신호가 입력되지 못하도록 한다.Meanwhile, the controller 600 turns on the A / D converter 210 and the current image storage unit 220, and then estimates the motion by the motion estimator 500 to output the motion information. The 210 and the current image storage unit 220 are turned off to prevent the NTSC signal from being input from the camera 100.

이와 같은 동작을 통해 카메라(100)를 목표물의 위치에 따라 이동시켜 카메라(100)가 목표물을 추적할 수 있도록 한다.Through this operation, the camera 100 is moved according to the position of the target so that the camera 100 can track the target.

이상에서 설명한 바와 같이 본 발명에 의한 화상 전화기의 목표물 추적 방법 및 장치는 상관영역을 소정 크기의 부블럭으로 분할한 다음 밝기값이 비슷한 부블럭을 하나의 영역으로 병합하여 이를 검색영역에 매핑함으로써 상관도를 추정하므로, 계산량을 줄임과 아울러 추적 성능을 향상시키는 효과가 있다.As described above, the method and apparatus for tracking a target of an image telephone according to the present invention divides the correlation area into sub-blocks having a predetermined size, and then merges the sub-blocks having similar brightness values into one area to correlate them to the search area. Since the degree is estimated, the amount of calculation is reduced, and the tracking performance is improved.

Claims

A region setting process of setting a correlation region where a target is located in the previous image to be input in the tracking mode and setting a search region for searching in the input current image;

A block dividing / merging process of dividing the subblocks having a predetermined size into the correlated region set by the region setting process and then merging subblocks having similar brightness values into one region;

A correlation calculation step of calculating a correlation by mapping subblocks of the correlation area merged by the block division / merging process to a search area;

A motion estimation and motor driving process of estimating motion and moving the camera according to the correlation calculated by the correlation calculation process; And

And a call termination process for repeating the process of area setting, motion estimation, and motor driving for the input image until the call is terminated.

The method of claim 1, wherein the area setting process comprises: a first step of A / D converting and storing a previous image and a current image which are input in a tracking mode;

And a second step of setting a correlation region from the stored previous image and setting a search region from the current image.

3. A method according to claim 1 or 2, wherein the correlation area consists of 16 x 16 pixels.

3. The method of claim 1 or 2, wherein the search area is composed of 32 x 32 pixels.

2. The method of claim 1, wherein the subblock of the correlation region is 4x4 pixels.

2. The method of claim 1, wherein the subblock merging of the block division / merge process merges when the difference between the brightness values of any two subblocks is equal to or less than a predetermined threshold, and otherwise does not merge.

The method of claim 1, wherein the correlation function for calculating the correlation is a mean absolute error (MAE).

The method of claim 1, wherein the motion estimation and motor driving process comprises: a first step of estimating motion by extracting a portion having the highest calculated correlation;

And a second step of moving the camera according to the estimated movement.

Image storage means for A / D converting and storing the image input in the tracking mode;

Block dividing / merging means for setting a search region and a correlation region from the current image and the previous image stored in the image storage means, and then dividing only the correlation region into sub-blocks and merging sub-blocks having similar brightness values for the sub-blocks;

Correlation coefficient calculating means for calculating a correlation by mapping subblocks of the correlation region merged by the block division / merge means to a search region;

Motion estimation means for estimating motion according to the correlation degree calculated in said correlation calculation means; And

And control means for controlling the operation of the image storage means and controlling the camera to move according to the motion estimated by the motion estimation means.

10. The apparatus of claim 9, wherein the image storing means comprises: an A / D converter configured to A / D convert an image input in the tracking mode;

A current video storage unit for storing a current video signal output from the A / D converter;

And a previous video storage unit for storing a video signal outputted from the current video storage unit.

10. The apparatus of claim 9, wherein the block dividing / merge means comprises: a search region setting unit for setting a search region having a predetermined size for searching for a correlation from a current image;

A correlation region setting unit for setting a correlation region of a predetermined size in which a target is located from a previous image;

A sub-block dividing unit dividing the correlation region set by the correlation region setting unit into sub-blocks having a predetermined size;

And a similar sub-block merging unit for merging only sub-blocks having similar brightness values among the sub-blocks of the correlated region divided by the sub-block dividing unit into one region.

12. The target tracking device of claim 11, wherein the similar sub-block merging unit merges if the difference between the brightness values of any two sub-blocks is equal to or less than a predetermined threshold, and otherwise does not merge.

12. The target tracking device of a video telephone according to claim 9 or 11, wherein the correlation area is made of 16 x 16 pixels.

12. The method of claim 9 or 11, wherein the search area is composed of 32 x 32 pixels.

12. The target tracking device of a video telephone according to claim 9 or 11, wherein the subblock is composed of 4x4 pixels.

10. The apparatus of claim 9, wherein the correlation function for calculating the correlation is a mean absolute error (MAE).