KR19980023002A

KR19980023002A - Talker tracking method and apparatus for video phone

Info

Publication number: KR19980023002A
Application number: KR1019960042345A
Authority: KR
Inventors: 정성학
Original assignee: 배순훈; 대우전자 주식회사
Priority date: 1996-09-25
Filing date: 1996-09-25
Publication date: 1998-07-06
Also published as: KR100215204B1

Abstract

본 발명은 적외선 카메라를 이용한 화상 전화기에 있어서, 입력 영상에서 추출한 경계 화소의 크기와 방향에 따라 배경과 대화자를 분리하고 중심적 추적을 통해 카메라를 이동시켜 대화자를 추적하는 대화자 추적 방법 및 장치에 관한 것이다.The present invention relates to a method and an apparatus for tracking a speaker in a video telephone using an infrared camera, which separates the background and the speaker according to the size and direction of the boundary pixel extracted from the input image and moves the camera through the central tracking. .

본 발명은 입력되는 적외선 영상으로부터 추출된 경계 화소 및 경계선 방향에 따라 해당하는 경계 방향 비트에 특정 값을 할당하고 대화자 영상을 추출하여 중심점 추적을 통해 움직임을 추정하고 그에 따라 카메라를 이동시킨다.The present invention assigns a specific value to the corresponding boundary direction bit according to the boundary pixel and boundary line direction extracted from the input infrared image, extracts the dialogue image, estimates the movement through center point tracking, and moves the camera accordingly.

따라서 본 발명은 적외선 입력 영상에서 추출된 경계 화소의 크기와 방향을 이용하여 각 입력 영상에 할당될 레지스터의 값을 할당하는 과정을 통해 레지스터의 값에 따라 배경과 대화자를 분리하여 대화자를 추적하므로 추적 성능이 향상되는 효과가 있다.Therefore, the present invention tracks the dialog by separating the background and the dialog according to the register value by allocating the value of the register to be allocated to each input image by using the size and direction of the boundary pixel extracted from the infrared input image. The performance is improved.

Description

TARGET TRACKING METHODE AND DERICE FOR VIDEO PHONE

본 발명은 화상 전화기에 있어서, 대화자를 추적하여 카메라를 이동시켜 주는 대화자 추적 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for tracking a speaker, which moves a camera by tracking a speaker.

본 발명의 목적은 화상 전화기에 있어서, 대화자를 추적하여 카메라를 자동으로 이동시키므로써 대화자가 이동하면서 통화할 수 있도록 하기 위한 대화자 추적 방법 및 장치를 제공함에 있다.SUMMARY OF THE INVENTION An object of the present invention is to provide a method and apparatus for tracking a speaker, which enables the talker to make a call while moving by automatically moving the camera by tracking the talker.

일반적으로 물체, 즉 대화자(Traget)을 추적하는 추적 기법에는 상관 추적 기법과 중심점 추적 기법이 있다.In general, there are two kinds of tracking methods for tracking an object, that is, a conversation, such as correlation tracking and center tracking.

상관 추적 기법은 도1a에 도시한 바와 같이 이전 프레임의 이동 물체, 즉 대화자의 위치에 적당한 크기의 영역(A)을 정의하고 정의된 영역(A)과 현재 프레임내의 검색 영역과의 상관도를 계산하여 상관도가 가장 높은 영역(A')으로 물체가 이동한 것으로 추정하는 방법이다.The correlation tracking technique defines an area A of a size appropriate for the moving object of the previous frame, that is, the position of the dialoger as shown in Fig. 1A, and calculates the correlation between the defined area A and the search area within the current frame. Therefore, it is a method of estimating that an object has moved to the region A 'having the highest correlation.

즉, 상관 추적 기법은 주어진 n번째 영상에서 이동 물체의 위치가 주어진 경우 이동 물체를 포함하는 일정한 크기의 윈도우 영역, 즉 상관 영역을 정의하고, n+1번째 영상에서의 검색 영역상의 각 위치에 대하여 상관도를 계산하여 상관도가 가장 높은 영역의 위치를 n+1번째 영상에서의 이동 물체의 위치로 간주한다.That is, the correlation tracking technique defines a window area having a constant size, that is, a correlation area, including a moving object when a position of a moving object is given in a given n-th image, and for each position on the search area in the n + 1 th image. The correlation is calculated to regard the position of the region having the highest correlation as the position of the moving object in the n + 1 th image.

여기서, 상관도 계산시 초기창의 모양, 즉 현재 프레임과 이전 프레임의 상관도 계산시 사용되는 영역은 주로 정사각형 형태의 윈도우 형태로 이루어진다.Here, the shape of the initial window when the correlation is calculated, that is, the area used when calculating the correlation between the current frame and the previous frame is mainly composed of a square window.

따라서 상관 추적 기법은 영상 영역화 과정을 수행하지 않고 입력되는 현재 프레임의 영상으로 부터 직접 상관도를 계산하기 때문에 비교적 복잡한 영상에 대해서도 추적 성능이 유지되지만 계산량이 많아지는 단점이 있다.Therefore, the correlation tracking technique calculates the correlation directly from the image of the current frame input without performing the image segmentation process, so that the tracking performance is maintained even for a relatively complex image, but the computational amount is large.

즉, 상관 추적 기법은 일반적으로 중심점 추적 기법에 비하여 영상을 이진화 하지 않고 영상의 명암 정보를 사용하기 때문에 배경 산란 등이 첨가되어 영상 영역화가 불가능한 경우에도 어느 정도의 추적 성능을 기대할 수 있다. 그러나 상관 추적 기법은 이동 물체의 움직임을 추정하기 위해서 상관도를 계산하여야 하기 때문에 계산량이 많아지는 단점이 있다.That is, the correlation tracking technique generally uses contrast information of the image rather than the center point tracking technique, and thus, some background tracking performance can be expected even when image scattering is impossible due to background scattering. However, the correlation tracking technique has a disadvantage in that a large amount of calculation is required because the correlation must be calculated to estimate the movement of the moving object.

또한 중심점 추적 기법은 도1b에 도시한 바와 같이 이동 물체를 배경으로 부터 분리한 후 추출된 이동 물체의 중심점(B)을 추적하는 방법이다.In addition, the center point tracking method is a method of tracking the center point B of the extracted moving object after separating the moving object from the background, as shown in FIG.

따라서 영상이 비교적 단순하여 영상 영역화가 용이할때에는 추적 안정성이 좋고 추적 가능한 물체의 속도에 대한 제약이 비교적 적어 추적 성능이 좋다.Therefore, when the image is relatively simple and the image segmentation is easy, the tracking stability is good and the tracking performance is good because the constraint on the speed of the traceable object is relatively small.

그러나 중심점 추적 기법은 이동 물체, 즉 대화자를 배경과 분리하기 위해 문턱치를 이용하게 되어 영상에서 정확한 대화자를 추출하기 어렵다. 즉, 중심점 추적 기법은 문턱치를 이용하여 배경과 물체를 이진화하므로, 영상에서 정확한 물체를 추출하기 어려울 뿐만 아니라 입력 영상에 잡음이 존재하는 경우 그에 대한 영향을 많이 받는 단점이 있다.However, the center tracking technique uses a threshold to separate a moving object, that is, a dialog, from the background, and thus it is difficult to extract an accurate dialog from an image. That is, since the center tracking method binarizes the background and the object by using the threshold, it is difficult to extract the exact object from the image, and has a disadvantage of being affected by noise when there is noise in the input image.

상기 단점을 개선하기 위한 본 발명은 화상 전화기에 있어서, 배경과 얼굴의 구분이 용이한 적외선 영상과 중심적 추적 방법을 이용하여 추적 성능을 향상시키기 위한 대화자 추적 방법 및 장치를 제공함에 그 목적이 있다.An object of the present invention is to provide a speaker tracking method and apparatus for improving tracking performance by using an infrared image and a central tracking method for easily distinguishing a background from a face in a video telephone.

도1a은 종래의 상관 추적 기법을 설명하기 위한 도면1A is a diagram for explaining a conventional correlation tracking technique.

도1b은 종래의 중심점 추적 기법을 설명하기 위한 도면1B is a diagram for explaining a conventional center point tracking technique.

도2는 본 발명에 의한 대화자 추적 방법의 흐름도2 is a flowchart of a method for tracking a speaker according to the present invention.

도3은 도2의 경계 화소 추출 단계를 설명하기 위한 도면FIG. 3 is a diagram for describing a boundary pixel extraction step of FIG. 2. FIG.

도4는 도2의 대화자 추출 단계를 설명하기 위한 도면4 is a view for explaining a dialog extraction step of FIG.

도5a은 도4의 경계 방향 비트 할당 단계를 설명하기 위한 도면FIG. 5A is a diagram for explaining a boundary bit allocation step of FIG. 4; FIG.

도5b은 도 4의 대화자 중심점 판단 단계를 설명하기 위한 도면FIG. 5B is a diagram for explaining a dialogue center point determination step of FIG. 4; FIG.

도6은 본 발명에 의한 대화자 추적 장치의 구성도6 is a block diagram of an apparatus for tracking a speaker according to the present invention;

* 도면의 주요 부분에 대한 부호의 설명* Explanation of symbols for the main parts of the drawings

300 : 카메라, 400 : 모터부, 410 : 모터, 420 : 모터 드라이버, 500 : 제어부, 600 : 영상 저장부, 610 : A/D 변환기, 620 : 메모리, 700 : 경계 방향 비트 할당부, 710 : 경계 화소 추출부, 720 : 경계선 방향 추출부, 730 : 쉬프터, 740 : 경계 방향 비트 세트부, 800 : 대화자 생성부, 810 : 카운터, 820 : 결과 영상 생성부, 900 : 중심점 계산부300: camera, 400: motor unit, 410: motor, 420: motor driver, 500: controller, 600: image storage unit, 610: A / D converter, 620: memory, 700: boundary direction bit allocation unit, 710: boundary A pixel extractor, 720: boundary line extractor, 730: shifter, 740: boundary bit set unit, 800: dialog generator, 810: counter, 820: result image generator, 900: center point calculator

상기 목적을 달성하기 위해 본 발명에 의한 화상 전화기의 대화자 추적 방법은 추적 모드에서 입력되는 적외선 영상 신호로부터 경계 화소와 경계선 방향을 추출하는 경계 추출 단계; 상기 추출된 경계 화소의 경계선 방향의 90도 방향에 위치한 모든 화소에 대해서 상기 추출된 경계선 방향에 해당하는 경계 방향 비트에 특정 값을 할당하고 할당된 특정값의 개수에 따라 대화자 영상을 추출 및 저장하는 대화자 추출 단계; 상기 추출된 대화자 영상의 중심점을 계산하는 대화자 영상 중심점 계산 단계; 및 상기 계산된 대화자 영상의 중심점에 따라 움직임을 추정하여 카메라를 이동시키는 움직임 추정 및 모터 구동 단계를 포함하여 수행됨을 특징으로 한다.In order to achieve the above object, the method for tracking a speaker of a video telephone according to the present invention includes: a boundary extraction step of extracting boundary pixels and boundary lines from an infrared image signal input in a tracking mode; Allocating a specific value to the boundary direction bits corresponding to the extracted boundary line direction for all pixels located in the 90 degree direction of the boundary line direction of the extracted boundary pixel, and extracting and storing the dialogue image according to the number of assigned specific values Dialogue extracting step; A dialog image center point calculation step of calculating a center point of the extracted dialog image; And a motion estimation and a motor driving step of moving the camera by estimating the movement according to the center point of the calculated dialogue image.

또한, 상기 목적을 달성하기 위한 본 발명에 의한 화상 전화기의 대화자 추적 장치는 추적 모드에서 입력되는 적외선 영상을 A/D 변환하여 저장하는 영상 저장 수단; 상기 영상 저장 수단에 저장된 적외선 영상으로부터 경계 화소와 경계선 방향을 추출하여 각 화소의 해당하는 경계 방향 비트 할당 수단에서 출력되는 각 화소의 경계 방향 비트에 할당된 특정 값의 갯수에 따라 대화자 영상을 생성하여 상기 영상 저장 수단에 저장하는 대화자 생성 수단; 상기 영상 저장 수단에 저장된 대화자 영상을 검색하여 대화자의 중심점을 계산하는 중심점 계산수단; 및 상기 영사 저장 수단의 동작을 제어하고 상기 중심점 계산 수단에서 계산된 중심점에 따라 대화자의 움직임을 추정하여 카메라를 이동시키도록 제어하는 제어 수단을 포함하여 구성됨을 특징으로 한다.In addition, the speaker tracking apparatus of the video telephone according to the present invention for achieving the above object comprises image storage means for A / D conversion and storing the infrared image input in the tracking mode; Extracts a boundary pixel and a boundary line direction from an infrared image stored in the image storage means, and generates a dialogue image according to the number of specific values assigned to the boundary direction bits of each pixel output from the corresponding boundary direction bit allocation means of each pixel. Talker generating means for storing in the video storing means; Center point calculation means for retrieving the talker image stored in the image storage means and calculating a center point of the talker; And control means for controlling the operation of the projection storage means and controlling the movement of the camera by estimating the movement of the dialoger according to the center point calculated by the center point calculation means.

이하 첨부한 도면을 참조하여 본 발명의 실시예를 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명에 의한 화상 전화기의 대화자 추적 방법은 도2에 도시한 바와 같이 경계 추출 단계(100, 101, 102, 103), 대화자 추출 단계 (104, 105, 106, 107), 중심점 계산 단계(108), 및 움직임 추정 및 모터 구동 단계(109)에 의해 수행된다.The conversation tracking method of the video telephone according to the present invention includes the boundary extraction step 100, 101, 102, 103, the dialog extraction step 104, 105, 106, 107 and the center point calculation step 108 as shown in FIG. And motion estimation and motor drive step 109.

상기 경계 추출 단계(100, 101, 102, 103)는 추적 모드에서 입력되는 적외선 영상 신호로부터 경계 화소와 경계선 방향을 추출하는 단계로, 추적 모드에서 입력되는 적외선 영상 신호를 A/D(Analog/Digital) 변환하여 저장하는 적외선 영상 저장 단계(100,101), 상기 저장된 적외선 영상 신호에서 경계 화소를 추출하는 경계 화소 추출 단계(102), 및 상기 추출된 경계 화소의 수직 및 수평 미분값의 아크 탄젠트값(atan(h, v))으로 경계선 방향을 추출하는 경계선 방향 추출 단계(103)에 의해 수행된다.The boundary extracting step (100, 101, 102, 103) is a step of extracting the boundary pixel and the boundary line direction from the infrared image signal input in the tracking mode, A / D (Analog / Digital) Infrared image storage step (100, 101) of converting and storing, boundary pixel extraction step 102 of extracting a boundary pixel from the stored infrared image signal, and arc tangent values (atan) of the vertical and horizontal derivatives of the extracted boundary pixel (h, v)) is performed by the boundary direction extraction step 103 of extracting the boundary direction.

여기서, 상기 경계 화소는 상기 설정된 초기창내의 각 화소의 밝기값의 수평 및 수직 방향의 미분값의 크기와 설정된 경계선 판별용 문턱치의 비교에 의해 결정된다.Here, the boundary pixel is determined by comparing the magnitudes of the differential values in the horizontal and vertical directions of the brightness values of the respective pixels in the set initial window with the set boundary determination threshold.

상기 대화자 추출 단계(104, 105, 106, 107)는 상기 추출된 경계 화소의 경계선 방향의 90도 방향에 위치한 모든 화소에 대해서 상기 추출된 경계선 방향에 해당하는 경계 방향 비트에 특정 값을 할당하고 할당된 특정값의 개수에 따라 대화자 영상을 추출 및 저장하는 단계로, 상기 추출된 경계 화소의 경계선 방향에 대해 90도 방향에 위치한 각각의 화소에 상기 추출된 경계선 방향에 따라 특정 값을 경계 방향 비트로 할당하는 방향 비트 할당 단계(104), 상기 경계 방향 비트에 할당된 특정 값의 갯수를 카운팅하여 대화자 중심선을 판단하는 대화자 중심점 판단 단계(105), 및 상기 대화자 중심점을 중심으로 둘러싸인 경계 화소가 이루는 대화자 영상을 추출하여 생성하는 대화자 영상 생성 단계(106)에 의해 수행된다.The dialogue extracting step (104, 105, 106, 107) assigns and assigns a specific value to the boundary direction bit corresponding to the extracted boundary line direction for all pixels located in the 90 degree direction of the boundary line direction of the extracted boundary pixel. Extracting and storing the dialogue image according to the number of specified values, and assigning a specific value to each pixel positioned in a 90 degree direction with respect to the boundary line direction of the extracted boundary pixel as a boundary direction bit according to the extracted boundary line direction. Dialogue bit allocation step 104, Dialogue center point determination step 105 of determining the dialog center line by counting the number of specific values assigned to the boundary direction bits, and Dialog image formed by a border pixel around the dialog center point And extracting and generating the dialogue image.

여기서, 상기 경계 방향 비트 할당 단계는 도4에 도시한 바와 같이 상기 경계 방향 비트를 할당하기 위한 검색 화소의 갯수를 설정하는 검색 화소 갯수 설정 단계(200), 상기 설정된 화소 갯수에 따라 상기 추출된 경계 화소의 경계선 방향에 대해 90도에 위치한 화소를 선택하는 화소 선택 단계(201), 상기 선택된 화소에 상기 경계 방향 비트로 8비트를 할당하는 할당 단계(202), 및 상기 할당된 8비트의 경계 방향 비트 중에서 상기 추출된 경계선 방향에 해당하는 비트들을 특정 값으로 세트하는 경계 방향 비트 세트 단계(203)에 의해 수행된다.In the boundary direction bit allocation step, as illustrated in FIG. 4, the search pixel number setting step of setting the number of search pixels for allocating the boundary direction bits (200), and the extracted boundary according to the set number of pixels. A pixel selection step 201 of selecting a pixel located at 90 degrees with respect to the boundary line direction of the pixel, an allocation step 202 of allocating 8 bits as the boundary direction bits to the selected pixel, and the boundary bits of the allocated 8 bits Is performed by a boundary direction bit set step 203 of setting bits corresponding to the extracted boundary line direction to a specific value.

여기서, 상기 경계 방향 비트 세트 단계(203)는 상기 추출된 경계선 방향이 x방향과 y방향으로 (-1, -1)인 경우 할당된 8비트의 경계 방향 비트 중에서 첫번째 비트에 특정 값을 세트하고, 상기 추출된 경계선 방향이 x방향과 y방향으로 (0, -1)인 경우 할당된 8비트의 경계 방향 비트 중에서 두번째 비트에 특정 값을 세트하고, 상기 추출된 경계선 방향이 x방향과 y방향으로(1, -1)인 경우 할당된 8비트의 경계 방향 비트 중에서 세번째 비트에 특정 값을 세트하고 , 상기 추출된 경계선 방향이 x방향과 y방향으로(1, 0)인 경우 할당된 8비트의 경계 방향 비트 중에서 네번째 비트에 특정 값을 세트하고, 상기 추출된 경계선 방향이 x방향과 y방향으로 (1, 1)인 경우 할당된 8비트의 경계 방향 비트 중에서 다섯번째 비트에 특정 값을 세트하고, 상기 추출된 경계선 방향이 x방향과 y방향으로 (0, 1)인 경우 할당된 8비트의 경계 방향 비트 중에서 여섯번째 비트에 특정 값을 세트하고, 상기 추출된 경계선 방향이 x방향과 y방향으로 (-1, 1)인 경우 할당된 8비트의 경계 방향 비트 중에서 일곱번째 비트에 특정값을 세트하고, 상기 추출된 경계선 방향이 x방향과 y방향으로 (-1, 0)인 경우 할당된 8비트의 경계 방향 비트 중에서 여덟번째 비트에 특정 값을 세트하여 수행된다.Here, the boundary direction bit set step 203 sets a specific value to the first bit of the allocated 8-bit boundary direction bits when the extracted boundary line direction is (-1, -1) in the x direction and the y direction. When the extracted boundary line direction is (0, -1) in the x direction and the y direction, a specific value is set to the second bit among the allocated 8 bit boundary direction bits, and the extracted boundary line direction is the x direction and the y direction. If (1, -1), the specific value is set to the third bit among the allocated 8-bit boundary direction bits, and if the extracted boundary line direction is (1, 0) in the x direction and the y direction (1, 0), the allocated 8 bits A specific value is set in the fourth bit among the boundary direction bits of, and if the extracted boundary line direction is (1, 1) in the x direction and the y direction, the specific value is set in the fifth bit of the allocated 8 bit boundary direction bits. And the extracted boundary line direction is In the case of (0, 1) in the x direction and the y direction, a specific value is set to the sixth bit among the allocated 8 bit boundary direction bits, and the extracted boundary line direction is (-1, 1) in the x direction and the y direction. If a specific value is set to the seventh bit among the allocated 8-bit bits of the direction direction, and if the extracted boundary line direction is (-1, 0) in the x- and y-directions, This is done by setting a specific value to the eighth bit.

또한, 상기 검색 화소의 갯수는 예상되는 대화자의 얼굴의 길이에 해당하는 화소의 갯수의 1/2 이상으로 설정된다.In addition, the number of search pixels is set to 1/2 or more of the number of pixels corresponding to the expected length of the talker's face.

한편, 상기 중심점 계산 단계(108)는 상기 추출된 대화자 영상의 중심점을 계산하는 단계이다.On the other hand, the center point calculation step 108 is to calculate the center point of the extracted dialog image.

상기 움직임 추정 및 모터 구동 단계(109)는 상기 계산된 대화자 영상의 중심점에 따라 움직임을 추정하여 카메라를 이동시키는 단계이다.The motion estimation and motor driving step 109 is a step of moving the camera by estimating the motion according to the calculated center point of the dialog image.

이와 같이 이루어지는 본 발명에 의한 화상 전화기의 대화자 추적 방법을 첨부한 도면을 참조하여 상세하게 설명한다.The speaker tracking method of the video telephone according to the present invention thus made will be described in detail with reference to the accompanying drawings.

먼저, 경계 추출 단계(100, 101, 102, 103)를 수행하여 입력되는 적외선 영상의 경계를 추적하는데, 이를 세부적을 설명하면 다음과 같다.First, the boundary extraction step (100, 101, 102, 103) is performed to track the boundary of the input infrared image, which will be described in detail.

전화가 개시되고 사용자의 상황에 따라서 추적 기능을 사용하기를 원하지 않을 경우가 있을 수 있으므로 사용자가 추적 기능을 사용할 것인가를 먼저 결정한다(100). 추적 모드와 비추적 모드의 구별은 스위치로 간단하게 구현할 수 있다. 즉, 스위치가 온(ON)되어 있으면 추적 기능을 수행하고 오프(OFF)되어 있으면 수행하지 않는다.Since the call is initiated and there may be times when the user does not want to use the tracking function, the user first determines whether to use the tracking function (100). The distinction between tracking mode and non-tracking mode is simple with a switch. That is, if the switch is ON, the tracking function is performed. If the switch is OFF, the tracking function is not performed.

스위치를 온시켜 추적 기능을 수행하는 추적 모드가 되면 카메라로 부터 입력되는 적외선 영상을 A/D(Analog/digital) 변환하여 저장하는 적외선 영상 저장 단계(101)를 수행한다.When the tracking mode is turned on to perform the tracking function, the infrared image storage step 101 of converting the infrared image input from the camera to A / D (Analog / digital) conversion is performed.

즉 카메라에서 들어오는 적외선 영상 신호는 A/D 변환기를 통해 2차원 행렬 상에서 지정된 범위내의 값을 가지는 디지탈 영상(I(x, y))이 되어 저장된다. 이러한 디지탈 영상을 메모리에 저장하여 입력 영상으로 사용하게 된다.That is, the infrared image signal coming from the camera is stored as a digital image (I (x, y)) having a value within a specified range on a two-dimensional matrix through an A / D converter. The digital image is stored in a memory and used as an input image.

이와 같이 저장된 적외선 영상 신호는 미분기를 통해 경계 화소가 추출되는데, 이러한 경계 화소 추출 단계(102)를 도3을 참조하여 설명하면 다음과 같다.In the infrared image signal stored as described above, boundary pixels are extracted through differentiation. The boundary pixel extraction step 102 will be described with reference to FIG. 3 as follows.

저장된 적외선 영상 신호는 2차원 행렬I(x, y)에 밝기값이 할당된 형태이다.The stored infrared image signal has a brightness value assigned to the two-dimensional matrix I (x, y).

이와 같이 이루어진 영상 신호는 수평 및 수직 방향으로 아래 식(1)과 식(2)와 같이 미분이 이루어진다.The video signal thus made is differentiated in the horizontal and vertical directions as shown in Equations (1) and (2) below.

I(x-1, y-1)+I(x-1, y)+I(x-1, y+1)-I(x+1, y-1)-I(x+1, y)-I(x+1, y-1) + hI (x-1, y-1) + I (x-1, y) + I (x-1, y + 1) -I (x + 1, y-1) -I (x + 1, y) -I (x + 1, y-1) + h

---- 식(1)---- Formula (1)

I(x-1, y-1)+I(x, y-1)+I(x+1, y-1)-I(x-1, y+1)-I(x, y+1)-I(x+1, y+1)=v ----- 식(2)I (x-1, y-1) + I (x, y-1) + I (x + 1, y-1) -I (x-1, y + 1) -I (x, y + 1) -I (x + 1, y + 1) = v ----- Equation (2)

즉, 도3에 도시한 바와 같이 수평 미분은 기준 화소(x, y)를 중심으로 하여 좌측의 화소의 밝기값(A4, A5, A6)과 우측의 화소의 밝기값(A1, A2, A3)의 차를 구하므로써 이루어지고, 수직 미분은 기준 화소(x, y)를 중심으로 하여 상측의 화소의 밝기값(A4, A7, A1)과 하측의 화소의 밝기값(A6, A8, A3)의 차를 구하므로써 이루어진다.That is, as shown in Fig. 3, the horizontal derivative is the brightness values A4, A5, A6 of the left pixel and the brightness values A1, A2, A3 of the right pixel with respect to the reference pixel (x, y) as the center. The vertical derivative is obtained by comparing the brightness values (A4, A7, A1) of the upper pixels and the brightness values (A6, A8, A3) of the lower pixels with respect to the reference pixel (x, y) as the center. This is done by saving a car.

위의 식(1)과 같이 수평 미분을 수행하여 수평 미분값(h)를 계산하고, 위의 식(2)와 같이 수직 미분을 수행하여 수직 미분값(v)을 계산한후 이들 미분값(h, v)의 크기 ((h²+ v²)^1/2)를 계산한다. 즉, 수직 및 수평 미분값(h, v)을 각각 제곱한 후 다시 제곱근을 취해 미분값의 크기 ((h²+ v²)^1/2)를 계산한다.The horizontal differential value (h) is calculated by performing horizontal differentiation as shown in Equation (1), and the vertical differential value (v) is calculated by performing vertical differentiation as shown in Equation (2) above. Calculate the size of h, v) ((h ² + v ² ) ^1/2 ). That is, the vertical and horizontal derivatives (h, v) are squared, respectively, and the square root is again used to calculate the magnitude ((h ² + v ² ) ^1/2 ) of the derivatives.

상기 계산된 미분값의 크기((h²+ v²)^1/2)는 미리 설정된 경계선 판별용 문턱치와 비교되어 상기 경계선 판별용 문턱치보다 큰 경우 기준 화소(x, y)가 경계 화소가 된다.The calculated magnitude (h ² + v ² ) ^1/2 is compared with a preset boundary determination threshold and the reference pixel (x, y) becomes a boundary pixel when it is larger than the boundary determination threshold.

이와 같은 미분 과정은 한 프레임의 모든 영상 신호에 대해 수행된다.This differentiation process is performed for all image signals of one frame.

이와 같이 경계 화소를 추출한 후, 상기 추출된 경계 화소의 수직 및 수평 미분값의 아크 탄텐트값(atan(h, v))으로 경계선 방향을 추출하는 경계선 방향 추출 단계(103)를 수행한다.After the boundary pixels are extracted as described above, the boundary line direction extraction step 103 of extracting the boundary line direction using the arc tantent values atan (h, v) of the vertical and horizontal differential values of the extracted boundary pixels is performed.

즉, 상기 추출된 경계 화소의 수직 및 수평 미분값(h, v)의 아크 탄젠트(atan(h, v))값을 계산하여 경계선 방향을 추출한다. 이때, 계산된 아크 탄젠트 값(atan(h, v))은 -90도에서 90도까지의 값을 가질 수 있다.That is, the direction of the boundary line is extracted by calculating the arc tangent (atan (h, v)) values of the vertical and horizontal differential values h and v of the extracted boundary pixels. At this time, the calculated arc tangent value (atan (h, v)) may have a value from -90 degrees to 90 degrees.

이와 같이 추출된 경계 화소와 경계선 방향을 이용하여 상기 추출된 경계 화소 및 경계선 방향에 따라 해당하는 경계 방향 비트에 특정 값을 할당하여 대화자 영상을 추출 및 저장하는 대화자 추출 단계(104, 105, 106, 107)를 수행하기 위해 경계 방향 비트 할당 단계(104)와 대화자 중심점 판단 단계(105)를 수행하는데, 이를 도4, 도5a, 도5b을 참조하여 설명한다.Dialogue extraction step 104, 105, 106 for extracting and storing a dialogue image by assigning a specific value to a corresponding boundary direction bit according to the extracted boundary pixel and boundary line direction using the extracted boundary pixel and boundary line direction. In order to perform 107, the boundary direction bit allocation step 104 and the dialogue center point determination step 105 are performed, which will be described with reference to FIGS. 4, 5A, and 5B.

먼저, 상기 경계 방향 비트를 할당하기 위한 검색 화소의 갯수를 설정하는 검색 화소 갯수 설정 단계(200)를 수행한다.First, a search pixel number setting step 200 of setting the number of search pixels for allocating the boundary direction bits is performed.

이때, 상기 검색 화소의 갯수는 예상되는 대화자의 얼굴의 길이에 해당하는 화소의 갯수의 1/2 이상으로 설정한다. 왜냐하면, 대화자의 이동을 추출하여 움직임을 검출하기 위해서는 대화자의 얼굴의 중심점을 추출하여야 하기 때문이다.In this case, the number of the search pixels is set to 1/2 or more of the number of pixels corresponding to the expected length of the face of the talker. This is because the center point of the face of the talker must be extracted to extract the talker's movement.

검색 화소의 갯수 설정후에는 경계 방향 비트를 세트하려는 화소를 선택한다(201). 즉, 상기 추출된 경계 화소의 경계선 방향에 대해 90도에 위치한 화소, 즉 경계선 방향을 중심으로 좌측으로 90도에 위치한 화소 중에서 첫번째로 인접한 화소를 선택한다(201). 이와 같이 선택된 화소에 경계 방향 비트로 8비트를 할당한다(202).After setting the number of search pixels, a pixel to which the boundary direction bits are to be set is selected (201). That is, the first adjacent pixel is selected among pixels located 90 degrees with respect to the boundary line direction of the extracted boundary pixel, that is, pixels 90 degrees to the left with respect to the boundary line direction (201). 8 bits are allocated to the selected pixel as the boundary direction bits (202).

즉, 입력 영상의 각 화소에 '00000000'의 레지스터를 할당하는데, 입력 영상(I(x, y))의 모든 화소당 1바이트 크기를 갖는 메모리를 레지스터로 이용한다.That is, a register of '00000000' is allocated to each pixel of the input image, and a memory having a size of 1 byte for every pixel of the input image I (x, y) is used as a register.

한편, 경계선 방향은 -90도에서 90도까지의 값을 가지게 되므로 이를 45도 간격으로 8 가지로 나누어 준다. 즉, 도5a에 도시한 바와 같이 경계선 방향을 (x방향, y방향)으로 각각 (-1, -1), (0, -1), (1, -1), (1, 0), (1, 1), (0, 1), (-1, 1), 및 (-1, 0)으로 분리하여 경계선 방향이 8가지로 나누어 질 수 있도록 한다.On the other hand, the boundary direction has a value ranging from -90 degrees to 90 degrees, so it is divided into eight at 45 degree intervals. That is, as shown in Fig. 5A, the boundary line direction is (-1, -1), (0, -1), (1, -1), (1, 0), ( It divides into 1, 1), (0, 1), (-1, 1), and (-1, 0) so that the boundary direction can be divided into 8 types.

이를 위해 경계선 방향을 계산하기 위한 각 경계 화소의 수평 및 수직 미분값(h, v)과 해당하는 (x방향 ,y방향)을 룩업 테이블에 저장하여 경계 화소의 수직 및 수평 미분값(h, v)에 따라 간단하게 경계선 방향을 추출할 수 있도록 한다.For this purpose, the horizontal and vertical differential values (h, v) of each boundary pixel and the corresponding (x-direction, y-direction) for calculating the boundary line direction are stored in a lookup table, so that the vertical and horizontal differential values (h, v) of the boundary pixels are stored. ), The boundary direction can be extracted easily.

이와 같이 경계선 방향이 (x방향 ,y방향)으로 계산된후에는 이를 이용하여 상기 각 화소에 할당된 경계 방향 비트에 특정값을 세트하는데(203), 이를 도5a을 참조하여 설명한다.After the boundary line direction is calculated in the (x direction, y direction) as described above, a specific value is set in the boundary direction bit allocated to each pixel (203), which will be described with reference to FIG. 5A.

즉, 화소(X1)의 경우 경계선 방향이 (-1, 1)이므로, 일곱 번째 비트에 '1'을 할당하여 경계 방향 비트로 '00000010'을 세트한다. 또한, 화소(X2)의 경우 경계선 방향이 (0, 1)이므로, 여섯번째 비트에 '1'을 할당하여 경계 방향 비트로 '00000100'을 세트한다(203). 또한, 화소(X3)의 경우 경계선 방향이 (1, 1)이므로, 다섯번째 비트에 '1'을 할당하여 경계 방향 비트로 '00001000'을 세트한다. 또한, 화소(X4)의 경우 경계선 방향이 (1, 0)이므로, 네번째 비트에 '1'을 할당하여 경계 방향 비트로 '00010000'을 세트한다. 또한, 화소(X5)인 경우 경계선 방향이 (1, -1)이므로, 세번째 비트에 '1'을 할당하여 경계 방향 비트로 '00100000'을 세트한다. 또한, 화소(X6)의 경우 경계선 방향이 (0, -1)이므로, 두번째 비트에 '1'을 할당하여 경계 방향 비트로 '01000000'을 세트한다. 또한, 화소(X7)의 경우 경계선 방향이 (-1, -1)이므로 첫번째 비트에 '1'을 할당하여 경계 방향 비트로 '100000000'을 세트한다.That is, in the case of the pixel X1, since the boundary line direction is (-1, 1), '000000' 'is set as the boundary direction bit by assigning' 1 'to the seventh bit. In the case of the pixel X2, since the boundary line direction is (0, 1), '1' is assigned to the sixth bit and '00000100' is set as the boundary direction bit (203). In the pixel X3, since the boundary line direction is (1, 1), '1' is assigned to the fifth bit and '00001000' is set as the boundary direction bit. Also, in the case of the pixel X4, since the boundary line direction is (1, 0), '000' is set as the boundary direction bit by assigning '1' to the fourth bit. Also, in the case of the pixel X5, since the boundary line direction is (1, -1), '00100000' is set as the boundary direction bit by assigning '1' to the third bit. In the case of the pixel X6, since the boundary line direction is (0, -1), '1' is assigned to the second bit and '01000000' is set as the boundary direction bit. In the pixel X7, since the boundary line direction is (-1, -1), '1' is assigned to the first bit and '100000000' is set as the boundary direction bit.

이와 같이 경계 방향 비트를 세트하게 되면 화소(X0)이 경우 경계선 방향이 8개가 되므로, 8개의 모든 비트에 '1'을 할당하여 경계 방향 비트로 '1111111'을 세트한다. 특, 화소(X0)의 경우에는 모든 경계선 방향, 즉 (-1, 1), (0, 1), (1, 1), (1, 0), (1, -1), (0, -1), (-1, -1)이 모두 경계선 방향이 되므로 경계 방향 비트로 '1111111'이 세트되게 된다.When the boundary direction bits are set in this way, in the case of the pixel X0, the boundary direction is eight, and thus, '1111111' is set as the boundary direction bits by allocating '1' to all eight bits. In particular, in the case of the pixel X0, all boundary directions, that is, (-1, 1), (0, 1), (1, 1), (1, 0), (1, -1), (0,- Since 1), (-1, -1) are all in the boundary line direction, '1111111' is set as the boundary direction bit.

이와 같이 선택된 화소에 경계 방향 비트를 세트한(203) 후에는 상기 세트된 경계 방향 비트 중에서 특정값, 즉 '1'의 갯수를 카운팅한다(204). 이때 카운팅값이 '6'보다 큰 경우에는 대화자 얼굴의 중심 부분인 중심점으로 판단한다(205, 206).After the boundary direction bits are set in the selected pixel as described above (203), the number of specific values, that is, '1', is counted among the set boundary direction bits (204). In this case, when the counting value is greater than '6', it is determined as the center point, which is the center part of the conversation face (205, 206).

예를 들어 도5a의 각 화소(X1, X2, X3, X4, X5, X6, X7, X8)는 경계 방향 비트 중 한 개의 비트만이 '1'이 세트되어 있으므로 대화자 얼굴의 중심점이 되지 못한다. 그러나 도5a의 화소(X0)는 8개의 경계 방향 비트 모두가 '1'로 세트되어 있으므로, 대화자의 중심점이 된다.For example, in each pixel X1, X2, X3, X4, X5, X6, X7, and X8 of FIG. 5A, since only one bit of the boundary direction bit is set to '1', it is not a center point of the face of the talker. However, in the pixel X0 of Fig. 5A, since all eight boundary bits are set to '1', it becomes the center point of the dialogue.

또한, 도5b에 도시한 바와 같이 화소(M)의 경우 8개의 방향 비트 중에서 7개의 비트가 '1'로 세트될 수 있으므로 대화자의 중심점이 된다.In addition, as shown in FIG. 5B, in the case of the pixel M, seven bits among eight direction bits may be set to '1', thereby forming a center point of the dialogue.

즉, 도5b에 도시한 바와 같이 화소(M)는 경계 방향 비트가'11111011'로 세트되므로 '1'의 갯수가 '7'개로 '6'보다 크게 되어 화소(M)을 대화자의 중심점으로 판단한다(206).That is, as shown in FIG. 5B, since the boundary bit is set to '11111011' in the pixel M, the number of '1's is greater than' 6 'to' 7 'and the pixel M is determined as the center point of the dialogue. (206).

이와 같이 선택된 화소의 경계 방향 비트를 검색한 후에는 선택된 화소의 갯수가 검색 화소의 갯수와 같은지 검색한다(207).After retrieving the boundary direction bits of the selected pixel as described above, it is searched whether the number of selected pixels is equal to the number of search pixels (207).

즉, 해당하는 경계 화소로부터 경계선 방향에 대해 90도 방향에 위치한 화소를 설정된 검색 화소의 갯수만큼 검색하여 경계 방향 비트를 세트해주어야 하므로 위의 선택 화소 갯수 검색 단계(207)를 수행해야 한다.That is, since the pixels located in the 90-degree direction with respect to the boundary line direction from the corresponding boundary pixels must be searched for the number of search pixels set, the boundary direction bits must be set, and thus the above-mentioned selection pixel number search step 207 must be performed.

예를 들어 대화자의 얼굴의 길이가 화소수로 볼 때 20개라고 하면 검색 화소 갯수는 10개 이상이 된다. 따라서, 검색 화소의 개수를 '10'으로 설정하면 경계 화소로부터 경계선 방향에 90도 방향에 위치한 화소를 10개 까지 검색하여 경계 방향 비트를 세트하여야 한다.For example, if the length of the face of the talker is 20 pixels, the number of search pixels is 10 or more. Therefore, if the number of search pixels is set to '10', the boundary direction bits should be set by searching up to ten pixels located in the 90 degree direction from the boundary pixel.

이와 같이 10개의 화소의 경계 방향 비트를 모두 검색하지 않은 경우에는 경계 화소가 위치한 방향에 대해 반대 방향으로 선택된 화소에 인접한 화소를 선택하여(208) 경계 비트 할당 단계(202)와 경계 비트 세트 단계(204)를 계속해서 수행한다.If the boundary bits of the ten pixels are not retrieved as described above, pixels adjacent to the pixel selected in the opposite direction to the direction in which the boundary pixels are located are selected (208), and the boundary bit allocation step 202 and the boundary bit set step ( Continue to step 204).

한편, 10개의 화소의 경계 방향 비트를 모두 검색한 경우에는 경계선 방향으로 진행하여 이웃한 경계 화소를 선택한다(209). 이때, 선택된 경계 화소가 위에서 설명한 경계 방향 비트 검색이 완료된 경계 화소인지 검색한다(210). 즉 이미 선택된어 검색이 완료된 경계 화소인지 검색하여 검색이 완료된 화소가 아니면 상기 화소 선택 단계(201)로 진행한다.On the other hand, if all of the boundary direction bits of the ten pixels are searched, the neighboring boundary pixels are selected in the direction of the boundary line in step 209. In this case, it is searched whether the selected boundary pixel is the boundary pixel in which the boundary direction bit search described above is completed (210). That is, if the search for the selected word is already completed or not, the search proceeds to the pixel selection step 201.

또한, 이미 선택되어 검색이 완료된 경계 화소인 경우에는, 즉 하나의 폐곡선을 이루는 모든 경계 화소에 대한 검색이 완료된 경우에는 위에서 판단된 대화자 중심점을 중심으로 둘러싸인 경계 화소가 이루는 면을 대화자 영상으로 판단한다(211). 이와 같이 판단된 대화자 영상을 한 프레임의 적외선 영상으로부터 추출하여 생성한 후 저장한다(107).In the case of the boundary pixel that is already selected and the search is completed, that is, when the search for all the boundary pixels forming one closed curve is completed, the plane formed by the boundary pixel surrounded by the center of the dialogue center determined above is determined as the dialogue image. (211). The speaker image determined as described above is extracted from the infrared image of one frame, generated, and stored, in operation 107.

그리고 나서, 상기 중심점 계산 단계(108)를 수행하여 추출되어 저장된 대화자 영상의 중심점을 계산한 후, 상기 움직임 추정 및 모터 구동 단계(109)를 수행하여 상기 계산된 대화자 영상의 중심점 위치에 따라 움직임을 추정하여 카메라를 이동시킨다.Then, the center point calculation step 108 is performed to calculate the center point of the extracted and stored speaker image, and then the motion estimation and motor driving step 109 is performed to move the motion according to the calculated center point position of the speaker image. Move the camera by estimation.

본 발명에 의한 화상 전화기의 대화자 추적 장치는 도6에 도시한 바와 같이 모터부(400), 제어부(500), 영상 저장부(600), 경계 방향 비트 할당부(700), 대화자 생성부(800), 및 중심점 계산부(900)로 구성된다.As illustrated in FIG. 6, the apparatus for tracking a speaker of a video telephone according to the present invention includes a motor unit 400, a controller 500, an image storage unit 600, a boundary bit allocator 700, and a speaker generator 800. ), And a center point calculator 900.

영상 저장부(600)는 추적 모드에서 입력되는 적외선 영상을 A/D 변환하여 저장하는 것으로, 상기 제어부(500)의 제어에 따라 추적 모드에서 입력되는 적외선 영상신호를 A/D 변환하는 A/D 변환기(610), 및 상기 제어부(500)의 제어에 따라 상기 A/D 변환기(610)로부터 출력되는 적외선 영상 신호와 상기 대화자 생성부(800)로부터 출력되는 대화자 영상 신호를 저장하는 메모리(620)로 구성된다.The image storage unit 600 A / D converts and stores the infrared image input in the tracking mode, and A / D converts the infrared image signal input in the tracking mode under the control of the control unit 500. A memory 620 for storing an infrared image signal output from the A / D converter 610 and a speaker image signal output from the speaker generator 800 under the control of the converter 610 and the controller 500. It consists of.

여기서, 메모리(620)는 입력 영상 신호를 저장하기 위해 VRAM 형태를 사용하는 것이 바람직하다.Here, the memory 620 preferably uses a VRAM form to store an input image signal.

상기 경계 방향 비트 할당부(700)는 상기 영상 저장부(600)의 메모리(620)에 저장된 적외선 영상으로부터 경계 화소와 경계선 방향을 추출하여 각 화소의 해당하는 경계 방향 비트에 특정 값을 할당하고 세트하는 것으로, 상기 영상 저장부(600)의 메모리(620)에 저장된 적외선 영상 신호에서 경계 신호를 추출하는 경계 화소 추출부(710), 상기 경계 화소 추출부(710)에서 추출된 경계 화소에 따라 경계선 방향을 추출하는 경계선 방향 추출부(720), 상기 경계 화소 추출부(720)와 경계 방향 추출부(720)에서 추출된 경계 화소의 경계선 방향에 대해 90도 방향에 위치한 화소들을 쉬프트시켜 출력하는 쉬프터(730), 및 상기 쉬프터(730)로 부터 출력되는 각각의 화소에 상기 추출된 경계선 방향에 따라 해당하는 값을 경계 방향 비트로 세트하는 경계 방향 비트 세트부(740)로 구성된다.The boundary direction bit allocation unit 700 extracts a boundary pixel and a boundary line direction from an infrared image stored in the memory 620 of the image storage unit 600, and allocates a specific value to a corresponding boundary direction bit of each pixel. The boundary pixel extracting unit 710 extracts a boundary signal from an infrared image signal stored in the memory 620 of the image storage unit 600, and a boundary line according to the boundary pixel extracted by the boundary pixel extracting unit 710. Shifter for shifting the pixels located in the 90-degree direction with respect to the boundary line direction of the boundary line extracted from the boundary line direction extraction unit 720, the boundary pixel extraction unit 720 and the boundary direction extraction unit 720 to extract the direction 730 and a boundary direction bit set unit 740 that sets a value corresponding to the extracted boundary line direction to each of the pixels output from the shifter 730 as boundary direction bits. It is composed.

대화자 생성부(800)는 경계 방향 비트 할당부(700)의 경계 방향 비트 세트부(740)에서 출력되는 각 화소의 경계 방향 비트에 할당된 특정 값의 갯수에 따라 대화자 영상을 생성하여 상기 영상 저장부(600)의 메모리(620)에 저장하는 것으로, 상기 경계 방향 비트 할당부(700)의 경계 방향 비트 세트부(740)에서 출력되는 각 화소의 경계 방향 비트에 할당된 특정 값의 갯수를 카운팅하는 카운터(810), 및 상기 카운터(810)이 카운팅값이 일정한 값 이상이 되면 해당하는 화소들을 대화자의 중심점으로 판단하여 대화자 영상을 생성하여 상기 영상 저장부(600)의 메모리(620)에 저장하는 결과 영상 생성부(820)로 구성된다.The dialog generator 800 generates the speaker image according to the number of specific values allocated to the boundary direction bits of each pixel output from the boundary direction bit set unit 740 of the boundary direction bit allocation unit 700 and stores the image. The number of specific values allocated to the boundary direction bits of each pixel output from the boundary direction bit set unit 740 of the boundary direction bit allocation unit 700 by storing in the memory 620 of the unit 600. When the counter 810 and the counter 810 have a count value greater than or equal to a predetermined value, the corresponding pixels are determined as the center point of the talker to generate a talker image, and store the same in the memory 620 of the image storage unit 600. The result is an image generator 820.

중심점 계산부(900)는 상기 영상 저장부(600)의 메모리(620)에 저장된 대화자 영상을 검색하여 대화자의 중심점을 계산하여 상기 제어부(500)로 출력한다.The center point calculator 900 searches for the talker image stored in the memory 620 of the image storage unit 600, calculates a center point of the talker, and outputs the center point of the talker to the controller 500.

제어부(500)는 상기 영상 저장부(600)의 A/D 변환기(610)와 메모리(620)의 동작을 제어하고 상기 중심점 계산부(900)에서 계산된 중심점에 따라 대화자의 움직임을 추정하여 카메라를 이동시키도록 상기 모터부(400)를 제어한다.The controller 500 controls the operations of the A / D converter 610 and the memory 620 of the image storage unit 600 and estimates the movement of the dialoger according to the center point calculated by the center point calculator 900. The motor unit 400 is controlled to move.

모터부(400)는 카메라(300)를 이동시키는 모터(410)와, 상기 제어부(500)의 제어에 따라 상기 모터(410)를 구동시키는 모터 드라이버(420)로 구성된다.The motor unit 400 includes a motor 410 for moving the camera 300 and a motor driver 420 for driving the motor 410 under the control of the controller 500.

이와 같이 구성되는 본 발명에 의한 화상 전화기의 대화자 추적 장치의 동작을 설명한다.The operation of the speaker tracking apparatus of the video telephone according to the present invention configured as described above will be described.

먼저, 제어부(500)의 제어에 따라 A/D 변환기(610)가 온되어 카메라(300)로 부터 들어오는 NTSC신호를 A/D 변환하여 디지탈 신호로 출력하게 된다.First, the A / D converter 610 is turned on under the control of the control unit 500 to A / D convert the NTSC signal from the camera 300 and output it as a digital signal.

즉, 추적 모드가 온되어 전화 통화를 시작하는 경우에는 카메라(300)로부터 들어오느 영상 신호를 A/D 변환기(610)에서 A/D 변환하고 A/D 변환된 디지탈 영상신호를 메모리(620)에 저장하게 된다.That is, when the tracking mode is turned on to start a phone call, the A / D converter 610 converts the video signal received from the camera 300 and the A / D converted digital video signal into the memory 620. Will be stored in.

메모리(620)에 저장된 영상 신호는 경계 화소 추출부(710)에 입력되어 위의 식(1)과 식(2)에 의해 각 화소의 수평 및 수직 미분값(h, v)의 크기가 계산되고 설정된 경계 화소 판별용 문턱치와 비교되어 경게 화소가 추출된다.The image signal stored in the memory 620 is input to the boundary pixel extracting unit 710, and the magnitudes of the horizontal and vertical differential values h and v of each pixel are calculated by the above equations (1) and (2). The light weight pixel is extracted by comparing with the set threshold pixel discrimination threshold.

이와 같이 경계 화소 추출부(710)에서 추출된 경계 화소를 입력으로, 경계선 방향 추출부(720)에서는 경계선의 방향을 추출하게 된다.As described above, the boundary pixel extracted by the boundary pixel extracting unit 710 is input, and the boundary line direction extracting unit 720 extracts the direction of the boundary line.

즉, 상기 경계선 방향 추출부(720)에서는 상기 경계 화소 추출부(710)에서 추출된 경계 화소의 수직 및 수평 미분값의 아크 탄젠트값(atan(h, v))을 경계선 방향으로 추출한다.That is, the boundary line direction extractor 720 extracts arc tangent values atan (h, v) of vertical and horizontal derivatives of the boundary pixels extracted by the boundary pixel extractor 710 in the boundary line direction.

이때 추출된 경계선 방향은 위에서 설명한 바와 같이 90도에서 -90도까지의 값을 가질 수 있으므로 이를 8가지로 나눈다. 즉, 아크 탄젠트 값을 (x방향, y방향)에 대해 각각 (-1, -1), (0, -1), (1, -1), (1, 0), (1, 1), (0, 1), (-1, -1), 및 (-1, 0)으로 분리하여 경계선 방향이 8가지로 나누어 질 수 있도록 한다.In this case, the extracted boundary line direction may have a value ranging from 90 degrees to -90 degrees as described above. In other words, the arc tangent values are (-1, -1), (0, -1), (1, -1), (1, 0), (1, 1), It is divided into (0, 1), (-1, -1), and (-1, 0) so that the boundary direction can be divided into eight types.

실제 수학 함수인 아크 탄젠트는 대화자 추적 장치내의 ROM에 룩업 테이블 형태로 저장되어 해당하는 수평 및 수직 미분값(h, v)에 대한 결과를 이용한다.The arc tangent, which is a real mathematical function, is stored in the form of a lookup table in ROM in the speaker tracking device and uses the results for the corresponding horizontal and vertical differential values (h, v).

이러한 8가지의 경계선 방향을 경계 방향 비트 세트부(740)에서는 각각 8비트 레지스터에서 한 비트로 표현한다. 즉, 위에서 설명한 바와 같이 8비트의 경계 방향 비트를 경계선 방향에 따라 각각 '10000000', '01000000', '00100000', '00010000', '00001000', '00000100', '00000010', 및 '00000001'으로 표현한다.These eight boundary line directions are represented by one bit in the eight-bit register in the boundary direction bit set unit 740. That is, as described above, the 8-bit boundary direction bits are set to '10000000', '01000000', '00100000', '00010000', '00001000', '00000100', '00000010', and '00000001' respectively according to the boundary line direction. Express as

이와 같이 경계 방향 비트를 세트하는 동작을 설명한다.Thus, the operation of setting the boundary direction bits will be described.

먼저, 도5a에 도시한 바와 같이 경계 화소 추출부(710)에서 추출된 경계 화소의 경계선 방향에 90도 방향에 위치한 각각의 화소를 선택하기 위해 상기 경계 화소 추출부9710)에서 출력되는 화소를 쉬프터(730)에서 쉬프트시키면서 경계 방향 비트 세트부(740)로 출력한다.First, as illustrated in FIG. 5A, the pixel output from the boundary pixel extracting unit 9710 is shifted to select each pixel located in a 90 degree direction in the boundary line direction of the boundary pixel extracted from the boundary pixel extracting unit 710. It outputs to the boundary direction bit set unit 740 while shifting at 730.

경계 방향 비트 세트부(740)에서는 위에서 설명한 바와 같이 입력되는 화소의 8개의 경계 방향 비트 중에서 상기 경계선 방향 추출부(720)로부터 출력되는 경계선 방향에 해당하는 비트를 '1'로 세트하여 출력한다.As described above, the boundary direction bit set unit 740 sets the bit corresponding to the boundary line direction output from the boundary line direction extraction unit 720 among the eight boundary direction bits of the input pixel to '1' and outputs the bit.

경계 방향 비스 세트부(740)로부터 출력되는 8비트의 경계 방향 비트(R(x, y))는 카운터(810)에 입력되어 '1'의 갯수가 카운팅된다.The 8-bit boundary direction bits R (x, y) output from the boundary direction vis set unit 740 are input to the counter 810, and the number of '1' is counted.

결과 영상 생성부(820)에서는 카운터(810)에서 카운팅된 '1'의 갯수가 '6'보다 크면, 즉 경계선 방향에 대해 90도를 이루는 선이 7개 이상 모아지면 해당 화소를 대화자 얼굴의 중심점으로 판단하고 대화자 중심점을 중심으로 둘러싸인 경계 화소가 이루는 결과 영상을 추출하여 대화자의 얼굴 영상을 생성한다.As a result, when the number of '1' counted by the counter 810 is greater than '6', in the image generating unit 820, when seven or more lines that form 90 degrees with respect to the boundary line are collected, the corresponding pixel is the center point of the face of the speaker. The resultant image of the boundary pixels surrounded by the center point of the talker is extracted and the face image of the talker is generated.

결과 영상 생성부(820)에서 생성된 대화자 얼굴의 영상은 메모리(620)에 저장되어, 중심점 계산부(900)에서 아래 식(3)과 식(4)와 같이 중심점이 계산된다.As a result, the image of the speaker face generated by the image generator 820 is stored in the memory 620, and the center point is calculated by the center point calculator 900 as shown in Equations 3 and 4 below.

x중심점 = 1/N ∑ I(i, j) X i ---- 식(3)xcenter = 1 / N ∑ I (i, j) X i ---- Equation (3)

y중심점 = 1/N ∑ I(i, j) X j ---- 식(4)y center point = 1 / N ∑ I (i, j) X j ---- Equation (4)

여기서, N은 결과 영상, 즉 대화자 얼굴의 영상을 이루는 화소의 갯수이다.Here, N is the number of pixels constituting the resultant image, that is, the image of the dialogue face.

이와 같이 계산된 중심점은 제어부(500)에 입력되고 제어부(500)에서는 이를 이용하여 대화자의 이동, 즉 움직임을 추정하고 추정된 움직임에 따라 모터 드라이버(420)를 제어한다. 모터 드라이버(420)에 의해 모터(410)가 적외선 카메라(300)를 이동시켜 대화자를 추적하게 된다.The calculated center point is input to the controller 500, and the controller 500 uses the same to estimate the movement, that is, the movement of the talker, and to control the motor driver 420 according to the estimated movement. The motor 410 moves the infrared camera 300 by the motor driver 420 to track the talker.

한편, 제어부(500)에서는 적외선 카메라(300)에서 적외선 영상이 입력되면 A/D 변환기(610)와 메모리(620)를 온시켜 적외선 카메라(300)로부터 들어오는 영상 신호를 A/D 변환하여 저장할 수 있도록 한다.Meanwhile, when an infrared image is input from the infrared camera 300, the controller 500 may turn on the A / D converter 610 and the memory 620 to A / D convert and store an image signal from the infrared camera 300. Make sure

이와 같이 메모리(620)에 저장한후에는 A/D 변환기(610)를 오프시켜 다시 카메라(300)로부터 들어오는 영상 신호를 받아들이지 못하도록 한다.After storing in the memory 620 as described above, the A / D converter 610 is turned off so that the video signal from the camera 300 cannot be received again.

그리고 결과 영상 생성부(820)에서 결과 영상이 생성된후에는 다시 A/D 변환기(610)와 메모리(620)를 온시켜 카메라(300)로부터 들어오는 다음 영상 신호를 받아들일 수 있도록 한다.After the resultant image is generated by the resultant image generator 820, the A / D converter 610 and the memory 620 are turned on to receive the next image signal from the camera 300.

이와 같은 동작은 입력된 영상 신호로부터 결과 영상이 추출되어 중심점이 계산될때까지 현재 입력된 영상이 변하지 않도록 하기 위한 것이다.This operation is to prevent the current input image from changing until the resultant image is extracted from the input image signal and the center point is calculated.

이상에서 설명한 바와 같이 본 발명에 의한 화상 전화기의 대화자 추적 방법 및 장치는 적외선 입력 영상에서 추출된 경계 화소의 크기와 방향을 이용하여 각 입력 영상에 할당된 레지스터의 값을 할당하는 과정을 통해 레지스터의 값에 따라 배경과 대화자를 분리하여 대화자를 추적하므로 추적 성능이 향상되는 효과가 있다.As described above, the method and apparatus for tracking a dialog of a video telephone according to the present invention utilizes the size and direction of a boundary pixel extracted from an infrared input image to assign a register value to each input image. Tracking performance is improved by separating the background and the dialog according to the value.

Claims

A boundary extraction step (100, 101, 102, 103) for extracting the boundary line direction of the boundary pixel from the infrared image signal input in the tracking mode; Allocating a specific value to the boundary direction bits corresponding to the extracted boundary line direction for all pixels located in the 90 degree direction of the boundary line direction of the extracted boundary pixel, and extracting and storing the dialogue image according to the number of assigned specific values Dialog extraction step 104, 105, 106, 107; A center point calculation step (108) of calculating a center point of the extracted dialog image; And a motion estimation and motor driving step (109) of moving the camera by estimating the motion according to the calculated center point of the speaker image.

The method of claim 1, wherein the boundary extraction step (100, 101, 102, 103) is an infrared image storage step (100, 101) for converting and storing the infrared image signal input in the tracking mode A / D (Analog / Digital) ; A boundary pixel extraction step (102) of extracting a boundary pixel from the stored infrared image signal; And a boundary line extraction step (103) of extracting a boundary line direction with arc tangent values (atan (h, v)) of vertical and horizontal differential values of the extracted boundary pixels. Way.

The method of claim 2, wherein the boundary pixel is determined by comparing magnitudes of differential values in horizontal and vertical directions of brightness values of each pixel in the set initial window with a threshold for determining a boundary line. .

The method of claim 1, wherein the dialogue extracting step (104, 105, 106, 107) bounds a specific value according to the extracted boundary line direction to each pixel located in a 90 degree direction with respect to the boundary line direction of the extracted boundary pixel A boundary direction bit allocation step of allocating 104 direction bits; A dialogue center point determining step (105) of determining a dialogue center point by counting the number of specific values assigned to the boundary direction bits; And a dialog image generation step (106) of extracting and generating a dialog image of a border pixel surrounded by the center point of the dialog center.

5. The method of claim 4, wherein the boundary bit allocation step comprises: a search pixel number setting step of setting a number of search pixels for allocating the boundary direction bits; A pixel selecting step (201) of selecting a pixel located at 90 degrees with respect to a boundary line direction of the extracted boundary pixel according to the set number of pixels; An allocating step (202) of allocating 8 bits as the boundary direction bits to the selected pixel; And a boundary direction bit set step (203) of setting the bits corresponding to the extracted boundary line direction among the allocated 8-bit boundary direction bits to a specific value.

6. The method of claim 5, wherein the boundary direction bit set step (203) is specified to the first bit of the allocated 8-bit boundary direction bits when the extracted boundary line direction is (-1, -1) in the x and y directions. Set a value, and when the extracted boundary line direction is (0, -1) in the x direction and the y direction, set a specific value to the second bit among the allocated 8 bit boundary direction bits, and the extracted boundary line direction is x In the case of (1, -1) in the direction and y direction When a specific value is set in the third bit among the allocated 8 bit boundary direction bits, and the extracted boundary line direction is (1, 0) in the x direction and the y direction A specific value is set in the fourth bit among the allocated 8 bits of the boundary direction, and when the extracted boundary line direction is (1, 1) in the x direction and the y direction, the fifth bit among the allocated 8 bit boundary direction bits is set. Set a specific value, and extract the boundary If the direction is (0, 1) in the x direction and the y direction, a specific value is set in the sixth bit among the allocated 8 bit boundary direction bits, and the extracted boundary direction is (-1, In the case of 1), a specific value is set in the seventh bit among the allocated 8 bit boundary direction bits, and when the extracted boundary line direction is (-1, 0) in the x direction and the y direction, the allocated 8 bit boundary direction The method of claim 1, wherein the method is performed by setting a specific value to the eighth bit among the bits.

6. The method of claim 5, wherein the number of search pixels is set to 1/2 or more of the number of pixels corresponding to the expected length of the talker's face.

Image storage means 600 for A / D converting and storing the infrared image input in the tracking mode; Boundary direction bit allocation means (700) for extracting a boundary pixel and a boundary line direction from an infrared image stored in the image storing means (600) to assign and set a specific value to a corresponding boundary direction bit of each pixel; Talker generation means (800) for generating a talker image according to the number of specific values assigned to the boundary direction bits of each pixel output from the boundary direction bit allocation means (700) and storing it in the image storage means (600); Center point calculation means (900) for retrieving the center point of the dialog by searching the dialog image stored in the image storage means (600); And a control means 500 for controlling the operation of the image storing means 600 and controlling the movement of the camera by estimating the movement of the talker according to the center point calculated by the center point calculating means 900. Talker tracking device for video telephone.

The apparatus of claim 8, wherein the image storing means (600) comprises: an A / D converter (610) for A / D converting an infrared image signal input in a tracking mode under the control of the control means (500); And a memory 620 for storing the infrared image signal output from the A / D converter 610 and the speaker image signal output from the speaker generating means 800 under the control of the control means 500. Talker tracking device of the video telephone, characterized in that.

The method of claim 8, wherein the boundary direction bit allocation means (700) comprises: a boundary pixel extracting unit (710) for extracting boundary pixels from an infrared image signal stored in the image storing means (600); A boundary line direction extracting unit 720 for extracting a boundary line direction according to the boundary pixels extracted by the boundary pixel extracting unit 710; A shifter 730 for shifting and outputting pixels positioned in a 90 degree direction with respect to the boundary line direction of the boundary pixel extracted by the boundary pixel extracting unit 710 and the boundary line direction extracting unit 720; And a boundary direction bit set unit 740 for setting a corresponding value according to the extracted boundary line direction to each of the pixels output from the shifter 730 as boundary direction bits. Device.

9. The apparatus of claim 8, wherein the dialog generating means (800) comprises: a counter (810) for counting the number of specific values assigned to the boundary direction bits of each pixel output from the boundary direction bit allocation means (700); And a result image generator 820 for determining a corresponding pixel as a center point of the talker when the counting value of the counter 810 is equal to or greater than a predetermined value to generate a talker image and storing the talker image in the image storage means 600. Talker tracking device of Mars telephone, characterized in that configured.