KR100215206B1

KR100215206B1 - Video conference system

Info

Publication number: KR100215206B1
Application number: KR1019960048090A
Authority: KR
Inventors: 정성학
Original assignee: 전주범; 대우전자주식회사
Priority date: 1996-10-24
Filing date: 1996-10-24
Publication date: 1999-08-16
Also published as: KR19980028900A

Abstract

본 발명은 적외선 영상을 이용하여 대화자를 인식한후 이를 추적하는 화상 회의 시스템에 관한 것이다.The present invention relates to a video conferencing system for recognizing a conversant using an infrared image and then tracking it.

본 발명은 화상 회의시 하나 또는 그 이상의 대화자의 적외선 영상에 해당하는 각각의 모델 영상신호를 입력으로 각각의 모델 얼굴의 윤곽선을 추출하여 인식하는 모델 인식단계; 상기 인식된 하나 또는 그 이상의 모델 영상신호 중에서 하나를 선택하는 모델 선택단계; 카메라를 통해 입력되는 대화자의 적외선 영상신호로부터 대화자 얼굴의 윤곽선을 추출하여 인식하는 대화자 인식단계; 상기 선택된 모델 영상신호의 모델 얼굴의 윤곽선과 상기 인식된 대화자 영상신호의 대화자 얼굴의 윤곽선을 정합시키는 정합단계; 및 상기 정합된 대화자를 추적하여 카메라를 이동시키는 대화자 추적단계에 의해 수행되고, 화상 회의시 하나 또는 그 이상의 대화자의 적외선 영상에 해당하는 각각의 모델 영상신호와 대화자의 적외선 영상신호를 입력으로 각각의 모델과 대화자의 얼굴의 윤곽선을 추출하여 인식하는 인식수단; 상기 인식수단의 동작을 제어하고 상기 추출된 모델 영상신호와 대화자의 얼굴의 윤곽선을 정합하고 정합된 대화자를 추적하도록 제어하는 CPU; 및 상기 CPU의 제어에 따라 카메라의 방향을 이동시키기 위한 모터드라이버로 구성된다.The present invention relates to a method and a system for recognizing a model, comprising: a model recognition step of extracting and recognizing contours of respective model faces by inputting respective model image signals corresponding to infrared images of one or more talkers during a video conference; A model selection step of selecting one of the recognized one or more model image signals; A talker recognizing step of extracting and recognizing a contour line of a talker's face from an infrared image signal of a talker input through a camera; A matching step of matching a contour of a model face of the selected model video signal with a contour of a talker face of the recognized talker video signal; And an interpreter tracking step of tracking the matched talker and moving the camera, wherein each model image signal corresponding to the infrared image of one or more talkers during a video conference and the infrared image signal of the talker are input, Recognition means for extracting and recognizing contours of the model and the face of the talker; A CPU for controlling the operation of the recognizing means and for controlling the extracted model video signal to match the outline of the face of the talker and to track the registered talker; And a motor driver for moving the direction of the camera under the control of the CPU.

따라서 본 발명은 적외선 영상을 이용하여 대화자를 인식한후 이를 추적하는 기능이 있어 화상 회의중 대화자가 이동하더라도 카메라로 추적이 가능하다.Accordingly, the present invention has a function of recognizing a talker using an infrared image and tracking the talker, so that even if a talker moves during a video conference, it can be tracked by a camera.

Description

VIDEO CONFERENCE SYSTEM

본 발명은 화상 회의 시스템에 관한 것으로, 특히 대화자 인식 기능을 갖는 화상 회의 시스템에 관한 것이다.The present invention relates to a video conference system, and more particularly, to a video conference system having a recognizer function.

본 발명의 목적은 대화자 인식 및 추적 기능을 갖도록하여 대화자가 카메라 앞에 위치하지 않고 이동하는 경우에도 이를 추적하여 대화자가 자유롭게 이동할 수 있도록 하기 위한 화상 회의 시스템을 제공함에 있다.It is an object of the present invention to provide a video conferencing system for allowing a conversation person to freely move by tracking a conversation person even if the conversation person moves without being positioned in front of the camera.

본 발명은 화상 회의 시스템에 관한 것으로, 특히 적외선 영상을 이용하여 대화자를 인식한후 이를 추적하는 화상 회의 시스템에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a video conferencing system, and more particularly, to a video conferencing system for recognizing a conversant using an infrared image and tracking the same.

일반적으로 화상 회의 시스템은 회의 참석자가 원거리에 있어도 화면을 통해 회의를 진행할 수 있다.In general, a video conferencing system allows a conference attendee to stay on the screen, even if they are at a distance.

따라서 대화자는 카메라 앞에 위치하여야만 화면을 통해 원거리에 있는 상대방 대화자에게 자신의 모습을 보여줄 수 있었다.Therefore, the dialogue person must be located in front of the camera, so that he can show his / her appearance to the remote talker who is remotely through the screen.

즉, 종래의 화상 회의 시스템은 화상 회의시 대화자가 반드시 카메라 앞에 위치하여야만 하므로 대화자의 이동이 불가능한 문제점이 있었다.That is, the conventional video conferencing system has a problem that the conversation person can not move because the conversation person must be located in front of the camera at the time of video conferencing.

상기 문제점을 개선하기 위한 본 발명은 적외선 영상을 이용하여 대화자를 인식한후 이를 추적하여 대화자의 이동을 자유롭게 하기 위한 화상 회의 시스템을 제공함에 그 목적이 있다.An object of the present invention is to provide a video conferencing system for recognizing a conversation person using an infrared image and tracking the conversation person to freely move the conversation person.

도 1 은 본 발명에 의한 모델 인식 단계를 나타낸 흐름도1 is a flowchart showing a model recognition step according to the present invention.

도 2 는 본 발명에 의한 대화자 인식, 정합, 및 추적 단계를 나타낸 흐름도Figure 2 is a flow chart illustrating the recognizer, registration and tracking steps of the present invention,

도 3a 는 도 1 및 도 2 의 경계 추출 단계를 설명하기 위한 도면FIG. 3A is a view for explaining the boundary extracting steps of FIGS. 1 and 2;

도 3b 는 도 1 및 도 2 의 세선화 단계를 설명하기 위한 도면FIG. 3B is a view for explaining the thinning step of FIGS. 1 and 2;

도 3c 는 도 1 및 도 2 의 직선 근사 단계를 설명하기 위한 도면Fig. 3C is a view for explaining the linear approximation steps of Figs. 1 and 2

도 4 는 도 2 의 추적 단계를 나타낸 세부 흐름도4 is a detailed flowchart showing the tracking step of FIG. 2;

도 5 는 본 발명에 의한 화상 회의 시스템의 구성도5 is a block diagram of the video conferencing system according to the present invention

도면의 주요 부분에 대한 부호의 설명DESCRIPTION OF THE REFERENCE NUMERALS

400 : 적외선 카메라 500 : 인식부400: infrared camera 500: recognition unit

510 : A/D 변환기 520 : 메모리510: A / D converter 520: memory

530 : 경계 추출부 540 : 경계선 방향 산출부530: boundary extracting unit 540: boundary line calculating unit

550 : 세선화부 560 : 직선 근사부550: Fine line section 560: Line approximation section

570 : 선형 특징 추출부 600 : CPU570: Linear feature extraction unit 600: CPU

610 : ROM 700 : 모터 드라이버610: ROM 700: motor driver

800 : 모터800: Motor

상기 목적을 달성하기 위해 본 발명에 의한 화상 회의 시스템은 화상 회의시 하나 또는 그 이상의 대화자의 적외선 영상에 해당하는 각각의 모델 영상 신호를 입력으로 각각의 모델 얼굴의 윤곽선을 추출하여 인식하는 모델 인식 단계; 상기 인식된 하나 또는 그 이상의 모델 영상 신호 중에서 하나를 선택하는 모델 선택 단계; 카메라를 통해 입력되는 대화자의 적외선 영상 신호로부터 대화자 얼굴의 윤곽선을 추출하여 인식하는 대화자 인식 단계; 상기 선택된 모델 영상 신호의 모델 얼굴의 윤곽선과 상기 인식된 대화자 영상 신호의 대화자 얼굴의 윤곽선을 정합시키는 정합 단계; 및 상기 정합된 대화자를 추적하여 카메라를 이동시키는 대화자 추적 단계를 포함하여 수행되고, 화상 회의시 하나 또는 그 이상의 대화자의 적외선 영상에 해당하는 각각의 모델 영상 신호와 대화자의 적외선 영상 신호를 입력으로 각각의 모델과 대화자의 얼굴의 윤곽선을 추출하여 인식하는 인식 수단; 상기 인식 수단의 동작을 제어하고 상기 추출된 모델 영상 신호와 대화자의 얼굴의 윤곽선을 정합하고 정합된 대화자를 추적하도록 제어하는 CPU; 및 상기 CPU의 제어에 따라 카메라의 방향을 이동시키기 위한 모터 드라이버를 포함하여 구성됨을 특징으로 한다.In order to achieve the above object, a video conference system according to the present invention includes a model recognition step of extracting and recognizing contours of respective model faces by inputting respective model video signals corresponding to infrared images of one or more talkers during a video conference, ; A model selection step of selecting one of the recognized one or more model image signals; A talker recognizing step of extracting and recognizing a contour line of a talker's face from an infrared image signal of a talker input through a camera; A matching step of matching a contour of a model face of the selected model video signal with a contour of a talker face of the recognized talker video signal; And a communicator tracking step of tracking the matched talker to move the camera, wherein each model image signal corresponding to one or more infrared images of one or more talkers during a video conference and an infrared image signal of the talker are input And recognizing means for extracting and recognizing the contour of the face of the talker; A CPU for controlling the operation of the recognizing means and for controlling the extracted model video signal to match the outline of the face of the talker and to track the registered talker; And a motor driver for moving the camera according to the control of the CPU.

이하 첨부한 도면을 참조하여 본 발명의 실시예를 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명에 의한 화상 회의 시스템은 도1 및 도2에 도시한 바와 같이 모델 인식 단계(100 내재 107), 모델 선택 단계(200), 대화자 인식 단계(201 내지 207), 정합 단계(208, 209), 대화자 추적 단계(210), 및 모델 변경 단계(211, 212)에 의해 수행된다.1 and 2, the video conference system according to the present invention includes a model recognition step 100, a model selection step 200, a talker recognition step 201 to 207, a matching step 208 and 209, , The talker tracking step 210, and the model change step 211, 212.

상기 모델 인식 단계(100 내지 107)는 도1에 도시한 바와 같이 화상 회의시 하나 또는 그 이상의 대화자의 적외선 영상에 해당하는 각각의 모델 영상 신호를 입력으로 각각의 모델 얼굴의 윤곽선을 추출하여 인식하는 것으로, 화상 회의시 인식할 대화자의 적외선 영상에 해당하는 모델 영상 신호를 A/D 변환하여 저장하는 모델 영상 신호 저장 단계(100), 상기 저장된 모델 영상 신호로부터 경계 화소와 경계선 방향을 산출하는 경계 추출 단계(101, 102), 상기 추출된 경계선 방향을 이용하여 상기 추출된 경계 화소를 세선화하는 세선화 단계(103), 상기 세선화된 경계 화소를 따라 가면서 직선 근사하고 직선의 선형 특징을 추출하여 저장하는 직선 선형 특징 추출 단계(104, 105, 106), 및 상기 인식할 다른 대화자의 영상에 해당하는 다른 모델 영상 신호가 있는지 판단하여 있으면 상기 모델 영상 신호 저장 단계(100)로 진행하는 모델 유무 판단 단계(107)에 의해 수행된다.As shown in FIG. 1, the model recognizing steps 100 to 107 extract the outline of each model face by inputting respective model image signals corresponding to infrared images of one or more talkers during a video conference, A model image signal storing step (100) of A / D converting and storing a model image signal corresponding to an infrared image of a talker to be recognized at the time of a video conference, a boundary extracting step of calculating a boundary pixel and a boundary direction from the stored model image signal A step of thinning the extracted boundary pixels by using the extracted boundary direction, a step of extracting a linear feature of a straight line by tracing the thinned boundary pixel, A linear linear feature extraction step (104, 105, 106) for storing a linear model feature value, and another model image signal corresponding to an image of another talker to be recognized And if so, proceeds to the model image signal storage step 100. [0041] FIG.

여기서, 상기 경계 화소는 상기 저장된 모델 영상 신호의 수직 및 수평 방향의 미분값과 설정된 경계선 판별용 문턱치의 비교에 의해 결정되고, 상기 경계선 방향은 상기 경계 화소의 수직 및 수평 방향의 미분값의 아크탄젠트값(aran(h/v))으로 계산된다.Here, the boundary pixels are determined by comparing vertical and horizontal differential values of the stored model video signal with a threshold value for setting a boundary, and the boundary direction is determined by comparing arctangent of the vertical and horizontal differential values of the boundary pixel Value (aran (h / v)).

모델 선택 단계(200)는 상기 인식된 하나 또는 그 이상의 모델 영상 신호 중에서 하나를 선택하는 단계이다.The model selection step 200 is a step of selecting one of the recognized one or more model video signals.

대화자 인식 단계(201 내지 207)는 도2에 도시한 바와 같이 카메라를 통해 입력되는 대화자의 적외선 영상 신호로부터 대화자 얼굴의 윤곽선을 추출하여 인식하는 단계로, 도2에 도시한 바와 같이 카메라를 통해 입력되는 대화자의 적외선 영상 신호를 A/D 변환하여 저장하는 대화자 영상 신호 저장 단계(201), 상기 저장된 대화자 영상 신호로부터 경계 화소와 경계선 방향을 산출하는 경계 추출 단계(202, 203), 상기 추출된 경계선 방향을 이용하여 상기 추출된 경계 화소를 세선화하는 세선화 단계(204), 및 상기 세선화된 경계 화소를 따라 가면서 직선 근사하고 직선의 선형 특징을 추출하여 저장하는 직선 선형 특징 추출 단계(205, 206, 207)에 의해 수행된다.As shown in Fig. 2, the talker recognizing steps 201 to 207 are steps of extracting and recognizing the outline of the talker's face from the infrared image signal of the talker input through the camera, A boundary image extracting step (202, 203) for calculating a boundary pixel and a boundary line direction from the stored intermittent image signal, an extracting step (202, 203) for extracting an infrared image signal of the talker from the stored intermittent image signal, And a linear linear feature extraction step (205) of extracting and storing a straight line linear feature along the thinned boundary pixel and storing the linear feature, 206, and 207, respectively.

여기서, 상기 경계 화소는 상기 저장된 대화자 영상 신호의 수직 및 수평 방향의 미분값과 설정된 경계선 판별용 문턱치의 비교에 의해 결정되고, 경계선 방향은 상기 경계 화소의 수직 및 수평 방향의 미분값의 아크탄젠트값(aran(h/v))으로 계산된다.Here, the boundary pixels are determined by comparing vertical and horizontal differential values of the stored communicator image signals with thresholds for discriminating boundary lines, and the boundary line direction is determined by comparing arc tangent values of the vertical and horizontal differential values of the boundary pixels (aran (h / v)).

정합 단계(208, 209)는 상기 선택된 모델 영상 신호의 모델 얼굴의 윤곽선과 상기 인식된 대화자 영상 신호의 대화자 얼굴의 윤곽선을 정합시키는 단계이다.The matching steps 208 and 209 match the outline of the model face of the selected model video signal and the outline of the talker face of the recognized talker image signal.

대화자 추적 단계(210)는 상기 정합된 대화자를 추적하여 카메라를 이동시키는 단계로, 도6에 도시한 바와 같이 상기 정합된 대화자가 있는 초기 위치에서 대화자의 얼굴을 포함하는 일정한 크기의 윈도우를 설정하는 윈도우 설정 단계(300), 상기 설정된 윈도우 내의 화소값을 이진화하는 이진화 단계(301), 상기 정합된 대화자의 중심점 추적을 위한 문턱치를 설정하는 문턱치 설정 단계(303), 상기 이진화된 화소값이 설정된 문턱치 보다 큰지 검색하는 이진화 값 검색 단계(303), 상기 이진화 값 검색 결과 이진화 값이 상기 문턱치보다 큰 경우 상기 카메라를 이동시켜 추적을 수행하는 카메라 이동 및 추적 단계(304), 및 상기 카메라 이동 및 추적 단계(304) 수행 도중 대화자의 추적이 불가능한 경우 상기 대화자 인식 단계(201 내지 207)로 진행하여 대화자를 재인식 및 재추적하는 추적 불가능 처리 단계(305, 306)에 의해 수행된다.The talker tracking step 210 is a step of tracking the matched talker to move the camera. As shown in FIG. 6, a predetermined size window including the talker's face is set at an initial position where the matched talker is located A window setting step 300, a binarization step 301 for binarizing pixel values in the set window, a threshold setting step 303 for setting a threshold value for tracking the center point of the matched talker, a threshold value setting step 303 for setting the binarized pixel value to a set threshold value A camera moving and tracking step (step 304) of moving the camera and performing tracking if the binary value of the binarized value search result is greater than the threshold value, and a camera moving and tracking step If it is impossible to trace the talker during the execution of the talker step 304, the process proceeds to the talker recognition step 201 to 207, It is performed by the formula and not re-trace trace processing stage (305, 306).

모델 변경 단계(201 내지 209)는 상기 인식된 하나 또는 그 이상의 모델 영상 신호 중에서 다른 하나를 선택하는 경우 카메라를 이동하고 상기 카메라 인식 및 정합 단계(201 내지 209)로 진행하는 단계(211, 212)이다.The model changing steps 201 to 209 are steps 211 and 212 of moving the camera and proceeding to the camera recognition and matching steps 201 to 209 when selecting one of the recognized one or more model video signals, to be.

이와 같이 수행되는 본 발명에 의한 화상 회의 시스템의 상세한 동작 과정을 도1 내지 도5를 참조하여 설명한다.Detailed operation of the video conferencing system according to the present invention will be described with reference to FIGS. 1 to 5. FIG.

먼저, 화상 회의에 참석하는 모든 대화자에 해당하는 적외선 모델 영상 신호를 카메라를 통해 취하여 모델의 얼굴 윤곽선을 추출해야 한다.First, an infrared model video signal corresponding to all talkers participating in a video conference should be taken through the camera to extract the face contours of the model.

즉, 모델 인식 단계(100 내지 107)를 수행하여 기준이 되는 하나 또는 그 이상의 대화자의 적외선 영상에 해당하는 각각의 모델 영상 신호를 입력으로 각각의 모델 얼굴의 윤곽선을 추출하여 인식하는데, 이를 첨부한 도면을 참조하여 설명한다.That is, by performing the model recognizing steps 100 to 107, the contour lines of each model face are extracted and recognized by inputting respective model image signals corresponding to the infrared images of one or more talkers as a reference, Will be described with reference to the drawings.

먼저, 화상 회의시 인식할 대화자의 적외선 영상에 해당하는 모델 영상 신호를 적외선 카메라를 통해 취한후 A/D 변환하여 저장한다(100).First, a model video signal corresponding to an infrared image of a talker to be recognized at the time of a video conference is taken through an infrared camera and A / D converted and stored (100).

적외선 카메라에서 들어오는 신호, 예를 들어 NTSC 신호는 A/D 변환되면 2차원 행렬상에서 지정된 범위내의 값을 가지는 디지탈 영상(I(x, y))이 된다. 이 영상을 메모리에 저장하여 입력 영상으로 사용한다.A signal received from an infrared camera, for example, an NTSC signal, is converted into a digital image I (x, y) having a value within a specified range on a two-dimensional matrix after A / D conversion. This image is stored in memory and used as input image.

이와 같이 모델 영상 신호 저장 단계(100)를 수행한후 상기 저장된 모델 영상 신호로부터 경계 화소와 경계선 방향을 산출하는 경계 추출 단계(101, 102)를 수행한다.After the model image signal storage step 100 is performed, boundary extraction steps 101 and 102 are performed to calculate boundary pixels and boundary lines from the stored model image signals.

상기 저장된 모델 영상 신호를 수직 및 수평 방향으로 미분하여 수평 및 수직 방향의 미분값(h, v)을 찾아낸후 이 미분값의 크기((h²+ v²)^1/2)와 설정된 경계선 판별용 문턱치를 비교하여 경계 화소를 찾아낸다.(H, v) in the horizontal and vertical directions by differentiating the stored model video signal in the vertical and horizontal directions, and then calculates the difference (h ² + v ² ) ^1/2 of the differential value The boundary pixels are found by comparing threshold values.

2차원 행렬상에서 지정된 범위내의 값을 가지는 디지탈 영상(I(x, y)신호는 아래 식(1) 및 식(2)와 같이 수평 및 수직 방향으로 미분이 이루어진다.A digital image I (x, y) signal having a value within a specified range on a two-dimensional matrix is differentiated in the horizontal and vertical directions as shown in the following equations (1) and (2).

I(x-1, y-1)+I(x-1, y)+I(x-1, y+1)-I(x+1, y-1)-I(x+1, y)-I(x+1, y-1) = hI (x + 1, y-1) + I (x-1, y) + I -I (x + 1, y-1) = h

----- 식(1)----- Equation (1)

I(x-1, y-1)+I(x, y-1)+I(x+1, y-1)-I(x-1, y+1)-I(x, y+1)-I(x+1, y+1) = vI (x, y + 1) -I (x, y + 1) -I (x + 1, y + 1) = v

----- 식(2)----- Equation (2)

즉, 도3에 도시한 바와 같이 수평 미분은 기준 화소(x, y)를 중심으로 하여 좌측의 화소의 밝기값(A4, A5, A6)과 우측의 화소의 밝기값(A1, A2, A3)의 차를 구하므로써 이루어지고, 수직 미분은 기준 화소(x, y)를 중심으로 하여 상측의 화소의 밝기값(A4, A7, A1)과 하측의 화소의 밝기값(A6, A8, A3)의 차를 구하므로써 이루어진다.As shown in FIG. 3, the horizontal differential includes the brightness values A4, A5, and A6 of the left pixel and the brightness values A1, A2, and A3 of the right pixel with the reference pixel x, A7 and A1 on the upper side and the brightness values A6, A8 and A3 on the lower side are centered on the reference pixel x and y, It is done by retrieving the car.

위의 식(1)에 도시한 바와 같이 수평 미분을 수행하여 수평 미분값(h)을 계산하고, 위의 식(2)에 도시한 바와 같이 수직 미분을 수행하여 수직 미분값(v)을 계산한후 이들 미분값(h, v)의 크기((h²+ v²)^1/2)를 계산한다. 즉, 수직 및 수평 미분값(h, v)을 각각 제곱하여 가산한후 다시 제곱근을 취해 미분값의 크기((h²+ v²)^1/2)를 계산한다.As shown in the above equation (1), the horizontal differential value is calculated to calculate the horizontal differential value h and the vertical differential value is calculated as shown in the above equation (2) to calculate the vertical differential value (v) (H ^2, v ² ) ^1/2 ) of the differential values (h, v). That is, the magnitude ((h ² + v ² ) ^1/2 ) of the differential value is calculated by squaring and adding the vertical and horizontal differential values (h, v), and then taking the square root again.

상기 계산된 미분값의 크기((h²+ v²)^1/2)가 미리 설정된 경계선 판별용 문턱치와 비교되어 상기 경계선 판별용 문턱치보다 큰 경우 기준 화소(x, y)가 경계 화소가 된다.When the calculated differential value ((h ² + v ² ) ^1/2 ) is compared with a threshold value for determining a boundary and is larger than the threshold value for threshold value determination, the reference pixel (x, y) becomes a boundary pixel.

이와 같은 미분값의 크기 및 비교 과정을 모델 영상 신호의 각 화소에 대해 수행하여 한 프레임의 모델 영상 신호내의 모든 경계 화소를 추출해낸다.The size of the differential value and the comparison process are performed for each pixel of the model video signal to extract all boundary pixels in the model video signal of one frame.

이와 같이 경계 화소를 추출하여 경계선을 추출한후에는 경계선 방향을 산출하는데(102), 경계선 방향은 상기 경계 화소의 수직 및 수평 방향의 미분값의 아크탄젠트값(aran(h/v))으로 계산된다.After extracting the boundary pixels and extracting the boundary lines as described above, the boundary direction is calculated (102), and the boundary direction is calculated as the arctangent value (aran (h / v)) of the differential values in the vertical and horizontal directions of the boundary pixel .

즉, 내부의 ROM에 룩업 테이블을 준비하여 놓으면 해당 값(h/v)에 대한 아크탄젠트 결과를 얻을 수 있으므로, 이를 통하여 경계선 방향을 산출한다(102).That is, if a look-up table is prepared in the internal ROM, an arctangent result for the value (h / v) can be obtained.

이와 같이 경계 화소와 경계선 방향을 추출한후에는 상기 추출된 경계선 방향을 이용하여 상기 추출된 경계 화소를 세선화한다(103).After extracting the boundary pixels and the boundary direction, the extracted boundary pixels are thinned using the extracted boundary direction (103).

즉, 미분하여 추출된 경계 화소는 적외선 영상의 특성상 한 화소 굵기가 아니라 2내지 3화소 정도의 굵기이므로, 경계 화소를 세선화하여 보다 정확한 경계 화소를 추출한다.That is, since the boundary pixels extracted by differentiation are not thicker than one pixel but have a thickness of about 2 to 3 pixels due to the characteristic of the infrared image, the boundary pixels are thinned to extract more accurate boundary pixels.

이를 도4를 참조하여 설명하면 다음과 같다.This will be described with reference to FIG.

경계 화소에 대해 경계선 방향과 90도 되는 화소들을 검사하여 경계 화소의 값이 가장 큰 화소 하나만을 제외하고 나머지를 삭제하여 경계 화소를 제외시킨다.The boundary pixels are checked with respect to the boundary line and the boundary pixels are excluded except for the one pixel having the largest boundary pixel value.

즉, 도4에 도시한 바와 같이 경계 화소(e)의 경계선 방향(a)과 90도가 되는 방향(b)의 화소들을 검사하여 미분값의 크기(h²+ v²)^1/2)가 가장 큰 경계 화소만을 남기고 나머지 경계 화소는 제외시킨다.That is, by checking the pixels of the boundary pixel (e) direction boundary line (a) and the direction (b) which is 90 degrees as shown in the 4 magnitude of the differential value ^{^{^{(h 2 + v 2) 1/2}}} ) is most Leaving only the large boundary pixels and excluding the remaining boundary pixels.

이와 같이 세선화 단계(103)를 수행한후에는 직선 선형 특징 추출 단계(104, 105, 106)를 수행하여 상기 세선화된 경계 화소를 따라 가면서 직선 근사하고 직선의 선형 특징을 추출하여 저장한다.After the thinning step 103 is performed, a linear linear feature extraction step 104, 105, or 106 is performed to linearly approximate the thinned boundary pixel and extract and store a linear feature of a straight line.

즉, 도5에 도시한 바와 같이 현재 화소(e1)에서 경계 화소(e)를 따라 가면서(Tracing) 직선(l1)을 시작하여 화소(e)와 직선(l1)과의 거리가 일정한 거리 이상이 되면 하나의 직선(l1)을 끝내고 또다른 직선(l2)을 시작하여 직선 근사한다.5, if the distance between the pixel e and the straight line l1 is equal to or larger than a certain distance by starting the straight line 11 from the current pixel e1 along the boundary pixel e , One straight line (11) ends and another straight line (12) starts to be linearly approximated.

이와 같이 경계 화소를 따라 가면서 근사 직선(l1, l2)을 추출해낸후, 이 들 직선의 선형 특징을 추출한다((105). 즉, 추출된 근사 직선이 n개 인 경우, 직선의 선형 특징은 [(직선1, 시작점, 끝점, 길이, 기울기), (직선2, 시작점, 끝점, 길이 기울기), …, (직선n, 기울기, 끝점, 길이 기울기)]로 표현된다.After extracting the approximated straight lines (l1, l2) along the boundary pixels, the linear features of these straight lines are extracted (105). That is, when the extracted approximated straight lines are n, (Straight line 1, start point, end point, length, slope), (straight line 2, start point, end point, length slope), ..., (straight line n, slope, end point, length slope).

입력 영상의 얼굴의 윤곽선에서 추출 가능한 직선의 수는 영상에 따라 다르지만 화상 회의에서 한 사람의 얼굴당 추출 가능한 직선의 수(n)는 10에서 20개 정도이다.The number of straight lines that can be extracted from the contour of the face of the input image varies depending on the image, but the number of straight lines (n) that can be extracted per person's face in a video conference is about 10 to 20.

이와 같이 모델 영상 신호에서 추출된 선형 특징은 메모리에 저장되어 입력되는 대화자 영상 신호에서 추출된 선형 특징과의 정합에 이용된다.Thus, the linear feature extracted from the model video signal is used for matching with the linear feature extracted from the talker image signal stored in the memory.

하나의 모델의 모델 영상 신호에 대해 선형 특징을 추출한후에는 상기 인식할 다른 대화자의 영상에 해당하는 다른 모델 영상 신호가 있는지 판단한다. 이때 참석할 다른 대화자가 있어 다른 모델 영상 신호가 있으면 상기 모델 영상 신호 저장 단계(100)로 진행하여 다른 모델 영상 신호에 대해서도 선형 특징을 추출해낸다. 또한, 모든 모델 영상 신호에 대해 선형 특징을 추출하여 다른 모델 영상 신호가 없는 경우에는 모델 인식 단계를 종료하고 상기 인식된 하나 또는 그 이상의 모델 영상 신호 중에서 하나를 선택하는 모델 선택 단계(200)를 수행한다.After extracting the linear feature of the model image signal of one model, it is determined whether there is another model image signal corresponding to the image of the other talker to be recognized. At this time, if there is another speaker to be present, if there is another model video signal, the process goes to the model video signal storing step 100 to extract a linear feature for another model video signal. Also, if a linear feature is extracted for all the model video signals and there is no other model video signal, a model selection step 200 for terminating the model recognition step and selecting one of the recognized one or more model video signals is performed do.

즉, 카메라를 통해 영상 신호가 전달되어야 할 모델을 선택한다.(200).That is, a model to which a video signal is to be transmitted is selected through a camera (200).

예를 들면 화상 회의에 참석하는 많은 대화자 중에서 주로 이야기를 하는 하나의 대화자를 선택한다.For example, one of the many talkers attending a video conference mainly selects one talker to talk to.

이와 같이 모델을 선택한후에는 화상 회의에 참석하는 참석자 중에서 상기 선택된 모델과 동일한 대화자를 찾아내어 추적해야 한다.After selecting the model in this way, it is necessary to find and track the same talker as the selected model among attendees who attend the video conference.

따라서 도2에 도시한 바와 같이 카메라를 통해 입력되는 대화자의 적외선 영상 신호로부터 대화자 얼굴의 윤곽선을 추출하여 인식하는 대화자 인식 단계(201 내지 207)를 수행한다.Accordingly, as shown in FIG. 2, a talker recognizing step (201 to 207) for extracting and recognizing the outline of the talker's face from the infrared image signal of the talker input through the camera is performed.

대화자 인식 단계는 위의 모델 인식 단계와 동일한 방법으로 수행된다.The talker recognition step is performed in the same manner as the above model recognition step.

즉, 도2에 도시한 바와 같이 카메라를 통해 입력되는 대화자의 적외선 영상 신호를 A/D 변환하여 저장하는 대화자 영상 신호 저장 단계(201)를 먼저 수행한후, 상기 저장된 대화자 영상 신호로부터 경계 화소와 경계선 방향을 산출하는 경계 추출 단계(202, 203)를 수행한다.That is, as shown in FIG. 2, a speaker signal processing step 201 for A / D-converting an infrared image signal of a talker input through a camera and storing the signal is first performed. Then, A boundary extracting step (202, 203) for calculating the boundary line direction is performed.

이때, 추출되는 경계 화소와 경계선 방향은 위의 모델 인식 단계에서 수행된 경계 추출 단계와 동일하게 이루어진다.At this time, the extracted boundary pixels and the boundary direction are performed in the same manner as the boundary extraction step performed in the above model recognition step.

즉, 도3에 도시한 바와 같이 A/D 변환되어 2차원 행렬상에서 지정된 범위내의 값을 가지는 대화자의 디지탈 영상 신호(I(x, y))를 위의 식(1)과 식(2)에 따라 수직 및 수평 방향으로 미분하여 수평 및 수직 방향의 미분값(h, v)을 찾아낸후 이 미분값의 크기((h²+ v²)^1/2)와 설정된 경계선 판별용 문턱치를 비교하여 경계 화소를 찾아낸다(202).That is, as shown in Fig. 3, the digital video signal I (x, y) of the talker A / D-converted and having a value within a specified range on the two- (H, v) in the horizontal and vertical directions are differentiated in the vertical and horizontal directions, and then the magnitude ((h ² + v ² ) ^1/2 ) of the differential value is compared with the threshold value for setting the boundary, (202).

이와 같이 추출된 경계 화소의 수직 및 수평 방향의 미분값의 아크탄젠트값(aran(h/v))으로 경계선 방향을 계산한다(203).The boundary direction is calculated from the arc tangent value aran (h / v) of the vertical and horizontal differential values of the extracted boundary pixels (203).

또한, 모델 인식 단계와 마찬가지로 상기 산출된 경계선 방향을 이용하여 상기 추출된 경계 화소를 세선화하는 세선화 단계(204)를 도4에 도시한 바와 같이 수행하고, 상기 세선화된 경계 화소를 따라 가면서 직선 근사하고 직선의 선형 특징을 추출하여 저장하는 직선 선형 특징 추출 단계(205, 206, 207)를 도5에 도시한 바와 같이 수행한다.In addition, as in the model recognition step, a thinning step 204 for thinning the extracted boundary pixels using the calculated boundary direction is performed as shown in FIG. 4, Linear linear feature extraction steps (205, 206, 207) for extracting and storing straight line approximated linear features are performed as shown in FIG.

이와 같이 대화자 영상 신호로부터 추출되어 저장된 대화자의 선형 특징은 상기 모델 영상 신호로부터 추출된후 선택된 모델의 선형 특징과 비교되어 정합 여부가 판단된다.The linear feature of the talker extracted and extracted from the talker image signal is extracted from the model image signal and then compared with the linear feature of the selected model to determine whether the match is true or false.

즉, 위와 같이 선택된 모델 영상 신호의 모델 얼굴의 윤곽선에 대한 선형 특징과 인식된 대화자 영상 신호의 대화자 얼굴의 윤곽선에 대한 선형 특징을 비교하여 정합 여부를 판단하는데 이때 이용되는 방법이 트리 서치(Tree Search) 방법이다.That is, the linear feature of the model face image of the selected model image signal is compared with the linear feature of the contour face of the recognized face image of the recognized speaker image signal to determine whether or not the matching is performed. ) Method.

즉, 입력 영상의 직선에 대해 정합 가능한 모든 모델의 직선을 할당한 다음 다음 식(3)과 같이 특정 값(m)을 계산하여 그값(m)이 가장 작을 때 정합이 이루어졌다고 본다.That is, a straight line of all models that can be matched with respect to the straight line of the input image is assigned, and then a specific value m is calculated as shown in the following equation (3), and it is considered that the matching is performed when the value m is the smallest.

nn

m = 1/(Σ#(입력 영상 직선 길이i - 모델 직선 길이#(m = 1 / (? # (input image straight line length i - model straight line length #

ii

+#(입력 영상 직선 기울기i - 모델 직선 기울기#() --- 식(3)+ # (Input image straight line slope i - Model straight line slope # () - (3)

이와 같이 선택된 모델과의 정합 단계를 수행하여 정합이 이루어지지 않은 경우에는 카메라를 이동시켜 다른 대화자를 선택하고 이 다른 대화자의 대화자 영상 신호를 입력으로 대화자 영상 신호 저장 단계(201)로부터 반복 수행하여 선택된 모델과 동일한 선형 특징을 갖는 대화자를 찾아낸다.If the matching is not performed, the camera is moved to select another talker, and the talker image signal of the other talker is input repeatedly from the talker image signal storing step 201 to be selected Find a talker with the same linear characteristics as the model.

이와 같은 과정의 수행을 통해 모델과 정합이 이루어지는 대화자를 찾아낸후에는 중심점 추적 방법을 통해 추적을 수행하게 된다.After finding the communicators that match with the model through the above process, the tracking is performed through the center point tracking method.

즉, 정합이 이루어지면 대화자에 대한 초기 위치 추정이 가능하므로 이 초기 정보를 이용하여 카메라를 이동시켜 추적시키는데(210), 이를 도6을 참조하여 세부적으로 설명한다.That is, if the matching is performed, the initial position can be estimated for the talker, so that the initial information is used to move and track the camera 210, which will be described in detail with reference to FIG.

먼저, 윈도우 설정 단계(301)를 수행하여 상기 정합된 대화자가 있는 초기 위치에서 대화자의 얼굴을 포함하는 일정한 크기의 윈도우를 설정한다.First, a window setting step 301 is performed to set a window of a predetermined size including the face of the talker at an initial position where the matched dialogue is present.

일반적으로 중심점 추적은 전체 영상을 대상으로하여 수행된다. 그러나 본 발명에서는 인식 기능을 통하여 초기 위치를 알 수 있으므로 초기 위치 근방에 한정하여 추적 기능을 수행한다. 즉, 사람의 얼굴을 포함하는 크기의 창인 윈도우를 설정하여 이 창내에서만 추적 기능을 수행하면 된다.Generally, the center point tracking is performed on the entire image. However, in the present invention, since the initial position can be known through the recognition function, the tracking function is limited to the vicinity of the initial position. That is, a window that is a window of a size including a face of a person can be set to perform the tracking function only in this window.

여기서, 창의 크기는 영상내에 얼굴의 크기를 포함하면서 대화자 인식 단계에서 제외된 다른 사람의 얼굴을 포함하지 않는 크기이면 된다.Here, the size of the window may be a size that includes the size of the face in the image but does not include the face of another person excluded from the recognizer.

이와 같이 윈도우 설정 단계(300)를 수행한후에는 상기 설정된 윈도우 내의 화소값을 이진화하는 이진화 단계(301)를 수행하고, 상기 정합된 대화자의 중심점 추적을 위한 문턱치(Thres)를 설정하는 문턱치 설정 단계(302)를 수행한다.After the window setting step 300 is performed, a binarization step (step 301) of binarizing pixel values in the set window and a threshold value setting step of setting a threshold value Thres for tracking the center point of the matched talker 302).

윈도우내에서 문턱치(Thres)를 설정하면 얼굴 부분을 쉽게 추출할 수 있으므로 문턱치를 설정하여 아래 식(4)에 도시한 바와 같이 이진화값 검색 단계를 수행한다.Since the face part can be easily extracted by setting the threshold value in the window, the binarization value searching step is performed as shown in the following equation (4) by setting the threshold value.

I(x, y) = 1 if I(x, y) Thres --- 식(4)I (x, y) = 1 if I (x, y) Thres --- (4)

위의 식(4)에서 이진화 값을 검색하여 영상에서의 값이 '1'인 부분을 계속 추적하면 대화자의 추적이 가능해진다. 즉, 초기 인식에 의해 '1'값이 나타낸 부분으로 카메라가 움직이도록 모터를 구동하고 대화자가 움직이게 되면 '1'로 표시된 부분도 같이 이동하게 되면 이 이동 정보를 다시 모터에 전달하여 카메라를 이동시킨다(304).In the above equation (4), if the binarization value is retrieved and the value of the image in the image is '1', tracking of the talker becomes possible. In other words, when the motor is driven so that the camera moves to the portion where the value of '1' is indicated by the initial recognition and the part indicated by '1' moves together with the conversation person, the movement information is transmitted to the motor again to move the camera (304).

따라서 창도 카메라와 같이 이동하게 되므로 게속 추적이 가능해진다.Therefore, since the window is moved with the camera, it becomes possible to keep track of it.

한편, 추적을 위해 카메라가 이동하는 도중에 대상 얼굴의 급격한 이동이나 창안에 2인 이상의 얼굴이 촬영된 경우에는 인식기의 작용을 다시 활성화하여 인식 기능을 거쳐서 추적 대상 얼굴을 다시 인식해야 한다.On the other hand, when two or more faces are photographed in the sudden movement of the target face or the window during the movement of the camera for tracking, the recognition target face must be recognized again through recognition function again by activating the recognition function.

즉, 상기 카메라 이동 및 추적 단계(304) 수행 도중 대화자의 추적이 불가능한 경우 상기 대화자 인식 단계(201 내지 207)로 진행하여 대화자를 재인식 및 재추적하는 추적 불가능 처리 단계(305, 306)를 수행한다.That is, if it is impossible to trace the talker during the camera movement and tracking step 304, the process proceeds to the talker recognition steps 201 to 207 to perform the traceability processing steps 305 and 306 for re-recognizing and re-tracing the talker .

다음으로, 화상 회의 시스템은 도7에 도시한 바와 같이 인식부(500), CPU(600), 및 모터 드라이버(700)로 구성된다.Next, the video conference system is composed of the recognition unit 500, the CPU 600, and the motor driver 700 as shown in Fig.

인식부(500)는 화상 회의시 하나 또는 그 이상의 대화자의 적외선 영상에 해당하는 각각의 모델 영상 신호와 대화자의 적외선 영상 신호를 입력으로 각각의 모델과 대화자의 얼굴의 윤곽선을 추출하여 인식하는 것으로, 상기 CPU(600)의 제어에 따라 화상 회의시 하나 또는 그 이상의 대화자의 적외선 영상에 해당하는 각각의 모델 영상 신호와 대화자 영상 신호를 A/D 변환하는 A/D(Analog/Digital) 변환기(510), 상기 CPU(600)의 제어에 따라 상기 A/D 변환기(510)로부터 출력되는 영상 신호를 저장하는 메모리(520), 상기 메모리(520)로부터 출력되는 영상 신호로부터 경계 화소를 추출하는 경계 추출부(530), 상기 경계 추출부(530)에서 추출된 경계 화소를 이용하여 경계선 방향을 산출하는 경계선 방향 산출부(540), 상기 경계선 방향 산출부(540)에서 산출된 경계선 방향을 이용하여 상기 경게 추출부(530)에서 추출된 경계 화소를 세선화하는 세선화부(550), 상기 세선화부(550)에서 세선화된 경계 화소를 따라 가면서 직선 근사하는 직선 근사부(560), 및 상기 직선 근사부(560)에서 출력된 직선의 선형 특징을 추출하여 상기 메모리(520)에 저장하는 직선 선형 특징 추출부(570)로 구성된다.The recognition unit 500 extracts and recognizes contours of faces of each model and talker by inputting respective model image signals corresponding to infrared images of one or more talkers during a video conference and infrared image signals of the talker, An A / D (Analog / Digital) converter 510 for A / D-converting each model image signal and a talker image signal corresponding to infrared images of one or more talkers under a control of the CPU 600, A memory 520 for storing a video signal outputted from the A / D converter 510 under the control of the CPU 600, a boundary extracting unit 520 for extracting a boundary pixel from a video signal outputted from the memory 520, A boundary line direction calculating unit 540 for calculating a boundary line direction using the boundary pixels extracted by the boundary extracting unit 530, a boundary line calculating unit 540 for calculating a boundary line direction calculated by the boundary line calculating unit 540, A linear approximation unit 560 that linearly approximates the boundary pixels along the thinned boundary pixels in the thinning unit 550, and a linear approximation unit 560 that linearly approximates the boundary pixels along the thinned boundary pixels, And a linear linear feature extraction unit 570 for extracting a linear feature of a straight line output from the linear approximation unit 560 and storing the extracted linear feature in the memory 520.

CPU(600)는 ROM(610)를 포함하여 구성되어 상기 인식부(500)의 동작을 제어하고 상기 추출된 모델 영상 신호와 대화자의 얼굴의 윤곽선을 정합하고 정합된 대화자를 추적하도록 제어한다.The CPU 600 includes a ROM 610 and controls the operation of the recognition unit 500 to control the matching of the extracted model video signal and the outline of the face of the talker and tracking of the registered talker.

모터 드라이버(700)는 상기 CPU(600)의 제어에 따라 카메라의 방향을 이동시키기 위해 모터(800)를 구동시킨다.The motor driver 700 drives the motor 800 to move the direction of the camera under the control of the CPU 600.

이와 같이 구성되는 본 발명에 의한 화상 회의 시스템의 하드웨어의 동작을 설명한다.The operation of the hardware of the video conferencing system according to the present invention constructed as above will be described.

먼저, 적외선 카메라(400)에서의 신호를 메모리에 저장하여 입력 영상을 생성해야 한다. 적외선 카메라에서 들어오는 신호, 예를 들어 NTSC 신호는 A/D 변환기(510)를 거쳐 2차원 행렬상에서 지정된 범위내의 값을 가지는 디지탈 영상(I(x, y)이 된다. 이 영상은 RAM으로 이루어진 메모리(520)에 저장되어 입력 영상으로 사용된다.First, a signal from the infrared camera 400 is stored in a memory to generate an input image. A signal input from an infrared camera, for example, an NTSC signal, is converted into a digital image I (x, y) having a value within a specified range on a two-dimensional matrix through an A / D converter 510. This image is stored in a memory (520) and used as an input image.

디지탈 영상이 메모리(520)에 저장이 되고 나서는 CPU(600)에서 선형 특징 추출부(570)의 동작이 끝날때까지 상기 A/D 변환기(510)에서 메모리(520)에 신호를 저장하는 기능을 오프시켜서 입력 영상이 변하지 않게 한다.After the digital image is stored in the memory 520, the function of storing the signal in the memory 520 from the A / D converter 510 until the operation of the linear feature extraction unit 570 is completed by the CPU 600 Off so that the input image does not change.

물론 선형 특징 추출부(570)의 선형 특징 추출 동작이 완료되면 다시 새로운 입력 영상을 받아들이도록 A/D 변환기(510)를 온시킨다.Of course, when the linear feature extraction operation of the linear feature extraction unit 570 is completed, the A / D converter 510 is turned on so as to receive a new input image again.

상기 메모리(520)로부터 출력되는 영상 신호(I(x, y)는 경계 화소 추출부(530)에서 미분에 의해 경계 화소가 추출된다. 즉, 위의 식(1)과 식(2)에 의해 도3에 도시한 바와 같이 수평 및 수직 미분이 수행되어 미분값(h, v)이 구해진후 이 미분값의 크기((h²+ v²)^1/2)는 다시 계산되고 설정된 경계선 판별용 문턱치와 비교되어 경계 화소를 찾아낸다.The boundary pixels are extracted by the derivative in the boundary pixel extracting unit 530. That is, the image signals I (x, y) outputted from the memory 520 are extracted by the above equations (1) and As shown in FIG. 3, after the differential values h and v are obtained by performing the horizontal and vertical differential calculations, the magnitude ((h ² + v ² ) ^1/2 ) of the differential value is calculated again and the threshold value And finds a boundary pixel.

또한, 경계선 방향 산출부(540)에서는 이와 같이 추출된 경계 화소의 수직 및 수평 방향의 미분값의 아크탄젠트값(aran(h/v))으로 경계선 방향을 산출한다.In addition, the boundary line direction calculating unit 540 calculates the boundary line direction by arctangent values aran (h / v) of the differential values in the vertical and horizontal directions of the extracted boundary pixels.

세선화부(550)에서는 도4에 도시한 바와 같이 경계선 방향 산출부(540)에서 산출된 경계선 방향과 경계 추출부(530)에서 추출된 경계 화소 중에서 90도 되는 화소들을 검사하여 경계 화소의 값이 가장 큰 화소 하나만을 제외하고 나머지를 삭제하여 경계 화소를 제외시킨다.As shown in FIG. 4, the fine line section 550 checks the boundary line direction calculated by the boundary line direction calculating section 540 and the pixels that are 90 degrees out of the boundary pixels extracted by the boundary extracting section 530, Excluding the largest pixel, the remaining pixels are excluded to exclude the boundary pixels.

이와 같이 세선화부(550)에서 세선화된 경계 화소는 직선 근사부(560)에서 도5에 도시한 바와 같이 경계 화소를 따라 가면서 직선 근사된후 선형 특징 추출부(570)에서 직선의 선형 특징이 추출된다.As shown in FIG. 5, in the line approximation unit 560, the thinned pixels in the thin line unit 550 are linearly approximated while following the boundary pixels, and the linear feature of the straight line is extracted from the linear feature extraction unit 570 And extracted.

직선의 선형 특징은 [(직선1, 시작점, 끝점, 길이, 기울기), (직선2, 시작점, 끝점, 길이 기울기), …, (직선n, 기울기, 끝점, 길이 기울기)]로 표현된다.Linear features of a straight line include [(straight line 1, start point, end point, length, slope), (straight line 2, start point, end point, length slope), ... , (Straight line n, slope, end point, length slope)].

이와 같이 추출된 모델에 대한 직선의 선형 특징은 다시 메모리(520)에 저장된다.The linear characteristic of the straight line for the extracted model is stored again in the memory 520. [

하나 또는 그 이상의 대화자의 얼굴을 사전에 적외선 카메라로 촬영해 놓은 모델은 선형 특징이 추출되어 데이터 베이스로 구축된다.A model in which the face of one or more talkers is photographed with an infrared camera in advance is extracted as a linear feature and constructed as a database.

한편, 현재 카메라를 통해 입력되는 대화자의 적외선 영상 신호에 대해서도 위와 같은 과정을 통해 선형 특징이 추출되어야 한다. 즉, 대화자 영상 신호는 A/D 변환기(510)를 통해 메모리(520)에 저장된후, 경계 추출부(530), 경계선 방향 산출부(540), 세선화부(550), 직선 근사부(560), 및 선형 특징 추출부(570)를 거쳐 얼굴 둘레의 윤곽선에 대한 선형 특징이 추출된다.On the other hand, the linear feature should be extracted through the above process for the infrared image signal of the talker inputted through the current camera. That is, the talker image signal is stored in the memory 520 via the A / D converter 510 and then transmitted to the boundary extracting unit 530, the boundary line direction calculating unit 540, the thinning unit 550, the linear approximation unit 560, , And a linear feature extracting unit 570 extracts a linear feature for a contour around the face.

이와 같이 추출된 선형 특징은 메모리(520)에 저장되고, 이때 다수의 모델중에서 선택된 하나의 모델과 동일한 대화자를 CPU(600)에서 정합을 통해 추출해내고, 추출된 대화자를 중심점 추적 방법을 통해 CPU(600)에서 추적한다.The extracted linear features are stored in the memory 520. At this time, the same talker as one selected from a plurality of models is extracted from the matching through the CPU 600, and the extracted talker is sent to the CPU 600).

이상에서 설명한 바와 같이 본 발명에 의한 화상 회의 시스템은 적외선 영상을 이용하여 대화자를 인식한후 이를 추적하는 기능이 있어 화상 회의중 대화자가 이동하더라도 카메라로 추적이 가능하다.As described above, the video conferencing system according to the present invention has a function of recognizing a conversant using an infrared image and tracking the conversant, so that the conversation can be tracked by a camera even if the conversant moves during a video conference.

Claims

A model recognizing step (100 to 107) for extracting and recognizing contours of respective model faces by inputting respective model image signals corresponding to infrared images of one or more talkers during a video conference;

A model selection step (200) for selecting one of the recognized one or more model image signals;

A talker recognizing step (201 to 207) for extracting and recognizing the contour line of the talker's face from the infrared image signal of the talker inputted through the camera;

A matching step (208, 209) for matching the contour of the model face of the selected model video signal with the contour of the talker face of the recognized talker image signal; And

(210) tracking the matched talker to move the camera. &Lt; Desc / Clms Page number 19 >

The method of claim 1, further comprising: a model changing step (211, 212) for moving the camera and proceeding to the camera recognizing and matching step (201 - 209) when selecting another one of the recognized one or more model video signals Wherein the video conference system further comprises:

2. The method according to claim 1, wherein the model recognition step (100 to 107)

A model image signal storing step (100) for A / D converting and storing a model image signal corresponding to an infrared image of a talker to be recognized at the time of a video conference;

A boundary extracting step (101, 102) for calculating a boundary pixel and a boundary line direction from the stored model video signal;

A thinning step (103) of thinning the extracted boundary pixels using the extracted boundary direction;

A straight line feature extraction step (104, 105, 106) for extracting and storing a linear feature of a straight line while following the thinned boundary pixel; And

And determining whether there is another model video signal corresponding to the video of the other talker to be recognized, if it is determined that there is another model video signal corresponding to the video of the other talker to be recognized.

4. The apparatus of claim 3,

Wherein the threshold value is determined by comparing a differential value in the vertical and horizontal directions of the stored model video signal with a threshold value for setting a boundary.

5. The method of claim 4,

And arctanent values aran (h / v) of the differential values in the vertical and horizontal directions of the boundary pixels.

2. The method according to claim 1, wherein the talker recognition step (201 to 207)

A savant image signal storing step (201) for A / D-converting the infrared image signal of a talker inputted through a camera and storing the result;

A boundary extracting step (202, 203) for calculating a boundary pixel and a boundary line direction from the stored talker image signal;

A thinning step (204) of thinning the extracted boundary pixels using the extracted boundary direction; And

And a linear feature extraction step (205, 206, 207) for extracting and storing a linear feature of a straight line while keeping track of the thinned boundary pixels.

7. The apparatus of claim 6,

Wherein the predetermined threshold value is determined by comparing a differential value in the vertical and horizontal directions of the stored communicator video signal with a threshold value for setting a boundary.

8. The method of claim 7,

2. The method of claim 1, wherein the talker tracking step (210)

A window setting step (300) of setting a window of a predetermined size including a face of the talker at an initial position where the matched dialogue is located;

A binarization step (301) of binarizing pixel values in the set window;

A threshold setting step (303) of setting a threshold for tracking the center point of the matched talker;

A binarization value search step (303) for searching whether the binarized pixel value is larger than a threshold value set; And

And a camera moving and tracking step (304) for moving the camera and performing tracking if the binarized value is greater than the threshold value.

10. The method of claim 9, wherein the tracking step (210)

If it is impossible to trace the talker during the camera movement and tracking step (304), the process further proceeds to the talker recognition step (201 to 207), and the non-traceable processing step (305, 306) Wherein the video conference system comprises:

Recognition means (500) for extracting and recognizing contours of faces of respective models and talkers by inputting respective model image signals corresponding to infrared images of one or more talkers during a video conference and infrared image signals of talkers;

A CPU 600 for controlling the operation of the recognizing means 500 and controlling the extracted model video signal to match the outline of the face of the talker and to track the aligned talker; And

And a motor driver (700) for moving the direction of the camera under the control of the CPU (600).

12. The apparatus of claim 11, wherein the recognition means (500)

An A / D (Analog / Digital) converter 510 for A / D-converting each model image signal and a talker image signal corresponding to infrared images of one or more talkers under a control of the CPU 600, ;

A memory 520 for storing a video signal output from the A / D converter 510 under the control of the CPU 600;

A boundary extracting unit 530 for extracting boundary pixels from the video signal output from the memory 520;

A boundary line direction calculating unit 540 for calculating a boundary line direction using the boundary pixels extracted by the boundary extracting unit 530;

A nebulizer 550 for nebulizing the boundary pixels extracted by the nebulae extractor 530 using the boundary line direction calculated by the boundary line direction calculator 540;

A linear approximation unit 560 that linearly approximates the boundary pixels along the thinned boundary pixels in the thinning unit 550; And

And a straight line feature extraction unit (570) for extracting a linear feature of a straight line output from the linear approximation unit (560) and storing the linear feature in the memory (520).