KR20040014053A

KR20040014053A - Mobile station and face detection method for mobile station

Info

Publication number: KR20040014053A
Application number: KR1020020047221A
Authority: KR
Inventors: 이진수; 박기수
Original assignee: 엘지전자 주식회사
Priority date: 2002-08-09
Filing date: 2002-08-09
Publication date: 2004-02-14
Also published as: KR100438283B1

Abstract

PURPOSE: A mobile communication terminal and a face region extracting method in a mobile communication terminal are provided to efficiently mount a face region extracting engine and a background conversion engine on a DSP chip for the mobile communication terminal, and to share a small memory, thereby minimizing usage of the memory and enabling a real-time processing by using an internal memory of the DSP chip. CONSTITUTION: A memory is divided into an external memory(110) and an internal memory(162). Program codes and data recorded in the external memory(110) are transmitted to a DSP chip(160) through a traffic controller(140) added with a DMA(Direct Memory Access) function. An instruction cache(161) quickly performs programs. Program codes of an encoder, a decoder, a face region extracting engine, a background conversion engine, and an ROI(Region Of Interest)-based bit rate control engine are assigned to the external memory(110), respectively. Infrequent data are assigned to the external memory(110). Frequent data are assigned to the internal memory(162). The program codes of the face region extracting engine and the background conversion engine are assigned to the external memory(110).

Description

A method for extracting a face region in a mobile communication terminal and a mobile communication terminal {Mobile station and face detection method for mobile station}

본 발명은 이동통신 단말기에 관한 것으로서, 특히 얼굴 영역 추출을 실시간으로 처리하여 관심영역(ROI:Region Of Interest) 기반 비트율 제어 통신 및 배경 전환 통신을 수행할 수 있는 화상통신용 이동통신 단말기에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a mobile communication terminal, and more particularly, to a mobile communication terminal for image communication capable of performing region of interest (ROI) based bit rate control communication and background switching communication by processing face region extraction in real time.

오늘날, 이동통신 분야는 IMT2000을 정점으로 정보통신 분야를 대표하는 커다란 하나의 분야로 자리잡고 있다. 이동통신 서비스는 점점 높은 대역폭을 기반으로 질적으로도 큰 발전을 거듭하고 있으며 부가적인 서비스 또한 다양해지고 있다. 이에 따라 이동통신 단말기의 역할도 중요해지고 있으며, 다양한 기능을 제공하면서도 적은 전원을 소모할 수 있는 초소형 이동통신 단말기가 등장되고 있다.Today, the mobile communication field has become one of the largest fields representing the information communication field with the peak of IMT2000. Mobile communication services are developing qualitatively based on the higher bandwidth and additional services are also diversifying. Accordingly, the role of the mobile communication terminal is also becoming important, and a miniature mobile communication terminal capable of consuming a small amount of power while providing various functions has emerged.

이동통신 단말기에서 사용되는 주요 처리 기능은 송신할 데이터를 인코딩하거나 수신된 데이터를 디코딩하는 것이다. 기존의 음성 데이터만을 송수신하던 단말기와는 달리 최근에 등장되고 있는 화상 통신용 이동통신 단말기는 비디오 영상을 인코딩/디코딩해야 한다. 비디오 영상의 인코딩/디코딩은 음성 데이터에 대한 인코딩/디코딩에 비해 많은 처리량을 요구한다. 따라서 보다 빠르게 실시간 처리를 수행하기 위해서는 전용 하드웨어의 도움을 받아야 하는데, 이때 사용되는 것이 DSP(Digital Signal Processing) 칩이다. 이러한 DSP 칩은 인코딩/디코딩에서 사용하는 DCT(Discrete Cosine Transform) 변환 등 단순하지만 반복 연산이 많은 신호 처리를 빠르게 처리할 수 있는 전용 칩이라고 할 수 있다. 최근 들어 대부분의 이동통신 단말기에는 이러한 DSP 칩이 탑재되어 있는데, 이동통신 단말기용 DSP 칩의 가장 큰 특징은 많은 처리를 적은 전원을 사용하여 수행할 수 있다는 것이다.The main processing functions used in the mobile communication terminal are to encode the data to be transmitted or to decode the received data. Unlike terminals that only transmit and receive audio data, mobile terminals for video communication, which have recently emerged, need to encode / decode video images. Encoding / decoding video images requires more throughput than encoding / decoding video data. Therefore, in order to perform real-time processing faster, it is necessary to use dedicated hardware, and a DSP (Digital Signal Processing) chip is used. Such a DSP chip is a dedicated chip that can quickly process a signal processing that is simple but has many repetitive operations such as a discrete cosine transform (DCT) transform used in encoding / decoding. Recently, most mobile communication terminals are equipped with such a DSP chip. The biggest feature of the DSP chip for a mobile communication terminal is that a large number of processing can be performed using a low power supply.

이와 같은 이동통신 단말기에 탑재되는 DSP 칩의 대표적인 제품으로는 TI (Texas Instruments) 사에서 출시한 TMS320C52x 계열이나 TMS320C55x 계열을 들 수 있다. 이러한 DSP 칩은 150~200 MHz의 성능을 유지하면서 적은 전력 소모를 특징으로 한다. 또한, 이러한 DSP 칩은 제한적이지만 내부 메모리를 가지고 있는데, 내부 메모리는 한 칩 사이클(chip cycle)에 메모리를 한번 억세스(access) 할 수 있으므로 외부 메모리를 사용하는 것에 비해 일반적으로 유리하다. 그리고, 프로그램의 경우 특정 DSP 칩에는 프로그램 캐쉬(Program Cache)가 장착되어 있어서, 프로그램 코드가 외부 메모리에 있다고 하더라도 프로그램 캐쉬를 사용하여 빠르게 동작할 수 있도록 하고 있다. 따라서 제한된 내부 메모리, 프로그램 캐쉬(Program Cache) 등의 DSP 칩의 특징을 잘 활용해야만 복잡한 알고리즘을 실시간으로 처리할 수 있게 된다.Representative products of DSP chips in such mobile communication terminals include the TMS320C52x series and the TMS320C55x series from Texas Instruments. These DSP chips feature low power consumption while maintaining performance from 150 to 200 MHz. In addition, these DSP chips have limited but internal memory, which is generally advantageous over using external memory because the memory can be accessed once per chip cycle. In the case of a program, a specific DSP chip is equipped with a program cache, so that even if the program code is in an external memory, the program cache can be used to quickly operate. As a result, complex algorithms can be processed in real time only when the DSP chip features such as limited internal memory and program cache are utilized.

그리고, 이동통신 단말기용 DSP 칩의 최근 동향을 보면, 특정한 하드웨어에 의존적인 설계에서 범용적인 설계로 변화되고 있는 실정이다. TI사의 C54x나 C55x의 경우에도 이러한 범용 DSP 칩의 기능을 수행할 수 있는 이동통신 단말기용 DSP칩이다. 즉 다양한 회사에서 개발되는 이동통신 단말기의 특성에 잘 적응할 수 있도록 설계할 수 있으며, 이러한 범용적인 특성으로 인해 DSP 칩을 이용한 여러 가지 처리를 더 수행하여 DSP 칩을 더욱 많이 활용할 수 있게 된다.In addition, the recent trend of the DSP chip for a mobile communication terminal is changing from a specific hardware-dependent design to a general-purpose design. TI's C54x and C55x are also DSP chips for mobile handsets that can perform these functions. That is, it can be designed to adapt well to the characteristics of mobile communication terminals developed by various companies. Due to this general characteristic, the DSP chip can be utilized more by performing various processing using the DSP chip.

한편, 동영상 코딩에 대한 연구 분야에서는, 얼굴 영역을 추출하여 얼굴 영역을 보다 높은 화질로 코딩하고 다른 배경 영역은 낮은 화질로 코딩함으로써, 적은 데이터 양으로 고화질 영상을 송수신할 수 있는 기술에 대한 연구가 많이 수행되고 있다.On the other hand, in the field of video coding research, there is a research on a technology for extracting a face region and coding a face region with higher image quality and other background regions with a lower image quality to transmit and receive a high quality image with a small amount of data. A lot is being done.

얼굴 영역은 화상 통신에서 주요관심영역(ROI:Region Of Interest)이므로 얼굴 영역만 고화질이라면 사용자가 화질상 불편함을 덜 느끼게 되는 것이다. 이때 중요한 것은 얼굴 영역을 정확하게 추출하기 위한 얼굴 영역 추출 엔진인데, 얼굴 영역 추출 방법은 오래 전부터 많은 연구가 이루어져 왔다. 그만큼 얼굴 영역을 추출하는 방법도 여러 가지인데, 얼굴 색 정보를 이용하는 방법이나 눈, 코, 입과 같은 특징 정보를 이용하는 방법, 그리고 사전에 얼굴 탬플리트를 구성하여 비교하는 탬플리트 매칭 방법 등이 많이 소개되고 있다. 이 중에서 주로 사용하는 방법은 얼굴 색 정보를 이용하는 방법으로 비교적 빠른 처리가 가능하면서도 괜찮은 성능을 보여주고 있다.Since the face area is a region of interest (ROI) in video communication, if only the face area is high quality, the user may feel less inconvenience in image quality. At this time, it is important to extract the face region engine for the face region extraction. The face region extraction method has been studied for a long time. There are many ways of extracting the face area, including how to use face color information, how to use feature information such as eyes, nose and mouth, and template matching methods to construct and compare face templates in advance. have. Among them, the most commonly used method is the use of face color information, which is relatively fast and shows good performance.

하지만, 상기 소개된 얼굴 영역 추출 기술들은 대부분이 개인용 컴퓨터 환경에서 얼굴 영역을 정확히 추출하는 데에만 집중하여, 이동통신 단말기 환경에 적용할 경우 수초에서 심하게는 수분이 걸릴 정도로 많은 처리 시간이 걸리게 된다. 따라서, 제한된 하드웨어를 사용하면서도 실시간 처리가 요구되는 이동통신 단말기에서는, DSP 칩을 이용하기 좋은 구조로 얼굴 영역 추출 알고리즘이 개발되어야 한다.However, most of the face region extraction techniques introduced above focus only on accurately extracting a face region in a personal computer environment, and when applied to a mobile communication terminal environment, it takes a lot of processing time to take several seconds to severely. Therefore, in a mobile communication terminal which requires a real-time processing while using limited hardware, a face region extraction algorithm should be developed to have a good structure to use a DSP chip.

이때, DSP 칩에서 얼굴 영역 추출 알고리즘을 수행할 때 고려되어야 하는 사항은 크게 다음의 두 가지이다. 먼저, 중앙처리장치(CPU) 사용율 면에서, CPU를 최대한 적게 사용해야 한다. 그러기 위해 알고리즘 자체가 매우 간단해야 하며 DSP 칩이 빠르게 처리할 수 있는 단순 반복 연산 처리로만 알고리즘이 구성되어야 한다. 다음 고려해야 할 사항으로는 적은 메모리의 사용이다. DSP 칩의 경우 대부분이 내부 메모리를 사용해야 빠른 처리가 가능하므로 사용하는 데이터의 양이 적어야 한다. 그러기 위해 처리 시간 순서 상에서 요구되는 데이터 메모리를 잘 분석하여 같은 메모리를 공유할 수 있는 최적화된 메모리 설계가 요구된다. 마찬가지로 프로그램 코드 영역도 메모리를 차지하게 되는데, 역시 내부 메모리를 사용하는 것이 좋으나 C55x와 같이 프로그램 캐쉬(Program Cache)가 있는 경우 외부 메모리에 프로그램 코드를 올려도 빠른 처리를 수행할 수 있게 된다.At this time, there are two major considerations when performing the face region extraction algorithm in the DSP chip. First, in terms of CPU utilization, you should use as few CPUs as possible. To do this, the algorithm itself must be very simple, and the algorithm must consist of simple iterative processing that the DSP chip can process quickly. The next consideration is the use of less memory. Most DSP chips use internal memory for fast processing, so the amount of data used should be small. This requires an optimized memory design that can analyze the data memory required in the processing time sequence and share the same memory. Similarly, the program code area occupies the memory. It is also recommended to use internal memory. However, if there is a program cache like C55x, the program code can be loaded quickly in external memory.

본 발명은, DSP 칩의 특성을 고려하여 얼굴 영역 추출 알고리즘을 개발한 후, 얼굴 영역 추출 알고리즘 연산에 최적화되도록 메모리 할당을 설계함으로써, 얼굴 영역 추출을 실시간으로 처리할 수 있는 이동통신 단말기를 제공함에 그 목적이 있다.The present invention provides a mobile communication terminal capable of processing face region extraction in real time by developing a face region extraction algorithm in consideration of characteristics of a DSP chip and designing a memory allocation to be optimized for a face region extraction algorithm. The purpose is.

또한 본 발명은, DSP 칩이 탑재된 이동통신 단말기에서 기본적으로 DSP 칩을 이용하여 수행되는 인코더/디코더 이외에, 얼굴 영역 추출 엔진과 비트율 제어 엔진, 그리고 배경 전환 엔진을 탑재하여, 낮은 네트워크 환경에서도 고화질을 유지하고 배경 전환 등의 서비스가 가능한 이동통신 단말기를 제공함에 다른 목적이 있다.In addition, the present invention is equipped with a face region extraction engine, a bit rate control engine, and a background switching engine in addition to an encoder / decoder basically performed by using a DSP chip in a mobile communication terminal equipped with a DSP chip, so that even in a low network environment, Another purpose is to provide a mobile communication terminal capable of maintaining and providing services such as background switching.

도 1은 본 발명에 따른 이동통신 단말기의 구성을 개략적으로 나타낸 블록도.1 is a block diagram schematically showing the configuration of a mobile communication terminal according to the present invention;

도 2는 본 발명에 따른 이동통신 단말기에서의 얼굴 영역 추출 방법에 의하여, 얼굴 영역을 추출하는 과정을 설명하기 위한 도면.2 is a view for explaining a process of extracting a face region by a face region extraction method in a mobile communication terminal according to the present invention.

도 3은 본 발명에 따른 이동통신 단말기에서의 얼굴 영역 추출 방법에 의하여, 얼굴 영역을 추출하는 각 과정에서의 메모리 사용 상태를 나타낸 도면.3 is a diagram illustrating a memory usage state in each process of extracting a face region by a face region extraction method in a mobile communication terminal according to the present invention;

도 4는 본 발명에 따른 이동통신 단말기에서, 배경 전환 통신이 수행되는 화면의 예를 나타낸 도면.4 is a diagram illustrating an example of a screen on which background switching communication is performed in a mobile communication terminal according to the present invention;

도 5는 본 발명에 따른 이동통신 단말기에서, 배경 전환 통신이 수행되는 과정을 설명하기 위한 도면.5 is a view for explaining a process of performing a background switching communication in a mobile communication terminal according to the present invention.

도 6은 본 발명에 따른 이동통신 단말기의 메모리 영역 할당을 개념적으로 나타낸 도면.6 conceptually illustrates memory area allocation of a mobile communication terminal according to the present invention;

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

110... 외부 메모리120... 카메라110 ... external memory 120 ... camera

130... 액정표시장치140... 트랙픽 제어부130 ... LCD 140 ... Traffic control unit

150... 프레임 버퍼160... DSP150 ... frame buffer 160 ... DSP

161... 인스트럭션 캐쉬162... 내부 메모리Instruction cache 162 Internal memory

상기의 목적을 달성하기 위하여 본 발명에 따른 이동통신 단말기는,In order to achieve the above object, a mobile communication terminal according to the present invention,

'살색 이미지' 처리를 위한 데이터와 'Dilation 이미지' 처리를 위한 데이터의 저장 영역으로 할당된 제 1 메모리 영역과;A first memory area allocated as a storage area for data for 'color image' processing and data for 'Dilation image' processing;

'인덱스 이미지' 처리를 위한 데이터와 'Erosion 이미지' 처리를 위한 데이터의 저장 영역으로 할당된 제 2 메모리 영역과;A second memory area allocated as a storage area for data for 'index image' processing and data for 'Erosion image' processing;

'회색 이미지' 처리를 위한 데이터의 저장 영역으로 할당된 제 3 메모리 영역과;A third memory area allocated as a storage area for data for 'gray image' processing;

얼굴 영역 추출 과정에서의 연산에 필요한 전역 변수의 저장 영역으로 할당된 제 4 메모리 영역과;A fourth memory area allocated as a storage area of a global variable necessary for the operation in the face region extraction process;

얼굴 영역 추출 과정에서의 연산에 필요한 지역 변수의 저장 영역으로 할당된 제 5 메모리 영역; 및A fifth memory area allocated as a storage area of a local variable necessary for an operation in a face area extraction process; And

상기 제 1 내지 제 5 메모리 영역에 저장되는 데이터를 이용하여, 입력된 영상으로부터 얼굴 영역을 추출하기 위한 얼굴 영역 추출 엔진이 구비된 DSP(Digital Signal Processing) 칩; 을 포함하는 점에 그 특징이 있다.A digital signal processing (DSP) chip having a face region extraction engine for extracting a face region from an input image using data stored in the first to fifth memory regions; Its features are to include.

여기서, 상기 '인덱스 이미지'는 입력된 영상에서 살색 조건을 만족하는 픽셀의 칼라 인덱스 값으로 구성된 이미지를 나타내며, 상기 '살색 이미지'는 '인덱스 이미지'로부터 주요 살색을 중심으로 색을 그룹화하여, 살색 그룹에 속한 픽셀만을 표현한 이미지를 나타내며, 상기 '회색 이미지'는 입력된 영상을 회색으로 변환시킨 이미지를 나타내며, 상기 'Erosion 이미지'는 '살색 이미지'에서 잡음으로 발생한 살색 픽셀을 제거한 이미지를 나타내며, 상기 'Dilation 이미지'는 'Erosion 이미지'에서 잡음으로 발생한 살색 영역 내에서의 작은 구멍을 채움으로써 구성된 이미지를 각각 나타낸다.Here, the 'index image' represents an image composed of color index values of pixels satisfying the skin color condition in the input image, and the 'skin image' is a skin color by grouping colors around the main skin color from the 'index image'. The image represents only the pixels belonging to the group. The gray image represents an image obtained by converting the input image to gray. The 'Erosion image' represents an image from which flesh color pixels generated by noise are removed from the 'color image'. The 'Dilation image' represents an image configured by filling small holes in the flesh color region generated by noise in the 'Erosion image'.

또한 본 발명에 의하면, 상기 얼굴 영역 추출 엔진의 코드부는 상기 DSP 칩의 외부 메모리에 위치하고, 상기 제 1 내지 제 5 메모리 영역은 상기 DSP 칩의 내부 메모리에 위치하는 점에 그 특징이 있다.Further, according to the present invention, the code part of the face region extraction engine is located in an external memory of the DSP chip, and the first to fifth memory areas are located in an internal memory of the DSP chip.

또한 본 발명에 의하면, 상기 제 1 내지 제 3 메모리 영역은 모두 88*72 Word 크기로 구성되며, 상기 제 4 메모리 영역은 890 Word 크기로 구성되며, 제 5 메모리 영역은 100 Word 크기로 구성되는 점에 그 특징이 있다.According to the present invention, all of the first to third memory areas are 88 * 72 Word size, the fourth memory area is 890 Word size, the fifth memory area is 100 Word size Has its features.

또한 본 발명에 의하면, 영상이 표시되는 화면의 배경으로 사용될 배경 영상이 저장되는 제 6 메모리 영역을 더 구비하며, 입력된 영상과 배경 영상을 선택적으로 합성시키는 배경 전환 엔진을 상기 DSP 칩이 더 구비하는 점에 그 특징이 있다.According to the present invention, the apparatus further comprises a sixth memory area storing a background image to be used as a background of a screen on which an image is displayed, and the DSP chip further includes a background switching engine for selectively synthesizing the input image with the background image. There is a characteristic in that.

또한 본 발명에 의하면, 상기 제 6 메모리 영역과 상기 배경 전환 엔진의 코드부는 상기 DSP 칩의 외부 메모리에 위치하는 점에 그 특징이 있다.According to the present invention, the sixth memory area and the code part of the background switching engine are characterized in that they are located in an external memory of the DSP chip.

또한 본 발명에 의하면, 입력 영상으로 사용되는 원 영상을 저장하는 제 7 메모리 영역을 상기 DSP 칩의 외부 메모리에 더 구비하며, 원 영상과 배경 영상을 합성하기 위해 원 영상과 배경 영상을 매크로 블록 단위로 상기 DSP 칩의 내부 메모리에서 합성하는 점에 그 특징이 있다.In addition, according to the present invention, a seventh memory area for storing the original image to be used as an input image is further provided in the external memory of the DSP chip, and the original image and the background image in macroblock units to synthesize the original image and the background image This feature is characterized in that the synthesis in the internal memory of the DSP chip.

또한 본 발명에 의하면, 상기 DSP 칩은 주요관심영역(ROI)에 대해서는 비 ROI 영역에 비하여 상대적으로 고화질로 설정되도록 ROI 여부에 따라 비트율을 상대적으로 차이나게 제어하는 ROI 기반 비트율 제어 엔진을 더 구비하는 점에 그 특징이 있다.In addition, according to the present invention, the DSP chip further comprises a ROI-based bit rate control engine for controlling the bit rate relatively different depending on the ROI so as to be set to a relatively high picture quality relative to the non-ROI area for the main ROI. It has that feature.

또한 상기의 목적을 달성하기 위하여 본 발명에 따른 이동통신 단말기에서의 얼굴 영역 추출 방법은,In addition, in order to achieve the above object, the method for extracting a face region in a mobile communication terminal according to the present invention,

입력 영상의 모든 픽셀에 대해 살색 조건을 만족하는 지 조사하는 단계와;Investigating whether all the pixels of the input image satisfy the skin color condition;

살색 조건을 만족하는 픽셀에 대해 각 픽셀의 칼라 값의 인덱스를 구하여 인덱스 값으로 구성된 '인덱스 이미지'를 구성하는 단계와;Constructing an 'index image' composed of index values by obtaining an index of color values of each pixel for pixels satisfying the flesh color condition;

상기 '인덱스 이미지'에서 가장 많이 나타나는 칼라를 중심으로 유사한 칼라를 하나의 그룹으로 묶음으로써 해당 칼라 그룹에 해당하는 픽셀만 'On' 시키고 나머지는 'Off'로 표시하여 '살색 이미지'를 구성하는 단계와;Comprising the most common color in the 'index image' grouping similar colors into a group to 'On' only the pixels corresponding to the color group, and the other 'Off' to configure a 'color image' Wow;

상기 '살색 이미지'에 'Erosion 모폴로지'를 수행하여 'Erosion 이미지'를 구성하는 단계와;Constructing an 'Erosion image' by performing an 'Erosion morphology' on the 'color image';

상기 'Erosion 이미지'에 'Dilation 모폴로지'를 수행하여 'Dilation 이미지'를 구성하는 단계와;Constructing a 'dilation image' by performing a 'dilation morphology' on the 'Erosion image';

상기 'Dilation 이미지'에서 서로 연결된 On-픽셀들을 하나의 덩어리로 연결하여 영역으로 표시하는 연결 요소 구성 단계와;A connecting element constituting step of connecting the on-pixels connected to each other in the 'Dilation image' as a single mass and displaying the connected area;

상기 연결 요소 구성 단계에서 구성된 연결 요소 중 일정 조건을 만족하는 요소를 얼굴 후보 영역으로 지정하는 단계와;Designating as a face candidate region an element satisfying a predetermined condition among the connection elements configured in the connection element configuring step;

상기 지정된 얼굴 후보 영역이 타원 모양을 하는 지 조사하는 타원 매칭 단계와;An elliptic matching step of examining whether the designated face candidate region has an ellipse shape;

상기 최초 입력 영상을 회색 영상으로 변환하여 '회색 이미지'를 구성하는 단계와;Converting the first input image into a gray image to construct a 'gray image';

상기 '회색 이미지'에서 얼굴 후보 영역을 중심으로 눈과 입 영역을 조사하는 단계; 및Irradiating an eye and a mouth area around the face candidate area in the 'gray image'; And

상기 타원 매칭 결과와 눈과 입 영역 조사 결과를 사용하여 얼굴 영역을 결정하는 얼굴 영역 판단 단계;를 포함하는 점에 그 특징이 있다.And a face region determination step of determining a face region using the elliptic matching result and the eye and mouth region findings.

여기서 본 발명에 의하면, 상기 '살색 이미지'와 'Dilation 이미지' 처리를 위한 데이터에 대하여 같은 메모리 영역을 할당하여 시간에 따라 공유하여 사용하고, 상기 '인덱스 이미지'와 'Erosion 이미지' 처리를 위한 데이터에 대하여 같은 메모리 영역을 할당하여 시간에 따라 공유하여 사용하고, 상기 '회색 이미지' 처리를 위한 데이터는 별도의 메모리 영역을 할당하여 사용하는 점에 그 특징이 있다.According to the present invention, the same memory area is allocated to the data for processing the 'color image' and the 'Dilation image' and shared according to time, and the data for the 'index image' and 'Erosion image' processing are used. The same memory area may be allocated and shared according to time, and the data for the 'gray image' processing may be allocated by using a separate memory area.

또한, 상기의 다른 목적을 달성하기 위하여 본 발명에 따른 이동통신 단말기는, 얼굴 영역을 추출하기 위한 얼굴 영역 추출 엔진과, 입력된 영상과 배경 영상을 선택적으로 합성하기 위한 배경 전환 엔진이 구비된 DSP 칩과; 상기 DSP 칩의 외부 메모리에 탑재된, 상기 얼굴 영역 추출 엔진의 코드부와; 상기 DSP 칩의 내부 메모리에 탑재된, 상기 얼굴 영역 추출 엔진이 얼굴 영역 추출을 위해 사용하는 전역 변수부와; 상기 DSP 칩의 내부 메모리에 탑재된, 상기 얼굴 영역 추출 엔진이 얼굴 영역 추출을 위해 사용하는 지역 변수부와; 상기 DSP 칩의 외부 메모리에 탑재된, 상기 배경 전환 엔진의 코드부와; 상기 DSP 칩의 내부 메모리에 탑재된, 상기 배경 전환 엔진이 사용하는 전역 변수부와; 상기 DSP 칩의 내부 메모리에 탑재된, 상기 배경 전환 엔진이 사용하는 지역 변수부;를 포함하여 구성되며,In addition, in order to achieve the above object, the mobile communication terminal according to the present invention includes a DSP including a face region extraction engine for extracting a face region and a background switching engine for selectively synthesizing an input image and a background image. A chip; A code section of the face region extraction engine mounted in an external memory of the DSP chip; A global variable unit mounted in an internal memory of the DSP chip and used by the face region extraction engine for face region extraction; A local variable unit mounted in an internal memory of the DSP chip and used by the face region extraction engine for face region extraction; A code unit of the background switching engine mounted in an external memory of the DSP chip; A global variable unit used by the background switching engine mounted in an internal memory of the DSP chip; And a local variable unit used in the background switching engine mounted in the internal memory of the DSP chip.

추출된 얼굴 영역 정보를 기반으로, 입력된 영상과 배경 영상을 선택적으로 합성하여 배경 전환을 수행하는 점에 그 특징이 있다.Based on the extracted face region information, there is a feature in that a background conversion is performed by selectively synthesizing the input image and the background image.

또한, 상기의 다른 목적을 달성하기 위한 본 발명에 따른 이동통신 단말기의 다른 예는, 얼굴 영역을 추출하기 위한 얼굴 영역 추출 엔진과, 주요관심영역(ROI :Region Of Interest) 여부에 따라 비트율을 상대적으로 차이나게 제어하는 ROI 기반 비트율 제어 엔진이 구비된 DSP 칩과; 상기 DSP 칩의 외부 메모리에 탑재된, 상기 얼굴 영역 추출 엔진의 코드부와; 상기 DSP 칩의 내부 메모리에 탑재된, 상기 얼굴 영역 추출 엔진이 사용하는 전역 변수부와; 상기 DSP 칩의 내부 메모리에 탑재된, 상기 얼굴 영역 추출 엔진이 사용하는 지역 변수부; 및 상기 DSP 칩의 외부 메모리에 탑재된, 상기 ROI 기반 비트율 제어 엔진의 코드부; 를 포함하여 구성되며,In addition, another example of a mobile communication terminal according to the present invention for achieving the above object is a face region extraction engine for extracting a face region, and a bit rate according to whether the region of interest (ROI) or not; A DSP chip equipped with a ROI-based bit rate control engine for differentially controlling the signal; A code section of the face region extraction engine mounted in an external memory of the DSP chip; A global variable portion used by the face region extraction engine mounted in an internal memory of the DSP chip; A local variable unit used by the face region extraction engine mounted in an internal memory of the DSP chip; And a code unit of the ROI-based bit rate control engine mounted in an external memory of the DSP chip. It is configured to include,

추출된 얼굴 영역 정보를 기반으로, 얼굴 영역에 대해서는 비 얼굴 영역에 비하여 상대적으로 고화질로 설정되도록, 얼굴 영역인지의 여부에 따라 비트율을 상대적으로 차이나게 제어하는 점에 그 특징이 있다.Based on the extracted face region information, the feature is that the bit rate is controlled relatively differently according to whether or not the face region is set so that the face region is set to a higher image quality than the non-face region.

이와 같은 본 발명에 의하면, 실시간으로 얼굴 영역을 추출하여 ROI 기반으로 비트율을 제어하거나 배경을 전환하여 통신을 수행할 수 있는 장점이 있다.According to the present invention, there is an advantage that the communication can be performed by extracting the face region in real time and controlling the bit rate or switching the background based on the ROI.

또한 본 발명에 의하면, 적은 메모리와 제한된 파워라는 특성을 갖는 이동통신 단말기용 DSP 칩에, 얼굴 영역 추출 엔진 및 배경 전환 엔진을 효과적으로 탑재 시킴으로써, 특히 적은 양의 메모리를 공유하도록 설계하여 메모리 사용을 최소화하고, 주요 데이터의 경우 DSP 칩의 내부 메모리를 사용하게 함으로써 실시간 처리를 가능하게 해줄 수 있는 장점이 있다.In addition, according to the present invention, a face region extraction engine and a background switching engine are effectively mounted on a DSP chip for a mobile communication terminal having a characteristic of low memory and limited power, and in particular, it is designed to share a small amount of memory to minimize memory usage. In addition, the main data has an advantage of enabling real-time processing by using the internal memory of the DSP chip.

이하, 첨부된 도면을 참조하여 본 발명에 따른 실시 예를 상세히 설명한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명에서는, 얼굴 영역 추출 엔진과 ROI 기반 비트율 제어 엔진, 배경 전환 엔진, 인코더, 디코더가 탑재된 DSP 기반 이동통신 단말기를 제시한다. 본 발명의 실시 예에서는 DSP 칩으로 TI 사의 TMS320C55x를 채용하고, 전체 프레임 구조는 C55x와 ARM925^TMCPU, 두 개의 CPU가 탑재된 OMAP1510 구조를 채용한 경우를 기준으로 설명하기로 한다. 그러나, 본 발명의 기술적 사상은 이에 한정되지 않고, 이하에서 설명되는 내용을 기반으로 해석되어야 할 것이다.The present invention proposes a DSP-based mobile communication terminal equipped with a face region extraction engine, an ROI-based bit rate control engine, a background switching engine, an encoder, and a decoder. In the embodiment of the present invention, the TMS320C55x of TI Corp. is used as the DSP chip, and the entire frame structure will be described based on the case of adopting the OMAP1510 structure in which two CPUs, C55x and ARM925 ^TM CPU, are mounted. However, the technical idea of the present invention is not limited thereto and should be interpreted based on the contents described below.

OMAP1510의 경우, C55x의 내부 메모리는 80KWord(160KB)가 있으며, 이동통신 단말기의 액정표시장치(LCD)로 프레임을 디스플레이하기 위해 버퍼링(buffering)으로 사용할 온-칩(on-chip) 메모리, 1.5Mbits가 장착되어 있다. 또한 24KB의 프로그램 캐쉬(program cache)가 장착되어 있으며 DSP 칩 자체는 170~200MHz의 성능을 나타내고 있다. C55x의 모든 데이터 처리는 Word 단위로 처리되며, 별도의 처리를 하지 않는 한 기본적으로 가장 작은 데이터 단위는 1 Word(2Byte)로 한다. 본 발명이 적용될 이동통신 단말기는 이러한 OMAP1510에 외부 메모리를 더 장착하고 있다. 외부 메모리에는 DSP 칩이 사용할 데이터나 프로그램 코드 중 내부 메모리에 장착되지 못한 것과 ARM CPU에서 사용할 데이터와 프로그램 코드가 들어간다.In the case of the OMAP1510, the internal memory of the C55x has 80 kWords (160 KB), 1.5 Mbits, on-chip memory, which will be used as buffering to display frames on the LCD of the mobile communication terminal. Is equipped. It is also equipped with a 24KB program cache and the DSP chip itself is capable of 170MHz to 200MHz. All data processing of C55x is processed in Word unit. Unless otherwise processed, the smallest data unit is basically 1 Word (2Byte). The mobile communication terminal to which the present invention is applied further includes an external memory in the OMAP1510. The external memory contains the data or program code that the DSP chip cannot use in the internal memory and the data and program code for the ARM CPU.

<이동통신 단말기의 설계><Design of mobile communication terminal>

본 발명에 따른 이동통신 단말기의 구성에 대하여 도 1을 참조하여 설명해 보기로 한다. 도 1은 본 발명에서 제시하는 ROI 기반 비트율 제어 및 배경 전환 통신이 가능한 이동통신 단말기의 구성 예를 나타낸 도면이다.The configuration of a mobile communication terminal according to the present invention will be described with reference to FIG. 1. 1 is a diagram illustrating an example of a configuration of a mobile communication terminal capable of ROI-based bit rate control and background switching communication according to the present invention.

도 1을 참조하여 설명하면, DSP 칩(160)을 기준으로 구분하는 경우에 메모리는 크게 외부 메모리(External Memory)(110)와 내부 메모리(Internal Memory)(162)로 나누어져 있으며, 상기 외부 메모리(110)에 기록된 프로그램 코드 및 데이터는 메모리 직접 참조(DMA:Direct Memory Access) 기능이 부가된 트래픽 제어부(140)를 통해 상기 DSP 칩(160) 내부로 전송될 수 있다. 또한, 화상 통신용 이동통신 단말기이므로 카메라(120)가 연결되어 있고, 수신된 영상을 디코딩하여 액정표시장치( 130)로 디스플레이하기 위해 전용 버퍼(Frame Buffer)(150)가 장착되어 있다. 또한 상기 DSP 칩(160) 내부에는 프로그램을 보다 빠르게 수행하기 위해 인스트럭션 캐쉬(Instruction Cache:I-Cache)(161)가 장착되었다.Referring to FIG. 1, when divided based on the DSP chip 160, the memory is largely divided into an external memory 110 and an internal memory 162. The program code and data recorded at 110 may be transferred into the DSP chip 160 through the traffic controller 140 to which a direct memory access (DMA) function is added. In addition, the camera 120 is connected to the mobile communication terminal for image communication, and a dedicated buffer 150 is mounted to decode the received image and display the received image on the liquid crystal display 130. In addition, an instruction cache (I-Cache) 161 is mounted in the DSP chip 160 to execute a program more quickly.

그리고, 상기 외부 메모리(110)에는 인코더, 디코더, 얼굴 영역 추출 엔진, 배경 전환 엔진, 그리고 ROI 기반 비트율 제어 엔진의 프로그램 코드가 각각 할당되어 있다. 여기서, 프로그램 코드는 C55x의 프로그램 캐쉬(program cache)를 이용할 수 있으므로 상기 외부 메모리(110)에 할당되더라도 빠른 처리가 가능하다. 또한, 인코더/디코더의 데이터 영역은 자주 사용하지 않는 데이터의 경우에는 상기외부 메모리(110)에(Enc./Dec. data1 영역) 할당하며, 자주 사용하는 데이터의 경우에는 상기 내부 메모리(162)에(Enc./Dec. data2 영역) 할당한다. 하지만 본 발명에서는 인코더와 디코더의 경우 모든 이동통신 단말기가 공통적으로 탑재해야 하는 모듈이므로 그 탑재 방법에 대해 제한하지는 않는다.The external memory 110 is allocated program codes of an encoder, a decoder, a face region extraction engine, a background switching engine, and an ROI-based bit rate control engine, respectively. In this case, the program code may use a program cache of C55x, so that even if the program code is allocated to the external memory 110, fast processing is possible. In addition, the data area of the encoder / decoder is allocated to the external memory 110 (Enc./Dec.data1 area) for data that is not frequently used, and to the internal memory 162 for data that is frequently used. (Enc./Dec.data2 area) Allocates. However, in the present invention, the encoder and the decoder do not limit the mounting method because all mobile communication terminals must be installed in common.

그리고, 얼굴 영역 추출 엔진 및 배경 전환 엔진의 경우 프로그램 코드는 상기 외부 메모리(110)에 할당되고, 사용하는 데이터는 상기 내부 메모리(162)에 할당된다. 이때, 상기 얼굴 영역 추출 엔진 및 배경 전환 엔진에서 사용하는 데이터는 처리 시간 순서 상으로 라이프 사이클(life-cycle)을 분석하여 같은 메모리 장소를 공유할 수 있도록 설계되었다. 자세한 메모리 설계 사항에 대해서는 다음에서 설명하기로 한다.In the face region extraction engine and the background switching engine, program codes are allocated to the external memory 110 and data to be used are allocated to the internal memory 162. In this case, the data used by the face region extraction engine and the background switching engine are designed to share the same memory location by analyzing life-cycles in order of processing time. The detailed memory design details will be described later.

<얼굴 영역 추출 엔진의 DSP 설계>DSP Design of Face Region Extraction Engine

얼굴 영역 추출 엔진을 DSP에 탑재하기 이전에 먼저 얼굴 영역 추출 알고리즘에 대해 살펴 보기로 한다. 얼굴 영역 추출 알고리즘에 대한 순차도는 도 2에 나타나 있으며, DSP 설계를 위해 모든 알고리즘의 순차도는 생성되거나 입력되는 데이터 흐름을 기준으로 기술하도록 한다. 얼굴 영역 추출 알고리즘은 크게 칼라 기반 얼굴 후보 영역 추출 모듈과 특징 기반 얼굴 영역 추출 모듈로 구성된다.Before the face region extraction engine is mounted on the DSP, we will look at the face region extraction algorithm. The sequence diagram for the face region extraction algorithm is shown in FIG. 2, and for the DSP design, the sequence diagram of all algorithms is described based on the data flow generated or input. The face region extraction algorithm is largely composed of a color-based face candidate region extraction module and a feature-based face region extraction module.

첫번째 단계는 영상의 입력 단계이다. 카메라를 통한 프레임은 외부 프레임 버퍼에 저장되며 내부 메모리에는 순서대로 매크로블록 단위로 올라와 처리된다. 즉, 전체 크기 176*144*1.5 Word 만큼의 영상은 한번에 내부 메모리에 올려 처리할경우 제한된 메모리의 50% 정도를 점유하게 되므로, 가능하면 부분적으로 올려 처리하도록 한다. 원 프레임은 반복적으로 처리하는 일이 없고 처음에만 한번 읽어 처리하면 이후 사용하지 않으므로 매크로 블록 단위로 내부 메모리에 올려 처리할 수 있다. 메모리에 올려진 한 매크로 블록 영상에서 각 픽셀 별로 살색 범위를 만족하는 칼라인지 조사한 후, 살색 범위의 칼라이면 양자화하여 칼라 인덱스로 변환한 후 '인덱스 이미지'의 해당 부분에 표기한다.The first stage is the input stage of the image. Frames through the camera are stored in an external frame buffer, and are processed in macroblock units in order in the internal memory. In other words, the entire image of 176 * 144 * 1.5 Words occupies about 50% of the limited memory when it is processed in the internal memory at once. Since the original frame is not processed repeatedly and read only once at first, it is not used afterwards, so it can be loaded into internal memory in macroblock units. In the macroblock image loaded in the memory, each pixel is checked whether the color satisfies the skin color range, and if the color is the skin color range, the color block is quantized, converted into a color index, and marked on the corresponding part of the 'index image'.

칼라 인덱스의 변환은 YcbCr 색공간에서 표현된 수많은 칼라를 그 색공간 상에서 그룹화하여 같은 그룹의 칼라를 하나의 인덱스로 표시하는 것을 의미한다. 본 발명에서는 살색 범위에 속한 칼라들을 모두 72 가지의 인덱스로 표시한다. '인덱스 이미지'는 원래의 크기인 176*144를 가로와 세로에 대하여 반씩 축소한 88*72 크기로 구성된다. 이는 매크로 블록 단위로 올라온 원 영상의 픽셀들을 가로, 세로 한 칸씩 건너서 조사함으로써 크기를 축소할 수 있다.Conversion of the color index means that a number of colors represented in the YcbCr color space are grouped in the color space to display colors of the same group as one index. In the present invention, all the colors in the flesh color range are represented by 72 indexes. The 'index image' consists of 88 * 72, the original size of 176 * 144, reduced by half for horizontal and vertical. This can be reduced in size by inspecting pixels of the original image, which are raised in units of macro blocks, by crossing each column horizontally and vertically.

'인덱스 이미지'를 구성함과 동시에 조사할 각 픽셀을 회색(gray)으로 변환하여 '회색 이미지'를 구성한다. '회색 이미지' 역시 88*72 크기가 된다. 모든 매크로 블록을 모두 차례대로 내부 메모리에 올려 본 작업을 수행하게 되면 '인덱스 메모리'와 '회색 메모리'가 구성된다.In addition to constructing an 'index image', each pixel to be examined is converted to gray to form a 'gray image'. The gray image is also 88 * 72. Putting all the macroblocks into the internal memory in turn, this operation consists of 'index memory' and 'gray memory'.

'인덱스 메모리'에 저장된 인덱스 정보를 사용하여 다시 칼라 그룹화를 수행한다. 여기서, 칼라 그룹화란 현재 영상에서 추출된 살색 픽셀들 중 가장 많이 분포하고 있는 색을 중심으로 유사한 색끼리 다시 하나의 그룹으로 묶는 것을 의미한다. 유사한 색끼리 그룹화하는 방법은 72 개의 칼라 인덱스를 각 빈(bin)으로 하는칼라 히스토그램을 구성한 후 가장 큰 값을 갖는 빈이 나타내는 색을 중심으로 유사한 색을 나타내는 인덱스들을 하나로 그룹화함으로써 이루어 진다. 이렇게 그룹화 과정이 끝나면 해당 그룹에 속한 인덱스만 'On' 시키고 나머지는 'Off'로 표시되는 '살색 이미지'가 구성된다.Color grouping is performed again using the index information stored in the 'index memory'. Here, the color grouping means that similar colors are grouped into one group based on the color most distributed among the flesh color pixels extracted from the current image. The method of grouping similar colors is performed by constructing a color histogram of 72 color indexes as bins, and then grouping indexes representing similar colors into one centered around the color represented by the bin having the largest value. After the grouping process, only the index belonging to the group is 'On' and the rest is composed of 'color image'.

'살색 이미지'는 픽셀 단위로 구성된 것이라서 잡음 등이 많이 포함되어 있다. 잡음 중에는 바탕에 한 두 픽셀씩 'On' 되어 나타나는 잡음과 얼굴 영역 내에 작은 구멍으로 나타내는 잡음이 있다. 먼저 'Erosion 모폴로지'를 사용하여 바탕에 포함된 잡음이 제거된 'Erosion 이미지'를 구성한다. 그리고, 'Erosion 이미지'에 대하여 다시 'Dilation 모폴로지'를 사용하게 되면 잡음에 의해 발생한 작은 구멍이 채워진 'Dilation 이미지'를 구성하게 된다.The 'color image' is composed of pixels, and includes a lot of noise. Among the noises are noises that appear 'On' by one or two pixels on the background and noises represented by small holes in the face area. First, the 'Erosion morphology' is used to construct an 'Erosion image' with noise removed from the background. If the 'Dilation morphology' is used again for the 'Erosion Image', the 'Dilation Image' is filled with the small holes generated by the noise.

'Dilation 이미지'를 다시 4*4 크기의 그리드(grid)로 분할한 후, 한 그리드 내에 'On' 픽셀의 수가 일정 임계치 이상인 그리드만 'On' 하고 그렇지 않은 그리드는 'Off' 하면 22*18 개의 그리드로 구성된 '그리드 이미지'가 구성된다. 이후, '그리드 이미지'에서 'On' 된 그리드들이 서로 인접해 있을 경우 이를 연결된 하나의 콤포넌트로 표시하는 연결 콤포넌트를 구한다.After dividing the 'Dilation Image' into a 4 * 4 grid again, if only one grid has a certain number of 'On' pixels above a certain threshold, 'On' and if it is 'Off' the other 22'18 The grid image consists of a grid. Then, when the grids 'On' in the 'grid image' are adjacent to each other, a connection component is displayed which displays them as one connected component.

즉, 인접한 그리드들을 묶어 각각의 덩어리로 정의한다. 이렇게 덩어리로 묶여진 콤포넌트는 영상에서 각각의 부분 영역을 이루며, 이러한 부분 영역들 중 하나가 얼굴일 확률이 높게 된다. 따라서, 연결 콤포넌트들 중 일정 크기가 넘는 콤포넌트들을 얼굴 후보 영역으로 지정하면 이후 특징 기반 얼굴 영역 추출 모듈에서 이를 확인, 재조정하여 얼굴 영역을 추출하게 된다. 이때, 영상이 복잡하지 않을경우에는, 특징 기반 얼굴 영역 추출 모듈이 없더라도, 칼라기반으로 추출된 연결 콤포넌트 자체가 얼굴 영역으로 추출될 수도 있다.That is, they define adjacent chunks by tying adjacent grids together. These bundled components make up each subregion in the image, and one of these subregions is more likely to be a face. Therefore, if a component having a predetermined size or more among the connected components is designated as the face candidate region, the feature-based face region extraction module will check and readjust it to extract the face region. In this case, when the image is not complicated, even if the feature-based face region extraction module is not present, the connection component itself extracted based on the color may be extracted as the face region.

특징 기반 얼굴 영역 추출 모듈은 기본적으로 앞서 잡음을 제거한 살색 영상인 'Dilation 이미지'와 '회색 이미지'를 사용한다. 'Dilation 이미지'를 사용하여 살색의 분포가 타원에 가까운지를 검사하는 타원 매칭을 수행하는데, 일반적으로 얼굴이 타원 모양이므로 타원에 가깝게 분포되어 있을 경우 얼굴일 확률이 높아진다. '회색 이미지'를 이용하여 타원 내부에서 눈과 입을 조사하게 되는데, 눈과 입의 특징은 가로로 짙은 선 모양의 패턴을 지니고 있으므로 이를 이용하여 눈과 입을 조사한다. 눈과 입, 그리고 타원의 조건이 만족되면 해당 타원 영역을 최종적으로 얼굴 영역으로 정의한다.The feature-based face region extraction module basically uses 'Dilation Image' and 'Gray Image', which are previously skinned images without noise. Elliptic matching is performed to check whether the distribution of flesh color is close to an ellipse using a 'dilation image'. In general, the face is elliptical, and thus the probability of being a face is increased when it is distributed close to the ellipse. Eyes and mouths are examined inside the ellipse using the 'gray image'. Eyes and mouths have a dark, horizontal line pattern. If the conditions of the eyes, mouth, and ellipse are satisfied, the elliptical region is finally defined as the facial region.

상기 기술한 얼굴 영역 추출 알고리즘을 DSP 칩에 탑재하기 위해서는, 먼저 상기에서 설명된 바와 같이 프로그램 코드는 외부 메모리에 할당하도록 한다. 처리 데이터는 '인덱스 이미지', '회색 이미지', '살색 이미지', 'Erosion 이미지', 'Dilation 이미지'가 각각 88*72 Word를 사용하고 있으며 그 밖에 '그리드 이미지' 등 처리를 위해 사용하는 데이터 들이 대략 890 Word 이내를 사용한다. 이들은 전역변수로 선언하여 사용되는데, 88*72 word를 사용하는 5 개의 이미지가 가장 큰 영역을 차지하는 변수로서 이들이 효과적으로 메모리를 공유하여 사용할 수 있도록 설계한다. 그러기 위해 각 프로세스별 라이프 사이클(life cycle)을 정의해보면 도 3과 같다.In order to mount the above-described face region extraction algorithm on the DSP chip, the program code is first allocated to an external memory as described above. The processed data is 88 * 72 Word of 'Index Image', 'Gray Image', 'Color Image', 'Erosion Image', and 'Dilation Image', and other data used for processing such as 'Grid Image' They use approximately 890 words. These are declared as global variables. Five images using 88 * 72 words occupy the largest area, and they are designed to effectively share and use memory. To this end, the life cycle of each process is defined as shown in FIG. 3.

도 3을 참조하면, 더 이상 사용하지 않는 변수가 사용하던 메모리 영역을 다른 변수가 후에 사용하도록 설계할 경우 88*72 Word 크기의 메모리 영역이 세 개만 있으면 효과적으로 메모리를 사용할 수 있음을 알 수 있다. 이들 세 영역을 각각 'Image1', 'Image2', 'Image3'라고 하자.Referring to FIG. 3, when the memory area used by a variable that is no longer used is designed to be used by another variable, it can be seen that only three memory areas of 88 * 72 Word size can effectively use the memory. Let's call these three regions 'Image1', 'Image2', and 'Image3' respectively.

먼저, 'Image1'은 외부 메모리에 있는 원 영상에서 매크로 블록단위로 내부 메모리로 올릴 때, 내부 메모리에 올려진 하나의 매크로 블록 영상을 위한 영역으로 사용한다. 이후 '살색 이미지'로 사용하다가 'Dilation 모폴로지' 적용 이후에 'Dilation 이미지' 영역으로 사용한다. 그리고, 'Image2'는 '인덱스 이미지'를 위한 영역으로 사용하다가 'Erosion 모폴로지' 적용 이후에는 'Erosion 이미지'로 사용하며, 칼라 기반 얼굴 후보 영역 추출 모듈이 끝나면 특징 기반 얼굴 영역 추출 모듈에서 사용하는 전역 변수 중 일부를 위해 사용된다. 또한, 'Image3'는 '회색 이미지'를 위해 사용된다. '회색 이미지'는 처음 생성되어 얼굴 영역 추출 후반에 눈, 입 영역 추출에 사용되므로 가장 라이프 사이클(life cycle)이 길다.First, 'Image1' is used as an area for one macro block image loaded in the internal memory when uploading from the original image in the external memory to the internal memory in macro block units. After that, it is used as 'color image' and 'Dilation image' area after applying 'Dilation morphology'. And, 'Image2' is used as an area for 'index image' and 'Erosion image' after 'Erosion morphology' is applied.After completion of color-based face candidate area extraction module, global used by feature-based face area extraction module Used for some of the variables. Also, 'Image3' is used for 'gray image'. 'Gray images' are first created and used for eye and mouth areas later in the face area extraction, so they have the longest life cycle.

이와 같이 3 개의 88*72 Word 크기의 메모리 영역을 공유하여 사용하고, 890 Word의 내부 메모리를 기타 전역 변수로 사용함으로써 모든 전역 변수를 내부 메모리에 할당할 수 있게 된다. 임시로 사용되는 적은 양의 지역 변수들은 C55x의 경우 스택(stack)이라는 영역으로 할당되어 사용되는데, 본 발명에서는 100K Word 만큼의 지역 변수용 스택(stack)을 할당하여 사용한다. 이와 같이 하여 총 메모리 크기는 20K Word 정도로서 DSP 전체 내부 메모리의 25% 만을 사용할 수 있게 된다.In this way, by sharing three 88 * 72 Word memory areas and using 890 Word's internal memory as other global variables, all global variables can be allocated to the internal memory. A small amount of local variables used temporarily are allocated and used as an area called a stack in the case of C55x. In the present invention, a stack for local variables of 100K Word is used. In this way, the total memory size is about 20K words, and only 25% of the DSP's total internal memory is available.

상기 기술한 방법으로 얼굴 영역 추출 엔진의 DSP 탑재에 필요한 모든 메모리 할당을 설계하였다. 얼굴 영역 추출 엔진이 추출한 얼굴 영역 정보를 사용하여ROI 기반으로 비트율을 제어하거나 배경 전환 통신을 할 수 있다. ROI 기반 비트율 제어란 주요관심영역인 얼굴 영역에 대해서는 상대적으로 높게 화질을 설정함으로서 전체적으로는 적은 데이터 양으로도 얼굴 영역을 고화질로 유지하여 통신할 수 있는 제어 기법을 말한다. ROI 기반 비트율 제어는 인코더의 자체 비트율 제어 부분에서 얼굴 영역 정보를 사용하여 수행할 수 있다.In the above-described method, all memory allocations required for DSP loading of the face region extraction engine were designed. The face region information extracted by the face region extraction engine may be used to control bit rate or perform background switching communication based on ROI. ROI-based bit rate control refers to a control technique that maintains a high quality image for a face region, which is a major area of interest, and maintains a high quality face region for communication. ROI-based bit rate control may be performed using face region information in its own bit rate control portion of the encoder.

그리고 배경 전환 통신은, 도 4에 나타낸 바와 같이, 얼굴 이외의 영역을 다른 정지 영상으로 교체하여 통신하는 기능을 의미한다. 배경 전환 통신을 사용하여 여러 가지 유용하고 재미있는 통신이 가능함과 동시에, 배경이 정지 영상이므로 매우 적은 양으로 압축하여 통신할 수 있다. 배경 전환 엔진은 추출된 얼굴 영역 정보를 이용하여, 배경 부분은 사용자가 선택한 배경 부분으로, 얼굴 영역 부분은 카메라에서 입력된 부분으로 합성하여 인코더로 입력하는 기능을 한다. 인코더는 매크로 블록 단위로 입력을 받아 처리를 하므로 배경 전환 엔진도 매크로 블록 단위로 영상을 합성한다. 즉 배경 영상은 외부 메모리에 있으며, 사용자가 배경을 선택하면 해당 배경의 해당 매크로블록을 내부 메모리로 올리고, 해당 매크로 블록이 얼굴 영역이 포함되어 있는 경우, 카메라 영상에서의 매크로 블록도 내부 메모리에 올린 다음, 두 영상을 합성하게 된다.And background switching communication means the function which communicates by replacing the area | region other than a face with another still image, as shown in FIG. Background switching communication enables various useful and interesting communication, and at the same time, the background is a still image, so it can be compressed and communicated in a very small amount. The background conversion engine uses the extracted face region information to synthesize a background portion as a background portion selected by a user and a face region portion as a portion input from a camera and input the same to an encoder. Since the encoder receives and processes input in units of macro blocks, the background conversion engine also synthesizes images in units of macro blocks. That is, the background image is in the external memory, and when the user selects the background, the macro block of the background is uploaded to the internal memory, and if the macro block includes the face area, the macro block from the camera image is also uploaded to the internal memory. Next, the two images are synthesized.

이때 내부 메모리로 올라오는 두개의 매크로 블록 크기(16*16*1.5)의 영상은 앞서 기술한 'Image1', 'Image2', 'Image3' 중 하나를 공유하여 사용할 수 있다. 이렇게 합성된 매크로 블록 크기의 합성 영상은 인코더의 입력으로 들어가서 인코딩 된다. 이와 같은 인코딩 수행의 한 방안으로서 H.263 비디오 인코더를 사용하여인코딩을 수행할 수도 있다. 이상에서 설명된 배경 전환 엔진의 DSP 칩 상에서의 처리 과정을 도 5에 나타내었다.In this case, two macro block sizes (16 * 16 * 1.5) that are uploaded to the internal memory may be shared by using one of the above-described 'Image1', 'Image2', and 'Image3'. The synthesized macro block size synthesized image is encoded into the input of the encoder. As one method of performing such encoding, encoding may be performed using an H.263 video encoder. The processing on the DSP chip of the background switching engine described above is shown in FIG. 5.

지금까지 기술한 TI320C55x DSP칩을 사용한 ROI 기반 비트율 제어 및 배경 전환 이동통신 단말기의 구성이 도 6에 도시되어 있다. 외부 메모리에는 얼굴 영역 추출 엔진, 비트율 제어 엔진, 배경 전환 엔진, 인코더, 디코더 등 각 엔진의 프로그램 코드가 할당되며, 내부 메모리에는 얼굴 영역 추출 엔진과 배경 전환 엔진에서 사용하는 전역변수를 위한 세 개의 88*72 Word 크기의 전역 변수 공간 ('Image1', 'Image2', 'Image3')이 있다. 기타 전역 변수를 위한 890 Word 크기의 전역변수 공간(Global Variable)이 역시 내부 메모리에 존재하며, 지역 변수를 위한 100Word 크기의 공간(stack)이 내부메모리에 할당된다. 그리고 배경, 카메라 입력 프레임은 외부 메모리 버퍼에 할당된다. C55x는 메모리 관리 특성 상 모든 전역 변수는 그 초기 값을 갖으며 초기 값 정보는 전역 변수와 같은 크기로 '.init'이라는 또 다른 메모리로 할당된다. 여기서, '.init'은 초기 값으로 프로그램 실행 시 한번만 읽으면 되므로 외부 메모리에 할당된다.The configuration of ROI-based bit rate control and background switching mobile communication terminal using the TI320C55x DSP chip described so far is shown in FIG. 6. The external memory is allocated program code of each engine such as face region extraction engine, bit rate control engine, background switching engine, encoder, decoder, etc., and internal memory is divided into three 88s for global variables used by face region extraction engine and background switching engine. * 72 There is a global variable space ('Image1', 'Image2', 'Image3') of Word size. A 890 Word global variable for other global variables also exists in internal memory, and a 100 word stack for local variables is allocated in internal memory. The background and camera input frames are then allocated to an external memory buffer. In C55x, all global variables have their initial values due to memory management. Initial value information is the same size as global variables and is allocated to another memory named '.init'. Here, '.init' is allocated to external memory because the initial value needs to be read only once when executing the program.

이상의 설명에서와 같이 본 발명에 따른 이동통신 단말기에 의하면, 실시간으로 얼굴 영역을 추출하여 ROI 기반으로 비트율을 제어하거나 배경을 전환하여 통신을 수행할 수 있는 장점이 있다.As described above, the mobile communication terminal according to the present invention has an advantage of performing communication by extracting a face region in real time and controlling a bit rate or switching a background based on ROI.

또한 본 발명에 따른 이동통신 단말기에 의하면, 적은 메모리와 제한된 파워라는 특성을 갖는 이동통신 단말기용 DSP 칩에, 얼굴 영역 추출 엔진 및 배경 전환엔진을 효과적으로 탑재 시킴으로써, 특히 적은 양의 메모리를 공유하도록 설계하여 메모리 사용을 최소화하고, 주요 데이터의 경우 DSP 칩의 내부 메모리를 사용하게 함으로써 실시간 처리를 가능하게 해줄 수 있는 장점이 있다.In addition, the mobile communication terminal according to the present invention is particularly designed to share a small amount of memory by effectively mounting a face region extraction engine and a background switching engine in a DSP chip for a mobile communication terminal having characteristics of low memory and limited power. This minimizes the use of memory and enables the real-time processing by using the internal memory of the DSP chip for main data.

Claims

A first memory area allocated as a storage area for data for 'color image' processing and data for 'Dilation image' processing;

A second memory area allocated as a storage area for data for 'index image' processing and data for 'Erosion image' processing;

A third memory area allocated as a storage area for data for 'gray image' processing;

A fourth memory area allocated as a storage area of a global variable necessary for the operation in the face region extraction process;

A fifth memory area allocated as a storage area of a local variable necessary for an operation in a face area extraction process; And

A digital signal processing (DSP) chip having a face region extraction engine for extracting a face region from an input image using data stored in the first to fifth memory regions; Mobile communication terminal comprising a.

Here, the 'index image' represents an image composed of color index values of pixels satisfying the skin color condition in the input image, and the 'skin image' is a skin color by grouping colors around the main skin color from the 'index image'. The image represents only the pixels belonging to the group. The gray image represents an image obtained by converting the input image to gray, and the 'Erosion image' represents an image from which the flesh color pixels generated by noise are removed from the 'color image'. The 'Dilation image' represents an image configured by filling small holes in the flesh color region generated by noise in the 'Erosion image'.

The method of claim 1,

The code unit of the face region extraction engine is located in an external memory of the DSP chip, and the first to fifth memory areas are located in an internal memory of the DSP chip.

The method of claim 1,

The first to third memory areas are all 88 * 72 words, the fourth memory area is 890 Word, and the fifth memory area is 100 Word. .

The method of claim 1,

And a sixth memory area storing a background image to be used as a background of a screen on which an image is displayed, wherein the DSP chip further includes a background switching engine for selectively synthesizing an input image with a background image. Communication terminal.

The method of claim 4, wherein

And a code part of the sixth memory area and the background switching engine is located in an external memory of the DSP chip.

The method of claim 4, wherein

A seventh memory area for storing an original image used as an input image is further included in an external memory of the DSP chip, and in order to synthesize the original image and the background image, the original image and the background image in macroblock units are included in the DSP chip. A mobile communication terminal, characterized in that synthesized in the memory.

The method of claim 1,

The DSP chip further comprises a ROI-based bit rate control engine for controlling the bit rate relatively different according to the ROI so as to be set to a relatively high picture quality relative to the non-ROI area for the main area of interest (ROI). Mobile terminal.

Investigating whether all the pixels of the input image satisfy the skin color condition;

Constructing an 'index image' composed of index values by obtaining an index of color values of each pixel for pixels satisfying the flesh color condition;

Comprising the most common color in the 'index image' grouping similar colors into a group to 'On' only the pixels corresponding to the color group, and the other 'Off' to configure a 'color image' Wow;

Constructing an 'Erosion image' by performing an 'Erosion morphology' on the 'color image';

Constructing a 'dilation image' by performing a 'dilation morphology' on the 'Erosion image';

A connecting element constituting step of connecting the on-pixels connected to each other in the 'Dilation image' as a single mass and displaying the connected area;

Designating as a face candidate region an element satisfying a predetermined condition among the connection elements configured in the connection element configuring step;

An elliptic matching step of examining whether the designated face candidate region has an ellipse shape;

Converting the first input image into a gray image to construct a 'gray image';

Irradiating an eye and a mouth area around the face candidate area in the 'gray image'; And

And a face region determination step of determining a face region by using the elliptic matching result and the eye and mouth region findings.

The method of claim 8,

The same memory area is allocated to the data for processing the 'color image' and the 'Dilation image' and shared according to time, and the same memory area is used for the data for the 'index image' and 'Erosion image' processing. The method for extracting a face region in a mobile communication terminal, wherein the data is allocated and shared according to time, and the data for processing the 'gray image' is allocated by using a separate memory area.

A DSP chip having a face region extraction engine for extracting a face region, and a background switching engine for selectively synthesizing an input image and a background image;

A code section of the face region extraction engine mounted in an external memory of the DSP chip;

A global variable unit mounted in an internal memory of the DSP chip and used by the face region extraction engine for face region extraction;

A local variable unit mounted in an internal memory of the DSP chip and used by the face region extraction engine for face region extraction;

A code unit of the background switching engine mounted in an external memory of the DSP chip;

A global variable unit used by the background switching engine mounted in an internal memory of the DSP chip;

And a local variable unit used in the background switching engine mounted in the internal memory of the DSP chip.

And based on the extracted face region information, selectively converts the input image and the background image to perform background switching.

A DSP chip including a face region extraction engine for extracting a face region and a ROI-based bit rate control engine for controlling bit rates relatively differently according to a region of interest (ROI);

A global variable portion used by the face region extraction engine mounted in an internal memory of the DSP chip;

A local variable unit used by the face region extraction engine mounted in an internal memory of the DSP chip; And

A code unit of the ROI based bit rate control engine, mounted in an external memory of the DSP chip; It is configured to include,

And based on the extracted face region information, controlling the bit rate relatively differently according to whether or not the face region is set so that the face region is set to a higher quality than the non-face region.