KR20120031309A

KR20120031309A - Information processing apparatus and information processing method

Info

Publication number: KR20120031309A
Application number: KR1020127004619A
Authority: KR
Inventors: 노부히로 사이조
Original assignee: 소니 주식회사
Priority date: 2009-06-30
Filing date: 2010-05-21
Publication date: 2012-04-02
Also published as: JP2014064047A; EP2384485A1; CN102138148A; US8107706B2; US20110194774A1; JP4548542B1; JPWO2011001761A1; JP4831267B2; KR20120039498A; WO2011001761A1; WO2011001593A1; US20110142349A1; CN102138148B; US8285054B2; US20110216941A1; TW201112168A; EP2378759A4; EP2378759A1

Abstract

화상 내의 복수의 대상 화소를 검출하고 소정의 피사체에 대응하는 휘도값들을 식별하는 방법 및 장치가 개시된다. 검출용 장치는 제1 파장 및 제2 파장을 각각 사용하여 촬상된 제1 화상 및 제2 화상을 기억하도록 구성된 메모리를 포함한다. 검출용 장치는, 기억된 제1 및 제2 촬상 화상의 휘도값에 기초하여 상기 제1 촬상 화상 내의 복수의 대상 화소를 검출하도록 구성된 적어도 하나의 프로세서를 더 포함한다. 식별용 장치는 처리 화상을 기억하도록 구성된 메모리 및 상기 처리 화상 내의 상기 복수의 대상 화소의 휘도값의 주파수를 결정하고, 상기 결정된 휘도값들의 주파수에 기초하여 상기 처리 화상 내의 소정의 피사체에 대응하는 휘도값의 범위를 결정하도록 구성된 적어도 하나의 프로세서를 포함한다.A method and apparatus for detecting a plurality of target pixels in an image and identifying luminance values corresponding to a predetermined subject are disclosed. The detecting apparatus includes a memory configured to store a first image and a second image photographed using the first wavelength and the second wavelength, respectively. The detection apparatus further includes at least one processor configured to detect a plurality of target pixels in the first captured image based on the stored luminance values of the first and second captured images. The identification device determines a frequency of luminance values of the plurality of target pixels in the memory and the memory configured to store the processed image, and the luminance corresponding to the predetermined subject in the processed image based on the frequencies of the determined luminance values. At least one processor configured to determine a range of values.

Description

Information processing unit and information processing method {INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD}

본 발명은 정보 처리 장치 및 정보 처리 방법에 관한 것으로, 특히, 예를 들어, 사용자를 촬상하여 얻어지는 촬상 화상으로부터 사용자의 손의 형상 등을 추출할 경우에 적합한 정보 처리 장치 및 정보 처리 방법에 관한 것이다. The present invention relates to an information processing apparatus and an information processing method, and more particularly, to an information processing apparatus and an information processing method suitable for extracting the shape of a user's hand and the like from a captured image obtained by imaging a user. .

최근, 퍼스널 컴퓨터 등에 데이터를 입력하는 입력 디바이스로서, 마우스, 그래픽 태블릿 및 터치 패드 외에, 사용자의 몸짓(gesture)(모션) 또는 자세(posture)(포즈)를 사용하여 데이터를 입력하는 데이터 입력 기술이 연구되어 오고 있다. Recently, as an input device for inputting data to a personal computer or the like, in addition to a mouse, a graphic tablet, and a touch pad, a data input technology for inputting data using a gesture (motion) or posture (pose) of a user has been developed. It has been studied.

이 데이터 입력 기술에서는, 예를 들어, 사용자의 손의 몸짓이나 자세를 이용하여 데이터가 입력되기 때문에, 사용자를 촬상하여 얻어지는 촬상 화상으로부터 사용자의 손의 형상을 정확하게 추출할 필요가 있다.In this data input technique, since data is input using, for example, the gesture and posture of the user's hand, it is necessary to accurately extract the shape of the user's hand from the captured image obtained by imaging the user.

사용자의 손의 형상을 추출하기 위한 추출 기술로서, 화상들의 패턴 매칭을 사용하는 패턴 매칭 방법, 사용자의 피부 영역을 추출하는 피부 영역 추출 방법 등이 있다. Extraction techniques for extracting the shape of a user's hand include a pattern matching method using pattern matching of images, a skin region extraction method for extracting a skin region of a user, and the like.

패턴 매칭 방법에서는, 예를 들어, 각종 형상 및 크기를 갖는 손을 촬상하여 얻어지는 복수의 형상 화상을 미리 학습해 두고, 촬상 화상과 가장 유사한 형상 화상(예를 들어, 대응하는 화소들의 화소값들 간의 차의 총합이 최소가 되는 형상 화상)으로 표시되는 손의 형상을 사용자의 손의 형상으로서 추출한다.In the pattern matching method, for example, a plurality of shape images obtained by imaging a hand having various shapes and sizes are learned in advance, and a shape image most similar to the picked-up image (for example, between pixel values of corresponding pixels). The shape of the hand, which is represented by the shape image of which the total sum of cars is minimized, is extracted as the shape of the user's hand.

그러나, 이 패턴 매칭 방법에서는, 형상 화상이 촬상되는 때의 조건과 다른 조건(예를 들어, 촬상 방향, 조명의 정도, 배경 및 촬상 시의 피사체의 크기 등) 하에서 촬상 화상이 얻어지는 경우에는, 사용자의 손의 형상을 정확하게 추출하는 것이 곤란할 수 있다.However, in this pattern matching method, when a captured image is obtained under a condition different from the condition when the shape image is imaged (for example, the image pickup direction, the degree of illumination, the background and the size of the subject at the time of image pickup), the user It may be difficult to accurately extract the shape of the hand.

특히, 손의 형상을 추출하는 경우에 있어서, 촬상 화상 내의 손의 형상과 형상 화상 내의 손의 형상이 크게 다르거나 또는 촬상 화상 내의 손이 얼굴 등과 겹치는 경우에는, 예를 들어, 얼굴의 형상을 추출하는 경우에 비해 손의 형상을 정확하게 추출하는 것이 곤란하다. In particular, in the case of extracting the shape of a hand, when the shape of the hand in the captured image and the shape of the hand in the shape image are greatly different, or when the hand in the captured image overlaps with the face or the like, for example, the shape of the face is extracted. Compared to the case, it is difficult to accurately extract the shape of the hand.

또한, 손의 형상을 실시간으로 추출하는 것이 요구되는 경우, 패턴 매칭은 방대한 양의 계산을 필요로 하고, 이는 대부분의 경우 지장을 초래한다.In addition, if it is desired to extract the shape of the hand in real time, pattern matching requires a large amount of calculation, which in most cases leads to difficulties.

피부 영역 추출 방법에서는, 인간 피부의 색을 나타내는 피부 정보를 사용하여, 촬상 화상 내의 사용자의 피부를 나타내는 피부 영역이 추출된다.In the skin region extraction method, the skin region representing the skin of the user in the captured image is extracted using the skin information representing the color of the human skin.

그러나, 피부 정보를 사용하는 피부 영역 추출 방법에서는, 피부의 색과 그에 가까운 색을 구별하는 것이 어렵다. 또한, 피부의 색이 인종에 따라 다르기 때문에, 모든 인종에 대해 적절한 피부 영역을 추출하는 것이 불가능할 수 있다.However, in the skin region extraction method using the skin information, it is difficult to distinguish the color of the skin from the color close to it. In addition, since the color of the skin varies from race to race, it may be impossible to extract the appropriate skin area for all races.

이러한 점에 있어, 인종에 상관없이 파장에 대한 피부의 반사율 변화가 일정하다는 사실에 기초하여, 촬상 화상 내의 피부 영역을 추출하는 분광 반사율 특성을 사용하는 추출 기술이 최근 제안되어 있다(예를 들어, 비특허 문헌 1을 참조).In this regard, based on the fact that the change in the reflectance of the skin with respect to the wavelength is constant regardless of race, an extraction technique using the spectral reflectance characteristic of extracting the skin region in the captured image has recently been proposed (e.g., See Non-Patent Document 1).

(비특허문헌 1)NPL1 : 스즈키 야스히로 등, "Detection Method of Skin Region by Near-IR Spectrum Multi-Band", IEEJ Transactions on Electronics, Information and Systems, 127권 4호, 일본 2007년(Non-Patent Document 1) NPL1: Yashiro Suzuki et al., "Detection Method of Skin Region by Near-IR Spectrum Multi-Band", IEEJ Transactions on Electronics, Information and Systems, Vol. 127, No. 4, Japan 2007

그러나, 상술한 분광 반사율 특성을 사용하는 종래의 추출 기술에서는, 촬상 화상 내에 피부 영역으로서 피사체의 얼굴과 손이 존재하는 경우, 얼굴과 손 둘 다의 형상을 피부 영역으로서 추출하기 때문에, 손의 형상만을 피부 영역으로서 추출하는 것이 곤란하다.However, in the conventional extraction technique using the spectral reflectance characteristic described above, when the face and the hand of the subject exist as the skin region in the captured image, the shape of both the face and the hand is extracted as the skin region. It is difficult to extract only as a skin area.

상술한 상황을 감안하여, 일련의 처리에 필요한 연산량의 증가를 억제하면서, 사용자를 촬상하여 얻어지는 촬상 화상으로부터 사용자의 정확한 손의 형상 등을 고속으로 추출하는 것이 바람직하다.In view of the above situation, it is preferable to extract the shape of the user's correct hand and the like at high speed from the captured image obtained by imaging the user while suppressing an increase in the amount of calculation required for a series of processes.

본 발명의 실시 형태에 따르면, 화상 내의 복수의 대상 화소를 검출하는 정보 처리 장치, 방법, 컴퓨터 판독가능 기록 매체 및 컴퓨터 프로그램이 제공된다. 정보 처리 장치는 제1 파장의 광을 이용하여 촬상된 제1 화상 및 이 제1 파장과는 상이한 제2 파장의 광을 사용하여 촬상된 제2 화상을 기억하도록 구성된 제1 메모리를 포함한다. 정보 처리 장치는, 기억된 제1 및 제2 촬상 화상들의 휘도값들에 기초하여 제1 촬상 화상 내의 복수의 대상 화소를 검출하도록 구성된 적어도 하나의 프로세서를 더 포함한다.According to an embodiment of the present invention, an information processing apparatus, a method, a computer readable recording medium, and a computer program for detecting a plurality of target pixels in an image are provided. The information processing apparatus includes a first memory configured to store a first image picked up using light of a first wavelength and a second image picked up using light of a second wavelength different from the first wavelength. The information processing apparatus further includes at least one processor configured to detect the plurality of target pixels in the first captured image based on the luminance values of the stored first and second captured images.

또한, 본 발명의 다른 실시 형태에 따르면, 소정의 피사체에 대응하는 휘도값들을 식별하는 정보 처리 장치, 방법, 컴퓨터 판독가능 기록 매체 및 컴퓨터 프로그램이 제공된다. 정보 처리 장치는 메모리와 적어도 하나의 프로세서를 포함한다. 메모리는 화상으로부터 생성되고 복수의 대상 화소를 포함하는 처리 화상을 기억하도록 구성된다. 적어도 하나의 프로세서는 처리 화상 내의 복수의 대상 화소의 휘도값들의 주파수를 결정하고, 휘도값의 결정된 주파수에 기초하여 처리 화상 내의 소정의 피사체에 대응하는 휘도값들의 범위를 결정하도록 구성된다.In addition, according to another embodiment of the present invention, an information processing apparatus, method, computer readable recording medium, and computer program for identifying luminance values corresponding to a predetermined subject are provided. The information processing apparatus includes a memory and at least one processor. The memory is configured to store a processed image generated from the image and including a plurality of target pixels. The at least one processor is configured to determine frequencies of luminance values of the plurality of target pixels in the processed image and to determine a range of luminance values corresponding to a predetermined subject in the processed image based on the determined frequencies of the luminance values.

본 발명의 실시 형태들에 따르면, 일련의 처리에 필요한 연산량의 증가를 억제하면서, 사용자의 정확한 손의 형상 등을 고속으로 추출하는 것이 가능하다.According to the embodiments of the present invention, it is possible to extract the shape of the user's correct hand and the like at high speed while suppressing the increase in the amount of computation required for the series of processing.

도 1은 정보 처리 시스템의 구성예를 도시하는 블록도.
도 2는 정보 처리 장치의 구성예를 도시하는 블록도.
도 3은 인간의 피부의 반사 특성의 일례를 도시하는 도면.
도 4는 제1 및 제2 촬상 화상의 일례를 도시하는 도면.
도 5는 2치화부에서 생성되는 2치화 피부 화상의 일례를 도시하는 도면.
도 6은 피부 추출부에 의해 추출되는 피부 화상의 일례를 도시하는 도면.
도 7은 피부 화상의 히스토그램의 일례를 도시하는 도면.
도 8은 마스크 화상 생성부에 의해 생성되는 마스크 화상의 일례를 도시하는 도면.
도 9는 형상 추출부에 의해 생성되는 추출 화상의 일례를 도시하는 도면.
도 10은 형상 추출 처리를 설명하기 위한 흐름도.
도 11은 FFT 임계값 결정 처리에 사용되는 제1 촬상 화상을 도시하는 도면.
도 12는 FFT 임계값 결정 처리를 설명하기 위한 흐름도.
도 13은 카메라의 상대 감도 특성을 도시하는 도면.
도 14는 LED의 배치 방법을 도시하는 도면.
도 15는 컴퓨터의 구성예를 도시하는 도면.1 is a block diagram illustrating a configuration example of an information processing system.
2 is a block diagram illustrating a configuration example of an information processing apparatus.
3 is a diagram illustrating an example of reflective characteristics of human skin.
4 is a diagram illustrating examples of first and second captured images.
FIG. 5 is a diagram illustrating an example of a binarized skin image generated in a binarization unit. FIG.
6 is a diagram illustrating an example of a skin image extracted by a skin extraction unit.
7 is a diagram showing an example of a histogram of a skin image.
8 is a diagram illustrating an example of a mask image generated by the mask image generation unit.
9 is a diagram illustrating an example of an extracted image generated by the shape extraction unit.
10 is a flowchart for explaining a shape extraction process;
11 is a diagram showing a first captured image used for FFT threshold determination processing.
12 is a flowchart for explaining an FFT threshold determination process.
13 is a diagram illustrating relative sensitivity characteristics of a camera.
14 illustrates a method of arranging LEDs.
15 is a diagram illustrating a configuration example of a computer.

이하, 본 발명을 실시하기 위한 실시 형태(이하, 본 실시 형태라고 함)에 대해서 설명한다. 또한, 설명은 이하의 순서로 행함을 유의한다.EMBODIMENT OF THE INVENTION Hereinafter, embodiment (henceforth this embodiment) for implementing this invention is demonstrated. Note that description is made in the following order.

1. 본 실시 형태(사용자의 손의 형상을 추출하는 예)1. This embodiment (example of extracting the shape of a user's hand)

2. 변형예2. Variations

(1. 본 실시 형태)(1.this embodiment)

(정보 처리 시스템(1)의 구성예)(Configuration example of the information processing system 1)

도 1은 본 실시 형태의 정보 처리 시스템(1)의 구성예를 나타내고 있다. 1 shows an example of the configuration of the information processing system 1 of the present embodiment.

이 정보 처리 시스템(1)은 사용자의 손을 사용하여 행해지는 몸짓(또는 자세)에 따라서 소정의 처리를 실행하며, 정보 처리 장치(21), 카메라(22) 및 발광 장치(23)를 포함한다.This information processing system 1 executes a predetermined process according to a gesture (or posture) performed using a user's hand, and includes an information processing apparatus 21, a camera 22, and a light emitting device 23. .

정보 처리 시스템(1)이 소정의 처리를 실행하게 하기 위해, 사용자는 (카메라(22)의 렌즈면 앞에서) 자신의 손의 형상을 변화시킨다. In order for the information processing system 1 to perform a predetermined process, the user changes the shape of his hand (in front of the lens surface of the camera 22).

이때, 정보 처리 시스템(1)은 사용자의 손의 형상을 인식하고, 그 인식 결과에 따라서 소정의 처리를 실행한다. At this time, the information processing system 1 recognizes the shape of a user's hand, and performs predetermined processing according to the recognition result.

또한, 본 실시 형태에서는, 사용자는 카메라(22)의 렌즈면 앞에서 손의 형상을 변화시키고, 자신의 손을 자신의 얼굴, 가슴 등보다도 카메라(22)의 렌즈면에 더 가까운 위치로 이동시켜 몸짓(또는 자세)을 행한다. In addition, in the present embodiment, the user changes the shape of the hand in front of the lens surface of the camera 22 and moves his hand to a position closer to the lens surface of the camera 22 than to his or her face or chest. (Or posture).

정보 처리 장치(21)는 카메라(22) 및 발광 장치(23)를 제어한다. 또한, 정보 처리 장치(21)는 카메라(22)에 의해 촬상되는 촬상 화상에 기초하여 사용자의 손의 형상을 인식하고, 그 인식 결과에 따라서 소정의 처리를 실행한다. The information processing apparatus 21 controls the camera 22 and the light emitting device 23. The information processing apparatus 21 also recognizes the shape of the user's hand based on the captured image picked up by the camera 22, and executes a predetermined process in accordance with the recognition result.

카메라(22)는 사용자와 같은 피사체의 촬상에 사용되는 렌즈를 포함하고, 렌즈의 전방면은 가시광을 차단하는 가시광 커트 필터(visible light cut filter)(22a)로 덮어져 있다.The camera 22 includes a lens used for imaging an object such as a user, and the front surface of the lens is covered with a visible light cut filter 22a that blocks visible light.

이러한 구성으로 인해, 카메라(22)는, 형광등의 적외 성분 또는 햇빛을 제외하고는, 발광 장치(23)에 의해 피사체에 조사되는 비가시광의 반사광만을 수광하고, 그 결과 얻어지는 촬상 화상을 정보 처리 장치(21)에 공급한다.Due to such a configuration, the camera 22 receives only the reflected light of the invisible light irradiated to the subject by the light emitting device 23, except for infrared components such as fluorescent lamps or sunlight, and the information processing device as a result. It supplies to 21.

즉, 예를 들어, 카메라(22)는, 발광 장치(23)에 의해 피사체에 조사되는 비가시광인 제1 파장의 광(예를 들어, 870㎚의 근적외선)의 반사광만을 수광하고, 그 결과 얻어지는 제1 촬상 화상을 정보 처리 장치(21)에 공급한다.That is, for example, the camera 22 receives only the reflected light of the light of the 1st wavelength (for example, near-infrared ray of 870 nm) which is invisible light irradiated to the subject by the light emitting device 23, and is obtained as a result. The first captured image is supplied to the information processing apparatus 21.

또한, 카메라(22)는, 발광 장치(23)에 의해 피사체에 조사되는 비가시광이며, 제1 파장과는 상이한 제2 파장의 광(예를 들어, 950㎚의 근적외선)의 반사광만을 수광하고, 그 결과 얻어지는 제2 촬상 화상을 정보 처리 장치(21)에 공급한다.Moreover, the camera 22 is invisible light irradiated to the subject by the light emitting device 23, receives only the reflected light of the light of the 2nd wavelength (for example, near-infrared ray of 950 nm) different from a 1st wavelength, The second captured image obtained as a result is supplied to the information processing apparatus 21.

발광 장치(23)는 제1 파장의 광을 발광하는 LED(발광 다이오드)(23a₁ 및 23a₂) 및 제2 파장의 광을 발광하는 LED(23b₁ 및 23b₂)를 포함한다.The light emitting device 23 includes LEDs (light emitting diodes) 23a ₁ and 23a ₂ for emitting light of a first wavelength and LEDs 23b ₁ and 23b ₂ for emitting light of a second wavelength.

또한, 이하에서, LED들(23a₁ 및 23a₂)을 서로 구별할 필요가 없을 경우에는, LED들(23a₁ 및 23a₂)을 간단히 LED(23a)라고 함을 유의한다. 또한, LED들(23b₁ 및 23b₂)을 서로 구별할 필요가 없을 경우에도, LED들(23b₁ 및 23b₂)을 간단히 LED(23b)라고 함을 유의한다.In addition, hereinafter, when it is not necessary to distinguish between the LED's (23a ₁ and 23a _2), it is noted simply referred to as LED (23a) the LED's (23a ₁ and 23a _2). Note that even when the LEDs 23b ₁ and 23b ₂ need not be distinguished from each other, the LEDs 23b ₁ and 23b ₂ are simply referred to as LEDs 23b.

LED(23a)와 LED(23b)는 정보 처리 장치(21)의 제어 하에서 교대로 발광한다.The LEDs 23a and 23b alternately emit light under the control of the information processing apparatus 21.

또한, 제1 파장의 광의 반사광 및 제2 파장의 광의 반사광에 있어서, 카메라(22)에 의해 수광되는 반사광의 강도(광량)가 동일하게 되도록 LED(23a)와 LED(23b)의 출력이 조정된다. In addition, in the reflected light of the light of the first wavelength and the reflected light of the light of the second wavelength, the outputs of the LED 23a and the LED 23b are adjusted so that the intensity (light quantity) of the reflected light received by the camera 22 is the same. .

또한, LED(23a)와 LED(23b)는 도 1에 도시한 바와 같이 교대로 바둑판 형상으로 배치되어 있고, LED(23a) 및 LED(23b)의 전방면에는 LED(23a) 및 LED(23b)로부터 발광되는 광을 균일하게 확산시키는 확산판(23c)이 제공되어 있다. 이러한 구성에 의해, 피사체에는, 제1 및 제2 파장의 광이 불균일하지 않게 조사된다.In addition, the LED 23a and the LED 23b are alternately arranged in a checkerboard shape as shown in FIG. 1, and the LEDs 23a and 23b are provided on the front surfaces of the LEDs 23a and 23b. A diffuser plate 23c is provided which uniformly diffuses the light emitted from it. With this configuration, the light of the first and second wavelengths is irradiated onto the subject so as not to be nonuniform.

또한, LED(23a) 또는 LED(23b)로부터 발광되는 광이 적어도 사용자의 손에 확실하게 조사되는 위치에 발광 장치(23)가 배치됨을 유의한다. 본 실시 형태에서는, 사용자는 카메라(22)의 렌즈면 앞에서 손의 형상을 변화시키고, 이에 따라 발광 장치(23)는, 예를 들어, 카메라(22)와 근접하게 배치된다.
Note that the light emitting device 23 is disposed at a position where the light emitted from the LED 23a or the LED 23b is reliably irradiated at least to the user's hand. In the present embodiment, the user changes the shape of the hand in front of the lens surface of the camera 22, whereby the light emitting device 23 is disposed close to the camera 22, for example.

*(정보 처리 장치(21)의 구성예)* (Configuration example of the information processing apparatus 21)

도 2는 정보 처리 장치(21)의 구성예를 나타내고 있다. 2 shows an example of the configuration of the information processing apparatus 21.

정보 처리 장치(21)는, 제어부(41), 2치화부(42), 피부 추출부(43), 임계값 결정부(44), 마스크 화상 생성부(45) 및 형상 추출부(46)를 포함한다. The information processing apparatus 21 controls the control unit 41, the binarizing unit 42, the skin extracting unit 43, the threshold value determining unit 44, the mask image generating unit 45, and the shape extracting unit 46. Include.

제어부(41)는 발광 장치(23)를 제어하여, LED(23a)와 LED(23b)를 교대로 발광시킨다. The controller 41 controls the light emitting device 23 to alternately emit the LEDs 23a and 23b.

2치화부(42)에는 카메라(22)로부터 제1 촬상 화상 및 제2 촬상 화상이 공급된다. 2치화부(42)는 카메라(22)로부터 공급되는 제1 및 제2 촬상 화상에 기초하여, 대상 화소를 추출(검출)한다. 한 실시 형태에서, 대상 화소는 사용자의 피부를 나타내는 하나 이상의 피부 영역 및 제1 촬상 화상 중에서 피부 영역을 제외한 영역에 대응한다. The binarizing unit 42 is supplied with a first captured image and a second captured image from the camera 22. The binarizing unit 42 extracts (detects) the target pixel based on the first and second captured images supplied from the camera 22. In one embodiment, the target pixel corresponds to one or more skin regions representing the skin of the user and a region excluding the skin region of the first captured image.

그리고, 2치화부(42)는, 추출된 피부 영역을 구성하는 화소의 화소값과 피부 영역을 제외한 영역을 구성하는 화소의 화소값을 서로 다른 값(예를 들어, 0과 1)으로 2치화시켜 얻어지는 2치화 피부 화상을 생성하고, 이 2치화된 피부 화상을 피부 추출부(43) 및 형상 추출부(46)에 공급한다.The binarization unit 42 binarizes the pixel value of the pixel constituting the extracted skin region and the pixel value of the pixel constituting the region excluding the skin region to different values (for example, 0 and 1). The binarized skin image obtained by making it work is produced and this binarized skin image is supplied to the skin extraction part 43 and the shape extraction part 46. FIG.

피부 추출부(43) 및 마스크 화상 생성부(45)에는, 카메라(22)로부터 제1 촬상 화상이 공급된다.The first extraction image is supplied from the camera 22 to the skin extracting unit 43 and the mask image generating unit 45.

피부 추출부(43)는, 2치화부(42)로부터 공급되는 2치화 피부 화상에 기초하여, 카메라(22)로부터 공급되는 제1 촬상 화상으로부터 2치화 피부 화상 내의 피부 영역에 대응하는 영역(사용자의 피부 영역을 나타내는 영역)을 추출한다.The skin extracting unit 43 is a region corresponding to the skin region in the binarized skin image from the first captured image supplied from the camera 22 based on the binarized skin image supplied from the binarizing unit 42. Area representing the skin area).

그리고, 피부 추출부(43)는 추출된 영역을 포함하는 피부 화상을 생성하고, 이 피부 화상을 임계값 결정부(44)에 공급한다. 또한, 피부 추출부(43)는, 추출된 영역을 피부 화상으로서 임계값 결정부(44)에 공급할 수 있음을 유의한다.The skin extracting unit 43 generates a skin image including the extracted region, and supplies the skin image to the threshold value determining unit 44. Note that the skin extracting section 43 can supply the extracted region to the threshold value determining section 44 as a skin image.

임계값 결정부(44)는, 피부 추출부(43)로부터 공급되는 피부 화상에 기초하여, 피부 화상(피부 화상을 구성하는 화소의 휘도값)과 같은 처리 화상의 히스토그램을 작성한다. 그리고, 임계값 결정부(44)는, 작성된 피부 화상의 히스토그램에 기초하여, 마스크 화상(후술됨)을 생성하는 데 사용되는 마스크 임계값을 결정하고, 이 마스크 임계값을 마스크 화상 생성부(45)에 공급한다. The threshold value determination section 44 creates a histogram of the processed image, such as a skin image (luminance value of the pixels constituting the skin image), based on the skin image supplied from the skin extraction section 43. Then, the threshold value determination section 44 determines a mask threshold value used to generate a mask image (described later) based on the histogram of the created skin image, and the mask threshold value is converted into the mask image generation section 45. Supplies).

마스크 화상 생성부(45)는, 임계값 결정부(44)로부터 공급되는 마스크 임계값에 기초하여, 카메라(22)로부터 공급되는 제1 촬상 화상으로부터 마스크 화상을 생성하고, 이 마스크 화상을 형상 추출부(46)에 공급한다. The mask image generation part 45 generates a mask image from the 1st picked-up image supplied from the camera 22 based on the mask threshold value supplied from the threshold value determination part 44, and extracts shape of this mask image. It supplies to the part 46.

또한, 마스크 화상은, 제1 촬상 화상을, 마스크 임계값에 의해 특정되는 휘도값의 범위 내의 휘도값을 갖는 화소로 구성되는 마스크 영역과 이 마스크 영역을 제외한 비 마스크 영역으로 2치화시켜 얻어지는 화상임을 유의한다.In addition, a mask image is an image obtained by binarizing a 1st picked-up image into the mask area | region comprised by the pixel which has a brightness value within the range of the brightness value specified by the mask threshold value, and the non-mask area | region except this mask area | region. Be careful.

형상 추출부(46)는, 마스크 화상 생성부(45)로부터의 마스크 화상에 기초하여, 2치화부(42)로부터 공급되는 2치화 피부 화상으로부터, 마스크 화상 내의 마스크 영역에 대응하는 영역으로서, 예를 들어, 사용자의 손의 형상을 나타내는 형상 영역에 대응하는 적어도 하나의 소정의 피사체를 추출한다.The shape extraction part 46 is an area | region corresponding to the mask area | region in a mask image from the binarized skin image supplied from the binarization part 42 based on the mask image from the mask image generation part 45, For example, at least one predetermined subject corresponding to the shape region representing the shape of the user's hand is extracted.

그리고, 형상 추출부(46)는, 추출된 형상 영역에 기초하여 손의 형상을 인식하고, 인식 결과에 대응하는 처리를 행하고, 그 처리 결과를 후단에 출력한다.Then, the shape extraction unit 46 recognizes the shape of the hand based on the extracted shape region, performs a process corresponding to the recognition result, and outputs the process result to the rear end.

또한, 2치화부(42)가 제1 촬상 화상으로부터 피부 영역과 피부 영역을 제외한 영역을 추출하지만, 2치화부(42)가 제2 촬상 화상으로부터도 피부 영역과 피부 영역을 제외한 영역을 추출할 수 있음을 유의한다. 이 경우, 피부 추출부(43) 및 마스크 화상 생성부(45)에는, 카메라(22)로부터 제1 촬상 화상 대신에 제2 촬상 화상이 공급된다. In addition, although the binarization unit 42 extracts an area excluding the skin region and the skin region from the first captured image, the binarization unit 42 extracts an area excluding the skin region and the skin region from the second captured image. Note that you can. In this case, the skin extraction part 43 and the mask image generation part 45 are supplied from the camera 22 with a 2nd captured image instead of a 1st captured image.

그리고, 피부 추출부(43)는 제2 촬상 화상으로부터 피부 화상을 생성하고, 마스크 화상 생성부(45)는 제2 촬상 화상으로부터 마스크 화상을 생성한다.The skin extracting unit 43 generates a skin image from the second captured image, and the mask image generating unit 45 generates a mask image from the second captured image.

(2치화 피부 화상의 생성)(Production of binary skin image)

이어서, 도 3 내지 도 5를 참조하여, 2치화부(42)가 2치화 피부 화상을 생성하는 처리를 상세하게 설명한다. Next, with reference to FIGS. 3-5, the process by which the binarization part 42 produces a binarized skin image is demonstrated in detail.

도 3 및 도 4에서는 카메라(22)에 의해 촬상되는 제1 촬상 화상 및 제2 촬상 화상에 대해서 설명함을 유의한다. 또한, 도 5에서는, 제1 촬상 화상 및 제2 촬상 화상에 기초하여 2치화부(42)에 의해 생성되는 2치화 피부 화상에 대해서 설명한다. 3 and 4, the first captured image and the second captured image captured by the camera 22 will be described. In addition, in FIG. 5, the binarized skin image produced | generated by the binarization part 42 based on a 1st captured image and a 2nd captured image is demonstrated.

도 3은 파장이 상이한 조사광에 대한 인간 피부의 반사 특성을 나타내고 있다.3 shows reflection characteristics of human skin with respect to irradiation light having different wavelengths.

또한, 이 반사 특성은 인간 피부의 색의 차이(인종의 차이)나 피부의 상태(썬탠 등) 등에 상관없이 일반적인 것임을 유의한다.In addition, it is noted that this reflective characteristic is general regardless of the difference in color (human differences) of the human skin, the condition of the skin (suntan, etc.).

도 3에서, 횡축은 인간 피부에 조사되는 광의 파장을 나타내고, 종축은 인간 피부에 조사된 광의 반사율을 나타내고 있다. In FIG. 3, the horizontal axis represents the wavelength of light irradiated to human skin, and the vertical axis represents the reflectance of light irradiated to human skin.

인간 피부에 조사된 광의 반사율은, 800㎚ 부근을 피크로 하여 900㎚ 부근에서부터 급격히 감소하고, 1000㎚ 부근을 최소값으로 하여 다시 상승한다.The reflectance of the light irradiated to the human skin rapidly decreases from around 900 nm with the peak around 800 nm, and rises again with the minimum value around 1000 nm.

구체적으로는, 예를 들어, 도 3에 도시된 바와 같이, 파장이 870㎚인 광을 인간의 피부에 조사해서 얻어지는 반사광의 반사율은 63%이며, 파장이 950㎚인 광을 인간의 피부에 조사해서 얻어지는 반사광의 반사율은 50%이다. Specifically, for example, as shown in FIG. 3, the reflectance of the reflected light obtained by irradiating human skin with light having a wavelength of 870 nm is 63%, and the human skin is irradiated with light having a wavelength of 950 nm. The reflectance of the reflected light obtained by this is 50%.

상기 현상은 인간의 피부에 특유한 것이며, 인간의 피부 이외의 물체(예를 들어, 두발이나 의복 등)에 대해서는, 800 내지 1000㎚ 부근에 있어서 반사율의 변화가 종종 완만하게 된다.
The phenomenon is peculiar to human skin, and for objects other than human skin (for example, hair or clothes, etc.), the change in reflectance often becomes gentle around 800 to 1000 nm.

*이어서, 카메라(22)의 촬상에 의해 얻어지는 제1 및 제2 촬상 화상을 도 4를 참조하여 설명한다. Next, the 1st and 2nd picked-up image obtained by the imaging of the camera 22 is demonstrated with reference to FIG.

도 4는 파장이 870㎚이며 사용자에 조사되는 광의 반사광을 수광하여 얻어지는 제1 촬상 화상 및 파장이 950㎚이며 사용자에 조사되는 광의 반사광을 수광하여 얻어지는 제2 촬상 화상의 예를 나타내고 있다.4 shows an example of a first picked-up image obtained by receiving reflected light of light irradiated to the user with a wavelength of 870 nm and a second picked-up image obtained by receiving reflected light of light irradiated to the user with a wavelength of 950 nm.

도 4a는 사용자의 얼굴(61) 및 손(62)이 사용자의 피부 영역으로서 표시되고, 사용자가 입고 있는 셔츠(63) 및 배경(64)이 사용자의 피부 영역을 제외한 영역으로서 표시되어 있는 제1 촬상 화상을 나타낸다.4A shows a first face in which the user's face 61 and hand 62 are displayed as the skin area of the user, and the shirt 63 and background 64 that the user is wearing are displayed as the area excluding the skin area of the user. A captured image is shown.

도 4b는 사용자의 얼굴(81) 및 손(82)이 사용자의 피부 영역으로서 표시되고, 사용자가 입고 있는 셔츠(83) 및 배경(84)이 사용자의 피부 영역을 제외한 영역으로서 표시되어 있는 제2 촬상 화상을 나타낸다.4B shows a second display in which the user's face 81 and hand 82 are displayed as the user's skin area, and the shirt 83 and background 84 the user is wearing are shown as the area excluding the user's skin area. A captured image is shown.

도 3에서 설명한 바와 같이, 사용자의 피부 부분에서의 반사 특성에 관해서는, 파장이 870㎚인 광의 반사율이 파장이 950㎚인 광의 반사율보다도 크다.As described in FIG. 3, with respect to the reflection characteristic in the skin portion of the user, the reflectance of light having a wavelength of 870 nm is larger than the reflectance of light having a wavelength of 950 nm.

따라서, 파장이 870㎚인 광을 사용자에 조사할 경우, 사용자의 피부 부분에 조사되는 광의 반사광으로서, 파장이 950㎚인 광의 반사광보다도 더 밝은 광이 카메라(22)의 렌즈에 입사된다.Therefore, when the user irradiates light having a wavelength of 870 nm, light that is brighter than the reflected light of the light having a wavelength of 950 nm is incident on the lens of the camera 22 as the reflected light of the light irradiated to the skin portion of the user.

그 결과, 제1 촬상 화상 내의 사용자의 피부 영역(얼굴(61) 및 손(62))을 구성하는 화소의 휘도값이, 제2 촬상 화상 내의 사용자의 피부 영역(얼굴(81) 및 손(82))을 구성하는 화소의 휘도값보다도 큰 값이 된다. As a result, the luminance value of the pixels constituting the skin region (face 61 and hand 62) of the user in the first captured image is determined by the skin region (face 81 and hand 82) of the user in the second captured image. It becomes a value larger than the luminance value of the pixel which comprises ()).

따라서, 제1 촬상 화상 내의 사용자의 피부 영역을 구성하는 화소의 휘도값으로부터, 제2 촬상 화상 내의 사용자의 피부 영역을 구성하는 화소의 휘도값을 차감하여 얻어지는 차분은 양의 값이 된다. Therefore, the difference obtained by subtracting the luminance value of the pixel constituting the skin region of the user in the second captured image from the luminance value of the pixel constituting the skin region of the user in the first captured image is a positive value.

상기와는 대조적으로, 사용자의 피부 부분을 제외한 부분에서의 반사 특성에 관해서는, 파장이 870㎚인 광의 반사율은 파장이 950㎚인 광의 반사율과 동일하거나 또는 그보다 작은 경우가 많다.In contrast to the above, with respect to the reflection characteristic in the portion except the skin part of the user, the reflectance of light having a wavelength of 870 nm is often equal to or smaller than the reflectance of light having a wavelength of 950 nm.

따라서, 파장이 870㎚인 광을 사용자에 조사할 경우, 사용자의 피부 부분을 제외한 부분에 조사되는 광의 반사광으로서, 파장이 950㎚인 광의 반사광만큼 밝은 광 또는 이보다 어두운 광이 카메라(22)의 렌즈에 입사된다.Therefore, when the user irradiates the light having a wavelength of 870 nm, the light reflected by the light irradiated to the portion except the skin part of the user, the light which is as bright as the reflected light of the light having the wavelength of 950 nm or darker than this is the lens of the camera 22. Is incident on.

이로 인해, 제1 촬상 화상 내의 사용자의 피부 영역을 제외한 영역(셔츠(63) 및 배경(64))을 구성하는 화소의 휘도값은, 제2 촬상 화상 내의 사용자의 피부 영역을 제외한 영역(셔츠(83) 및 배경(84))을 구성하는 화소의 휘도값과 동일하거나 더 작은 값이 된다.For this reason, the luminance value of the pixels constituting the area (shirt 63 and background 64) excluding the skin area of the user in the first captured image is determined by the area (shirt (excluding the skin area of the user in the second captured image). 83) and the background 84), the value is equal to or smaller than the luminance value of the pixels.

따라서, 제1 촬상 화상 내의 사용자의 피부 부분을 제외한 부분을 구성하는 화소의 휘도값으로부터, 제2 촬상 화상 내의 사용자의 대응하는 피부 부분을 구성하는 화소의 휘도값을 차감하여 얻어지는 차분은, 0 이하의 값(양의 값을 제외한 값)이 된다.Therefore, the difference obtained by subtracting the luminance value of the pixel constituting the corresponding skin portion of the user in the second captured image from the luminance value of the pixel constituting the portion excluding the skin portion of the user in the first captured image is 0 or less. Will be the value of (excluding positive values).

그 결과, 2치화부(42)는 제1 촬상 화상과 제2 촬상 화상의 대응하는 화소의 휘도값끼리의 차분을 산출하고, 산출된 차분에 기초하여 대상 화소(예를 들면, 피부 영역)과 사용자의 피부 영역을 제외한 영역을 추출한다. 그리고, 2치화부(42)는, 추출된 사용자의 피부 영역을 값 1로 나타내고, 추출된 사용자의 피부 영역을 제외한 영역을 값 0으로 나타내는 2치화 피부 화상을 생성한다.As a result, the binarizing unit 42 calculates the difference between the luminance values of the corresponding pixels of the first captured image and the second captured image, and based on the calculated difference, the target pixel (for example, the skin region) and Extracts the user's skin area. Then, the binarizing unit 42 generates a binarized skin image representing the extracted skin region of the user as the value 1 and representing the region excluding the extracted skin region of the user as the value 0. FIG.

다시 말해, 예를 들어, 산출된 차분이 양의 값일 경우, 2치화부(42)는 대응하는 화소를 사용자의 피부 영역을 구성하는 화소로서 추출하고, 산출된 차분이 양의 값이 아닐 경우, 대응하는 화소를 사용자의 피부 영역을 제외한 영역을 구성하는 화소로서 추출한다.In other words, for example, when the calculated difference is a positive value, the binarization unit 42 extracts the corresponding pixel as a pixel constituting the skin region of the user, and when the calculated difference is not a positive value, The corresponding pixel is extracted as a pixel constituting an area excluding a skin area of the user.

그리고, 2치화부(42)는 사용자의 피부 영역을 구성하는 화소로서 추출된 화소의 값을 각각 1로 설정하고, 사용자의 피부 영역을 제외한 영역을 구성하는 화소로서 추출된 화소의 값을 각각 0으로 설정함으로써 2치화 피부 화상을 생성하고, 이 2치화 피부 화상을 피부 추출부(43) 및 형상 추출부(46)에 공급한다.The binarizer 42 sets the value of the extracted pixels as pixels constituting the skin region of the user to 1, and sets the values of the extracted pixels as pixels constituting the region excluding the skin region of the user, respectively. By setting to, a binarized skin image is generated, and the binarized skin image is supplied to the skin extraction section 43 and the shape extraction section 46.

또한, 사용자의 피부 부분을 제외한 부분에서의 반사율에 따라, 피부 부분을 제외한 부분에 대해 산출된 차분이 피부 부분에 대해 산출된 차분보다 작지만 양의 값이 되는 경우가 발생할 수 있음을 유의한다. 따라서, 차분이 양의 값이어도 소정의 임계값 미만인 경우에는, 이 차분이 사용자의 피부 부분을 제외한 부분의 차분이라고 가정하고, 그 부분에 대해 값 0을 설정하는 것이 바람직할 수 있다.In addition, it may be noted that depending on the reflectance in the part except the skin part of the user, the difference calculated for the part except the skin part may be smaller than the difference calculated for the skin part but may be a positive value. Therefore, if the difference is less than a predetermined threshold even if the difference is a positive value, it may be desirable to assume that the difference is a difference of the portion excluding the skin part of the user, and to set a value of 0 for that portion.

또한, 2치화부(42)는, 제1 촬상 화상과 제2 촬상 화상의 대응하는 화소들의 휘도값들 간의 차분 절대값을 산출하고, 산출된 차분 절대값이 소정의 임계값 이상인지의 여부에 기초하여, 사용자의 피부 부분(피부 영역)과 피부 부분을 제외한 부분(피부 영역 이외의 영역)을 추출하여, 2치화 피부 화상을 생성할 수 있다.Further, the binarizing unit 42 calculates an absolute difference value between luminance values of corresponding pixels of the first captured image and the second captured image, and determines whether the calculated absolute difference value is equal to or greater than a predetermined threshold value. Based on this, a skin part (skin area) of the user and a part (area other than the skin area) except for the skin part can be extracted to generate a binarized skin image.

상기 처리는, 반사 특성으로 인해 사용자의 피부 부분에 대응하는 차분 절대값은 비교적 큰 값이고, 사용자의 피부 부분을 제외한 부분에 대응하는 차분 절대값은 비교적 작은 값이라는 사실을 이용하고 있다. The above process takes advantage of the fact that the absolute absolute value corresponding to the portion of the skin of the user is relatively large due to the reflection characteristic, and the absolute absolute value corresponding to the portion of the skin except the portion of the user is relatively small.

이어서, 도 5는 2치화부(42)에 의해 생성되는 2치화 피부 화상의 예를 나타내고 있다.Next, FIG. 5 has shown the example of the binarized skin image produced | generated by the binarization part 42. As shown in FIG.

도 5에 나타낸 2치화 피부 화상에서, 흑색으로 나타난 부분은 값 1로 나타내어지는 피부 영역을 나타낸다. 이 피부 영역은 사용자의 얼굴의 피부 부분을 나타내는 얼굴 영역(101) 및 사용자의 손의 피부 부분을 나타내는 손 영역(102)을 포함한다.In the binarized skin image shown in Fig. 5, the part shown in black represents the skin area indicated by the value 1. This skin region includes a facial region 101 representing the skin portion of the user's face and a hand region 102 representing the skin portion of the user's hand.

또한, 도 5에 나타낸 얼굴 영역(101)은, 도면의 편의상 얼굴의 피부 부분 외에 눈썹, 눈, 머리카락 등을 포함하고 있지만, 얼굴 영역(101)은 실제로는 얼굴의 피부 부분만으로 구성됨을 유의한다.In addition, although the facial region 101 shown in FIG. 5 includes eyebrows, eyes, hair, etc. in addition to the skin part of a face for convenience of drawing, it is noted that the facial area 101 actually consists only of the skin part of a face.

또한, 도 5에 나타낸 2치화 피부 화상에서, 백색으로 나타난 부분은 피부 영역을 제외한 영역을 나타내며, 값 0으로 나타내어진다.In addition, in the binarized skin image shown in FIG. 5, the part which appeared white represents the area | region except a skin area | region, and is represented by the value 0. FIG.

2치화부(42)는 생성된 2치화 피부 화상을 피부 추출부(43) 및 형상 추출부(46)에 공급한다. The binarization unit 42 supplies the generated binarized skin image to the skin extraction unit 43 and the shape extraction unit 46.

피부 추출부(43)는, 2치화부(42)로부터 공급되는 2치화 피부 화상에 기초하여, 카메라(22)로부터 공급되는 제1 촬상 화상으로부터, 2치화 피부 화상 내의 얼굴 영역(101) 및 손 영역(102)에 대응하는 영역(얼굴(61)과 손(62)을 포함하는 영역)을 추출한다. 그리고, 피부 추출부(43)는 추출된 영역을 포함하는 피부 화상을 생성한다.The skin extracting unit 43 uses the face region 101 and the hand in the binarized skin image from the first captured image supplied from the camera 22 based on the binarized skin image supplied from the binarizing unit 42. The area (the area including the face 61 and the hand 62) corresponding to the area 102 is extracted. The skin extracting unit 43 generates a skin image including the extracted region.

(피부 화상의 생성)(Generation of skin image)

이어서, 도 6을 참조하여, 피부 추출부(43)가, 2치화부(42)로부터 공급되는 2치화 피부 화상에 기초하여, 제1 촬상 화상으로부터 처리 화상(예를 들어, 피부 화상)을 생성하는 처리를 설명한다. Next, referring to FIG. 6, the skin extracting unit 43 generates a processed image (eg, a skin image) from the first captured image based on the binarized skin image supplied from the binarizing unit 42. The processing to be described will be described.

도 6은 피부 추출부(43)에 의해 추출되는 피부 화상의 예를 나타내고 있다. 도 6에 나타낸 피부 화상에는 사용자의 얼굴(61) 및 손(62)이 표시되어 있다. 6 shows an example of the skin image extracted by the skin extraction section 43. In the skin image shown in FIG. 6, the user's face 61 and hands 62 are displayed.

또한, 도 6에 나타낸 피부 화상은 도면의 편의상 얼굴의 피부 부분 외에, 사용자의 얼굴(61)로서 눈썹, 눈, 머리카락 등도 포함하고 있지만, 도 6에 나타낸 얼굴(61)은 실제로는 얼굴의 피부 부분만을 나타내고 있음을 유의한다.In addition, the skin image shown in FIG. 6 includes eyebrows, eyes, hair, and the like as the face 61 of the user in addition to the skin portion of the face for convenience of drawing, but the face 61 shown in FIG. 6 is actually the skin portion of the face. Note that only.

피부 추출부(43)는 2치화부(42)로부터 공급되는 2치화 피부 화상의 화소의 휘도값과, 카메라(22)로부터 공급되는 제1 촬상 화상의 대응하는 화소의 휘도값을 승산한다.The skin extracting unit 43 multiplies the luminance value of the pixel of the binarized skin image supplied from the binarization unit 42 with the luminance value of the corresponding pixel of the first captured image supplied from the camera 22.

그리고, 피부 추출부(43)는, 제1 촬상 화상을 구성하는 화소들 중에서, 승산 결과가 0이 아닌 화소로 구성되는 영역(얼굴(61)과 손(62)을 포함하는 영역)을 추출하고, 추출된 영역을 포함하는 피부 화상을 생성한다.The skin extracting unit 43 extracts a region (the region including the face 61 and the hand 62) composed of pixels whose multiplication results are not zero among the pixels constituting the first captured image. The skin image including the extracted area is generated.

이에 의해, 제1 촬상 화상 내의 영역들 중에서, 2치화 피부 화상의 얼굴 영역(101)에 대응하는 영역에 포함되는 얼굴(61) 및 2치화 피부 화상의 손 영역(102)에 대응하는 영역에 포함되는 손(62)이 있는 그대로 추출된다. 2치화 피부 화상의 피부 영역을 제외한 영역에 대응하는 영역(도 6에서 백색으로 나타남)에는 휘도값 225가 주어지고, 그리고 나서 도 6에 나타낸 바와 같은 피부 화상이 제1 촬상 화상으로부터 생성된다. Thereby, among the areas in the first captured image, it is included in the area corresponding to the face 61 included in the area corresponding to the face area 101 of the binarized skin image and the hand area 102 of the binarized skin image. The hand 62 is extracted as it is. A luminance value 225 is given to an area (shown in white in FIG. 6) corresponding to an area except the skin area of the binarized skin image, and then a skin image as shown in FIG. 6 is generated from the first captured image.

피부 추출부(43)는 생성된 피부 화상을 임계값 결정부(44)에 공급한다.The skin extracting unit 43 supplies the generated skin image to the threshold value determining unit 44.

임계값 결정부(44)는 피부 추출부(43)로부터 공급되는 피부 화상에 기초하여, 마스크 화상을 생성하는 데 사용되는 마스크 임계값을 결정한다.The threshold value determination section 44 determines the mask threshold value used to generate the mask image based on the skin image supplied from the skin extraction section 43.

(마스크 임계값의 결정)(Determination of mask threshold)

이어서, 도 7을 참조하여, 임계값 결정부(44)가 마스크 임계값을 결정하는 처리를 상세하게 설명한다.Next, with reference to FIG. 7, the process which the threshold value determination part 44 determines a mask threshold value is demonstrated in detail.

도 7은 피부 화상의 히스토그램의 예를 나타내고 있다.7 shows an example of a histogram of skin burns.

도 7에 있어서, 횡축은 피부 화상을 구성하는 화소의 휘도값을 나타내고 있다. 또한, 종축은 횡축의 휘도값에 대응하는 화소의 수를 나타내고 있다.In FIG. 7, the horizontal axis represents luminance values of pixels constituting the skin image. In addition, the vertical axis represents the number of pixels corresponding to the luminance value of the horizontal axis.

또한, 도 6의 피부 화상에서 백색 부분으로 나타내어진 영역을 구성하고 휘도값이 225인 화소의 수가 일반적으로 도 7의 히스토그램에 표시되지만, 휘도값이 225인 화소의 수는 마스크 임계값을 결정하는 데 사용되지 않기 때문에 그 휘도가 생략되어 있음을 유의한다.In addition, the number of pixels constituting the region represented by the white portion in the skin image of FIG. 6 and having a luminance value of 225 is generally displayed in the histogram of FIG. 7, but the number of pixels having the luminance value of 225 determines the mask threshold value. Note that the brightness is omitted because it is not used.

임계값 결정부(44)는 피부 추출부(43)로부터 공급되는 피부 화상을 구성하는 화소의 휘도값에 대해서, 도 7에 나타내어진 것과 같은 히스토그램을 작성한다. The threshold value determination section 44 creates a histogram as shown in FIG. 7 with respect to luminance values of pixels constituting the skin image supplied from the skin extraction section 43.

도 7의 히스토그램에서는, 휘도값 0에서 휘도값 54까지의 사이와, 휘도값 55에서 휘도값 110까지의 사이에 많은 화소 수가 집중되어 있다. 즉, 도 7의 히스토그램에서, 복수의 대상 화소들이 2개의 별도의 그룹으로 그룹핑되어 있다.In the histogram of FIG. 7, a large number of pixels are concentrated between the luminance value 0 to the luminance value 54 and the luminance value 55 to the luminance value 110. That is, in the histogram of FIG. 7, a plurality of target pixels are grouped into two separate groups.

그런데, 상술한 바와 같이, 손이 카메라(22)에 가까이 위치하고 있고, 얼굴, 가슴 등은 카메라(22)에서 멀리 위치하고 있다고 가정한다.However, as described above, it is assumed that the hand is located near the camera 22 and the face, the chest, and the like are located far from the camera 22.

예를 들어, 발광 장치(23)의 LED(23a) 및 LED(23b)는 카메라(22)에 근접한 상태에서 발광하기 때문에, 카메라(22)(발광 장치(23))에 더 가까이 위치하는 사용자의 몸체(이 경우에는 손)의 휘도값이 커지고, 카메라(22)에서 좀 더 멀리 위치하는 사용자의 몸체(이 경우에는 얼굴 등)의 휘도값은 작아진다.For example, since the LEDs 23a and 23b of the light emitting device 23 emit light in a state close to the camera 22, the user of the user who is closer to the camera 22 (light emitting device 23). The luminance value of the body (in this case, the hand) increases, and the luminance value of the user's body (in this case, the face, etc.) located further away from the camera 22 becomes smaller.

따라서, 카메라(22)에 가까이 위치하는 손의 피부 부분을 구성하는 화소의 휘도값은, 카메라(22)에서 멀리 위치하는 얼굴의 피부 부분을 구성하는 화소의 휘도값보다 큰 값이 된다.Therefore, the luminance value of the pixel which comprises the skin part of the hand located near the camera 22 becomes a value larger than the luminance value of the pixel which comprises the skin part of the face located far from the camera 22.

이로 인해, 휘도값 0에서 휘도값 54 사이의 휘도값은 얼굴(61)(의 영역)을 구성하는 화소의 휘도값이며, 휘도값 55에서 휘도값 110 사이의 휘도값은 손(62)과 같은 소정의 피사체를 구성하는 화소의 휘도값이다.For this reason, the luminance value between the luminance value 0 and the luminance value 54 is the luminance value of the pixels constituting the face 61 (area), and the luminance value between the luminance value 55 and the luminance value 110 is the same as the hand 62. It is the luminance value of the pixels constituting the predetermined subject.

임계값 결정부(44)는 최소 화소 값(이 예에서는, 휘도값 55)을 하한 임계값 Th_L로서 결정하고, 최대 화소 값(이 경우에는, 휘도값 110)을 상한 임계값 Th_H로서 결정한다.The threshold value determining section 44 determines the minimum pixel value (luminance value 55 in this example) as the lower limit threshold Th_L, and determines the maximum pixel value (in this case, the luminance value 110) as the upper limit threshold Th_H.

그리고, 임계값 결정부(44)는 결정된 하한 임계값 Th_L 및 상한 임계값 Th_H를, 마스크 임계값으로서 마스크 화상 생성부(45)에 공급한다.Then, the threshold value determining unit 44 supplies the determined lower limit threshold Th_L and the upper limit threshold Th_H to the mask image generating unit 45 as a mask threshold.

마스크 화상 생성부(45)는, 임계값 결정부(44)로부터 공급되는 마스크 임계값(하한 임계값 Th_L 및 상한 임계값 Th_H)에 기초하여, 카메라(22)로부터 공급되는 제1 촬상 화상으로부터 마스크 영역과 비 마스크 영역을 검출하고, 검출된 마스크 영역과 비 마스크 영역이 상이한 값으로 2치화된 마스크 화상을 생성한다.The mask image generation part 45 masks from the 1st picked-up image supplied from the camera 22 based on the mask threshold value (lower limit threshold Th_L and upper limit threshold Th_H) supplied from the threshold value determination part 44. An area and a non-mask area are detected, and a mask image in which the detected mask area and the non-mask area are binarized to different values is generated.

(마스크 화상의 생성)(Generation of mask image)

이어서, 마스크 화상 생성부(45)가 임계값 결정부(44)로부터의 마스크 임계값에 기초하여 마스크 화상을 생성하는 처리를 도 8을 참조하여 상세하게 설명한다. Next, the process by which the mask image generation part 45 produces | generates a mask image based on the mask threshold value from the threshold value determination part 44 is demonstrated in detail with reference to FIG.

도 8은 마스크 화상의 예를 나타내고 있다. 도 8에 나타낸 마스크 화상에서 흑색으로 나타내어지는 마스크 영역(121)은, 대응하는 제1 촬상 화상 내에서, 휘도값이 하한 임계값 Th_L 이상이며 상한 임계값 Th_H 이하인 영역이다.8 shows an example of a mask image. The mask area 121 shown in black in the mask image shown in FIG. 8 is an area whose luminance value is greater than or equal to the lower limit threshold Th_L and less than or equal to the upper limit threshold Th_H in the corresponding first captured image.

또한, 도 8에 나타낸 마스크 화상에서 백색으로 나타내어지는 비 마스크 영역은, 대응하는 제1 촬상 화상 내에서, 휘도값이 하한 임계값 Th_L 미만이거나 또는 상한 임계값 Th_H 보다 큰 영역이다.In the mask image shown in FIG. 8, the non-mask area shown in white is a region in which the luminance value is lower than the lower limit threshold Th_L or larger than the upper limit threshold Th_H in the corresponding first captured image.

마스크 화상 생성부(45)는, 카메라(22)로부터 공급되는 제1 촬상 화상을 구성하는 화소의 휘도값이 하한 임계값 Th_L이상이고 상한 임계값 Th_H 이하인 경우에는, 그러한 휘도값을 갖는 화소를 마스크 영역에 포함되는 화소로서 검출하고, 그 휘도값 각각을 값 1로 변환한다.The mask image generation unit 45 masks a pixel having such a luminance value when the luminance value of the pixel constituting the first captured image supplied from the camera 22 is greater than or equal to the lower limit threshold Th_L and less than or equal to the upper limit threshold Th_H. It detects as a pixel contained in an area | region, and converts each luminance value into the value 1.

또한, 마스크 화상 생성부(45)는, 카메라(22)로부터 공급되는 제1 촬상 화상을 구성하는 화소의 휘도값이 하한 임계값 Th_L 미만이거나 상한 임계값 Th_H보다도 큰 경우에는, 그러한 휘도값을 갖는 화소를 비 마스크 영역에 포함되는 화소로서 검출하고, 그 휘도값 각각을 값 0으로 변환한다.The mask image generating unit 45 has such a luminance value when the luminance value of the pixel constituting the first captured image supplied from the camera 22 is less than the lower limit threshold Th_L or larger than the upper limit threshold Th_H. The pixel is detected as a pixel included in the non-masked area, and each of the luminance values is converted into a value of zero.

이에 의해, 마스크 화상 생성부(45)는, 각각 값 1을 갖는 화소로 구성된 마스크 영역(121)(흑색으로 나타남)과, 각각 값 0을 갖는 화소로 구성된 비 마스크 영역(백색으로 나타남)으로 구성되는 마스크 화상을 생성하고, 이 마스크 화상을 형상 추출부(46)에 공급한다.Thereby, the mask image generation part 45 consists of the mask area | region 121 (shown in black) which consists of pixels which respectively have the value 1, and the non-mask area | region (shown in white) which consists of the pixels which respectively have the value 0. A mask image to be generated is generated, and the mask image is supplied to the shape extraction unit 46.

형상 추출부(46)는, 마스크 화상 생성부(45)로부터 공급되는 마스크 화상에 기초하여, 2치화부(42)로부터 공급되는 2치화 피부 화상 내의 얼굴 영역(101) 및 손 영역(102)으로부터, 마스크 화상 내의 마스크 영역(121)에 대응하는 영역으로서, 예를 들어 사용자의 손의 형상을 나타내는 형상 영역을 추출한다.The shape extracting section 46 is formed from the face region 101 and the hand region 102 in the binarized skin image supplied from the binarizing unit 42 based on the mask image supplied from the mask image generating unit 45. As a region corresponding to the mask region 121 in the mask image, for example, a shape region representing the shape of the user's hand is extracted.

(손의 형상의 추출)(Extraction of the shape of the hand)

이어서, 형상 추출부(46)가 2치화 피부 화상으로부터 사용자의 손의 형상을 추출하는 처리를 도 9을 참조하여 상세하게 설명한다.Next, the process by which the shape extraction part 46 extracts the shape of a user's hand from a binarized skin image is demonstrated in detail with reference to FIG.

도 9는 형상 추출부(46)에 의해 추출되는 형상 영역을 포함하는 추출 화상의 표시 예를 나타내고 있다. 9 shows an example of display of an extracted image including a shape region extracted by the shape extraction section 46.

도 9에 나타낸 추출 화상에서, 형상 영역(141)은 사용자의 손의 형상이다.In the extracted image shown in FIG. 9, the shape region 141 is the shape of a user's hand.

형상 추출부(46)는 마스크 화상 생성부(45)로부터 공급되는 마스크 화상을 구성하는 화소의 휘도값과, 2치화부(42)로부터 공급되는 2치화 피부 화상을 구성하는 대응하는 화소의 휘도값을 승산한다. The shape extraction unit 46 is configured to generate luminance values of pixels constituting the mask image supplied from the mask image generation unit 45 and luminance values of corresponding pixels constituting the binarized skin image supplied from the binarization unit 42. Multiply by

그리고, 형상 추출부(46)는, 그 승산 결과가 0이 아닌 2치화 피부 화상 내의 영역, 즉, 2치화 피부 화상 내의 얼굴 영역(101) 및 손 영역(102)(도 5) 중에서 마스크 화상 내의 마스크 영역(121)(도 8)과 겹치는 부분을, 형상 영역(141)으로서 추출한다.Then, the shape extracting section 46 is used in the mask image in the region in the binarized skin image whose multiplication result is not zero, that is, the face region 101 and the hand region 102 (FIG. 5) in the binarized skin image. The portion overlapping with the mask region 121 (FIG. 8) is extracted as the shape region 141.

또한, 형상 추출부(46)는, 추출된 형상 영역(141)에 기초하여 사용자의 손의 형상을 인식하고, 그 인식 결과에 따른 처리를 행한다.In addition, the shape extracting section 46 recognizes the shape of the user's hand based on the extracted shape region 141, and performs processing according to the recognition result.

또한, 도 8에 나타낸 마스크 화상 내의 마스크 영역(121)은, 사용자의 손 외에 사용자가 착용하고 있는 셔츠를 포함함을 유의한다.Note that the mask area 121 in the mask image shown in FIG. 8 includes a shirt worn by the user in addition to the user's hand.

그러나, 2치화 피부 화상 내의 얼굴 영역(101) 및 손 영역(102)은 사용자가 착용하고 있는 셔츠를 포함하지 않기 때문에, 형상 추출부(46)는, 셔츠의 형상을 나타내는 영역을 추출하지 않으면서 손의 형상만을 나타내는 형상 영역(141)을 정확하게 추출할 수 있다.However, since the face region 101 and the hand region 102 in the binarized skin image do not include the shirt worn by the user, the shape extracting section 46 does not extract the region representing the shape of the shirt. The shape region 141 representing only the shape of the hand can be accurately extracted.

(형상 추출 처리의 동작 설명)(Operation Description of Shape Extraction Processing)

이어서, 정보 처리 시스템(1)이 사용자의 손의 형상 등을 추출하는 형상 추출 처리를 상세하게 설명한다.Next, the shape extraction process in which the information processing system 1 extracts the shape of a user's hand, etc. is demonstrated in detail.

도 10은 형상 추출 처리를 설명하기 위한 흐름도이다. 또한, 이 형상 추출 처리는 정보 처리 시스템(1)의 전원이 턴 온되었을 때부터 반복하여 실행됨을 유의한다.10 is a flowchart for explaining a shape extraction process. Note that this shape extraction process is repeatedly executed from when the power of the information processing system 1 is turned on.

이하, 사용자가 카메라(22) 앞에 있을 때에 행해지는 형상 추출 처리에 대해서 설명한다.Hereinafter, shape extraction processing performed when the user is in front of the camera 22 will be described.

스텝 S1에서, 제어부(41)는 발광 장치(23)의 LED(23a)를 제어하여, 제1 파장의 광의 발광을 개시시킨다. 또한, LED(23b)가 발광하고 있는 경우에는, 제어부(41)가 LED(23b)의 발광을 정지시킨 후 LED(23a)의 발광을 개시시킴을 유의한다.In step S1, the control part 41 controls the LED 23a of the light emitting device 23 to start light emission of the light of a 1st wavelength. Note that when the LED 23b emits light, the controller 41 stops light emission of the LED 23b and then starts emitting light of the LED 23a.

스텝 S2에서, 카메라(22)는 제1 파장의 광이 조사되는 사용자를 촬상하고, 그 결과 얻어진 제1 촬상 화상을 정보 처리 장치(21)에 공급한다.In step S2, the camera 22 picks up the user to which the light of a 1st wavelength is irradiated, and supplies the obtained 1st picked-up image to the information processing apparatus 21. As shown in FIG.

스텝 S3에서, 제어부(41)는 발광 장치(23)의 LED(23a)를 제어하여 제1 파장의 광의 발광을 정지시키고, 발광 장치(23)의 LED(23b)를 제어하여 제2 파장의 광의 발광을 개시시킨다.In step S3, the controller 41 controls the LED 23a of the light emitting device 23 to stop light emission of the light of the first wavelength, and controls the LED 23b of the light emitting device 23 to control the light of the second wavelength. Initiate light emission.

스텝 S4에서, 카메라(22)는 제2 파장의 광이 조사되는 사용자를 촬상하고, 그 결과 얻어진 제2 촬상 화상을 정보 처리 장치(21)에 공급한다.In step S4, the camera 22 picks up the user to which the light of a 2nd wavelength is irradiated, and supplies the obtained 2nd picked-up image to the information processing apparatus 21. FIG.

스텝 S5에서, 2치화부(42)는, 카메라(22)로부터 공급되는 제1 촬상 화상과 제2 촬상 화상의 대응하는 화소들의 휘도값들끼리의 차분에 기초하여 도 5에 도시된 바와 같은 2치화 피부 화상을 생성하고, 이 2치화 피부 화상을 피부 추출부(43) 및 형상 추출부(46)에 공급한다.In step S5, the binarizing unit 42 is configured as shown in Fig. 5 based on the difference between the luminance values of the corresponding pixels of the first captured image and the second captured image supplied from the camera 22. A digitized skin image is generated and supplied to the skin extraction section 43 and the shape extraction section 46.

스텝 S6에서, 피부 추출부(43)는, 2치화부(42)로부터 공급되는 2치화 피부 화상에 기초하여, 카메라(22)로부터 공급되는 제1 촬상 화상으로부터 2치화 피부 화상 내의 피부 영역에 대응하는 영역(사용자의 피부 부분을 나타내는 영역)을 추출한다. In step S6, the skin extracting unit 43 corresponds to the skin region in the binarized skin image from the first captured image supplied from the camera 22 based on the binarized skin image supplied from the binarization unit 42. The area | region (the area | region which shows a skin part of a user) is extracted.

그리고, 피부 추출부(43)는 추출된 영역을 포함하는 피부 화상을 생성하고, 이 피부 화상을 임계값 결정부(44)에 공급한다.The skin extracting unit 43 generates a skin image including the extracted region, and supplies the skin image to the threshold value determining unit 44.

스텝 S7에서, 임계값 결정부(44)는 피부 추출부(43)로부터 공급되는 피부 화상을 구성하는 화소의 휘도값에 기초하여, 도 7에 도시된 바와 같은 피부 화상의 히스토그램을 작성한다.In step S7, the threshold value determining unit 44 creates a histogram of the skin image as shown in FIG. 7 based on the luminance value of the pixels constituting the skin image supplied from the skin extraction unit 43.

스텝 S8에서, 임계값 결정부(44)는, 작성된 피부 화상의 히스토그램에 기초하여, 최소 화소 수일 때의 휘도값을 하한 임계값 Th_L으로서 결정하고, 최대 휘도 값을 상한 임계값 Th_H로서 결정한다.In step S8, based on the histogram of the created skin image, the threshold value determining unit 44 determines the luminance value at the minimum pixel number as the lower limit threshold Th_L, and determines the maximum luminance value as the upper limit threshold Th_H.

그리고, 임계값 결정부(44)는, 결정된 하한 임계값 Th_L 및 상한 임계값 Th_H를, 마스크 임계값으로서, 마스크 화상 생성부(45)에 공급한다.Then, the threshold value determining unit 44 supplies the determined lower limit threshold Th_L and the upper limit threshold Th_H as the mask threshold to the mask image generating unit 45.

스텝 S9에서, 마스크 화상 생성부(45)는, 임계값 결정부(44)로부터 공급되는 마스크 임계값(하한 임계값 Th_L 및 상한 임계값 Th_H)에 기초하여, 카메라(22)로부터 공급되는 제1 촬상 화상을 2치화하고, 도 8에 도시된 바와 같은 마스크 화상을 생성하고, 이 마스크 화상을 형상 추출부(46)에 공급한다.In step S9, the mask image generation part 45 supplies the 1st thing supplied from the camera 22 based on the mask threshold value (lower limit threshold Th_L and upper limit threshold Th_H) supplied from the threshold value determination part 44. FIG. The captured image is binarized, a mask image as shown in FIG. 8 is generated, and the mask image is supplied to the shape extraction section 46.

스텝 S10에서, 형상 추출부(46)는, 마스크 화상 생성부(45)로부터 공급되는 마스크 화상에 기초하여, 2치화부(42)로부터 공급되는 2치화 피부 화상으로부터, 마스크 화상 내의 마스크 영역에 대응하는 영역으로서, 예를 들어, 사용자의 손의 형상을 나타내는 추출 영역을 추출한다. In step S10, the shape extraction part 46 respond | corresponds to the mask area | region in a mask image from the binarization skin image supplied from the binarization part 42 based on the mask image supplied from the mask image generation part 45 As the region to be extracted, for example, an extraction region representing the shape of the user's hand is extracted.

그리고, 형상 추출부(46)는, 이렇게 추출된 추출 영역에 의해 손의 형상을 인식하고, 인식 결과에 따른 처리를 행하고, 그 처리 결과를 후단에 출력한다.And the shape extraction part 46 recognizes the shape of a hand by the extraction area | region extracted in this way, performs the process according to a recognition result, and outputs the process result to the back end.

상기 처리에 의해, 형상 추출 처리는 종료된다.By the above processing, the shape extraction processing is finished.

이상 설명한 바와 같이, 형상 추출 처리에서는, 마스크 임계값에 기초하여 1대의 카메라(22)에 의해 촬상된 제1 촬상 화상으로부터 마스크 화상을 생성하고, 생성된 마스크 화상에 기초하여 2치화 피부 화상으로부터 사용자의 손의 형상이 추출된다.As described above, in the shape extraction processing, a mask image is generated from the first captured image picked up by one camera 22 based on the mask threshold value, and the user is generated from the binarized skin image based on the generated mask image. The shape of the hand is extracted.

따라서, 예를 들어, 복수의 카메라에 의해 촬상된 복수의 촬상 화상에 기초하여 카메라와 사용자의 손 등 간의 거리를 나타내는 거리 화상을 생성하고 그 거리 화상을 마스크 화상으로서 사용하여 사용자의 손의 형상을 추출하는 경우에 비해, 마스크 화상을 생성하는 데 필요한 계산량을 줄이고 보다 작은 수의 부품으로 사용자의 손의 형상을 추출하는 것이 가능하게 된다. Thus, for example, based on the plurality of captured images captured by the plurality of cameras, a distance image representing the distance between the camera and the user's hand or the like is generated and the distance image is used as a mask image to form the shape of the user's hand. In comparison with the case of extraction, it is possible to reduce the amount of calculation necessary to generate the mask image and to extract the shape of the user's hand with a smaller number of parts.

또한, 형상 추출 처리에서는, 카메라(22)에서부터 사용자의 얼굴까지의 거리와 카메라(22)에서부터 사용자의 손까지의 거리 간의 차에 기초하여, 피부 부분으로서, 얼굴의 피부 부분은 포함하지 않고 손의 피부 부분만을 포함하는 마스크 영역(121)과 비 마스크 영역을 포함하는 마스크 화상이 생성된다.In the shape extraction process, the skin portion of the face is not included as a skin portion based on the difference between the distance from the camera 22 to the user's face and the distance from the camera 22 to the user's hand. A mask image including only the skin portion and a mask image including the non-mask area are generated.

이로 인해, 2치화 피부 화상에서, 추출해야 할 손을 포함하는 손 영역(102)과 손 이외의 피부 부분인 얼굴을 포함하는 얼굴 영역(101)이 겹치는 경우에도, 마스크 영역(121)은 피부 부분으로서 얼굴의 피부 부분은 포함하지 않고 손의 피부 부분만을 포함하기 때문에, 2치화 피부 화상으로부터 손 영역(102)만을 추출할 수 있다.Thus, even in the case where the binarized skin image overlaps the hand region 102 including the hand to be extracted and the face region 101 including the face which is a skin portion other than the hand, the mask region 121 is a skin portion. Since only the skin portion of the hand is included without including the skin portion of the face, only the hand region 102 can be extracted from the binarized skin image.

그 결과, 사용자의 손의 형상을 정확하게 추출하는 것이 가능하게 된다.As a result, it is possible to accurately extract the shape of the user's hand.

또한, 형상 추출 처리에서는, LED(23a) 및 LED(23b)로부터 인간이 볼 수 없는 비가시의 근적외선(광)이 발광된다.In addition, in the shape extraction process, invisible near-infrared light (light) which a human cannot see is emitted from the LED 23a and the LED 23b.

따라서, 사용자는 LED(23a) 및 LED(23b)로부터 발광되는 광을 시인할 수 없기 때문에, LED(23a) 및 LED(23b)로부터 발광되는 광이 눈부신 것 때문에 사용자가 불편함을 느끼지는 않는다.Therefore, since the user cannot see the light emitted from the LEDs 23a and 23b, the user does not feel uncomfortable because the light emitted from the LEDs 23a and 23b is dazzling.

또한, 정보 처리 시스템(1)의 발광 장치(23)에서, LED(23a) 및 LED(23b)의 전방면에 확산판(23c)이 제공된다.Further, in the light emitting device 23 of the information processing system 1, the diffusion plate 23c is provided on the front surfaces of the LEDs 23a and 23b.

이러한 구성으로 인해, LED(23a) 및 LED(23b)로부터 발광되는 비가시광이 균일하게 확산된다. 그러므로, 광량에 의한 불균일 없이 균일한 광이 피사체에 조사된다.Due to this configuration, the invisible light emitted from the LEDs 23a and 23b is uniformly diffused. Therefore, uniform light is irradiated onto the subject without unevenness due to the amount of light.

이에 의해, 피사체에 조사되는 비가시광의 반사광이 광량에 의해 야기되는 불균일 없는 균일한 광으로서 카메라(22)에 의해 수광되기 때문에, 그 결과 광량에 의해 야기되는 불균일 없는 제1 및 제 2 촬상 화상을 카메라(22)에 의해 얻을 수 있다.Thereby, since the reflected light of the invisible light irradiated to the subject is received by the camera 22 as non-uniform uniform light caused by the amount of light, as a result, the non-uniform first and second captured images caused by the amount of light are removed. This can be obtained by the camera 22.

따라서, 정보 처리 시스템(1)에서는, 광량에 의해 야기되는 불균일이 없는 제1 및 제2 촬상 화상이 손의 형상 등을 추출하는 데에 사용되기 때문에, 예를 들어, 광량에 의해 야기되는 불균일이 있는 제1 및 제2 촬상 화상을 사용하는 경우보다 더 정확하게 손의 형상 등을 추출하는 것이 가능하게 된다.Therefore, in the information processing system 1, since the first and second captured images without the nonuniformity caused by the light amount are used to extract the shape of the hand, etc., for example, the nonuniformity caused by the light amount is It is possible to extract the shape of the hand and the like more accurately than when using the first and second captured images.

또한, 정보 처리 시스템(1)에서는, 사용자가 손의 형상을 변화시킬 때마다 변화 후의 손의 형상을 인식할 수 있도록, 예를 들어, 형상 추출 처리를 개시했을 때부터 80㎳ 정도 후의 손의 형상을 추출하는 것이 바람직하다.In addition, in the information processing system 1, for example, the shape of the hand about 80 mm after the shape extraction process is started so that a user can recognize the shape of the hand after a change whenever a user changes the shape of a hand. It is preferable to extract.

(2. 변형예)(2. Variation)

상술한 형상 추출 처리에서는, 형상 추출 처리가 행해질 때마다, 스텝 S6 내지 스텝 S8의 처리에 의해, 피부 화상을 추출하고, 추출된 피부 화상의 히스토그램에 기초하여 마스크 임계값(하한 임계값 Th_L 및 상한 임계값 Th_H)을 결정하도록 했지만, 형상 추출 처리는 이에 한정되지 않는다.In the shape extraction processing described above, every time the shape extraction processing is performed, the skin image is extracted by the processing of steps S6 to S8, and the mask threshold value (lower limit threshold Th_L and upper limit is based on the histogram of the extracted skin image). Threshold value Th_H) is determined, but the shape extraction process is not limited to this.

즉, 예를 들어, 형상 추출 처리에서는, 형상 추출 처리가 행해질 때, 스텝 S6 내지 스텝 S8에서 이전에 결정된 마스크 임계값을 그대로 사용할 수 있다.That is, in the shape extraction processing, for example, when the shape extraction processing is performed, the mask threshold value previously determined in steps S6 to S8 can be used as it is.

이 경우, 스텝 S6 내지 스텝 S8에서의 처리를 생략할 수 있기 때문에, 형상 추출 처리에 의해 손의 형상 등을 신속하게 추출하는 것이 가능하게 된다.In this case, since the process in step S6-step S8 can be abbreviate | omitted, it becomes possible to extract the shape of a hand etc. quickly by the shape extraction process.

또한, 형상 추출 처리를 행하기 전에, 스텝 S6 내지 스텝 S8에서의 처리와 동일한 처리를 행하여 마스크 임계값을 미리 결정함으로써, 형상 추출 처리에서 스텝 S6 내지 스텝 S8에서의 처리를 생략하는 것도 가능하게 된다.In addition, before performing the shape extraction process, by performing the same process as the process in step S6-step S8, and determining a mask threshold value previously, it becomes possible to omit the process in step S6-step S8 in a shape extraction process. .

또한, 형상 추출 처리를 행하기 전에 마스크 임계값을 미리 결정하는 처리로서, 그 외에도, 예를 들어, 사용자의 손 영역을 구성하는 화소의 휘도값의 평균값에 기초하여 마스크 임계값을 결정하는 것도 가능함을 유의한다.In addition, as the processing for determining the mask threshold in advance before performing the shape extraction processing, it is also possible to determine the mask threshold based on, for example, the average value of the luminance values of the pixels constituting the user's hand region. Note.

(마스크 임계값의 결정 방법)(Method of Determining Mask Threshold)

이어서, 임계값 결정부(44)가 사용자의 손 영역을 구성하는 화소의 휘도값의 평균값에 기초하여 마스크 임계값을 결정하는 FFT(Fast Fourier Transform, 고속 푸리에 변환) 임계값 결정 처리를 도 11을 참조하여 설명한다.Subsequently, the threshold determining unit 44 determines the FFT (Fast Fourier Transform) threshold value determination process of determining a mask threshold value based on an average value of luminance values of pixels constituting the user's hand region. It demonstrates with reference.

도 11은 파장이 870㎚인 광이 조사되는 사용자를 촬상하여 얻어지는 제1 촬상 화상의 예를 나타내고 있다.11 shows an example of a first captured image obtained by imaging a user to whom light of wavelength 870 nm is irradiated.

또한, FFT 임계값 결정 처리를 행하는 경우, 임계값 결정부(44)에는 손을 흔들고 있는 사용자를 카메라(22)가 촬상하여 얻어지는 복수의 제1 촬상 화상이 카메라(22)로부터 공급됨을 유의한다.Note that in the case of performing the FFT threshold determination process, it is noted that the threshold value determination section 44 is supplied from the camera 22 with a plurality of first captured images obtained by the camera 22 imaging the user who is shaking their hands.

임계값 결정부(44)는 복수의 제1 촬상 화상에 대하여 FFT 처리를 행하고, 일정한 주파수에서 움직이고 있는 제1 촬상 화상 내의 손 영역을 검출한다. The threshold value determination section 44 performs FFT processing on the plurality of first captured images, and detects a hand area in the first captured image that is moving at a constant frequency.

그리고, 임계값 결정부(44)는, 검출된 손 영역의 일부인 직사각형 영역(161)을 구성하는 화소의 휘도값의 평균값 ave_L을 산출한다.Then, the threshold value determining unit 44 calculates an average value ave_L of luminance values of pixels constituting the rectangular region 161 which is a part of the detected hand region.

또한, 임계값 결정부(44)는, 평균값 ave_L에서 조정치 a를 차감하여 얻어지는 값 ave_L-a를 하한 임계값 Th_L로서 결정하고, 평균값 ave_L에 조정치 b를 가산하여 얻어지는 값 ave_L+b를 상한 임계값 Th_H로서 결정한다.Further, the threshold value determining unit 44 determines the value ave_L-a obtained by subtracting the adjustment value a from the average value ave_L as the lower limit threshold Th_L, and adds the adjustment value b to the average value ave_L to the upper limit. Determine as threshold Th_H.

또한, 조정치 a 및 b는, 평균값 ave_L을 조정하고 하한 임계값 Th_L 및 상한 임계값 Th_H를 결정하는 데 사용되는 값임을 유의한다.Note that the adjustment values a and b are values used to adjust the average value ave_L and determine the lower threshold Th_L and the upper threshold Th_H.

조정치 a 및 b는, LED(23a) 및 LED(23b)로부터 발광되는 광의 강도(광량), 카메라(22)에서부터 사용자까지의 거리 및 카메라(22)에 사용되는 CCD(Charge Coupled Device Image Sensor)의 광의 감도에 따라서 산출되는 변수이지만, 이 변수들은 실제로는 대부분의 경우 실험적으로 산출된다.The adjustment values a and b are the intensity (light quantity) of the light emitted from the LEDs 23a and 23b, the distance from the camera 22 to the user, and the charge coupled device image sensor (CCD) used for the camera 22. Although this variable is calculated according to the sensitivity of light, these variables are actually calculated experimentally in most cases.

(FFT 임계값 결정 처리에 의한 동작에 관한 설명) (Description of the Operation by the FFT Threshold Determination Process)

이어서, 임계값 결정부(44)가 사용자의 손 영역을 구성하는 화소의 휘도값의 평균값에 기초하여 마스크 임계값을 결정하는 FFT 임계값 결정 처리를 설명한다.Next, the FFT threshold determination process in which the threshold value determination section 44 determines the mask threshold value based on the average value of the luminance values of the pixels constituting the user's hand area will be described.

도 12는 FFT 임계값 결정 처리를 설명하기 위한 흐름도이다. 이 FFT 임계값 결정 처리는, 예를 들어, 정보 처리 시스템의 전원이 턴 온되었을 때 그리고 형상 추출 처리가 행해지기 전에 개시된다.12 is a flowchart for explaining an FFT threshold determination process. This FFT threshold determination process is started, for example, when the power supply of the information processing system is turned on and before the shape extraction process is performed.

스텝 S31에서, 제어부(41)는 발광 장치(23)의 LED(23a)를 제어하여, 제1 파장의 광의 발광을 개시시킨다.In step S31, the control part 41 controls the LED 23a of the light emitting device 23 to start light emission of the light of a 1st wavelength.

스텝 S32에서, 제어부(41)는 정보 처리 장치(21)에 제공되는 디스플레이, 스피커 등(미도시)을 제어하여, 사용자에게 손을 흔들도록 지시한다.In step S32, the controller 41 controls a display, a speaker, and the like (not shown) provided to the information processing apparatus 21 to instruct the user to shake his or her hand.

스텝 S33에서, 카메라(22)는 손을 흔들고 있는 사용자를 촬상하고, 그 결과 얻어지는 제1 촬상 화상을 정보 처리 장치(21)의 임계값 결정부(44)에 공급한다. In step S33, the camera 22 picks up the user who is shaking hands, and supplies the first picked-up image obtained as a result to the threshold value determination part 44 of the information processing apparatus 21. FIG.

스텝 S34에서, 임계값 결정부(44)는 제1 촬상 화상에 대하여 FFT 처리를 행하고, 일정한 주파수에서 움직이고 있는, 제1 촬상 화상 내의 손 영역을 검출한다.In step S34, the threshold value determination part 44 performs an FFT process with respect to a 1st picked-up image, and detects the hand area | region in the 1st picked-up image moving at a fixed frequency.

스텝 S35에서, 임계값 결정부(44)는 검출된 손 영역의 일부인 직사각형 영역(161)을 구성하는 화소의 휘도값의 평균값 ave_L을 산출한다.In step S35, the threshold value determination section 44 calculates an average value ave_L of the luminance values of the pixels constituting the rectangular region 161 that is part of the detected hand region.

스텝 S36에서, 임계값 결정부(44)는 평균값 ave_L에서 조정치 a를 차감하여 얻어지는 값 ave_L-a를 하한 임계값 Th_L로서 결정하고, 평균값 ave_L로부터 조정치 b를 가산하여 얻어지는 값 ave_L+b를 상한 임계값 Th_H로서 결정한다.In step S36, the threshold value determination unit 44 determines the value ave_L-a obtained by subtracting the adjustment value a from the average value ave_L as the lower limit threshold Th_L, and adds the adjustment value b from the average value ave_L to the value ave_L + b obtained. The upper limit is determined as the threshold Th_H.

이상의 처리에 의해, FFT 임계값 결정 처리는 종료된다. 상술한 바와 같이, FFT 임계값 결정 처리에서는, 형상 추출 처리가 행해지기 전에 마스크 임계값이 결정되기 때문에, 그 결과 형상 추출 처리에 있어서 스텝 S6 내지 스텝 S8의 처리를 생략할 수 있고, 보다 신속하게 손의 형상 등을 추출하는 것이 가능하게 된다.By the above processing, the FFT threshold determination processing is finished. As described above, in the FFT threshold value determination process, since the mask threshold value is determined before the shape extraction process is performed, as a result, the process of steps S6 to S8 can be omitted in the shape extraction process, more quickly. It is possible to extract the shape of the hand and the like.

또한, FFT 임계값 결정 처리에서는, 복수의 제1 촬상 화상에 대하여 FFT 처리를 행하고, 제1 촬상 화상 내의 손 영역을 검출하고, 그 손 영역 내의 화소의 휘도값의 평균값에 기초하여 마스크 임계값(하한 임계값 Th_L 및 상한 임계값 Th_H)을 결정하도록 했지만, FFT 임계값 결정 처리는 이에 한정되지 않음을 유의한다.Further, in the FFT threshold determination processing, FFT processing is performed on a plurality of first picked-up images, a hand area in the first picked-up image is detected, and a mask threshold value ( Although the lower limit threshold Th_L and the upper limit threshold Th_H) are determined, it is noted that the FFT threshold determination processing is not limited to this.

즉, 예를 들어, FFT 임계값 결정 처리에서는, 손을 흔들고 있는 사용자를 카메라(22)가 촬상하여 얻어지는 복수의 제2 촬상 화상에 대하여 FFT 처리를 행함으로써, 제2 촬상 화상 내의 손 영역을 검출하고 그 손 영역 내의 화소의 휘도값의 평균값에 기초하여 마스크 임계값을 결정하도록 해도 좋다.That is, in the FFT threshold determination process, for example, the FFT process is performed on a plurality of second captured images obtained by the camera 22 imaging a user who is shaking their hands, thereby detecting the hand region in the second captured image. The mask threshold value may be determined based on the average value of the luminance values of the pixels in the hand region.

본 실시 형태에 있어서, 2치화부(42)는 제1 촬상 화상으로부터 사용자의 피부 영역 및 사용자의 피부 영역을 제외한 영역을 추출하고, 추출된 피부 영역 및 피부 영역을 제외한 영역으로 구성되는 2치화 피부 화상을 피부 추출부(43) 및 형상 추출부(46)에 공급하지만, 본 발명은 이에 한정되지 않는다.In the present embodiment, the binarization unit 42 extracts the skin region of the user and the region excluding the skin region of the user from the first captured image, and binarizes the skin composed of the extracted skin region and the region except the skin region. Although the image is supplied to the skin extraction section 43 and the shape extraction section 46, the present invention is not limited to this.

즉, 예를 들어, 2치화부(42)는 제1 촬상 화상으로부터 사용자의 피부 영역을 추출하고, 적어도 추출된 피부 영역을 포함하는 2치화 피부 화상을 피부 추출부(43) 및 형상 추출부(46)에 공급한다.That is, for example, the binarizing unit 42 extracts the skin region of the user from the first captured image, and extracts the binarized skin image including at least the extracted skin region from the skin extracting unit 43 and the shape extracting unit ( 46).

이 경우, 피부 추출부(43)는, 카메라(22)가 촬상한 제1 촬상 화상으로부터, 2치화부(42)로부터 공급되는 2치화 피부 화상에 포함되는 피부 영역에 대응하는 영역을 추출한다. 또한, 형상 추출부(46)는, 2치화부(42)로부터 공급되는 2치화 피부 화상에 포함되는 피부 영역으로부터 형상 영역을 추출한다.In this case, the skin extraction part 43 extracts the area | region corresponding to the skin area | region contained in the binarization skin image supplied from the binarization part 42 from the 1st picked-up image image | photographed by the camera 22. FIG. In addition, the shape extraction part 46 extracts a shape area | region from the skin area | region contained in the binarization skin image supplied from the binarization part 42. As shown in FIG.

본 실시 형태에서, 마스크 화상 생성부(45)는, 예를 들어, 제1 촬상 화상으로부터 마스크 영역 및 비 마스크 영역을 검출하고, 검출된 마스크 영역 및 비 마스크 영역으로 구성되는 마스크 화상을 생성하도록 했지만, 본 발명은 이에 한정되지 않는다.In the present embodiment, the mask image generation unit 45 detects, for example, a mask region and a non-mask region from the first captured image, and generates a mask image composed of the detected mask region and the non-mask region. The present invention is not limited to this.

즉, 예를 들어, 마스크 화상 생성부(45)는, 2치화 피부 화상으로부터 형상 영역을 추출하기 위한 추출용 영역으로서 마스크 영역만을 검출하고, 적어도 검출된 마스크 영역을 포함하는 마스크 화상을 생성하도록 해도 좋다. 이 경우, 형상 추출부(46)에서는, 2치화부(42)로부터 공급되는 2치화 피부 화상 내의 피부 영역 중에서 마스크 화상 내의 마스크 영역에 대응하는 영역이 형상 영역으로서 추출된다.That is, for example, the mask image generating unit 45 may detect only the mask region as an extraction region for extracting the shape region from the binarized skin image, and generate a mask image including at least the detected mask region. good. In this case, in the shape extraction part 46, the area | region corresponding to the mask area | region in a mask image is extracted as a shape area | region from the skin area | region in the binarization skin image supplied from the binarization part 42. As shown in FIG.

또한, 예를 들어, 마스크 화상 생성부(45)는, 추출용 영역으로서 비 마스크 영역만을 검출하고, 적어도 겁출된 비 마스크 영역을 포함하는 마스크 화상을 생성하도록 해도 좋다. 이 경우, 형상 추출부(46)에서는, 2치화부(42)로부터 공급되는 2치화 피부 화상 내의 피부 영역 중에서 마스크 화상 내의 비 마스크 영역을 제외한 영역에 대응하는 영역이 형상 영역으로서 추출된다.For example, the mask image generation part 45 may detect only a non-mask area | region as an extraction area, and may generate the mask image containing at least the masked non-mask area | region. In this case, the shape extraction part 46 extracts as a shape area | region the area | region corresponding to the area | region except the non-mask area | region in a mask image from the skin area | region in the binarization skin image supplied from the binarization part 42. As shown in FIG.

(카메라(22), LED(23a) 및 LED(23b)의 성능)(Performance of Camera 22, LED 23a, and LED 23b)

이어서, 도 13 및 도 14를 참조하여, 본 발명의 출원인이 실제로 형상 추출 처리 및 FFT 임계값 결정 처리를 행했을 때의, 정보 처리 시스템(1)을 구성하는 카메라(22) 및 발광 장치(23)의 성능을 설명한다.Subsequently, with reference to FIGS. 13 and 14, the camera 22 and the light emitting device 23 constituting the information processing system 1 when the applicant of the present invention actually performs the shape extraction process and the FFT threshold value determination process. ) Performance.

본 발명의 출원인은 카메라(22)로서 소니 사(社)에 의해 제조된 비디오 카메라를 사용했다. 이 카메라(22)는 모델 번호가 XC-EI50이며, 촬상 소자로서 1/2 IT 방식의 CCD를 포함한다.The applicant of the present invention used a video camera manufactured by Sony Corporation as the camera 22. This camera 22 has a model number XC-EI50 and includes a 1/2 IT system CCD as the imaging element.

또한, 카메라(22)의 유효 화소 수는 768×494 화소이며, 렌즈 마운트(lens mount)로서 C 마운트 그리고 주사 방식으로서 525개의 라인을 인터레이스하는 주사 방식을 채용하고 있다. The effective number of pixels of the camera 22 is 768x494 pixels, and a scanning method of interlacing 525 lines as a C mount as a lens mount and a scanning method is adopted.

또한, 감도는 F11(400 lx)이며, 최저 피사계 심도는 0.1 lx이다. 또한, 카메라(22)에 의해 촬상된 촬상 화상의 S/N(신호 대 잡음) 비는 60㏈이다.In addition, the sensitivity is F11 (400 lx), and the lowest depth of field is 0.1 lx. In addition, the S / N (signal-to-noise) ratio of the picked-up image picked up by the camera 22 is 60 Hz.

또한, 카메라(22)에 있어서, 카메라(22)에 미리 제공되는 셔터 버튼(노멀 셔터)에 의한 셔터 속도는, 1/100 내지 1/10,000 초이며, 카메라(22)의 외부에 접속된 릴리즈 스위치(외부 트리거 셔터)에 의한 셔터 속도는 1/4 내지 1/10,000초이다.In addition, in the camera 22, the shutter speed by the shutter button (normal shutter) provided in advance to the camera 22 is 1/100-1 / 10,000 second, The release switch connected to the exterior of the camera 22. The shutter speed by (external trigger shutter) is 1/4 to 1 / 10,000 second.

또한, 카메라(22)의 외형 치수는, 29(폭)×29(높이)×32(깊이)㎜이며, 카메라(22)의 중량은 약 50g이다. 또한, 카메라(22)의 진동 저항은 70G이다.In addition, the external dimension of the camera 22 is 29 (width) x 29 (height) x 32 (depth) mm, and the weight of the camera 22 is about 50g. In addition, the vibration resistance of the camera 22 is 70G.

또한, 카메라(22)는, 400㎚의 가시 영역에서부터 1,000㎚의 근적외 영역까지의 범위 내의 감도를 갖는다.In addition, the camera 22 has a sensitivity within a range from a visible region of 400 nm to a near infrared region of 1,000 nm.

도 13은 카메라(22)의 상대 감도 특성의 예를 나타내고 있다.13 shows an example of the relative sensitivity characteristic of the camera 22.

또한, 도 13에서, 횡축은 카메라(22)의 렌즈에 입사되는 파장을 나타내고 있고, 종축은 파장에 대응하는 상대 감도를 나타내고 있음을 유의한다.In addition, in FIG. 13, the horizontal axis shows the wavelength which injects into the lens of the camera 22, and the vertical axis shows the relative sensitivity corresponding to a wavelength.

또한, 본 발명의 출원인은 발광 장치(23)로서 도 14에 도시된 바와 같이 바둑판 형상으로 교대로 배치된 8개의 LED(23a) 및 8개의 LED(23b)를 사용했다.In addition, the applicant of the present invention used eight LEDs 23a and eight LEDs 23b alternately arranged in a checkerboard shape as shown in FIG. 14 as the light emitting device 23.

본 발명의 출원인에 의해 실제로 사용된 LED(23a)로서는 파장이 870㎚인 광을 발광하는 LED를 사용했고, LED(23b)로서는 파장이 950㎚인 광을 발광하는 LED를 사용했다.As the LED 23a actually used by the applicant of the present invention, an LED emitting light having a wavelength of 870 nm was used, and an LED emitting light having a wavelength of 950 nm was used as the LED 23b.

또한, LED(23a) 및 LED(23b)로서, 직류 순전류(절대 최대 정격)이 100㎃이고 순전압이 1.6V인 LED를 사용했다.As the LEDs 23a and 23b, LEDs having a DC forward current (absolute maximum rating) of 100 mA and a forward voltage of 1.6 V were used.

본 발명의 출원인은 상술한 성능을 갖는 카메라(22) 및 도 14에 도시된 것 같이 배치된 LED(23a) 및 LED(23b)를 사용하여, 형상 추출 처리 및 FFT 임계값 결정 처리를 실제로 행하고, 따라서 상술한 현저한 작용 효과를 알 수 있었다.The applicant of the present invention actually performs shape extraction processing and FFT threshold determination processing using the camera 22 having the above-described performance and the LEDs 23a and 23b arranged as shown in Fig. 14, Therefore, the above-mentioned remarkable effect was found.

본 실시 형태에서, 마스크 화상 생성부(45)는, 임계값 결정부(44)로부터 공급되는 마스크 임계값에 기초하여, 카메라(22)로부터 공급되는 제1 촬상 화상으로부터 마스크 화상을 생성하도록 했지만, 마스크 화상의 생성 방법은 이에 한정되지 않는다. In the present embodiment, the mask image generation unit 45 generates a mask image from the first captured image supplied from the camera 22 based on the mask threshold value supplied from the threshold value determination unit 44. The generation method of a mask image is not limited to this.

즉, 예를 들어, 마스크 화상 생성부(45)는, 상이한 방향에서 촬상하는 복수의 카메라에 의해 촬상되는 촬상 화상에 기초하여, 카메라에서부터 사용자까지의 거리를 나타내는 거리 화상을 생성하는 스테레오 처리(stereo processing)를 행하고, 그 결과 얻어진 거리 화상을 마스크 화상으로서 채용하는 것이 가능하다.That is, for example, the mask image generation unit 45 generates stereo images for generating a distance image representing the distance from the camera to the user based on the captured images picked up by the plurality of cameras picked up in different directions. processing), and the resulting distance image can be employed as a mask image.

이 경우, 형상 추출부(46)는, 마스크 화상 생성부(45)로부터 공급되는 거리 화상 내의, 카메라에서부터 손까지의 거리를 나타내는 영역과, 2치화부(42)로부터 공급되는 2치화 피부 화상 내의 얼굴 영역(101)과 손 영역(102)이 겹치는 부분을, 사용자의 손의 형상을 나타내는 형상 영역(141)으로서 추출한다.In this case, the shape extracting section 46 includes a region representing the distance from the camera to the hand in the distance image supplied from the mask image generating section 45 and the binarized skin image supplied from the binarizing section 42. The portion where the face region 101 and the hand region 102 overlap is extracted as the shape region 141 representing the shape of the user's hand.

또한, 마스크 화상으로서 거리 화상을 생성하는 방법으로서는, 스테레오 처리 외에, 적외선을 사용자에 조사하여 사용자에 대해 반사되어 되돌아올 때까지의 시간에 기초하여 사용자까지의 거리를 산출하는 레이저 범위 파인더(laser range finder) 등을 사용하여 사용자의 거리 화상을 생성하는 것이 가능하다.As a method of generating a distance image as a mask image, in addition to stereo processing, a laser range finder that calculates the distance to the user based on the time until the infrared light is reflected on the user and returned to the user. It is possible to generate a distance image of the user using a finder).

또한, 본 실시 형태에서는, LED(23a)로부터 발광되는 제1 파장이 870㎚로 설정되고, LED(23b)로부터 발광되는 제2 파장을 950㎚로 설정했지만, 파장의 조합은 이에 한정되지 않는다.In the present embodiment, the first wavelength emitted from the LED 23a is set to 870 nm, and the second wavelength emitted from the LED 23b is set to 950 nm, but the combination of wavelengths is not limited thereto.

즉, 파장의 조합으로는, 그 조합이, 제1 파장에서의 반사율과 제2 파장에서의 반사율 간의 차분 절대값이 사용자의 피부 이외의 피사체에 대해서 얻어지는 반사율들 간의 차분 절대값보다 충분히 크기만 하면 어떤 조합으로 설정되어도 좋다. 구체적으로는, 도 3으로부터 명백한 바와 같이, 870㎚와 950㎚의 조합 외에도, 예를 들어, 800㎚와 950㎚의 조합, 870㎚와 1,000㎚의 조합, 및 800㎚ 및 1,000㎚의 조합이 가능하다.That is, as a combination of wavelengths, as long as the combination only has a magnitude absolute difference between the reflectance at the first wavelength and the reflectance at the second wavelength is larger than the absolute difference value between the reflectances obtained for a subject other than the user's skin. It may be set in any combination. Specifically, as is apparent from FIG. 3, in addition to the combination of 870 nm and 950 nm, for example, a combination of 800 nm and 950 nm, a combination of 870 nm and 1,000 nm, and a combination of 800 nm and 1,000 nm are possible. Do.

또한, LED(23a)로부터 발광되는 광으로서 가시광을 사용하는 경우에는, 가시광 커트 필터(22a) 대신, LED(23a)로부터 발광되는 가시광만을 통과시켜서 카메라(22)의 렌즈에 가시광을 입사시키는 필터가 사용됨을 유의한다. 이것은 LED(23b)에 대해서도 동일하다.In addition, when using visible light as the light emitted from the LED 23a, instead of the visible light cut filter 22a, a filter which allows visible light to pass through the visible light emitted from the LED 23a and enters the visible light into the lens of the camera 22 is provided. Note that it is used. The same applies to the LED 23b.

또한, 본 실시 형태에서, LED(23a) 및 LED(23b)는 형상 추출 처리에서 개별적으로 발광한다. 그러나, LED(23a) 및 LED(23b)를 동시에 발광시킴으로써 제1 촬상 화상 및 제2 촬상 화상을 취득하는 것이 가능하다.In the present embodiment, the LEDs 23a and 23b emit light individually in the shape extraction process. However, it is possible to acquire the first picked up image and the second picked up image by emitting the LED 23a and the LED 23b simultaneously.

즉, 예를 들어, 카메라(22) 대신에 카메라(22)와 같은 기능을 갖는 2대의 카메라를 서로 근접시킨 상태에서 제공한다. 2대의 카메라 중 1대의 카메라의 전방면에는 제1 파장의 광만을 통과시키는 필터를 제공하고, 다른 카메라의 전방면에는 제2 파장의 광만을 통과시키는 필터를 제공한다.That is, for example, instead of the camera 22, two cameras having the same function as the camera 22 are provided in close proximity to each other. A front surface of one of the two cameras is provided with a filter for passing only light of the first wavelength, and a front side of the other camera is provided with a filter for passing only light of the second wavelength.

이 경우, LED(23a) 및 LED(23b)가 동시에 발광한다고 하더라도, 제1 파장의 광만이 하나의 카메라에 입사되므로, 그 결과 나머지 하나의 카메라에서는 제1 촬상 화상을 얻는 것이 가능하게 된다. 또한, 제2 파장의 광만이 나머지 하나의 카메라에 입사하기 때문에, 나머지 하나의 카메라에서 제2 촬상 화상을 얻는 것이 가능하게 된다.In this case, even if the LEDs 23a and 23b emit light at the same time, since only light of the first wavelength is incident on one camera, it is possible to obtain the first captured image from the other camera. In addition, since only the light of the second wavelength is incident on the other camera, it becomes possible to obtain a second captured image from the other camera.

본 실시 형태에 있어서, LED(23a)의 개수 및 LED(23b)의 개수는 각각 2개로 설정되었지만, 그 수는 상기에 한정되지 않는다.In the present embodiment, the number of the LEDs 23a and the number of the LEDs 23b are set to two each, but the number is not limited to the above.

또한, 본 실시 형태에서, 정보 처리 장치(21)에 소정의 처리를 실행시키기 위해 사용자의 몸체를 나타내는 물체로서 손(의 형상)을 변화시키도록 했지만, 물체로서 손 외에, 예를 들어, 사용자의 발 등을 채용하는 것도 가능하다. In addition, in the present embodiment, the information processing apparatus 21 is configured to change the shape of the hand as an object representing the body of the user so as to execute a predetermined process. It is also possible to adopt a foot or the like.

그런데, 상술한 일련의 처리는, 전용 하드웨어 또는 소트프웨어에 의해 실행될 수 있다. 일련의 처리가 소프트웨어에 의해 실행되는 경우에는, 그 소프트웨어를 구성하는 프로그램이, 각종 프로그램을 설치함으로써 각종 기능을 실행할 수 있는, 소위 내장형 컴퓨터 또는 범용 퍼스널 컴퓨터의 기록 매체로부터 설치된다.By the way, the above-described series of processes can be executed by dedicated hardware or software. When a series of processes are executed by software, a program constituting the software is installed from a recording medium of a so-called built-in computer or a general purpose personal computer that can execute various functions by installing various programs.

(컴퓨터의 구성예)(Configuration example of computer)

이어서, 도 15는 상술한 일련의 처리를 프로그램에 의해 실행시키는 퍼스널 컴퓨터의 구성예를 나타내고 있다. 예를 들어, 도 2에 도시되어 있는 정보 처리 장치(21)의 부들, 즉, 각 부는 도 15에 도시된 바와 같이 CPU(201)와 같은 적어도 하나의 프로세서에 의해 구현될 수 있다. 일 실시 형태에서, 2치화부(42), 피부 추출부(43), 임계값 결정부(44), 마스크 화상 생성부(45) 및 형상 추출부(46)(즉, 각 부들)는 하나의 프로세서 또는 복수의 상이한 프로세서에 의해 구현될 수 있다.Next, FIG. 15 shows an example of the configuration of a personal computer which executes the above-described series of processes by a program. For example, the parts of the information processing apparatus 21 shown in FIG. 2, that is, each part may be implemented by at least one processor such as the CPU 201 as shown in FIG. 15. In one embodiment, the binarizing unit 42, the skin extracting unit 43, the threshold value determining unit 44, the mask image generating unit 45, and the shape extracting unit 46 (ie, the respective units) are one unit. It may be implemented by a processor or a plurality of different processors.

CPU(Central Processing Unit)(201)는 ROM(Read Only Memory)(202) 또는 기억부(208)에 기억되어 있는 프로그램에 따라 각종 처리를 실행한다. RAM(Random Access Memory)(203)은 CPU(201)에 의해 실행되는 프로그램, 데이터 등을 적절히 기억한다. 이 CPU(201), ROM(202) 및 RAM(203)은 버스(204)를 통해 서로 접속되어 있다.The CPU (Central Processing Unit) 201 executes various processes in accordance with a program stored in the ROM (Read Only Memory) 202 or the storage unit 208. A random access memory (RAM) 203 appropriately stores programs, data, and the like executed by the CPU 201. The CPU 201, the ROM 202, and the RAM 203 are connected to each other via the bus 204.

CPU(201)는 또한 버스(204)를 통해 입출력 인터페이스(205)에 접속되어 있다. 입출력 인터페이스(205)는, 예를 들어, 키보드, 마우스, 마이크 등과 같은 입력부(206) 및 디스플레이 및 스피커 등과 같은 출력부(207)에 접속되어 있다. CPU(201)는 입력부(206)로부터 입력되는 명령에 대응하여 각종 유형의 처리를 실행한다. 그리고, CPU(201)는 처리의 결과를 출력부(207)에 출력한다. The CPU 201 is also connected to the input / output interface 205 via the bus 204. The input / output interface 205 is connected to, for example, an input unit 206 such as a keyboard, a mouse, a microphone, and the like, and an output unit 207 such as a display and a speaker. The CPU 201 executes various types of processing in response to the command input from the input unit 206. The CPU 201 then outputs the result of the processing to the output unit 207.

입/출력 인터페이스(205)에 접속되어 있는 기억부(208)는, 예를 들어 하드 디스크로 구성되고, CPU(201)에 의해 실행되는 프로그램 및 각종 유형의 데이터를 기억한다. 통신부(209)는, 인터넷이나 LAN 등의 네트워크를 통해서 외부의 장치와 통신한다.The storage unit 208 connected to the input / output interface 205 is configured of, for example, a hard disk, and stores a program executed by the CPU 201 and various types of data. The communication unit 209 communicates with an external device via a network such as the Internet or a LAN.

또한, 프로그램은 통신부(209)를 통해 취득될 수 있고, 기억부(208)에 기억될 수 있다.In addition, the program can be acquired through the communication unit 209 and stored in the storage unit 208.

입출력 인터페이스(205)에 접속되어 있는 드라이브(210)는, 자기 디스크, 광 디스크, 광-자기 디스크 및 반도체 메모리 등과 같은 리무버블 매체(211)가 드라이브에 장착되었을 때 리무버블 매체(211)를 구동하고, 그 리무버블 매체(211)에 기억되어 있는 프로그램 및 데이터를 취득한다. 취득된 프로그램 및 데이터는 필요에 따라 기억부(208)에 전송되어 기억된다. The drive 210 connected to the input / output interface 205 drives the removable medium 211 when the removable medium 211 such as a magnetic disk, an optical disk, an optical-magnetic disk, a semiconductor memory, and the like is mounted in the drive. Then, the program and data stored in the removable medium 211 are acquired. The acquired program and data are transferred to and stored in the storage unit 208 as necessary.

컴퓨터에 설치되어 컴퓨터에 의해 실행되는 프로그램을 기록(기억)하는 기록 매체는, 도 15에 도시한 바와 같이, 자기 디스크(플렉시블 디스크를 포함함), 광 디스크(CD-ROM(Compact Disc-Read Only Memory) 및 DVD(Digital Versatile Disc)를 포함함), 광-자기 디스크(MD(Mini-Disc)를 포함함) 및 반도체 메모리 등과 같은 패키지 매체인 리무버블 매체(211), 프로그램이 일시적으로 또는 영속적으로 기억되는 ROM(202) 또는 기억부(208)를 구성하는 하드 디스크로 구성된다. 프로그램은, 필요에 따라 라우터 및 모뎀 등의 인터페이스인 통신부(209)를 통하거나 LAN, 인터넷 및 디지털 위성 방송과 같은 유/무선의 통신 매체를 이용하여 기록 매체에 기록된다.Recording media installed in the computer and recording (stored) programs executed by the computer are, as shown in Fig. 15, magnetic disks (including flexible disks) and optical disks (Compact Disc-Read Only). Removable media 211, which is package media such as memory and DVD (including Digital Versatile Disc), optical-magnetic disks (including Mini-Disc (MD)), and semiconductor memory, programs are temporarily or permanently It consists of a hard disk which comprises the ROM 202 or the memory | storage part 208 stored in the memory. The program is recorded on the recording medium as necessary via the communication unit 209, which is an interface such as a router and a modem, or by using a wired / wireless communication medium such as a LAN, the Internet, and digital satellite broadcasting.

또한, 본 명세서에서, 상술한 일련의 처리를 기술하는 스텝은, 기재된 순서에 따라 시계열적으로 행해지는 처리 외에도 시계열적으로 처리되지 않고 병렬적 또는 개별로 실행되는 처리도 포함하는 것임을 유의한다.Note that, in the present specification, the steps for describing the series of processing described above include processing executed in parallel or separately without being processed in time series in addition to the processing performed in time series according to the described order.

또한, 본 명세서에서, 시스템은 복수의 장치로 구성되는 장치 전체를 나타내는 것이다.In addition, in this specification, a system represents the whole apparatus which consists of a some apparatus.

또한, 본 발명의 실시 형태는 상술한 본 실시 형태에 한정되지 않으며, 본 발명의 요지를 일탈하지 않고 다양하게 변경될 수 있음을 유의한다.Note that the embodiment of the present invention is not limited to the above-described embodiment, and may be variously changed without departing from the gist of the present invention.

본 출원은 2009년 6월 30일자로 일본 특허청에 출원된 우선권인 일본 특허 출원 JP 2009-154921호에 개시된 것에 관련된 내용을 포함하며, 그 전체 내용은 참조로서 본 명세서에 포함된다.This application includes content related to that disclosed in Japanese Patent Application No. JP 2009-154921, which is a priority filed with Japan Patent Office on June 30, 2009, the entire contents of which are incorporated herein by reference.

1 : 정보 처리 시스템
21 : 정보 처리 장치
22 : 카메라
23 : 발광 장치
41 : 제어부
42 : 2치화부
43 : 피부 추출부
44 : 임계값 결정부
45 : 마스크 화상 생성부
46 : 형상 추출부1: information processing system
21: information processing device
22: camera
23: light emitting device
41: control unit
42: binarization
43: skin extracting unit
44: threshold determination unit
45: mask image generation unit
46: shape extraction unit

Claims

In the information processing apparatus which extracts the shape of the subject which shows the site | part of the predetermined skin of a user's body from the picked-up image obtained by imaging the said user,
Irradiation means for irradiating the user with light of a first wavelength and light of a second wavelength different from the first wavelength;
Acquisition means for acquiring a first image obtained by receiving reflected light of light of the first wavelength irradiated to the user, and a second image obtained by receiving reflected light of light of the second wavelength irradiated to the user;
Skin region extraction means for extracting a skin region representing the skin of the user based on the first and second images;
Shape region extraction means for extracting a shape region representing the shape of the subject on the skin region
Including,
The shape region extracting means extracts the shape region based on a distribution of luminance values of pixels constituting a region corresponding to the skin region on at least one image among the first and second images, A part of the user corresponding to the subject and an area excluding the shape area on the skin area is displayed;
Information processing device.

In the information processing apparatus which extracts the shape of the subject which shows the site | part of the predetermined skin of a user's body from the picked-up image obtained by imaging the said user,
Irradiation means for irradiating the user with light of a first wavelength and light of a second wavelength different from the first wavelength;
Acquisition means for acquiring a first image obtained by receiving reflected light of light of the first wavelength irradiated to the user, and a second image obtained by receiving reflected light of light of the second wavelength irradiated to the user;
Skin region extraction means for extracting a skin region representing the skin of the user based on the first and second images;
Shape region extraction means for extracting a shape region representing the shape of the subject on the skin region
Including,
The shape region extraction means extracts the shape region based on a distribution of luminance values of pixels constituting a region corresponding to the skin region on a display image, wherein the display image includes the subject and the shape on the skin region. The part of the user corresponding to the area except the area is displayed.
The said irradiation means is arrange | positioned in the state adjacent to the said acquisition means,
Information processing device.

The method of claim 2, wherein the shape region extraction means,
From the irradiation means between the subject and a portion of the user corresponding to an area excluding the shape area on the skin area, derived from a distribution of luminance values of pixels constituting an area corresponding to the skin area on the display image. Extracting the shape region based on the difference of relative distances,
Information processing device.

The method of claim 3, wherein the shape region extraction means,
From the irradiation means between the subject and a portion of the user corresponding to an area excluding the shape area on the skin area, derived from a histogram of luminance values of pixels constituting an area corresponding to the skin area on the display image. Extracting the shape region based on the difference of relative distances,
Information processing device.