KR101314687B1

KR101314687B1 - Providing device of eye scan path and mehtod for providing eye scan path

Info

Publication number: KR101314687B1
Application number: KR1020110129935A
Authority: KR
Inventors: 이민호; 장영민; 박형민; 김민욱; 정성문; 김범휘
Original assignee: 서강대학교산학협력단; 경북대학교 산학협력단
Priority date: 2011-12-06
Filing date: 2011-12-06
Publication date: 2013-10-07
Also published as: KR20130063429A

Abstract

시선 경로 제공장치가 개시된다. 본 시선 경로 제공장치는, 이격된 위치에서 촬영된 복수의 이미지 및 이격된 위치에서 청취된 복수의 음원을 입력받는 입력부, 복수의 음원을 분석하여 음원의 위치를 판단하는 위치 판단부, 복수의 이미지 각각에 대한 복수의 모노 돌출맵을 생성하고, 생성된 복수의 모노 돌출맵을 이용하여 동적 돌출맵을 생성하는 돌출맵 생성부, 생성된 동적 돌출맵 및 판단된 음원 위치를 기초로 복수의 이미지에 대한 시선 경로를 생성하는 시선 경로 생성부, 및, 생성된 시선 경로를 출력하는 출력부를 포함한다. Disclosed is a gaze path providing apparatus. The apparatus for providing a gaze path may include: an input unit for receiving a plurality of images photographed at spaced locations and a plurality of sound sources heard at spaced locations; a position determination unit for analyzing a plurality of sound sources to determine a location of a sound source; and a plurality of images A plurality of mono protrusion maps are generated for each of the plurality of mono protrusion maps, and a protrusion protrusion generation unit generating a dynamic protrusion map by using the generated plurality of mono protrusion maps. And a gaze path generator for generating a gaze path for the gaze path, and an output unit for outputting the generated gaze path.

Description

Device for providing a gaze path and method for providing a gaze path {PROVIDING DEVICE OF EYE SCAN PATH AND MEHTOD FOR PROVIDING EYE SCAN PATH}

본 발명은 시선 경로 제공장치 및 시선 경로 제공장치에 관한 것으로, 더욱 상세하게는 시/청각 융합 정보에 기초하여 인간과 유사한 시선 이동 경로를 제공할 수 있는 시선 경로 제공장치에 관한 것이다. The present invention relates to a gaze path providing device and a gaze path providing device, and more particularly, to a gaze path providing device capable of providing a gaze-like movement path similar to a human based on visual / audio fusion information.

센서기술은 인간의 감각기관을 모방하는 것으로 시작하였다. 최근에는 센서 기술 중 능동형 인공 시각 시스템의 중요성이 부각되고 있다.Sensor technology began by mimicking human sense organs. Recently, the importance of active artificial vision system among the sensor technology has been highlighted.

그러나 지금까지 개발된 많은 인공 시각시스템은 입력 이미지에 대해 특정 대상 검출 및 인식에 중점을 두고 있기 때문에 높은 복잡도를 갖는 임의의 자연 이미지에서 인간의 시각시스템처럼 어떻게 효과적으로 필요한 정보를 선택할 것인가 하는 시각시스템의 시작 단계의 문제에 대한 해결책을 갖지 못하는 단점이 있었다. However, many artificial visual systems developed so far focus on detecting and recognizing specific objects for the input image. There was a drawback to not having a solution to the problem at the beginning.

반면, 인간의 시각시스템은 다양하고 복잡한 환경 인자의 변화에 강건한 시각 처리 성능을 갖는다. 인간의 시각시스템은 복잡한 시각계에서의 정보를 일괄적으로 처리하는 대신에, 필요한 정보를 선별적으로 또한 순차적으로 처리하는 효율적인 정보처리 메커니즘이 있다. 이는 단순한 기능만을 갖는 시각 세포들이 복잡한 이미지 정보를 실시간으로 빠르고 신속하게 처리할 수 있게 하는 선택적 주의집중 기능과 대상 물체로 시선을 자유롭게 이동할 수 있는 능동 시각 기능에 의해 이루어진다. On the other hand, the human visual system has a robust visual processing performance against various and complex environmental factors. The human visual system has an efficient information processing mechanism that selectively and sequentially processes necessary information instead of processing information in a complex visual system collectively. This is achieved by the selective attention function, which allows visual cells with only simple functions to process complex image information quickly and quickly in real time, and the active visual function, which can freely move the eyes to the target object.

따라서 최근에는 기존의 인공 시각시스템이 갖는 한계를 극복하고 복잡한 실세계의 시각 정보를 효율적으로 처리하기 위해 인간 시각시스템에서의 뇌 정보처리와 생물학적 정보처리 메커니즘을 기반으로 한 시각시스템 모델 개발의 중요성이 부각되고 있다.Therefore, in recent years, the importance of developing visual system models based on brain information processing and biological information processing mechanisms in human visual systems has been highlighted to overcome the limitations of existing artificial visual systems and to efficiently process complex real world visual information. It is becoming.

따라서, 본 발명의 목적은 시/청각 융합 정보에 기초하여 인간과 유사한 시선 이동 경로를 제공할 수 있는 시선 경로 제공장치를 제공하는 데 있다. Accordingly, it is an object of the present invention to provide a gaze path providing apparatus capable of providing a gaze movement path similar to a human based on visual / audio fusion information.

이상과 같은 목적을 달성하기 위한 본 발명에 의한 시건 경로 장치는, 이격된 위치에서 촬영된 복수의 이미지 및 이격된 위치에서 청취된 복수의 음원을 입력받는 입력부, 상기 복수의 음원을 분석하여 상기 음원의 위치를 판단하는 위치 판단부, 상기 복수의 이미지 각각에 대한 복수의 모노 돌출맵을 생성하고, 상기 생성된 복수의 모노 돌출맵을 이용하여 동적 돌출맵을 생성하는 돌출맵 생성부, 상기 생성된 동적 돌출맵 및 상기 판단된 음원 위치를 기초로 상기 복수의 이미지에 대한 시선 경로를 생성하는 시선 경로 생성부, 및, 상기 생성된 시선 경로를 출력하는 출력부를 포함한다. In accordance with an embodiment of the present invention, a path path apparatus includes: an input unit configured to receive a plurality of images photographed at spaced locations and a plurality of sound sources heard at spaced locations, and analyzing the plurality of sound sources; Position determination unit for determining the position of the projection map generation unit for generating a plurality of mono protrusion map for each of the plurality of images, and generating a dynamic protrusion map using the generated plurality of mono protrusion map, the generated A gaze path generation unit configured to generate gaze paths for the plurality of images based on the dynamic protrusion map and the determined sound source position, and an output unit to output the generated gaze paths.

이 경우, 상기 돌출맵 생성부는, 상기 입력된 이미지에 대한 밝기, 에지, 대칭성 및 보색 중 적어도 하나의 이미지 정보를 추출하는 이미지 정보 추출부, 상기 추출된 이미지 정보에 대한 중앙-주변 차(Center-surround Difference: CSD) 및 정규화 처리를 수행하여, 밝기 특징맵, 방향 특징맵, 대칭성 특징맵, 컬러 특징맵 중 적어도 하나의 특징맵을 출력하는 CSD 처리부, 및, 상기 출력된 특징맵에 대한 독립성분해석(Independent component analysis)을 수행하여 모노 돌출맵을 생성하는 ICA 처리부를 포함한다. In this case, the protrusion map generating unit may include an image information extracting unit extracting at least one image information of brightness, edge, symmetry, and complementary colors of the input image, and a center-periphery difference of the extracted image information. a CSD processing unit for performing at least one feature map among a brightness feature map, a direction feature map, a symmetry feature map, and a color feature map by performing a surround difference (CSD) and normalization process, and an independent component of the output feature map It includes an ICA processing unit for generating a mono protrusion map by performing an independent component analysis.

이 경우, 상기 돌출맵 생성부는, 상기 ICA 처리부에서 생성된 복수의 모노 돌출맵을 병합하여 동적 돌출맵을 생성하는 병합부를 더 포함하는 것이 바람직하다. In this case, the protrusion map generation unit may further include a merge unit that generates a dynamic protrusion map by merging a plurality of mono protrusion maps generated by the ICA processing unit.

한편, 상기 돌출맵 생성부는, 생물학 기반의 선택적 주위 집중 모델을 이용하여 상기 복수의 모노 돌출맵을 생성하는 것이 바람직하다. On the other hand, the protrusion map generation unit, it is preferable to generate the plurality of mono protrusion map using a biological-based selective ambient concentration model.

한편, 상기 시선 경로 생성부는, 상기 판단된 음원 위치에 기초하여 상기 생성된 동적 돌출맵에 포함된 복수 개의 돌출 포인트(salient point)를 보강 처리 또는 억제 처리하여 상기 복수개의 돌출 포인트에 대한 우선 순위를 부여하고, 상기 부여된 우선 순위에 따라 시선 경로를 생성하는 것이 바람직하다. The gaze path generation unit may reinforce or suppress a plurality of salient points included in the generated dynamic salient map based on the determined sound source position to prioritize the salient points. It is desirable to create a gaze path in accordance with the given priority.

한편, 상기 입력부는, 기설정된 시간을 주기로 복수의 이미지 및 복수의 음원을 입력받는 것이 바람직하다. On the other hand, it is preferable that the input unit receives a plurality of images and a plurality of sound sources at predetermined time intervals.

한편, 본 실시 예에 따른 시선 경로 제공장치에서의 시선 경로 제공방법은, 이격된 위치에서 촬영된 복수의 이미지 및 이격된 위치에서 청취된 복수의 음원을 입력받는 단계, 상기 복수의 음원을 분석하여 상기 음원의 위치를 판단하는 단계, 상기 복수의 이미지 각각에 대한 복수의 모노 돌출맵을 생성하는 단계, 상기 생성된 복수의 모노 돌출맵을 이용하여 동적 돌출맵을 생성하는 단계, 상기 생성된 동적 돌출맵 및 상기 판단된 음원 위치를 기초로 상기 복수의 이미지에 대한 시선 경로를 생성하는 단계, 및, 상기 생성된 시선 경로를 출력하는 단계를 포함한다. On the other hand, the gaze path providing method in the gaze path providing apparatus according to the present embodiment, the step of receiving a plurality of images taken from the spaced position and a plurality of sound sources heard from the spaced position, by analyzing the plurality of sound sources Determining a position of the sound source, generating a plurality of mono protrusion maps for each of the plurality of images, generating a dynamic protrusion map using the generated plurality of mono protrusion maps, and generating the generated dynamic protrusions Generating a gaze path for the plurality of images based on a map and the determined sound source position, and outputting the generated gaze path.

이 경우, 상기 복수의 모노 돌출맵을 생성하는 단계는, 상기 입력된 복수의 이미지 각각에 대한 밝기, 에지, 대칭성 및 보색 중 적어도 하나의 이미지 정보를 추출하는 단계, 상기 추출된 이미지 정보에 대한 중앙-주변 차(Center-surround Difference: CSD) 및 정규화 처리를 수행하여, 밝기 특징맵, 방향 특징맵, 대칭성 특징맵, 컬러 특징맵 중 적어도 하나의 특징맵을 출력하는 단계, 및, 상기 출력된 특징맵에 대한 독립성분해석(Independent component analysis)을 수행하여 모노 돌출맵을 생성하는 단계를 포함한다. In this case, the generating of the plurality of mono protrusion maps may include extracting at least one image information of brightness, edge, symmetry, and complementary colors for each of the plurality of input images, and generating a center of the extracted image information. Performing a center-surround difference (CSD) and normalization process to output at least one feature map of a brightness feature map, a directional feature map, a symmetry feature map, and a color feature map, and the output feature And performing an independent component analysis on the map to generate a mono protrusion map.

이 경우, 상기 동적 돌출맵을 생성하는 단계는, 상기 생성된 복수의 모노 돌출맵을 병합하여 동적 돌출맵을 생성하는 것이 바람직하다. In this case, in the generating of the dynamic protrusion map, it is preferable to generate the dynamic protrusion map by merging the generated plurality of mono protrusion maps.

한편, 상기 복수의 모노 돌출맵을 생성하는 단계는, 생물학 기반의 선택적 주위 집중 모델을 이용하여 상기 복수의 모노 돌출맵을 생성하는 것이 바람직하다. Meanwhile, in the generating of the plurality of mono protrusion maps, it is preferable to generate the plurality of mono protrusion maps using a biologically-based selective ambient concentration model.

한편, 상기 시선 경로를 생성하는 단계는, 상기 판단된 음원 위치에 기초하여 상기 생성된 동적 돌출맵에 포함된 복수 개의 돌출 포인트(salient point)를 보강 처리 또는 억제 처리하여 상기 복수개의 돌출 포인트에 대한 우선 순위를 부여하고, 상기 부여된 우선 순위에 따라 시선 경로를 생성하는 것이 바람직하다. The generating of the gaze path may include reinforcing or suppressing a plurality of salient points included in the generated dynamic salient map based on the determined sound source position, thereby generating a plurality of salient points. It is preferable to give priority and generate a gaze path according to the given priority.

한편, 상기 입력받는 단계는, 기설정된 시간을 주기로 복수의 이미지 및 복수의 음원을 입력받는 것이 바람직하다. On the other hand, in the receiving step, it is preferable to receive a plurality of images and a plurality of sound sources at a predetermined time period.

따라서, 본 실시 예에 따른 시선 경로 제공장치 및 시선 경로 제공방법은, 시/청각 정보를 융합하여, 영상의 동적인 움직임과 음원의 위치를 동시에 고려하기 때문에 정보 선택에 있어 높은 신뢰도를 가지고 정보를 선택할 수 있다. Therefore, the apparatus for providing a gaze path and the method for providing a gaze path according to the present embodiment fuse the visual and auditory information and consider the dynamic movement of the image and the position of the sound source at the same time. You can choose.

도 1은 본 발명의 일 실시 예에 따른 시선 경로 제공장치의 구성을 나타내는 블록도,
도 2는 도 1의 돌출맵 생성부의 구체적인 구성을 도시한 블록도,
도 3은 도 1의 위치 판단부의 동작을 설명하기 위한 도면,
도 4는 도 1의 돌출맵 생성부의 동작을 설명하기 위한 도면, 그리고,
도 5는 본 실시 예에 따른 시선 경로 방법의 동작을 설명하기 위한 흐름도이다. 1 is a block diagram showing the configuration of a gaze path providing apparatus according to an embodiment of the present invention;
FIG. 2 is a block diagram illustrating a specific configuration of the protrusion map generating unit of FIG. 1;
3 is a view for explaining the operation of the position determination unit of FIG.
4 is a view for explaining the operation of the protrusion map generation unit of FIG. 1, and
5 is a flowchart illustrating an operation of a gaze path method according to an exemplary embodiment.

이하 첨부된 도면들을 참조하여 본 발명에 대하여 보다 상세하게 설명한다. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will now be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시 예에 따른 시선 경로 제공장치의 구성을 나타내는 블록도이다. 1 is a block diagram illustrating a configuration of a gaze path providing apparatus according to an exemplary embodiment.

도 1을 참고하면, 본 실시 예에 따른 시선 경로 제공장치(100)는 입력부(110), 출력부(120), 저장부(130), 위치 판단부(140), 돌출맵 생성부(150), 시선 경로 생성부(160) 및 제어부(170)로 구성될 수 있다. 본 실시 예에서는 단순히 사용자의 시선 경로만을 파악하는 장치로 이용되는 것으로 설명하나, 본 실시 예에 따른 시선 경로 장치는 로봇의 비전 시스템, 보안 시스템 및 서벨리언스 시스템의 일 구성으로 구현될 수도 있다. Referring to FIG. 1, the gaze path providing apparatus 100 according to the present exemplary embodiment may include an input unit 110, an output unit 120, a storage unit 130, a position determiner 140, and a protrusion map generator 150. The eyeball path generation unit 160 and the controller 170 may be configured. In the present embodiment, it will be described as being simply used as a device for identifying a user's gaze path, but the gaze path device according to the present embodiment may be implemented as one configuration of a vision system, a security system, and a surveillance system of a robot.

입력부(110)는 이격된 위치에서 촬영된 복수의 이미지 및 이격된 위치에서 청취된 복수의 음원을 입력받는다. 구체적으로, 입력부(110)는 외부 디지털 카메라, 화상독취장치(스캐너)와 같은 촬영 장치에서 촬영된 복수의 이미지를 입력받을 수 있다. 그리고 입력부(110)는 복수의 채널을 갖는 마이크를 통하여 복수의 음원을 입력받을 수 있다. The input unit 110 receives a plurality of images photographed at spaced locations and a plurality of sound sources heard at spaced locations. In detail, the input unit 110 may receive a plurality of images photographed by a photographing device such as an external digital camera and an image reading apparatus (scanner). The input unit 110 may receive a plurality of sound sources through a microphone having a plurality of channels.

출력부(120)는 생성된 시선 경로를 출력할 수 있다. 구체적으로 출력부(120)는 모니터와 같은 디스플레이 장치로 구현될 수 있으며, 입력부(110)를 통하여 입력받은 이미지와 해당 이미지에 대한 시선 경로를 함께 표시할 수 있다. The output unit 120 may output the generated gaze path. In more detail, the output unit 120 may be implemented as a display device such as a monitor, and may display an image received through the input unit 110 and a gaze path for the corresponding image.

저장부(130)는 입력된 복수의 이미지 및 입력된 복수의 음원을 저장한다. 구체적으로, 저장부(130)는 상술한 입력부(110)에서 입력된 복수의 이미지 및 복수의 음원을 저장할 수 있다. 그리고 저장부(130)는 후술할 돌출맵 생성부(150)에서 생성된 돌출맵을 임시 저장할 수 있으며, 후술할 시선 경로 생성부(160)에서 생성된 시선 경로를 임시 저장할 수도 있다. 그리고 저장부(130)는 시선 경로 제공장치(100) 내부에 장착된 메모리, 예를 들면 ROM, 플래시 메모리나 HDD일 수 있으며, 시선 경로 제공장치(100)와 연결된 외장형 HDD 또는 메모리 카드, 예를 들면, 플래시 메모리(Flash Memory: M/S, xD, SD 등)나 USB 메모리 등일 수도 있다. The storage unit 130 stores a plurality of input images and a plurality of input sound sources. In detail, the storage 130 may store a plurality of images and a plurality of sound sources input by the input unit 110 described above. The storage unit 130 may temporarily store the protrusion map generated by the protrusion map generator 150 to be described later, and may temporarily store the gaze path generated by the gaze path generator 160 to be described later. The storage unit 130 may be a memory mounted in the gaze path providing apparatus 100, for example, a ROM, a flash memory or an HDD, and an external HDD or a memory card connected to the gaze path providing apparatus 100. For example, it may be a flash memory (M / S, xD, SD, etc.) or a USB memory.

위치 판단부(140)는 복수의 음원(예를 들어, 스테레오 음원)을 분석하여 음원의 위치를 판단한다. 구체적으로, 위치 판단부(140)는 입력된 복수의 음원 사이의 위상 차를 분석하여 음원이 발생된 위치를 판단할 수 있다. 이러한 동작은 종래 널리 알려진 기술인바 이에 대해서 구체적인 설명은 생략한다. The position determiner 140 analyzes a plurality of sound sources (eg, stereo sound sources) to determine the position of the sound source. In detail, the position determiner 140 may determine a position where the sound source is generated by analyzing phase differences between the plurality of input sound sources. Since this operation is a well known technique, a detailed description thereof will be omitted.

돌출맵 생성부(150)는 복수의 이미지 각각에 대한 복수의 모노 돌출맵을 생성하고, 생성된 복수의 모노 돌출맵을 이용하여 동적 돌출맵을 생성한다. 돌출맵 생성부(150)의 구체적인 동작 및 구성에 대해서는 도 2를 참고하여 후술한다. The protrusion map generation unit 150 generates a plurality of mono protrusion maps for each of the plurality of images, and generates a dynamic protrusion map using the generated plurality of mono protrusion maps. A detailed operation and configuration of the protrusion map generator 150 will be described later with reference to FIG. 2.

시선 경로 생성부(160)는 생성된 동적 돌출맵 및 판단된 음원 위치를 기초로 복수의 이미지에 대한 시선 경로를 생성한다. 구체적으로, 시선 경로 생성부(160)는 돌출맵 생성부(150)에서 생성된 동적 돌출맵에 포함된 복수의 돌출 포인트에 대해서 위치 판단부(140)에서 판단된 음원 위치에 기초하여 보강 처리 또는 억제 처리를 수행하여 복수개의 돌출 포인트에 대한 우선 순위를 부여하고, 부여된 우선 순위에 따라 시선 경로를 생성할 수 있다. 한편, 시선 경로 생성부(160)의 기능은 돌출맵 생성부(150)에 통합되는 형태로도 구현될 수 있다. The gaze path generator 160 generates gaze paths for a plurality of images based on the generated dynamic protrusion map and the determined sound source position. In detail, the gaze path generator 160 may reinforce or process the plurality of protrusion points included in the dynamic protrusion map generated by the protrusion map generator 150 based on the sound source position determined by the position determiner 140. The suppression process may be performed to give priority to the plurality of protruding points, and generate a gaze path according to the given priority. Meanwhile, the function of the gaze path generator 160 may be implemented in the form of being integrated into the protrusion map generator 150.

이와 같은 시선 경로 생성부(160)의 동작은 생물학 기반의 선택적 주위 집중 모델을 이용할 수 있다. 여기서, 생물학 기반의 선택적 주위 집중 모델은 인간의 기각 구조 및 처리의 일부 과정을 모델링한 것으로, 입력된 영상에 대해서 즉각적으로 반응하는 자료 주도적 처리 과정과 학습된 정보를 이용하여 개념 주도적 처리 과정으로 나누어진다. 자료 주도적 처리 과정과 개념 주도적 처리 과정은 널리 알려진 기술인바 이에 대한 구체적인 설명은 생략한다. The operation of the gaze path generator 160 may use a biological-based selective ambient concentration model. Here, the biologically based selective ambient concentration model is a model of some processes of human rejection structure and processing. It is divided into concept-driven processing using data-driven processing that reacts instantly to the input image and learned information Loses. Data-Driven Processing and Concepts Driven processing is a well-known technology and its detailed description is omitted.

제어부(170)는 시선 경로 제공장치(100)의 각 구성을 제어한다. 구체적으로, 복수의 이미지 및 복수의 음원이 입력부(110)를 통하여 입력되면, 제어부(170)는 입력된 복수의 이미지에 대한 동적 돌출맵이 생성되도록 돌출맵 생성부(150)를 제어할 수 있으며, 복수의 음원에 대한 음원 위치가 판단되도록 위치 판단부(140)를 제어할 수 있다. 그리고 제어부(170)는 생성된 동적 돌출맵 및 음원 위치에 기초하여 사용자의 시선 경로가 생성되도록 시선 경로 생성부(160)를 제어할 수 있다. 그리고 제어부(170)는 생성된 시선 경로가 표시되도록 출력부(120)를 제어할 수 있다. The controller 170 controls each component of the gaze path providing apparatus 100. In detail, when a plurality of images and a plurality of sound sources are input through the input unit 110, the controller 170 may control the protrusion map generator 150 to generate a dynamic protrusion map for the plurality of input images. The position determination unit 140 may be controlled to determine sound source positions for the plurality of sound sources. The controller 170 may control the gaze path generator 160 to generate a gaze path of the user based on the generated dynamic protrusion map and the sound source position. The controller 170 may control the output unit 120 to display the generated gaze path.

따라서, 본 실시 예에 따른 시선 경로 제공장치(100)는 시/청각 정보를 융합하여, 영상의 동적인 움직임과 음원의 위치를 동시에 고려하기 때문에 정보 선택에 있어 높은 신뢰도를 가지고 시선 경로를 생성할 수 있게 된다. Therefore, the gaze path providing apparatus 100 according to the present embodiment fuses the visual and audio information, and considers the dynamic movement of the image and the position of the sound source at the same time, thereby generating a gaze path with high reliability in selecting information. It becomes possible.

본 실시 예에서는 생성된 시선 경로를 출력부(120)를 통하여 표시하는 동작만을 설명하였으나, 구현시에는 생성된 시선 경로가 저장부(130)에 저장되거나, 인쇄 장치를 통하여 인쇄되거나, 특정 장치에 전송되는 형태로도 구현될 수 있다. In the present embodiment, only the operation of displaying the generated gaze path through the output unit 120 has been described. However, in the implementation, the generated gaze path is stored in the storage unit 130, printed through a printing device, or in a specific device. It may also be implemented in the form of being transmitted.

도 2는 도 1의 돌출맵 생성부의 구체적인 구성을 도시한 블록도이다. FIG. 2 is a block diagram illustrating a specific configuration of the protrusion map generating unit of FIG. 1.

도 2를 참고하면, 돌출맵 생성부(150)는 이미지 정보 추출부(151), CSD 처리부(152), ICA 처리부(153) 및 병합부(154)를 포함한다. Referring to FIG. 2, the protrusion map generator 150 includes an image information extractor 151, a CSD processor 152, an ICA processor 153, and a merger 154.

이미지 정보 추출부(151)는 입력된 이미지에 대한 밝기(I), 에지(E), 및 보색(RG, BY)에 대한 이미지 정보를 추출한다. 구체적으로, 입력된 이미지의 R(Red), G(Green), B(Ble) 값을 기초로 입력된 영상에 대한 밝기, 에지, 대칭성 및 보색 중 적어도 하나의 이미지 정보를 추출할 수 있다. The image information extractor 151 extracts image information about brightness I, edge E, and complementary colors RG and BY of the input image. Specifically, at least one image information among brightness, edge, symmetry, and complementary colors of the input image may be extracted based on R (Red), G (Green), and B (Ble) values of the input image.

CSD 처리부(152)는 추출된 이미지 정보에 대한 중앙-주변 창(Center-surround Difference: CSD) 및 정규화 처리를 수행하여 밝기 특징맵, 방향 특징맵, 대칭성 특징맵, 컬러 특징맵을 생성할 수 있다. The CSD processor 152 may generate a brightness feature map, a directional feature map, a symmetry feature map, and a color feature map by performing a center-surround difference (CSD) and normalization process on the extracted image information. .

그리고 ICA 처리부(153)는 출력된 특징맵에 대한 독립 성분해석(Independent component analysis)을 수행하여 모노 돌출맵(SM: Salient Map)을 생성한다.The ICA processor 153 generates a mono salient map (SM) by performing independent component analysis on the output feature map.

이와 같은 이미지 정보 추출부(151), CSD 처리부(152), ICA 처리부(153)를 이용하여 각각의 이미지에 대한 모노 돌출맵을 생성한다. The image information extracting unit 151, the CSD processing unit 152, and the ICA processing unit 153 generate a mono protrusion map for each image.

그리고 병합부(154)는 ICA 처리부(153)에서 생성된 복수의 모노 돌출맵을 병합하여 동적 돌출맵을 생성한다. 구체적으로, 동적 돌출맵은 아래의 수학식 1 내지 4에 의하여 생성될 수 있다. The merger 154 merges the plurality of mono protrusion maps generated by the ICA processor 153 to generate a dynamic protrusion map. In detail, the dynamic protrusion map may be generated by Equations 1 to 4 below.

여기서 Sp(v)는 깊이 정보가 고려되지 않은 상향식 돌출맵이고, L(sp.v.σ)는 수학식 2과 같은 라플라스 식이다. Sp (v) is a bottom-up protrusion map without considering depth information, and L (sp.v.σ) is a Laplace equation as shown in Equation (2).

도 2에 도시된 바와 같은 동적 돌출맵이 인간의 선택적 주의 집중 기능과 유사한 돌출맵을 만들어 낼지라고, 돌출된 돌출 영역들은 인간에게 관심없는 영역이 될 수도 있고, 보다 관심을 가져야 할 영역일 수도 있다. 왜냐하면, 돌출맵은 보색성과 밝기, 에지, 대칭성 정보와 같은 원시적인 특징들만을 사용하여 생성하였기 때문이다. 이러한 점을 해결하기 위하여, 청각 정보에 반영하여 동적 돌출맵의 돌출 영역 각각에 대한 억제/강화를 수행할 수 있다. 이러한 동작은 Fuzzy ART 신경망으로 모델링할 수 있다. Whether the dynamic protrusion map as shown in FIG. 2 produces a protrusion map similar to the selective attention function of a human, the protruding protrusion regions may be regions that are of no interest to humans, or regions that need more attention. . This is because the protrusion map is generated using only primitive features such as complementary color and brightness, edge, and symmetry information. In order to solve this problem, it is possible to suppress / reinforce each of the protruding regions of the dynamic salient map by reflecting it in the auditory information. This behavior can be modeled with Fuzzy ART neural network.

도 2를 설명함에 있어서, 돌출맵 생성부(150)는 두 개의 이미지 정보 추출부(151), 두 개의 CSD 처리부(152), 두 개의 ICA 처리부(153)를 포함하는 것으로 도시하고 설명하였으나, 입력되는 이미지가 3개 이상인 경우, 입력되는 이미지 개수에 대응되는 이미지 정보 추출부(151), CSD 처리부(152), ICA 처리부(153) 구비되는 형태로도 구현될 수 있다. 또한, 본 실시 예에서는 복수의 이미지에 대해서 병렬적으로 모노 돌출맵을 생성하는 것만을 설명하였지만, 하나의 정보 추출부, CSD 처리부, ICS 처리부를 이용하여 복수의 이미지 각각에 대한 모노 돌출맵을 생성하는 형태로도 구현될 수 있다. In FIG. 2, the protrusion map generating unit 150 is illustrated and described as including two image information extracting units 151, two CSD processing units 152, and two ICA processing units 153. When there are three or more images, the image information extracting unit 151, the CSD processing unit 152, and the ICA processing unit 153 corresponding to the number of input images may be implemented. In addition, in the present embodiment, only the generation of a mono protrusion map in parallel for a plurality of images has been described, but a mono protrusion map for each of the plurality of images is generated using one information extractor, a CSD processor, and an ICS processor. It can also be implemented in the form.

도 3은 도 1의 위치 판단부의 동작을 설명하기 위한 도면이다. 3 is a view for explaining an operation of the position determiner of FIG. 1.

도 3을 참조하면, 입력부에 두 개의 음원(예를 들어, 스테레오 음원)이 입력되면, 위치 판단부(140)는 각각의 음원의 스펙트럼을 분석하고, 분석된 음원 각각의 스펙트럼을 통하여, 음원이 발생된 위치를 예측할 수 있다. Referring to FIG. 3, when two sound sources (eg, stereo sound sources) are input to the input unit, the position determination unit 140 analyzes the spectrum of each sound source, and through the spectrum of each analyzed sound source, The location generated can be predicted.

도 4는 도 1의 돌출맵 생성부의 동작을 설명하기 위한 도면이다. FIG. 4 is a diagram for describing an operation of the protrusion map generating unit of FIG. 1.

도 4를 참조하면, ICA 처리부(153)에서 생성된 복수의 돌출맵(410), 병합부(154)에서 생성된 동적 돌출맵(420) 및 최종 돌출맵(430)이 표시되어 있다. 4, a plurality of protrusion maps 410 generated by the ICA processor 153, a dynamic protrusion map 420 generated by the merger 154, and a final protrusion map 430 are displayed.

복수의 모노 돌출맵(410)은 입력부(110)를 통하여 입력된 복수의 이미지 각각에 대응되는 돌출입니다. The plurality of mono protrusion maps 410 are protrusions corresponding to each of a plurality of images input through the input unit 110.

그리고 동적 돌출맵(420)은 ICA 처리부(154)에서 출력된 복수의 모노 돌출맵이 병합되어 생성된 돌출맵이다. The dynamic protrusion map 420 is a protrusion map generated by merging a plurality of mono protrusion maps output from the ICA processor 154.

그리고 최종 돌출맵(430)은 위치 판단부(140)에서 판단된 음원의 위치에 따라 동적 돌출맵(420)의 돌출 영역에 대한 보강 및 억제 동작을 수행하여 생성된 돌출맵이다. The final protrusion map 430 is a protrusion map generated by performing a reinforcement and suppression operation on the protrusion region of the dynamic protrusion map 420 according to the position of the sound source determined by the position determiner 140.

도 5는 본 실시 예에 따른 시선 경로 방법의 동작을 설명하기 위한 흐름도이다. 5 is a flowchart illustrating an operation of a gaze path method according to an exemplary embodiment.

이격된 위치에서 촬영된 복수의 이미지 및 이격된 위치에서 청취된 복수의 음원을 입력받는다(S510). 구체적으로, 외부 디지털 카메라, 화상독취장치(스캐너)와 같은 촬영 장치에서 촬영된 복수의 이미지를 입력받을 수 있다. 그리고 복수의 채널을 갖는 마이크를 통하여 복수의 음원을 입력받을 수 있다. In operation S510, a plurality of images photographed at a spaced position and a plurality of sound sources heard at a spaced position are received. In detail, a plurality of images photographed by a photographing apparatus such as an external digital camera or an image reading apparatus (scanner) may be input. A plurality of sound sources may be input through a microphone having a plurality of channels.

그리고 복수의 음원을 분석하여 상기 음원의 위치를 판단한다(S520). 구체적으로, 입력된 복수의 음원 사이의 위상 차를 분석하여 음원이 발생된 위치를 판단할 수 있다. The location of the sound source is determined by analyzing a plurality of sound sources (S520). Specifically, the position where the sound source is generated may be determined by analyzing the phase difference between the plurality of input sound sources.

그리고 복수의 이미지 각각에 대한 복수의 모노 돌출맵을 생성한다(S530). 구체적으로, 입력된 복수의 이미지 각각에 대한 밝기, 에지, 대칭성 및 보색 중 적어도 하나의 이미지 정보를 추출하고, 추출된 이미지 정보에 대한 중앙-주변 차(Center-surround Difference: CSD) 및 정규화 처리를 수행하여, 밝기 특징맵, 방향 특징맵, 대칭성 특징맵, 컬러 특징맵 중 적어도 하나의 특징맵을 출력하고, 출력된 특징맵에 대한 독립성분해석(Independent component analysis)을 수행하여 모노 돌출맵을 생성할 수 있다. In operation S530, a plurality of mono protrusion maps are generated for each of the plurality of images. Specifically, at least one image information of brightness, edge, symmetry, and complementary colors for each of the plurality of input images is extracted, and a center-surround difference (CSD) and normalization process for the extracted image information is performed. Outputs at least one feature map of a brightness feature map, a directional feature map, a symmetry feature map, and a color feature map, and generates a mono protrusion map by performing independent component analysis on the output feature map. can do.

생성된 복수의 모노 돌출맵을 이용하여 동적 돌출맵을 생성한다(S540). 구체적으로, 생성된 복수의 모노 돌출맵을 병합하여 동적 돌출맵을 생성할 수 있다. A dynamic protrusion map is generated using the generated plurality of mono protrusion maps (S540). Specifically, the dynamic protrusion map may be generated by merging a plurality of generated mono protrusion maps.

생성된 동적 돌출맵 및 판단된 음원 위치를 기초로 복수의 이미지에 대한 시선 경로를 생성한다(S550). 구체적으로, 생성된 동적 돌출맵에 포함된 복수의 돌출 포인트에 대해서 판단된 음원 위치에 기초하여 보강 처리 또는 억제 처리를 수행하여 복수개의 돌출 포인트에 대한 우선 순위를 부여하고, 부여된 우선 순위에 따라 시선 경로를 생성할 수 있다.A gaze path for a plurality of images is generated based on the generated dynamic protrusion map and the determined sound source position (S550). Specifically, reinforcement or suppression is performed on the plurality of protrusion points included in the generated dynamic protrusion map to give priority to the plurality of protrusion points, and according to the assigned priority. A gaze path may be generated.

생성된 시선 경로를 출력한다(S560). 구체적으로 모니터와 같은 디스플레이 장치를 통하여 생성된 시선 경로를 출력할 수 있다. 이때, 입력받은 복수의 이미지와 해당 복수의 이미지에 대한 시선 경로를 함께 표시할 수 있다. The generated gaze path is output (S560). In detail, the gaze path generated through the display device such as a monitor may be output. In this case, the plurality of input images and the gaze paths of the plurality of images may be displayed together.

따라서, 본 실시 예에 따른 시선 경로 제공방법은 시/청각 정보를 융합하여, 영상의 동적인 움직임과 음원의 위치를 동시에 고려하기 때문에 정보 선택에 있어 높은 신뢰도를 가지고 시선 경로를 생성할 수 있게 된다. 도 5와 같은 시선 경로 제공방법은, 도 1의 구성을 가지는 시선 경로 제공장치상에서 실행될 수 있으며, 그 밖의 다른 구성을 가지는 시선 경로 제공장치상에도 실행될 수 있다. Therefore, in the gaze path providing method according to the present embodiment, the gaze path and the auditory information are merged to simultaneously consider the dynamic movement of the image and the position of the sound source, thereby making it possible to generate a gaze path with high reliability in selecting information. . The gaze path providing method as shown in FIG. 5 may be executed on the gaze path providing device having the configuration of FIG. 1 and may also be executed on the gaze path providing device having another configuration.

이상에서는 본 발명의 바람직한 실시 예에 대해서 도시하고, 설명하였으나, 본 발명은 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진자라면 누구든지 다양한 변형 실시할 수 있는 것은 물론이고, 그와 같은 변경은 청구범위 기재의 범위 내에 있게 된다. Although the above has been illustrated and described with respect to preferred embodiments of the present invention, the present invention is not limited to the above-described specific embodiments, and the present invention belongs to the present invention without departing from the gist of the present invention as claimed in the claims. Anyone of ordinary skill in the art can make various modifications, and such changes are within the scope of the claims.

100: 시선 경로 제공장치 110: 입력부
120: 출력부 130: 저장부
140: 위치 판단부 150: 돌출맵 생성부
160: 시선 경로 생성부 170: 제어부100: gaze path providing apparatus 110: input unit
120: output unit 130: storage unit
140: position determination unit 150: protrusion map generation unit
160: gaze path generation unit 170: control unit

Claims

In the gaze path providing apparatus,
An input unit configured to receive a plurality of images photographed at positions spaced at predetermined intervals and a plurality of sound sources heard at positions spaced at predetermined intervals;
A position determination unit determining the position of the sound source by analyzing the plurality of sound sources;
A protrusion map generation unit generating a plurality of mono protrusion maps for each of the plurality of images and generating a dynamic protrusion map using the generated plurality of mono protrusion maps;
A gaze path generation unit configured to generate gaze paths for the plurality of images based on the generated dynamic protrusion map and the determined sound source position; And
And an output unit for outputting the generated gaze path.

The method of claim 1,
The protrusion map generation unit,
An image information extracting unit extracting at least one image information of brightness, edge, symmetry, and complementary colors of the input image;
A center-surround difference (CSD) and a normalization process on the extracted image information to output at least one feature map of a brightness feature map, a direction feature map, a symmetric feature map, and a color feature map A CSD processing unit; And
And an ICA processing unit for generating a mono protrusion map by performing independent component analysis on the output feature map.

3. The method of claim 2,
The protrusion map generation unit,
And a merging unit for generating a dynamic salient map by merging a plurality of mono salient maps generated by the ICA processing unit.

The method of claim 1,
The protrusion map generation unit,
An apparatus for providing gaze paths, wherein the plurality of mono protrusion maps are generated using a biologically-based selective ambient concentration model.

The method of claim 1,
The gaze path generation unit,
Based on the determined sound source position, reinforcement or suppression of a plurality of salient points included in the generated dynamic salient map gives priority to the plurality of salient points, and the assigned priority. Gaze path providing apparatus characterized in that for generating a gaze path according to.

delete

In the gaze path providing method in the gaze path providing apparatus,
Receiving a plurality of images photographed at positions spaced at predetermined time intervals and a plurality of sound sources heard at positions spaced at predetermined time intervals;
Analyzing the plurality of sound sources to determine a position of the sound source;
Generating a plurality of mono protrusion maps for each of the plurality of images;
Generating a dynamic protrusion map using the generated plurality of mono protrusion maps;
Generating a gaze path for the plurality of images based on the generated dynamic protrusion map and the determined sound source position; And
And outputting the generated gaze path.

The method of claim 7, wherein
The generating of the plurality of mono protrusion maps may include:
Extracting at least one image information of brightness, edge, symmetry, and complementary colors for each of the plurality of input images;
A center-surround difference (CSD) and a normalization process on the extracted image information to output at least one feature map of a brightness feature map, a direction feature map, a symmetric feature map, and a color feature map step; And
And generating a mono protrusion map by performing independent component analysis on the output feature map.

9. The method of claim 8,
Generating the dynamic protrusion map,
And generating a dynamic protrusion map by merging the generated plurality of mono protrusion maps.

The method of claim 7, wherein
The generating of the plurality of mono protrusion maps may include:
And generating the plurality of mono protrusion maps using a biologically-based selective ambient concentration model.

The method of claim 7, wherein
The generating of the gaze path may include:
Based on the determined sound source position, reinforcement or suppression of a plurality of salient points included in the generated dynamic salient map gives priority to the plurality of salient points, and the assigned priority. The gaze path providing method of generating a gaze path according to the present invention.

delete