KR20110037183A

KR20110037183A - Power controllable computer system combining neuro-fuzzy system and parallel processor, method and apparatus for recognizing objects using the computer system in images

Info

Publication number: KR20110037183A
Application number: KR1020090094495A
Authority: KR
Inventors: 유회준; 김주영; 박준영
Original assignee: 한국과학기술원
Priority date: 2009-10-06
Filing date: 2009-10-06
Publication date: 2011-04-13

Abstract

PURPOSE: A computer system which can control power combined to a neuro-fuzzy system and a parallel processing processor are provided to perform only in a processor which needs a parallel process of limited data among input data applied to a neuro network technique and a fuzzy technique. CONSTITUTION: A neuro-fuzzy system(110) includes at least two among a neural network block(111), a fuzzy logic block(112) and a neuro-fuzzy block(113). A parallel processor(120) includes a plurality of processing units(121). A network on chip(130) is connected between the neuro-fuzzy system and the parallel processing processor. The network on chip performs data communication among the neuro-fuzzy system, the parallel processing processor and a power supply unit(160).

Description

POWER CONTROLLABLE COMPUTER SYSTEM COMBINING NEURO-FUZZY SYSTEM AND PARALLEL PROCESSOR, METHOD AND APPARATUS FOR RECOGNIZING OBJECTS USING THE COMPUTER SYSTEM IN IMAGES}

본 발명은 병렬처리 컴퓨터 시스템 및 이를 이용하여 영상에서 물체를 인식하는 기술에 관한 것이다.The present invention relates to a parallel processing computer system and a technique for recognizing an object in an image using the same.

물체 인식(object recognition)은 자동차 자율 주행, 지능형 로봇 비전 시스템, 경보 시스템 등 최근 고급 비전 응용분야의 핵심기술로서, 2-D 영상(image, 이미지) 데이터가 입력으로 주어졌을 때, 이미지에서 물체의 특징점을 찾고 그에 대한 특징벡터(descriptor vector)를 생성하여, 미리 등록된 물체에 대한 벡터 집합인 물체 데이터베이스(object database)와 비교하여 가장 가까운 물체를 결정하는 과정으로 이루어진다.Object recognition is a key technology in recent advanced vision applications such as autonomous vehicles, intelligent robot vision systems, and alarm systems. When 2-D image data is given as input, the object recognition Finding the feature point and generating a feature vector (descriptor vector), and compares with the object database which is a vector set of the object registered in advance to determine the nearest object.

이러한 물체 인식을 수행하기 위해서는 입력 이미지에 대하여 어떻게 특징점을 추출하고 그것을 어떻게 벡터로 기술하는지가 중요하게 되고, 이를 위해서는 많 은 양의 이미지 데이터에 대하여 필터링, 히스토그램 등 복잡, 다양한 연산이 필요하게 된다. 현재 가장 많이 사용되는 병렬 처리는 SIFT(scale invariant feature transform)로서, 그 과정이 많은 양의 연산량을 필요로 하여 2GHz 고성능 CPU로도 초당 0.5프레임 정도의 성능을 낼 정도로 실시간 처리하기가 힘들다.In order to perform the object recognition, it is important to extract feature points from the input image and how to describe them as vectors. To this end, a large amount of image data, such as filtering and histogram, require complex and various operations. The most commonly used parallel processing is scale invariant feature transform (SIFT), which requires a large amount of computation, making it difficult to process in real time such that even a 2GHz high-performance CPU can perform as much as 0.5 frames per second.

이를 극복하기 위하여 수많은 프로세싱 유닛을 포함하는 병렬 프로세서 (parallel processor)를 이용하여 물체 인식을 수행할 경우, 병렬 처리로 인하여 프레임 속도를 향상시킬 수 있지만, 이 경우 프로세싱 유닛 수에 비례하여 전력 소모가 증가하게 되는 단점이 있어. 모바일 로봇이나 핸드폰과 같은 저전력 실시간 응용분야에 적용시키기 어렵게 된다.In order to overcome this problem, when performing object recognition using a parallel processor including a large number of processing units, the frame rate can be improved due to the parallel processing, but in this case, power consumption increases in proportion to the number of processing units. There's a downside to doing that. It will be difficult to apply to low-power real-time applications such as mobile robots and cell phones.

본 발명은 뉴럴 네트워크 기술과 퍼지 기술을 적용하여 입력된 데이터 중에서 한정된 데이터를 필요한 프로세서에 대해서만 병렬처리를 수행하는 것을 목적으로 한다.An object of the present invention is to perform parallel processing only for processors requiring limited data among input data by applying neural network technology and fuzzy technology.

또한, 본 발명은 입력된 영상에서 관심영역(region-of-interest, ROI)을 추출하고 추출된 관심영역의 데이터만을 필요한 프로세서에 대해서만 병렬처리를 수행함으로써 물체 인식 처리 속도를 증가시키고, 물체 인식 장치의 전력 소모를 감소시켜 실시간으로 물체를 인식하는 것을 목적으로 한다.In addition, the present invention increases the object recognition processing speed by extracting a region-of-interest (ROI) from an input image and performing parallel processing only on processors requiring only data of the extracted region of interest. It aims to recognize an object in real time by reducing its power consumption.

본 발명에 따른 컴퓨터 시스템은 뉴럴 네트워크(neural networks) 블록, 퍼 지 로직(fuzzy logic) 블록, 및 뉴럴 네트워크와 퍼지 로직이 결합된 뉴로-퍼지(neuro-fuzzy) 블록 중 적어도 어느 2개를 포함하는 뉴로-퍼지 시스템(neuro-fuzzy system), 복수의 프로세싱 유닛을 포함하는 병렬처리 프로세서(parallel processor), 뉴로-퍼지 시스템의 제어신호에 따라 병렬처리 프로세서로 공급되는 전력을 제어하는 전력공급장치 및 뉴로-퍼지 시스템과 병렬처리 프로세서 사이에 연결되어 뉴로-퍼지 시스템, 병렬처리 프로세서 및 전력공급장치 간의 데이터 통신을 하는 네트워크 온 칩을 포함한다.The computer system according to the present invention comprises at least two of neural networks blocks, fuzzy logic blocks, and neuro-fuzzy blocks in which neural networks and fuzzy logic are combined. A neuro-fuzzy system, a parallel processor including a plurality of processing units, a power supply for controlling the power supplied to the parallel processor according to a control signal of the neuro-fuzzy system, and a neuro A network on chip coupled between the purge system and the parallel processor for data communication between the neuro-fuge system, the parallel processor and the power supply.

뉴로-퍼지 시스템에서 추출된 데이터와 병렬처리 프로세서의 연산 과정의 중간 데이터를 저장하기 위한 메모리를 더 포함하는 것이 바람직하다.It is preferable to further include a memory for storing the data extracted from the neuro-fuzzy system and the intermediate data of the operation process of the parallel processor.

뉴로-퍼지 시스템은 태스크 스케줄러(task scheduler)를 더 포함하고, 태스크 스케줄러는 뉴로-퍼지 시스템의 출력 데이터를 병렬처리 프로세서에 분배하는 스케줄(schedule)을 생성하는 것이 바람직하다.The neuro-fuzzy system further includes a task scheduler, and the task scheduler preferably generates a schedule for distributing output data of the neuro-fuzzy system to the parallel processing processor.

복수의 프로세싱 유닛들은 특정 개수마다 하나의 독립된 전력 도메인(domain)으로 구분되고, 전력공급장치는 복수의 전원 조정기를 포함하며, 전원 조정기 각각은 뉴로-퍼지 시스템의 제어신호에 따라 전력 도메인 각각을 제어하는 것이 바람직하다.The plurality of processing units are divided into one independent power domain for each specific number, and the power supply includes a plurality of power regulators, each of which controls each of the power domains according to a control signal of a neuro-purge system. It is desirable to.

본 발명에 따른 물체 인식 장치는, 셀룰러 뉴럴 네트워크 시각 집중기(cellular neural networks visual attention engine), 퍼지 모션 측정기(fuzzy motion estimator), 뉴로-퍼지 분류기(neuro fuzzy classifier) 및 태스크 스케줄러를 포함하는 뉴로-퍼지 시스템, 복수의 프로세싱 유닛을 포함하는 병렬처리 프로 세서, 뉴로-퍼지 시스템의 제어신호에 따라 상기 병렬처리 프로세서로 공급되는 전력을 제어하는 전력공급장치, 특징벡터를 데이터베이스에 있는 벡터들과 비교하여 가장 가까운 거리를 갖는 벡터에 해당하는 물체를 인식하는 물체결정부 및 뉴로-퍼지 시스템, 병렬처리 프로세서, 물체결정부 및 전력공급장치 간의 데이터 통신을 하는 네트워크 온 칩을 포함하고, 퍼지 모션 측정기는 연속된 이미지프레임 사이에서 동적 모션 벡터를 생성하고, 셀룰러 뉴럴 네트워크 시각 집중기는 정적 특징인 강도, 색, 방향을 추출하고 동적 모션 벡터와 함께 누적하여 특징맵을 생성하고, 뉴로-퍼지 분류기는 특징맵을 기반으로 시드점(seed point)을 추출하고 시드점을 기준으로 영역확장을 통한 동질성 판단으로 각 물체의 관심영역(region-of-interest, ROI)을 타일(tile) 단위로 추출하고, 태스크 스케줄러는 관심영역 타일을 관심영역 타일 태스크로 변환하고, 변환된 관심영역 타일 태스크를 병렬처리 프로세서에 분배하고, 병렬처리 프로세서는 관심영역 타일 태스크를 SIMD(single-instruction-multiple-data) 병렬연산을 하여 물체의 특징점과 특징점에 대한 특징벡터를 생성하고, 특징벡터를 물체결정부로 전달한다.An object recognition apparatus according to the present invention includes a neuro-neural network including a cellular neural networks visual attention engine, a fuzzy motion estimator, a neuro-fuzzy classifier and a task scheduler. A fuzzy system, a parallel processing processor including a plurality of processing units, a power supply for controlling the power supplied to the parallel processing processor according to a control signal of a neuro-purge system, and a feature vector are compared with vectors in a database. An object determination unit and a neuro-fuge system that recognizes an object corresponding to the vector having the closest distance, and a network on chip for data communication between the parallel processing processor, the object determination unit and the power supply unit, and the fuzzy motion measuring unit is a continuous Dynamic motion vectors between captured imageframes and The concentrator extracts the static features such as intensity, color, and direction and accumulates them together with the dynamic motion vector to generate a feature map, and the neuro-fuzzy classifier extracts seed points based on the feature map and based on the seed points. The region-of-interest (ROI) of each object is extracted in tile units by determining homogeneity through region expansion, and the task scheduler converts the ROI tile into a ROI tile task and converts the transformed ROI. The tile task is distributed to the parallel processor, and the parallel processor performs a single-instruction-multiple-data (SIMD) parallel operation on the tile region of interest to generate a feature vector of the feature point and the feature point of the object, and then asks the feature vector. Deliver to the signing government.

관심영역 타일 태스크는 전체 이미지 데이터에서 추출한 관심영역 타일의 최 좌측 상단 점의 데이터의 주소를 의미하는 시작주소, 전체 이미지에서 관심영역 타일의 최 좌측 상단 점의 2차원 좌표값인 X방향좌표, Y방향좌표로 구성된 시작 좌표 및 관심영역 타일의 너비와 높이를 각각 기술한 타일크기를 포함하고, 태스크 스케줄러는 변환된 관심영역 타일 태스크를 병렬처리 프로세서의 복수의 프로세싱 유닛에 분배하고 관리하는 것이 바람직하다.The ROI tile task is a starting address representing the data of the leftmost point of the ROI tile extracted from the entire image data, an X-direction coordinate that is a two-dimensional coordinate value of the leftmost point of the ROI tile of the whole image, and Y. And a tile size describing the width and height of the region of interest tile, the starting coordinates composed of the direction coordinates, and the task scheduler preferably distributes and manages the transformed region of interest tile task to a plurality of processing units of the parallel processor. .

뉴로-퍼지 시스템은 뉴로-퍼지 분류기가 추출한 관심영역 타일의 개수를 기준으로 병렬처리 프로세서가 처리해야 할 연산량을 미리 예측하는 것이 바람직하다.The neuro-fuzzy system preferably predicts the amount of computation to be processed by the parallel processor based on the number of ROI tiles extracted by the neuro-fuzzy classifier.

복수의 프로세싱 유닛들은 특정 개수마다 하나의 독립된 전력 도메인으로 구분되고, 전력공급장치는 복수의 전원 조정기를 포함하며, 전원 조정기 각각은 뉴로-퍼지 시스템의 제어신호에 따라 전력 도메인 각각을 제어하는 것이 바람직하다.The plurality of processing units are divided into one independent power domain for each specific number, and the power supply includes a plurality of power regulators, each of which controls each of the power domains according to a control signal of a neuro-purge system. Do.

본 발명에 따른 물체 인식 장치에서 병렬처리 프로세서가 처리해야 할 연산량 예측 방법은 뉴로-퍼지 분류기가 추출한 관심영역 타일의 개수를 측정하는 제1 단계, 측정된 타일의 개수와 분배 기준 개수를 비교하여 특정 개수의 전력 도메인을 선택하는 제2 단계, 선택된 전력 도메인에만 전력공급장치가 전력을 공급하는 제3 단계 및 전력이 공급된 전력 도메인에 포함된 병렬처리 프로세서의 연산 결과에 따라 제2 단계의 분배 기준 개수를 갱신하는 제4 단계를 포함한다.In the object recognition apparatus according to the present invention, an amount of computation prediction method to be processed by a parallel processing processor includes a first step of measuring the number of ROI tiles extracted by a neuro-fuzzy classifier, comparing the number of measured tiles with the distribution reference number A second step of selecting a number of power domains, a third step of supplying power to only the selected power domains, and a distribution criterion of the second step according to an operation result of the parallel processor included in the powered power domains A fourth step of updating the count.

본 발명에 따른 물체 인식 장치에서 1개 이상의 프로세싱 유닛을 포함하는 독립된 전력 도메인 개수 결정 방법은 전력 도메인 개수를 X축, 전력 도메인 개수에 따른 전력 감소효과를 Y축으로 하는 그래프를 나타내는 제1 단계, 전력 도메인 개수를 X축, 전력 도메인 개수에 따른 전원 조정기에 필요한 추가면적 소요비용을 Y축으로 하는 그래프를 나타내는 제2 단계 및 제1 단계 그래프의 기울기가 감소하며 제2 단계 그래프 기울기가 증가하는 교차점의 전력 도메인 개수를 결정하는 제3 단계를 포함한다.In the object recognition apparatus according to the present invention, an independent power domain number determining method including one or more processing units includes a first step of displaying a graph in which the number of power domains is represented by the X axis and the power reduction effect according to the number of power domains is represented by the Y axis; Cross point where the slope of the second and first graphs decreases while the slope of the second and first graphs is reduced, indicating that the X-axis is the number of power domains and the additional area required for the power regulator according to the number of power domains is the Y-axis. Determining a number of power domains.

본 발명에 따른 물체 인식 장치를 이용한 물체 인식 방법은 뉴로-퍼지 시스 템에 의하여 물체의 관심영역을 타일 단위로 추출하는 제1 단계, 추출된 관심영역 타일을 전체 이미지 데이터에서 해당 타일 데이터의 시작 주소, 전체 이미지에서 해당 타일의 시작 위치의 2차원 좌표값 및 해당 타일의 크기를 포함하는 관심영역 타일 태스크로 변환하는 제2 단계, 뉴로-퍼지 시스템이 병렬처리 프로세서가 처리해야 할 연산량을 예측하여 전력공급장치에 제어신호를 전달하는 제3 단계, 전원 조정기 각각이 제어신호에 따라 각각의 전력 도메인을 제어하는 제4 단계, 변환된 관심영역 타일 태스크가 네트워크 온 칩을 통하여 복수의 프로세싱 유닛으로 구성된 병렬처리 프로세서에 분배되는 제5 단계, 병렬처리 프로세서의 각 프로세싱 유닛이 분배받은 관심영역 타일 태스크에 대하여 물체의 특징점과 특징벡터를 생성하는 제6 단계 및 물체 결정부가 데이터베이스에 있는 벡터들과 비교하여 가장 가까운 거리를 갖는 벡터에 해당하는 물체를 인식하는 제7 단계를 포함한다.In the object recognition method using the object recognition apparatus according to the present invention, a first step of extracting a region of interest of an object in units of tiles by a neuro-fuzzy system and a starting address of the corresponding tile data in the entire image data In the second step of converting a two-dimensional coordinate value of the starting position of the tile and the size of the tile in the entire image into a region of interest tile task, the neuro-fuzzy system predicts the amount of computation that the parallel processing processor must handle A third step of transmitting a control signal to the supply device, a fourth step of each power regulator controlling each power domain according to the control signal, and a converted region of interest tile task having a plurality of processing units arranged in parallel through a network on chip The fifth step distributed to the processing processor, each processing unit of the parallel processing processor distributed interest Against Tiles task and a seventh step of recognizing the object corresponding to a vector having the shortest distance as compared with the vector in the sixth step and the object determining additional databases for generating a feature point of an object and a feature vector.

제1단계는 뉴로-퍼지 시스템은 동적 모션 벡터 생성, 정적 특징 추출한 후, 동적 모션 벡터 및 정적 특징을 함께 누적하여 관심영역을 분류하고 관심영역을 일정한 크기의 기본 타일을 이용하여 표현하는 것이 바람직하다.In the first step, the neuro-fuzzy system may generate a dynamic motion vector, extract a static feature, and then accumulate the dynamic motion vector and the static feature together to classify the ROI and express the ROI using a basic tile having a constant size. .

본 발명에 의하면 뉴럴 네트워크 기술과 퍼지 기술을 적용하여 입력된 데이터 중에서 한정된 데이터를 필요한 프로세서에 대해서만 병렬처리를 수행하게 할 수 있다.According to the present invention, a neural network technique and a fuzzy technique may be applied to perform parallel processing only for a processor requiring limited data among input data.

또한, 본 발명에 의하면 입력된 영상에서 관심영역을 추출하고 추출된 관심영역의 데이터만을 필요한 프로세서에 대해서만 병렬처리를 수행함으로써 물체 인 식 처리 속도를 증가시키고, 물체 인식 장치의 전력 소모를 감소시켜 실시간으로 물체를 인식할 수 있다.In addition, according to the present invention, by extracting the region of interest from the input image and performing parallel processing only for the processor that needs only the data of the extracted region of interest, the object recognition processing speed is increased, and the power consumption of the object recognition apparatus is reduced, thereby real-time. Can recognize objects.

컴퓨터 시스템Computer systems

도 1은 본 발명의 일 실시예에 따른 컴퓨터 시스템(100)의 블록도를 나타내는 도면이다. 컴퓨터 시스템(100)은 뉴로-퍼지 시스템(110), 병렬처리 프로세서(120), 네트워크 온 칩(130), 메모리(140) 및 전력공급장치(160)를 포함한다.1 is a block diagram of a computer system 100 according to an embodiment of the present invention. Computer system 100 includes a neuro-fuzzy system 110, a parallel processor 120, a network on chip 130, a memory 140, and a power supply 160.

뉴로-퍼지 시스템(110)은 뉴럴 네트워크 블록(111), 퍼지 로직 블록(112), 및 뉴럴 네트워크와 퍼지 로직이 결합된 뉴로-퍼지 블록(113) 중 적어도 어느 2개를 포함하는 시스템을 말한다.The neuro-fuzzy system 110 refers to a system including at least two of neural network block 111, fuzzy logic block 112, and neuro-fuzzy block 113 in which neural network and fuzzy logic are combined.

뉴로-퍼지 시스템(110)은 인간의 학습 능력(뉴럴 네트워크)과 근사적인 사고 과정(퍼지 로직)을 함께 구현하는 시스템으로, 뉴럴 네트워크의 기능과 퍼지 로직의 기능이 결합되어야 한다. 이를 위해서 첫 번째 경우로 뉴로-퍼지 블록(113)들만으로 구성될 수 있으나 이 경우만을 말하는 것은 아니다. 두 번째 경우로 뉴럴 네트워크 블록(111)과 퍼지 로직 블록(112)의 결합으로도 위와 같은 기능을 구현할 수 있고, 세 번째 경우로 뉴로-퍼지 블록(113)과 뉴럴 네트워크 블록(111)의 결합으로도 위와 같은 기능을 구현할 수 있고, 네 번째 경우로 뉴로-퍼지 블록(113)과 퍼지 로직 블록(112)의 결합으로도 가능하다. 뿐만 아니라 다섯 번째 경우로 뉴로-퍼지 블록(113), 뉴럴 네트워크 블록(111) 및 퍼지 로직 블록(112) 모두의 결합으로도 위와 같은 기능의 구현이 가능하므로 뉴로-퍼지 시스템(110)은 위 다섯 가지 경우를 모두 포함하는 시스템으로 정의한다.The neuro-fuzzy system 110 is a system that implements a human learning ability (neural network) and an approximate thinking process (fuzzy logic). The function of the neural network and the fuzzy logic should be combined. To this end, the first case may be composed of only the neuro-fuzzy blocks 113, but this is not the only case. In the second case, the neural network block 111 and the fuzzy logic block 112 may implement the above functions. In the third case, the neuro-fuzzy block 113 and the neural network block 111 may be combined. As described above, the same function may be implemented. In a fourth case, the neuro-fuzzy block 113 and the fuzzy logic block 112 may be combined. In addition, in the fifth case, the combination of the neuro-fuzzy block 113, the neural network block 111, and the fuzzy logic block 112 can implement the above functions. It is defined as a system that includes all cases.

뉴럴 네트워크 블록(111)은 데이터 마이닝, 네트워크 관리, 머신 비전, 신호처리 등의 광범위한 분야에서 사용되는데 기존의 컴퓨터 모델이 지속적인 명령어에 의해 계산적인 연산에는 뛰어나지만 경험을 통해 현상을 일반화하고 배우는 학습 능력이 부족한 사실에 기반하여, 인간의 학습 능력을 모사하기 위하여 인간 뇌의 신경망(neural)을 컴퓨터에 모델링한 알고리즘이 적용된 블록이다.The neural network block 111 is used in a wide range of fields such as data mining, network management, machine vision, signal processing, etc. The existing computer model is excellent in computational computation by continuous instructions, but learning ability to generalize and learn phenomena through experience. Based on this lack of fact, in order to simulate the learning ability of humans, a computer-based algorithm that models the neural network of the human brain is applied.

이러한 인간 뇌의 신경망을 모델링한 알고리즘은 뇌의 신경망 활동을 모델링하기 위해 설계된 기본적 단위들로 구성된다. 위의 단위는 데이터 입력들을 하나의 출력값으로 통합하는데 이러한 결합을 단위의 활성화 함수라 한다. 활성화 함수는 두 가지 부분으로 나뉜다. 첫 번째 부분은 모든 입력을 하나의 값으로 통합하는 것이다. 즉 단위로의 각각의 입력은 자체적인 가중치를 가진다. 가장 일반적인 결합 함수는 가중치를 가지는 합계(weighted sum)로 각 입력은 가중치로 곱해지고 이러한 것들은 함께 더해진다. 다른 결합 함수들도 때때로 유용하며, 가중치를 가진 입력의 최대값, 최소값 혹은 논리적 AND, OR 값을 포함한다. 활성화 함수의 두 번째 부분은 이전함수(transfer function)로, 결합함수로부터 단위의 출력으로 값이 이전된다는 사실에서 붙여진 이름이다. 이전 함수에는 지그모이드(sigmoid), 선형, 하이퍼볼릭 탄젠트(hyperbolic tangent) 함수가 있다.The algorithm modeling the neural network of the human brain is composed of basic units designed to model neural network activity of the brain. The above unit combines the data inputs into one output, which is called the activation function of the unit. The activation function is divided into two parts. The first part is to consolidate all the inputs into one value. Each input to a unit has its own weight. The most common combination function is a weighted sum, where each input is multiplied by a weight and these are added together. Other combination functions are also sometimes useful and include weighted input maximum, minimum, or logical AND, OR values. The second part of the activation function is the transfer function, given the name given to the fact that the value is transferred from the coupling function to the output of the unit. The previous functions include the sigmoid, linear, and hyperbolic tangent functions.

퍼지 로직 블록(112)은 플랜트 제어, 모터 제어, 비행기 제어 등 많은 지능 제어 응용 분야에서 널리 사용되는데, 여기서 퍼지 로직은 분석 대상이 어떤 모임에 속한다 또는 속하지 않는다는 이진법 논리로부터, 각 분석 대상이 그 모임에 속 하는 정도를 멤버쉽 함수(membership function)로 나타냄으로써 수학적으로 표현한다. 이러한 멤버쉽 함수를 통하여 애매모호한 값을 다루는 퍼지 로직으로 구현된 블록을 퍼지 로직 블록(112)이라고 한다. 퍼지 로직 블록(112)은 인간의 뇌에서 벌어지는 근사적이고 비수치적인 사고의 과정을 모사한다. 퍼지 로직 블록(112)은 컴퓨터 시스템(100)의 목적에 따라 알려진 여러 멤버쉽 함수를 이용하여 구현할 수 있다.Fuzzy logic block 112 is widely used in many intelligent control applications, such as plant control, motor control, and airplane control, where fuzzy logic is derived from binary logic that an analysis subject belongs to or does not belong to, and that each analysis subject is gathered. The degree of belonging is expressed mathematically by the membership function. A block implemented with fuzzy logic that handles ambiguous values through such a membership function is called fuzzy logic block 112. Fuzzy logic block 112 simulates the approximate and non-numerical process of thinking in the human brain. Fuzzy logic block 112 may be implemented using various membership functions known for the purposes of computer system 100.

뉴로-퍼지 블록(113)은 뉴럴 네트워크의 학습 기능과 애매모호한 값을 다루는 퍼지 로직의 특징을 결합하여 인간의 뇌에서 벌어지는 추론 과정을 모사하기 위해서 뉴럴 네트워크 기술과 퍼지 로직 기술을 결합하여 구현한다.The neuro-fuzzy block 113 combines the neural network technology and the fuzzy logic technology to simulate the inference process occurring in the human brain by combining the learning function of the neural network and the features of fuzzy logic dealing with ambiguous values.

병렬처리 프로세서(120)는 복수의 프로세싱 유닛(121)을 포함하며 네트워크 온 칩(130)에서 전송된 데이터를 병렬적으로 연산한다.The parallel processor 120 includes a plurality of processing units 121 and calculates the data transmitted from the network on chip 130 in parallel.

네트워크 온 칩(130)은 뉴로-퍼지 시스템(110)과 병렬처리 프로세서(120) 사이에 연결된다. 네트워크 온 칩(130)은 뉴로-퍼지 시스템(110)이 취합 또는 추출한 데이터를 병렬처리 프로세서(120)에 전달하고, 뉴로-퍼지 시스템(110)의 전력 제어 신호는 전력제어장치(160)에 전달한다. 또한 뉴로-퍼지 시스템(110), 병렬처리 프로세서(120), 전력공급장치(160) 및 메모리(140) 간의 데이터 또는 신호의 통신을 가능하게 한다.The network on chip 130 is connected between the neuro-fuzzy system 110 and the parallel processor 120. The network on chip 130 transmits the data collected or extracted by the neuro-fuzzy system 110 to the parallel processor 120, and the power control signal of the neuro-fuzzy system 110 is transmitted to the power controller 160. do. It also enables communication of data or signals between the neuro-fuzzy system 110, the parallel processor 120, the power supply 160, and the memory 140.

메모리(140)는 뉴로-퍼지 시스템(110)에서 추출된 데이터와 병렬처리 프로세서(120)의 연산 과정의 중간 데이터를 저장한다.The memory 140 stores data extracted from the neuro-fuzzy system 110 and intermediate data of a calculation process of the parallel processor 120.

전력공급장치(160)는 뉴로-퍼지 시스템(110)이 네트워크 온 칩(130)에 전달 한 제어신호에 따라 병렬처리 프로세서(120)로 공급되는 전력을 제어한다. 즉, 데이터 연산이 필요한 병렬처리 프로세서(120)에만 전력을 공급하고 불필요한 병렬처리 프로세서(120)에는 전력 공급을 차단한다. 따라서 본 발명은 병렬처리 프로세서(120)가 소모하는 전력을 처리 데이터의 양에 따라 효율적으로 제어할 수 있으므로, 방대한 데이터를 처리하는 경우에도 저전력, 실시간으로 컴퓨터 시스템(100)을 구동할 수 있다.The power supply device 160 controls the power supplied to the parallel processor 120 according to the control signal transmitted from the neuro-fuzzy system 110 to the network on chip 130. That is, power is supplied only to the parallel processor 120 that requires data operation, and power supply to the unnecessary parallel processor 120 is cut off. Therefore, the present invention can efficiently control the power consumed by the parallel processing processor 120 according to the amount of processed data, so that the computer system 100 can be driven in real time even when processing massive data.

또한, 뉴로-퍼지 기술을 적용한 컴퓨터 시스템(100)은 일정한 목적을 위해 입력된 전체 데이터 모두를 처리 또는 연산하지 않고 컴퓨터 시스템(100)의 목적을 달성하는 데 있어 필요한 정도의 일정한 데이터만을 뉴로-퍼지 기술로 선택 또는 취합하고, 선택 또는 취합된 데이터만을 컴퓨터 시스템(100)이 처리 또는 연산한다. 결국 뉴로-퍼지 기술이 적용되지 않은 기존의 컴퓨터 시스템에 비해서 같은 목적을 달성하더라도 처리 속도, 전력 소모 등이 향상된다. 또한 방대한 데이터를 처리하는 경우에도 뉴로-퍼지 기술이 적용된 컴퓨터 시스템(100)은 실시간 구동이 가능하다.In addition, the computer system 100 to which the neuro-fuzzy technology is applied does not process or compute all the data inputted for a certain purpose, but the neuro-purge only a certain amount of data necessary to achieve the purpose of the computer system 100. Technology selects or aggregates, and computer system 100 processes or computes only the selected or aggregated data. As a result, processing speed, power consumption, etc. are improved even if the same purpose is achieved compared to the conventional computer system without neuro-purge technology. In addition, even when processing massive data, the computer system 100 to which the neuro-fuzzy technology is applied may be driven in real time.

또한, 도 1에 나타낸 컴퓨터 시스템(100)에 포함된 뉴로-퍼지 시스템(110)은 태스크 스케줄러(미도시)를 더 포함할 수 있으며, 태스크 스케줄러는 뉴로 퍼지 시스템에서 출력되는 데이터를 병렬처리 프로세서에 효율적으로 분배한다. 예를 들면, 태스크 스케줄러는 높은 처리 속도를 위해서 여러 개의 데이터들을 프로세싱 유닛(121)에 효율적으로 할당하고, 각 프로세싱 유닛(121)이 각각의 데이터 연산을 끝내는 대로 새로운 데이터를 할당하는 것이 가능하다.In addition, the neuro-fuzzy system 110 included in the computer system 100 shown in FIG. 1 may further include a task scheduler (not shown), and the task scheduler may transmit data output from the neuro fuzzy system to the parallel processing processor. Distribute efficiently For example, the task scheduler can efficiently allocate several data to the processing unit 121 for high processing speed, and allocate new data as each processing unit 121 finishes each data operation.

물체 인식 장치Object recognition device

도 1의 컴퓨터 시스템(100)을 이용하는 다양한 실시예 중에서 물체 인식 장치 및 물체 인식 방법에 관하여 이하 도면을 통해 설명한다.An object recognition apparatus and an object recognition method among various embodiments using the computer system 100 of FIG. 1 will be described with reference to the accompanying drawings.

물체 인식은 2-D 영상(image, 이미지) 데이터가 입력으로 주어졌을 때, 이미지에서 물체의 특징점을 찾고 그에 대한 특징벡터(descriptor vector)를 생성하여, 미리 등록된 물체에 대한 벡터 집합인 물체 데이터베이스(object database)와 비교하여 가장 가까운 물체를 결정하는 과정으로 이루어진다. 이러한 물체 인식을 수행하기 위해서는 입력 이미지에 대하여 특징점을 추출하고 그것을 벡터로 기술하여야 한다. 이를 위해서는 많은 양의 이미지 데이터에 대하여 복잡, 다양한 연산이 필요하다. 현재 가장 많이 사용되는 병렬 처리 연산은 SIFT(scale invariant feature transform)이지만 이는 많은 양의 연산량을 필요로 하여 2GHz 고성능 CPU로도 초당 0.5프레임 정도의 성능을 낼 정도로 실시간 처리하기가 힘들다.Object recognition is an object database that is a vector set of pre-registered objects by finding the feature points of an object in the image and generating a descriptor vector for it when 2-D image data is given as input. It is the process of determining the nearest object compared to the (object database). In order to perform such object recognition, feature points should be extracted from the input image and described as vectors. This requires complex and various operations on a large amount of image data. Currently, the most commonly used parallel processing operation is scale invariant feature transform (SIFT), but it requires a large amount of computation, and it is difficult to process in real time such as 0.5 frames per second even with a 2GHz high-performance CPU.

본 발명에서 제안된 컴퓨터 시스템(100)을 물체 인식 장치에 적용하면 병렬 처리 프로세서(120)가 처리해야 하는 데이터의 양을 원천적으로 감소시키는 것은 물론 병렬처리 프로세서가 소모하는 전력을 처리 데이터의 양에 따라 효율적으로 제어할 수 있으므로, 많은 연산량을 요구하는 물체 인식을 종래 기술에 비해 저전력으로 수행 가능하고 처리 속도의 향상으로 실시간으로 물체 인식이 가능하게 된다.Application of the computer system 100 proposed in the present invention to the object recognition apparatus not only reduces the amount of data that the parallel processing processor 120 needs to process, but also converts the power consumed by the parallel processing processor into the amount of processed data. According to the present invention, an object recognition requiring a large amount of computation can be performed at a lower power than in the prior art, and an object recognition can be performed in real time by improving processing speed.

도 2는 본 발명의 일 실시예에 따른 물체 인식 장치의 블록도를 나타낸 도면 이다.2 is a block diagram of an object recognition apparatus according to an embodiment of the present invention.

물체 인식 장치는 뉴로-퍼지 시스템(210), 병렬처리 프로세서(220), 네트워크 온 칩(230), 메모리(240), 물체결정부(250) 및 전력공급장치(260)를 포함한다.The object recognition apparatus includes a neuro-fuzzy system 210, a parallel processor 220, a network on chip 230, a memory 240, an object determiner 250, and a power supply 260.

뉴로-퍼지 시스템(210)은 뉴럴 네트워크 블록, 퍼지 로직 블록, 뉴럴 네트워크와 퍼지 로직이 결합된 뉴로-퍼지 블록 및 태스크 스케줄러(217)를 포함한다.The neuro-fuzzy system 210 includes a neural network block, a fuzzy logic block, a neuro-fuzzy block in which the neural network and fuzzy logic are combined, and a task scheduler 217.

뉴로-퍼지 시스템(210)은 전체 이미지(215)를 입력받아 물체에 대한 관심영역(211, 212, 213)을 추출한 뒤 추출된 관심영역(211, 212, 213) 이미지를 2차원 타일 태스크(214)로 변환하고 변환된 타일 태스크(214)를 출력한다. 예를 들면, 640x480 픽셀 크기의 예시 이미지(215)에서 각 물체의 관심영역(211, 212, 213)은 뉴로-퍼지 시스템(210)에 의하여 40x40 크기의 기본 타일 단위(216)로 추출되게 된다.The neuro-fuzzy system 210 receives the entire image 215, extracts the regions of interest 211, 212, and 213 of the object, and then extracts the extracted regions of interest 211, 212, and 213 from the two-dimensional tile task 214. ) And output the converted tile task 214. For example, in the example image 215 of 640x480 pixels, the ROIs 211, 212, and 213 of each object may be extracted by the neuro-fuzzy system 210 in 40 × 40 basic tile units 216.

또한, 병렬처리 프로세서(220)가 처리해야 할 연산량을 예측하여 특정 프로세싱 유닛(221)들에만 전력을 공급하는 제어신호를 네트워크 온 집(230)을 통해 전력공급장치(230)에 전달한다.In addition, the parallel processor 220 estimates the amount of processing to be processed and transmits a control signal for supplying power only to the specific processing units 221 to the power supply device 230 through the network house 230.

이렇게 추출된 관심영역 타일은 태스크 스케줄러(217)에 의하여 각 관심영역 기본 타일(216)의 시작 주소, 시작 좌표, 및 타일 크기로 이루어진 12바이트의 타일 태스크(214)로 변환되어 16개의 프로세싱 유닛(221)으로 이루어진 병렬 처리 프로세서(220)의 각 프로세싱 유닛(221)에 할당된다.The extracted region of interest tile is converted into a 12-byte tile task 214 composed of a start address, a start coordinate, and a tile size of each region of interest base tile 216 by the task scheduler 217, thereby processing 16 processing units ( 221 is assigned to each processing unit 221 of the parallel processing processor 220.

병렬처리 프로세서(220)는 복수의, 예를 들어 16개의 프로세싱 유닛(221)을 포함하고 추출된 관심영역 타일(211, 212, 213)의 타일 태스크(214)에 대해서 SIMD 병렬 연산을 통하여 물체의 특징점과 특징벡터들을 생성한다.The parallel processor 220 includes a plurality of, for example, sixteen processing units 221 and performs SIMD parallel operation on the tile task 214 of the extracted region of interest tiles 211, 212, 213 through SIMD parallel operation. Generate feature points and feature vectors.

네트워크 온 칩(230)은 뉴로-퍼지 시스템(210)과 병렬처리 프로세서(220) 사이에 연결되고, 뉴로-퍼지 시스템(210)의 출력인 관심영역 타일 태스크(214) 또는 제어신호를 병렬처리 프로세서(220)에 전송한다. 또한, 뉴로-퍼지 시스템(210), 병렬처리 프로세서(220), 메모리(240), 물체결정부(250) 및 전력공급장치(260)와 데이터 통신을 한다.The network on chip 230 is connected between the neuro-fuzzy system 210 and the parallel processor 220, and transmits the ROI tile task 214 or a control signal which is an output of the neuro-fuzzy system 210 to the parallel processor. To 220. In addition, data communication is performed with the neuro-fuzzy system 210, the parallel processor 220, the memory 240, the object determiner 250, and the power supply 260.

메모리(240)는 전체 입력 이미지(215)를 저장하고, 뉴로-퍼지 시스템(210), 물체 결정부(250) 및 프로세싱 유닛(221)들의 데이터를 저장한다.The memory 240 stores the entire input image 215 and stores data of the neuro-fuzzy system 210, the object determiner 250, and the processing units 221.

물체 결정부(250)는 병렬 처리 프로세서(220)에서 생성한 특징점과 특징벡터를 기초로 대상 물체에 대하여 미리 만들어진 데이터베이스에서 가장 가까운 벡터를 찾아 투표함으로써 물체의 종류를 판단한다. 여러 물체가 있을 경우 반복적으로 매칭(matching) 과정을 수행하여 여러 물체에 대하여 최종적으로 물체들의 종류를 판단한다.The object determiner 250 determines the type of the object by finding and voting the closest vector from a database previously created for the target object based on the feature point and the feature vector generated by the parallel processing processor 220. If there are several objects, the matching process is repeatedly performed to determine the types of the objects for the various objects.

본 발명의 일 실시예에 따르면, 병렬처리 프로세서(220)는 입력된 이미지(215) 전체가 아닌 뉴로-퍼지 시스템(210)에서 추출된 관심영역(211, 212, 213) 타일이 변환된 타일 태스크(214)에 대하여만 SIMD 연산을 수행한다. 또한 전력공급장치(260)은 뉴로-퍼지 시스템에서 출력되는 관심영역 타일의 개수에 따라 병렬처리 프로세서(220) 중 일부의 프로세싱 유닛(221)에만 전력을 공급한다. 이렇게 함으로써 본 발명에서 제안된 물체 인식 장치는 종래 기술과 비교해서 저전력으로 물체를 인식할 수 있고, 처리 속도가 향상되어 실시간으로 물체를 인식할 수 있다.According to an embodiment of the present invention, the parallel processing processor 220 converts a tile task in which tiles of the regions of interest 211, 212, and 213 extracted from the neuro-fuge system 210, instead of the entire input image 215, are converted. Perform SIMD operation only for 214. The power supply 260 also supplies power only to some of the processing units 221 of the parallel processor 220 according to the number of ROI tiles output from the neuro-fuzzy system. By doing so, the object recognition device proposed in the present invention can recognize an object with low power as compared with the prior art, and the processing speed can be improved to recognize an object in real time.

도 3a는 도 2에 나타낸 물체 인식 장치에 있어서, 뉴로-퍼지 시스템(210)이 물체의 관심영역(211, 212, 213)을 추출하는데 필요한 구성을 나타낸 도면이다.FIG. 3A is a diagram illustrating a configuration required for the neuro-fuzzy system 210 to extract the ROIs 211, 212, and 213 of an object in the object recognition apparatus illustrated in FIG. 2.

뉴로-퍼지 시스템(210)은 퍼지 모션 측정기(fuzzy motion estimator, 310), 셀룰러 뉴럴 네트워크 시각 집중기(cellular neural networks visual attention engine, 320), 뉴로-퍼지 분류기(neuro-fuzzy classifier, 330), 태스크 스케줄러(340) 및 제어부(350)를 포함한다.The neuro-fuzzy system 210 includes a fuzzy motion estimator 310, a cellular neural networks visual attention engine 320, a neuro-fuzzy classifier 330, a task. The scheduler 340 and the controller 350 are included.

도 3a의 퍼지 모션 측정기(310)는 퍼지 로직을 이용하여 시간적으로 연속적인 두 이미지 프레임에 대하여 모션 벡터(motion vector)를 구한다. 한 프레임의 어떤 지점이 다른 프레임에서 어떤 지점인지를 찾아내는 정합 문제(correspondence problem)를 통하여 각 지점의 모션 벡터를 찾아내게 된다. 이 과정에서 모든 이미지 영역을 찾는 풀 서치(full search)를 사용하면 연산량이 너무 많기 때문에 퍼지 추론을 이용하여 서치 영역을 줄이는 것이 필요하다. 동작 과정을 자세히 살펴보면, 먼저 입력된 이미지 프레임에서 비교하고자 하는 픽셀을 각각 하나씩 추출하고, 그 차이 값을 누적하여 메모리 소자에 기억한 다음, 그 값을 멤버십 함수를 통하여 퍼지화 시킨다. 퍼지화된 값의 분포에 따라 두 이미지 픽셀의 유사도를 결정함으로써, 모션 벡터의 크기를 결정하게 된다.The fuzzy motion measurer 310 of FIG. 3A uses fuzzy logic to find a motion vector for two temporally consecutive image frames. The motion vector of each point is found through the correspondence problem of finding which point in one frame is which point in another frame. Using full search to find all image areas in this process requires too much computation, so it is necessary to reduce the search area using fuzzy inference. Looking at the operation process in detail, the pixels to be compared are first extracted from the input image frame, and the difference values are accumulated and stored in the memory device, and then the values are fuzzy through the membership function. By determining the similarity of the two image pixels according to the distribution of the fuzzy values, the size of the motion vector is determined.

도 3a의 셀룰러 뉴럴 네트워크 시각 집중기(320)는 뉴럴 네트워크 기반의 시각집중 알고리즘을 이용하여 인간 뇌의 시각피질(visual cortex)에서 일어나는 시각 집중(visual attention) 현상을 하드웨어로 구현한다. 이는 2-D로 연결된 뉴럴 네트워크를 이용하여 효율적으로 전체 입력 이미지에서 강도, 색, 방향 등의 정적 특징 추출(feature extraction)을 수행하며, 퍼지 모션 측정기(310)에서 생성된 모션 벡터와 셀룰러 뉴럴 네트워크 시각 집중기(320)에서 추출된 강도, 색, 방향들을 모두 누적하여 특징맵(saliency map)을 생성한다.도 3b는 셀룰러 뉴럴 네트워크 시각 집중기(320)를 나타낸 도면이다. 각 셀(321)이 2-D 배열로 구성되는 셀룰러 뉴럴 네트워크 (322)를 이용하여 구현된다. 2-D 배열의 각 셀(321)은 입력 이미지의 한 픽셀에 각각 매핑(mapping)되어 강도, 색, 방향의 특징 추출(feature extraction)을 수행한다. 이렇게 함으로써 입력 이미지의 특징 추출의 효율이 증가된다. 셀룰러 뉴럴 네트워크(322)는 각 셀(321)의 배열과 입력된 이미지의 픽셀과의 일대일 관계를 이용하여 이미지 필터링 등의 작업을 수행한다. 이렇게 함으로써 이미지 필터링 등의 처리 시간이 빨라지고 셀룰러 뉴럴 네트워크 시각 집중기(320)의 전력소모가 감소된다. 각 셀(321)은 이미지 정보를 저장하는 메모리 소자(323), 인근 셀과 데이터를 주고 받기 위한 시프트 레지스터(324), 그리고 셀 데이터의 연산을 위한 연산 장치(325)로 이루어지며, 도 3a의 제어부(350)를 통하여 각 셀(321)들이 주변 셀(321)들과 통신하여 고유한 영상 처리를 수행한다.The cellular neural network visual concentrator 320 of FIG. 3A implements a visual attention phenomenon occurring in the visual cortex of the human brain in hardware using a neural network-based visual focus algorithm. It efficiently performs static feature extraction of intensity, color, direction, etc. from the entire input image using a 2-D connected neural network, and the motion vector and cellular neural network generated by the fuzzy motion measurer 310 A intensity map is accumulated by accumulating all the intensity, color, and directions extracted from the visual concentrator 320. FIG. 3B is a diagram illustrating the cellular neural network visual concentrator 320. Referring to FIG. Each cell 321 is implemented using a cellular neural network 322 consisting of a 2-D array. Each cell 321 of the 2-D array is mapped to one pixel of the input image to perform feature extraction of intensity, color, and direction. This increases the efficiency of feature extraction of the input image. The cellular neural network 322 performs an operation such as image filtering using a one-to-one relationship between the arrangement of each cell 321 and the pixels of the input image. This speeds up processing time such as image filtering and reduces the power consumption of the cellular neural network concentrator 320. Each cell 321 includes a memory device 323 for storing image information, a shift register 324 for exchanging data with neighboring cells, and an arithmetic device 325 for computing cell data. Each cell 321 communicates with neighboring cells 321 through the control unit 350 to perform unique image processing.

다시 도 3a로 돌아가, 뉴로-퍼지 분류기(330)는 뉴럴 네트워크와 퍼지 로직을 결합하여 셀룰러 뉴럴 네트워크 시각 집중기(320)에서 얻어진 특징맵을 기반으로 도 2에 나타낸 입력 이미지(215)에서 각 물체의 관심 영역(211, 212, 213)을 추출하는 역할을 한다.Returning back to FIG. 3A, the neuro-fuzzy classifier 330 combines the neural network and fuzzy logic to each object in the input image 215 shown in FIG. 2 based on the feature map obtained in the cellular neural network visual concentrator 320. Serves to extract the regions of interest 211, 212, and 213.

도 3c는 뉴로-퍼지 분류기(330)를 나타낸 블록도이다. 뉴로-퍼지 분류기(330)는 임의의 픽셀이 임의 물체의 관심 영역(211, 212, 213)에 들어가게 되는 지 아닌지를 동질성 기준(homogeneity criteria)을 통하여 판별하며, 셀룰러 뉴럴 네트워크 시각 집중기(320)에서 생성된 특징맵으로부터 가장 특징적인 부분인 시드점을 추출한 후, 그것을 시작점으로 주변 픽셀들에 대한 분류를 통하여 점점 영역을 넓혀가면서 각 물체의 관심 영역을 도출한다. 이 과정에서 물체의 관심 영역과 동질성 테스트를 위한 타겟 픽셀간의 유사도를 측정하기 위해서 퍼지 로직이 사용된다. 유사도 측정의 대상으로 강도(intensity), 특징(saliency), 시드 점과의 거리(distance) 등이 고려된다. 인간의 유사도 판단과 비슷한 유사도 측정을 위해서 퍼지 로직을 이용한다. 측정의 대상이 되는 변수는 다음 수식(1)과 같이 가우시안(gaussain) 퍼지 로직(331)을 통해 유사도(μ1 내지 μn)가 측정된다. 3C is a block diagram illustrating a neuro-fuzzy classifier 330. The neuro-fuzzy classifier 330 determines whether any pixel enters the region of interest 211, 212, 213 of any object, through homogeneity criteria, and the cellular neural network visual concentrator 320. After extracting the seed point, which is the most characteristic part, from the feature map generated in, the area of interest of each object is derived by gradually expanding the area through classification of neighboring pixels. In this process, fuzzy logic is used to measure the similarity between the region of interest of the object and the target pixel for homogeneity testing. Intensity, saliency, distance from the seed point, etc. are considered as objects of the similarity measurement. Fuzzy logic is used to measure similarity between humans and similarity. The similarity (μ1 to μn) is measured through a Gaussian fuzzy logic 331 as shown in Equation (1).

… (1)

… (One)

측정된 유사도(μ1 내지 μn)들은 뉴럴 네트워크(332)를 통해 수식(2)와 같이 가중치(ω1 내지 ωn)가 곱해지게 되고, 곱해진 값들은 합산부(334)에서 합산된다. 그 후 동질성 판별부(335)는 수식(3)과 같이 합쳐진 값이 경계 값(threshold value, b)과 비교하여 최종적으로 해당영역에 포함되는 동질 픽셀인지 판별한다. The measured similarities μ1 to μn are multiplied by the weights ω1 to ωn as in Equation (2) through the neural network 332, and the multiplied values are summed in the summation unit 334. Thereafter, the homogeneity determination unit 335 determines whether the combined values, as shown in Equation (3), are homogeneous pixels finally included in the corresponding area by comparing with the threshold value b.

… (2)

… (3)

판별 후 뉴럴 네트워크(332)에 포함된 가중치 업데이트 경로(333)를 통해 각 연결의 가중치(ω1 내지 ωn)가 수식 (4)와 같이 학습에 의하여 변화하게 된다.After the determination, the weights ω1 to ωn of each connection are changed by learning as shown in Equation (4) through the weight update path 333 included in the neural network 332.

… (4)

도 3d는 유사도 측정에 사용되는 퍼지 멤버십 함수의 대표적인 예인 가우시안 함수를 CMOS소자로 구현한 회로(360) 및 결과 파형(370)을 도시한 것이다. 도 3d의 (b)에 나타낸 바와 같이 이 회로(360)의 출력(Iout; 362)은 가우시안 함수(370)로 출력된다. 회로의 입력인 Xseed(361)의 값을 변화시키면 결과 파형(370)의 중심점(seed, 371)을 이동시킬 수 있다. 또 다른 입력인 Xtarget(363)에 target값(373)을 입력하면 해당 입력값(373)에 따른 가우시안 함수값(372)이 출력단자(Iout; 362)에서 출력된다.FIG. 3D illustrates a circuit 360 and a result waveform 370 in which a Gaussian function, which is a representative example of the fuzzy membership function used for similarity measurement, is implemented as a CMOS device. As shown in FIG. 3D (b), the output Iout 362 of this circuit 360 is output to the Gaussian function 370. Changing the value of the Xseed 361, which is the input of the circuit, may move the center point 371 of the resulting waveform 370. When the target value 373 is input to another input, Xtarget 363, the Gaussian function value 372 according to the input value 373 is output from the output terminal Iout 362.

가우시안 함수(370)를 디지털로 정확히 구현하고자 한다면, 1000회 이상의 인스트럭션 수행이 필요하나 아날로그 회로(360)로 구현하는 경우 소수의 MOSFET 소자(M1 내지 M11)만으로 100ns 이하로 수행 시간으로 가우시안 함수(370)의 구현이 가능하다.In order to accurately implement the Gaussian function 370 digitally, more than 1000 instructions need to be performed, but when the analog circuit 360 is implemented, the Gaussian function 370 is less than 100 ns with only a few MOSFET elements M1 to M11. ) Can be implemented.

다시 도 3a로 돌아가, 태스크 스케줄러(217)는 퍼지 모션 측정기(310), 셀룰러 뉴럴 네트워크 시각 집중기(320) 및 뉴로-퍼지 분류기(330)에 의하여 추출된 관심영역 타일(211, 212, 213)을 관심영역 타일 태스크(214)로 변환하고, 관심영역 타일 태스크(214)를 복수의 프로세싱 유닛(221)에 병렬적으로 할당하기 때문에, 병렬 처리 프로세서(220)는 동시에 여러 개의 관심영역의 타일(211, 212, 213)을 처리하게 된다. 태스크 스케줄러(340)는 타일 태스크(214)를 복수의 프로세싱 유닛(221)에 할당함에 있어서, 태스크 스케줄링 테이블을 만들어 어떤 프로세싱 유닛(221)에서 어떤 관심영역 태스크(214)를 처리하고 있고 어떤 프로세싱 유닛(221)이 사용 가능한지를 관리한다. 예를 들면, 태스크 스케줄러(340)는 높은 처리 속도를 위해서 여러 개의 관심영역 타일 태스크들(214)을 최대한 많은 프로세싱 유닛(221)에 할당하고, 각 프로세싱 유닛(221)이 개별 태스크를 끝내는 대로 새로운 관심영역 타일 태스크(214)를 할당하는 것이 가능하다. 하지만, 이것은 한 가지 스케줄링 방법일 것이고, 태스크 스케줄러에 의해서 여러 가지 스케줄링 방법이 가능하다. 또한, 할당되는 관심영역의 크기 역시 일정하지 않고, 타일 크기를 임의로 기본 타일 크기의 배수로 정하는 것이 가능하다.Referring back to FIG. 3A, the task scheduler 217 is a region of interest tile 211, 212, 213 extracted by the fuzzy motion measurer 310, the cellular neural network visual concentrator 320, and the neuro-fuzzy classifier 330. Is transformed into the region of interest tile task 214 and the region of interest tile task 214 is assigned to the plurality of processing units 221 in parallel, the parallel processing processor 220 simultaneously executes tiles of several regions of interest ( 211, 212, and 213. In assigning the tile task 214 to the plurality of processing units 221, the task scheduler 340 generates a task scheduling table to process which region of interest task 214 is processed in which processing unit 221 and which processing unit. 221 manages the availability. For example, task scheduler 340 assigns several ROI tile tasks 214 to as many processing units 221 as possible for high processing rates, and each processing unit 221 completes new tasks as they complete their individual tasks. It is possible to assign the region of interest tile task 214. However, this may be one scheduling method, and various scheduling methods are possible by the task scheduler. In addition, the size of the region of interest to be allocated is also not constant, it is possible to set the tile size as a multiple of the base tile size arbitrarily.

제어부(350)는 퍼지 모션 측정기(310), 셀룰러 뉴럴 네트워크 시각 집중 기(320), 뉴러-퍼지 분류기(330) 및 태스크 스케줄러(217)의 동작을 제어한다.The controller 350 controls the operations of the fuzzy motion measurer 310, the cellular neural network visual concentrator 320, the neural-fuzzy classifier 330, and the task scheduler 217.

도 4는 12바이트의 관심영역 타일 태스크(214)의 포맷을 나타낸 도면이다. 타일 태스크(214)는 32비트의 시작 주소(411), 32비트의 시작좌표(412), 32비트의 타일크기(413)로 구성되어 있다.4 illustrates the format of a 12 byte region of interest tile task 214. The tile task 214 is composed of a 32-bit start address 411, a 32-bit start coordinate 412, and a 32-bit tile size 413.

32비트의 시작 주소(411)는 메모리에 저장되어 있는 전체 이미지(215) 데이터 중에서 해당 관심영역 타일(211, 212, 213)의 최 좌측 상단 점(415)의 데이터의 주소를 의미한다. 32비트의 시작 좌표(412)는 이미지(215)에서 해당 관심영역 타일(211, 212, 213)의 최 좌측 상단 점(415)의 2차원 좌표값(X, Y)을 의미한다. 시작 좌표(412)는 다시 각각 16비트의 X방향좌표, Y방향좌표로 나눠진다. 타일 크기(413)는 시작 좌표(412)를 기준으로 해당 타일(414)의 크기를 기술하는 것으로, 타일의 너비(W)와 높이(H)가 각각 16비트씩 기술되어 있다. 위 3가지 정보(411, 412, 413)와 전체 이미지(215) 넓이의 데이터만 있으면, 도 2의 프로세싱 유닛(221)은 자신이 처리해야 하는 이미지(414) 타일의 위치와 크기를 알 수 있고 이것을 메모리(240)에서 다운로드해서 처리하는 것이 가능하다. 전체 이미지의 넓이는 변하지 않는 값이므로 프로그램 시작 단계에서 각 프로세싱 유닛(221)에 세팅할 수 있다.The 32-bit starting address 411 means the address of the data of the leftmost point 415 of the ROI tiles 211, 212, and 213 among the entire image 215 data stored in the memory. The 32-bit start coordinate 412 refers to the two-dimensional coordinate values (X, Y) of the upper left point 415 of the ROI tiles 211, 212, and 213 in the image 215. The start coordinates 412 are further divided into 16-bit X-direction coordinates and Y-direction coordinates, respectively. The tile size 413 describes the size of the tile 414 based on the start coordinates 412, and the width W and the height H of the tile are described by 16 bits. If only the above three pieces of information (411, 412, 413) and the data of the entire image 215 width, the processing unit 221 of Figure 2 can know the location and size of the image 414 tile that it has to process This can be downloaded from the memory 240 and processed. Since the width of the entire image is a value that does not change, it may be set in each processing unit 221 at the program start stage.

도 5a는 앞서 도 2에서 설명한 물체 인식 장치의 병렬처리 프로세서(220)를 나타낸 도면이다.FIG. 5A illustrates a parallel processing processor 220 of the object recognition apparatus described with reference to FIG. 2.

병렬처리 프로세서(220)는 복수의, 예를 들어 16개의 프로세싱 유닛(PE 1 내지 PE 16, 221)을 포함하며, 16개의 프로세싱 유닛들(PE 1 내지 PE 16)은 4개마다 하나의 독립된 전력 도메인으로 구분된다. 즉, 제1 내지 제4 프로세싱 유닛(PE 1 내지 PE 4)이 제1 전력 도메인(222)을 공유하고, 제5 내지 제8 프로세싱 유닛(PE 5 내지 PE 8)이 제2 전력 도메인(미도시)을 공유하고, 제9 내지 제12 프로세싱 유닛(PE 9 내지 PE 12)이 제3 전력 도메인(미도시)을 공유하고, 제13 내지 제16 프로세싱 유닛(PE 13 내지 PE 16)이 제4 전력 도메인(223)을 공유한다.The parallel processing processor 220 includes a plurality of, for example, 16 processing units PE 1 to PE 16, 221, wherein the 16 processing units PE 1 to PE 16 each have one independent power every four. It is divided into domains. That is, the first to fourth processing units PE 1 to PE 4 share the first power domain 222, and the fifth to eighth processing units PE 5 to PE 8 share a second power domain (not shown). ), The ninth through twelfth processing units PE 9 through PE 12 share the third power domain (not shown), and the thirteenth through sixteenth processing units PE 13 through PE 16 share the fourth power. Share domain 223.

도 5b는 앞서 도 2에서 설명한 물체 인식 장치의 전력공급장치(260)를 나타낸 도면이다.FIG. 5B is a diagram illustrating the power supply device 260 of the object recognition apparatus described with reference to FIG. 2.

전력공급장치(260)은 복수의, 예를 들어 4개의 전원 조정기(261 내지 264)를 포함하며, 각각의 전원 조정기(261 내지 264)는 도 5a에 나타낸 각각의 전력 도메인(222, 223, …)과 1대 1로 매핑된다. 즉, 제1 전원 조정기(261)는 도 5a의 제1 전력 도메인(222)의 전원을 제어하고, 제2 전원 조정기(262)는 도 5a의 제2 전력 도메인(미도시)의 전원을 제어하고, 제3 전원 조정기(263)는 도 5a의 제3 전력 도메인(미도시)의 전원을 제어하고, 제4 전원 조정기(264)는 도 5a의 제4 전력 도메인(223)의 전원을 제어한다.Power supply 260 includes a plurality of, for example, four power regulators 261-264, each power regulator 261-264 having a respective power domain 222, 223,... ) And one-to-one. That is, the first power regulator 261 controls the power of the first power domain 222 of FIG. 5A, and the second power regulator 262 controls the power of the second power domain (not shown) of FIG. 5A. The third power regulator 263 controls the power of the third power domain (not shown) of FIG. 5A, and the fourth power regulator 264 controls the power of the fourth power domain 223 of FIG. 5A.

이와 같이 전력 도메인(222, 223, …)의 분리를 통해, 병렬처리 프로세서(220)가 처리해야 할 연산량(관심영역 태스크)이 많지 않을 때에는 필요한 병렬처리 프로세서(220)에만 전력을 공급하고, 불필요한 병렬처리 프로세서(220)에 대해서는 전력 공급을 차단한다. 즉, 뉴로-퍼지 시스템(210)이 병렬처리 프로세서(220)가 처리해야 할 연산량을 예측하여 필요한 전력 도메인(222, 223, …)에만 전력을 공급하는 제어신호를 전원공급장치(260)에 전달한다. 따라서 본 발명의 일 실시예에 따르면 저전력, 실시간으로 물체 인식이 가능하다.By separating the power domains 222, 223,..., The power is supplied only to the necessary parallel processing processor 220 when the amount of computation (interest region task) that the parallel processing processor 220 needs to process is not large. Power supply to the parallel processor 220 is cut off. That is, the neuro-fuzzy system 210 predicts the amount of computation to be processed by the parallel processor 220 and transmits a control signal to the power supply 260 to supply power only to the required power domains 222, 223,... do. Therefore, according to an embodiment of the present invention, it is possible to recognize an object in low power and real time.

병렬처리 프로세서가 처리해야 할 연산량 예측 방법How to estimate how much processing a parallel processor should handle

도 6은 본 발명의 일 실시예에 따른 물체 인식 장치의 뉴로-퍼지 시스템이 수행하는 병렬처리 프로세서가 처리해야 할 연산량 예측 방법을 나타낸 순서도이다.FIG. 6 is a flowchart illustrating an operation amount prediction method to be processed by a parallel processor executed by a neuro-fuzzy system of an object recognition apparatus according to an embodiment of the present invention.

먼저 뉴로-퍼지 시스템은 뉴로-퍼지 분류기가 추출한 관심영역 타일의 개수를 측정한다(S601).First, the neuro-fuzzy system measures the number of ROI tiles extracted by the neuro-fuzzy classifier (S601).

그 후 측정된 타일의 개수와 분배 기준 개수를 비교하여 특정 개수의 전력 도메인을 선택한다(S602). 즉, 관심영역 타일의 개수에 따라 N단계로 구분하여 관심영역 타일의 개수가 가장 적을 때는 제1 전력 도메인을 선택하고, 그 다음 단계일 때는 제1 및 제2 전력 도메인을 선택하고, 그 다음 단계일 때는 제1, 제2 및 제3 전력 도메인을 선택하고, 관심영역 타일의 개수가 가장 많을 때는 제1 내지 제N 전력 도메인 모두를 선택한다.Thereafter, a specific number of power domains is selected by comparing the measured number of tiles with the distribution reference number (S602). That is, the first power domain is selected when the number of ROI tiles is the smallest, and when the next step is selected, the first and second power domains are selected. Is selected when the first, second and third power domains are selected, and when the number of ROI tiles is the largest, all of the first to Nth power domains are selected.

그 후 전력공급장치가 선택된 전력 도메인에만 전력을 공급한다(S603).Thereafter, the power supply supplies power only to the selected power domain (S603).

그 후 전력이 공급된 전력 도메인에 포함된 병렬처리 프로세서의 연산 결과에 따라 상기 S602 단계의 분배 기준 개수를 갱신한다(S604).Thereafter, the distribution reference number of the step S602 is updated according to the calculation result of the parallel processor included in the power domain to which power is supplied (S604).

전력 도메인 개수 결정 방법How to determine the number of power domains

도 7은 본 발명의 일 실시예에 따른 도 2에 나타낸 물체 인식 장치에서 전력 도메인 개수 결정 방법을 나타낸 도면이다.FIG. 7 is a diagram illustrating a method for determining the number of power domains in the object recognition apparatus shown in FIG. 2 according to an embodiment of the present invention.

각 전력 도메인(222, 223, …)이 포함하는 프로세싱 유닛(221)의 개수가 적을수록, 즉 전력 도메인(222, 223, …) 개수가 많을수록 물체 인식 장치의 전력감소효과(720)가 증가한다. 이에 반해, 전력 도메인(222, 223, …)의 개수에 따라 필요한 전원 조정기(261 내지 264, …)의 개수가 증가하고, 전원 조정기(261 내지 264, …)에 필요한 면적이 늘어난다. 그러나 증가하는 기울기는 전력감소효과(720)의 경우, 전력 도메인(222, 223, …) 개수가 증가함에 따라 증가하는 기울기는 감소한다. 추가면적 소요비용(730)의 경우, 전력 도메인(222, 223, …) 개수가 증가함에 따라 증가하는 기울기 역시 증가한다. 따라서 전력 도메인(222, 223, …) 개수는 추가면적에 따른 소요비용(730)과 전력감소효과(720)를 비교하여 결정한다. As the number of processing units 221 included in each of the power domains 222, 223,..., That is, the greater the number of power domains 222, 223,..., The power reduction effect 720 of the object recognition apparatus increases. . On the other hand, the number of power regulators 261 to 264, ... required increases with the number of power domains 222, 223, ..., and the area required for the power regulators 261 to 264, ... increases. However, in the case of the power reduction effect 720, the increasing slope decreases as the number of power domains 222, 223, ... increases. In the case of the additional area cost 730, the slope increases as the number of power domains 222, 223, ... increases. Therefore, the number of power domains 222, 223,... Is determined by comparing the cost 730 and the power reduction effect 720 according to the additional area.

먼저 전력 도메인 개수(710)를 X축, 전력 도메인 개수에 따른 전력 감소효과(720)를 Y축으로 하는 그래프(740)를 그린다. 그 후 전력 도메인 개수(710)를 X축, 상기 전력 도메인 개수에 따라 필요한 전원 조정기(261 내지 264, …)에 필요한 추가면적 소요비용(730)을 Y축으로 하는 그래프(750)를 그린다. 그 후 전력감소효과 그래프(740)의 기울기가 감소하며, 추가면적 소요비용 그래프(750)의 기울기가 증가하는 교차점(760)의 전력 도메인 개수를 결정한다.First, a graph 740 is drawn with the power domain number 710 as the X axis and the power reduction effect 720 according to the number of power domains as the Y axis. Thereafter, a graph 750 is plotted on the X-axis as the number of power domains 710 and Y-axis as an additional area cost 730 required for the power regulators 261 to 264 required for the number of power domains. Thereafter, the slope of the power reduction effect graph 740 is decreased, and the number of power domains of the intersection 760 at which the slope of the additional area cost graph 750 increases is determined.

예를 들면, 도 7에서 전력 도메인(222, 223, …)이 2개에서 4개로 증가하는 경우, 전력감소효과(720)가 56%에서 78%로 증가하고 이에 따른 추가면적 소요비용(730)은 4.5%에서 6.4%로 증가하고, 전력 도메인(222, 223, …)이 4개에서 8개로 증가하는 경우, 전력감소효과 (720)가 78%에서 88%로 증가하고 이에 따른 추가면적 소요비용(730)은 6.4%에서 9.2%로 증가한다. 따라서 최적의 효율을 얻기 위한 전력 도메인(222, 223, …) 개수는 5개 이상이 되지 않아야 한다. 즉, 전력감소효과(720)와 추가면적 소요비용(730)을 비교하여 2개와 4개 사이에서 결정하면 된다.For example, in FIG. 7, when the power domains 222, 223,... Increase from two to four, the power reduction effect 720 increases from 56% to 78% and thus additional area cost 730. Increases from 4.5% to 6.4%, and power domains (222, 223,…) increase from 4 to 8, power reduction effect 720 increases from 78% to 88%, resulting in additional area cost. (730) increases from 6.4% to 9.2%. Therefore, the number of power domains 222, 223,... To obtain optimal efficiency should not be more than five. That is, the power reduction effect 720 and the additional area required cost 730 may be determined between two and four.

물체 인식 방법Object recognition method

도 8은 본 발명의 일 실시예에 따른 물체 인식 장치에서 수행되는 물체 인식 방법을 나타내는 순서도이다.8 is a flowchart illustrating an object recognition method performed in an object recognition apparatus according to an embodiment of the present invention.

먼저 뉴로-퍼지 시스템에 의하여 물체의 관심영역을 기본 타일 단위로 추출한다(S801).First, a region of interest of an object is extracted in units of basic tiles by a neuro-fuzzy system (S801).

그 후 추출된 관심영역 타일을 전체 이미지 데이터에서 해당 타일 데이터의 시작 주소, 전체 이미지에서 해당 타일의 시작 위치의 2차원 좌표값 및 해당 타일의 크기를 포함하는 관심영역 타일 태스크로 변환한다(S802).Thereafter, the extracted region of interest tile is converted into a region of interest tile task including the start address of the tile data in the entire image data, a two-dimensional coordinate value of the start position of the tile in the entire image, and the size of the tile (S802). .

그 후 뉴로-퍼지 시스템이 병렬처리 프로세서가 처리해야 할 연산량을 예측하여 네트워크 온 칩을 통해 전력공급장치에 제어신호를 전달한다(S803).Thereafter, the neuro-fuzzy system predicts the amount of computation to be processed by the parallel processor and transmits a control signal to the power supply device through the network on chip (S803).

그 후 전원 조정기 각각이 제어신호에 따라 각각의 전력 도메인을 제어한다(S804).Thereafter, each power regulator controls each power domain according to the control signal (S804).

그 후 변환된 관심영역 타일 태스크가 네트워크 온 칩을 통하여 복수의 프로세싱 유닛으로 구성된 병렬처리 프로세서에 분배된다(S805).After that, the converted region of interest tile task is distributed to a parallel processing processor including a plurality of processing units through a network on chip (S805).

그 후 병렬처리 프로세서의 각 프로세싱 유닛이 분배받은 관심영역 타일 태스크에 대하여 물체의 특징점과 특징벡터를 생성한다(S806).Thereafter, each processing unit of the parallel processing processor generates a feature point and a feature vector of the object for the ROI tile task distributed (S806).

그 후 물체 결정부가 데이터베이스에 있는 벡터들과 비교하여 가장 가까운 거리를 갖는 벡터에 해당하는 물체를 인식한다(S807).Thereafter, the object determiner recognizes an object corresponding to the vector having the closest distance compared with the vectors in the database (S807).

또한, S801단계에서, 도 3a에 나타낸 뉴로-퍼지 시스템(210)의 퍼지 모션 측정기(310), 셀룰러 뉴럴 네트워크 시각 집중기(320), 뉴로-퍼지 분류기(330)를 통하여 각 물체의 관심영역(211, 212, 213)을 추출할 수 있다.In operation S801, the region of interest of each object may be passed through the fuzzy motion detector 310, the cellular neural network visual concentrator 320, and the neuro-fuzzy classifier 330 of the neuro-fuzzy system 210 illustrated in FIG. 3A. 211, 212, and 213 can be extracted.

본 발명에서 컴퓨터 시스템(100)은 이상 설명한 실시예에만 국한된 것이 아니고, 뉴럴 네트워크 블록(111), 퍼지 로직 블록(112) 및 뉴럴 네트워크와 퍼지 로직이 결합된 뉴로-퍼지 블록(113) 중 적어도 어느 2개를 포함하는 뉴로-퍼지 시스템(110)과 병렬처리 프로세서(120)가 결합되어 연산을 처리하는 모든 컴퓨터 시스템에 적용 가능함을 명시한다In the present invention, the computer system 100 is not limited to the above-described embodiments, and may include at least any one of the neural network block 111, the fuzzy logic block 112, and the neuro-fuzzy block 113 in which the neural network and the fuzzy logic are combined. Specifies that the neuro-fuzzy system 110 and the parallel processor 120, which include two, are applicable to all computer systems that process computations.

본 발명의 일 실시예에 따른 물체 인식 장치에서 뉴로-퍼지 시스템(210)은 전체 이미지(215)에서 각 물체의 관심영역(211, 212, 213)을 추출함으로써 병렬처리 프로세서(220)는 전체 이미지(215)가 아닌 개략적인 이미지인 관심영역(211, 212, 213)의 데이터만 연산하므로 물체 인식 속도를 가속화한다. 그 결과, 640x480 크기의 입력 이미지(215)에 대하여 500mW의 저전력으로 초당 30Frame 이상의 실시간 물체 인식이 가능하게 된다. 예를 들면, COIL-100과 같이 물체 인식의 실험에 많이 쓰이는 데이터베이스를 이용하여 테스트를 한 결과, 뉴로-퍼지의 관심영역(211, 212, 213) 추출로 인하여 병렬처리 프로세서(220)가 처리해야 하는 이미지 영역은 뉴로-퍼지 시스템이 적용되지 않았을 때와 비교하여 평균적으로 50% 이상 줄어들게 됨을 알 수 있었다. 또한, 필요한 관심영역(211, 212, 213)에서만 특징점들이 추출되기 때문에 특징점의 개수가 줄어들고 따라서 물체인식을 위한 벡터를 만드는 과정 및 데이터베이스와의 매칭 과정에서 필요한 연산량 또한 줄어들게 되어 저전력, 실시간으로 물체 인식이 가능하게 되는 것이다.In the object recognition apparatus according to the exemplary embodiment, the neuro-fuzzy system 210 extracts the ROIs 211, 212, and 213 of each object from the entire image 215, thereby allowing the parallel processor 220 to display the entire image. Since only the data of the ROIs 211, 212, and 213, which are schematic images and not 215, is calculated, the object recognition speed is accelerated. As a result, real-time object recognition of 30 frames or more per second is possible at a low power of 500 mW for the input image 215 having a size of 640x480. For example, as a result of testing using a database that is widely used for experiments of object recognition such as COIL-100, the parallel processing processor 220 has to process the data due to extraction of the regions of interest 211, 212, and 213 of neuro-fuzzy. The image area is reduced by more than 50% on average compared to when no neuro-fuzzy system is applied. In addition, since feature points are extracted only in the required regions of interest 211, 212, and 213, the number of feature points is reduced, thus reducing the amount of computation required in the process of creating a vector for object recognition and matching with a database. This will be possible.

이상에서 보는 바와 같이, 본 발명이 속하는 기술 분야의 당업자는 본 발명이 그 기술적 사상이나 필수적 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. As described above, those skilled in the art will understand that the present invention can be implemented in other specific forms without changing the technical spirit or essential features.

그러므로 이상에서 기술한 실시 예는 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로 이해해야만 하고, 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 등가개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.Therefore, the embodiments described above are to be understood in all respects as illustrative and not restrictive, and the scope of the present invention is indicated by the following claims rather than the above description, and the meaning and scope of the claims and All changes or modifications derived from the equivalent concept should be interpreted as being included in the scope of the present invention.

도 1은 본 발명의 일 실시예에 따른 컴퓨터 시스템(100)의 블록도를 나타내는 도면이다.1 is a block diagram of a computer system 100 according to an embodiment of the present invention.

도 2는 본 발명의 일 실시예에 따른 물체 인식 장치의 블록도를 나타낸 도면이다.2 is a block diagram of an object recognition apparatus according to an embodiment of the present invention.

도 3b는 셀룰러 뉴럴 네트워크 시각 집중기(320)를 나타낸 도면이다.3B illustrates a cellular neural network visual concentrator 320.

도 3c는 뉴로-퍼지 분류기(330)를 나타낸 블록도이다.3C is a block diagram illustrating a neuro-fuzzy classifier 330.

도 3d는 유사도 측정에 사용되는 퍼지 멤버십 함수의 대표적인 예인 가우시안 함수를 CMOS소자로 구현한 회로(360) 및 결과 파형(370)을 도시한 것이다.FIG. 3D illustrates a circuit 360 and a result waveform 370 in which a Gaussian function, which is a representative example of the fuzzy membership function used for similarity measurement, is implemented as a CMOS device.

도 4는 12바이트의 관심영역 타일 태스크(214)의 포맷을 나타낸 도면이다.4 illustrates the format of a 12 byte region of interest tile task 214.

도 5a는 도 2의 물체 인식 장치의 병렬처리 프로세서(220)를 나타낸 도면이다.FIG. 5A is a diagram illustrating a parallel processor 220 of the object recognition apparatus of FIG. 2.

도 5b는 도 2의 물체 인식 장치의 전력공급장치(260)를 나타낸 도면이다.FIG. 5B is a diagram illustrating the power supply device 260 of the object recognition device of FIG. 2.

**********도면의 주요 부분에 대한 부호의 설명******************** Description of the symbols for the main parts of the drawings **********

110, 210: 뉴로-퍼지 시스템110, 210: neuro-fuzzy system

120, 220: 병렬처리 프로세서120, 220: parallel processor

130, 230: 네트워크 온 칩130, 230: network on chip

140, 240: 메모리140, 240: memory

160, 260: 전력공급장치160, 260: power supply

217: 태스크 스케줄러217: Task Scheduler

214: 관심영역 타일 태스크214: Interest tile task

222, 223: 전력 도메인222, 223: power domain

261, 262, 263, 264: 전원 조정기261, 262, 263, 264: power regulator

310: 퍼지 모션 측정기310: Fuzzy Motion Meter

320: 셀룰러 뉴럴 네트워크 시각 집중기320: cellular neural network vision concentrator

330: 뉴로-퍼지 분류기330 neuro-fuzzy classifier

Claims

A neuro-fuzzy system including at least two of neural networks blocks, fuzzy logic blocks, and neuro-fuzzy blocks in which neural networks and fuzzy logic are combined;

A parallel processor including a plurality of processing units;

A power supply device controlling power supplied to the parallel processor in accordance with a control signal of the neuro-purge system; And

And a network on chip coupled between the neuro-fuge system and the parallel processor and for data communication between the neuro-fuge system, the parallel processor and the power supply.

The method of claim 1,

And a memory for storing data extracted from said neuro-fuzzy system and intermediate data of a computation process of said parallel processing processor.

The method of claim 1,

The neuro-fuzzy system further includes a task scheduler,

The task scheduler generates a schedule for distributing output data of the neuro-fuzzy system to the parallel processing processor.

The method of claim 2,

And a task scheduler for generating a schedule for distributing data of said neuro-fuzzy system to said parallel processing processor.

The method according to any one of claims 1 to 4,

The plurality of processing units are divided into one independent power domain for each specific number,

The power supply includes a plurality of power regulators,

Each of the power regulators controlling each of the power domains in accordance with a control signal of the neuro-purge system.

Neuro-fuzzy systems including cellular neural networks visual attention engine, fuzzy motion estimator, neuro fuzzy classifier, and task scheduler;

A parallel processing processor including a plurality of processing units;

A power supply device controlling power supplied to the parallel processor in accordance with a control signal of the neuro-purge system;

An object determination unit which recognizes an object corresponding to the vector having the closest distance by comparing the feature vector with the vectors in the database; And

A network on chip for performing data communication between the neuro-fuzzy system, the parallel processor, the object determination unit, and the power supply device;

The fuzzy motion meter generates a dynamic motion vector between successive image frames,

The cellular neural network visual concentrator extracts intensity, color, and direction which are static features, accumulates together with the dynamic motion vector to generate a feature map,

The neuro-fuzzy classifier extracts a seed point based on the feature map and tiles a region-of-interest (ROI) of each object by determining homogeneity through region expansion based on the seed point. extracted in (tile) units,

The task scheduler converts the region of interest tile into a region of interest tile task, distributes the converted region of interest tile task to the parallel processing processor,

The parallel processing processor performs a single-instruction-multiple-data (SIMD) parallel operation on the ROI tile task to generate a feature point of the object and a feature vector for the feature point, and transmits the feature vector to the object determiner. , Object recognition device.

The method of claim 6,

The ROI tile task

A start address indicating an address of data of the leftmost point of the extracted ROI tile from all image data;

A starting coordinate consisting of an X-direction coordinate and a Y-direction coordinate that are two-dimensional coordinate values of the leftmost point of the ROI tile in the entire image; And

A tile size describing a width and a height of the ROI tile, respectively,

And the task scheduler distributes and manages the converted region of interest tile task to a plurality of processing units of the parallel processor.

The method of claim 6,

The neuro-fuzzy system predicts the amount of computation to be processed by the parallel processor based on the number of ROI tiles extracted by the neuro-fuzzy classifier.

The method according to any one of claims 6 to 8,

The power supply device includes a plurality of power regulators, each power regulator to control each of the power domain according to a control signal of the neuro-purge system.

A method of predicting a computation amount to be processed by the parallel processor in an object recognition apparatus according to claim 9,

Measuring a number of ROI tiles extracted by the neuro-fuzzy classifier;

A second step of selecting a specific number of power domains by comparing the measured number of tiles and the distribution reference number;

A third step of supplying power to the selected power domain only; And

And a fourth step of updating the number of distribution criteria of the second step according to a calculation result of the parallel processor included in the power domain to which the power is supplied.

A method for determining the number of independent power domains including at least one processing unit in an object recognition apparatus according to claim 9,

A first step of displaying a graph in which the number of power domains is X-axis and the power reduction effect according to the number of power domains is Y-axis;

A second step of displaying a graph in which the number of power domains is X-axis and the additional area cost required for the power regulator according to the number of power domains is Y-axis; And

And determining a number of power domains of intersections at which the slope of the first stage graph decreases and the second stage graph slope increases.

In the object recognition method using the object recognition device according to claim 9,

Extracting a region of interest of an object in units of tiles by the neuro-fuzzy system;

A second step of converting the extracted region of interest tile into a region of interest tile task including a start address of the tile data in all image data, a two-dimensional coordinate value of the start position of the tile in the whole image, and a size of the tile;

A third step of the neuro-fuzzy system estimating a calculation amount to be processed by the parallel processor and transmitting a control signal to the power supply device;

A fourth step of each of the power regulators controlling the respective power domains according to the control signal;

A fifth step of distributing the converted region of interest tile task to the parallel processing processor including a plurality of processing units through a network on chip;

Generating a feature point and a feature vector of an object with respect to the ROI tile task distributed by each processing unit of the parallel processor; And

And a seventh step of the object determining unit recognizing an object corresponding to the vector having the closest distance compared to the vectors in the database.

The method of claim 12,

The first step,

Dynamic motion vector generation, extracting static features;

Accumulating the dynamic motion vector and the static feature together to determine a region of interest; And

And extracting the region of interest using a basic tile having a predetermined size.