KR100956747B1

KR100956747B1 - Computer Architecture Combining Neural Network and Parallel Processor, and Processing Method Using It

Info

Publication number: KR100956747B1
Application number: KR1020080012058A
Authority: KR
Inventors: 유회준; 김관호
Original assignee: 한국과학기술원
Priority date: 2008-02-11
Filing date: 2008-02-11
Publication date: 2010-05-12
Also published as: KR20090086660A

Abstract

본 발명은 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조 및 그를 이용한 처리방법에 관한 것으로서, 물체의 이미지에 대한 특징점을 추출하여 상기 물체의 이미지에 대한 2차원 좌표값으로 변환하고, 상기 2차원 좌표값을 패킷의 형태로 출력하는 셀룰러 신경망 회로와, 복수의 프로세싱 유닛을 포함하고, 상기 출력된 2차원 좌표값을 중심으로 하는 이미지 패치 영역을 설정하여, 상기 설정된 이미지 패치 영역 내부의 픽셀들 간의 SIMD 연산을 수행하는 병렬처리 프로세서와, 상기 셀룰러 신경망 회로와 상기 병렬처리 프로세서 사이에 연결되고, 상기 신경망 회로에서 변환된 상기 2차원 좌표값들에 대한 패킷을 상기 병렬처리 프로세서에 전송하기 위한 온 칩 네트워크 장치를 포함하여, 신경회로망으로 전체 영상의 개략적인 처리를 가속화하고, 이를 기반으로 하여 고성능의 병렬처리 디지털 프로세서로 자세한 이미지 처리를 하여줌으로서, 저 전력으로 빠르게 실시간으로 물체인식을 가능하게 할 수 있는 효과가 있다. The present invention relates to a computer structure combining a neural network and a parallel processing processor, and a processing method using the same, extracting feature points of an image of an object, converting the two-dimensional coordinate values of the image of the object, and converting the two-dimensional coordinate values. And a cellular neural network circuit for outputting a packet in the form of a packet, and a plurality of processing units, and setting an image patch region centered on the output two-dimensional coordinate values, to perform SIMD operation between pixels in the set image patch region. An on-chip network device connected between the cellular neural network circuit and the parallel processor and transmitting a packet for the two-dimensional coordinate values converted in the neural network circuit to the parallel processor. Including, speed up the coarse processing of the entire image into the neural network. , As the zoom by using it as a base for more high-performance image processing with parallel processing of a digital processor, an effect that can be quickly and at a low power it enables object recognition in real time.

물체인식, 신경망 회로, 병렬처리 프로세서, SIMD Object Recognition, Neural Network, Parallel Processing Processor, SIMD

Description

Computer Architecture Combining Neural Networks and Parallel Processing Processors and Processing Methods Using Them {Computer Architecture Combining Neural Network and Parallel Processor, and Processing Method Using It}

본 발명은 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조 및 그를 이용한 처리방법에 관한 것으로서, 특히, 인간의 뇌에서 일어나는 영상의 특징적인 부분만을 한눈에 파악할 수 있는 시각집중 현상을 모방하고 가속화하는 신경회로망과 픽셀 단위의 자세한 이미지 처리를 위한 병렬처리 디지털 프로세서를 결합한 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조 및 그를 이용한 처리방법에 관한 것이다. The present invention relates to a computer structure combining a neural network and a parallel processing processor and a processing method using the same. In particular, the present invention relates to a neural network that mimics and accelerates a visual concentration phenomenon capable of grasping at a glance a characteristic part of an image occurring in a human brain. The present invention relates to a neural network combining a parallel processing digital processor for detailed image processing in a pixel unit and a pixel, and a computer structure combining a parallel processing processor and a processing method using the same.

물체 인식(Object Recognition) 은 2-D 이미지가 입력으로 주어졌을 때, 이미지의 특징점을 찾고, 그 특징점 각각에 대하여 방향 및 크기 벡터(Descriptor)를 구하며 그것들을 미리 등록된 여러 가지 물체에 대한 벡터집합인 데이터베이스 (Database)와 비교하여 가장 가까운 벡터를 갖는 물체를 찾는 과정으로써, 지능형 로봇의 자율 주행이나 자동차의 크루즈 컨트롤 등의 많은 응용분야에서 사용된다. Object Recognition, when a 2-D image is given as an input, finds the feature points of the image, obtains a direction and magnitude vector for each of those feature points, and sets them as a vector set of various registered objects. The process of finding the object with the closest vector compared to the in-database is used in many applications such as autonomous driving of intelligent robots and cruise control of automobiles.

신경망 회로는 지속적인 명령어를 처리하는데 뛰어난 컴퓨터와 경험을 통한 일반화에 익숙한 인간과의 차이를 인간 뇌의 뉴럴(Neural) 연결고리를 디지털 컴퓨터 상에서 모델링함으로써 연결해주는 역할을 한다. 이러한 신경망 회로는 데이터 마이닝, 네트워크 관리, 머신 비전, 신호처리 등의 광범위한 분야에서 사용된다. 특히 셀룰러 신경망 회로(Cellular Neural Network)는 근방의 뉴런과만 데이터를 교환하기 때문에 하드웨어로 구현하기에 적합하며, 기본적인 이미지 프로세싱 및 영상 분류 및 분할 등에 많이 사용된다. Neural network circuits bridge the differences between humans who are good at processing continuous instructions and humans who are accustomed to generalization through experience by modeling neural links of the human brain on digital computers. These neural network circuits are used in a wide range of fields such as data mining, network management, machine vision, and signal processing. Cellular Neural Networks, in particular, are ideal for hardware implementation because they exchange data only with nearby neurons, and are used for basic image processing and image classification and segmentation.

이러한 물체 인식을 수행하기 위해서 현재 많이 사용되는 SIFT(Scale Invariant Feature Transform)라는 복잡한 알고리즘을 소프트웨어로 구현하여 고성능의 General Purpose Processor에서 동작시키거나, 픽셀 단위의 이미지 처리에 적합한 수많은 프로세싱 유닛들이 포함된 병렬처리 프로세서를 이용하여 물체 인식 알고리즘의 수행 속도를 가속시키는 방법이 주로 사용되고 있다. To implement this object recognition, a complex algorithm called the Scale Invariant Feature Transform (SIFT), which is widely used today, is implemented in software to run on a high-performance General Purpose Processor or to include a large number of processing units suitable for pixel-by-pixel image processing. A method of accelerating the execution speed of an object recognition algorithm using a processing processor is mainly used.

물체 인식의 복잡한 알고리즘으로 인해 연산량이 많고, 이에 따라 일반적인 목적 프로세서(General Purpose Processor)에서 동작시키면 대략 초당 2프레임 정도의 수행 속도에 불과하며 수많은 프로세싱 유닛들을 포함하는 병렬처리 프로세서를 이용하여 물체 인식을 수행할 경우 필요로 하는 프로세싱 유닛의 수에 비례해서 파워소모가 증가하게 되어, 모바일 로봇이나 핸드폰과 같은 저전력 실시간 응용분야에 적용시키기 어려운 문제가 있다. Due to the complex algorithm of object recognition, the computational load is high. Therefore, when operating on a general purpose processor, the object recognition is performed using a parallel processing processor including a large number of processing units. When performed, power consumption increases in proportion to the number of processing units required, which makes it difficult to apply to low-power real-time applications such as mobile robots and mobile phones.

따라서, 상기와 같은 문제점을 해결하기 위하여, 본 발명은 신경망 회로와 병렬처리 프로세서를 결합하고, 이를 저전력 실시간 물체인식을 위한 하드웨어에 적용하였으며, 신경망 회로와 병렬처리 프로세서를 결합한 물체 인식의 동작을 제공하는 것을 목적으로 한다. Accordingly, in order to solve the above problems, the present invention combines a neural network and a parallel processing processor, and applies it to hardware for low power real-time object recognition, and provides an operation of object recognition combining a neural network and a parallel processing processor. It aims to do it.

이러한 기술적 과제를 달성하기 위한 본 발명에 따른 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조는 물체의 이미지에 대한 특징점을 추출하여 상기 물체의 이미지에 대한 2차원 좌표값으로 변환하고, 상기 2차원 좌표값을 패킷의 형태로 출력하는 셀룰러 신경망 회로와, 복수의 프로세싱 유닛을 포함하고, 상기 출력된 2차원 좌표값을 중심으로 하는 이미지 패치 영역을 설정하여, 상기 설정된 이미지 패치 영역 내부의 픽셀들 간의 SIMD 연산을 수행하는 병렬처리 프로세서와, 상기 셀룰러 신경망 회로와 상기 병렬처리 프로세서 사이에 연결되고, 상기 신경망 회로에서 변환된 상기 2차원 좌표값들에 대한 패킷을 상기 병렬처리 프로세서에 전송하기 위한 온 칩 네트워크 장치를 포함하는 것을 특징으로 한다. In order to achieve the above technical problem, a computer structure combining a neural network and a parallel processor according to an embodiment of the present invention extracts feature points of an image of an object, converts them into two-dimensional coordinate values of the image of the object, and converts the two-dimensional coordinate values. And a cellular neural network circuit for outputting a packet in the form of a packet, and a plurality of processing units, and setting an image patch region centered on the output two-dimensional coordinate values, to perform SIMD operation between pixels in the set image patch region. An on-chip network device connected between the cellular neural network circuit and the parallel processor and transmitting a packet for the two-dimensional coordinate values converted in the neural network circuit to the parallel processor. Characterized in that it comprises a.

상기 셀룰러 신경망회로는 시각 집중 알고리즘(Visual Attention Algorithm)을 사용하여 물체의 이미지를 형성하는 특징 맵(Feature map), 윤곽선 추출(Contour extraction), 텍스쳐 추출(Texture extraction) 중 적어도 하나의 동작 을 수행하는 것이 바람직하다. The cellular neural network performs at least one of feature map, contour extraction, and texture extraction, which form an image of an object using a visual attention algorithm. It is preferable.

상기 셀룰러 신경망회로는 각 셀이 2-D 배열로 구성된 CNN(Cellular Neural Network)를 포함할 수 있다. The cellular neural network may include a CNN (Cellular Neural Network) in which each cell is configured in a 2-D array.

상기 복수의 프로세싱 유닛은 각 1-D 배열이고, 8개의 동일한 프로세싱 유닛으로 구성될 수 있다. The plurality of processing units are each 1-D array and may consist of eight identical processing units.

상기 병렬처리 프로세서는 8개의 프로세싱 유닛에 대하여 물체의 이미지를 세로로 8등분 한 이미지 영역을 각각의 프로세싱 유닛에 할당하고, 상기 물체의 이미지의 각 라인에 있는 픽셀이 동시에 이미지를 처리하여 라인 별로 이미지를 출력 또는 갱신하는 것이 바람직하다. The parallel processing processor allocates to each processing unit an image area in which the image of the object is vertically divided into eight processing units for each of the eight processing units, and the pixels in each line of the image of the object simultaneously process the image and perform the image line by line. It is preferable to output or update.

상기와 같은 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조를 이용한 물체 인식 처리 방법에 있어서, 셀룰러 신경망회로에 의하여 물체의 이미지에 포함된 특징점을 추출하는 제1 단계와, 상기 추출된 특징점을 상기 물체의 이미지에 대한 2차원 좌표값으로 변환하고, 상기 2차원 좌표값을 패킷의 형태로 출력하는 제2 단계와, 상기 출력된 2차원 좌표값을 병렬 처리 프로세서로 전송하는 제3 단계와, 상기 전송된 2차원 좌표값을 중심으로 하는 이미지 패치 영역을 설정하여, 상기 설정된 이미지 패치 영역 내부의 픽셀들간의 SIMD 연산처리하는 제4 단계를 포함할 수 있다. An object recognition processing method using a computer structure combining the neural network and the parallel processing processor, the method comprising: extracting a feature point included in an image of an object by a cellular neural network; and extracting the extracted feature point from the object A second step of converting the two-dimensional coordinate values for the image and outputting the two-dimensional coordinate values in the form of a packet; and a third step of transmitting the outputted two-dimensional coordinate values to a parallel processing processor; The method may include a fourth step of setting an image patch region centered on a 2D coordinate value and performing SIMD operation between pixels in the set image patch region.

상기 제1 단계는 상기 신경망회로의 시각 집중 동작에 의해 특징영역을 추출하는 것이 바람직하다. In the first step, it is preferable to extract the feature region by the visual concentration operation of the neural network.

상기 제4 단계는 상기 복수의 프로세싱 유닛에 대하여 전체 이미지를 세로로 8등분 한 이미지 영역을 각각의 프로세싱 유닛에 할당하고, 이미지의 각 라인에 있 는 픽셀이 동시에 이미지를 처리하여 라인 별로 이미지를 출력 또는 갱신하는 것이 바람직하다. In the fourth step, an image area in which the entire image is vertically divided into eight for each of the plurality of processing units is allocated to each processing unit, and pixels in each line of the image simultaneously process the image and output the image for each line. Or update.

상기 제4 단계는 상기 이미지 패치 영역 내부의 연산 결과를 이용하여 각 특징점에 대한 기울기 히스토그램(Orientation histogram) 및 특징벡터를 생성하는 과정과, 데이터베이스에 있는 벡터들과 비교하여 가장 가까운 거리를 갖는 벡터에 해당하는 물체를 인식하는 과정을 더 포함할 수 있다.The fourth step is a process of generating an orientation histogram and a feature vector for each feature point using a result of the calculation inside the image patch region, and compares the vectors with the closest distance to the vectors in the database. The method may further include a process of recognizing a corresponding object.

상기한 바와 같이, 본 발명에 따른 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조 및 그를 이용한 처리방법에 의하면 신경회로망으로 전체 영상의 개략적인 처리를 가속화하고, 이를 기반으로 하여 고성능의 병렬처리 디지털 프로세서로 자세한 이미지 처리를 하여줌으로서, 저전력으로 빠르게 실시간으로 물체인식을 가능하게 할 수 있는 효과가 있다. As described above, according to the computer structure combining the neural network and the parallel processing processor according to the present invention and a processing method using the same, the neural network accelerates the rough processing of the entire image, and based on this, a high performance parallel processing digital processor. By providing detailed image processing, it has the effect of enabling object recognition in real time with low power.

이하, 상기와 같이 구성된 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조 및 그를 이용한 처리방법에 대하여 도면을 참조하여 상세하게 설명하기로 한다. Hereinafter, a computer structure combining the neural network circuit and the parallel processing processor configured as described above and a processing method using the same will be described in detail with reference to the accompanying drawings.

한편, 본 발명의 일 실시예로서, 신경망 회로와 병렬처리 프로세서를 결합한 새로운 형태의 컴퓨터 구조에 가장 적합한 하나의 예로 물체 인식 처리 시스템에 적용하여 설명하기로 한다. Meanwhile, as an embodiment of the present invention, one example that is most suitable for a new type of computer structure combining a neural network circuit and a parallel processing processor will be described in the object recognition processing system.

도 1은 본 발명의 일 실시예에 따른 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조의 전체적인 구조를 나타내는 블록도이고, 도 2는 본 발명의 일 실시예에 따른 병렬처리 프로세서의 이미지 처리 방법을 나타내는 도면이며, 도 3은 본 발명의 일 실시예에 따른 병렬처리 프로세서의 특징점을 중심으로 하는 이미지 패치 영역에 대한 처리 방법을 나타내는 도면이다. 1 is a block diagram showing the overall structure of a computer structure combining a neural network and a parallel processor according to an embodiment of the present invention, Figure 2 is a view showing an image processing method of a parallel processor in accordance with an embodiment of the present invention 3 is a diagram illustrating a processing method for an image patch area centered on feature points of a parallel processing processor according to an exemplary embodiment of the present invention.

도 1에 나타난 바와 같이, 본 발명의 일 실시예에 따른 신경망 회로와 병렬처리 프로세서가 결합된 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조는 물체의 이미지에 대한 특징점을 추출하여 상기 물체의 이미지에 대한 2차원 좌표값으로 변환하고, 상기 2차원 좌표값을 패킷의 형태로 출력하는 셀룰러 신경망 회로(10)와, 복수의 프로세싱 유닛(PE1 내지 PE8, 35)을 포함하고, 상기 출력된 2차원 좌표값을 중심으로 하는 이미지 패치 영역을 설정하여, 상기 설정된 이미지 패치 영역 내부의 픽셀들 간의 SIMD 연산을 수행하는 병렬처리 프로세서(30)와, 상기 셀룰러 신경망 회로(10)와 상기 병렬처리 프로세서(30) 사이에 연결되고, 상기 셀룰러 신경망 회로(10)에서 변환된 상기 좌표값들에 대한 패킷을 상기 병렬처리 프로세서(30)에 전송하기 위한 온 칩 네트워크 장치(20)와, 상기 셀룰러 신경망 회로(10)에서 추출된 특징점에 대한 데이터값을 저장하는 메모리(40)로 구성된다. As shown in FIG. 1, a computer structure in which a neural network and a parallel processing processor are combined with a neural network and a parallel processing processor according to an embodiment of the present invention extracts a feature point for an image of an object and extracts a feature point for the image of the object. A cellular neural network circuit 10 for converting into two-dimensional coordinate values and outputting the two-dimensional coordinate values in the form of a packet, and a plurality of processing units PE1 to PE8 and 35; An image patch region centered on a cross sectional area, and performing a SIMD operation between pixels in the set image patch region, and between the cellular neural network circuit 10 and the parallel processor 30. An on chip connected to the parallel processing processor 30 to transmit a packet for the coordinate values converted in the cellular neural network circuit 10 to the parallel processing processor 30. It consists of a work device 20, a memory 40 for storing a data value for a feature point extracted from the cellular neural network circuit (10).

여기서, 셀룰러 신경망 회로(10)는 이미지의 특징점을 추출하여 병렬처리 프로세서(30)에 전달함으로써 전체 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터 구조를 제어하는 역할을 하며, 병렬처리 프로세서(30)는 신경망 회로(10)의 제어 하에서 연산처리를 주로 담당하게 된다. Here, the cellular neural network circuit 10 controls the computer structure combining the entire neural network and the parallel processing processor by extracting the feature points of the image and transferring them to the parallel processing processor 30, and the parallel processing processor 30 is the neural network. Under the control of the circuit 10, it is mainly responsible for processing.

셀룰러 신경망 회로(10)는 각 셀이 2-D 배열로 구성되어 하드웨어 구현에 유리한 구조인 CNN(Cellular Neural Network)이 사용되고, 병렬처리 프로세서(30)는 픽셀 레벨의 동일한 연산처리를 가속시키기 위한 SIMD(Single Instruction Multiple Data) 형태의 프로세싱 유닛 8개(35)가 1-D 배열로 구성되어 있다. The cellular neural network circuit 10 uses a CNN (Cellular Neural Network), in which each cell is configured in a 2-D array and is advantageous for hardware implementation, and the parallel processor 30 uses a SIMD to accelerate the same computation at the pixel level. Eight processing units 35 in the form of (Single Instruction Multiple Data) are configured in a 1-D array.

그리고, 셀룰러 신경망 회로(10)는 인간 뇌의 시각피질(Visual Cortex)에서 일어나는 시각 집중(Visual Attention) 현상을 하드웨어로 구현하기 위하여 신경망 회로 기반의 시각집중 알고리즘을 사용하여 전체 이미지의 특징 맵(Feature map)을 만들거나 윤곽선 추출(Contour extraction), 텍스처 추출(Texture extraction)과 같은 동작을 수행하게 된다. In addition, the cellular neural network circuit 10 uses a neural network circuit-based visual focus algorithm to implement a visual attention phenomenon occurring in the visual cortex of the human brain in hardware. You can create maps or perform operations such as contour extraction and texture extraction.

따라서, 기존의 가우시안 피라미드와 중심 차분 연산(Center-surround Difference Operation)을 이용한 특징 맵 구축 방법을 셀룰러 형태의 신경망 회로에 적용할 수 있도록 신경망 회로 기반의 알고리즘으로 변형함으로써, 특징점을 구하는 시각 집중 현상을 가속화 시킬 수 있게 된다.Therefore, by transforming the existing feature map construction method using Gaussian pyramid and Center-surround Difference operation into a neural network circuit based algorithm to apply cellular type neural network circuits, the visual concentration phenomenon of finding feature points is transformed. Will be accelerated.

또한, 셀룰러 신경망 회로(10)로 전체 영상의 개략적인 처리를 가속화하고, 이를 기반으로 하여 고성능의 병렬처리 디지털 프로세서(30)로 자세한 이미지 처리를 함으로써 600mW 의 저전력으로 초당 22 Frame 의 실시간 물체 인식이 가능하게 된다. In addition, the cellular neural network circuit 10 accelerates the rough processing of the entire image and based on the detailed image processing by the high performance parallel processing digital processor 30, real-time object recognition at 22 frames per second at a low power of 600 mW is achieved. It becomes possible.

예를 들면, COIL-100과 같이 물체 인식의 실험에 많이 쓰이는 데이터베이스 를 이용하여 테스트를 한 결과, 특징점의 개수는 평균적으로 36%정도 줄었고, 물체 인식률은 기존의 신경망회로가 없는 알고리즘과 비교하였을 때, 거의 차이가 없다는 것을 알 수 있다. For example, the test using a database used for experiments on object recognition, such as COIL-100, shows that the number of feature points is reduced by 36% on average, and the object recognition rate is compared with the conventional algorithm without neural network. As you can see, there is almost no difference.

또한, 특징점의 개수가 줄어듦에 따라 물체인식을 위한 벡터를 만드는 과정 및 데이터베이스와의 매칭과정에서 필요한 연산량이 현저히 줄어들게 되어 실시간으로 물체 인식이 가능하게 되는 것이다. In addition, as the number of feature points decreases, the amount of computation required in the process of creating a vector for object recognition and matching with a database is significantly reduced, thereby enabling object recognition in real time.

도 2에 나타난 바와 같이, 본 발명의 일 실시예에 따른 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조중 병렬처리 프로세서(30)에 대하여 상세하게 설명하자면, 우선 병렬처리 프로세서는 이미지(U)를 세로방향으로 8등분으로 나누어 각각의 이미지 영역(I)을 해당하는 프로세싱 유닛(PE1 내지 PE7)이 동시에 처리하게 된다. As shown in FIG. 2, the parallel processor 30 of the computer structure combining the neural network and the parallel processor according to an embodiment of the present invention will be described in detail. First, the parallel processor processes the image U vertically. The image processing unit PE1 to PE7 simultaneously process each image region I by dividing it into eighths in the direction.

그리고, 가우시안 필터링과 같이 전체 이미지(U)에 대하여 동일한 연산을 수행할 경우, 이미지의 각 라인(L)에 해당하는 픽셀이 동시에 처리되며, 각 라인별로 폭(W)과 높이(H)를 가지는 전체 이미지에 대한 연산이 수행되게 된다. When the same operation is performed on the entire image U, such as Gaussian filtering, pixels corresponding to each line L of the image are processed at the same time, and each line has a width W and a height H. The operation is performed on the whole image.

도 3에 나타난 바와 같이, 병렬처리 프로세서가 신경망회로에서 추출된 특징점들을 중심으로 한 이미지 영역에 대한 연산을 처리하는 방법을 나타내준다. As shown in FIG. 3, a parallel processing processor illustrates a method of processing an operation on an image area based on feature points extracted from a neural network.

여기서, 신경망 회로에서 얻어진 특징점(P)을 중심으로 R만큼의 거리에 있는 영역에 포함된 이미지 픽셀(Px)들이 하나의 이미지 패치영역(A)을 형성하게 되고, 이미지 패치 영역(A) 내부에 포함된 픽셀(Px)들 간의 연산을 프로세싱 유닛(PE) 내부의 SIMD 연산유닛을 이용하여 각 픽셀(Px)에 필요한 연산을 동시에 수행하게 된다. Here, the image pixels Px included in the region at a distance of R with respect to the feature point P obtained from the neural network circuit form one image patch region A, and the inside of the image patch region A Computation between the included pixels Px is simultaneously performed using the SIMD operation unit inside the processing unit PE.

만약 특징점이 프로세싱 유닛(PE) 사이의 경계에 위치하게 된다면, 즉 이미지 패치(A)가 두 개의 프로세싱 유닛(PE)을 다 포함하게 된다면, 온칩 네트워크를 통하여 프로세싱 유닛(PE) 사이의 이미지 전송을 통해서 하나의 프로세싱 유닛(PE) 내부에서 이미지 패치(A)에 대한 연산이 이루어지게 된다. If the feature point is located at the boundary between the processing units PE, that is, the image patch A includes both processing units PE, image transfer between the processing units PE through the on-chip network is performed. Through this operation of the image patch (A) is made in one processing unit (PE).

여기서, R값은 칩 내부의 프로세서 또는 외부 인터페이스에 의해서 셋 팅 될 수 있다. Here, the R value may be set by a processor inside the chip or an external interface.

이하, 상기와 같이 구성된 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조를 이용한 물체 인식 처리 방법에 대하여 도 4를 참조하여 설명하기로 한다. Hereinafter, an object recognition processing method using a computer structure combining the neural network circuit and the parallel processing processor configured as described above will be described with reference to FIG. 4.

도 4는 본 발명의 일 실시예에 따른 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조를 이용한 물체 인식 처리 방법을 나타내는 순서도이다. 4 is a flowchart illustrating an object recognition processing method using a computer structure combining a neural network and a parallel processor according to an embodiment of the present invention.

도 4에 나타난 바와 같이, 본 발명의 일 실시예에 따른 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조를 이용한 물체 인식 처리 방법은, 셀룰러 신경망회로에 의하여 물체의 이미지에 포함된 특징점을 추출하는 제1 단계(S10)와, 상기 추출된 특징점을 상기 물체의 이미지에 대한 2차원 좌표값으로 변환하고, 상기 좌표값을 패킷의 형태로 출력하는 제2 단계(S20)와, 출력된 좌표값을 병렬 처리 프로세서로 전송하는 제3 단계(S30)와, 전송된 상기 2차원 좌표값을 중심으로 형성된 이미지 패치 내부의 픽셀들간의 SIMD 연산처리하는 제4 단계(S40)와, 상기 이미지 패치내부의 연산 결과를 이용하여 각 특징점에 대한 기울기 히스토그램(Orientation histogram) 및 특징벡터를 생성하는 제5 단계(S50)와, 데이터베이스에 있는 벡터들과 비교하여, 가장 가까운 거리를 갖는 벡터에 해당하는 물체를 인식하는 제6 단계(S60)를 포함하여 구성된다. As shown in FIG. 4, an object recognition processing method using a computer structure combining a neural network and a parallel processing processor according to an embodiment of the present invention includes a first method of extracting feature points included in an image of an object by a cellular neural network. Step S10, a second step S20 of converting the extracted feature points into two-dimensional coordinate values of the image of the object, outputting the coordinate values in the form of a packet, and parallel processing of the output coordinate values A third step S30 of transmitting to the processor, a fourth step S40 of processing SIMD between pixels in the image patch formed around the transmitted two-dimensional coordinate values, and a calculation result of the image patch. The fifth step (S50) of generating an orientation histogram and a feature vector for each feature point by using the method, and comparing the vectors with the vectors in the database, And a sixth step S60 of recognizing an object corresponding to the vector having the same.

또한, 상기 제1 단계(S10)는 상기 신경망회로의 시각 집중 동작에 의해 특징점을 추출하게 된다. In addition, the first step (S10) is to extract the feature point by the visual focusing operation of the neural network.

여기서, 신경망 회로에 의해 추출된 특징점은 전체 이미지에 대한 (x,y) 좌표로 나타낼 수 있으며, 그것을 패킷으로 만든 후(S20), 온칩 네트워크에 의해 특징점의 이미지 영역에 해당하는 병렬처리 프로세서에 전달되게 된다(S30). Here, the feature points extracted by the neural network can be represented by (x, y) coordinates for the whole image, and made into packets (S20), and then transferred to the parallel processing processor corresponding to the image area of the feature points by the on-chip network. Will be (S30).

그리고, 상기 제4 단계(S40)는 상기 병렬처리 프로세서의 8개의 프로세싱 유닛에 전체 이미지를 세로로 8등분 한 이미지 영역을 각각의 프로세싱 유닛에 할당하여 이미지의 각 라인에 있는 픽셀이 동시에 이미지를 처리하여 라인 별로 이미지를 출력 또는 갱신하여 준다. The fourth step (S40) allocates an image area of eight processing units of the parallel processing processor to the respective processing units by vertically dividing the entire image into eight processing units so that pixels on each line of the image simultaneously process the image. To print or update the image line by line.

그 후에, 상기 제5 단계(S50)에서는 이미지 패치내부의 연산 결과를 이용하여 각 특징점에 대한 기울기 히스토그램(Orientation histogram)을 생성하여 16차의 특징벡터를 생성하여준다.Subsequently, in the fifth step S50, a gradient histogram for each feature point is generated by using an operation result in the image patch to generate a 16th order feature vector.

마지막으로, 상기 제6 단계(S60)에서는 데이터베이스에 있는 벡터들과 비교하여 가장 가까운 거리를 갖는 벡터에 해당하는 물체를 최종 물체 인식 결과로 얻게 된다. Finally, in the sixth step S60, an object corresponding to the vector having the closest distance is obtained as the final object recognition result compared with the vectors in the database.

따라서, 신경망 회로에서 전체 이미지에 대해 포괄적으로 특징점 또는 초점이 되는 영역을 설정하게 되고, 병렬처리 프로세서는 전체 이미지에 대해서 연산을 다 처리하지 않고, 신경망 회로에 의해서 추출된 이미지의 특징적인 영역에만 초점을 맞추어서 연산을 처리하기 때문에 물체 인식을 위한 연산량이 현저히 줄어들게 된다. Therefore, in the neural network circuits, a comprehensive feature or focus area is set for the entire image, and the parallel processing processor does not process all the images, but focuses only on the characteristic regions of the image extracted by the neural network. Because the calculation process is performed according to, the amount of computation for object recognition is significantly reduced.

이상에서 보는 바와 같이, 본 발명이 속하는 기술 분야의 당업자는 본 발명이 그 기술적 사상이나 필수적 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로서 이해해야만 하고, 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 등가개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다. As described above, those skilled in the art will appreciate that the present invention can be implemented in other specific forms without changing the technical spirit or essential features. Therefore, the above-described embodiments are to be understood as illustrative and not restrictive in all respects, and the scope of the present invention is indicated by the appended claims rather than the foregoing description, and the meaning and scope of the claims and All changes or modifications derived from the equivalent concept should be interpreted as being included in the scope of the present invention.

도 1은 본 발명의 일 실시예에 따른 신경망회로와 병렬처리 프로세서를 결합한 컴퓨터구조의 전체적인 구조를 나타내는 블록도, 1 is a block diagram showing the overall structure of a computer structure combining a neural network and a parallel processing processor according to an embodiment of the present invention;

도 2는 본 발명의 일 실시예에 따른 병렬처리 프로세서의 이미지 처리 방법을 나타내는 도면, 2 is a diagram illustrating an image processing method of a parallel processing processor according to an embodiment of the present invention;

도 3은 본 발명의 일 실시예에 따른 병렬처리 프로세서의 특징점을 중심으로 하는 이미지 패치 영역에 대한 처리 방법을 나타내는 도면, 3 is a view showing a processing method for an image patch region centered on feature points of a parallel processing processor according to an embodiment of the present invention;

Claims

A cellular neural network circuit for extracting feature points of an image of an object and converting the feature points into two-dimensional coordinate values of the image of the object and outputting the two-dimensional coordinate values in the form of a packet;

A parallel processing processor including a plurality of processing units, setting an image patch area centered on the output two-dimensional coordinate values, and performing a SIMD operation between pixels in the set image patch area; And

An on-chip network device coupled between the cellular neural network circuit and the parallel processor and configured to transmit a packet for the two-dimensional coordinate values converted in the neural network circuit to the parallel processor;

Comprising a computer structure that combines a neural network and a parallel processing processor.

The method of claim 1,

The cellular neural network performs at least one of feature map, contour extraction, and texture extraction, which form an image of an object using a visual attention algorithm. , A computer architecture combining neural networks and parallel processors.

The method of claim 1,

The cellular neural network includes a neural network and a parallel processing processor, each cell including a CNN (Cellular Neural Network) configured in a 2-D array.

The method of claim 1,

And said plurality of processing units comprise a neural network and a parallel processing processor, each 1-D array.

The method of claim 4, wherein

And said plurality of processing units comprise eight identical processing units, combining a neural network and a parallel processor.

The method of claim 1,

The parallel processing processor allocates to each processing unit an image area in which the image of the object is vertically divided into eight processing units for each of the eight processing units, and the pixels in each line of the image of the object simultaneously process the image and perform the image line by line. A computer architecture combining neural network and parallel processing processor that outputs or updates the data.

In the object recognition processing method using a computer structure combining a neural network and a parallel processing processor according to claim 1,

Extracting feature points included in an image of an object by a cellular neural network;

Converting the extracted feature points into two-dimensional coordinate values of the image of the object and outputting the two-dimensional coordinate values in the form of a packet;

Transmitting the output two-dimensional coordinate values to a parallel processor; And

Setting an image patch region centered on the transmitted two-dimensional coordinate values, and performing a SIMD operation between pixels in the set image patch region;

Object recognition processing method using a computer structure combining a neural network and a parallel processing processor comprising a.

The method of claim 7, wherein

The first step is an object recognition processing method using a computer structure combining a neural network and a parallel processing processor for extracting feature regions by visually intensive operation of the neural network.

The method of claim 7, wherein

The fourth step allocates to each processing unit an image area in which the entire image is vertically divided into eight for each of the plurality of processing units, and pixels in each line of the image simultaneously process the image to output an image for each line or An object recognition processing method using a computer structure combining a neural network and a parallel processing processor.

The method of claim 7, wherein

The fourth step may include generating an orient histogram and a feature vector for each feature point using a result of the calculation inside the image patch area; And recognizing an object corresponding to a vector having a closest distance compared to vectors in a database. 2. The method of claim 2, further comprising: a neural network and a parallel processor.