KR20210111014A

KR20210111014A - Electronic apparatus and method for controlling thereof

Info

Publication number: KR20210111014A
Application number: KR1020200026010A
Authority: KR
Inventors: 박배성; 김병욱; 이동수; 권세중; 전용권
Original assignee: 삼성전자주식회사
Priority date: 2020-03-02
Filing date: 2020-03-02
Publication date: 2021-09-10
Also published as: US20210271981A1; WO2021177617A1

Abstract

Disclosed is an electronic device that performs a computation of a neural network model. The present electronic device comprises: a memory in which weight data including quantized weight values of a neural network model is stored; and a processor that obtains computation data based on binary data and input data having at least one bit value different from each other, generates a lookup table that the computation data is matched to the binary data, obtains the computation data corresponding to the weight data from the lookup table, and performs computation of the neural network model based on the obtained computation data. Therefore, the present invention is capable of accurately deriving an output value for an input value within a short time.

Description

ELECTRONIC APPARATUS AND METHOD FOR CONTROLLING THEREOF

본 개시는 전자 장치 및 그 제어 방법에 관한 것으로, 보다 상세하게는 인공 지능 기술을 기반으로 동작하는 전자 장치 및 그 제어 방법에 관한 것이다.The present disclosure relates to an electronic device and a control method thereof, and more particularly, to an electronic device operating based on artificial intelligence technology and a control method thereof.

최근 인간 수준의 지능을 구현하는 인공 지능 시스템이 개발되고 있다. 인공 지능 시스템은, 기존의 룰(rule) 기반 시스템과 달리 기계가 스스로 학습하고 판단하는 시스템으로써, 음성 인식, 이미지 인식 및 미래 예측 등과 같은 다양한 범위에서 활용되고 있다.Recently, artificial intelligence systems that implement human-level intelligence are being developed. Unlike the existing rule-based system, the artificial intelligence system is a system in which a machine learns and judges by itself, and is used in various fields such as voice recognition, image recognition, and future prediction.

특히, 최근에는 딥 러닝(deep learning)에 기반한 딥 뉴럴 네트워크(deep neural network)를 통해 주어진 문제를 해결하는 인공 지능 시스템이 개발되고 있다. In particular, recently, an artificial intelligence system for solving a given problem through a deep neural network based on deep learning has been developed.

딥 뉴럴 네트워크는 입력 레이어(input layer)와 출력 레이어(output layer) 사이에 다수의 은닉 레이어(hidden layer)을 포함하는 뉴럴 네트워크로써, 각 레이어에 포함된 뉴런을 통해 인공 지능 기술을 구현하는 모델을 의미한다. A deep neural network is a neural network that includes a number of hidden layers between an input layer and an output layer. it means.

이와 같은, 딥 뉴럴 네트워크는 정확한 결과 값을 도출해 내기 위해서 다수의 뉴런을 포함하는 것이 일반적이다.As such, a deep neural network generally includes a plurality of neurons in order to derive an accurate result value.

그런데, 방대한 양의 뉴런들이 존재할 경우, 입력 값에 대한 출력 값의 정확도가 높아지는 것은 별론, 출력 값 도출을 위한 연산에 많은 시간을 소요하는 문제가 있다. However, when there are a large amount of neurons, there is a problem in that it takes a lot of time to calculate the output value, apart from increasing the accuracy of the output value with respect to the input value.

또한, 방대한 양의 뉴런들로 인해서, 딥 뉴럴 네트워크는 제한된 메모리를 가진 스마트 폰과 같은 모바일 장치 등에서는 용량 상의 문제로 이용될 수 없는 문제도 있다.In addition, due to a large amount of neurons, there is a problem in that the deep neural network cannot be used in a mobile device such as a smart phone having a limited memory due to a capacity problem.

본 개시는 상술한 문제점을 해결하기 위해 안출된 것으로, 본 개시의 목적은 빠른 시간 내 입력 값에 대한 출력 값을 정확히 도출하고, 제한된 메모리를 가진 모바일 장치 등에서도 인공 지능 기술을 실현할 수 있도록 하는 전자 장치를 제공함에 있다.The present disclosure has been devised to solve the above problems, and an object of the present disclosure is to accurately derive an output value for an input value within a short time, and to realize artificial intelligence technology in a mobile device with limited memory, etc. to provide the device.

본 개시의 일 실시 예에 따른 신경망 모델의 연산을 수행하는 전자 장치는, 상기 신경망 모델의 양자화된 가중치 값들을 포함하는 가중치 데이터가 저장된 메모리 및 적어도 하나의 비트 값이 서로 다른 이진 데이터 및 입력 데이터에 기초하여 연산 데이터를 획득하고, 상기 이진 데이터에 상기 연산 데이터가 매칭된 룩업 테이블을 생성하며, 상기 룩업 테이블로부터 상기 가중치 데이터에 대응되는 연산 데이터를 획득하고, 상기 획득된 연산 데이터에 기초하여 상기 신경망 모델의 연산을 수행하는 프로세서를 포함한다.An electronic device for performing an operation of a neural network model according to an embodiment of the present disclosure includes a memory in which weight data including quantized weight values of the neural network model is stored, and binary data and input data having at least one bit value different from each other. to obtain operation data based on the calculation data, generate a lookup table in which the operation data is matched to the binary data, obtain operation data corresponding to the weight data from the lookup table, and based on the obtained operation data, the neural network It includes a processor that performs calculations on the model.

여기에서, 상기 이진 데이터 각각은, n 개의 비트 값들로 구성되고, 상기 입력 데이터는, 매트릭스의 복수의 입력 값들을 포함하고, 상기 프로세서는, 상기 매트릭스의 각 컬럼에서 n 개의 입력 값들을 획득하고, 상기 이진 데이터 및 상기 n 개의 입력 값들에 기초하여 상기 이진 데이터 별로 상기 연산 데이터를 획득할 수 있다.wherein each of the binary data consists of n bit values, the input data includes a plurality of input values of a matrix, and the processor obtains n input values from each column of the matrix, The operation data may be obtained for each binary data based on the binary data and the n input values.

그리고, 상기 가중치 데이터는, 매트릭스의 복수의 가중치 값들을 포함하고, 상기 프로세서는, 상기 매트릭스의 각 로우에서 상기 n 개의 입력 값들에 대응되는 n 개의 가중치 값들을 식별하고, 상기 이진 데이터 중 상기 식별된 n 개의 가중치 값들에 대응되는 이진 데이터를 식별하며, 상기 룩업 테이블로부터 상기 식별된 이진 데이터에 대응되는 연산 데이터를 획득하고, 상기 획득된 연산 데이터에 기초하여 상기 신경망 모델의 연산을 수행할 수 있다.And, the weight data includes a plurality of weight values of a matrix, and the processor identifies n weight values corresponding to the n input values in each row of the matrix, and the identified weight values among the binary data. Binary data corresponding to n weight values may be identified, operation data corresponding to the identified binary data may be obtained from the lookup table, and the neural network model may be calculated based on the obtained operation data.

그리고, 상기 프로세서는, 상기 매트릭스의 각 컬럼의 입력 값들에 기초하여 생성된 복수의 룩업 테이블 중에서, 상기 입력 데이터에 대한 출력 매트릭스의 각 컬럼에 대응되는 룩업 테이블을 각각 판단하고, 상기 각각의 룩업 테이블로부터 상기 출력 매트릭스의 각 컬럼의 출력 값들을 획득할 수 있다.In addition, the processor determines, from among a plurality of lookup tables generated based on input values of each column of the matrix, a lookup table corresponding to each column of an output matrix for the input data, and each of the lookup tables Output values of each column of the output matrix may be obtained from

그리고, 상기 프로세서는, 상기 복수의 입력 값들을 포함하는 매트릭스를 기설정된 로우를 기준으로 제1 매트릭스 및 제2 매트릭스로 분할하고, 상기 복수의 가중치 값들을 포함하는 매트릭스를 기설정된 컬럼을 기준으로 제3 매트릭스 및 제4 매트릭스로 분할하며, 상기 제1 매트릭스의 각 컬럼의 입력 값들에 기초하여 복수의 룩업 테이블을 생성하고, 상기 복수의 룩업 테이블로부터 상기 제3 매트릭스의 각 로우에 대응되는 연산 데이터를 획득하고, 상기 제2 매트릭스의 각 컬럼의 입력 값들에 기초하여 복수의 룩업 테이블을 생성하고, 상기 복수의 룩업 테이블로부터 상기 제4 매트릭스의 각 로우에 대응되는 연산 데이터를 획득할 수 있다.Then, the processor divides the matrix including the plurality of input values into a first matrix and a second matrix based on a predetermined row, and divides the matrix including the plurality of weight values based on a predetermined column. It is divided into a third matrix and a fourth matrix, a plurality of lookup tables are generated based on input values of each column of the first matrix, and operation data corresponding to each row of the third matrix are generated from the plurality of lookup tables. obtained, a plurality of lookup tables may be generated based on input values of each column of the second matrix, and operation data corresponding to each row of the fourth matrix may be obtained from the plurality of lookup tables.

그리고, 상기 프로세서는, 상기 매트릭스의 각 컬럼에서 8개의 입력 값들을 획득하고, 상기 이진 데이터 및 상기 8개의 입력 값들에 기초하여 상기 이진 데이터 별로 상기 연산 데이터를 획득할 수 있다.The processor may obtain eight input values from each column of the matrix, and obtain the operation data for each binary data based on the binary data and the eight input values.

그리고, 상기 프로세서는, 상기 이진 데이터 및 상기 n 개의 입력 값들에 기초한 복수의 연산식에서, 동일한 중간 연산식을 가지는 제1 연산식 및 제2 연산식이 있는 경우, 상기 제2 연산식의 연산은 상기 제1 연산식의 연산 값에 기초하여 수행할 수 있다.And, in the plurality of arithmetic expressions based on the binary data and the n input values, the processor may be configured to, when there is a first arithmetic expression and a second arithmetic expression having the same intermediate arithmetic expression, the operation of the second arithmetic expression is performed in the first 1 It can be performed based on the calculation value of the calculation expression.

본 개시의 일 실시 예에 따른 신경망 모델의 연산을 수행하는 전자 장치의 제어 방법은 적어도 하나의 비트 값이 서로 다른 이진 데이터 및 입력 데이터에 기초하여 연산 데이터를 획득하는 단계, 상기 이진 데이터에 상기 연산 데이터가 매칭된 룩업 테이블을 생성하는 단계, 상기 룩업 테이블로부터 상기 신경망 모델의 양자화된 가중치 값들을 포함하는 가중치 데이터에 대응되는 연산 데이터를 획득하는 단계 및 상기 획득된 연산 데이터에 기초하여 상기 신경망 모델의 연산을 수행하는 단계를 포함한다.According to an embodiment of the present disclosure, a control method of an electronic device performing an operation of a neural network model includes: acquiring operation data based on binary data and input data having at least one bit value different from each other; generating a lookup table to which data is matched; obtaining computation data corresponding to weight data including quantized weight values of the neural network model from the lookup table; performing an operation.

여기에서, 상기 이진 데이터 각각은, n 개의 비트 값들로 구성되고, 상기 입력 데이터는, 매트릭스의 복수의 입력 값들을 포함하고, 상기 연산 데이터를 획득하는 단계는, 상기 매트릭스의 각 컬럼에서 n 개의 입력 값들을 획득하고, 상기 이진 데이터 및 상기 n 개의 입력 값들에 기초하여 상기 이진 데이터 별로 상기 연산 데이터를 획득할 수 있다.Here, each of the binary data consists of n bit values, the input data includes a plurality of input values of a matrix, and the obtaining of the operation data includes n input values in each column of the matrix. values may be obtained, and the operation data may be obtained for each binary data based on the binary data and the n input values.

그리고, 상기 가중치 데이터는, 매트릭스의 복수의 가중치 값들을 포함하고, 상기 신경망 모델의 연산을 수행하는 단계는, 상기 매트릭스의 각 로우에서 상기 n 개의 입력 값들에 대응되는 n 개의 가중치 값들을 식별하고, 상기 이진 데이터 중 상기 식별된 n 개의 가중치 값들에 대응되는 이진 데이터를 식별하며, 상기 룩업 테이블로부터 상기 식별된 이진 데이터에 대응되는 연산 데이터를 획득하고, 상기 획득된 연산 데이터에 기초하여 상기 신경망 모델의 연산을 수행할 수 있다.In addition, the weight data includes a plurality of weight values of a matrix, and the operation of the neural network model includes identifying n weight values corresponding to the n input values in each row of the matrix, Identifies binary data corresponding to the identified n weight values among the binary data, obtains computational data corresponding to the identified binary data from the lookup table, and based on the obtained computational data, the neural network model operation can be performed.

그리고, 상기 신경망 모델의 연산을 수행하는 단계는, 상기 매트릭스의 각 컬럼의 입력 값들에 기초하여 생성된 복수의 룩업 테이블 중에서, 상기 입력 데이터에 대한 출력 매트릭스의 각 컬럼에 대응되는 룩업 테이블을 각각 판단하고, 상기 각각의 룩업 테이블로부터 상기 출력 매트릭스의 각 컬럼의 출력 값들을 획득하는 단계를 포함할 수 있다.And, in the operation of the neural network model, a lookup table corresponding to each column of the output matrix for the input data is determined from among a plurality of lookup tables generated based on input values of each column of the matrix. and obtaining output values of each column of the output matrix from the respective lookup tables.

그리고, 상기 연산 데이터를 획득하는 단계는, 상기 복수의 입력 값들을 포함하는 매트릭스를 기설정된 로우를 기준으로 제1 매트릭스 및 제2 매트릭스로 분할하고, 상기 복수의 가중치 값들을 포함하는 매트릭스를 기설정된 컬럼을 기준으로 제3 매트릭스 및 제4 매트릭스로 분할하며, 상기 제1 매트릭스의 각 컬럼의 입력 값들에 기초하여 복수의 룩업 테이블을 생성하고, 상기 복수의 룩업 테이블로부터 상기 제3 매트릭스의 각 로우에 대응되는 연산 데이터를 획득하고, 상기 제2 매트릭스의 각 컬럼의 입력 값들에 기초하여 복수의 룩업 테이블을 생성하고, 상기 복수의 룩업 테이블로부터 상기 제4 매트릭스의 각 로우에 대응되는 연산 데이터를 획득할 수 있다.The obtaining of the operation data may include dividing the matrix including the plurality of input values into a first matrix and a second matrix based on a predetermined row, and dividing the matrix including the plurality of weight values into a predetermined matrix. It is divided into a third matrix and a fourth matrix based on a column, and a plurality of lookup tables are generated based on input values of each column of the first matrix, and from the plurality of lookup tables, each row of the third matrix is Obtaining corresponding operation data, generating a plurality of lookup tables based on input values of each column of the second matrix, and obtaining operation data corresponding to each row of the fourth matrix from the plurality of lookup tables can

그리고, 상기 연산 데이터를 획득하는 단계는, 상기 매트릭스의 각 컬럼에서 8개의 입력 값들을 획득하고, 상기 이진 데이터 및 상기 8개의 입력 값들에 기초하여 상기 이진 데이터 별로 상기 연산 데이터를 획득할 수 있다.In addition, the obtaining of the operation data may include obtaining eight input values from each column of the matrix, and obtaining the operation data for each binary data based on the binary data and the eight input values.

그리고, 상기 룩업 테이블을 생성하는 단계는, 상기 이진 데이터 및 상기 n 개의 입력 값들에 기초한 복수의 연산식에서, 동일한 중간 연산식을 가지는 제1 연산식 및 제2 연산식이 있는 경우, 상기 제2 연산식의 연산은 상기 제1 연산식의 연산 값에 기초하여 수행하는 단계를 포함할 수 있다.In addition, the generating of the lookup table may include, in a plurality of arithmetic expressions based on the binary data and the n input values, when there is a first arithmetic expression and a second arithmetic expression having the same intermediate arithmetic expression, the second arithmetic expression The operation of may include performing the operation based on the operation value of the first expression.

이상과 같은 본 개시의 다양한 실시 예에 따르면, 빠른 시간 내 입력 값에 대한 출력 값을 정확히 도출할 수 있고, 제한된 메모리를 가진 모바일 장치 등에서도 인공 지능 기술을 실현할 수 있다.According to various embodiments of the present disclosure as described above, it is possible to accurately derive an output value for an input value within a short time, and it is possible to realize artificial intelligence technology even in a mobile device having a limited memory.

도 1은 본 개시의 일 실시 예에 따른 전자 장치를 설명하기 위한 블록도이다.
도 2a는 본 개시의 일 실시 예에 따른 입력 데이터에 대응되는 매트릭스를 도시한 도면이다.
도 2b는 본 개시의 일 실시 예에 따른 룩업 테이블을 도시한 도면이다.
도 3a는 본 개시의 일 실시 예에 따른 룩업 테이블을 이용한 신경망 모델의 연산을 설명하기 위한 도면이다.
도 3b는 본 개시의 일 실시 예에 따른 출력 데이터에 대응되는 매트릭스의 컬럼 별로 이용되는 룩업 테이블을 설명하기 위한 도면이다.
도 3c는 본 개시의 일 실시 예에 따른 출력 데이터에 대응되는 매트릭스의 제1 컬럼의 출력 값을 획득하는 실시 예를 설명하기 위한 도면이다.
도 3d는 본 개시의 일 실시 예에 따른 출력 데이터에 대응되는 매트릭스의 제2 컬럼의 출력 값을 획득하는 실시 예를 설명하기 위한 도면이다.
도 3e는 본 개시의 일 실시 예에 따른 출력 데이터에 대응되는 매트릭스의 제3 컬럼의 출력 값을 획득하는 실시 예를 설명하기 위한 도면이다.
도 4a는 본 개시의 일 실시 예에 따른 룩업 테이블 생성에 이용되는 연산식을 설명하기 위한 도면이다.
도 4b는 본 개시의 일 실시 예에 따른 복수의 연산식에 포함된 동일한 중간 연산식을 설명하기 위한 도면이다.
도 4c는 본 개시의 일 실시 예에 따른 복수의 연산식에 포함된 동일한 중간 연산식을 설명하기 위한 도면이다.
도 4d는 본 개시의 일 실시 예에 따른 복수의 연산식에 포함된 동일한 중간 연산식을 설명하기 위한 도면이다.
도 4e는 본 개시의 일 실시 예에 따른 복수의 연산식에 포함된 동일한 중간 연산식을 설명하기 위한 도면이다.
도 4f는 본 개시의 일 실시 예에 따른 동일한 중간 연산식에 기초하여 연산 값을 획득하는 실시 예를 설명하기 위한 도면이다.
도 5는 본 개시의 일 실시 예에 따른 신경망 모델의 연산 방법을 설명하기 위한 도면이다.
도 6은 본 개시의 일 실시 예에 따른 전자 장치를 설명하기 위한 상세 블록도이다.
도 7은 본 개시의 일 실시 예에 따른 전자 장치의 제어 방법을 설명하기 위한 순서도이다.1 is a block diagram illustrating an electronic device according to an embodiment of the present disclosure.
2A is a diagram illustrating a matrix corresponding to input data according to an embodiment of the present disclosure.
2B is a diagram illustrating a lookup table according to an embodiment of the present disclosure.
3A is a diagram for explaining the operation of a neural network model using a lookup table according to an embodiment of the present disclosure.
3B is a diagram for describing a lookup table used for each column of a matrix corresponding to output data according to an embodiment of the present disclosure.
3C is a diagram for explaining an embodiment of obtaining an output value of a first column of a matrix corresponding to output data according to an embodiment of the present disclosure;
3D is a diagram for explaining an embodiment of obtaining an output value of a second column of a matrix corresponding to output data according to an embodiment of the present disclosure.
3E is a diagram for describing an example of obtaining an output value of a third column of a matrix corresponding to output data according to an embodiment of the present disclosure;
4A is a diagram for explaining an arithmetic expression used to generate a lookup table according to an embodiment of the present disclosure.
4B is a diagram for explaining the same intermediate expression included in a plurality of expression expressions according to an embodiment of the present disclosure.
4C is a diagram for explaining the same intermediate expression included in a plurality of expression expressions according to an embodiment of the present disclosure.
4D is a diagram for explaining the same intermediate expression included in a plurality of expression expressions according to an embodiment of the present disclosure.
4E is a diagram for explaining the same intermediate expression included in a plurality of expression expressions according to an embodiment of the present disclosure.
4F is a diagram for explaining an embodiment of obtaining an operation value based on the same intermediate operation expression according to an embodiment of the present disclosure.
5 is a diagram for explaining a method of calculating a neural network model according to an embodiment of the present disclosure.
6 is a detailed block diagram illustrating an electronic device according to an embodiment of the present disclosure.
7 is a flowchart illustrating a method of controlling an electronic device according to an embodiment of the present disclosure.

이하, 본 개시의 다양한 실시 예가 첨부된 도면을 참조하여 기재된다. 그러나, 이는 본 개시에 기재된 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 개시의 실시 예의 다양한 변경(modifications), 균등물(equivalents), 및/또는 대체물(alternatives)을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다.Hereinafter, various embodiments of the present disclosure will be described with reference to the accompanying drawings. However, this is not intended to limit the technology described in the present disclosure to specific embodiments, and it should be understood that various modifications, equivalents, and/or alternatives of the embodiments of the present disclosure are included. . In connection with the description of the drawings, like reference numerals may be used for like components.

본 개시에서, "가진다," "가질 수 있다," "포함한다," 또는 "포함할 수 있다" 등의 표현은 해당 특징(예: 수치, 기능, 동작, 또는 부품 등의 구성요소)의 존재를 가리키며, 추가적인 특징의 존재를 배제하지 않는다.In the present disclosure, expressions such as “have,” “may have,” “include,” or “may include” indicate the presence of a corresponding characteristic (eg, a numerical value, function, operation, or component such as a part). and does not exclude the presence of additional features.

본 개시에서, "A 또는 B," "A 또는/및 B 중 적어도 하나," 또는 "A 또는/및 B 중 하나 또는 그 이상"등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다. 예를 들면, "A 또는 B," "A 및 B 중 적어도 하나," 또는 "A 또는 B 중 적어도 하나"는, (1) 적어도 하나의 A를 포함, (2) 적어도 하나의 B를 포함, 또는 (3) 적어도 하나의 A 및 적어도 하나의 B 모두를 포함하는 경우를 모두 지칭할 수 있다.In this disclosure, expressions such as “A or B,” “at least one of A and/and B,” or “one or more of A or/and B” may include all possible combinations of the items listed together. . For example, "A or B," "at least one of A and B," or "at least one of A or B" means (1) includes at least one A, (2) includes at least one B; Or (3) it may refer to all cases including both at least one A and at least one B.

본 개시에서 사용된 "제1," "제2," "첫째," 또는 "둘째,"등의 표현들은 다양한 구성요소들을, 순서 및/또는 중요도에 상관없이 수식할 수 있고, 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 뿐 해당 구성요소들을 한정하지 않는다. As used in the present disclosure, expressions such as "first," "second," "first," or "second," may modify various elements, regardless of order and/or importance, and refer to one element. It is used only to distinguish it from other components, and does not limit the components.

어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "(기능적으로 또는 통신적으로) 연결되어((operatively or communicatively) coupled with/to)" 있다거나 "접속되어(connected to)" 있다고 언급된 때에는, 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로 연결되거나, 다른 구성요소(예: 제3 구성요소)를 통하여 연결될 수 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소와 상기 다른 구성요소 사이에 다른 구성요소(예: 제 3 구성요소)가 존재하지 않는 것으로 이해될 수 있다.A component (eg, a first component) is "coupled with/to (operatively or communicatively)" to another component (eg, a second component) When referring to "connected to", it will be understood that the certain element may be directly connected to the other element or may be connected through another element (eg, a third element). On the other hand, when it is said that a component (eg, a first component) is "directly connected" or "directly connected" to another component (eg, a second component), the component and the It may be understood that other components (eg, a third component) do not exist between other components.

본 개시에서 사용된 표현 "~하도록 구성된(또는 설정된)(configured to)"은 상황에 따라, 예를 들면, "~에 적합한(suitable for)," "~하는 능력을 가지는(having the capacity to)," "~하도록 설계된(designed to)," "~하도록 변경된(adapted to)," "~하도록 만들어진(made to)," 또는 "~를 할 수 있는(capable of)"과 바꾸어 사용될 수 있다. 용어 "~하도록 구성된(또는 설정된)"은 하드웨어적으로 "특별히 설계된(specifically designed to)" 것만을 반드시 의미하지 않을 수 있다. 대신, 어떤 상황에서는, "~하도록 구성된 장치"라는 표현은, 그 장치가 다른 장치 또는 부품들과 함께 "~할 수 있는" 것을 의미할 수 있다. 예를 들면, 문구 "A, B, 및 C를 수행하도록 구성된(또는 설정된) 프로세서"는 해당 동작을 수행하기 위한 전용 프로세서(예: 임베디드 프로세서), 또는 메모리 장치에 저장된 하나 이상의 소프트웨어 프로그램들을 실행함으로써, 해당 동작들을 수행할 수 있는 범용 프로세서(generic-purpose processor)(예: CPU 또는 application processor)를 의미할 수 있다.The expression "configured to (or configured to)" as used in this disclosure depends on the context, for example, "suitable for," "having the capacity to" ," "designed to," "adapted to," "made to," or "capable of." The term “configured (or configured to)” may not necessarily mean only “specifically designed to” in hardware. Instead, in some circumstances, the expression “a device configured to” may mean that the device is “capable of” with other devices or parts. For example, the phrase “a processor configured (or configured to perform) A, B, and C” refers to a dedicated processor (eg, an embedded processor) for performing the operations, or by executing one or more software programs stored in a memory device. , may mean a generic-purpose processor (eg, a CPU or an application processor) capable of performing corresponding operations.

본 개시에서 "모듈" 혹은 "부"는 적어도 하나의 기능이나 동작을 수행하며, 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다. 또한, 복수의 "모듈" 혹은 복수의 "부"는 특정한 하드웨어로 구현될 필요가 있는 "모듈" 혹은 "부"를 제외하고는 적어도 하나의 모듈로 일체화되어 적어도 하나의 프로세서(미도시)로 구현될 수 있다.In the present disclosure, a “module” or “unit” performs at least one function or operation, and may be implemented as hardware or software, or a combination of hardware and software. In addition, a plurality of “modules” or a plurality of “units” are integrated into at least one module and implemented with at least one processor (not shown) except for “modules” or “units” that need to be implemented with specific hardware. can be

이하, 첨부된 도면을 참조하여 본 개시를 상세히 설명한다.Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

도 1은 본 개시의 일 실시 예에 따른 전자 장치를 설명하기 위한 블록도이다.1 is a block diagram illustrating an electronic device according to an embodiment of the present disclosure.

도 1을 참조하면, 본 개시의 일 실시 예에 따른 전자 장치(100)는 메모리(110) 및 프로세서(120)를 포함한다.Referring to FIG. 1 , an electronic device 100 according to an embodiment of the present disclosure includes a memory 110 and a processor 120 .

본 개시의 일 실시 예에 따른 전자 장치(100)는 신경망 모델(또는, 인공 지능 모델)을 이용하여 입력 데이터에 대한 출력 데이터를 획득하는 장치로써, 예를 들어, 전자 장치(100)는 데스크탑 PC, 노트북, 스마트 폰, 태블릿 PC, 서버 등일 수 있다. 또는, 전자 장치(100)는 클라우딩 컴퓨팅 환경이 구축된 시스템 자체일 수도 있다. 다만, 이에 한정되는 것은 아니며, 전자 장치(100)는 신경망 모델의 연산이 가능한 장치라면 어떠한 장치라도 무방하다.The electronic device 100 according to an embodiment of the present disclosure is a device for obtaining output data for input data using a neural network model (or artificial intelligence model), and for example, the electronic device 100 is a desktop PC. , a laptop, a smart phone, a tablet PC, a server, and the like. Alternatively, the electronic device 100 may be a system itself in which a cloud computing environment is built. However, the present invention is not limited thereto, and the electronic device 100 may be any device as long as it is capable of calculating a neural network model.

메모리(110)는 하드 디스크, 비휘발성 메모리 또는 휘발성 메모리 등으로 구현될 수 있다. 여기에서, 비휘발성 메모리는 OTPROM(one time programmable ROM), PROM(programmable ROM), EPROM(erasable and programmable ROM), EEPROM(electrically erasable and programmable ROM), mask ROM, flash ROM 등이 될 수 있고, 휘발성 메모리는 DRAM(dynamic RAM), SRAM(static RAM), 또는 SDRAM(synchronous dynamic RAM) 등이 될 수 있다. The memory 110 may be implemented as a hard disk, non-volatile memory, or volatile memory. Here, the non-volatile memory may be one time programmable ROM (OTPROM), programmable ROM (PROM), erasable and programmable ROM (EPROM), electrically erasable and programmable ROM (EEPROM), mask ROM, flash ROM, etc. The memory may be a dynamic RAM (DRAM), a static RAM (SRAM), or a synchronous dynamic RAM (SDRAM).

한편, 도 1에서는 메모리(110)를 프로세서(120)와 별개의 구성으로 도시하였으나, 메모리(110)는 프로세서(120)에 포함될 수도 있다. 즉, 메모리(110)는 오프 칩 메모리로 구현될 수 있음은 물론, 온 칩 메모리로 구현될 수도 있다.Meanwhile, although the memory 110 is illustrated as a separate configuration from the processor 120 in FIG. 1 , the memory 110 may be included in the processor 120 . That is, the memory 110 may be implemented as an off-chip memory as well as an on-chip memory.

또한, 도 1에는 하나의 메모리(110)를 도시하였으나, 실시 예에 따라 메모리(110)는 복수 개로 구현될 수도 있다.Also, although one memory 110 is illustrated in FIG. 1 , a plurality of memories 110 may be implemented according to an embodiment.

메모리(110)는 신경망 모델의 가중치 데이터를 저장할 수 있다. 여기에서, 가중치 데이터는 신경망 모델의 연산에 이용되는 데이터로써, 메모리(110)는 신경망 모델을 구성하는 복수의 레이어에 대응되는 복수의 가중치 데이터를 저장할 수 있다. The memory 110 may store weight data of the neural network model. Here, the weight data is data used for calculation of the neural network model, and the memory 110 may store a plurality of weight data corresponding to a plurality of layers constituting the neural network model.

특히, 메모리(110)는 양자화 된 가중치 값들을 포함하는 가중치 데이터를 저장할 수 있다. 여기에서, 양자화된 가중치 값은 -1 또는 1이 될 수 있고, 가중치 데이터는 -1 또는 1로 구성된 m×n의 매트릭스로 표현 될 수 있다. 또한, 가중치 값 -1은 0으로 치환되어 메모리(110)에 저장될 수도 있다. 즉, 메모리(110)는 0 또는 1로 구성된 가중치 데이터를 저장할 수도 있다. 실시 예에 따라, -1 또는 1의 가중치 값들을 포함하는 가중치 데이터는 제1 메모리(가령, 하드 디스크)에 저장되고, 0 또는 1의 가중치 값들을 포함하는 가중치 데이터는 제2 메모리(가령, SDRAM)에 저장될 수 있다. 여기에서, 0 또는 1의 가중치 값들은 후술할 신경망 모델의 연산에 이용될 수 있다.In particular, the memory 110 may store weight data including quantized weight values. Here, the quantized weight value may be -1 or 1, and the weight data may be expressed as an m×n matrix composed of -1 or 1. Also, the weight value -1 may be replaced with 0 and stored in the memory 110 . That is, the memory 110 may store weight data composed of 0 or 1. According to an embodiment, weight data including weight values of -1 or 1 is stored in a first memory (eg, a hard disk), and weight data including weight values of 0 or 1 is stored in a second memory (eg, SDRAM) ) can be stored in Here, weight values of 0 or 1 may be used for calculation of a neural network model, which will be described later.

한편, 신경망 모델의 양자화는 전자 장치(100)의 프로세서(120)에 수행될 수 있음은 물론, 외부 장치(가령, 서버)에 의해 수행될 수도 있다. 외부 장치에 의해 신경망 모델의 양자화가 수행되는 경우, 프로세서(120)는 외부 장치로부터 양자화 된 가중치 값을 포함하는 가중치 데이터를 수신하고, 이를 메모리(110)에 저장할 수 있다. 신경망 모델의 양자화 방법은 후술한다.Meanwhile, the quantization of the neural network model may be performed by the processor 120 of the electronic device 100 as well as by an external device (eg, a server). When the quantization of the neural network model is performed by the external device, the processor 120 may receive weight data including the quantized weight value from the external device and store it in the memory 110 . The quantization method of the neural network model will be described later.

이와 같은, 신경망 모델은 뉴럴 네트워크(Neural Network)를 기반으로 하는 모델이 될 수 있다. 일 예로, 신경망 모델은 RNN(Recurrent Neural Network)에 기반한 모델일 수 있다. 여기에서, RNN은 순환 신경망을 의미하며, 시계열 데이터와 같이 시간의 흐름에 따라 변화하는 데이터를 학습하기 위한 딥 러닝 모델의 일종이다.Such a neural network model may be a model based on a neural network. As an example, the neural network model may be a model based on a recurrent neural network (RNN). Here, RNN means a recurrent neural network, and is a kind of deep learning model for learning data that changes over time, such as time series data.

다만, 이에 한정되는 것은 아니며, 신경망 모델은 CNN(Convolutional Neural Network), DNN (Deep Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network) 또는 BRDNN(Bidirectional Recurrent Deep Neural Network) 등과 같은 다양한 네트워크에 기반한 모델이 될 수 있다. 또는, 메모리(110)는 인공 지능 알고리즘을 통해 학습된 모델이 아닌 룰(rule) 기반으로 생성된 모델을 저장할 수도 있으며, 메모리(110)에 저장된 모델에는 특별한 제한이 없다. However, the present invention is not limited thereto, and the neural network model includes a variety of models such as Convolutional Neural Network (CNN), Deep Neural Network (DNN), Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), or Bidirectional Recurrent Deep Neural Network (BRDNN). It can be a model based on the network. Alternatively, the memory 110 may store a model generated based on a rule rather than a model learned through an artificial intelligence algorithm, and there is no particular limitation on the model stored in the memory 110 .

프로세서(120)는 전자 장치(100)의 동작을 전반적으로 제어한다. 이를 위해, 프로세서(120)는 하나 또는 복수의 프로세서로 구성될 수 있다. 여기에서, 하나 또는 복수의 프로세서는 중앙처리장치(central processing unit(CPU)) 등과 같은 범용 프로세서가 될 수 있음은 물론, 그래픽 처리 장치(graphic processing unit, GPU) 등과 같은 그래픽 전용 프로세서 또는 신경망 처리 장치(neural network processing unit, NPU)와 같은 인공 지능 전용 프로세서가 될 수 있다. 또한, 프로세서(120)는 SoC(System on Chip)(일 예로, 온 디바이스 인공 지능 칩(On-device AI Chip)), LSI(large scale integration) 또는 FPGA(Field Programmable gate array)가 될 수도 있다.The processor 120 controls the overall operation of the electronic device 100 . To this end, the processor 120 may be composed of one or a plurality of processors. Here, one or more processors may be general-purpose processors such as a central processing unit (CPU), etc., as well as a graphics-only processor or neural network processing unit such as a graphic processing unit (GPU). It can be a processor dedicated to artificial intelligence, such as a neural network processing unit (NPU). In addition, the processor 120 may be a system on chip (SoC) (eg, an on-device AI chip), a large scale integration (LSI), or a field programmable gate array (FPGA).

프로세서(120)는 신경망 모델의 가중치 값을 양자화 할 수 있다. 구제적으로, 프로세서(120)는 k 비트로 가중치 값을 양자화 하는 경우, 하기 수학식 1을 만족하는 다양한 양자화 알고리즘을 통해 신경망 모델의 가중치 값을 양자화 할 수 있다. The processor 120 may quantize a weight value of the neural network model. Specifically, when the processor 120 quantizes the weight value with k bits, the processor 120 may quantize the weight value of the neural network model through various quantization algorithms satisfying Equation 1 below.

[수학식 1] [Equation 1]

(여기에서, w는 양자화 전의 가중치 값이고, a는 스케일링 인자(scaling factor)이며, b는 양자화 된 가중치 값으로써 -1 또는 +1이 될 수 있다.)(Here, w is a weight value before quantization, a is a scaling factor, and b is a quantized weight value, which may be -1 or +1.)

일 예로, 프로세서(120)는 그리디 알고리즘을 통해 가중치 값을 양자화 할 수 있다. 이 경우, 프로세서(120)는 상술한 수학식 1에서 k=1인 경우의 스케일링 인자 및 양자화 된 가중치 값을 하기 수학식 2에 기초하여 획득할 수 있다. For example, the processor 120 may quantize the weight value through a greedy algorithm. In this case, the processor 120 may obtain a scaling factor and a quantized weight value when k=1 in Equation 1 above based on Equation 2 below.

[수학식 2][Equation 2]

(여기에서, w는 양자화 전의 가중치 값이고, a*는 k=1인 경우의 스케일링 인자이며, b*는 k=1인 경우의 양자화 된 가중치 값으로써 -1 또는 +1이고, n은 1이상의 정수가 될 수 있다.)(where w is the weight value before quantization, a* is a scaling factor in the case of k=1, b* is the quantized weight value in the case of k=1, which is -1 or +1, and n is 1 or more It can be an integer.)

그리고, 프로세서(120)는 k= i (1 < i ≤ k)인 경우의 스케일링 인자 및 양자화 된 가중치 값을, 하기 수학식 3을 반복적으로 연산함으로써 획득할 수 있다. 즉, 프로세서(120)는 양자화 전의 가중치 값과 k=1인 경우에 양자화된 가중치 값의 차이인 r 을 이용하여, k= i (1 < i ≤ k)인 경우의 스케일링 인자 및 양자화 된 가중치 값을 획득할 수 있다.In addition, the processor 120 may obtain a scaling factor and a quantized weight value when k=i (1 < i ≤ k) by repeatedly calculating Equation 3 below. That is, the processor 120 uses r that is the difference between the weight value before quantization and the quantized weight value when k=1, and the scaling factor and the quantized weight value when k=i (1 < i ≤ k). can be obtained.

[수학식 3][Equation 3]

(여기에서, w는 양자화 전의 가중치 값이고, a는 스케일링 인자이며, b는 양자화 된 가중치 값으로써 -1 또는 +1이고, r은 양자화 전의 가중치 값과 k=1인 경우에 양자화된 가중치 값의 차이가 될 수 있다.)(Where w is the weight value before quantization, a is the scaling factor, b is the quantized weight value, which is -1 or +1, and r is the weight value before quantization and the quantized weight value when k=1. It can make a difference.)

이에 따라, 전자 장치(100)는 스케일링 인자 및, -1 또는 1로 양자화된 가중치 값들을 포함하는 가중치 데이터를 메모리(110)에 저장할 수 있다. 한편, 이상에서는 그리디 알고리즘을 통해 양자화하는 실시 예를 설명하였으나, 가중치 값을 양자화하는 방법에 특별한 제한은 없다. 예를 들어, 양자화는 유니터리 양자화, 적응적 양자화, 균일 양자화, 또는 관리된 반복 양자화(supervised iterative quantization) 등의 다양한 알고리즘을 통해 수행될 수 있다.Accordingly, the electronic device 100 may store weight data including a scaling factor and weight values quantized to -1 or 1 in the memory 110 . Meanwhile, although an embodiment of quantizing through a greedy algorithm has been described above, there is no particular limitation on a method of quantizing a weight value. For example, quantization may be performed through various algorithms, such as unitary quantization, adaptive quantization, uniform quantization, or supervised iterative quantization.

프로세서(120)는 신경망 모델의 양자화 된 가중치 값들에 기초하여, 입력 데이터에 대한 출력 데이터를 획득할 수 있다. 여기에서, 입력 데이터는 텍스트, 이미지 또는 사용자 음성 등이 될 수 있다. 일 예로, 텍스트는 전자 장치(100)의 키보드 또는 터치패드 등과 같은 입력부(미도시)를 통해 입력된 텍스트가 될 수 있고, 이미지는 전자 장치(100)의 카메라를 통해 촬영된 이미지가 될 수 있다. 또한, 사용자 음성은 전자 장치(100)의 마이크에 입력된 사용자 음성이 될 수 있다.The processor 120 may obtain output data for the input data based on the quantized weight values of the neural network model. Here, the input data may be text, an image, or a user's voice. For example, the text may be text input through an input unit (not shown) such as a keyboard or a touchpad of the electronic device 100 , and the image may be an image captured by the camera of the electronic device 100 . . Also, the user's voice may be a user's voice input into the microphone of the electronic device 100 .

한편, 출력 데이터는 입력 데이터 및/또는 신경망 모델의 종류에 따라 상이할 수 있다. 즉, 출력 데이터는 어떠한 입력 데이터가 어떠한 신경망 모델에 입력되는지에 따라 상이할 수 있다. 예를 들어, 본 개시의 신경망 모델이 언어 번역을 위한 모델인 경우, 프로세서(120)는 제1 언어로 표현된 입력 데이터에 대해 제2 언어로 표현된 출력 데이터를 획득할 수 있다. 또는, 본 개시의 신경망 모델이 이미지 분석을 위한 모델인 경우, 프로세서(120)는 이미지를 신경망 모델의 입력 데이터로 입력하고, 해당 이미지에서 검출된 오브젝트에 관한 정보를 출력 데이터로 획득할 수 있다. 또한, 본 개시의 신경망 모델이 음성 인식을 위한 모델인 경우, 프로세서(120)는 사용자 음성을 입력 데이터로, 사용자 음성에 대응되는 텍스트를 출력 데이터로 획득할 수도 있다. 한편, 상술한 출력 데이터는 일 실시 예로서, 본 개시의 출력 데이터의 종류가 이에 제한되는 것은 아니라 할 것이다.Meanwhile, the output data may be different according to the type of the input data and/or the neural network model. That is, the output data may be different depending on which input data is input to which neural network model. For example, when the neural network model of the present disclosure is a model for language translation, the processor 120 may obtain output data expressed in a second language with respect to input data expressed in a first language. Alternatively, when the neural network model of the present disclosure is a model for image analysis, the processor 120 may input an image as input data of the neural network model, and obtain information about an object detected in the image as output data. Also, when the neural network model of the present disclosure is a model for voice recognition, the processor 120 may acquire a user voice as input data and a text corresponding to the user voice as output data. Meanwhile, the above-described output data is an example, and the type of output data of the present disclosure is not limited thereto.

이를 위해, 프로세서(120)는 입력 데이터가 입력되면, 입력 데이터를 복수의 입력 값들을 포함하는 매트릭스(또는, 벡터나 텐서)로 표현할 수 있다. 여기에서, 입력 데이터를 매트릭스(또는, 벡터나 텐서)로 표현하는 방법은 입력 데이터의 종류, 유형에 따라 상이할 수 있다. 일 예로, 프로세서(120)는 입력 데이터로 텍스트(또는, 사용자 음성을 변환한 텍스트)가 입력되는 경우, 원 핫 인코딩(One hot Encoding)을 통해 텍스트를 벡터로 표현하거나, 워드 임베딩(Word Embedding)을 통해 텍스트를 벡터로 표현할 수 있다. 여기에서, 원 핫 인코딩은 특정 단어의 인덱스의 값만 1으로 표현하고 나머지 인덱스의 값은 0으로 표현하는 방식이고, 워드 임베딩은 사용자에 의해 설정된 벡터의 차원(가령, 128차원)으로 단어를 실수로 표현하는 방식이다. 워드 임베딩 방법으로는 일 예로, Word2Vec, FastText, Glove 등이 이용될 수 있다. 한편, 프로세서(120)는 입력 데이터로 이미지가 입력되는 경우이면, 이미지의 각 픽셀을 매트릭스로 표현할 수 있다. 일 예로, 프로세서(120)는 이미지의 각 픽셀을 RGB 컬러별로 0에서 255의 값으로 표현하거나, 0에서 255로 표현된 값을 기설정된 값(가령, 255)로 나눈 값으로 이미지를 매트릭스로 표현할 수 있다. To this end, when input data is input, the processor 120 may express the input data as a matrix (or a vector or a tensor) including a plurality of input values. Here, a method of expressing the input data as a matrix (or a vector or a tensor) may be different depending on the type and type of the input data. For example, when text (or text obtained by converting a user's voice) is input as input data, the processor 120 expresses the text as a vector through one hot encoding or word embedding. The text can be expressed as a vector. Here, the one-hot encoding is a method in which only the index value of a specific word is expressed as 1 and the values of the remaining indices are expressed as 0, and word embedding is a method in which words are mistakenly expressed as a dimension of a vector set by the user (eg, 128 dimensions). way of expressing it. As a word embedding method, for example, Word2Vec, FastText, Glove, etc. may be used. Meanwhile, when an image is input as input data, the processor 120 may represent each pixel of the image as a matrix. For example, the processor 120 expresses each pixel of the image as a value of 0 to 255 for each RGB color, or expresses the image as a matrix by dividing a value expressed by 0 to 255 by a preset value (eg, 255). can

프로세서(120)는 양자화 된 가중치 값들과 입력 데이터의 입력 값에 기초하여, 입력 데이터에 대한 적어도 하나의 중간 데이터를 획득하고, 적어도 하나의 중간 데이터에 대한 출력 데이터를 획득할 수 있다. The processor 120 may obtain at least one intermediate data of the input data and obtain output data of the at least one intermediate data based on the quantized weight values and the input value of the input data.

여기에서, 본 개시는 종래의 전자 장치가 양자화 된 복수의 가중치 값들 및 복수의 입력 값들에 대해 matmul 연산을 수행하여 입력 데이터에 대한 출력 데이터를 획득하는 것과 달리, 룩업 테이블을 이용하여 입력 데이터에 대한 출력 데이터를 획득할 수 있다. Here, in the present disclosure, unlike the conventional electronic device in which a matmul operation is performed on a plurality of quantized weight values and a plurality of input values to obtain output data on the input data, a lookup table is used for the input data. Output data can be obtained.

이는, 다수의 matmul 연산시 발생하는 레이턴시(latency) 문제 및 메모리 과부하 현상을 방지하기 위함으로써, 먼저 도 2a 및 도 2b를 참조하여 본 개시의 룩업 테이블에 대해 상세히 설명한다.This is to prevent a latency problem and memory overload phenomenon occurring during multiple matmul operations. First, a lookup table of the present disclosure will be described in detail with reference to FIGS. 2A and 2B .

도 2a 및 도 2b는 본 개시의 일 실시 예에 따른 룩업 테이블을 설명하기 위한 도면이다. 2A and 2B are diagrams for explaining a lookup table according to an embodiment of the present disclosure.

전술한 바와 같이, 프로세서(120)는 복수의 입력 값들을 포함하는 매트릭스(또는, 벡터나 텐서) 형태의 입력 데이터를 획득할 수 있다. 일 예로, 프로세서(120)는 도 2a에 도시된 바와 같이 복수의 입력 값들을 포함하는 4×3의 매트릭스를 획득할 수 있다. 이하, 복수의 입력 값들을 포함하는 매트릭스를 입력 매트릭스(또는, 입력 데이터에 대응되는 매트릭스)라 한다.As described above, the processor 120 may obtain input data in the form of a matrix (or a vector or a tensor) including a plurality of input values. As an example, the processor 120 may obtain a 4×3 matrix including a plurality of input values as shown in FIG. 2A . Hereinafter, a matrix including a plurality of input values is referred to as an input matrix (or a matrix corresponding to input data).

프로세서(120)는 입력 데이터의 입력 값들 및 이진 데이터에 기초하여 룩업 테이블을 생성할 수 있다. 여기에서, 이진 데이터는 0 또는 1의 값을 갖는 n 개의 비트 값들로 구성된 데이터가 될 수 있다. 그리고, 이진 데이터의 개수는 2^n 개가 될 수 있다. 일 예로, 2 비트의 이진 데이터는 0 또는 1의 값을 갖는 2개의 비트 값들로 구성된 데이터로써, 00, 01, 10, 11 중 하나가 될 수 있다. 다른 예로, 4 비트의 이진 데이터는 0 또는 1의 값을 갖는 4개의 비트 값들로 구성된 데이터로써, 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111 중 하나가 될 수 있다. The processor 120 may generate a lookup table based on input values of the input data and binary data. Here, the binary data may be data composed of n bit values having a value of 0 or 1. And, the number of binary data may be 2^n. For example, 2-bit binary data is data composed of two-bit values having a value of 0 or 1, and may be one of 00, 01, 10, and 11. As another example, 4-bit binary data is data composed of 4 bit values having a value of 0 or 1, and is 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100. , 1101, 1110, or 1111.

구체적으로, 프로세서(120)는 입력 매트릭스의 각 컬럼에서 n 개의 입력 값들을 획득하고, 이진 데이터의 비트 값들 및 획득한 n 개의 입력 값들에 기초하여 연산 데이터를 획득할 수 있다. 여기에서, 이진 데이터의 비트 값은 상술한 바와 같이 0 또는 1이 될 수 있고, 프로세서(120)는 이진 데이터의 비트 값이 0인 경우는 입력 값에 -1을 연산하고, 이진 데이터의 비트 값이 1인 경우는 입력 값에 1을 연산할 수 있다. 그리고, 프로세서(120)는 이진 데이터 별로 획득한 연산 데이터를 매칭함으로써 룩업 테이블을 생성할 수 있다.Specifically, the processor 120 may obtain n input values from each column of the input matrix, and may obtain operation data based on bit values of binary data and the obtained n input values. Here, the bit value of the binary data may be 0 or 1 as described above, and when the bit value of the binary data is 0, the processor 120 calculates -1 to the input value, and the bit value of the binary data If this is 1, 1 can be calculated on the input value. In addition, the processor 120 may generate the lookup table by matching the operation data obtained for each binary data.

일 예로, 도 2를 참조하면, n = 2인 경우, 프로세서(120)는 입력 매트릭스의 제1 컬럼에서 제1 및 제2 로우의 입력 값들 0.03, -0.17을 획득할 수 있다. 그리고, 프로세서(120)는 -(0.03)-(-0.17)을 연산하여 연산 값 0.14를 획득하고, 연산 값 0.14를 이진 데이터 00에 매칭할 수 있다. 이와 유사하게, 프로세서(120)는 -(0.03)+(-0.17)을 연산하여 연산 값 -0.20을 획득하고, 연산 값 -0.20을 이진 데이터 01에 매칭할 수 있고, (0.03)-(-0.17)을 연산하여 연산 값 0.20을 획득하고, 연산 값 0.20을 이진 데이터 10에 매칭할 수 있으며, (0.03)+(-0.17)을 연산하여 연산 값 -0.14를 획득하고, 연산 값 -0.14를 이진 데이터 11에 매칭할 수 있다. 그리고, 프로세서(120)는 제1 컬럼에서 제3 및 제4 로우의 입력 값들 0.20, -0.17을 획득할 수 있다. 그리고, 프로세서(120)는 -(0.20)-(0.17)을 연산하여 연산 값 -0.37을 획득하고, 연산 값 -0.37를 이진 데이터 00에 매칭할 수 있다. 이와 유사하게, 프로세서(120)는 -(0.20)+(0.17)을 연산하여 연산 값 -0.03을 획득하고, 연산 값 -0.03을 이진 데이터 01에 매칭할 수 있고, (0.20)-(0.17)을 연산하여 연산 값 0.03을 획득하고, 연산 값 0.03을 이진 데이터 10에 매칭할 수 있으며, (0.20)+(0.17)을 연산하여 연산 값 0.37을 획득하고, 연산 값 0.37을 이진 데이터 11에 매칭할 수 있다. 이와 유사한 방식으로, 프로세서(120)는 입력 매트릭스의 제2 컬럼의 입력 값들 및 제3 컬럼의 입력 값들에 대한 연산 값들을 획득할 수 있고, 이진 데이터에 연산 값들을 매칭할 수 있다. 한편, 상술한 n = 2인 경우는 일 실시 예로써, n은 사용자 설정에 따라 변경될 수 있다. For example, referring to FIG. 2 , when n = 2, the processor 120 may obtain input values 0.03 and −0.17 of the first and second rows from the first column of the input matrix. Then, the processor 120 may calculate -(0.03)-(-0.17) to obtain an operation value of 0.14, and match the operation value 0.14 to the binary data 00. Similarly, the processor 120 may calculate -(0.03)+(-0.17) to obtain the operation value -0.20, match the operation value -0.20 to the binary data 01, and (0.03)-(-0.17). ) to obtain the operation value 0.20, the operation value 0.20 can be matched to binary data 10, (0.03)+(-0.17) is calculated to obtain the operation value -0.14, and the operation value -0.14 is binary data 11 can be matched. In addition, the processor 120 may obtain input values 0.20 and −0.17 of the third and fourth rows in the first column. Then, the processor 120 may calculate -(0.20)-(0.17) to obtain an operation value -0.37, and match the operation value -0.37 to the binary data 00. Similarly, the processor 120 may calculate -(0.20)+(0.17) to obtain an operation value of -0.03, match the operation value -0.03 to binary data 01, and obtain (0.20)-(0.17) Calculate to obtain an operation value of 0.03, match the operation value 0.03 to binary data 10, calculate (0.20)+(0.17) to obtain an operation value 0.37, and match the operation value 0.37 to binary data 11 have. In a similar manner, the processor 120 may obtain operation values for the input values of the second column and the input values of the third column of the input matrix, and match the operation values to the binary data. Meanwhile, the above-described case where n = 2 is an embodiment, and n may be changed according to a user setting.

이에 따라, 프로세서(120)는 입력 매트릭스의 컬럼 별로 적어도 하나의 룩업 테이블을 생성할 수 있다. 상술한 바와 같이, 입력 매트릭스가 4×3의 매트릭스이고, n = 2인 경우이면, 프로세서(120)는 도 2b와 같이 입력 매트릭스의 컬럼 별로 두 개의 룩업 테이블, 즉 총 6개의 룩업 테이블을 생성할 수 있다. 만약, n = 4인 경우이면, 프로세서(120)는 입력 매트릭스의 컬럼 별로 하나의 룩업 테이블, 즉 총 3개의 룩업 테이블을 생성할 수 있을 것이다.Accordingly, the processor 120 may generate at least one lookup table for each column of the input matrix. As described above, if the input matrix is a 4×3 matrix and n = 2, the processor 120 generates two lookup tables for each column of the input matrix, that is, a total of six lookup tables as shown in FIG. 2B. can If n = 4, the processor 120 may generate one lookup table for each column of the input matrix, that is, a total of three lookup tables.

도 3a 내지 도 3e는 본 개시의 일 실시 예에 따른 룩업 테이블을 이용한 신경망 모델의 연산을 설명하기 위한 도면이다.3A to 3E are diagrams for explaining the operation of a neural network model using a lookup table according to an embodiment of the present disclosure.

전술한 바와 같이, 프로세서(120)는 양자화 된 가중치 값들 및 입력 데이터의 입력 값들에 기초하여 출력 데이터를 획득할 수 있다. 구체적으로, 프로세서(120)는 가중치 데이터 W (이는, 스케일링 인자 A 및 양자화 된 가중치 값 B를 포함할 수 있다.) 및 입력 데이터 X의 연산에 기초하여, 입력 데이터 X에 대한 출력 데이터를 획득할 수 있다. 일 예로, 가중치 데이터의 가중치 값들이 3비트로 양자화 된 경우이면, 프로세서(120)는 하기와 같은 수학식 4에 기초하여 입력 데이터 X에 대한 출력 데이터를 획득할 수 있다.As described above, the processor 120 may obtain the output data based on the quantized weight values and the input values of the input data. Specifically, the processor 120 is configured to obtain output data for the input data X based on the calculation of the weight data W (which may include a scaling factor A and a quantized weight value B) and the input data X. can For example, if the weight values of the weight data are quantized to 3 bits, the processor 120 may obtain output data for the input data X based on Equation 4 below.

[수학식 4][Equation 4]

수학식 4를 매트릭스 형태로 표현하면 하기와 같다. Equation 4 is expressed in a matrix form as follows.

일 예로, k=1인 경우의 스케일링 인자 Ao 및 양자화 된 가중치 Bo, 그리고 입력 데이터 X는 도 3a와 같은 값들을 가질 수 있다. 여기에서, 종래의 전자 장치가 B*X의 연산 값을 matmul 연산을 통해 획득하는 것과 달리, 본 개시는 룩업 테이블의 참조를 통해 B*X의 연산 값을 획득할 수 있다. 이하, 설명의 편의를 위해 Bo*X의 연산 값을 획득하는 방법에 대해 설명하나, 하기의 기술적 사상은 B1*X, B2*X 등과 같은 Bn*X의 연산 값을 획득하는 경우에도 적용될 수 있다고 볼 것이다.For example, when k=1, the scaling factor Ao, the quantized weight Bo, and the input data X may have values as shown in FIG. 3A . Here, unlike the conventional electronic device that obtains the calculated value of B*X through the matmul operation, the present disclosure may obtain the calculated value of B*X through reference to a lookup table. Hereinafter, for convenience of explanation, a method of obtaining the operation value of Bo*X will be described, but the following technical idea can be applied even when obtaining an operation value of Bn*X such as B1*X, B2*X, etc. will see

도 3b를 참조하면, Bo*X의 연산 값들을 포함하는 출력 매트릭스는 y1 내지 y109의 출력 값들을 가질 수 있다. 여기에서, 프로세서(120)는 출력 값 획득을 위해, 입력 매트릭스의 각 컬럼 별로 생성한 복수의 룩업 테이블 중에서, 출력 매트릭스의 각 컬럼에 대응되는 룩업 테이블을 판단할 수 있다. 구체적으로, 프로세서(120)는 출력 매트릭스의 컬럼과 동일한 컬럼의 입력 매트릭스에 기초하여 생성된 룩업 테이블을, 출력 매트릭스의 출력 값 획득을 위한 룩업 테이블로 판단할 수 있다. 일 예로, 도 3b를 참조하면, 프로세서(120)는 복수의 룩업 테이블 중에서 입력 매트릭스의 제1 컬럼의 입력 값들에 기초하여 생성된 제1 룩업 테이블(311) 및 제2 룩업 테이블(312)을, 출력 매트릭스의 제1 컬럼의 출력 값들을 획득하기 위한 룩업 테이블로 판단할 수 있다.Referring to FIG. 3B , the output matrix including the operation values of Bo*X may have output values of y1 to y109. Here, in order to obtain an output value, the processor 120 may determine a lookup table corresponding to each column of the output matrix from among a plurality of lookup tables generated for each column of the input matrix. Specifically, the processor 120 may determine the lookup table generated based on the input matrix of the same column as the column of the output matrix as the lookup table for obtaining the output value of the output matrix. As an example, referring to FIG. 3B , the processor 120 generates a first lookup table 311 and a second lookup table 312 generated based on input values of a first column of an input matrix among a plurality of lookup tables, It may be determined as a lookup table for obtaining output values of the first column of the output matrix.

그리고, 프로세서(120)는 0 또는 1의 가중치 값을 포함하는 가중치 매트릭스의 각 로우에서 n 개의 입력 값들에 대응되는 n 개의 가중치 값을 식별할 수 있다. 즉, 입력 매트릭스의 각 컬럼에서 n 개의 입력 값들을 획득하여 룩업 테이블을 생성한 경우이면, 프로세서(120)는 가중치 매트릭스의 각 로우에서 n 개의 입력 값들에 대응되는 n 개의 가중치 값을 식별할 수 있다. 일 예로, 상술한 바와 같이 n=2인 경우, 프로세서(120)는 가중치 값들을 포함하는 매트릭스(이하, 가중치 매트릭스 또는 가중치 데이터에 대응되는 매트릭스라 한다.)에서, 각 로우 별로 2개의 입력 값들에 대응되는 2개의 가중치 값들을 식별할 수 있다. 즉, 도 3c에서 프로세서(120)는 가중치 매트릭스의 제1 로우에서, 1과 0 및, 0과 1을 식별하고, 제2 로우에서 0과 1 및, 1과 0을 식별하며, 이와 유사하게 나머지 로우에서 2개의 가중치 값들을 식별할 수 있다.In addition, the processor 120 may identify n weight values corresponding to n input values in each row of a weight matrix including a weight value of 0 or 1. That is, when the lookup table is generated by obtaining n input values from each column of the input matrix, the processor 120 may identify n weight values corresponding to the n input values in each row of the weight matrix. . For example, as described above, when n=2, the processor 120 receives two input values for each row in a matrix including weight values (hereinafter, referred to as a weight matrix or a matrix corresponding to weight data). Two corresponding weight values may be identified. That is, in FIG. 3C , the processor 120 identifies 1 and 0 and, 0 and 1 in the first row of the weight matrix, and 0 and 1 and 1 and 0 in the second row, and similarly the rest Two weight values can be identified in the row.

그리고, 프로세서(120)는 이진 데이터 중에서 식별된 가중치 값들에 대응되는 이진 데이터를 식별할 수 있다. 여기에서, 식별된 이진 데이터는 가중치 값과 동일한 비트 값을 포함하는 데이터로써, 예를 들어, 도 3c에 도시된 바와 같이, 식별된 가중치 값들이 1과 0이면, 가중치 값들에 대응되는 이진 데이터는 10이 될 수 있고, 식별된 가중치 값들이 0과 1이면, 가중치 값들에 대응되는 이진 데이터는 10이 될 수 있다.Then, the processor 120 may identify binary data corresponding to the identified weight values from among the binary data. Here, the identified binary data is data including the same bit value as the weight value. For example, as shown in FIG. 3C , if the identified weight values are 1 and 0, the binary data corresponding to the weight values is may be 10, and if the identified weight values are 0 and 1, binary data corresponding to the weight values may be 10.

그리고, 프로세서(120)는 룩업 테이블로부터 식별된 이진 데이터에 대응되는 연산 값을 획득할 수 있다.Then, the processor 120 may obtain an operation value corresponding to the binary data identified from the lookup table.

이를 위해, 프로세서(120)는 복수의 룩업 테이블(가령, 상술한 제1 및 제2 룩업 테이블(311, 312)) 중에서, 식별된 가중치 값들에 대응되는 연산 값들을 포함하는 룩업 테이블을 판단할 수 있다. 구체적으로, 프로세서(120)는 식별된 가중치 값들이 가중치 매트릭스의 k 컬럼에 포함된 값이면, 복수의 룩업 테이블 중에서, 입력 매트릭스의 k 로우의 입력 값들에 기초하여 생성된 룩업 테이블을 식별된 가중치 값들에 대응되는 연산 값들을 포함하는 룩업 테이블로 판단할 수 있다. 일 예로, 도 3c를 참조하면, 프로세서(120)는 식별된 가중치 값들이 가중치 매트릭스의 제1 및 제2 컬럼에 포함된 값이면, 제1 및 제2 룩업 테이블(311, 312) 중에서, 입력 매트릭스의 제1 및 제2 로우의 입력 값들에 기초하여 생성된 제1 룩업 테이블(311)을 식별된 가중치 값들에 대응되는 연산 값들을 포함하는 룩업 테이블로 판단하고, 식별된 가중치 값들이 가중치 매트릭스의 제3 및 제4 컬럼에 포함된 값이면, 제1 및 제2 룩업 테이블(311, 312) 중에서, 입력 매트릭스의 제4 및 제4 로우의 입력 값들에 기초하여 생성된 제2 룩업 테이블(312)을 식별된 가중치 값들에 대응되는 연산 값들을 포함하는 룩업 테이블로 판단할 수 있다.To this end, the processor 120 may determine a lookup table including operation values corresponding to the identified weight values from among the plurality of lookup tables (eg, the above-described first and second lookup tables 311 and 312 ). have. Specifically, if the identified weight values are values included in the k column of the weight matrix, the processor 120 uses a lookup table generated based on input values of k rows of the input matrix from among the plurality of lookup tables as the identified weight values. It can be determined as a lookup table including operation values corresponding to . For example, referring to FIG. 3C , if the identified weight values are values included in the first and second columns of the weight matrix, the processor 120 selects the input matrix from among the first and second lookup tables 311 and 312 . The first lookup table 311 generated based on the input values of the first and second rows of R is determined as a lookup table including operation values corresponding to the identified weight values, and the identified weight values are used as the first lookup table of the weight matrix. If the values are included in the third and fourth columns, the second lookup table 312 generated based on the input values of the fourth and fourth rows of the input matrix among the first and second lookup tables 311 and 312 is used. It may be determined as a lookup table including operation values corresponding to the identified weight values.

그리고, 프로세서(120)는 제1 룩업 테이블(311)로부터 식별된 이진 데이터 10에 매칭된 연산 값 0.20을 획득하고, 제2 룩업 테이블(312)로부터 식별된 이진 데이터 01에 매칭된 연산 값 -0.37을 획득하며, 연산 값 0.20 및 0.37의 합산을 통해 출력 매트릭스의 y1 값을 획득할 수 있다. 마찬가지로, 프로세서(120)는 가중치 매트릭스의 제2 로우의 경우, 제1 룩업 테이블(311)로부터 식별된 이진 데이터 01에 매칭된 연산 값 -0.20을 획득하고, 제2 룩업 테이블(312)로부터 식별된 이진 데이터 10에 매칭된 연산 값 0.37을 획득하며, 연산 값 -0.20 및 0.37의 합산을 통해 출력 매트릭스의 y4 값을 획득할 수 있고, 가중치 매트릭스의 제n 로우의 경우에도 이와 유사한 방식을 통해 출력 값을 획득할 수 있다.Then, the processor 120 obtains an operation value of 0.20 matched to the binary data 10 identified from the first lookup table 311 , and an operation value -0.37 matched to the binary data 01 identified from the second lookup table 312 . , and the y1 value of the output matrix can be obtained by summing the operation values 0.20 and 0.37. Similarly, for the second row of the weight matrix, the processor 120 obtains an operation value −0.20 matched to the binary data 01 identified from the first lookup table 311 , and obtains an operation value −0.20 identified from the second lookup table 312 . An operation value of 0.37 matched to binary data 10 is obtained, and the y4 value of the output matrix can be obtained by summing the operation values -0.20 and 0.37, and in the case of the nth row of the weight matrix, the output value is similarly obtained can be obtained.

또한, 출력 매트릭스의 제2 컬럼의 출력 값들의 경우에는, 도 3d에 도시된 바와 같이, 제3 및 제4 룩업 테이블(313, 314)로부터 획득될 수 있고, 출력 매트릭스의 제3 컬럼의 출력 값들의 경우에는, 도 3e에 도시된 바와 같이, 제5 및 제6 룩업 테이블(315, 316)로부터 획득될 수 있다. 여기에서는, 상술한 기술적 사상이 적용될 수 있는 바 구체적인 설명은 생략하기로 한다. In addition, in the case of output values of the second column of the output matrix, as shown in FIG. 3D , it may be obtained from the third and fourth lookup tables 313 and 314 , and output values of the third column of the output matrix In this case, as shown in FIG. 3E , they may be obtained from the fifth and sixth lookup tables 315 and 316 . Here, since the above-described technical idea can be applied, a detailed description will be omitted.

이후, 프로세서(120)는 출력 매트릭스의 출력 값들이 획득되면, 출력 매트릭스 및 스케일링 인자의 연산을 수행하고, 이에 따라 상술한 수학식 4의 결과 값을 획득할 수 있다. 그리고, 프로세서(120)는 획득한 결과 값을 이용하여 최종 출력 데이터를 출력할 수 있다. 여기에서, 출력 데이터는 상술한 바와 같이, 신경망 모델이 언어 번역을 위한 모델이면 입력 데이터로 입력된 텍스트와는 상이한 언어의 텍스트가 될 수 있고, 신경망 모델이 이미지 분석을 위한 모델이면, 이미지에 포함된 오브젝트에 관한 정보를 포함하는 데이터가 될 수 있으나, 반드시 이에 한정되는 것은 아니다.Thereafter, when the output values of the output matrix are obtained, the processor 120 may calculate the output matrix and the scaling factor, thereby obtaining the result value of Equation 4 described above. Then, the processor 120 may output the final output data using the obtained result value. Here, as described above, if the neural network model is a model for language translation, the output data may be text in a language different from the text input as input data, and if the neural network model is a model for image analysis, it is included in the image. It may be data including information about the object that has been used, but is not necessarily limited thereto.

이와 같이 본 개시는 양자화 된 가중치 값들 및 입력 데이터의 입력 값들의 matmul 연산 없이, 룩업 테이블 통해 출력 값들을 획득하는 바, 방대한 연산량에 따른 레이턴시(latency) 문제 및 메모리 과부하 현상을 방지할 수 있다.As such, the present disclosure obtains output values through a lookup table without matmul operation of quantized weight values and input values of input data, thereby preventing latency problems and memory overload due to a huge amount of computation.

도 4a 내지 도 4f는 본 개시의 일 실시 예에 따른 룩업 테이블 생성에 이용되는 연산식을 설명하기 위한 도면이다.4A to 4F are diagrams for explaining an arithmetic expression used to generate a lookup table according to an embodiment of the present disclosure.

전술한 바와 같이, 프로세서(120)는 이진 데이터 및 n 개의 입력 값들에 기초한 복수의 연산식을 통해 연산 값들을 획득하고, 이진 데이터 별로 연산 값을 매칭함으로써 룩업 테이블을 생성할 수 있다.As described above, the processor 120 may generate the lookup table by obtaining operation values through a plurality of operation expressions based on binary data and n input values, and matching the operation values for each binary data.

여기에서, 이진 데이터 및 n 개의 입력 값들에 기초한 복수의 연산식은 동일한 중간 연산식을 포함할 수 있다. 일 예로, n=8인 경우 8비트의 이진 데이터 및 8개의 입력 값들(x0 내지 x7)에 기초한 복수의 연산식은 도 4a와 같을 수 있다. 즉, n=8인 경우, 이진 데이터 00000000에 대응되는 연산 값 Ro를 획득하기 위한 연산식은 -x0-x1-x2-x3-x4-x5-x6-x7이고, 이진 데이터 00000001에 대응되는 연산 값 R1을 획득하기 위한 연산식은 -x0-x1-x2-x3-x4-x5-x6+x7이며, 이와 유사하게 R2 내지 R255 각각에 대응되는 복수의 연산 값들을 획득하기 위한 복수의 연산식이 있을 수 있다. 이 경우, 프로세서(120)는 복수의 연산식에 기초한 연산 값들을 획득함에 있어서, 복수의 연산식에 동일하게 포함된 중간 연산식의 결과 값을 이용할 수 있다.Here, a plurality of arithmetic expressions based on binary data and n input values may include the same intermediate arithmetic expression. For example, when n=8, a plurality of arithmetic expressions based on 8-bit binary data and 8 input values (x0 to x7) may be as shown in FIG. 4A . That is, when n=8, the formula for obtaining the operation value Ro corresponding to the binary data 00000000 is -x0-x1-x2-x3-x4-x5-x6-x7, and the operation value R1 corresponding to the binary data 00000001 An arithmetic expression for obtaining is -x0-x1-x2-x3-x4-x5-x6+x7, and similarly, there may be a plurality of arithmetic expressions for obtaining a plurality of operation values corresponding to each of R2 to R255. In this case, the processor 120 may use the result value of the intermediate arithmetic expression equally included in the plurality of arithmetic expressions when obtaining operation values based on the plurality of arithmetic expressions.

구체적으로, 도 4b를 참조하면, Ro를 획득하기 위한 연산식 -x0-x1-x2-x3-x4-x5-x6-x7 및 R1을 획득하기 위한 연산식 -x0-x1-x2-x3-x4-x5-x6-x7은 동일한 중간 연산식 -x0-x1-x2-x3-x4-x5-x6을 포함한다. 그리고, 도 4c를 참조하면, Ro를 획득하기 위한 연산식 -x0-x1-x2-x3-x4-x5-x6-x7 및 R2를 획득하기 위한 연산식 -x0-x1-x2-x3-x4-x5+x6-x7은 동일한 중간 연산식 -x0-x1-x2-x3-x4-x5-x7을 포함하고, R1을 획득하기 위한 연산식 -x0-x1-x2-x3-x4-x5-x6-x7 및 R3를 획득하기 위한 연산식 -x0-x1-x2-x3-x4-x5+x6+x7는 동일한 중간 연산식 x0-x1-x2-x3-x4-x5+x7을 포함할 수 있다. 또한, 도 4d 및 도 4e에 도시된 바와 같이, 연산 값들을 획득하기 위한 복수의 연산식에는 동일한 중간 연산식을 포함하는 복수의 연산식이 있을 수 있다. Specifically, referring to FIG. 4B , an equation for obtaining Ro -x0-x1-x2-x3-x4-x5-x6-x7 and an equation for obtaining R1 -x0-x1-x2-x3-x4 -x5-x6-x7 contains the same intermediate expression -x0-x1-x2-x3-x4-x5-x6. And, referring to FIG. 4C , an arithmetic expression -x0-x1-x2-x3-x4-x5-x6-x7 for obtaining Ro and an arithmetic expression -x0-x1-x2-x3-x4- for obtaining R2 x5+x6-x7 includes the same intermediate expression -x0-x1-x2-x3-x4-x5-x7, and the expression -x0-x1-x2-x3-x4-x5-x6- for obtaining R1 The arithmetic expressions -x0-x1-x2-x3-x4-x5+x6+x7 for obtaining x7 and R3 may include the same intermediate arithmetic expression x0-x1-x2-x3-x4-x5+x7. Also, as shown in FIGS. 4D and 4E , a plurality of arithmetic expressions for obtaining arithmetic values may include a plurality of arithmetic expressions including the same intermediate arithmetic expression.

이 경우, 프로세서(120)는 동일한 중간 연산식을 가지는 복수의 연산식 중 어느 하나의 연산식의 연산은 다른 하나의 연산식의 연산 값에 기초하여 수행할 수 있다. 즉, 상술한 실시 예와 같이, Ro를 획득하기 위한 연산식 -x0-x1-x2-x3-x4-x5-x6-x7 및 R1을 획득하기 위한 연산식 -x0-x1-x2-x3-x4-x5-x6-x7이 동일한 중간 연산식 -x0-x1-x2-x3-x4-x5-x6을 포함하는 경우, 프로세서(120)는 Ro는 연산식 -x0-x1-x2-x3-x4-x5-x6-x7을 통해 획득하고, R1은 R0의 연산 값에 2*x7을 더함으로써(즉, R1=R0+2x7) 획득할 수 있다. 이와 유사한 방법으로, 프로세서(120)는 R2 내지 R255의 값을 획득할 수 있고, 결론적으로 프로세서(120)는 도 4f에 도시된 복수의 연산식을 통해 연산 값들을 획득할 수 있다. 한편, 여기서는 n이 8인 경우를 실시 예로 설명하였으나, n이 다른 정수인 경우에도 상술한 기술적 사상은 그대로 적용될 수 있다고 볼 것이다.In this case, the processor 120 may perform the operation of any one of the plurality of arithmetic expressions having the same intermediate arithmetic expression based on the operation value of the other arithmetic expression. That is, as in the above-described embodiment, the arithmetic expression -x0-x1-x2-x3-x4-x5-x6-x7 for obtaining Ro and the arithmetic expression -x0-x1-x2-x3-x4 for obtaining R1 When -x5-x6-x7 includes the same intermediate expression -x0-x1-x2-x3-x4-x5-x6, the processor 120 determines that Ro is the expression -x0-x1-x2-x3-x4- It is obtained through x5-x6-x7, and R1 can be obtained by adding 2*x7 to the calculated value of R0 (ie, R1=R0+2x7). In a similar way, the processor 120 may obtain values of R2 to R255, and consequently, the processor 120 may obtain operation values through a plurality of equations shown in FIG. 4F . Meanwhile, here, the case where n is 8 has been described as an embodiment, but it will be seen that the above-described technical idea can be applied as it is even when n is another integer.

이와 같이, 동일한 중산 연산식을 포함하는 경우이면, 어느 하나의 연산식의 연산을 다른 하나의 연산식의 연산 값을 이용하여 수행함으로써, 본 개시는 룩업 테이블 생성을 위한 프로세서의 연산량을 대폭 줄일 수 있는 효과가 있다. As such, in the case of including the same arithmetic expression, the present disclosure can significantly reduce the amount of computation of the processor for generating the lookup table by performing the operation of one arithmetic expression using the operation value of the other arithmetic expression. there is an effect

도 5는 본 개시의 일 실시 예에 따른 신경망 모델의 연산 방법을 설명하기 위한 도면이다.5 is a diagram for explaining a method of calculating a neural network model according to an embodiment of the present disclosure.

전술한 바와 같이, 프로세서(120)는 룩업 테이블을 생성하고, 룩업 테이블에 기초하여 신경망 모델의 연산을 수행할 수 있다. 여기에서, 프로세서(120)는 입력 데이터의 모든 입력 값들에 대한 복수의 룩업 테이블을 생성하고, 복수의 룩업 테이블에 기초하여 신경망 모델의 연산을 수행할 수 있음은 물론, 입력 데이터의 일부 입력 값들에 대한 일부 룩업 테이블을 생성하고, 일부 룩업 테이블에 기초하여 신경망 모델의 일부 연산을 수행한 뒤, 입력 데이터의 나머지 입력 값들에 대한 일부 룩업 테이블을 생성하고, 일부 룩업 테이블에 기초하여 신경망 모델의 나머지 연산을 수행할 수도 있다.As described above, the processor 120 may generate a look-up table and perform an operation of the neural network model based on the look-up table. Here, the processor 120 may generate a plurality of lookup tables for all input values of the input data, and may perform an operation of the neural network model based on the plurality of lookup tables, as well as some input values of the input data. After generating some lookup tables for , some lookup tables for the neural network model are performed based on some lookup tables, some lookup tables for the remaining input values of the input data are generated, and the rest of the neural network model is calculated based on some lookup tables. can also be performed.

구체적으로, 프로세서(120)는 입력 매트릭스를 기설정된 로우를 기준으로 제1 입력 매트릭스 및 제2 입력 매트릭스로 분할하고, 가중치 매트릭스를 기설정된 컬럼을 기준으로 제3 가중치 매트릭스 및 제4 가중치 매트릭스로 분할할 수 있다. 여기에서, 기설정된 로우는 입력 매트릭스의 로우 수가 n 개이면 n/2 로우가 될 수 있고, 기설정된 컬럼은 가중치 매트릭스의 컬럼 수가 n 개이면 n/2 컬럼이 될 수 있으나, 반드시 이에 한정되는 것은 아니다.Specifically, the processor 120 divides the input matrix into a first input matrix and a second input matrix based on a predetermined row, and divides the weight matrix into a third weight matrix and a fourth weight matrix based on a predetermined column. can do. Here, the preset row may be an n/2 row if the number of rows in the input matrix is n, and the preset column may be an n/2 column if the number of columns in the weight matrix is n. no.

그리고, 프로세서(120)는 제1 입력 매트릭스의 각 컬럼의 입력 값들에 기초하여 복수의 룩업 테이블을 생성하고, 복수의 룩업 테이블로부터 제3 가중치 매트릭스의 각 로우에 대응되는 연산 데이터를 획득한 뒤, 제2 입력 매트릭스의 각 컬럼의 입력 값들에 기초하여 복수의 룩업 테이블을 생성하고, 복수의 룩업 테이블로부터 제4 가중치 매트릭스의 각 로우에 대응되는 연산 데이터를 획득할 수 있다. 여기에서, 룩업 테이블의 생성 및 룩업 테이블에 기초한 연산 데이터의 획득 방법은 상술한 기술적 사상이 적용될 수 있는 바, 구체적인 생략은 생략하기로 한다.Then, the processor 120 generates a plurality of lookup tables based on input values of each column of the first input matrix, and obtains operation data corresponding to each row of the third weight matrix from the plurality of lookup tables, A plurality of lookup tables may be generated based on input values of each column of the second input matrix, and operation data corresponding to each row of the fourth weight matrix may be obtained from the plurality of lookup tables. Here, the above-described technical idea may be applied to a method of generating a lookup table and obtaining operation data based on the lookup table, and thus, detailed omissions will be omitted.

일 예로, 도 5를 참조하면, 입력 매트릭스의 로우 수 및 가중치 매트릭스의 컬럼 수가 각각 512개인 경우, 프로세서(120)는 입력 매트릭스를 로우 256을 기준으로 입력 매트릭스 X1 및 입력 매트릭스 X2로 분할하고, 가중치 매트릭스를 컬럼 256을 기준으로 가중치 매트릭스 W1 및 가중치 매트릭스 W2로 분할할 수 있다. 그리고, 프로세서(120)는 입력 매트릭스 X1의 입력 값들에 기초하여 룩업 테이블을 생성하고, 룩업 테이블로부터 가중치 매트릭스 W1에 대응되는 연산 값들을 획득한 뒤, 입력 매트릭스 X2의 입력 값들에 기초하여 룩업 테이블을 생성하고, 룩업 테이블로부터 가중치 매트릭스 W2에 대응되는 연산 값들을 획득할 수 있다. 이와 같이 복수의 연산 값들을 분할된 매트릭스를 통해 일부씩 획득함으로써, 본 개시는 메모리 오버헤드 문제를 방지하고 메모리를 효율적으로 이용할 수 있다. 또한, 실시 예에 따라 본 개시의 프로세서(120)가 복수 개로 구현되는 경우이면, 입력 매트릭스 X1에 기초한 룩업 테이블로부터 가중치 매트릭스 W1에 대응되는 연산 값들 및 입력 매트릭스 X2에 기초한 룩업 테이블로부터 가중치 매트릭스 W2에 대응되는 연산 값들을 복수의 프로세서를 통해 병렬적으로 획득함으로써 연산에 소요되는 시간을 축소시킬 수 있다.For example, referring to FIG. 5 , when the number of rows of the input matrix and the number of columns of the weight matrix are 512, the processor 120 divides the input matrix into an input matrix X1 and an input matrix X2 based on the row 256, and weights The matrix may be partitioned into a weight matrix W1 and a weight matrix W2 based on column 256 . Then, the processor 120 generates a lookup table based on the input values of the input matrix X1, obtains operation values corresponding to the weight matrix W1 from the lookup table, and creates a lookup table based on the input values of the input matrix X2. generated, and calculated values corresponding to the weight matrix W2 may be obtained from the lookup table. As described above, by partially acquiring a plurality of operation values through a divided matrix, the present disclosure can prevent a memory overhead problem and efficiently use a memory. In addition, according to an embodiment, when a plurality of processors 120 are implemented, calculation values corresponding to the weight matrix W1 from the lookup table based on the input matrix X1 and the weight matrix W2 from the lookup table based on the input matrix X2 By parallelly acquiring corresponding operation values through a plurality of processors, it is possible to reduce the time required for the operation.

도 6은 본 개시의 일 실시 예에 따른 전자 장치를 설명하기 위한 상세 블록도이다.6 is a detailed block diagram illustrating an electronic device according to an embodiment of the present disclosure.

도 6을 참조하면, 본 개시의 일 실시 예에 따른 제1 메모리(110), 프로세서(120), 룩업 테이블 생성기(LUT Generator)(130), 제2 메모리(140) 및 멀티플라이어(multiplier)(150)를 포함할 수 있다.Referring to FIG. 6 , a first memory 110 , a processor 120 , a lookup table generator 130 , a second memory 140 and a multiplier according to an embodiment of the present disclosure ( 150) may be included.

제1 메모리(110)는 입력 매트릭스, 신경망 모델의 연산을 위한 스케일링 인자 및 가중치 매트릭스를 저장할 수 있다. 여기에서, 입력 매트릭스는 전술한 바와 같이 복수의 입력 값들을 포함할 수 있고, 가중치 매트릭스는 0 또는 1로 양자화 된 가중치 값들을 포함할 수 있다. The first memory 110 may store an input matrix, a scaling factor for calculation of a neural network model, and a weight matrix. Here, the input matrix may include a plurality of input values as described above, and the weight matrix may include weight values quantized to 0 or 1.

룩업 테이블 생성기(130)는 제1 메모리(110)로부터 입력 매트릭스를 로딩할 수 있다. 그리고, 룩업 테이블 생성기(130)는 이진 데이터 별로 입력 매트릭스의 입력 값들에 대한 연산 값들을 획득할 수 있다. 구체적으로, 룩업 테이블 생성기(130)는 n 비트의 이진 데이터의 룩업 테이블을 생성하는 경우, 입력 매트릭스의 각 컬럼에서 n 개의 입력 값들을 획득하고, 이진 데이터 및 n 개의 입력 값들에 기초하여 이진 데이터 별로 연산 값들을 획득할 수 있다. 그리고, 룩업 테이블 생성기(130)는 룩업 테이블 생성의 기초가 된 입력 매트릭스의 컬럼에 대한 정보 및 로우에 대한 정보를, 룩업 테이블에 매칭하여 저장할 수 있다. 여기에서, 컬럼에 대한 정보는 입력 매트릭스의 각 컬럼 별로 생성된 복수의 룩업 테이블 중에서, 출력 매트릭스의 각 컬럼에 대응되는 룩업 테이블을 판단할 때 이용될 수 있다. 그리고, 로우에 대한 정보는, 출력 매트릭스의 각 컬럼에 대응되는 복수의 룩업 테이블 중에서, 가중치 매트릭스의 각 컬럼에 대응되는 연산 값을 포함하는 룩업 테이블을 판단할 때 이용될 수 있다.The lookup table generator 130 may load the input matrix from the first memory 110 . In addition, the lookup table generator 130 may obtain operation values for the input values of the input matrix for each binary data. Specifically, when the lookup table generator 130 generates a lookup table of n-bit binary data, it obtains n input values from each column of the input matrix, and each binary data based on the binary data and the n input values. arithmetic values can be obtained. In addition, the lookup table generator 130 may match and store information on columns and rows of the input matrix, which is a basis for generating the lookup table, in the lookup table. Here, the column information may be used when determining a lookup table corresponding to each column of the output matrix from among a plurality of lookup tables generated for each column of the input matrix. Also, the row information may be used when determining a lookup table including an operation value corresponding to each column of the weight matrix from among a plurality of lookup tables corresponding to each column of the output matrix.

한편, 실시 예에 따라 룩업 테이블 생성기(130)는 8 비트의 이진 데이터에 기초한 룩업 테이블을 생성할 수 있다. 구체적으로, 룩업 테이블 생성기(130)는 입력 매트릭스의 각 컬럼에서 8개의 입력 값들을 획득하고, 8비트의 이진 데이터 및 8개의 입력 값들에 기초하여 이진 데이터 별로 연산 데이터를 획득할 수 있다. 이는, CPU와 같은 프로세서(120)는 byte 단위로 데이터를 처리함을 고려한 것으로써, 이에 따라 본 개시는 bit 레벨의 데이터를 byte 단위의 데이터로 처리하기 위한 shift 연산 등을 별도로 수행하지 않음으로써 프로세서의 과부하를 방지할 수 있다.Meanwhile, according to an embodiment, the lookup table generator 130 may generate a lookup table based on 8-bit binary data. Specifically, the lookup table generator 130 may obtain 8 input values from each column of the input matrix, and may obtain 8-bit binary data and operation data for each binary data based on the 8 input values. This is in consideration that the processor 120 such as a CPU processes data in units of bytes, and accordingly, the present disclosure does not separately perform a shift operation for processing bit-level data as data in units of bytes. overload can be prevented.

제2 메모리(140)는 적어도 하나의 룩업 테이블을 저장할 수 있다. 여기에서 제2 메모리(140)는 룩업 테이블과 같은 데이터를 임시로 저장하는 스크래치 패드 메모리(Scratchpad memory, SPM)가 될 수 있으나 반드시 이에 한정되는 것은 아니다.The second memory 140 may store at least one lookup table. Here, the second memory 140 may be a scratchpad memory (SPM) that temporarily stores data such as a lookup table, but is not limited thereto.

프로세서(120)는 제1 메모리(110)에서 가중치 매트릭스를 로딩하고, 제2 메모리(140)에서 룩업 테이블을 로딩할 수 있다. 그리고, 프로세서(120)는 룩업 테이블로부터 가중치 매트릭스의 가중치 값들의 연산 값을 획득하고, 이를 누산기(accumulator)에 누적하며, 누산기에 누적된 연산 값들의 합산에 기초하여 출력 매트릭스의 출력 값(즉, 가중치 매트릭스 B * 입력 매트릭스 X)들을 획득할 수 있다. 그리고, 프로세서(120)는 출력 매트릭스의 출력 값들에 대한 정보를 제1 메모리(110)에 저장할 수 있다. 이후, 멀티플라이어(150)는 제1 메모리(110)에 저장된 출력 값들 및 스케일링 인자를 로딩하고, 출력 값들 및 스케일링 인자의 곱 연산을 수행할 수 있다. The processor 120 may load the weight matrix from the first memory 110 and load the lookup table from the second memory 140 . Then, the processor 120 obtains an operation value of the weight values of the weight matrix from the lookup table, accumulates it in an accumulator, and an output value of the output matrix (that is, based on the summation of the operation values accumulated in the accumulator) weight matrices B * input matrices X) can be obtained. In addition, the processor 120 may store information on the output values of the output matrix in the first memory 110 . Thereafter, the multiplier 150 may load the output values and the scaling factor stored in the first memory 110 , and perform a multiplication operation of the output values and the scaling factor.

한편, 도 6에서는 제1 메모리(110) 및 제2 메모리(120)를 별개의 구성으로 도시하였으나, 제1 메모리(110) 및 제2 메모리(120)는 하나의 메모리가 될 수도 있다. 또한, 제1 메모리(110) 및 제2 메모리(120)는 프로세서(120)와 별개의 구성이 될 수 있으나, 프로세서(120) 내부에 포함될 수도 있다. 또한, 룩업 테이블 생성기(130)는 룩업 테이블 생성을 위한 별도의 연산기로 구현될 수 있음은 물론, 룩업 테이블 생성기의 기능은 프로세서(120)에 탑재될 수도 있다. 또한, 멀티플라이어(150)은 룩업 테이블로부터 획득한 연산 값에 스케일링 인자를 곱하는 별도의 연산기로 구현될 수 있음은 물론, 멀티플라이어(150)의 기능은 프로세서(120)에 탑재될 수도 있다.Meanwhile, although the first memory 110 and the second memory 120 are illustrated as separate components in FIG. 6 , the first memory 110 and the second memory 120 may be one memory. In addition, the first memory 110 and the second memory 120 may be configured separately from the processor 120 , but may also be included in the processor 120 . In addition, the lookup table generator 130 may be implemented as a separate operator for generating the lookup table, and the function of the lookup table generator may be mounted in the processor 120 . In addition, the multiplier 150 may be implemented as a separate operator that multiplies an operation value obtained from a lookup table by a scaling factor, as well as a function of the multiplier 150 may be mounted in the processor 120 .

한편, 이상에서는 입력 데이터의 입력 값에 기초하여 룩업 테이블을 생성하는 실시 예를 설명하였다. 그러나, 본 개시의 일 실시 예에 따른 전자 장치(100)는 상술한 방식을 역으로 적용하여 가중치 데이터의 가중치 값에 기초하여 룩업 테이블을 생성할 수도 있다. 즉, 프로세서(120)는 가중치 값은 그대로 두고(즉, 실수 값으로 처리), 입력 데이터의 입력 값들을 양자화할 수 있다. 그리고, 프로세서(120)는 가중치 값들 및 n 비트의 이진 데이터에 기초하여 이진 데이터 별로 연산 값이 매칭된 룩업 테이블을 생성하고, 룩업 테이블로부터 입력 데이터에 대응되는 연산 데이터를 획득할 수 있다. 이와 같은 가중치 데이터에 기초한 룩업 테이블은 입력 데이터의 크기가 작고 가중치 데이터의 크기가 큰 언어 모델의 연산에 이용될 수 있을 것이다. 한편, 도 2 등에서 전술한 입력 데이터에 기초한 룩업 테이블은 입력 데이터의 크기가 크고 가중치 데이터의 크기가 작은 이미지 모델의 연산에 이용될 수 있을 것이다. Meanwhile, an embodiment of generating a lookup table based on an input value of input data has been described above. However, the electronic device 100 according to an embodiment of the present disclosure may generate a lookup table based on a weight value of weight data by applying the above-described method in reverse. That is, the processor 120 may quantize the input values of the input data while leaving the weight value as it is (ie, processing it as a real value). Then, the processor 120 may generate a lookup table in which an operation value is matched for each binary data based on the weight values and n-bit binary data, and obtain operation data corresponding to the input data from the lookup table. The lookup table based on such weight data may be used for calculation of a language model having a small size of input data and a large size of weight data. Meanwhile, the lookup table based on the above-described input data in FIG. 2 and the like may be used to calculate an image model having a large input data size and a small weight data size.

도 7은 본 개시의 일 실시 예에 따른 전자 장치의 제어 방법을 설명하기 위한 도면이다.7 is a diagram for explaining a method of controlling an electronic device according to an embodiment of the present disclosure.

전자 장치(100)는 적어도 하나의 비트 값이 서로 다른 이진 데이터 및 입력 데이터에 기초하여 연산 데이터를 획득(S710)할 수 있다. 여기에서, 이진 데이터는 비트 값 0 또는 비트 값 1로 구성된 데이터이고, 입력 데이터는 복수의 입력 값들을 포함하는 데이터이며, 연산 데이터는 복수의 연산 값들을 포함하는 데이터가 될 수 있다. 그리고, 입력 데이터 및 연산 데이터 각각은 매트릭스로 표현될 수 있다. 구체적으로, 전자 장치(100)는 입력 매트릭스의 각 컬럼에서 n 개의 입력 값들을 획득하고, 이진 데이터 및 n 개의 입력 값들에 기초하여 이진 데이터 별로 연산 데이터를 획득할 수 있다.The electronic device 100 may obtain operation data based on binary data and input data having at least one bit value different from each other ( S710 ). Here, the binary data may be data configured with a bit value of 0 or a bit value of 1, the input data may be data including a plurality of input values, and the operation data may be data including a plurality of operation values. In addition, each of the input data and the operation data may be expressed as a matrix. Specifically, the electronic device 100 may obtain n input values from each column of the input matrix, and may obtain binary data and operation data for each binary data based on the n input values.

그리고, 전자 장치(100)는 이진 데이터에 연산 데이터가 매칭된 룩업 테이블을 생성(S720)하고, 룩업 테이블로부터 가중치 데이터에 대응되는 연산 데이터를 획득(S730)할 수 있다. 여기에서, 가중치 데이터는, 매트릭스의 복수의 가중치 값들을 포함할 수 있다. 구체적으로, 전자 장치(100)는 가중치 매트릭스의 각 로우에서 n 개의 입력 값들에 대응되는 n 개의 가중치 값들을 식별하고, 이진 데이터 중 식별된 n 개의 가중치 값들에 대응되는 이진 데이터를 식별할 수 있다. 그리고, 전자 장치(100)는 룩업 테이블로부터 식별된 이진 데이터에 대응되는 연산 데이터를 획득하고, 획득된 연산 데이터에 기초하여 신경망 모델의 연산을 수행(S740)할 수 있다.Then, the electronic device 100 may generate a lookup table in which operation data is matched to binary data ( S720 ), and may obtain operation data corresponding to weight data from the lookup table ( S730 ). Here, the weight data may include a plurality of weight values of a matrix. Specifically, the electronic device 100 may identify n weight values corresponding to n input values in each row of the weight matrix, and may identify binary data corresponding to the n weight values identified from among binary data. Then, the electronic device 100 may obtain operation data corresponding to the identified binary data from the lookup table, and may perform an operation of the neural network model based on the obtained operation data ( S740 ).

한편, 상술한 본 개시의 다양한 실시 예들에 따른 방법들은, 기존 전자 장치에 설치 가능한 소프트웨어 또는 어플리케이션 형태로 구현될 수 있다. Meanwhile, the above-described methods according to various embodiments of the present disclosure may be implemented in the form of software or applications that can be installed in an existing electronic device.

또한, 본 개시에 따른 전자 장치의 제어 방법을 순차적으로 수행하는 프로그램이 저장된 비일시적 판독 가능 매체(non-transitory computer readable medium)가 제공될 수 있다. In addition, a non-transitory computer readable medium in which a program for sequentially performing a control method of an electronic device according to the present disclosure is stored may be provided.

비일시적 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상술한 다양한 어플리케이션 또는 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등과 같은 비일시적 판독 가능 매체에 저장되어 제공될 수 있다.The non-transitory readable medium refers to a medium that stores data semi-permanently, rather than a medium that stores data for a short moment, such as a register, cache, memory, etc., and can be read by a device. Specifically, the above-described various applications or programs may be provided by being stored in a non-transitory readable medium such as a CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

또한, 이상에서는 본 발명의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In addition, although preferred embodiments of the present invention have been illustrated and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the present invention pertains without departing from the gist of the present invention as claimed in the claims In addition, various modifications are possible by those of ordinary skill in the art, and these modifications should not be individually understood from the technical spirit or perspective of the present invention.

100: 전자 장치
110: 메모리
120: 프로세서100: electronic device
110: memory
120: processor

Claims

In an electronic device for performing computation of a neural network model,
a memory storing weight data including quantized weight values of the neural network model; and
Obtaining operation data based on binary data and input data having at least one bit value different from each other, generating a lookup table in which the operation data is matched to the binary data, and operation data corresponding to the weight data from the lookup table and a processor configured to perform an operation of the neural network model based on the obtained operation data.

According to claim 1,
Each of the binary data consists of n bit values,
The input data includes a plurality of input values of a matrix,
The processor is
obtain n input values in each column of the matrix,
and acquiring the operation data for each binary data based on the binary data and the n input values.

3. The method of claim 2,
The weight data includes a plurality of weight values of a matrix,
The processor is
identify n weight values corresponding to the n input values in each row of the matrix, identify binary data corresponding to the identified n weight values among the binary data, and identify the identified binary data from the lookup table Obtaining operation data corresponding to the data, and performing operation of the neural network model based on the obtained operation data.

4. The method of claim 3,
The processor is
From among a plurality of lookup tables generated based on input values of each column of the matrix, a lookup table corresponding to each column of an output matrix for the input data is determined, respectively, and each of the output matrices from the respective lookup tables is determined. An electronic device for obtaining output values of a column.

4. The method of claim 3,
The processor is
The matrix including the plurality of input values is divided into a first matrix and a second matrix based on a predetermined row, and a matrix including the plurality of weight values is divided into a third matrix and a fourth matrix based on a predetermined column is divided into
generating a plurality of lookup tables based on input values of each column of the first matrix, and obtaining operation data corresponding to each row of the third matrix from the plurality of lookup tables;
generating a plurality of lookup tables based on input values of each column of the second matrix, and obtaining operation data corresponding to each row of the fourth matrix from the plurality of lookup tables.

3. The method of claim 2,
The processor is
get 8 input values in each column of the matrix,
and acquiring the operation data for each binary data based on the binary data and the eight input values.

3. The method of claim 2,
The processor is
In a plurality of arithmetic expressions based on the binary data and the n input values, when there are a first arithmetic expression and a second arithmetic expression having the same intermediate arithmetic expression, the operation of the second arithmetic expression is the operation value of the first arithmetic expression based on the electronic device.

In the control method of an electronic device for performing calculation of a neural network model,
obtaining operation data based on binary data and input data having at least one bit value different from each other;
generating a lookup table in which the operation data is matched to the binary data;
obtaining computation data corresponding to weight data including quantized weight values of the neural network model from the lookup table; and
and performing an operation of the neural network model based on the obtained operation data.

9. The method of claim 8,
Each of the binary data consists of n bit values,
The input data includes a plurality of input values of a matrix,
The step of obtaining the operation data includes:
obtain n input values in each column of the matrix,
The method for controlling an electronic device, wherein the operation data is obtained for each binary data based on the binary data and the n input values.

10. The method of claim 9,
The weight data includes a plurality of weight values of a matrix,
The operation of the neural network model comprises:
identify n weight values corresponding to the n input values in each row of the matrix, identify binary data corresponding to the identified n weight values among the binary data, and identify the identified binary data from the lookup table A method of controlling an electronic device, obtaining computational data corresponding to the data, and performing computation of the neural network model based on the acquired computational data.

11. The method of claim 10,
The operation of the neural network model comprises:
From among a plurality of lookup tables generated based on input values of each column of the matrix, a lookup table corresponding to each column of an output matrix for the input data is determined, respectively, and each of the output matrices from the respective lookup tables is determined. A method of controlling an electronic device, comprising: obtaining output values of a column.

11. The method of claim 10,
The step of obtaining the operation data includes:
The matrix including the plurality of input values is divided into a first matrix and a second matrix based on a predetermined row, and a matrix including the plurality of weight values is divided into a third matrix and a fourth matrix based on a predetermined column is divided into
generating a plurality of lookup tables based on input values of each column of the first matrix, and obtaining operation data corresponding to each row of the third matrix from the plurality of lookup tables;
generating a plurality of lookup tables based on input values of each column of the second matrix, and obtaining operation data corresponding to each row of the fourth matrix from the plurality of lookup tables.

10. The method of claim 9,
The step of obtaining the operation data includes:
get 8 input values in each column of the matrix,
The method for controlling an electronic device, wherein the operation data is obtained for each binary data based on the binary data and the eight input values.

10. The method of claim 9,
Creating the lookup table comprises:
In a plurality of arithmetic expressions based on the binary data and the n input values, when there are a first arithmetic expression and a second arithmetic expression having the same intermediate arithmetic expression, the operation of the second arithmetic expression is the operation value of the first arithmetic expression A method of controlling an electronic device, including;