KR20220154902A

KR20220154902A - Electronic device and method for controlling electronic device

Info

Publication number: KR20220154902A
Application number: KR1020210062422A
Authority: KR
Inventors: 이동수; 권세중; 오대환
Original assignee: 삼성전자주식회사
Priority date: 2021-05-14
Filing date: 2021-05-14
Publication date: 2022-11-22
Also published as: WO2022239967A1

Abstract

Disclosed are an electronic device and a control method thereof. According to the present invention present disclosure, the electronic device includes: a memory which stores a scaling factor of a neural network model; and a processor which acquires quantization data by quantizing weight data used for calculation of the neural network model with a combination of code data and scaling factor data. The processor acquires code data corresponding to the weight data by generating a random number with a pre-set seed based on a random number generator, and learns the neural network model by updating the scaling factor data based on the acquired code data and the scaling factor data pre-stored in the memory. Accordingly, the electronic device may acquire a neural network model with compressibility.

Description

Electronic device and its control method {ELECTRONIC DEVICE AND METHOD FOR CONTROLLING ELECTRONIC DEVICE}

본 개시는 전자 장치 및 이의 제어 방법에 관한 것으로, 더욱 상세하게는 신경망 모델의 가중치를 양자화 하는 전자 장치 및 이의 제어 방법에 관한 것이다. The present disclosure relates to an electronic device and a control method thereof, and more particularly, to an electronic device for quantizing weights of a neural network model and a control method thereof.

최근, 인간 수준의 지능을 구현하는 인공 지능 시스템에 대한 개발 및 연구가 진행되고 있다. 인공 지능 시스템이란 기존의 룰(rule) 기반 시스템과 달리 신경망 모델에 기반하여 학습 및 추론을 수행하는 시스템을 의미하며, 음성 인식, 이미지 인식 및 미래 예측 등과 같은 다양한 범위에서 활용되고 있다.Recently, development and research on artificial intelligence systems that implement human-level intelligence have been conducted. An artificial intelligence system refers to a system that performs learning and reasoning based on a neural network model, unlike conventional rule-based systems, and is used in various fields such as voice recognition, image recognition, and future prediction.

특히, 최근에는 딥 러닝(deep learning)에 기반한 딥 뉴럴 네트워크(deep neural network)를 통해 주어진 문제를 해결하는 인공 지능 시스템이 개발되고 있다.In particular, recently, an artificial intelligence system that solves a given problem through a deep neural network based on deep learning has been developed.

딥 뉴럴 네트워크는 입력 레이어(input layer)와 출력 레이어(output layer) 사이에 다수의 은닉 레이어(hidden layer)을 포함하는 뉴럴 네트워크로써, 각 레이어에 포함된 가중치 값과 입력 데이터 간의 연산을 통해 인공 지능 기술을 구현하는 모델을 의미한다. 딥 뉴럴 네트워크는 정확한 결과 값을 도출해 내기 위해서 다수의 가중치 값들을 포함하는 것이 일반적이다.A deep neural network is a neural network that includes multiple hidden layers between an input layer and an output layer. A model that implements a technology. In general, deep neural networks include multiple weight values in order to derive accurate result values.

한편, 딥 뉴럴 네트워크에 방대한 양의 가중치 값이 포함되어 있다는 점에서, 연산을 위해 필요한 리소스가 점차 커진다는 문제가 발생한다. 또한, 딥 뉴럴 네트워크 상에 연산을 압축 또는 단순화할 경우, 연산의 정확도가 떨어질 수 있다는 문제가 발생한다.On the other hand, since a deep neural network includes a huge amount of weight values, a problem arises in that resources required for calculation gradually increase. In addition, when computation is compressed or simplified on a deep neural network, accuracy of computation may be reduced.

본 개시는 상술한 문제점을 해결하기 위해 안출된 것으로서, 본 개시의 목적은 신경망 모델의 가중치를 양자화 하여 신경망 모델을 경량화하는 전자 장치 및 이의 제어 방법에 관한 것이다.The present disclosure has been made to solve the above-mentioned problems, and an object of the present disclosure relates to an electronic device and a control method for lightening a neural network model by quantizing weights of the neural network model.

본 개시의 일 실시 예에 따른, 전자 장치는 신경망 모델에 대응되는 스케일링 인자(scaling factor) 데이터를 저장하는 메모리; 및 상기 신경망 모델의 연산에 이용되는 가중치(Weight) 데이터를 부호 데이터와 상기 스케일링 인자 데이터의 조합으로 양자화(quantization)하여 양자화 데이터를 획득하는 프로세서;를 포함하고, 상기 프로세서는, 난수 발생기(random number generator)를 바탕으로, 기 설정된 시드(seed)로 난수(random number)를 생성하여 상기 가중치 데이터에 대응되는 부호 데이터를 획득하고, 상기 획득된 부호 데이터와 상기 메모리에 기저장된 스케일링 인자 데이터를 바탕으로 상기 스케일링 인자 데이터를 업데이트 하여 상기 신경망 모델을 학습한다.According to an embodiment of the present disclosure, an electronic device includes a memory configured to store scaling factor data corresponding to a neural network model; and a processor for obtaining quantization data by quantizing weight data used in the operation of the neural network model into a combination of code data and scaling factor data, wherein the processor includes a random number generator. generator), a random number is generated with a preset seed to obtain code data corresponding to the weight data, and based on the obtained code data and scaling factor data previously stored in the memory The neural network model is learned by updating the scaling factor data.

그리고, 상기 프로세서는, 상기 부호 데이터와 상기 메모리에 저장된 스케일링 인자 데이터를 바탕으로 상기 신경망 모델에 대한 순전파(Forward-pass)를 수행하여 출력 데이터를 획득하고, 상기 출력 데이터, 상기 부호 데이터 및 상기 스케일링 인자 데이터를 바탕으로, 상기 신경망 모델에 대한 역전파(Backward-pass)를 수행하여 상기 스케일링 인자 데이터에 대응되는 그래디언트(gradient) 값을 획득하고, 상기 그래디언트(gradient)를 바탕으로, 상기 스케일링 인자 데이터를 업데이트하고, 상기 업데이트된 스케일링 인자 데이터를 상기 메모리에 저장할 수 있다.Further, the processor obtains output data by performing a forward-pass on the neural network model based on the code data and scaling factor data stored in the memory, and the output data, the code data, and the Based on the scaling factor data, back-pass is performed on the neural network model to obtain a gradient value corresponding to the scaling factor data, and based on the gradient, the scaling factor Data may be updated, and the updated scaling factor data may be stored in the memory.

그리고, 상기 프로세서는 상기 난수 발생기를 바탕으로 기 설정된 시드(seed)에 따라 제1 가중치 데이터에 대응되는 제1 부호 데이터를 생성하고, 상기 제1 부호 데이터와 상기 제1 부호 데이터에 대응되는 적어도 하나의 제1 스케일링 인자 데이터에 대한 곱 연산을 수행하고, 상기 난수 발생기를 바탕으로 기 설정된 시드(seed)에 따라 제2 가중치 데이터에 대응되는 제2 부호 데이터를 생성하고, 상기 제2 부호 데이터와 상기 제2 부호 데이터에 대응되는 적어도 하나의 제2 스케일링 인자 데이터에 대한 곱 연산을 수행할 수 있다.The processor generates first code data corresponding to the first weight data according to a preset seed based on the random number generator, and generates the first code data and at least one code data corresponding to the first code data. performs a multiplication operation on the first scaling factor data of , generates second code data corresponding to second weight data according to a preset seed based on the random number generator, and A multiplication operation may be performed on at least one second scaling factor data corresponding to the second code data.

그리고, 상기 신경망 모델은 복수의 레이어로 구성되며, 상기 제1 부호 데이터는 상기 신경망 모델의 제1 레이어에 포함된 복수의 노드와 제2 레이어에 포함된 복수의 노드 사이의 가중치에 대응되는 데이터이며, 상기 제2 부호 데이터는 상기 신경망 모델의 상기 제2 레이어에 포함된 복수의 노드와 제3 레이어에 포함된 복수의 노드 사이의 가중치에 대응되는 데이터일 수 있다.The neural network model is composed of a plurality of layers, and the first code data is data corresponding to a weight between a plurality of nodes included in the first layer and a plurality of nodes included in the second layer of the neural network model. , The second code data may be data corresponding to weights between a plurality of nodes included in the second layer and a plurality of nodes included in the third layer of the neural network model.

그리고, 상기 획득된 부호 데이터는 상기 가중치 데이터에 포함된 가중치 각각에 대응되는 난수를 포함하고, 상기 난수는 -1 및 +1 중 하나일 수 있다.The obtained code data includes a random number corresponding to each weight included in the weight data, and the random number may be one of -1 and +1.

그리고, 상기 프로세서는, 상기 획득된 부호 데이터를 정규화(normalize)하고, 상기 정규화된 부호 데이터와 상기 스케일링 인자 데이터에 대한 연산을 수행하여, 상기 신경망 모델을 학습할 수 있다. The processor may normalize the obtained code data and perform an operation on the normalized code data and the scaling factor data to learn the neural network model.

그리고, 상기 획득된 부호 데이터는 상기 메모리에 저장되지 않는 것을 특징으로 할 수 있다.And, the obtained code data may not be stored in the memory.

그리고, 상기 프로세서는, 상기 신경망 모델이 학습되면, 상기 기 설정된 시드(seed)를 바탕으로 상기 난수 발생기를 이용하여 상기 부호 데이터를 생성하고, 상기 메모리에 저장된 상기 업데이트된 스케일링 인자 데이터와 상기 생성된 부호 데이터를 이용하여 상기 학습된 신경망 모델에 대한 순전파를 수행하여 출력 데이터를 생성할 수 있다.When the neural network model is learned, the processor generates the code data using the random number generator based on the preset seed, and the updated scaling factor data stored in the memory and the generated Output data may be generated by performing forward propagation on the learned neural network model using code data.

한편, 본 개시의 또 다른 실시 예에 따른 신경망 모델에 대응되는 스케일링 인자 데이터를 저장하는 메모리를 포함하는 전자 장치의 제어 방법은 난수 발생기(random number generator)를 바탕으로, 기 설정된 시드(seed)로 난수(random number)를 생성하여 상기 신경망 모델의 연산에 이용되는 가중치 데이터에 대응되는 부호 데이터를 획득하는 단계; 및 상기 획득된 부호 데이터와 상기 메모리에 저장된 스케일링 인자 데이터를 바탕으로 상기 스케일링 인자 데이터를 업데이트 하여 상기 신경망 모델을 학습하는 단계;를 포함한다.Meanwhile, a control method of an electronic device including a memory for storing scaling factor data corresponding to a neural network model according to another embodiment of the present disclosure is based on a random number generator, generating a random number to obtain code data corresponding to weight data used for calculation of the neural network model; and learning the neural network model by updating the scaling factor data based on the obtained code data and the scaling factor data stored in the memory.

그리고, 상기 학습하는 단계는, 상기 부호 데이터와 상기 메모리에 저장된 스케일링 인자 데이터를 바탕으로 상기 신경망 모델에 대한 순전파(Forward-pass)를 수행하여 출력 데이터를 획득하는 단계; 상기 출력 데이터, 상기 부호 데이터 및 상기 스케일링 인자 데이터를 바탕으로, 상기 신경망 모델에 대한 역전파(Backward-pass)를 수행하여 상기 스케일링 인자 데이터에 대응되는 그래디언트(gradient) 값을 획득하는 단계; 상기 그래디언트(gradient) 값을 바탕으로, 상기 스케일링 인자 데이터를 업데이트하는 단계; 및 상기 업데이트된 스케일링 인자 데이터를 상기 메모리에 저장하는 단계;를 포함할 수 있다.The learning may include obtaining output data by performing a forward-pass on the neural network model based on the code data and scaling factor data stored in the memory; obtaining a gradient value corresponding to the scaling factor data by performing backward-pass on the neural network model based on the output data, the sign data, and the scaling factor data; updating the scaling factor data based on the gradient value; and storing the updated scaling factor data in the memory.

그리고, 상기 출력 데이터를 획득하는 단계는, 상기 난수 발생기를 바탕으로 기 설정된 시드(seed)에 따라 제1 가중치 데이터에 대응되는 제1 부호 데이터를 생성하고, 상기 제1 부호 데이터와 상기 제1 부호 데이터에 대응되는 적어도 하나의 제1 스케일링 인자 데이터에 대한 곱 연산을 수행하는 단계; 및 상기 난수 발생기를 바탕으로 기 설정된 시드(seed)에 따라 제2 가중치 데이터에 대응되는 제2 부호 데이터를 생성하고, 상기 제2 부호 데이터와 상기 제2 부호 데이터에 대응되는 적어도 하나의 제2 스케일링 인자 데이터에 대한 곱 연산을 수행하는 단계;를 포함할 수 있다.The obtaining of the output data may include generating first code data corresponding to the first weight data according to a preset seed based on the random number generator, and generating the first code data and the first code data. performing a multiplication operation on at least one first scaling factor data corresponding to the data; and generating second code data corresponding to the second weight data according to a preset seed based on the random number generator, and performing at least one second scaling corresponding to the second code data and the second code data. Performing a multiplication operation on factor data; may include.

그리고, 상기 획득된 부호 데이터는 상기 가중치 데이터에 포함된 가중치 각각에 대응되는 난수를 포함하고, 상기 난수는 -1 및 +1 중 하나인 것을 특징으로 할 수 있다.The obtained code data may include a random number corresponding to each weight included in the weight data, and the random number may be one of -1 and +1.

그리고, 상기 학습하는 단계는, 상기 생성된 부호 데이터를 정규화(normalize)하는 단계; 및 상기 정규화된 부호 데이터와 상기 스케일링 인자 데이터에 대한 연산을 수행하여, 상기 신경망 모델을 학습하는 단계;를 포함할 수 있다. The learning may include normalizing the generated code data; and learning the neural network model by performing an operation on the normalized code data and the scaling factor data.

그리고, 제9항에 있어서, 상기 획득된 부호 데이터는 상기 메모리에 저장되지 않는 것을 특징으로 할 수 있다.[10] The method of claim 9, wherein the obtained code data is not stored in the memory.

그리고, 상기 신경망 모델이 학습되면, 상기 기 설정된 시드(seed)를 바탕으로 상기 난수 발생기를 이용하여 상기 부호 데이터를 생성하는 단계; 및 상기 메모리에 저장된 상기 업데이트된 스케일링 인자 데이터와 상기 생성된 부호 데이터를 이용하여 상기 학습된 신경망 모델에 대한 순전파를 수행하여 출력 데이터를 생성하는 단계;를 포함할 수 있다.Then, when the neural network model is learned, generating the code data using the random number generator based on the preset seed; and generating output data by performing forward propagation on the learned neural network model using the updated scaling factor data stored in the memory and the generated code data.

상술한 바와 같이 본 개시의 다양한 실시 예에 의해, 전자 장치는 압축률이 향상된 신경망 모델을 획득할 수 있다. As described above, according to various embodiments of the present disclosure, an electronic device may obtain a neural network model with an improved compression rate.

도 1은 본 개시의 일 실시 예에 따른, 전자 장치의 구성을 간략히 도시한 블록도이다,
도 2는 본 개시에 따른 가중치 데이터를 부호 데이터 및 스케일링 인자 데이터로 양자화한 것을 도시한 도면이다.
도 3은 본 개시에 따른 난수 발생기를 통해 부호 데이터를 생성하는 방법을 설명하기 위한 도면이다.
도 4는 본 개시에 따른 시드에 따른 난수 발생기에 생성된 부호 데이터를 설명하기 위한 도면이다.
도 5는 본 개시에 따른 순전파 및 역전파 과정을 통해 스케일링 인자 데이터가 업데이트되는 과정을 설명하기 위한 도면이다.
도 6a는 Uniform Quantization 방식을 설명하기 위한 도면이다.
도 6b는 Non-Uniform Quantization 방식을 설명하기 위한 도면이다.
도 7는 본 개시의 일 실시 예에 따른, 난수 발생기에서 복수의 부호 데이터를 생성하는 실시 예를 설명하기 위한 도면이다.
도 8은 본 개시에 따른 전자 장치의 제어 방법을 설명하기 위한 도면이다.
도 9는 본 개시에 따른, 전자 장치의 구성을 상세히 도시한 블록도 이다.1 is a block diagram briefly illustrating the configuration of an electronic device according to an embodiment of the present disclosure;
2 is a diagram illustrating quantization of weight data into code data and scaling factor data according to the present disclosure.
3 is a diagram for explaining a method of generating code data through a random number generator according to the present disclosure.
4 is a diagram for explaining code data generated by a random number generator according to a seed according to the present disclosure.
5 is a diagram for explaining a process of updating scaling factor data through forward propagation and back propagation according to the present disclosure.
6A is a diagram for explaining a Uniform Quantization method.
6B is a diagram for explaining a Non-Uniform Quantization method.
7 is a diagram for explaining an embodiment in which a plurality of code data is generated by a random number generator according to an embodiment of the present disclosure.
8 is a diagram for explaining a control method of an electronic device according to the present disclosure.
9 is a block diagram illustrating in detail the configuration of an electronic device according to the present disclosure.

본 개시는 신경망 모델의 가중치(Weight) 데이터를 양자화(quantization)함으로 양자화 데이터를 획득하여 경량화된 신경망 모델을 획득 하는 전자 장치 및 이의 제어 방법에 관한 것이다. The present disclosure relates to an electronic device for obtaining a lightweight neural network model by obtaining quantization data by quantizing weight data of a neural network model and a control method thereof.

본 개시의 전자 장치는 가중치 데이터를 부호 데이터 스케일링 인자(scaling factor) 데이터로 양자화함으로써, 가중치 데이터와 입력 데이터 간의 연산을 수행하기 위해 필요한 부동 소수점(floating-point) 곱셈 연산 과정을 줄일 수 있다.The electronic device of the present disclosure can reduce a floating-point multiplication operation process required to perform an operation between weight data and input data by quantizing weight data into code data scaling factor data.

또한, 전자 장치는 부호 데이터를 저장하지 않고, 본 개시에 따른 난수 발생기를 통해 생성되는 난수의 조합으로 부호 데이터를 구성함으로써, 압축률이 향상된 신경망 모델을 획득할 수 있다.In addition, the electronic device may obtain a neural network model with an improved compression ratio by configuring code data with a combination of random numbers generated by the random number generator according to the present disclosure without storing code data.

이하에서는 도면을 참조하여 본 개시에 대해 구체적으로 설명하도록 한다.Hereinafter, the present disclosure will be described in detail with reference to the drawings.

도 1은 본 개시의 일 실시예에 따른, 전자 장치(100)의 구성을 간략히 도시한 블록도이다. 도 1에 도시된 바와 같이, 전자 장치(100)는 메모리(110) 및 프로세서(120)를 포함할 수 있다. 다만, 도 1에 도시된 구성은 본 개시의 실시 예들을 구현하기 위한 예시도이며, 통상의 기술자에게 자명한 수준의 적절한 하드웨어 및 소프트웨어 구성들이 전자 장치(100)에 추가로 포함될 수 있다.1 is a block diagram briefly illustrating the configuration of an electronic device 100 according to an embodiment of the present disclosure. As shown in FIG. 1 , the electronic device 100 may include a memory 110 and a processor 120 . However, the configuration shown in FIG. 1 is an exemplary diagram for implementing the embodiments of the present disclosure, and the electronic device 100 may additionally include appropriate hardware and software configurations that are obvious to those skilled in the art.

한편, 본 개시를 설명함에 있어서, 전자 장치(100)는 신경망 모델(neural network model)(또는, 인공 지능 모델)의 학습, 압축 또는 신경망 모델을 이용하여 입력 데이터에 대한 출력 데이터를 획득하는 장치로써, 예를 들어, 전자 장치(100)는 데스크탑 PC, 노트북, 스마트 폰, 태블릿 PC, 서버 등으로 구현될 수 있다.Meanwhile, in describing the present disclosure, the electronic device 100 is a device that obtains output data for input data by learning, compressing, or using a neural network model of a neural network model (or artificial intelligence model). , For example, the electronic device 100 may be implemented as a desktop PC, a laptop computer, a smart phone, a tablet PC, a server, and the like.

또한, 전자 장치(100)가 수행하는 각종 동작은 클라우딩 컴퓨팅 환경이 구축된 시스템에 의해 수행될 수 있다. 예를 들어, 클라우딩 컴퓨팅 환경이 구축된 시스템은 신경망 모델의 양자화 데이터를 이용하여, 양자화 데이터와 입력 데이터 간의 연산을 수행할 수 있다.In addition, various operations performed by the electronic device 100 may be performed by a system in which a cloud computing environment is established. For example, a system in which a cloud computing environment is built may perform an operation between quantization data and input data using quantization data of a neural network model.

메모리(110)는 전자 장치(100)의 적어도 하나의 다른 구성요소에 관계된 명령 또는 데이터를 저장할 수 있다. 그리고, 메모리(110)는 프로세서(120)에 의해 액세스되며, 프로세서(120)에 의한 데이터의 독취/기록/수정/삭제/갱신 등이 수행될 수 있다.The memory 110 may store commands or data related to at least one other component of the electronic device 100 . Also, the memory 110 is accessed by the processor 120, and reading/writing/modifying/deleting/updating of data by the processor 120 may be performed.

본 개시에서 메모리라는 용어는 메모리(110), 프로세서(120) 내 롬(미도시), 램(미도시) 또는 전자 장치(100)에 장착되는 메모리 카드(미도시)(예를 들어, micro SD 카드, 메모리 스틱)를 포함할 수 있다. 또한, 메모리(110)에는 디스플레이의 디스플레이 영역에 표시될 각종 화면을 구성하기 위한 프로그램 및 데이터 등이 저장될 수 있다.In the present disclosure, the term memory refers to the memory 110, a ROM (not shown) in the processor 120, a RAM (not shown), or a memory card (not shown) mounted in the electronic device 100 (eg, micro SD). card, memory stick). In addition, the memory 110 may store programs and data for composing various screens to be displayed on the display area of the display.

그리고, 메모리(110)는 전력 공급이 중단되더라도 저장된 정보를 유지할 수 있는 비휘발성 메모리 저장된 정보를 유지하기 위해서는 지속적인 전력 공급이 필요한 휘발성 메모리를 포함할 수 있다. 예로, 비휘발성 메모리는 OTPROM(one time programmable ROM), PROM(programmable ROM), EPROM(erasable and programmable ROM), EEPROM(electrically erasable and programmable ROM), mask ROM, flash ROM 중 적어도 하나로 구현 될 수 있고, 휘발성 메모리는 DRAM(dynamic RAM), SRAM(static RAM), 또는 SDRAM(synchronous dynamic RAM) 중 적어도 하나로 구현될 수 있다.In addition, the memory 110 may include a non-volatile memory capable of maintaining stored information even if power supply is interrupted, and a volatile memory requiring continuous power supply to maintain stored information. For example, the non-volatile memory may be implemented with at least one of OTPROM (one time programmable ROM), PROM (programmable ROM), EPROM (erasable and programmable ROM), EEPROM (electrically erasable and programmable ROM), mask ROM, and flash ROM, The volatile memory may be implemented as at least one of dynamic RAM (DRAM), static RAM (SRAM), and synchronous dynamic RAM (SDRAM).

메모리(110)는 신경망 모델의 연산에 이용되는 가중치 데이터를 저장할 수 있다. 다만, 본 개시에 따르면 메모리(110)는 가중치 데이터를 양자화하여 획득된 양자화 데이터 중 스케일링 인자(scaling factor) 데이터만을 저장할 수 있다.The memory 110 may store weight data used for calculation of the neural network model. However, according to the present disclosure, the memory 110 may store only scaling factor data among quantization data obtained by quantizing weight data.

가중치 데이터는 가중치 데이터에 포함된 복수의 가중치 값들을 포함할 수 있다. 이 때, 가중치 값은 n(n은 1 이상의 자연수) 비트의 부동 소수점(floating-point) 형태로 구현될 수 있다. 예를 들면, 가중치 데이터는 32 비트(bit)의 부동 소수점으로 구현될 수 있다. 가중치 데이터는 벡터, 행렬 또는 텐서 중 적어도 하나로 표현될 수 있다.The weight data may include a plurality of weight values included in the weight data. In this case, the weight value may be implemented in the form of n (n is a natural number greater than or equal to 1) bit floating-point. For example, the weight data may be implemented as a 32-bit floating point. Weight data may be represented by at least one of a vector, matrix, or tensor.

메모리(110)는 가중치 데이터가 부호 데이터 및 스케일링 인자 데이터의 조합으로 양자화된 양자화 데이터를 저장할 수 있다. 양자화 데이터는 가중치 데이터의 포맷에 따라 벡터, 매트릭스 또는 텐서 중 적어도 하나로 표현될 수 있다.The memory 110 may store quantization data in which weight data is quantized as a combination of sign data and scaling factor data. Quantization data may be expressed as at least one of a vector, matrix, or tensor according to a format of weight data.

부호 데이터는 가중치 데이터와 연산이 수행되는 스케일링 인자의 크기는 변경하지 않고 부호만을 결정할 수 있는 부호 값인 1 또는 -1을 포함할 수 있다. 스케일링 인자 데이터는 가중치 데이터의 형식과 마찬가지로 부동 소수점 형태(예를 들어, 32bit 부동 소수점 형태)로 표현될 수 있다. 가중치 데이터가 양자화되는 방식은 후술하는 부분에서 설명하도록 한다.The sign data may include 1 or -1, which is a sign value capable of determining only a sign without changing the size of weight data and a scaling factor on which an operation is performed. Scaling factor data may be expressed in a floating point format (eg, 32-bit floating point format) similar to the format of weight data. A method of quantizing the weight data will be described in a later section.

메모리(110)는 다양한 유형의 입력 데이터를 저장할 수 있다. 예를 들면, 메모리(110)는 마이크를 통해 입력된 음성 데이터, 입력부(예를 들어, 카메라, 키보드 등)를 통해 입력된 이미지 데이터 또는 텍스트 데이터 등을 저장할 수 있다. 메모리(110)에 저장된 입력 데이터는 외부 장치를 통해 수신된 데이터를 포함할 수 있다.The memory 110 may store various types of input data. For example, the memory 110 may store voice data input through a microphone, image data or text data input through an input unit (eg, a camera, a keyboard, etc.). Input data stored in the memory 110 may include data received through an external device.

프로세서(120)는 메모리(110)와 전기적으로 연결되어 전자 장치(100)의 전반적인 동작 및 기능을 제어할 수 있다. 프로세서(120)는 전자 장치(100)의 동작을 제어하기 위해 하나 또는 복수의 프로세서로 구성될 수 있다.The processor 120 may be electrically connected to the memory 110 to control overall operations and functions of the electronic device 100 . The processor 120 may include one or a plurality of processors to control the operation of the electronic device 100 .

프로세서(120)는 각종 동작을 수행하기 위해 필요한 데이터를 비휘발성 메모리에서 휘발성 메모리로 로딩(loading)할 수 있다. 로딩이란 프로세서(120)가 액세스할 수 있도록 비휘발성 메모리에 저장된 데이터를 휘발성 메모리에 불러들여 저장하는 동작을 의미한다. The processor 120 may load data necessary for performing various operations from non-volatile memory to volatile memory. Loading refers to an operation of loading and storing data stored in a non-volatile memory into a volatile memory so that the processor 120 can access the data.

그리고, 휘발성 메모리는 프로세서(120)의 일 구성 요소로서 프로세서(120)에 포함된 형태로 구현될 수 있으나, 이는 일 실시 예에 불과하며, 휘발성 메모리는 프로세서(120)와 별개의 구성 요소로 구현될 수 있다.In addition, the volatile memory may be implemented as a component included in the processor 120 as a component of the processor 120, but this is only one embodiment, and the volatile memory is implemented as a component separate from the processor 120 It can be.

프로세서(120)는 가중치 데이터를 양자화하여 양자화 데이터를 획득할 수 있다. 가중치를 양자화한다는 것은 가중치를 효율적으로 활용하기 위하여 가중치의 단위를 단순화하거나 다른 방식으로 표현한다는 것을 의미한다.The processor 120 may obtain quantization data by quantizing weight data. Quantizing the weight means simplifying the unit of the weight or expressing it in a different way in order to efficiently use the weight.

예를 들면, 프로세서(120)는 가중치 데이터에 포함된 가중치 값들에 대해 이진 코드(binary coding) 방식의 양자화를 수행하여 양자화 데이터를 획득할 수 있다. 그리고, 프로세서(120)는 획득된 양자화 데이터를 메모리(110)에 저장할 수 있다. 가중치 값들에 대해 이진 코드 방식의 양자화를 수행한다는 것은 가중치 값을 부호 데이터와 스케일링 인자 데이터의 조합으로 양자화한다는 것을 의미한다.For example, the processor 120 may obtain quantization data by performing quantization using a binary coding method on weight values included in the weight data. And, the processor 120 may store the acquired quantization data in the memory 110 . Performing quantization of the binary code method on the weight values means that the weight values are quantized using a combination of code data and scaling factor data.

예로, k(k는 1 이상의 자연수) 비트(bit)를 기준으로 가중치 값에 대해 이진 코드 방식의 양자화를 수행한다는 것은, k 개의 부호 값과 스케일링 인자의 곱을 합산하는 방식으로 가중치를 표현하는 것을 의미한다. For example, performing binary code quantization on weight values based on k (k is a natural number greater than or equal to 1) bits means expressing weights in a way of summing the products of k sign values and scaling factors. do.

k가 3인 경우, 하기 수학식 1과 같이 가중치 데이터가 양자화될 수 있다. 수학식 1에서 W는 가중치 데이터를 의미하고 A는 스케일링 인자 데이터를 의미하며, B는 부호 데이터를 의미한다.When k is 3, weight data may be quantized as shown in Equation 1 below. In Equation 1, W means weight data, A means scaling factor data, and B means code data.

[수학식 1][Equation 1]

가중치에 대해 이진 코드 방식으로 양자화를 수행할 때, 프로세서(120)는 신경망 모델의 연산을 수행할 때 요구되는 정확도 레벨에 기초하여 k 값을 결정할 수 있다. k 값이 커질수록 가중치를 좀 더 정확히 표현할 수 있으므로, 신경망 모델을 통해 획득하는 출력 데이터의 정확도를 높이기 위해 k 값은 큰 값으로 결정될 수 있다. When quantization is performed on weights using a binary code method, the processor 120 may determine a k value based on an accuracy level required when performing an operation of a neural network model. Since the weight can be expressed more accurately as the value of k increases, the value of k may be determined as a larger value to increase the accuracy of output data obtained through the neural network model.

따라서, 신경망 모델의 연산을 수행할 때 요구되는 정확도 레벨이 높을 경우, 프로세서(120)는 k 값을 높은 값으로 결정할 수 있다. 신경망 모델의 연산을 수행할 때 요구되는 정확도 레벨은 입력 데이터의 유형에 따라 결정되거나 사용자가 신경망 모델을 설계할 때 결정될 수 있다.Accordingly, when the level of accuracy required for performing the computation of the neural network model is high, the processor 120 may determine the value of k as a high value. An accuracy level required when performing an operation of a neural network model may be determined according to a type of input data or may be determined when a user designs a neural network model.

예를 들어, 입력 데이터가 높은 연산의 정확도를 요구하는 언어 데이터 또는 음성 데이터인 경우, 프로세서(120)는 k의 값을 5로 결정하고, 입력 데이터가 비교적 낮은 연산의 정확도를 요구하는 이미지 데이터인 경우, 프로세서(120)는 k의 값을 3으로 결정할 수 있다. 다만, 이는 일 실시예에 불과하며 각 입력 데이터의 유형에 대응되는 k 값이 할당될 수 있으며, 사용자에 의해 자유롭게 변경될 수 있음은 물론이다.For example, when the input data is language data or voice data requiring high arithmetic accuracy, the processor 120 determines the value of k as 5, and the input data is image data requiring relatively low arithmetic accuracy. In this case, the processor 120 may determine the value of k as 3. However, this is only an example, and a k value corresponding to each type of input data may be allocated and may be freely changed by the user.

도 2는 본 개시에 따른 가중치 데이터(W)를 부호 데이터(b) 및 스케일링 인자 데이터(α)로 양자화한 것을 도시한 도면이다.2 is a diagram illustrating quantization of weight data (W) into code data (b) and scaling factor data (α) according to the present disclosure.

구체적으로, 도 2를 참조하면 가중치 데이터(W)는 4x4의 부동 소수점 형태로 표현될 수 있다. 본 개시의 일 실시 예로, 도 2의 가중치 데이터(W)는 신경망 모델에서 인접한 노드 사이를 연결하는 가중치 데이터(W)를 의미할 수 있다. 일 예로, 도 2의 가중치 데이터(W)는 신경망 모델의 제1 레이어에 포함된 노드와 제2 레이어에 포함된 노드를 연결하는 가중치들의 집합을 의미할 수 있다. 즉, 도 2의 가중치 데이터(W) 중 하나의 부동 소수점 값은 제1 레이어에 포함된 어느 하나의 노드와 제2 레이어에 포함된 어느 하나의 노드를 연결하는 가중치 값을 의미할 수 있다.Specifically, referring to FIG. 2 , the weight data W may be expressed in a 4x4 floating point format. As an embodiment of the present disclosure, the weight data (W) of FIG. 2 may mean weight data (W) connecting adjacent nodes in a neural network model. For example, the weight data W of FIG. 2 may mean a set of weights connecting nodes included in the first layer and nodes included in the second layer of the neural network model. That is, one floating point value of the weight data W of FIG. 2 may mean a weight value connecting any one node included in the first layer and any one node included in the second layer.

그리고, 가중치 데이터(W)내 부동 소수점 각각을 -1 또는 +1로 표현된 부호 데이터(b)와 부동 소수점 형태인 스케일링 인자 데이터(α)로 양자화 데이터가 구성될 수 있다. Further, quantization data may be composed of sign data (b) in which floating point values in the weight data (W) are expressed as -1 or +1, respectively, and scaling factor data (α) in the form of floating points.

본 개시에 따른 일 실시 예로, 도 2의 가중치 데이터(W)에 포함된 부동 소수점 데이터 하나는 32비트로 구현될 수 있으며, 부호 데이터(b)에 포함된 +1 또는 -1 하나는 1비트로 구현될 수 있으며, 스케일링 인자 데이터(α)는 32비트로 구현될 수 있다. 다만, 본 개시는 이에 한정되지 않으며 다양한 data precision에 따라 다양하게 구현될 수 있다.As an example according to the present disclosure, one floating point data included in the weight data (W) of FIG. 2 may be implemented with 32 bits, and either +1 or -1 included in the code data (b) may be implemented with 1 bit. and the scaling factor data α may be implemented with 32 bits. However, the present disclosure is not limited thereto and may be implemented in various ways according to various data precisions.

본 개시에 따르면, 프로세서(120)는 난수 발생기(random number generator)를 이용하여 도 2의 부호 데이터(b)를 생성할 수 있다. 즉, 기 설정된 시드(seed)로 난수를 생성하여 가중치 데이터(W)에 대응되는 부호 데이터(b)를 생성할 수 있다. According to the present disclosure, the processor 120 may generate the code data b of FIG. 2 using a random number generator. That is, code data (b) corresponding to the weight data (W) may be generated by generating a random number with a preset seed.

도 3은 본 개시에 따른 난수 발생기를 통해 부호 데이터를 생성하는 방법을 설명하기 위한 도면이다.3 is a diagram for explaining a method of generating code data through a random number generator according to the present disclosure.

난수 발생기는 주어진 범위 내에서 무작위 숫자인 난수(random number)를 생성하기 위한 구성이다. 본 개시에 따른 난수 발생기는 -1 및 +1 중 하나를 무작위로 생성할 수 있으며, 난수 발생기를 통해 생성된 난수들의 조합으로 부호 데이터를 구성할 수 있다. A random number generator is a component for generating a random number, which is a random number within a given range. The random number generator according to the present disclosure may randomly generate one of -1 and +1, and configure code data with a combination of random numbers generated through the random number generator.

도 3을 참조하면, 난수 발생기는 -1 및 +1 중 하나를 무작위로 생성하여 복수의 난수 조합을 획득할 수 있다. 그리고, 획득된 복수의 난수들을 정렬하여 부호 데이터를 구성할 수 있다. Referring to FIG. 3 , the random number generator may obtain a plurality of random number combinations by randomly generating one of -1 and +1. In addition, code data may be configured by arranging the plurality of obtained random numbers.

본 개시에 따른 난수 발생기는 가중치 데이터의 부동 소수점 각각에 대응되는 난수를 생성함으로 부호 데이터를 생성 할 수 있으며, 초기 설정 값인 시드(seed)값에 따라 생성되는 난수의 조합이 결정되며, 시드 값이 고정되는 경우 난수 발생기는 동일한 조합으로 난수를 생성할 수 있다. The random number generator according to the present disclosure can generate code data by generating random numbers corresponding to each floating point of weight data, and a combination of generated random numbers is determined according to a seed value, which is an initial setting value. If fixed, the random number generator can generate random numbers with the same combination.

도 4는 본 개시에 따른 시드에 따른 난수 발생기에 생성된 부호 데이터를 설명하기 위한 도면이다.4 is a diagram for explaining code data generated by a random number generator according to a seed according to the present disclosure.

도 4을 참조하면, 난수 발생기는 시드(seed)값에 따라 생성되는 난수의 조합이 결정될 수 있다. 즉, 난수 발생기가 제1 시드 값으로 난수를 생성하여 제1 부호 데이터(31)를 생성하고, 제2 시드 값으로 난수를 생성하여 제2 부호 데이터(32)를 생성할 수 있다. Referring to FIG. 4 , the random number generator may determine a combination of generated random numbers according to a seed value. That is, the random number generator may generate first code data 31 by generating a random number with a first seed value, and generate second code data 32 by generating a random number with a second seed value.

본 개시에 따른 프로세서(120)는 양자화 데이터 중 부호 데이터를 메모리에 따로 저장하지 않고, 난수 발생기에서 생성된 부호 데이터를 이용함으로, 신경망 모델의 압축률을 향상시킬 수 있다. 즉, 본 개시에 따르면 양자화 데이터 중 스케일링 인자 데이터만 메모리(110)에 저장되고 부호 데이터는 메모리(110)에 저장되지 않고 난수 발생기를 통해 생성함으로, 부호 데이터를 메모리에 저장할 필요가 없어지게 된다. 부호 데이터가 메모리(110)에 저장되지 않는다는 의미는 일 예로, 부호 데이터가 DRAM과 같은 비휘발성 메모리에 저장되지 않는다는 것을 의미할 수 있다. 따라서, 프로세서(120)는 양자화 데이터를 통한 신경망 모델의 압축률을 향상 시킬 수 있다. The processor 120 according to the present disclosure may improve the compression ratio of the neural network model by using code data generated by a random number generator without separately storing code data among quantization data in a memory. That is, according to the present disclosure, since only scaling factor data among quantization data is stored in the memory 110 and code data is not stored in the memory 110 but generated through a random number generator, there is no need to store the code data in the memory. Meaning that the code data is not stored in the memory 110 may mean, for example, that the code data is not stored in a non-volatile memory such as DRAM. Accordingly, the processor 120 may improve the compression rate of the neural network model through quantization data.

일 예로, 가중치 데이터(W)에 포함된 부동 소수점 데이터 하나는 32비트이고, 부호 데이터에 포함된 +1 또는 -1 하나는 1비트이며, 스케일링 인자 데이터(α)가 32비트이며, 128개의 부호 데이터가 하나의 스케일링 인자 데이터를 공유하는 경우 기존의 압축률은 1-(128*1bit+1*32bit)/(128*32bit)=96.09%가 되었다. 그러나, 본 개시에 따라 난수 발생기를 사용하는 경우 부호 데이터를 메모리에 저장할 필요가 없어지므로, 1-(128*0bit+1*32bit)/(128*32bit)=99.21%의 압축률이 가능해질 수 있다.For example, one floating point data included in the weight data (W) is 32 bits, +1 or -1 included in the code data is 1 bit, scaling factor data (α) is 32 bits, and 128 codes When the data share one scaling factor data, the conventional compression rate becomes 1-(128*1bit+1*32bit)/(128*32bit)=96.09%. However, when using a random number generator according to the present disclosure, since there is no need to store code data in memory, a compression rate of 1-(128*0bit+1*32bit)/(128*32bit)=99.21% may be possible. .

도 5는 본 개시에 따른 순전파 및 역전파 과정을 통해 스케일링 인자 데이터가 업데이트되는 과정을 설명하기 위한 도면이다.5 is a diagram for explaining a process of updating scaling factor data through forward propagation and back propagation according to the present disclosure.

본 개시에 따르면, 프로세서(120)는 난수 발생기를 통해 초기 설정된 시드 값으로 부호 데이터를 생성하며, 생성된 부호 데이터를 통해 신경망 모델에 대한 학습을 수행할 수 있다. 이 경우, 프로세서(120)는 난수 발생기에서 생성된 부호 데이터와 메모리에 저장된 스케일링 인자 데이터를 바탕으로 스케일링 인자 데이터를 업데이트 할 수 있다. 즉, 양자화 신경망 모델의 학습과정에서 종래에는 부호 데이터와 스케일링 인자 데이터 모두가 업데이트되며 학습이 수행되었으나, 본 개시에 따르면 스케이링 인자 데이터만을 업데이트 하게 되므로, 업데이트에 소요되는 자원 및 시간이 절약될 수 있다.According to the present disclosure, the processor 120 may generate code data with an initially set seed value through a random number generator, and perform learning of a neural network model through the generated code data. In this case, the processor 120 may update scaling factor data based on sign data generated by the random number generator and scaling factor data stored in a memory. That is, in the learning process of the quantization neural network model, conventionally, both sign data and scaling factor data are updated and learning is performed. However, according to the present disclosure, only scaling factor data is updated, so resources and time required for updating can be saved. have.

구체적으로, 프로세서(120)는 난수 발생기에서 생성된 부호 데이터와 메모리에 저장된 스케일링 인자 데이터를 바탕으로 신경망 모델에 대한 순전파(Forward-pass)를 수행하여 출력 데이터를 획득할 수 있다. 순전파(Forward-pass)란 입력 데이터와 가중치 데이터간의 연산을 수행하며 출력 데이터를 획득하는 과정이다. 즉, 프로세서(120)는 입력 데이터가 신경망 모델에 입력되면, 신경망 모델에 대한 순전파를 수행하여 입력 데이터, 부호 데이터 및 스케일링 인자 데이터의 연산을 수행하며 출력 데이터를 획득할 수 있다. Specifically, the processor 120 may obtain output data by performing forward-pass on the neural network model based on code data generated by the random number generator and scaling factor data stored in a memory. Forward-pass is a process of obtaining output data while performing an operation between input data and weight data. That is, when input data is input to the neural network model, the processor 120 may obtain output data by performing forward propagation on the neural network model, performing calculations on the input data, code data, and scaling factor data.

그리고, 프로세서(120)는 획득된 출력 데이터와 메모리에 저장된 스케일링 인자 데이터 및 난수 발생기에서 생성된 부호 데이터를 이용하여 역전파(Backward-pass)를 수행하여 상기 스케일링 인자 데이터에 대응되는 그래디언트(gradient) 값을 획득할 수 있다. 역전파(Backward-pass)란 출력데이터의 오차를 최소화시키기 위한 과정으로, 출력데이터의 오차가 최소화되는 방향으로 가중치 데이터를 업데이트하는 과정이다. 그래디언트(gradient) 값이란 출력데이터의 오차의 정도를 나타내는 값으로 출력 데이터의 오차의 정도가 최소인 지점을 가리키는 기울기를 의미한다. 출력 데이터의 오차의 정도가 적을수록 신경망 모델의 성능이 좋음을 나타낼 수 있으며, 그래디언트 값은 신경망 모델에 대한 성능을 나타내는 학습 지표일 수 있다.Then, the processor 120 performs backward-pass using the obtained output data, the scaling factor data stored in the memory, and the code data generated by the random number generator to obtain a gradient corresponding to the scaling factor data. value can be obtained. Backward-pass is a process for minimizing an error in output data, and is a process of updating weight data in a direction in which an error in output data is minimized. A gradient value is a value indicating the degree of error in output data, and means a slope indicating a point where the degree of error in output data is minimum. A smaller degree of error in the output data may indicate better performance of the neural network model, and the gradient value may be a learning index representing performance of the neural network model.

그리고, 프로세서(120)는 스케일링 인자 데이터에 대응되는 그래디언트(gradient) 값을 바탕으로 스케일링 인자 데이터를 업데이트할 수 있다. 그리고, 프로세서(120)는 업데이트된 스케일링 인자 데이터를 메모리(110)에 저장할 수 있다.Also, the processor 120 may update scaling factor data based on a gradient value corresponding to the scaling factor data. And, the processor 120 may store the updated scaling factor data in the memory 110 .

기존에는 역전파를 통해 양자화 데이터의 부호 데이터 및 스케일링 인자 데이터를 업데이트하였다. 다만, 본 개시에 따른 프로세서(120)는 난수 발생기를 통해 부호 데이터가 생성되므로 스케일링 인자 데이터만을 업데이트할 수 있다. 즉, 본 개시에 따르면 스케일링 인자 데이터만이 업데이트 되므로 기존의 부호 데이터가 업데이트됨으로 소요되는 시간 및 자원이 절약될 수 있다.Conventionally, sign data and scaling factor data of quantization data are updated through backpropagation. However, since code data is generated through a random number generator, the processor 120 according to the present disclosure may update only scaling factor data. That is, according to the present disclosure, since only scaling factor data is updated, time and resources required for updating existing code data can be saved.

그리고, 프로세서(120)는 학습된 신경망 모델을 통해 출력 데이터를 생성할 수 있다. 구체적으로, 프로세서(120)는 학습된 신경망 모델이 학습될 때 사용된 시드 값을 바탕으로 부호 데이터를 생성하고, 학습된 스케일링 인자 데이터 및 생성된 부호 데이터를 이용하여 순전파를 수행함으로 출력 데이터를 생성할 수 있다. 즉, 일 실시 예로, 프로세서(120)는 학습된 신경망 모델이 학습될 때 사용된 시드 값을 메모리(110)에 저장할 수 있다.And, the processor 120 may generate output data through the learned neural network model. Specifically, the processor 120 generates code data based on the seed value used when the learned neural network model is learned, and performs forward propagation using the learned scaling factor data and the generated code data to obtain output data. can create That is, as an example, the processor 120 may store a seed value used when the learned neural network model is learned in the memory 110 .

본 개시에 따른 일 실시 예로, 프로세서(120)는 난수 발생기에서 생성된 부호 데이터를 정규화(normalize)할 수 있다. 프로세서(120)는 난수 발생기를 통해 부호 데이터에 포함된 난수가 균일 하도록 부호 데이터를 생성할 수 있다. 즉, 난수 발생기는 부호 데이터에 포함된 -1의 수와 +1의 수가 동일하도록 난수를 생성할 수 있다. 다만, 이에 한정되지 않고, 난수 발생기는 기 설정된 오차 범위(예로, 5%) 내에서 -1의 수와 +1의 수의 비율이 비슷하도록 난수를 생성할 수 있다.As an example according to the present disclosure, the processor 120 may normalize code data generated by the random number generator. The processor 120 may generate code data so that the random number included in the code data is uniform through a random number generator. That is, the random number generator may generate a random number such that the number of -1 and +1 included in the code data is the same. However, it is not limited thereto, and the random number generator may generate random numbers such that the ratio of the number of -1 and the number of +1 is similar within a preset error range (eg, 5%).

그리고, 프로세서(120)는 Distribution convertor(500)를 이용하여 생성된 부호 데이터들이 가우시안 분포(Gaussian distribution)를 이루도록 부호 데이터를 정규화 할 수 있다.Also, the processor 120 may normalize the code data generated using the distribution convertor 500 so that the code data has a Gaussian distribution.

부호 데이터를 정규화하는 방법은 크게 두가지가 있을 수 있는데, 첫째로는 양자화된 부호 데이터 값들을 등 간격으로 표현하는 Uniform Quantization 방식이 있다. 도 6a는 Uniform Quantization 방식을 설명하기 위한 도면이며 도 6b는 Non-Uniform Quantization 방식을 설명하기 위한 도면이다.There are two ways to normalize code data. First, there is a uniform quantization method that expresses quantized code data values at equal intervals. 6A is a diagram for explaining a Uniform Quantization method, and FIG. 6B is a diagram for explaining a Non-Uniform Quantization method.

Uniform Quantization 방식은 도 6a와 같이 부호 데이터들의 최소 값과 최대 값을 설정하고 그 사이를 등 간격으로 나누어 integer value로 표현하는 방식이다. 이 경우, 부호 데이터들의 최대 값 과 최소 값의 차이는 스케일링 인자 데이터의 값일 수 있다.The uniform quantization method sets the minimum and maximum values of code data as shown in FIG. 6a and divides them at equal intervals to express them as integer values. In this case, the difference between the maximum and minimum values of the code data may be the scaling factor data value.

둘째로는 복수개의 스케일링 인자 데이터를 이용하여 양자화된 부호 데이터 값들을 비균등한 값으로 표현하는 Non-uniform Quantization 방식이 있다. 일 예로, 도 6b와 같이 2비트만으로 Non-uniform Quantization 방식을 표현하는 경우, 2 비트 중에 1비트는 제1 스케일링 인자 데이터의 값(α1)을 나타내며, 나머지 1비트는 제2 스케일링 인자 데이터의 값(α2)을 나타낼 수 있다. 이 경우, 4 가지의 경우의 수에 의해 양자화된 부호 데이터 값들이 비균등한 간격의 분포를 이룰 수 있다.Second, there is a non-uniform quantization method that expresses quantized code data values as non-uniform values using a plurality of scaling factor data. For example, when expressing the non-uniform quantization method with only 2 bits as shown in FIG. 6B, 1 bit represents the value of the first scaling factor data (α1) among the 2 bits, and the remaining 1 bit represents the value of the second scaling factor data. (α2) can be represented. In this case, the code data values quantized by the number of four cases may form a distribution of non-uniform intervals.

프로세서(120)는 Distribution convertor(500)를 이용하여 상술한 두가지 방식을 통해 부호 데이터들을 정규화하고, 정규화된 부호 데이터들을 이용하여 신경망 모델의 순전파 및 역전파를 수행할 수 있다.The processor 120 may normalize the code data through the above two methods using the distribution converter 500, and perform forward propagation and back propagation of the neural network model using the normalized code data.

한편, 본 개시에 따른 인공지능과 관련된 기능은 프로세서(120)와 메모리(110)를 통해 동작된다. 프로세서(120)는 하나 또는 복수의 프로세서로 구성될 수 있다. 이때, 하나 또는 복수의 프로세서는 CPU(Central Processing Unit), AP(Application Processor), DSP(Digital Signal Processor) 등과 같은 범용 프로세서, GPU(Graphic Processing Unit), VPU(Vision Processing Unit)와 같은 그래픽 전용 프로세서 또는 NPU(Neural Processing Unit)와 같은 인공지능 전용 프로세서일 수 있다. Meanwhile, functions related to artificial intelligence according to the present disclosure are operated through the processor 120 and the memory 110. Processor 120 may be composed of one or a plurality of processors. At this time, one or more processors may include general-purpose processors such as CPU (Central Processing Unit), AP (Application Processor), DSP (Digital Signal Processor), graphics-only processors such as GPU (Graphic Processing Unit) and VPU (Vision Processing Unit). Or it could be a processor dedicated to artificial intelligence, such as a Neural Processing Unit (NPU).

하나 또는 복수의 프로세서(120)는, 메모리(110)에 저장된 기 정의된 동작 규칙 또는 인공지능 모델에 따라, 입력 데이터를 처리하도록 제어한다. 또는, 하나 또는 복수의 프로세서가 인공지능 전용 프로세서인 경우, 인공지능 전용 프로세서는, 특정 인공지능 모델의 처리에 특화된 하드웨어 구조로 설계될 수 있다. One or more processors 120 control input data to be processed according to predefined operation rules or artificial intelligence models stored in the memory 110 . Alternatively, when one or more processors are processors dedicated to artificial intelligence, the processors dedicated to artificial intelligence may be designed with a hardware structure specialized for processing a specific artificial intelligence model.

기 정의된 동작 규칙 또는 인공지능 모델은 학습을 통해 만들어진 것을 특징으로 한다. 여기서, 학습을 통해 만들어진다는 것은, 기본 인공지능 모델이 학습 알고리즘에 의하여 다수의 학습 데이터들을 이용하여 학습됨으로써, 원하는 특성(또는, 목적)을 수행하도록 설정된 기 정의된 동작 규칙 또는 인공지능 모델이 만들어짐을 의미한다. 이러한 학습은 본 개시에 따른 인공지능이 수행되는 기기 자체에서 이루어질 수도 있고, 별도의 서버 및/또는 시스템을 통해 이루어 질 수도 있다. A predefined action rule or an artificial intelligence model is characterized in that it is created through learning. Here, being made through learning means that a basic artificial intelligence model is learned using a plurality of learning data by a learning algorithm, so that a predefined action rule or artificial intelligence model set to perform a desired characteristic (or purpose) is created. means burden. Such learning may be performed in the device itself in which artificial intelligence according to the present disclosure is performed, or through a separate server and/or system.

학습 알고리즘의 예로는, 지도형 학습(supervised learning), 비지도형 학습(unsupervised learning), 준지도형 학습(semi-supervised learning) 또는 강화 학습(reinforcement learning)이 있으나, 전술한 예에 한정되지 않는다.Examples of learning algorithms include supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning, but are not limited to the above examples.

인공지능 모델은 복수의 인공 신경망을 포함하며, 인공 신경망은 복수의 레이어들로 구성될 수 있다. 복수의 신경망 레이어들 각각은 복수의 가중치들(weight values)을 갖고 있으며, 이전(previous) 레이어의 연산 결과와 복수의 가중치들 간의 연산을 통해 신경망 연산을 수행한다. 복수의 신경망 레이어들이 갖고 있는 복수의 가중치들은 인공지능 모델의 학습 결과에 의해 최적화될 수 있다. 예를 들어, 학습 과정 동안 인공지능 모델에서 획득한 로스(loss) 값 또는 코스트(cost) 값이 감소 또는 최소화되도록 복수의 가중치들이 갱신될 수 있다.The artificial intelligence model includes a plurality of artificial neural networks, and the artificial neural networks may be composed of a plurality of layers. Each of the plurality of neural network layers has a plurality of weight values, and a neural network operation is performed through an operation between an operation result of a previous layer and a plurality of weight values. A plurality of weights possessed by a plurality of neural network layers may be optimized by a learning result of an artificial intelligence model. For example, a plurality of weights may be updated so that a loss value or a cost value obtained from an artificial intelligence model is reduced or minimized during a learning process.

인공 신경망의 예로는, CNN (Convolutional Neural Network), DNN (Deep Neural Network), RNN (Recurrent Neural Network), RBM (Restricted Boltzmann Machine), DBN (Deep Belief Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 및 심층 Q-네트워크 (Deep Q-Networks) 등이 있으며, 본 개시에서의 인공 신경망은 명시한 경우를 제외하고 전술한 예에 한정되지 않는다.Examples of artificial neural networks include Convolutional Neural Network (CNN), Deep Neural Network (DNN), Recurrent Neural Network (RNN), Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), Bidirectional Recurrent Deep Neural Network (BRDNN), and There are deep Q-networks and the like, and the artificial neural network in the present disclosure is not limited to the above-described examples except for the cases specified.

도 7는 본 개시의 일 실시 예에 따른, 난수 발생기에서 복수의 부호 데이터를 생성하는 실시 예를 설명하기 위한 도면이다.7 is a diagram for explaining an embodiment in which a plurality of code data is generated by a random number generator according to an embodiment of the present disclosure.

도 7을 참조하면, 본 개시에 따른 난수 발생기(300)는 신경망 모델의 레이어 개수에 대응되는 복수의 부호 데이터를 생성할 수 있다. 일 예로, 난수 발생기(300)는 제1 시드에 따라 난수를 생성하여 제1 부호 데이터(40-1), 제2 부호 데이터(40-2) 내지 제N 부호 데이터(40-N)를 순차적으로 생성할 수 있다. 여기서, 제1 부호 데이터(40-1)는 신경망 모델의 제1 레이어에 포함된 복수의 노드와 제2 레이어에 포함된 복수의 노드를 연결하기 위한 가중치 값에 대응될 수 있다. 그리고, 제2 부호 데이터(40-2)는 신경망 모델의 제2 레이어에 포함된 복수의 노드와 제3 레이어에 포함된 복수의 노드를 연결하기 위한 가중치 값에 대응될 수 있다. 또한, 제N 부호 데이터(40-N)는 신경망 모델의 제N 레이어에 포함된 복수의 노드와 제N+1 레이어에 포함된 복수의 노드를 연결하기 위한 가중치 값에 대응될 수 있다.Referring to FIG. 7 , the random number generator 300 according to the present disclosure may generate a plurality of code data corresponding to the number of layers of a neural network model. For example, the random number generator 300 generates random numbers according to the first seed and sequentially converts the first code data 40-1, the second code data 40-2 to the N-th code data 40-N. can create Here, the first code data 40 - 1 may correspond to a weight value for connecting a plurality of nodes included in the first layer and a plurality of nodes included in the second layer of the neural network model. And, the second code data 40 - 2 may correspond to a weight value for connecting a plurality of nodes included in the second layer and a plurality of nodes included in the third layer of the neural network model. In addition, the Nth code data 40 -N may correspond to a weight value for connecting a plurality of nodes included in the Nth layer and a plurality of nodes included in the N+1th layer of the neural network model.

그리고, 일 실시 예로, 복수의 부호 데이터 각각에 스케일링 인자 데이터가 할당될 수 있다. 즉, 제1 부호 데이터(40-1)에 대응되는 제1 스케일링 인자 데이터(45-1)가 제1 부호 데이터(40-1)와 연산이 수행될 수 있으며, 제N 부호 데이터(40-N)에 대응되는 제N 스케일링 인자 데이터(45-N)가 제N 부호 데이터(40-N)와 연산이 수행될 수 있다. 다만, 이는 일 실시 예에 불과하며, 복수의 부호 데이터가 하나의 스케일링 인자 데이터를 공유할 수 있다. 또는, 하나의 부호 데이터에 복수의 스케일링 인자 데이터가 할당될 수도 있다. 즉, 제1 스케일링 인자 데이터(45-1) 및 제2 스케일링 인자 데이터(45-2) 및 제N 스케일링 인자 데이터(45-N)는 복수 개로 구현될 수 있다.And, as an embodiment, scaling factor data may be allocated to each of a plurality of code data. That is, the first scaling factor data 45-1 corresponding to the first code data 40-1 may perform an operation with the first code data 40-1, and the Nth code data 40-N The N-th scaling factor data 45-N corresponding to ) may perform an operation with the N-th code data 40-N. However, this is only an example, and a plurality of code data may share one scaling factor data. Alternatively, a plurality of scaling factor data may be allocated to one code data. That is, the first scaling factor data 45-1, the second scaling factor data 45-2, and the Nth scaling factor data 45-N may be implemented in plural numbers.

도 7에서는 제1 시드를 통해 복수의 부호 데이터가 생성되는 것으로 설명하였지만, 본 개시는 이에 한정되지 않는다. 즉, 일 예로, 제1 시드를 통해 제1 부호 데이터(40-1)를 생성하고, 제2 시드를 통해 제2 부호 데이터(40-2)를 생성하며, 제N 시드를 통해 제N 부호 데이터(40-N)를 생성할 수 있다.Although it has been described in FIG. 7 that a plurality of code data is generated through the first seed, the present disclosure is not limited thereto. That is, as an example, the first code data 40-1 is generated through the first seed, the second code data 40-2 is generated through the second seed, and the N-th code data is generated through the N-th seed. (40-N).

또한, 상술한 실시 예 들에서는 가중치 데이터 및 부호 데이터를 2차원 데이터인 것으로 설명하였으나, 본 개시는 이에 한정되지 않고 4차원 tensor등 다양한 방식으로 구현될 수 있다. In addition, in the above-described embodiments, weight data and sign data have been described as two-dimensional data, but the present disclosure is not limited thereto and may be implemented in various ways such as a four-dimensional tensor.

도 8은 본 개시에 따른 전자 장치의 제어 방법을 설명하기 위한 도면이다.8 is a diagram for explaining a control method of an electronic device according to the present disclosure.

전자 장치(100)는 스케일링 인자 데이터를 저장할 수 있다(S810). 본 개시에 따른 전자 장치(100)는 신경망 모델의 연산에 이용되는 가중치 데이터를 부호 데이터와 스케일링 인자 데이터의 조합으로 양자화 하여 양자화 데이터를 획득할 수 있다. 그리고, 전자 장치(100)는 스케일링 인자 데이터를 메모리에 저장할 수 있다. 일 실시 예로, 스케일링 인자 데이터는 부동 소수점 형태로 표현될 수 있으며, 신경망 모델의 학습 초기 단계에서는 스케일링 인자 데이터는 임의의 부동 소수점의 숫자로 메모리에 저장될 수 있다. 그리고, 신경망 모델의 학습에 따라 메모리에 저장된 스케일링 인자 데이터가 업데이트될 수 있다.The electronic device 100 may store scaling factor data (S810). The electronic device 100 according to the present disclosure may acquire quantization data by quantizing weight data used for computation of a neural network model as a combination of code data and scaling factor data. And, the electronic device 100 may store scaling factor data in memory. As an example, scaling factor data may be expressed in a floating point form, and in an initial stage of learning a neural network model, scaling factor data may be stored in a memory as an arbitrary floating point number. In addition, scaling factor data stored in the memory may be updated according to learning of the neural network model.

전자 장치(100)는 난수 발생기를 바탕으로 기 설정된 시드로 난수를 생성하여 가중치 데이터에 대응되는 부호 데이터를 획득할 수 있다(S820). 부호 데이터는 가중치 데이터에 포함된 가중치 각각에 대응되는 난수를 포함할 수 있다. 그리고, 일 예로 난수는 -1 및 +1 중 하나일 수 있다.The electronic device 100 may obtain code data corresponding to the weight data by generating a random number with a preset seed based on the random number generator (S820). The sign data may include a random number corresponding to each weight included in the weight data. And, for example, the random number may be one of -1 and +1.

일 실시 예로, 전자 장치(100)는 난수 발생기를 바탕으로 기 설정된 시드에 따라 제1 가중치 데이터에 대응되는 제1 부호 데이터를 생성하고, 제1 부호 데이터와 제1 부호 데이터에 대응되는 적어도 하나의 제1 스케일링 인자 데이터에 대한 곱 연산을 수행할 수 있다. 그리고, 전자 장치(100)는 난수 발생기를 바탕으로 기 설정된 시드에 따라 제2 가중치 데이터에 대응되는 제2 부호 데이터를 생성하고, 제2 부호 데이터와 제2 부호 데이터에 대응되는 적어도 하나의 제2 스케일링 인자 데이터에 대한 곱 연산을 수행할 수 있다. 여기서, 신경망 모델은 복수의 레이어로 구성되며, 제1 부호 데이터는 신경망 모델의 제1 레이어에 포함된 복수의 노드와 제2 레이어에 포함된 복수의 노드 사이의 가중치에 대응되는 데이터이며, 제2 부호 데이터는 신경망 모델의 제2 레이어에 포함된 복수의 노드와 제3 레이어에 포함된 복수의 노드 사이의 가중치에 대응되는 데이터일 수 있다.As an embodiment, the electronic device 100 generates first code data corresponding to the first weight data according to a preset seed based on a random number generator, and generates at least one code data corresponding to the first code data and the first code data. A multiplication operation may be performed on the first scaling factor data. Further, the electronic device 100 generates second code data corresponding to the second weight data according to a preset seed based on the random number generator, and generates the second code data and at least one second code data corresponding to the second code data. A multiplication operation may be performed on scaling factor data. Here, the neural network model is composed of a plurality of layers, and the first code data is data corresponding to weights between a plurality of nodes included in the first layer and a plurality of nodes included in the second layer of the neural network model, and the second The sign data may be data corresponding to weights between a plurality of nodes included in the second layer and a plurality of nodes included in the third layer of the neural network model.

그리고, 전자 장치(100)는 획득된 부호 데이터와 기저장된 스케일링 인자 데이터를 바탕으로 스케일링 인자 데이터를 업데이트하여 신경망 모델을 학습할 수 있다(S830). Then, the electronic device 100 may learn the neural network model by updating scaling factor data based on the acquired code data and pre-stored scaling factor data (S830).

구체적으로, 전자 장치(100)는 부호 데이터와 메모리에 저장된 스케일링 인자 데이터를 바탕으로 신경망 모델에 대한 순전파를 수행하여 출력 데이터를 획득할 수 있다. 입력데이터를 신경망 모델에 입력하여 출력 데이터를 획득하는 순전파를 수행하면, 전자 장치(100)는 출력 데이터, 부호 데이터 및 스케일링 인자 데이터를 바탕으로 신경망 모델에 대한 역전파를 수행하여 스케일링 인자 데이터에 대응되는 그래디언트 값을 획득할 수 있다. 그리고, 전자 장치(100)는 그래디언트 값을 바탕으로 스케일링 인자 데이터를 업데이트 하고, 업데이트된 스케일링 인자 데이터를 메모리에 저장할 수 있다. Specifically, the electronic device 100 may obtain output data by performing forward propagation on the neural network model based on the code data and the scaling factor data stored in the memory. When forward propagation is performed by inputting input data to the neural network model to obtain output data, the electronic device 100 performs back propagation on the neural network model based on the output data, sign data, and scaling factor data to obtain scaling factor data. A corresponding gradient value may be obtained. Also, the electronic device 100 may update scaling factor data based on the gradient value and store the updated scaling factor data in a memory.

즉, 전자 장치(100)는 신경망 모델의 학습 과정을 통해 업데이트된 스케일링 인자 데이터는 메모리에 저장하나, 난수 발생기에서 생성되는 부호 데이터는 메모리에 저장하지 않으며, 신경망 모델의 학습 때 이용된 기 설정된 시드에 대한 정보만 메모리에 저장할 수 있다. 그리고, 전자 장치(100)는 신경망 모델이 학습되면, 기 설정된 시드를 바탕으로 난수 발생기를 이용하여 부호 데이터를 생성하고, 생성된 부호 데이터와 메모리에 저장된 스케일링 인자 데이터를 이용하여 출력 데이터를 획득할 수 있다.That is, the electronic device 100 stores the scaling factor data updated through the learning process of the neural network model in the memory, but does not store the code data generated by the random number generator in the memory. Only information about can be stored in memory. Then, when the neural network model is learned, the electronic device 100 generates code data using a random number generator based on a preset seed, and obtains output data using the generated code data and scaling factor data stored in a memory. can

도 9는 본 개시에 따른, 전자 장치의 구성을 상세히 도시한 블록도 이다. 도 9에 도시된 바와 같이, 전자 장치(100)는 메모리(910), 프로세서(920), 통신부(930), 디스플레이(940), 스피커(950), 마이크(960) 및 입력부(970)를 포함할 수 있다. 메모리(910) 및 프로세서(920)는 도 1의 메모리(110) 및 프로세서(120)를 참조하여 구체적으로 설명하였으므로 중복되는 설명은 생략하도록 한다.9 is a block diagram illustrating in detail the configuration of an electronic device according to the present disclosure. As shown in FIG. 9 , the electronic device 100 includes a memory 910, a processor 920, a communication unit 930, a display 940, a speaker 950, a microphone 960, and an input unit 970. can do. Since the memory 910 and the processor 920 have been specifically described with reference to the memory 110 and the processor 120 of FIG. 1 , overlapping descriptions will be omitted.

통신부(930)는 회로를 포함하며, 외부 장치와 통신을 수행할 수 있다. 이때, 통신부(930)가 외부 장치와 통신 연결되는 것은 제3 기기(예로, 중계기, 허브, 엑세스 포인트, 서버 또는 게이트웨이 등)를 거쳐서 통신하는 것을 포함할 수 있다.The communication unit 930 includes a circuit and can communicate with an external device. In this case, communication of the communication unit 930 with an external device may include communication through a third device (eg, a repeater, a hub, an access point, a server, or a gateway).

통신부(930)는 외부 장치와 통신을 수행하기 위해 다양한 통신 모듈을 포함할 수 있다. 일 예로, 통신부(930)는 무선 통신 모듈을 포함할 수 있으며, 예를 들면, 5G(5TH Generation), LTE, LTE-A(LTE Advance), CDMA(code division multiple access), WCDMA(wideband CDMA), 등 중 적어도 하나를 사용하는 셀룰러 통신 모듈을 포함할 수 있다.The communication unit 930 may include various communication modules to communicate with external devices. As an example, the communication unit 930 may include a wireless communication module, and for example, 5G (5TH Generation), LTE, LTE-A (LTE Advance), CDMA (code division multiple access), WCDMA (wideband CDMA) , and the like may include a cellular communication module using at least one.

다른 예로, 무선 통신 모듈은, 예를 들면, WiFi(wireless fidelity), 블루투스, 블루투스 저전력(BLE), 지그비(Zigbee), 라디오 프리퀀시(RF), 또는 보디 에어리어 네트워크(BAN) 중 적어도 하나를 포함할 수 있다. 다만, 이는 일 실시예에 불과하며 통신부(930)는 유선 통신 모듈을 포함할 수 있다.As another example, the wireless communication module may include, for example, at least one of WiFi (Wireless Fidelity), Bluetooth, Bluetooth Low Energy (BLE), Zigbee, Radio Frequency (RF), or Body Area Network (BAN). can However, this is only one embodiment and the communication unit 930 may include a wired communication module.

통신부(930)는 본 개시에 따라 학습된 신경망 모델을 구성하는 스케일링 인자 데이터를 외부 서버에 전송할 수 있다. 즉, 본 개시에 따른 프로세서(920)는 신경망 모델을 학습하여, 신경망 모델에 대응되는 스케일링 인자 데이터를 업데이트하여 메모리(910)에 저장할 수 있다. 그리고, 통신부(930)는 메모리(910)에 저장된 스케일링 인자 데이터를 외부 서버에 전송할 수 있다. The communication unit 930 may transmit scaling factor data constituting the neural network model trained according to the present disclosure to an external server. That is, the processor 920 according to the present disclosure may learn a neural network model, update scaling factor data corresponding to the neural network model, and store the updated scaling factor data in the memory 910 . Also, the communication unit 930 may transmit scaling factor data stored in the memory 910 to an external server.

이 경우, 신경망 모델의 학습에 이용된 시드에 대한 정보 또한 외부 서버에 전송할 수 있다. 즉, 프로세서(920)는 기 설정된 시드를 이용하여 생성된 부호 데이터를 이용하여 스케일링 인자 데이터를 업데이트하고, 스케일링 인자 데이터의 업데이트에 이용된 기 설정된 시드에 대한 정보를 메모리(910)에 저장할 수 있다. 그리고, 통신부(930)는 메모리(910)에 저장된 스케일링 인자 데이터를 기 설정된 시드에 대한 정보와 함께 외부 서버로 전송할 수 있다.In this case, information on the seed used to learn the neural network model may also be transmitted to an external server. That is, the processor 920 may update scaling factor data using sign data generated using a preset seed, and store information on the preset seed used for updating the scaling factor data in the memory 910. . Also, the communication unit 930 may transmit scaling factor data stored in the memory 910 to an external server together with information on a preset seed.

그리고, 외부 서버는 난수 발생기를 이용하여 수신된 시드에 따라 난수를 생성하여 부호 데이터를 생성하고, 수신된 스케일링 인자 데이터와 생성된 부호 데이터를 통해 순전파를 수행하여 출력 데이터를 획득할 수 있다.The external server may generate code data by generating a random number according to the received seed using a random number generator, perform forward propagation through the received scaling factor data and the generated code data, and obtain output data.

통신부(930)는 전자 장치(100)와 통신 연결된 외부 장치로부터 신경망 모델에 입력되는 입력 데이터를 수신할 수 있다. 예를 들어, 통신부(930)는 전자 장치(100)와 무선 통신 연결된 입력 장치(예를 들어, 카메라, 마이크, 키보드 등) 또는 각종 컨텐츠를 제공할 수 있는 외부 서버로부터 다양한 종류의 입력 데이터를 수신할 수 있다.The communication unit 930 may receive input data input to the neural network model from an external device communicatively connected to the electronic device 100 . For example, the communication unit 930 receives various types of input data from an input device (eg, a camera, microphone, keyboard, etc.) connected to the electronic device 100 through wireless communication or an external server capable of providing various contents. can do.

디스플레이(940)는 프로세서(120)의 제어에 따라 다양한 정보를 표시할 수 있다. 특히, 디스플레이(940)는 입력 데이터를 표시하거나, 가중치 데이터와 입력 데이터 간에 연산을 수행하여 획득된 출력 데이터를 표시할 수 있다. 여기서, 출력 데이터를 표시한다는 것은 출력 데이터에 기초하여 생성된 텍스트 또는 이미지가 포함된 화면을 표시하는 동작을 포함할 수 있다.The display 940 may display various information according to the control of the processor 120 . In particular, the display 940 may display input data or display output data obtained by performing an operation between weight data and input data. Here, displaying the output data may include an operation of displaying a screen including text or images generated based on the output data.

디스플레이(940)는 LCD(Liquid Crystal Display), OLED(Organic Light Emitting Diodes), AM-OLED(Active-Matrix Organic Light-Emitting Diode), LcoS(Liquid Crystal on Silicon) 또는 DLP(Digital Light Processing) 등과 같은 다양한 디스플레이 기술로 구현될 수 있다. 또한, 디스플레이(940)는 플렉서블 디스플레이(flexible display)의 형태로 전자 장치(100)의 전면 영역 및, 측면 영역 및 후면 영역 중 적어도 하나에 결합될 수도 있다.The display 940 may be a Liquid Crystal Display (LCD), Organic Light Emitting Diodes (OLED), Active-Matrix Organic Light-Emitting Diode (AM-OLED), Liquid Crystal on Silicon (LcoS) or Digital Light Processing (DLP). It can be implemented with various display technologies. Also, the display 940 may be coupled to at least one of a front area, a side area, and a rear area of the electronic device 100 in the form of a flexible display.

또한, 디스플레이(940)는 터치 센서를 구비한 터치 스크린으로 구현될 수도 있다.Also, the display 940 may be implemented as a touch screen having a touch sensor.

스피커(950)는 오디오 처리부(미도시)에 의해 디코딩이나 증폭, 노이즈 필터링과 같은 다양한 처리 작업이 수행된 각종 오디오 데이터를 출력하는 구성이다. 또한, 스피커(950)는 각종 알림 음이나 음성 메시지를 출력할 수 있다.The speaker 950 is a component that outputs various audio data on which various processing tasks such as decoding, amplification, and noise filtering have been performed by an audio processing unit (not shown). In addition, the speaker 950 may output various notification sounds or voice messages.

예를 들어, 신경망 모델에 의해 가중치 데이터와 입력 데이터 간의 연산 결과 즉, 출력 데이터가 출력되는 경우, 스피커(950)는 출력 데이터가 획득되었다는 알림 음 등을 출력할 수 있다.For example, when an operation result between weight data and input data, that is, output data is output by a neural network model, the speaker 950 may output a notification sound indicating that the output data has been obtained.

마이크(960)는 사용자로부터 음성을 입력받을 수 있는 구성이다. 마이크(960)는 전자 장치(100) 내부에 구비될 수 있으나, 외부에 구비되어 전자 장치(100)와 전기적으로 연결될 수 있다. 또한, 마이크(960)가 외부에 구비된 경우, 마이크(960)는 유/무선 인터페이스(예를 들어, Wi-Fi, 블루투스)을 통해 생성된 사용자 음성 신호를 프로세서(120)에 전송할 수 있다.The microphone 960 is a component capable of receiving voice input from a user. The microphone 960 may be provided inside the electronic device 100, but may be provided outside and electrically connected to the electronic device 100. In addition, when the microphone 960 is provided externally, the microphone 960 may transmit the generated user voice signal to the processor 120 through a wired/wireless interface (eg, Wi-Fi, Bluetooth).

마이크(960)는 각종 인공 신경망으로 구성된 인공지능 모델을 활성화시킬 수 있는 웨이크 업 워드(wake-up word)(또는, 트리거 워드(trigger word))가 포함된 사용자 음성을 입력 받을 수 있다. 웨이크 업 워드가 포함된 사용자 음성을 마이크(960)를 통해 입력받으면, 프로세서(120)는 인공 지능 모델을 활성화시키고 사용자 음성을 입력 데이터로 이용하여 가중치 데이터간의 연산을 수행할 수 있다.The microphone 960 may receive a user voice including a wake-up word (or trigger word) capable of activating an artificial intelligence model composed of various artificial neural networks. When the user's voice including the wake-up word is received through the microphone 960, the processor 120 activates an artificial intelligence model and may perform an operation between weight data using the user's voice as input data.

입력부(970)는 회로를 포함하며, 전자 장치(100)를 제어하기 위한 사용자 입력을 수신할 수 있다. 특히, 입력부(970)는 사용자 손 또는 스타일러스 펜 등을 이용한 사용자 터치를 입력받기 위한 터치 패널, 사용자 조작을 입력받기 위한 버튼 등이 포함될 수 있다. 또 다른 예로, 입력부(970)는 다른 입력 장치(예로, 키보드, 마우스, 모션 입력부 등)로 구현될 수 있다. 한편, 입력부(970)는 사용자로부터 입력된 제1 입력 데이터를 수신하거나 각종 사용자 명령을 입력받을 수 있다.The input unit 970 includes a circuit and can receive a user input for controlling the electronic device 100 . In particular, the input unit 970 may include a touch panel for receiving a user's touch using a user's hand or a stylus pen, and a button for receiving a user's manipulation. As another example, the input unit 970 may be implemented as another input device (eg, a keyboard, mouse, motion input unit, etc.). Meanwhile, the input unit 970 may receive first input data input from a user or various user commands.

본 실시 예들은 다양한 변환을 가할 수 있고 여러 가지 실시 예를 가질 수 있는바, 특정 실시 예들을 도면에 예시하고 상세한 설명에 상세하게 설명하였다. 그러나 이는 특정한 실시 형태에 대해 범위를 한정하려는 것이 아니며, 본 개시의 실시 예의 다양한 변경(modifications), 균등물(equivalents), 및/또는 대체물(alternatives)을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다.The present embodiments can apply various transformations and have various embodiments, and specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the scope to the specific embodiments, and should be understood to include various modifications, equivalents, and/or alternatives of the embodiments of the present disclosure. In connection with the description of the drawings, like reference numerals may be used for like elements.

본 개시를 설명함에 있어서, 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 개시의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그에 대한 상세한 설명은 생략한다. In describing the present disclosure, if it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the gist of the present disclosure, a detailed description thereof will be omitted.

덧붙여, 상술한 실시 예들은 여러 가지 다른 형태로 변형될 수 있으며, 본 개시의 기술적 사상의 범위가 하기 실시 예에 한정되는 것은 아니다. 오히려, 이들 실시 예는 본 개시를 더욱 충실하고 완전하게 하고, 당업자에게 본 개시의 기술적 사상을 완전하게 전달하기 위하여 제공되는 것이다.In addition, the above-described embodiments may be modified in various forms, and the scope of the technical spirit of the present disclosure is not limited to the following embodiments. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the spirit of the disclosure to those skilled in the art.

본 개시에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 권리범위를 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.Terms used in this disclosure are only used to describe specific embodiments, and are not intended to limit the scope of rights. Singular expressions include plural expressions unless the context clearly dictates otherwise.

본 개시에서, "가진다," "가질 수 있다," "포함한다," 또는 "포함할 수 있다" 등의 표현은 해당 특징(예: 수치, 기능, 동작, 또는 부품 등의 구성요소)의 존재를 가리키며, 추가적인 특징의 존재를 배제하지 않는다. In the present disclosure, expressions such as “has,” “can have,” “includes,” or “can include” indicate the presence of a corresponding feature (eg, numerical value, function, operation, or component such as a part). , which does not preclude the existence of additional features.

본 개시에서, "A 또는 B," "A 또는/및 B 중 적어도 하나," 또는 "A 또는/및 B 중 하나 또는 그 이상"등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다. 예를 들면, "A 또는 B," "A 및 B 중 적어도 하나," 또는 "A 또는 B 중 적어도 하나"는, (1) 적어도 하나의 A를 포함, (2) 적어도 하나의 B를 포함, 또는 (3) 적어도 하나의 A 및 적어도 하나의 B 모두를 포함하는 경우를 모두 지칭할 수 있다.In this disclosure, expressions such as “A or B,” “at least one of A and/and B,” or “one or more of A or/and B” may include all possible combinations of the items listed together. . For example, “A or B,” “at least one of A and B,” or “at least one of A or B” (1) includes at least one A, (2) includes at least one B, Or (3) may refer to all cases including at least one A and at least one B.

본 개시에서 사용된 "제1," "제2," "첫째," 또는 "둘째,"등의 표현들은 다양한 구성요소들을, 순서 및/또는 중요도에 상관없이 수식할 수 있고, 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 뿐 해당 구성요소들을 한정하지 않는다. Expressions such as "first," "second," "first," or "second," used in the present disclosure may modify various elements regardless of order and/or importance, and may refer to one element as It is used only to distinguish it from other components and does not limit the corresponding components.

어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "(기능적으로 또는 통신적으로) 연결되어((operatively or communicatively) coupled with/to)" 있다거나 "접속되어(connected to)" 있다고 언급된 때에는, 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로 연결되거나, 다른 구성요소(예: 제3 구성요소)를 통하여 연결될 수 있다고 이해되어야 할 것이다. A component (e.g., a first component) is "(operatively or communicatively) coupled with/to" another component (e.g., a second component); When referred to as "connected to", it should be understood that the certain component may be directly connected to the other component or connected through another component (eg, a third component).

반면에, 어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소와 상기 다른 구성요소 사이에 다른 구성요소(예: 제3 구성요소)가 존재하지 않는 것으로 이해될 수 있다.On the other hand, when an element (eg, a first element) is referred to as being “directly connected” or “directly connected” to another element (eg, a second element), the element and the above It may be understood that other components (eg, a third component) do not exist between the other components.

본 개시에서 사용된 표현 "~하도록 구성된(또는 설정된)(configured to)"은 상황에 따라, 예를 들면, "~에 적합한(suitable for)," "~하는 능력을 가지는(having the capacity to)," "~하도록 설계된(designed to)," "~하도록 변경된(adapted to)," "~하도록 만들어진(made to)," 또는 "~를 할 수 있는(capable of)"과 바꾸어 사용될 수 있다. 용어 "~하도록 구성된(또는 설정된)"은 하드웨어적으로 "특별히 설계된(specifically designed to)" 것만을 반드시 의미하지 않을 수 있다. The expression “configured to (or configured to)” as used in this disclosure means, depending on the situation, for example, “suitable for,” “having the capacity to.” ," "designed to," "adapted to," "made to," or "capable of." The term "configured (or set) to" may not necessarily mean only "specifically designed to" hardware.

대신, 어떤 상황에서는, "~하도록 구성된 장치"라는 표현은, 그 장치가 다른 장치 또는 부품들과 함께 "~할 수 있는" 것을 의미할 수 있다. 예를 들면, 문구 "A, B, 및 C를 수행하도록 구성된(또는 설정된) 프로세서"는 해당 동작을 수행하기 위한 전용 프로세서(예: 임베디드 프로세서), 또는 메모리 장치에 저장된 하나 이상의 소프트웨어 프로그램들을 실행함으로써, 해당 동작들을 수행할 수 있는 범용 프로세서(generic-purpose processor)(예: CPU 또는 application processor)를 의미할 수 있다.Instead, in some contexts, the phrase "device configured to" may mean that the device is "capable of" in conjunction with other devices or components. For example, the phrase "a processor configured (or configured) to perform A, B, and C" may include a dedicated processor (eg, embedded processor) to perform the operation, or by executing one or more software programs stored in a memory device. , may mean a general-purpose processor (eg, CPU or application processor) capable of performing corresponding operations.

실시 예에 있어서 '' 혹은 '부'는 적어도 하나의 기능이나 동작을 수행하며, 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다. 또한, 복수의 '' 혹은 복수의 '부'는 특정한 하드웨어로 구현될 필요가 있는 '' 혹은 '부'를 제외하고는 적어도 하나의 로 일체화되어 적어도 하나의 프로세서로 구현될 수 있다.In an embodiment, ' ' or 'unit' performs at least one function or operation, and may be implemented as hardware or software, or a combination of hardware and software. In addition, a plurality of '' or a plurality of 'units' may be integrated into at least one and implemented by at least one processor, except for '' or 'units' that need to be implemented with specific hardware.

한편, 도면에서의 다양한 요소와 영역은 개략적으로 그려진 것이다. 따라서, 본 발명의 기술적 사상은 첨부한 도면에 그려진 상대적인 크기나 간격에 의해 제한되지 않는다.Meanwhile, various elements and regions in the drawings are schematically drawn. Therefore, the technical spirit of the present invention is not limited by the relative size or spacing drawn in the accompanying drawings.

한편, 이상에서 설명된 다양한 실시 예들은 소프트웨어(software), 하드웨어(hardware) 또는 이들의 조합된 것을 이용하여 컴퓨터(computer) 또는 이와 유사한 장치로 읽을 수 있는 기록 매체 내에서 구현될 수 있다. 하드웨어적인 구현에 의하면, 본 개시에서 설명되는 실시 예들은 ASICs(Application Specific Integrated Circuits), DSPs(digital signal processors), DSPDs(digital signal processing devices), PLDs(programmable logic devices), FPGAs(field programmable gate arrays), 프로세서(processors), 제어기(controllers), 마이크로 컨트롤러(micro-controllers), 마이크로 프로세서(microprocessors), 기타 기능 수행을 위한 전기적인 유닛(unit) 중 적어도 하나를 이용하여 구현될 수 있다. 일부의 경우에 본 명세서에서 설명되는 실시 예들이 프로세서 자체로 구현될 수 있다. 소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능과 같은 실시 예들은 별도의 소프트웨어 들로 구현될 수 있다. 상기 소프트웨어 들 각각은 본 명세서에서 설명되는 하나 이상의 기능 및 작동을 수행할 수 있다.Meanwhile, various embodiments described above may be implemented in a recording medium readable by a computer or a similar device using software, hardware, or a combination thereof. According to the hardware implementation, the embodiments described in this disclosure are application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), and field programmable gate arrays (FPGAs). ), processors, controllers, micro-controllers, microprocessors, and electrical units for performing other functions. In some cases, the embodiments described herein may be implemented by a processor itself. According to software implementation, embodiments such as procedures and functions described in this specification may be implemented as separate software. Each of these pieces of software may perform one or more functions and operations described herein.

한편, 상술한 본 개시의 다양한 실시 예들에 따른 방법은 비일시적 판독 가능 매체(non-transitory readable medium)에 저장될 수 있다. 이러한 비일시적 판독 가능 매체는 다양한 장치에 탑재되어 사용될 수 있다. Meanwhile, the method according to various embodiments of the present disclosure described above may be stored in a non-transitory readable medium. Such non-transitory readable media may be loaded and used in various devices.

비일시적 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상술한 다양한 방법을 수행하기 위한 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등과 같은 비일시적 판독 가능 매체에 저장되어 제공될 수 있다.A non-transitory readable medium is not a medium that stores data for a short moment, such as a register, cache, or memory, but a medium that stores data semi-permanently and can be read by a device. Specifically, programs for performing the various methods described above may be stored and provided in a non-transitory readable medium such as a CD, DVD, hard disk, Blu-ray disk, USB, memory card, or ROM.

일 실시 예에 따르면, 본 문서에 개시된 다양한 실시 예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로, 또는 어플리케이션 스토어(예: 플레이 스토어^TM)를 통해 온라인으로 배포될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, the method according to various embodiments disclosed in this document may be included and provided in a computer program product. Computer program products may be traded between sellers and buyers as commodities. A computer program product may be distributed in the form of a device-readable storage medium (eg, compact disc read only memory (CD-ROM)) or online through an application store (eg, Play Store ^TM ). In the case of online distribution, at least part of the computer program product may be temporarily stored or temporarily created in a storage medium such as a manufacturer's server, an application store server, or a relay server's memory.

또한, 이상에서는 본 개시의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 개시는 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 개시의 요지를 벗어남이 없이 당해 개시가 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 개시의 기술적 사상이나 전망으로부터 개별적으로 이해되어서는 안될 것이다.In addition, although the preferred embodiments of the present disclosure have been shown and described above, the present disclosure is not limited to the specific embodiments described above, and the technical field to which the disclosure belongs without departing from the subject matter of the present disclosure claimed in the claims. Of course, various modifications are possible by those skilled in the art, and these modifications should not be individually understood from the technical spirit or perspective of the present disclosure.

100: 전자 장치
110: 메모리
120: 프로세서100: electronic device
110: memory
120: processor

Claims

In electronic devices,
a memory for storing scaling factor data corresponding to the neural network model; and
A processor for obtaining quantization data by quantizing weight data used in the operation of the neural network model as a combination of code data and scaling factor data;
the processor,
Based on a random number generator, generating a random number with a preset seed to obtain code data corresponding to the weight data,
The electronic device learning the neural network model by updating the scaling factor data based on the obtained code data and the scaling factor data pre-stored in the memory.

According to claim 1,
the processor,
Obtaining output data by performing a forward-pass on the neural network model based on the code data and scaling factor data stored in the memory;
Obtaining a gradient value corresponding to the scaling factor data by performing backward-pass on the neural network model based on the output data, the sign data, and the scaling factor data;
Based on the gradient, updating the scaling factor data;
An electronic device that stores the updated scaling factor data in the memory.

According to claim 2,
The processor
First code data corresponding to first weight data is generated according to a preset seed based on the random number generator, and at least one first scaling factor corresponding to the first code data and the first code data is generated. perform a multiplication operation on the data;
Second code data corresponding to second weight data is generated according to a preset seed based on the random number generator, and the second code data and at least one second scaling factor corresponding to the second code data are generated. An electronic device that performs multiplication operations on data.

According to claim 3,
The neural network model is composed of a plurality of layers,
The first code data is data corresponding to weights between a plurality of nodes included in the first layer and a plurality of nodes included in the second layer of the neural network model,
The second code data is data corresponding to weights between a plurality of nodes included in the second layer and a plurality of nodes included in the third layer of the neural network model.

According to claim 1,
The electronic device, characterized in that the obtained code data includes a random number corresponding to each weight included in the weight data, and the random number is one of -1 and +1.

According to claim 1,
the processor,
Normalize the generated code data,
An electronic device learning the neural network model by performing an operation on the normalized code data and the scaling factor data.

According to claim 1,
The electronic device, characterized in that the obtained code data is not stored in the memory.

According to claim 1,
the processor,
When the neural network model is learned, the code data is generated using the random number generator based on the preset seed;
The electronic device generating output data by performing forward propagation on the learned neural network model using the updated scaling factor data stored in the memory and the generated code data.

A control method of an electronic device including a memory for storing scaling factor data corresponding to a neural network model,
obtaining code data corresponding to weight data used for calculation of the neural network model by generating a random number with a preset seed based on a random number generator; and
and learning the neural network model by updating the scaling factor data based on the obtained code data and the scaling factor data stored in the memory.

According to claim 9,
The learning step is
obtaining output data by performing a forward-pass on the neural network model based on the code data and scaling factor data stored in the memory;
obtaining a gradient value corresponding to the scaling factor data by performing backward-pass on the neural network model based on the output data, the sign data, and the scaling factor data;
updating the scaling factor data based on the gradient value; and
and storing the updated scaling factor data in the memory.

According to claim 10,
Obtaining the output data,
First code data corresponding to first weight data is generated according to a preset seed based on the random number generator, and at least one first scaling factor corresponding to the first code data and the first code data is generated. performing a multiplication operation on the data; and
Second code data corresponding to second weight data is generated according to a preset seed based on the random number generator, and the second code data and at least one second scaling factor corresponding to the second code data are generated. A control method comprising: performing a multiplication operation on data.

According to claim 11,
The neural network model is composed of a plurality of layers,
The first code data is data corresponding to weights between a plurality of nodes included in the first layer and a plurality of nodes included in the second layer of the neural network model,
The second code data is data corresponding to weights between a plurality of nodes included in the second layer and a plurality of nodes included in the third layer of the neural network model.

According to claim 9,
The control method characterized in that the obtained code data includes a random number corresponding to each weight included in the weight data, and the random number is one of -1 and +1.

According to claim 9,
The learning step is
normalizing the generated code data; and
and learning the neural network model by performing an operation on the normalized code data and the scaling factor data.

According to claim 9,
The control method, characterized in that the obtained code data is not stored in the memory.

According to claim 9,
generating the code data using the random number generator based on the preset seed when the neural network model is learned; and
and generating output data by performing forward propagation on the learned neural network model using the updated scaling factor data stored in the memory and the generated code data.