WO2021049829A1 - 인공 신경망 연산을 수행하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체 - Google Patents
인공 신경망 연산을 수행하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체 Download PDFInfo
- Publication number
- WO2021049829A1 WO2021049829A1 PCT/KR2020/012024 KR2020012024W WO2021049829A1 WO 2021049829 A1 WO2021049829 A1 WO 2021049829A1 KR 2020012024 W KR2020012024 W KR 2020012024W WO 2021049829 A1 WO2021049829 A1 WO 2021049829A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- artificial neural
- neural network
- external memory
- layer
- present
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Definitions
- the present invention relates to a method, a system, and a non-transitory computer-readable recording medium for performing artificial neural network operations.
- Artificial neural networks are conceived from neurons in the brain and their connection structures, and are used in many application fields due to the overwhelmingly high performance and versatility of artificial neural networks. Artificial neural networks have made significant advances in accuracy in many applications such as vision, speech, and language, and have recently shown performance at or above the human level in various fields. The superior performance of artificial neural networks comes from their ability to extract shapes based on statistical learning of large amounts of data, which differs from conventional algorithms that use features or rules usually devised from human experience and intuition.
- the internal memory capacity of the artificial neural network computing device is inevitably limited, so it is common to have an external memory.
- the external memory is much larger than the internal memory in terms of power and delay time. In other words, it can be seen that minimizing external memory access in performing artificial neural network operations is also directly linked to performance improvement of artificial neural network computing devices.
- the present inventor(s) proposes a novel and advanced technology capable of maintaining the processing speed above a certain level while minimizing access to an external memory in the process of computing an artificial neural network.
- An object of the present invention is to solve all the problems of the prior art described above.
- an object of the present invention is to derive an optimal layer partitioning scheme and operation order in the process of calculating an artificial neural network, and to minimize access to external memory while maintaining the processing speed above a certain level. .
- a typical configuration of the present invention for achieving the above object is as follows.
- a method of performing an artificial neural network operation comprising the steps of: (a) obtaining information about an input layer and weights associated with the operation of the artificial neural network from an external memory, (b) transmitting an operation result generated by dividing at least one hidden layer into a plurality of dimensions with reference to the obtained information to be stored in the external memory, and (c) the (a) Steps) and (b) are repeatedly performed so that when the operation for a plurality of layers associated with the input layer is completed, an output layer based on the operation result is transmitted to the external memory and stored.
- a system for performing an artificial neural network operation (a) obtaining information about an input layer and weights associated with the operation of the artificial neural network from an external memory.
- a buffer unit (b) an operation management unit configured to transmit and store an operation result generated by dividing at least one hidden layer into a plurality of dimensions by referring to the obtained information, and (c) )
- an output layer based on the operation result is transferred to the external memory.
- an output layer management unit configured to transmit and store the at least one of the number of times to be accessed to the external memory and a degree of overlapping processing generated when the at least one hidden layer is divided into a plurality of dimensions.
- a system is provided in which the level of dimensional division of the hidden layer and the order of operation processing are determined.
- the present invention in the process of calculating an artificial neural network, it is possible to derive an optimal layer partitioning method and an operation order, thereby minimizing access to external memory and maintaining the processing speed above a certain level.
- 1 is a diagram schematically showing the configuration of an entire system that performs an artificial neural network operation according to an embodiment of the present invention.
- FIG. 2 is a diagram illustrating an internal configuration of an artificial neural network computing system according to an embodiment of the present invention.
- FIG. 3 is a diagram illustrating a process of performing an artificial neural network operation through a limited internal memory according to an embodiment of the present invention.
- an artificial neural network is a concept including a statistical learning algorithm inspired by a neural network of biology in machine learning and cognitive science, and a plurality of artificial neurons that form a network by combining synapses.
- (neuron) may mean the overall model having problem-solving ability by changing the coupling strength (eg, weight) of synapses through learning.
- the artificial neural network model may be composed of a hierarchical structure including an input layer, a plurality of hidden layers, and an output layer.
- 1 is a diagram schematically showing the configuration of an entire system that performs an artificial neural network operation according to an embodiment of the present invention.
- the entire system may include a communication network 100, an artificial neural network computing system 200, and an external memory 300.
- the communication network 100 is within one system (eg, chip, memory) or between a plurality of systems (eg, chip-chip, chip-memory, memory-memory). ) May mean a bus that transmits and receives data in an interface circuit, etc.
- the artificial neural network computing system 200 can communicate with an external memory 300 to be described later through the communication network 100, and (a) an external memory Obtain information on the input layer and the weight associated with the operation of the artificial neural network from (300), and (b) a plurality of at least one hidden layer by referring to the obtained information.
- the operation result generated by dividing into the dimension of is transmitted to the external memory 300 to be stored, and (c) the above (a) and (b) processes are repeatedly performed, and the above input layer and When an operation for a plurality of related layers is completed, an output layer based on the operation result may be transmitted to and stored in the external memory 300 above.
- the artificial neural network operation system 200 includes the number of times to be accessed (eg, read/write) to the external memory 300 and at least one hidden layer has a plurality of dimensions.
- a dimension division level eg, the number of dimension divisions
- an operation order of the at least one hidden layer above may be determined based on at least one of the degree of overlapping processing generated as the division process is divided into (dimension).
- the artificial neural network operation system 200 has been described as above, but this description is exemplary, and at least some of the functions or components required for the artificial neural network operation system 200 are required, such as a smart phone, a tablet PC, etc. Likewise, it may be implemented in a digital device or IC chip having a computing capability by having a memory means and mounting a microprocessor, or may be included in an external system (not shown).
- the external memory 300 may perform communication with the artificial neural network computing system 200 through the communication network 100, and each input neuron associated with the artificial neural network. It is possible to perform a function of storing input data of and information including synaptic weights for performing an operation from the input neurons.
- the external memory 300 may include a volatile memory such as Double Data Rate Synchronous Dynamic Random Access Memory (DDR-SDRAM).
- DDR-SDRAM Double Data Rate Synchronous Dynamic Random Access Memory
- FIG. 2 is a diagram illustrating an internal configuration of an artificial neural network computing system 200 according to an embodiment of the present invention.
- the artificial neural network operation system 200 includes a buffer unit 210, an operation management unit 220, an output layer management unit 230, a communication unit 240, and a control unit 250. It may include. According to an embodiment of the present invention, at least some of the buffer unit 210, the operation management unit 220, the output layer management unit 230, the communication unit 240, and the control unit 250 are external systems (not shown). It may be program modules that communicate with. These program modules may be included in the artificial neural network operation system 200 in the form of an operating system, an application program module, and other program modules, and may be physically stored on various known storage devices. In addition, these program modules may be stored in a remote storage device capable of communicating with the artificial neural network computing system 200. Meanwhile, these program modules include routines, subroutines, programs, objects, components, data structures, etc. that perform specific tasks or execute specific abstract data types according to the present invention, but are not limited thereto.
- the buffer unit 210 may perform a function of acquiring information about an input layer and a synaptic weight associated with an artificial neural network operation from the external memory 300. .
- the buffer unit 210 according to an embodiment of the present invention may perform a function of temporarily storing information inside the artificial neural network computing system 200, and stored in the external memory 200 according to the storage capacity. You can store all or part of the information.
- the buffer unit 210 stores information on synaptic weights for deriving an input layer from the external memory 300 and an operation result from the input layer to the next layer of the artificial neural network. Can be obtained.
- the operation management unit 220 divides (or distributes) at least one hidden layer into a plurality of dimensions by referring to the information obtained from the buffer unit 210. ) It is possible to perform a function of transmitting and storing an operation result generated by processing to the external memory 300.
- the dimension according to an embodiment of the present invention may include concepts such as depth and channel used in a known artificial neural network. For example, in the case of an RGB channel for an image, 224 X It is represented by 224 X 3 (ie, horizontal, vertical and color channels), and the dimension may mean three.
- the dimension according to an embodiment of the present invention may be a concept defined in an activation volume corresponding to data for each input neuron, not the dimension of the entire artificial neural network.
- the operation management unit 220 includes the number of times to access the external memory 300 and the degree of redundancy that occurs as at least one hidden layer is divided into a plurality of dimensions. For example, overhead occurs with respect to the area where the layers overlap, which may be a concept that refers to the degree.) Based on at least one of, the dimensional division level and operation processing order of the at least one hidden layer above are determined based on at least one of the above. I can.
- the output layer management unit 230 includes a process in which the buffer unit 210 acquires and stores information from the external memory 300, and stores the information provided from the buffer unit 210.
- the process of transmitting and storing the result of the operation generated by dividing into a plurality of hidden layers to the external memory 300 above is repeatedly performed, and the operation for the plurality of layers related to the input layer is completed.
- a function of transmitting an output layer based on the operation result to the external memory 300 and storing it may be performed.
- the level (eg, depth) of the upper layer may be set based on at least one of the number of hidden layers and the number of times the external memory is accessed.
- the communication unit 240 may perform a function of enabling data transmission and reception from/to the buffer unit 210, the operation management unit 220, and the output layer management unit 230. have.
- control unit 250 performs a function of controlling the flow of data between the buffer unit 210, the operation management unit 220, the output layer management unit 230, and the communication unit 240. can do. That is, the control unit 250 according to the present invention controls the data flow from/to the outside of the artificial neural network operation system 200 or the data flow between each component of the artificial neural network operation system 200, so that the buffer unit 210 , The operation management unit 220, the output layer management unit 230, and the communication unit 240 may be controlled to perform their own functions, respectively.
- FIG. 3 is a diagram illustrating a process of performing an artificial neural network operation through a limited internal memory according to an embodiment of the present invention.
- information about all synaptic weights required for artificial neural network operation is transmitted from a flash memory (not shown) composed of a nonvolatile memory such as a secured digital (SD) card to the external memory 300. Can be delivered to.
- a flash memory not shown
- a nonvolatile memory such as a secured digital (SD) card
- information on a weight associated with an input layer and an artificial neural network operation may be obtained from the external memory 300 (310).
- only information about a weight corresponding to a specific layer to be calculated from an upper input layer among information about weights associated with an artificial neural network operation may be obtained.
- At least one hidden layer is divided into a plurality of dimensions 321 (eg, channels) with reference to the obtained information.
- the resulting calculation result may be transmitted and stored in the external memory 300 above.
- the level of dimensional division of one hidden layer and an order of operation processing may be determined.
- a process (330, 350) of obtaining information on a weight associated with an input layer and an artificial neural network operation from the external memory 300 above The process (340, 360) of transmitting and storing the calculation result generated by dividing at least one hidden layer into a plurality of dimensions with reference to the obtained information above is repeatedly performed.
- an output layer based on the operation result may be transmitted and stored to the external memory 300 (370 ).
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (3)
- 인공 신경망 연산을 수행하는 방법으로서,(a) 외부 메모리(external memory)로부터 입력 레이어(input layer) 및 인공 신경망의 연산과 연관되는 가중치에 관한 정보를 획득하는 단계,(b) 상기 획득되는 정보를 참조하여 적어도 하나의 히든 레이어(hidden layer)를 복수의 차원으로 분할 처리하여 생성되는 연산 결과를 상기 외부 메모리에 전송하여 저장되도록 하는 단계, 및(c) 상기 (a) 단계 및 상기 (b) 단계를 반복적으로 수행하여, 상기 입력 레이어와 연관되는 복수의 계층에 대한 연산이 완료되면, 상기 연산 결과에 기초한 출력 레이어(output layer)를 상기 외부 메모리에 전송하여 저장되도록 하는 단계를 포함하고,상기 외부 메모리에게 접근되어야 할 횟수 및 상기 적어도 하나의 히든 레이어가 복수의 차원으로 분할 처리됨에 따라 발생되는 중복 처리 정도 중 적어도 하나를 참조하여 상기 적어도 하나의 히든 레이어의 차원 분할 수준 및 연산 처리 순서가 결정되는방법.
- 제1항에 따른 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 비일시성의 컴퓨터 판독 가능 기록 매체.
- 인공 신경망 연산을 수행하는 시스템으로서,(a) 외부 메모리(external memory)로부터 입력 레이어(input layer) 및 인공 신경망의 연산과 연관되는 가중치에 관한 정보를 획득하는 버퍼부,(b) 상기 획득되는 정보를 참조하여 적어도 하나의 히든 레이어(hidden layer)를 복수의 차원으로 분할 처리하여 생성되는 연산 결과를 상기 외부 메모리에 전송하여 저장되도록 하는 연산 관리부, 및(c) 상기 (a) 단계 및 상기 (b) 단계를 반복적으로 수행하여, 상기 입력 레이어와 연관되는 복수의 계층에 대한 연산이 완료되면, 상기 연산 결과에 기초한 출력 레이어(output layer)를 상기 외부 메모리에 전송하여 저장되도록 하는 출력 레이어 관리부를 포함하고,상기 외부 메모리에게 접근되어야 할 횟수 및 상기 적어도 하나의 히든 레이어가 복수의 차원으로 분할 처리됨에 따라 발생되는 중복 처리 정도 중 적어도 하나를 참조하여 상기 적어도 하나의 히든 레이어의 차원 분할 수준 및 연산 처리 순서가 결정되는시스템.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020190112054A KR102491202B1 (ko) | 2019-09-10 | 2019-09-10 | 인공 신경망 연산을 수행하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체 |
KR10-2019-0112054 | 2019-09-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021049829A1 true WO2021049829A1 (ko) | 2021-03-18 |
Family
ID=74866736
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2020/012024 WO2021049829A1 (ko) | 2019-09-10 | 2020-09-07 | 인공 신경망 연산을 수행하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체 |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR102491202B1 (ko) |
WO (1) | WO2021049829A1 (ko) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180075368A (ko) * | 2016-12-26 | 2018-07-04 | 한국과학기술원 | 인공 신경망 모델에서 메모리 효율성 및 학습 속도 향상을 위한 드롭아웃 방법과 이를 이용한 학습 방법 |
US20180204118A1 (en) * | 2017-01-18 | 2018-07-19 | Hitachi, Ltd. | Calculation System and Calculation Method of Neural Network |
KR20180109619A (ko) * | 2017-03-28 | 2018-10-08 | 삼성전자주식회사 | 컨볼루션 신경망 처리 방법 및 장치 |
KR20190055608A (ko) * | 2017-11-15 | 2019-05-23 | 삼성전자주식회사 | 병렬 연산 처리를 수행하는 메모리 장치 및 이를 포함하는 메모리 모듈 |
KR20190085444A (ko) * | 2018-01-10 | 2019-07-18 | 서울대학교산학협력단 | 딥 뉴럴 네트워크를 위한 gpu 메모리 관리 방법 및 그를 수행하는 연산 장치 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102139740B1 (ko) * | 2017-06-09 | 2020-07-31 | 한국과학기술원 | 전자 장치 및 학습 모델 최적화 방법 |
KR102098713B1 (ko) * | 2018-01-29 | 2020-04-08 | 주식회사 유엑스팩토리 | Cnn과 rnn이 하나의 고성능 저전력 칩으로 집적된 이기종 프로세서 구조 |
-
2019
- 2019-09-10 KR KR1020190112054A patent/KR102491202B1/ko active IP Right Grant
-
2020
- 2020-09-07 WO PCT/KR2020/012024 patent/WO2021049829A1/ko active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180075368A (ko) * | 2016-12-26 | 2018-07-04 | 한국과학기술원 | 인공 신경망 모델에서 메모리 효율성 및 학습 속도 향상을 위한 드롭아웃 방법과 이를 이용한 학습 방법 |
US20180204118A1 (en) * | 2017-01-18 | 2018-07-19 | Hitachi, Ltd. | Calculation System and Calculation Method of Neural Network |
KR20180109619A (ko) * | 2017-03-28 | 2018-10-08 | 삼성전자주식회사 | 컨볼루션 신경망 처리 방법 및 장치 |
KR20190055608A (ko) * | 2017-11-15 | 2019-05-23 | 삼성전자주식회사 | 병렬 연산 처리를 수행하는 메모리 장치 및 이를 포함하는 메모리 모듈 |
KR20190085444A (ko) * | 2018-01-10 | 2019-07-18 | 서울대학교산학협력단 | 딥 뉴럴 네트워크를 위한 gpu 메모리 관리 방법 및 그를 수행하는 연산 장치 |
Also Published As
Publication number | Publication date |
---|---|
KR20210030654A (ko) | 2021-03-18 |
KR102491202B1 (ko) | 2023-01-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020221200A1 (zh) | 神经网络的构建方法、图像处理方法及装置 | |
US11580367B2 (en) | Method and system for processing neural network | |
WO2021190296A1 (zh) | 一种动态手势识别方法及设备 | |
KR101950786B1 (ko) | 분산처리용 인공신경망 연산 가속화 방법 | |
CN110728364A (zh) | 一种运算装置和运算方法 | |
CN112840356A (zh) | 运算加速器、处理方法及相关设备 | |
CN111783937A (zh) | 一种神经网络构建方法以及系统 | |
CN111465943A (zh) | 芯片上计算网络 | |
EP3754503A1 (en) | Allocation system, method and apparatus for machine learning, and computer device | |
US11580369B2 (en) | Inference apparatus, convolution operation execution method, and program | |
KR102137802B1 (ko) | 분산처리용 인공신경망 연산 가속화 장치, 이를 이용한 인공신경망 가속화 시스템, 및 그 인공신경망의 가속화 방법 | |
EP4064134B1 (en) | Neural network processing method, device and system | |
WO2021049829A1 (ko) | 인공 신경망 연산을 수행하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체 | |
CN111199276B (zh) | 数据处理方法及相关产品 | |
CN111831356B (zh) | 权重精度配置方法、装置、设备及存储介质 | |
CN113837922A (zh) | 计算装置、数据处理方法及相关产品 | |
CN112805727A (zh) | 分布式处理用人工神经网络运算加速装置、利用其的人工神经网络加速系统、及该人工神经网络的加速方法 | |
WO2023122896A1 (zh) | 一种数据处理方法和装置 | |
CN116091844A (zh) | 一种基于边缘计算的图像数据处理方法及系统 | |
WO2022227024A1 (zh) | 神经网络模型的运算方法、训练方法及装置 | |
CN112099850A (zh) | 一种多核Hourglass网络加速方法 | |
CN114467121A (zh) | 一种感知网络及图像处理方法 | |
WO2024135862A1 (ko) | 비정형 데이터 처리를 지원하는 데이터 처리 및 가공 장치 | |
WO2021107170A1 (ko) | 저전력 딥러닝 가속 장치 | |
WO2023231559A1 (zh) | 一种神经网络加速器、加速方法以及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20864065 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20864065 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 08/09/2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20864065 Country of ref document: EP Kind code of ref document: A1 |