KR20230021140A

KR20230021140A - Neurons using Posit

Info

Publication number: KR20230021140A
Application number: KR1020237000890A
Authority: KR
Inventors: 비제이 에스. 라메쉬; 리차드 씨. 머피
Original assignee: 마이크론 테크놀로지, 인크
Priority date: 2020-06-29
Filing date: 2021-06-28
Publication date: 2023-02-13
Also published as: EP4172874A1; KR20230027250A; WO2022005673A1; CN115668224A; EP4172875A1; WO2022005944A1; CN115516463A

Abstract

포지트(posit)들을 사용하는 뉴런에 관련된 시스템들, 장치들, 및 방법들이 설명된다. 예시적인 장치는 데이터를 저장하도록 구성된 복수의 메모리 셀들을 포함하는 메모리 어레이를 포함할 수 있다. 데이터는 복수의 비트 스트링들을 포함할 수 있다. 예시적인 장치는 메모리 어레이에 결합된 뉴런 구성요소를 포함할 수 있다. 뉴런 구성요소는 복수의 비트 스트링들 중 적어도 하나에 대해 뉴로모픽 연산들을 수행하도록 구성된 뉴런 회로부를 포함할 수 있다.Systems, devices, and methods related to neurons using posits are described. An example device may include a memory array including a plurality of memory cells configured to store data. Data may include a plurality of bit strings. An exemplary device may include neuronal components coupled to a memory array. A neuron component may include neuron circuitry configured to perform neuromorphic operations on at least one of a plurality of bit strings.

Description

Neurons using Posit

본 개시는 일반적으로 반도체 메모리 및 방법들, 그리고 더 구체적으로는, 포지트를 사용하는 뉴런을 위한 장치들, 시스템들, 및 방법들에 관한 것이다.The present disclosure relates generally to semiconductor memory and methods, and more specifically to devices, systems, and methods for neurons using positrons.

메모리 디바이스들은 통상적으로 컴퓨터들 또는 다른 전자 시스템들 내에 내부 반도체, 집적 회로부로서 제공된다. 많은 상이한 타입들의 메모리들은 휘발성 및 비휘발성 메모리를 포함한다. 휘발성 메모리는 자신의 데이터(예를 들어, 호스트 데이터, 에러 데이터 등)를 유지하는 데 전력을 필요로 할 수 있고, 다른 것들 중에서도, 랜덤 액세스 메모리(RAM), 동적 랜덤 액세스 메모리(DRAM), 정적 랜덤 액세스 메모리(SRAM), 동기식 동적 랜덤 액세스 메모리(SDRAM), 및 사이리스터 랜덤 액세스 메모리(TRAM)를 포함한다. 비휘발성 메모리는 전력이 공급되지 않을 때 저장된 데이터를 유지함으로써 영구 데이터를 제공할 수 있고, 특히, NAND 플래시 메모리, NOR 플래시 메모리, 가변 메모리, 이를테면 상 변화 랜덤 액세스 메모리(PCRAM), 저항 랜덤 액세스 메모리(RRAM), 및 자기 랜덤 액세스 메모리(MRAM), 이를테면 회전 토크 전달 랜덤 액세스 메모리(STT RAM)를 포함할 수 있다. Memory devices are typically provided as internal semiconductor, integrated circuitry within computers or other electronic systems. Many different types of memories include volatile and non-volatile memory. Volatile memory can require power to hold its data (e.g., host data, error data, etc.) and, among other things, random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), synchronous dynamic random access memory (SDRAM), and thyristor random access memory (TRAM). Non-volatile memory can provide permanent data by retaining stored data when power is not supplied, and can provide permanent data, in particular NAND flash memory, NOR flash memory, variable memory such as phase change random access memory (PCRAM), resistive random access memory. (RRAM), and magnetic random access memory (MRAM), such as rotational torque transfer random access memory (STT RAM).

메모리 디바이스들은 컴퓨터 또는 전자 시스템이 동작하는 동안 호스트에 의해 사용하기 위한 데이터, 커맨드들, 및/또는 명령어들을 저장하기 위해 호스트(예를 들어, 호스트 컴퓨팅 디바이스)에 결합될 수 있다. 예를 들어, 컴퓨팅 또는 다른 전자 시스템의 동작 동안 데이터, 커맨드들, 및/또는 명령어들이 호스트와 메모리 디바이스(들) 사이에서 전달될 수 있다.Memory devices may be coupled to a host (eg, host computing device) to store data, commands, and/or instructions for use by the host while the computer or electronic system is operating. For example, data, commands, and/or instructions may be transferred between a host and memory device(s) during operation of a computing or other electronic system.

도 1은 본 개시의 다수의 실시예들에 따른 호스트 및 메모리 디바이스를 포함하는 장치를 포함하는 컴퓨팅 시스템 형태의 기능 블록도이다.
도 2a는 본 개시의 다수의 실시예들에 따른 호스트 및 메모리 디바이스를 포함하는 장치를 포함하는 컴퓨팅 시스템 형태의 다른 기능 블록도이다.
도 2b는 본 개시의 다수의 실시예들에 따른 호스트, 메모리 디바이스, 주문형 집적 회로, 및 필드 프로그래머블 게이트 어레이를 포함하는 컴퓨팅 시스템 형태의 기능 블록도이다.
도 3은 es 지수 비트들을 갖는 n 비트 포지트의 예이다.
도 4a는 3 비트 포지트에 대한 양의 값들의 예이다.
도 4b는 두 개의 지수 비트들을 사용하는 포지트 구성의 예이다.
도 5a는 본 개시의 다수의 실시예들에 따른 주변 감지 앰프들, 메모리 어레이, 및 복수의 산술 논리 유닛(ALU)들 형태의 기능 블록도이다.
도 5b는 본 개시의 다수의 실시예들에 따른 승산기, 누산기, 내부 ALU, 및 레지스터를 포함하는 뉴런 형태의 기능 블록도이다.
도 5c는 본 개시의 다수의 실시예들에 따른 감지 앰프들 및 뉴런들의 메모리 어레이 형태의 기능 블록도이다.
도 6은 본 개시의 다수의 실시예들에 따른 제어 회로부 형태의 기능 블록도이다.
도 7은 본 개시의 다수의 실시예들에 따른 뉴런들을 사용하여 뉴로모픽 연산들을 수행하기 위한 예시적인 방법을 나타내는 흐름도이다.1 is a functional block diagram in the form of a computing system that includes an apparatus that includes a host and a memory device in accordance with multiple embodiments of the present disclosure.
2A is another functional block diagram in the form of a computing system that includes an apparatus that includes a host and a memory device in accordance with multiple embodiments of the present disclosure.
2B is a functional block diagram in the form of a computing system that includes a host, a memory device, an application specific integrated circuit, and a field programmable gate array in accordance with multiple embodiments of the present disclosure.
3 is an example of an n-bit positive with es exponent bits.
4A is an example of positive values for a 3-bit posit.
4B is an example of a posit configuration using two exponent bits.
5A is a functional block diagram in the form of peripheral sense amplifiers, a memory array, and a plurality of arithmetic logic units (ALUs) in accordance with multiple embodiments of the present disclosure.
5B is a functional block diagram of a neuron type including a multiplier, an accumulator, an internal ALU, and registers in accordance with multiple embodiments of the present disclosure.
5C is a functional block diagram of sense amplifiers and a memory array of neurons in accordance with multiple embodiments of the present disclosure.
6 is a functional block diagram in the form of control circuitry in accordance with multiple embodiments of the present disclosure.
7 is a flow diagram illustrating an example method for performing neuromorphic operations using neurons in accordance with multiple embodiments of the present disclosure.

포지트(posit)들을 사용하는 뉴런에 관련된 시스템들, 장치들, 및 방법들이 설명된다. 예시적인 장치는 데이터를 저장하도록 구성된 복수의 메모리 셀들을 포함하는 메모리 어레이를 포함할 수 있다. 데이터는 복수의 비트 스트링들을 포함할 수 있다. 예시적인 장치는 메모리 어레이에 결합된 뉴런 구성요소를 포함할 수 있다. 뉴런 구성요소는 복수의 비트 스트링들 중 적어도 하나에 대해 뉴로모픽 연산들을 수행하도록 구성될 수 있다.Systems, devices, and methods related to neurons using posits are described. An example device may include a memory array including a plurality of memory cells configured to store data. Data may include a plurality of bit strings. An exemplary device may include neuronal components coupled to a memory array. A neuron component may be configured to perform neuromorphic operations on at least one of a plurality of bit strings.

메모리 어레이는 메모리 디바이스 내에 있을 수 있다. 메모리 디바이스는 제어 회로부를 포함할 수 있다. 제어 회로부는 메모리 자원 및 처리 자원을 포함할 수 있다. 제어 회로부는 메모리 어레이에 결합될 수 있다. 메모리 어레이는 복수의 뉴런들(예를 들어, 뉴런 구성요소들)을 포함할 수 있다. 메모리 어레이는 복수의 비트 스트링들을 포함하는 데이터를 저장할 수 있다. 제어 회로부는 복수의 비트 스트링들 중 적어도 하나에 대한 뉴로모픽 연산들의 수행을 제어할 수 있다. 이러한 방식으로, 뉴로모픽 메모리 어레이는 신경망으로서 작용할 수 있다. 신경망을 트레이닝하기 위해 신경망의 연산에 아날로그 가중치들이 입력될 수 있다. 포지트들을 포함하는 데이터에 대해 뉴로모픽 연산들이 수행되게 한 신경망에 데이터가 저장되는 것에 응답하여, 데이터는 학습을 시뮬레이션할 수 있고, 일어난 학습의 정도를 결정하기 위해 테스트되거나 분석될 수 있다. 데이터는 유니버설 수(universal number)(unum) 포맷, 이를테면 타입 III unum 또는 포지트 포맷을 포함하는 포맷을 가질 수 있다.The memory array can be within a memory device. The memory device may include control circuitry. Control circuitry may include memory resources and processing resources. Control circuitry may be coupled to the memory array. A memory array may include a plurality of neurons (eg, neuronal elements). A memory array may store data including a plurality of bit strings. The control circuitry may control performance of neuromorphic operations on at least one of a plurality of bit strings. In this way, the neuromorphic memory array can act as a neural network. Analog weights may be input to computation of the neural network to train the neural network. In response to the data being stored in a neural network that caused neuromorphic operations to be performed on the data comprising the points, the data may simulate learning and may be tested or analyzed to determine the degree of learning that has occurred. The data may have a format including a universal number (unum) format, such as a Type III unum or positive format.

신경망은 데이터에서의 패턴들을 인식하도록 실행될 수 있는 명령어 집합을 포함할 수 있다. 일부 신경망들은 인간 뇌가 연산하는 방식을 모방하는 방식으로 데이터의 집합에서의 기저 관계들을 인식하기 위해 사용될 수 있다. 신경망은 출력 기준들을 재설계하지 않는 최상의 가능한 결과를 신경망이 생성할 수 있도록 가변 또는 변화하는 입력들에 적응할 수 있다.A neural network can include a set of instructions that can be executed to recognize patterns in data. Some neural networks can be used to recognize underlying relationships in a set of data in a way that mimics the way the human brain operates. A neural network can adapt to variable or changing inputs such that the neural network can produce the best possible result without redesigning the output criteria.

신경망은 다수의 뉴런들로 이루어질 수 있으며, 이는 하나 이상의 식으로 표현될 수 있다. 신경망들과 관련하여, 뉴런은 수들 또는 벡터들의 수량을 입력들로서 수신하고, 신경망의 속성들에 기초하여, 출력을 생성할 수 있다. 예를 들어, 뉴런은 X _k 입력들을 수신할 수 있으며, k는 입력의 인덱스에 대응한다. 각 입력에 대해, 뉴런은 입력에 가중치 벡터 W _k 를 부여할 수 있다. 가중치 벡터들은 일부 실시예들에서, 신경망에서의 뉴런들을 신경망에서의 하나 이상의 상이한 뉴런과 구별되게 할 수 있다. 일부 신경망들에서, 입력 벡터들과 가중치 벡터들의 선형 조합의 예를 보여주는, 식 1에 의해 보여지는 바와 같이, 각 입력 벡터들은 각 가중치 벡터들과 곱해져 값을 낸다.A neural network may consist of multiple neurons, which may be expressed in one or more expressions. With respect to neural networks, a neuron can receive as inputs a quantity of numbers or vectors and, based on properties of the neural network, generate an output. For example, a neuron may receive X _k inputs, where k corresponds to the index of the input. For each input, a neuron can assign a weight vector W _k to the input. Weight vectors may, in some embodiments, differentiate neurons in a neural network from one or more different neurons in the neural network. In some neural networks, each input vector is multiplied with each weight vector to yield a value, as shown by Equation 1, which shows an example of a linear combination of input vectors and weight vectors.

식 1Equation 1

일부 신경망들에서, 식 1로부터의 결과 값 f (x ₁, x ₂)에 비선형 함수(예를 들어, 활성화 함수)가 적용될 수 있다. 식 1로부터의 결과 값에 적용될 수 있는 비선형 함수의 예는 정류된 선형 유닛 함수(rectified linear unit, ReLU)이다. 식 2로 보여지는 ReLU 함수의 적용은 값이 제로보다 크다면 함수에 입력되는 값을, 또는 함수에 입력되는 값이 제로보다 작다면 제로를 낸다. ReLU 함수는 여기서 단지 활성화 함수의 예시적인 예로서 사용되고, 제한하는 것으로 의도되지 않는다. 신경망들과 관련하여 적용될 수 있는 활성화 함수들의 다른 비제한적인 예들은 특히, 시그모이드 함수들, 이진 계단 함수들, 선형 활성화 함수들, 쌍곡선 함수들, 리키 ReLU 함수들, 파라메트릭 ReLU 함수들, 소프트맥스 함수들, 및/또는 스위시 함수들을 포함할 수 있다.In some neural networks, a nonlinear function (eg, an activation function) may be applied to the resulting value f ( x ₁ , x ₂ ) from Equation 1. An example of a non-linear function that can be applied to the resulting value from Equation 1 is the rectified linear unit function (ReLU). Application of the ReLU function shown in Equation 2 yields a value input to the function if the value is greater than zero, or zero if the value input to the function is less than zero. The ReLU function is used herein only as an illustrative example of an activation function and is not intended to be limiting. Other non-limiting examples of activation functions that can be applied in connection with neural networks include, among others, sigmoid functions, binary step functions, linear activation functions, hyperbolic functions, Leaky ReLU functions, parametric ReLU functions, Softmax functions, and/or Swish functions.

식 2Equation 2

신경망을 트레이닝하는 프로세스 동안, 입력 벡터들 및/또는 가중치 벡터들은 신경망을 "튜닝"하도록 변경될 수 있다. 일례에서, 신경망은 (예를 들어, 아날로그 가중치들과 같은) 랜덤 가중치들로 초기화될 수 있다. 시간이 지남에 따라, 가중치들은 신경망의 정확성을 개선하도록 조정될 수 있다. 이는 시간이 지남에 따라, 높은 정확성을 갖는 신경망을 낼 수 있다.During the process of training a neural network, the input vectors and/or weight vectors may be changed to “tune” the neural network. In one example, the neural network may be initialized with random weights (eg, analog weights). Over time, the weights can be adjusted to improve the accuracy of the neural network. Over time, this can result in a neural network with high accuracy.

신경망들은 광범위한 적용예들을 갖는다. 예를 들어, 신경망들은 특히, 시스템 식별 및 제어(차량 제어, 궤적 예측, 프로세스 제어, 자연 자원 관리), 양자 화학, 일반 게임 플레이, 패턴 인식(레이더 시스템, 얼굴 식별, 신호 분류, 3D 재구성, 객체 인식 등), 시퀀스 인식(제스처, 음성, 필기 및 인쇄된 텍스트 인식), 의료 진단, 금융(예를 들어, 자동화된 거래 시스템), 데이터 마이닝, 시각화, 기계 번역, 소셜 네트워크 필터링 및/또는 이메일 스팸 필터링을 위해 사용될 수 있다.Neural networks have a wide range of applications. For example, neural networks are used, among others, for system identification and control (vehicle control, trajectory prediction, process control, natural resource management), quantum chemistry, general game play, pattern recognition (radar systems, face identification, signal classification, 3D reconstruction, object recognition), sequence recognition (gesture, speech, handwritten and printed text recognition), medical diagnostics, finance (e.g. automated trading systems), data mining, visualization, machine translation, social network filtering and/or email spam. Can be used for filtering.

일부 신경망들이 요구하는 컴퓨팅 자원들로 인해, 일부 접근법들에서, 신경망들은 호스트 컴퓨팅 시스템(예를 들어, 데스크톱 컴퓨터, 수퍼컴퓨터 등) 또는 클라우드 컴퓨팅 환경과 같은 컴퓨팅 시스템에 배치된다. 이러한 접근법들에서, 신경망을 트레이닝하기 위한 연산의 일부로서 신경망의 대상이 될 데이터는 NAND 저장 디바이스와 같은 메모리 자원에 저장될 수 있고, 중앙 처리 유닛과 같은 처리 자원은 데이터에 액세스하고 신경망을 사용하여 데이터를 처리하기 위한 명령어들을 실행할 수 있다. 일부 접근법들은 또한 신경망 트레이닝의 일부로서 필드 프로그래머블 게이트 어레이 또는 주문형 집적 회로와 같은 특화된 하드웨어를 이용할 수 있다. 다른 접근법들에서, 하나 이상의 신경망의 저장 및 트레이닝은 동적 랜덤 액세스 메모리(DRAM) 디바이스와 같은 비휘발성 메모리 디바이스 내에서 일어날 수 있다. Due to the computing resources some neural networks require, in some approaches, neural networks are deployed in a computing system, such as a host computing system (eg, desktop computer, supercomputer, etc.) or a cloud computing environment. In these approaches, as part of the operation to train the neural network, data to be subjected to the neural network can be stored in a memory resource, such as a NAND storage device, and a processing resource, such as a central processing unit, can access the data and use the neural network to It can execute commands to process data. Some approaches may also use specialized hardware such as field programmable gate arrays or application specific integrated circuits as part of neural network training. In other approaches, storage and training of one or more neural networks may occur within a non-volatile memory device, such as a dynamic random access memory (DRAM) device.

신경망(또는 뉴로모픽) 연산들을 수행하는 데 사용될 수 있는 데이터는 특정 포맷들로 저장될 수 있다. 예를 들어, 데이터는 데이터의 정확성을 증가시키기 위해 아날로그 또는 디지털 포맷으로 저장될 수 있다. 하나의 이러한 포맷은 "유니버설 수"(unum) 포맷으로서 지칭되는 포맷을 포함할 수 있다. "포지트" 및/또는 "밸리드(valid)"로서 지칭될 수 있는 여러 형태의 unum 포맷들 ― 타입 I unum, 타입 II unum, 및 타입 III unum ― 이 있다. 타입 I unum은 실수가 정확한 플로트인지 여부, 또는 인접한 플로트들 사이의 간격에 있는지를 나타내기 위해 분수의 끝에서 "ubit"를 사용하는 IEEE 754 표준 부동 소수점 포맷의 상위 집합이다. 타입 I unum의 부호, 지수, 및 분수 비트들은 IEEE 754 부동 소수점 포맷으로부터 이들의 정의를 취하지만, 타입 I unum의 지수 및 분수 필드들의 길이는 단일 비트로부터 최대 사용자 정의가능 길이까지 극적으로 변할 수 있다. IEEE 754 표준 부동 소수점 포맷으로부터 부호, 지수, 및 분수 비트들을 취함으로써, 타입 I unum은 부동 소수점 수들과 유사하게 거동할 수 있지만, 타입 I unum의 지수 및 분수 비트들에서 나타나는 가변 비트 길이는 플로트들에 비해 추가적인 관리를 필요로 할 수 있다.Data that may be used to perform neural network (or neuromorphic) operations may be stored in specific formats. For example, data may be stored in analog or digital formats to increase the accuracy of the data. One such format may include a format referred to as a "universal number" (unum) format. There are several types of unum formats—Type I unum, Type II unum, and Type III unum—that can be referred to as “positive” and/or “valid.” Type I unum is a superset of the IEEE 754 standard floating-point format that uses "ubit" at the end of a fraction to indicate whether a real number is an exact float, or in an interval between adjacent floats. The sign, exponent, and fraction bits of type I unum take their definition from the IEEE 754 floating point format, but the length of the exponent and fraction fields of type I unum can vary dramatically from a single bit to a maximum user-definable length. . By taking the sign, exponent, and fraction bits from the IEEE 754 standard floating-point format, a Type I unum can behave similarly to floating-point numbers, but the variable bit lengths represented by the exponent and fraction bits of a Type I unum represent floats. may require additional management.

부동 소수점 표준을 참조하면, 이진수 스트링들과 같은 비트 스트링들(예를 들어, 수를 나타낼 수 있는 비트들의 스트링들)은 정수들 또는 비트들의 세 개의 집합들 ― "기수(base)"로서 지칭되는 비트들의 집합, "지수"로서 지칭되는 비트들의 집합, 및 "가수(mantissa)"(또는 가수부(significand))로서 지칭되는 비트들의 집합 ― 의 관점에서 표현된다. 이진 숫자 스트링이 저장되는 포맷을 정의하는 정수들 또는 비트들의 집합들은 간략화를 위해 본원에서 "숫자 포맷" 또는 "포맷"으로서 지칭될 수 있다. 예를 들어, 부동 소수점 비트 스트링을 정의하는 상술된 정수들 또는 비트들의 세 개의 집합들(예를 들어, 기수, 지수, 및 가수)은 포맷(예를 들어, 제1 포맷)으로서 지칭될 수 있다. 아래에서 더 상세히 설명될 바와 같이, 포지트 비트 스트링은 정수들 또는 비트들의 네 개의 집합들(예를 들어, 부호, 레짐, 지수, 및 가수)을 포함할 수 있으며, 이는 또한 "숫자 포맷" 또는 "포맷"(예를 들어, 제2 포맷)으로서 지칭될 수 있다. 또한, 부동 소수점 표준 하에서, 두 가지 무한대(예를 들어, +∞ 및 -∞) 및/또는 두 종류의 "NaN"(숫자가 아님): qNaN(quiet NaN) 및 시그널링 NaN(signaling NaN)이 비트 스트링에 포함될 수 있다. Referring to the floating point standard, bit strings (e.g., strings of bits that can represent numbers), such as strings of binary digits, are integers or three sets of bits - referred to as a "base". It is expressed in terms of a set of bits, a set of bits referred to as an "exponent", and a set of bits referred to as a "mantissa" (or significand). The integers or sets of bits that define the format in which a string of binary numbers is stored may be referred to herein as a "number format" or "format" for simplicity. For example, the aforementioned integers or three sets of bits (e.g., radix, exponent, and mantissa) that define a floating-point bit string may be referred to as a format (e.g., a first format) . As will be described in more detail below, a positive bit string may include integers or four sets of bits (e.g., sign, regime, exponent, and mantissa), which may also be referred to as a "number format" or may be referred to as a “format” (eg, a second format). Also, under the floating point standard, there are two infinities (e.g., +∞ and -∞) and/or two kinds of "NaN" (not a number): qNaN (quiet NaN) and signaling NaN (signaling NaN). can be included in the string.

부동 소수점 표준은 수 년 동안 컴퓨팅 시스템들에서 사용되어 왔고, 많은 컴퓨팅 시스템들에 의해 수행되는 계산을 위한 산술 포맷들, 교환 포맷들, 반올림 규칙들, 연산들, 및 예외 처리를 정의한다. 산술 포맷들은 이진 및/또는 십진 부동 소수점 데이터를 포함할 수 있으며, 이는 유한 수들, 무한대들, 및/또는 특수 NaN 값들을 포함할 수 있다. 교환 포맷들은 부동 소수점 데이터를 교환하기 위해 사용될 수 있는 인코딩들(예를 들어, 비트 스트링들)을 포함할 수 있다. 반올림 규칙들은 산술 연산들 및/또는 변환 연산들 동안 수들을 반올림할 때 만족될 수 있는 속성들의 집합을 포함할 수 있다. 부동 소수점 연산들은 산술 연산들 및/또는 삼각 함수들과 같은 다른 계산 연산들을 포함할 수 있다. 예외 처리는 제로로 나누기(division by zero), 오버플로 등과 같은 예외 조건들의 표시들을 포함할 수 있다. The floating point standard has been used in computing systems for many years and defines arithmetic formats, interchange formats, rounding rules, operations, and exception handling for calculations performed by many computing systems. Arithmetic formats may include binary and/or decimal floating point data, which may include finite numbers, infinities, and/or special NaN values. Interchange formats may include encodings (eg, bit strings) that may be used to exchange floating point data. Rounding rules may include a set of properties that may be satisfied when rounding numbers during arithmetic operations and/or conversion operations. Floating point operations may include arithmetic operations and/or other computational operations such as trigonometric functions. Exception handling may include indications of exceptional conditions such as division by zero, overflow, and the like.

다시 유니버설 수 포맷을 참조하면, 타입 II unum은 일반적으로 플로트와 호환되지 않으며, 이는 사영 실수에 기초한 클린한 수학적 설계를 허용한다. 타입 II unum은 n 비트를 포함할 수 있고, 원형 사영의 사분면들이 2 ^n-3 - 1 실수들의 순서 집합으로 채워지는 "u-격자"의 관점에서 설명될 수 있다. 타입 II unum의 값들은 양의 값들이 원형 사영의 상부 우측 사분면에 있는 한편, 이들의 음의 대응물들은 원형 사영의 상부 좌측 사분면에 있도록 원형 사영을 이등분하는 축에 대해 반영될 수 있다. 타입 II unum을 나타내는 원형 사영의 하반부는 원형 사영의 상반부에 있는 값들의 역수를 포함할 수 있다. 타입 II unum은 일반적으로 대부분의 연산들에 대해 룩업 테이블에 의존한다. 예를 들어, 룩업 테이블의 크기는 일부 상황들에서 타입 II unum의 효율을 제한할 수 있다. 그러나, 타입 II unum은 일부 조건들 하에서 플로트들과 비해 개선된 계산 기능을 제공할 수 있다.Referring back to universal number formats, Type II unums are generally incompatible with floats, which allow clean mathematical designs based on projective real numbers. A Type II unum can contain n bits and can be described in terms of a " u -lattice" in which the quadrants of the circular projection are filled with an ordered set of 2 ^n-3 - 1 real numbers. Values of Type II unum can be reflected about the axis that bisects the circular projection such that positive values are in the upper right quadrant of the circular projection, while their negative counterparts are in the upper left quadrant of the circular projection. The lower half of the circular projection representing the Type II unum may contain the reciprocal of the values in the upper half of the circular projection. Type II unums generally rely on lookup tables for most operations. For example, the size of the lookup table can limit the effectiveness of a Type II unum in some situations. However, Type II unums can provide improved computational capabilities compared to floats under some conditions.

타입 III unum 포맷은 본원에서 "포지트 포맷"으로서, 또는 간략함을 위해, "포지트"로서 지칭된다. 부동 소수점 비트 스트링들과 대조적으로, 포지트들은 특정 조건들 하에서, 동일한 비트 폭을 갖는 부동 소수점 수들보다 더 넓은 동적 범위 및 더 높은 정확도(정밀도)를 가능하게 할 수 있다. 이는 컴퓨팅 시스템에 의해 수행되는 연산들이 부동 소수점 수들을 이용하는 것보다 위치들을 사용할 때 더 높은 레이트(예를 들어, 더 빠른)로 수행될 수 있게 할 수 있으며, 이는 차례로, 예를 들어, 이러한 연산들을 수행하는 데 사용되는 클록 사이클들의 수를 감소시킴으로써 컴퓨팅 시스템의 성능을 개선할 수 있으며, 이에 의해 이러한 연산들을 수행하는 데 소비되는 처리 시간 및/또는 전력을 감소시킬 수 있다. 또한, 컴퓨팅 시스템들에서의 포지트들의 사용은 부동 소수점 수들보다 더 높은 정확도 및/또는 정밀도를 가능하게 할 수 있으며, 이는 또한, 일부 접근법들(예를 들어, 부동 소수점 포맷 비트 스트링들에 의존하는 접근법들)에 비해 컴퓨팅 시스템의 기능을 개선할 수 있다. 그러나, 타입 III unum 포맷으로 저장된 데이터에 대해 신경망 또는 뉴로모픽 연산들을 수행하는 것은 아날로그 포맷으로 저장된 데이터에 대해 그러한 연산들을 수행하는 것보다 더 어려울 수 있다.The Type III unum format is referred to herein as a "positive format" or, for brevity, as a "positive". In contrast to floating point bit strings, posits can, under certain conditions, enable a wider dynamic range and higher accuracy (precision) than floating point numbers with the same bit width. This may allow operations performed by the computing system to be performed at a higher rate (eg, faster) when using positions than using floating point numbers, which in turn may, for example, perform such operations The performance of a computing system may be improved by reducing the number of clock cycles used to perform it, thereby reducing processing time and/or power consumed performing these operations. In addition, the use of posits in computing systems may enable higher accuracy and/or precision than floating point numbers, which may also allow some approaches (e.g., ones that rely on floating point format bit strings). approaches) may improve the functionality of the computing system. However, performing neural network or neuromorphic operations on data stored in a Type III unum format can be more difficult than performing such operations on data stored in an analog format.

본원의 실시예들은 컴퓨팅 디바이스의 전체 기능을 개선하기 위해 비트 스트링들에 대해 다양한 연산들을 수행하도록 구성된 하드웨어 회로부(예를 들어, 제어 회로부, 디지털-아날로그 변환기, 아날로그-디지털 변환기 등)에 관한 것이다. 예를 들어, 본원의 실시예들은 신경망을 트레이닝하거나 신경망이 포지트를 사용하여 학습을 시뮬레이션하게 하기 위해 뉴로모픽 연산들을 수행하도록 구성되는 하드웨어 회로부에 관한 것이다. Embodiments herein relate to hardware circuitry (eg, control circuitry, digital-to-analog converters, analog-to-digital converters, etc.) configured to perform various operations on bit strings to improve the overall functionality of a computing device. For example, embodiments herein relate to hardware circuitry configured to perform neuromorphic operations to train a neural network or to cause a neural network to simulate learning using posi- tions.

일부 실시예들에서, 이러한 방식으로 연산들을 수행함으로써, 하드웨어 회로부는 이러한 신경망 목적들을 위한 뉴로모픽 연산들의 개선된 수행을 가능하게 할 수 있으면서, unum 또는 포지트 포맷의 데이터에 대해 메모리 디바이스 및/또는 뉴로모픽 메모리 어레이 외부에서 수행되는 연산들의 수행의 개선된 정확도 및/또는 정밀도, unum 또는 포지트 연산들을 수행하는 데 있어서의 개선된 속도, 및/또는 산술 및/또는 논리 연산들의 수행 전에, 수행 중에, 또는 수행 후에 비트 스트링들에 대한 감소된 필요 저장 공간을 여전히 유지할 수 있다. In some embodiments, by performing operations in this manner, hardware circuitry may enable improved performance of neuromorphic operations for such neural network purposes, while for data in unum or posit format a memory device and/or or before improved accuracy and/or precision of performance of operations performed outside of a neuromorphic memory array, improved speed in performing unum or positive operations, and/or performance of arithmetic and/or logical operations; It is possible to still maintain a reduced required storage space for bit strings during or after execution.

본원에서 사용될 때, 용어 "~ 상에 상주하는(resident on)"은 특정 구성요소 상에 물리적으로 위치되는 것을 지칭한다. 예를 들어, 도 1에 도시된 메모리 어레이(130)의 어레이 부분(132) 상에 상주하는 도 5b에 도시된 뉴런 회로부(552)와 같은 뉴런 회로부는 뉴런 회로부를 형성하는 물리적 하드웨어(예를 들어, 회로부, 로직, 또는 다른 하드웨어 구성요소들)가 아래에서 설명될 어레이 부분 및/또는 메모리 어레이 내에(즉, 동일한 다이 또는 패키지 내에) 물리적으로 포함되는 조건을 지칭한다. 용어 "~ 상에 상주하는"은 본원에서 "~ 상에 배치된" 또는 "~ 상에 위치된"과 같은 다른 용어들과 서로 바꿔 사용될 수 있다. As used herein, the term “resident on” refers to being physically located on a particular component. Neuron circuitry, such as, for example, neuron circuitry 552 shown in FIG. 5B residing on array portion 132 of memory array 130 shown in FIG. , circuitry, logic, or other hardware components) are physically contained within an array portion and/or memory array (ie, within the same die or package) described below. The term “resident on” may be used interchangeably herein with other terms such as “disposed on” or “located on”.

메모리 어레이(예를 들어, DRAM 메모리 어레이) 내의 뉴런들을 사용하여 뉴로모픽 연산들을 수행하고/하거나 하드웨어 회로부를 사용하여 산술 연산들을 수행함으로써, 이러한 연산들은 이러한 연산들이 메모리 어레이 외부에서 또는 포지트 포맷의 데이터의 사용 없이 수행되는 접근법들과 비교하여, 데이터의 신경망 처리의 효율을 증가시키기 위해 개선된 방식으로 포지트 포맷의 데이터를 사용한 신경망 트레이닝 및/또는 학습을 가능하게 할 수 있다. 예를 들어, 메모리 어레이 외부의 디바이스들 상에서 또는 포지트 포맷을 이용하지 않는 데이터에 대해 뉴로모픽 연산들을 수행하는 것은 뉴로모픽 연산들의 효율을 감소시키고, 메모리 어레이 및/또는 메모리 디바이스의 성능을 감소시킬 수 있다. By performing neuromorphic operations using neurons within a memory array (e.g., a DRAM memory array) and/or performing arithmetic operations using hardware circuitry, these operations can be performed outside the memory array or in a positive format. It may enable neural network training and/or learning using data in positive format in an improved manner to increase the efficiency of neural network processing of the data, compared to approaches performed without the use of data in . For example, performing neuromorphic operations on devices external to the memory array or on data that does not use a positive format reduces the efficiency of the neuromorphic operations and reduces the performance of the memory array and/or memory device. can reduce

본 개시에 대한 다음의 상세한 설명에서, 본 본원의 일부를 형성하고, 본 개시의 하나 이상의 실시예가 어떻게 실시될 수 있는지 예로서 도시되는 첨부 도면들이 참조된다. 이러한 실시예들은 당업자들이 본 개시의 실시예들을 실시할 수 있게 하기에 충분히 상세하게 설명되고, 다른 실시예들이 이용될 수 있는 것으로 그리고 본 개시의 범위로부터 벗어나지 않고 프로세스, 전기적, 및 구조적 변경들이 이루어질 수 있는 것으로 이해되어야 한다. In the following detailed description of the present disclosure, reference is made to the accompanying drawings, which form a part of this disclosure and are shown as examples of how one or more embodiments of the disclosure may be practiced. These embodiments have been described in sufficient detail to enable those skilled in the art to practice embodiments of the present disclosure, and other embodiments may be utilized and process, electrical, and structural changes may be made without departing from the scope of the present disclosure. should be understood as possible.

본원에서 사용될 때, 특히 도면들에서의 참조 부호들에 대한 “N”, “M” 등과 같은 지정자들은 그렇게 지정된 다수의 특정 특징부가 포함될 수 있음을 나타낸다. 또한, 본원에서 사용되는 용어는 특정 실시예들을 단지 설명하기 위한 것이고, 제한하려는 것이 아닌 것으로 이해되어야 한다. 본원에서 사용될 때, 단수형 표현들은 문맥상 명확히 다르게 지시하지 않는 한. 단수 및 복수 대상들 양자를 포함할 수 있다. 또한, "다수의", "적어도 하나의", 및 "하나 이상의"(예를 들어, 다수의 메모리 뱅크들)는 하나 이상의 메모리 뱅크를 지칭할 수 있은 반면, "복수의"는 이러한 것들이 하나보다 많은 것을 지칭하려는 것이다. As used herein, designations such as "N", "M", etc., particularly for reference numerals in the drawings, indicate that a number of specific features so designated may be included. Also, it should be understood that the terminology used herein is merely for describing specific embodiments and is not intended to be limiting. As used herein, singular expressions are used unless the context clearly dictates otherwise. It may include both singular and plural objects. Also, "a number of", "at least one", and "one or more" (e.g., multiple memory banks) can refer to one or more memory banks, whereas "a plurality of" means that these are more than one. It is intended to refer to many things.

더 나아가, "할 수 있다" 및 "할 수도 있다"라는 단어들은 본 출원 전반에 걸쳐 필수적 의미(즉, 해야하는)가 아니라, 허용적 의미(즉, ~할 가능성이 있는, ~를 할 수 있는)로 사용된다. "포함한다"라는 용어, 및 이의 파생어들은 "포함하지만, 이에 제한되지는 않는"을 의미한다. "결합된(coupled)" 및 "결합하는(coupling)"이라는 용어들은 문맥상 적절하게, 커맨드들 및/또는 데이터에의 액세스 및 이들의 이동(송신)을 위해 또는 물리적으로 직접적으로 또는 간접적으로 연결됨을 의미한다. "비트 스트링들," "데이터," 및 "데이터 값들"이라는 용어들은 상황에 적절하게, 본원에서 혼용되고 동일한 의미를 가질 수 있다. Further, the words "may" and "may" are used throughout this application in a permissive sense (ie, likely to, capable of), rather than a mandatory (ie, should) meaning. is used as The term "comprises", and its derivatives, means "including but not limited to". The terms “coupled” and “coupling” are contextually appropriate, directly or indirectly connected to or physically accessing and moving (transmitting) commands and/or data. means The terms “bit strings,” “data,” and “data values” may be used interchangeably and have the same meaning herein, as appropriate in the context.

본원에서의 도면들은 첫 번째 숫자 또는 숫자들이 도면 번호에 대응하고 나머지 숫자들이 도면에서의 요소 또는 컴포넌트를 식별하는 넘버링 규칙을 따른다. 상이한 도면들 간의 유사한 요소들 또는 컴포넌트들은 유사한 숫자들의 사용에 의해 식별될 수 있다. 예를 들어, 120는 도 1에서의 요소 "20"을 참조할 수 있고, 유사한 요소는 도 2a에서 220로서 참조될 수 있다. 복수의 유사한 요소들 또는 컴포넌트들 또는 유사한 요소들 또는 컴포넌트들의 그룹은 본원에서 단일 요소 번호로 총칭될 수 있다. 예를 들어, 복수의 기준 요소들(431-1, 431-2, 431-3)은 431으로서 총칭될 수 있다. 이해될 바와 같이, 본원에서의 다양한 실시예들에서 제시되는 요소들은 본 개시의 다수의 추가적인 실시예들을 제공하기 위해 추가, 교환, 및/또는 제거될 수 있다. 또한, 도면들에서 제공되는 요소들의 비율 및 상대적인 축척은 본 개시의 특정 실시예들을 예시하려는 것이고, 제한적인 의미로 취해져서는 안 된다. The drawings herein follow a numbering convention in which the first digit or digits correspond to a figure number and the remaining digits identify elements or components in the figure. Similar elements or components between different figures may be identified by the use of like numbers. For example, 120 may refer to element “20” in FIG. 1 and a similar element may be referenced as 220 in FIG. 2A. A plurality of similar elements or components or a group of similar elements or components may be referred to herein collectively by a single element number. For example, the plurality of reference elements 431-1, 431-2, and 431-3 may be collectively referred to as 431. As will be appreciated, elements presented in various embodiments herein may be added, exchanged, and/or removed to provide many additional embodiments of the present disclosure. Further, the proportions and relative scale of elements provided in the drawings are intended to illustrate particular embodiments of the present disclosure and should not be taken in a limiting sense.

도 1은 본 개시의 다수의 실시예들에 따른 호스트(102) 및 메모리 디바이스(104)를 포함하는 장치를 포함하는 컴퓨팅 시스템(100) 형태의 기능 블록도이다. 본원에서 사용될 때, "장치"는 예를 들어, 회로 또는 회로부, 다이 또는 다이들, 모듈 또는 모듈들, 디바이스 또는 디바이스들, 또는 시스템 또는 시스템들과 같은 다양한 구조체들 또는 구조체들의 조합들 중 임의의 것을 지칭할 수 있지만, 이에 제한되지는 않는다. 또한, 구성요소들(예를 들어, 호스트(102), 제어 회로부(120), 처리 자원(또는 논리 회로부)(122), 메모리 자원(124), 및/또는 뉴로모픽 메모리 어레이(130)) 각각은 본원에서 "장치"로서 별개로 지칭될 수 있다. 1 is a functional block diagram in the form of a computing system 100 that includes an apparatus that includes a host 102 and a memory device 104 in accordance with multiple embodiments of the present disclosure. As used herein, “apparatus” refers to any of various structures or combinations of structures, such as, for example, a circuit or circuitry, die or dies, module or modules, device or devices, or system or systems. may refer to, but is not limited thereto. Also, components (e.g., host 102, control circuitry 120, processing resources (or logic circuitry) 122, memory resources 124, and/or neuromorphic memory array 130) Each may be separately referred to herein as a “device”.

메모리 디바이스(104)는 하나 이상의 메모리 모듈(예를 들어, 단일 인라인 메모리 모듈, 듀얼 인라인 메모리 모듈 등)을 포함할 수 있다. 메모리 디바이스(104)는 휘발성 메모리 및/또는 비휘발성 메모리를 포함할 수 있다. 다수의 실시예들에서, 메모리 디바이스(104)는 멀티 칩 디바이스를 포함할 수 있다. 멀티 칩 디바이스는 다수의 상이한 메모리 타입들 및/또는 메모리 모듈들을 포함할 수 있다. 예를 들어, 메모리 디바이스(104)는 임의의 유형의 모듈 상에 비휘발성 또는 휘발성 메모리를 포함할 수 있다. The memory device 104 may include one or more memory modules (eg, single inline memory modules, dual inline memory modules, etc.). The memory device 104 may include volatile memory and/or non-volatile memory. In some embodiments, memory device 104 may include a multi-chip device. A multi-chip device can include a number of different memory types and/or memory modules. For example, memory device 104 may include non-volatile or volatile memory on any type of module.

메모리 디바이스(104)는 메모리 시스템(100)을 위한 메인 메모리를 제공할 수 있거나, 메모리 시스템(100) 전반에 걸친 추가적인 메모리 또는 저장소로서 사용될 수 있다. 메모리 디바이스(104)는 휘발성 및/또는 비휘발성 메모리 셀들을 포함할 수 있는 하나 이상의 뉴로모픽 메모리 어레이(130)(예를 들어, 메모리 셀들의 어레이)를 포함할 수 있다. 뉴로모픽 메모리 어레이(130)는 예를 들어, NAND 아키텍처를 갖는 플래시 메모리 어레이일 수 있다. 실시예들은 특정 유형의 메모리 디바이스에 제한되지 않지만, 메모리 디바이스(104)는 특히, RAM, ROM, DRAM, SDRAM, PCRAM, RRAM, 및 플래시 메모리를 포함할 수 있다.Memory device 104 may provide main memory for memory system 100 , or may be used as additional memory or storage throughout memory system 100 . The memory device 104 can include one or more neuromorphic memory arrays 130 (eg, an array of memory cells) that can include volatile and/or non-volatile memory cells. The neuromorphic memory array 130 may be, for example, a flash memory array having a NAND architecture. Although embodiments are not limited to a particular type of memory device, memory device 104 may include RAM, ROM, DRAM, SDRAM, PCRAM, RRAM, and flash memory, among others.

메모리 디바이스(104)가 비휘발성 메모리를 포함하는 실시예들에서, 메모리 디바이스(104)는 NAND 또는 NOR 플래시 메모리 디바이스들과 같은 플래시 메모리 디바이스들을 포함할 수 있다. 그러나, 실시예들은 메모리 디바이스(104)는 비휘발성 랜덤 액세스 메모리 디바이스들(예를 들어, NVRAM, ReRAM, FeRAM, MRAM, PCM)과 같은 다른 비휘발성 메모리 디바이스들, 3-D 크로스 포인트(3D XP) 메모리 디바이스들과 같은 "최근 만들어진" 메모리 디바이스들 등, 또는 이들의 조합들을 포함할 수 있지만, 이에 제한되지는 않는다. 비휘발성 메모리의 3D XP 어레이는 적층 가능한 크로스 그리드 데이터 액세스 어레이와 함께, 벌크 저항의 변화에 기초하여 비트 저장을 수행할 수 있다. 또한, 3D XP 비휘발성 메모리는 많은 플래시 기반 메모리들과 달리, 제자리 기록(write in-place) 연산을 수행할 수 있으며, 이때 비휘발성 메모리 셀은 비휘발성 메모리 셀이 이전에 소거되지 않고도 프로그래밍될 수 있다. In embodiments where memory device 104 includes non-volatile memory, memory device 104 may include flash memory devices, such as NAND or NOR flash memory devices. However, embodiments suggest that the memory device 104 is compatible with other non-volatile memory devices such as non-volatile random access memory devices (e.g., NVRAM, ReRAM, FeRAM, MRAM, PCM), 3-D cross point (3D XP ) memory devices, “recently made” memory devices, etc., or combinations thereof. A 3D XP array of non-volatile memory, along with a stackable cross-grid data access array, can perform bit storage based on changes in bulk resistance. Also, unlike many flash-based memories, 3D XP non-volatile memory can perform write in-place operations, in which non-volatile memory cells can be programmed without the non-volatile memory cells being previously erased. there is.

도 1에 도시된 바와 같이, 호스트(102)는 메모리 디바이스(104)에 연결될 수 있다. 다수의 실시예들에서, 호스트(102)는 하나 이상의 채널(103)(예를 들어, 버스, 인터페이스, 통신 경로 등)을 통해 메모리 디바이스(104)에 결합될 수 있다. 또한, 메모리 디바이스(104)의 제어 회로부(120)는 채널(107)을 통해 뉴로모픽 메모리 어레이(130)에 결합될 수 있다. 채널(들)(103)은 메모리 시스템(104)과 호스트(102) 사이에서 데이터를 전송하기 위해 사용될 수 있고, 표준화된 인터페이스 형태일 수 있다. 예를 들어, 메모리 시스템(104)이 컴퓨팅 시스템(100)에서 데이터 저장에 사용될 때, 채널(들)(103)은 다른 물리적 연결기들 및 인터페이스들 중에서도, SATA(serial advanced technology attachment), PCIe(peripheral component interconnect express), 또는 USB(universal serial bus), DDR(double data rate) 인터페이스일 수 있다. 그러나, 일반적으로, 채널(들)(103)은 호스트 인터페이스(103)에 대해 호환 가능한 수용기들을 갖는 호스트(102)와 메모리 시스템(104) 사이에서 제어, 어드레스, 데이터, 및 다른 신호들을 전달하기 위한 인터페이스를 제공할 수 있다. As shown in FIG. 1 , host 102 can be coupled to memory device 104 . In many embodiments, host 102 may be coupled to memory device 104 via one or more channels 103 (eg, a bus, interface, communication path, etc.). In addition, control circuitry 120 of memory device 104 may be coupled to neuromorphic memory array 130 via channel 107 . Channel(s) 103 may be used to transfer data between memory system 104 and host 102 and may be in the form of a standardized interface. For example, when the memory system 104 is used for data storage in the computing system 100, the channel(s) 103 are serial advanced technology attachment (SATA), peripheral (PCIe), among other physical connectors and interfaces. component interconnect express), or a universal serial bus (USB) or double data rate (DDR) interface. In general, however, channel(s) 103 are used for conveying control, address, data, and other signals between a host 102 and memory system 104 having compatible receptors for host interface 103. interfaces can be provided.

호스트(102)는 다양한 다른 유형들의 호스트들 중에서도, 개인용 랩탑 컴퓨터, 데스크탑 컴퓨터, 디지털 카메라, 모바일 전화, IoT(internet-of-things) 지원 디바이스, 또는 메모리 카드 리더, 그래픽 처리 유닛(예를 들어, 비디오 카드)와 같은 호스트 시스템일 수 있다. 호스트(102)는 시스템 마더보드 및/또는 백플레인을 포함할 수 있고, 다수의 메모리 액세스 디바이스들, 예를 들어, 다수의 처리 디바이스들(예를 들어, 하나 이상의 프로세서, 마이크로프로세서, 또는 몇몇 다른 유형의 제어 회로부)을 포함할 수 있다. 해당 기술분야의 통상의 기술자는 "프로세서"가 하나 이상의 프로세서, 이를테면 병렬 처리 시스템, 다수의 보조 프로세서들 등을 의도할 수 있음을 이해할 것이다.Host 102 may be a personal laptop computer, desktop computer, digital camera, mobile phone, internet-of-things (IoT) enabled device, or memory card reader, graphics processing unit (e.g., video card). Host 102 may include a system motherboard and/or backplane, and may include multiple memory access devices, such as multiple processing devices (e.g., one or more processors, microprocessors, or some other type of the control circuit) may be included. Those skilled in the art will understand that "processor" can mean one or more processors, such as parallel processing systems, multiple co-processors, and the like.

시스템(100)은 별개의 집적 회로들을 포함할 수 있거나, 호스트(102), 메모리 디바이스(104), 및 뉴로모픽 메모리 어레이(130)가 동일한 집적 회로 상에 있을 수 있다. 시스템(100)은 예를 들어, 서버 시스템 및/또는 고성능 컴퓨팅(high-performance computing; HPC) 시스템 및/또는 이의 일 부분일 수 있다. 도 1에 도시된 예는 폰 노이만(Von Neumann) 아키텍처를 갖는 시스템을 도시하지만, 본 개시의 실시예들은 비-폰 노이만 아키텍처들 ― 폰 노이만 아키텍처와 보통 연관된 하나 이상의 구성요소(예를 들어, CPU, ALU 등)를 포함하지 않을 수 있음 ― 로 구현될 수 있다. System 100 may include separate integrated circuits, or host 102 , memory device 104 , and neuromorphic memory array 130 may be on the same integrated circuit. System 100 may be, for example, a server system and/or a high-performance computing (HPC) system and/or a portion thereof. 1 shows a system with a von Neumann architecture, embodiments of the present disclosure may be used for non-von Neumann architectures - one or more components normally associated with a von Neumann architecture (e.g., CPU , ALU, etc.) may not be included.

일부 실시예들에서, 호스트(102)는 메모리 디바이스(104)를 포함하는 컴퓨팅 시스템(100)에 대한 운영 체제를 실행하는 것을 담당할 수 있다. 이에 따라, 일부 실시예들에서, 호스트(102)는 메모리 디바이스(104)의 동작을 제어하는 것을 담당할 수 있다. 예를 들어, 호스트(102)는 스케줄링 작업들, 실행 애플리케이션들, 제어 주변 장치들 등과 같은 컴퓨팅 시스템(100)의 하드웨어를 관리하는 (예를 들어, 운영 체제 형태의) 명령어들을 실행할 수 있다.In some embodiments, host 102 may be responsible for running an operating system for computing system 100 that includes memory device 104 . Accordingly, in some embodiments, host 102 may be responsible for controlling the operation of memory device 104 . For example, host 102 may execute instructions (eg, in the form of an operating system) that manage hardware of computing system 100, such as scheduling tasks, running applications, controlling peripherals, and the like.

본원에서 도 2a 및 도 2b에 더 상세히 도시되는 메모리 디바이스(104)는 제어 회로부(120)를 포함할 수 있으며, 이는 처리 자원(122) 및 메모리 자원(124)을 포함할 수 있다. 처리 자원(122)은 본원에서 더 상세히 설명되는 바와 같이, 주문형 집적 회로(ASIC), 필드 프로그래머블 게이트 어레이(FPGA), 시스템-온-칩, 또는 호스트(102) 및/또는 다른 외부 디바이스들로부터 수신된 비트 스트링에 대해 산술 및/또는 논리 연산들을 수행하도록 구성된 하드웨어 및/또는 회로부의 다른 조합과 같은 집적 회로 형태로 제공될 수 있다. 일부 실시예들에서, 처리 자원(122)은 산술 논리 유닛(ALU)을 포함할 수 있다. ALU는 포지트 포맷의 비트 스트링들과 같은 정수 이진 비트 스트링들에 대해, 상술한 연산들과 같은 연산들(예를 들어, 산술 연산들, 논리 연산들, 비트별 연산들 등)을 수행하기 위한 회로부(예를 들어, 하드웨어, 로직, 하나 이상의 처리 디바이스 등)를 포함할 수 있다. 그러나, 실시예들은 ALU로 제한되지 않고, 일부 실시예들에서, 처리 자원(122)은 본원에서 도 5와 관련하여 더 상세히 설명되는 바와 같이, ALU에 더하여, 또는 ALU 대신에 상태 기계 및/또는 명령어 집합 아키텍처(또는 이들의 조합들)를 포함할 수 있다. The memory device 104 , shown herein in more detail in FIGS. 2A and 2B , may include control circuitry 120 , which may include a processing resource 122 and a memory resource 124 . Processing resource 122 may be received from an application specific integrated circuit (ASIC), field programmable gate array (FPGA), system-on-chip, or host 102 and/or other external devices, as described in more detail herein. may be provided in the form of an integrated circuit, such as another combination of hardware and/or circuitry configured to perform arithmetic and/or logic operations on a bit string. In some embodiments, processing resource 122 may include an arithmetic logic unit (ALU). The ALU is used to perform operations such as those described above (e.g., arithmetic operations, logical operations, bitwise operations, etc.) on integer binary bit strings, such as bit strings in positive format. may include circuitry (eg, hardware, logic, one or more processing devices, etc.). However, embodiments are not limited to an ALU, and in some embodiments, processing resource 122 may be a state machine and/or in addition to or instead of an ALU, as described in more detail herein with respect to FIG. 5 . instruction set architecture (or combinations thereof).

예를 들어, 처리 자원(122)은 포지트 포맷의 하나 이상의 비트 스트링(예를 들어, 복수의 비트)을 수신하고/하거나 포지트 포맷의 비트 스트링들을 사용하여 산술 및/또는 논리 연산들과 같은 연산들의 수행을 야기하도록 구성될 수 있다. 부동 소수점 포맷의 비트 스트링들과 대조적으로, 정수들 또는 비트들의 세 개의 집합들 ― "기수"로서 지칭되는 비트들의 집합, "지수"로서 지칭되는 비트들의 집합, 및 "가수"(또는 가수부)로서 지칭되는 비트들의 집합 ― 을 포함하며, 포지트 포맷의 비트 스트링(들)은 비트들의 네 개의 집합들 ― "부호"로서 지칭되는 적어도 하나의 비트, "레짐"으로서 지칭되는 비트들의 집합, "지수"로서 지칭되는 비트들의 집합, 및 "가수"(또는 가수부)로서 지칭되는 비트들의 집합 ― 을 포함한다. 본원에서 사용될 때, 비트들의 집합은 비트 스트링에 포함된 비트들의 부분 집합을 지칭하는 것으로 의도된다. 비트들의 부호, 레짐, 지수, 및 가수 집합들의 예들은 본원에서 도 3 및 도 4a-도 4b와 관련하여 더 상세히 설명된다. For example, processing resource 122 may receive one or more bit strings (e.g., a plurality of bits) in positive format and/or perform arithmetic and/or logical operations using the bit strings in positive format. It can be configured to cause the performance of operations. In contrast to bit strings in floating point format, there are three sets of integers or bits - the set of bits referred to as the "radix", the set of bits referred to as the "exponent", and the "mantissa" (or mantissa) A set of bits, referred to as , where the bit string(s) in positive format are four sets of bits - at least one bit, referred to as "sign", set of bits referred to as "regime", " a set of bits referred to as the "exponent", and a set of bits referred to as the "mantissa" (or mantissa). As used herein, set of bits is intended to refer to a subset of bits included in a bit string. Examples of sets of sign, regime, exponent, and mantissa of bits are described in more detail herein with respect to FIGS. 3 and 4A-4B.

일부 실시예들에서, 처리 자원(122)은 포지트 비트 스트링들을 사용하는 덧셈, 뺄셈, 곱셈, 나눗셈, 융합된 곱셈 덧셈, 곱셈-누산, 내적 단위, 절대값(예를 들어, FABS()) 초과 또는 미만, 고속 푸리에 변환, 역 고속 푸리에 변환, 시그모이드 함수, 컨볼루션, 제곱근, 지수, 및/또는 로그 연산들, 및/또는 AND, OR, XOR, NOT 등과 같은 논리 연산들뿐만 아니라, 사인, 코사인, 탄젠트 등과 같은 삼각 연산들과 같은 산술 연산들을 수행(또는 수행을 야기)하도록 구성될 수 있다. 이해될 바와 같이, 전술한 연산 리스트는 총망라한 것으로 의도되지 않고, 또한 전술한 연산 리스트는 제한적인 것으로도 의도되지 않으며, 처리 자원(122)은 다른 산술 및/또는 논리 연산들을 수행(또는 수행을 야기)하도록 구성될 수 있다. In some embodiments, processing resource 122 performs addition, subtraction, multiplication, division, fused multiplicative addition, multiplication-accumulation, dot product unit, absolute value (e.g., FABS()) using positive bit strings. over or under, fast Fourier transform, inverse fast Fourier transform, sigmoid function, convolution, square root, exponential, and/or logarithmic operations, and/or logical operations such as AND, OR, XOR, NOT, etc., as well as It may be configured to perform (or cause the performance of) arithmetic operations such as trigonometric operations such as sine, cosine, tangent, and the like. As will be appreciated, the foregoing list of operations is not intended to be exhaustive, nor is the foregoing list of operations intended to be limiting, and processing resource 122 may perform (or cause to perform) other arithmetic and/or logical operations. ) can be configured.

제어 회로부(120)가 비트 스트링(들)에 대해 산술 및/또는 논리 연산을 수행한 후에, 제어 회로부(120)는 결과 비트 스트링(예를 들어, 산술 연산 및/또는 논리 연산의 결과를 나타내는 결과 비트 스트링)이 호스트(102), 및/또는 뉴로모픽 메모리 어레이(130)로 전달되게 할 수 있다. 일부 실시예들에서, 결과 비트 스트링은 뉴로모픽 메모리 어레이(130)로 발신될 수 있고, 결과 비트 스트링의 데이터는 뉴로모픽 메모리 어레이(130)의 신경망에 입력될 수 있다. 예를 들어, 결과 비트 스트링은 복수의 뉴런들(예를 들어, 뉴런 구성요소들)(125-1, 125-2, 125-3, 125-4, 125-5, 125-6)(이하, 뉴런 구성요소들(125)로서 지칭됨)에 의해 수행되는 다수의 뉴로모픽 연산들에 대한 입력으로서 사용될 수 있다. 일부 실시예들에서, 제어 회로부(120)는 결과 비트 스트링(들)을 예를 들어, 포지트 포맷으로, 호스트(102) 및/또는 뉴로모픽 메모리 어레이(130)에 전송할 수 있다. 뉴런 구성요소들(125)은 서로 그리고 뉴로모픽 메모리 어레이(130) 내의 다른 요소들에 결합될 수 있으며, 이는 도 5a-도 5c와 연관하여 아래에서 추가 설명될 것이다.After control circuitry 120 performs arithmetic and/or logical operations on the bit string(s), control circuitry 120 outputs the resulting bit string (e.g., a result representing the result of the arithmetic operation and/or logical operation). bit string) to be passed to the host 102 and/or the neuromorphic memory array 130 . In some embodiments, the resulting bit string may be sent to the neuromorphic memory array 130 and the data of the resulting bit string may be input into a neural network of the neuromorphic memory array 130 . For example, the resulting bit string may be a plurality of neurons (eg, neuronal elements) 125-1, 125-2, 125-3, 125-4, 125-5, 125-6 (hereafter, referred to as neuronal elements 125). In some embodiments, control circuitry 120 may transmit the resulting bit string(s) to host 102 and/or neuromorphic memory array 130, for example in positive format. The neuronal components 125 can be coupled to each other and to other elements within the neuromorphic memory array 130, which will be further described below in connection with FIGS. 5A-5C.

제어 회로부(120)는 메모리 자원(124)을 더 포함할 수 있으며, 이는 처리 자원(122)에 통신 가능하게 결합될 수 있다. 메모리 자원(124)은 휘발성 메모리 자원, 비휘발성 메모리 자원들, 또는 휘발성과 비휘발성 메모리 자원들의 조합을 포함할 수 있다. 일부 실시예들에서, 메모리 자원(124)은 정적 랜덤 액세스 메모리(SRAM)와 같은 랜덤 액세스 메모리(RAM)일 수 있다. 그러나, 실시예들은 이에 제한되지 않고, 메모리 자원(124)은 캐시, 하나 이상의 레지스터, NVRAM, ReRAM, FeRAM, MRAM, PCM), 3-D 크로스포인트(3D XP) 메모리 디바이스와 같은 "최근 만들어진" 메모리 디바이스 등, 또는 이들의 조합들일 수 있다. Control circuitry 120 may further include a memory resource 124 , which may be communicatively coupled to processing resource 122 . Memory resource 124 may include volatile memory resources, non-volatile memory resources, or a combination of volatile and non-volatile memory resources. In some embodiments, memory resource 124 may be random access memory (RAM), such as static random access memory (SRAM). However, embodiments are not so limited, and memory resources 124 may be "recently made" such as caches, one or more registers, NVRAM, ReRAM, FeRAM, MRAM, PCM), 3-D Crosspoint (3D XP) memory devices. memory device, etc., or combinations thereof.

제어 회로부(120)는 하나 이상의 채널(107)을 통해 뉴로모픽 메모리 어레이(130)에 통신가능하게 결합될 수 있다. 뉴로모픽 메모리 어레이(130)는 예를 들어, DRAM 어레이, SRAM 어레이, STT RAM 어레이, PCRAM 어레이, TRAM 어레이, RRAM 어레이, NAND 플래시 어레이, 및/또는 NOR 플래시 어레이일 수 있다. 어레이(130)는 본원에서 워드 라인들 또는 선택 라인들로서 지칭될 수 있는 액세스 라인들에 의해 연결되는 로우들, 및 본원에서 디지트 라인들 또는 데이터 라인들로서 지칭될 수 있는 감지 라인들에 의해 연결되는 컬럼들로 배열되는 메모리 셀들을 포함할 수 있다. 단일 어레이(130)가 도 1에 도시되어 있지만, 실시예들은 이에 제한되지 않는다. 예를 들어, 메모리 디바이스(104)는 다수의 메모리 어레이들(130)(예를 들어, DRAM 셀들, NAND 플래시 셀들 등의 다수의 뱅크들)을 포함할 수 있다. 또한, 어레이(130)는 로우들 및 컬럼들로 배열되고, 메모리 셀들에 결합된 뉴런들(125)을 포함할 수 있으며, 이는 도 5c와 관련하여 아래에서 추가로 설명될 것이다.Control circuitry 120 may be communicatively coupled to neuromorphic memory array 130 via one or more channels 107 . Neuromorphic memory array 130 may be, for example, a DRAM array, SRAM array, STT RAM array, PCRAM array, TRAM array, RRAM array, NAND flash array, and/or NOR flash array. Array 130 has rows connected by access lines, which may be referred to herein as word lines or select lines, and columns connected by sense lines, which may be referred to herein as digit lines or data lines. It may include memory cells arranged in . Although a single array 130 is shown in FIG. 1, embodiments are not so limited. For example, memory device 104 may include multiple memory arrays 130 (eg, multiple banks of DRAM cells, NAND flash cells, etc.). Array 130 may also include neurons 125 arranged in rows and columns and coupled to memory cells, as will be further described below with respect to FIG. 5C.

도 1의 실시예는 본 개시의 실시예들을 모호하지 않게 하기 위해 도시되지 않은 추가의 회로부들을 포함할 수 있다. 예를 들어, 메모리 소자(104)는 I/O 회로를 통해 I/O 연결부들을 거쳐 제공되는 어드레스 신호들을 래칭하기 위한 어드레스 회로를 포함할 수 있다. 메모리 디바이스(104) 및/또는 뉴로모픽 메모리 어레이(130)에 액세스하기 위해 어드레스 신호들이 수신되고 로우 디코더 및 컬럼 디코더에 의해 디코딩될 수 있다. 해당 기술분야의 통상의 기술자들은 어드레스 입력 연결부들의 수가 메모리 디바이스(104) 및/또는 뉴로모픽 메모리 어레이(130)의 밀도 및 아키텍처에 따라 달라질 수 있다는 것을 이해할 것이다.The embodiment of FIG. 1 may include additional circuitry not shown in order not to obscure embodiments of the present disclosure. For example, the memory device 104 may include address circuitry for latching address signals provided over I/O connections through the I/O circuitry. Address signals may be received and decoded by a row decoder and a column decoder to access memory device 104 and/or neuromorphic memory array 130 . Those skilled in the art will appreciate that the number of address input connections may vary depending on the density and architecture of memory device 104 and/or neuromorphic memory array 130 .

도 2a는 본 개시의 다수의 실시예들에 따른 호스트(202) 및 메모리 디바이스(204)를 포함하는 장치(200)를 포함하는 컴퓨팅 시스템 형태의 다른 기능 블록도이다. 메모리 디바이스(204)는 제어 회로부(220)를 포함할 수 있으며, 이는 도 1에 도시된 제어 회로부(120)와 유사할 수 있다. 이와 유사하게, 호스트(202)는 도 1에 도시된 호스트(102)와 유사할 수 있고, 메모리 디바이스(204)는 도 1에 도시된 메모리 디바이스(104)와 유사할 수 있으며, 뉴로모픽 메모리 어레이(230)는 도 1에 도시된 뉴로모픽 메모리 어레이(130)와 유사할 수 있다. 구성요소들(예를 들어, 호스트(202), 제어 회로부(220), 처리 자원(222), 메모리 자원(224), 및/또는 뉴로모픽 메모리 어레이(230) 등) 각각은 본원에서 "장치"로서 별개로 지칭될 수 있다. 2A is another functional block diagram in the form of a computing system that includes an apparatus 200 that includes a host 202 and a memory device 204 in accordance with multiple embodiments of the present disclosure. The memory device 204 may include control circuitry 220 , which may be similar to the control circuitry 120 shown in FIG. 1 . Similarly, host 202 can be similar to host 102 shown in FIG. 1 , memory device 204 can be similar to memory device 104 shown in FIG. 1 , and neuromorphic memory Array 230 may be similar to neuromorphic memory array 130 shown in FIG. 1 . Each of the components (e.g., host 202, control circuitry 220, processing resource 222, memory resource 224, and/or neuromorphic memory array 230, etc.) is referred to herein as a "device can be referred to separately as ".

호스트(202)는 하나 이상의 채널(203, 205)을 통해 메모리 디바이스(204)에 통신가능하게 결합될 수 있다. 채널들(203, 205)은 인터페이스들, 버스들, 통신 경로들, 또는 데이터 및/또는 커맨드들이 호스트(202)와 메모리 디바이스(204) 사이에서 전송될 수 있게 하는 다른 물리적 연결부들일 수 있다. 채널들(203, 205)은 메모리 시스템(204)과 호스트(202) 사이에서 데이터를 전송하기 위해 사용될 수 있고, 표준화된 인터페이스 형태일 수 있다. Host 202 may be communicatively coupled to memory device 204 via one or more channels 203 and 205 . Channels 203 and 205 may be interfaces, buses, communication paths, or other physical connections through which data and/or commands may be transferred between host 202 and memory device 204 . Channels 203 and 205 may be used to transfer data between memory system 204 and host 202 and may be in the form of a standardized interface.

메모리 시스템(204)이 컴퓨팅 시스템(200)에서 데이터 저장에 사용될 때, 채널들(203, 205)은 다른 물리적 연결기들 및 인터페이스들 중에서도, SATA(serial advanced technology attachment), PCIe(peripheral component interconnect express), 또는 USB(universal serial bus), DDR(double data rate) 인터페이스일 수 있다. 그러나, 일반적으로, 채널들(203, 205)은 채널들(203, 205)에 대한 호환가능한 수용기들을 갖는 호스트(202)와 메모리 시스템(204) 사이에서 제어, 어드레스, 데이터, 및 다른 신호들을 전달하기 위한 인터페이스를 제공할 수 있다. 일부 실시예들에서, 제어 회로부(220)에 의해 수행될 연산(예를 들어, 포지트 포맷의 비트 스트링(들)에 대해 산술 및/또는 논리 연산들을 수행하기 위한 연산)의 개시를 야기하기 위한 커맨드들은 호스트로부터 채널들(203, 205)을 통해 전송될 수 있다. When memory system 204 is used for data storage in computing system 200, channels 203 and 205 are serial advanced technology attachment (SATA), peripheral component interconnect express (PCIe), among other physical connectors and interfaces. , or a universal serial bus (USB) or double data rate (DDR) interface. In general, however, channels 203 and 205 carry control, address, data, and other signals between a host 202 and memory system 204 having compatible receptors for channels 203 and 205. An interface can be provided to do this. In some embodiments, to cause initiation of an operation to be performed by the control circuitry 220 (e.g., an operation to perform arithmetic and/or logical operations on bit string(s) in positive format). Commands may be transmitted over channels 203 and 205 from the host.

일부 실시예들에서, 제어 회로부(220)는 호스트(202)로부터의 개재 커맨드의 부재 시에 채널들(203, 205) 중 하나 이상을 통해 호스트(202)로부터 전송되는 개시 커맨드에 응답하여 산술 및/또는 논리 연산들을 수행할 수 있다는 점에 유의한다. 즉, 제어 회로부(220)가 호스트(202)로부터 연산의 수행을 개시하기 위한 커맨드를 수신하면, 호스트(202)로부터의 추가적인 커맨드들의 부재 시에 연산들이 제어 회로부(220)에 의해 수행될 수 있다. 그러나, 일부 실시예들에서, 제어 회로부(220)는 연산이 수행될 것임을 명시하는 호스트(202)로부터의 커맨드(예를 들어, 개시 커맨드)의 부재 시 비트 스트링(예를 들어, 포지트 포맷의 비트 스트링)의 수신에 응답하여 연산들을 수행할 수 있다. 예를 들어, 제어 회로부(220)는 비트 스트링(들)을 수신하는 것에 응답하여 수신된 비트 스트링(들)에 대한 산술 및/또는 논리 연산들의 수행을 자체 개시하도록 구성될 수 있다. 그러나, 일부 실시예들에서, 데이터에 대해 뉴로모픽 연산들을 수행하기 위한 커맨드를 수신하는 것에 응답하여, 데이터는 뉴로모픽 메모리 어레이(230)로 발신되면서 아날로그 포맷으로 변환될 수 있다. In some embodiments, control circuitry 220 performs arithmetic and arithmetic operations in response to an initiate command sent from host 202 over one or more of channels 203, 205 in the absence of an intervening command from host 202. Note that/or it can perform logical operations. That is, if control circuitry 220 receives a command from host 202 to initiate performance of an operation, the operation may be performed by control circuitry 220 in the absence of additional commands from host 202. . However, in some embodiments, control circuitry 220 may, in the absence of a command from host 202 specifying that an operation is to be performed (eg, a start command), generate a bit string (eg, in positive format). bit string) to perform operations in response to receiving it. For example, control circuitry 220 may be configured to self-initiate performance of arithmetic and/or logical operations on the received bit string(s) in response to receiving the bit string(s). However, in some embodiments, in response to receiving a command to perform neuromorphic operations on the data, the data may be converted to an analog format as it is sent to the neuromorphic memory array 230 .

본원에서 사용될 때, "제1 정밀도 레벨" 및 "제2 정밀도 레벨"은 일반적으로 비트 스트링, 및/또는 하나 이상의 비트 스트링을 사용하여 수행되는 연산의 결과를 나타내는 결과 비트 스트링의 정확도를 나타낸다. 예를 들어, 부동 소수점 포맷 비트 스트링들은 본원에서, "제1 정밀도 레벨"을 갖는 것으로서 설명될 수 있는 한편, unum 비트 스트링들(예를 들어, 포지트 포맷 비트 스트링들)은 특정 정밀도 또는 "제2 정밀도 레벨"을 갖는 것으로서 지칭될 수 있는데, 이는 본원에서 더 상세히 설명되는 바와 같이, unum 비트 스트링들이 부동 소수점 포맷의 수들보다 특정 조건 하에서 더 높은 레벨의 정밀도를 제공할 수 있기 때문이다.As used herein, “first precision level” and “second precision level” generally refer to the accuracy of a bit string and/or the resultant bit string representing the result of an operation performed using one or more bit strings. For example, floating point format bit strings may be described herein as having a "first level of precision," while unum bit strings (eg, positive format bit strings) may have a particular precision or "second level of precision." 2 levels of precision" because unum bit strings can provide a higher level of precision under certain conditions than numbers in floating point formats, as described in more detail herein.

일부 실시예들에서, 부동 소수점 포맷 또는 unum 포맷은 디지털 포맷을 지칭할 수 있는 한편, 추가적인 포맷은 아날로그 포맷을 포함할 수 있다. 디지털 포맷은 "0" 또는 "1"과 같은 이산 값들을 사용할 수 있는 한편, 아날로그 포맷은 더 연속적인 값들을 사용할 수 있고, 그 연속체를 따라 물리적 측정들을 나타낼 수 있다. In some embodiments, a floating point format or unum format may refer to a digital format, while the additional format may include an analog format. A digital format may use discrete values such as "0" or "1", while an analog format may use more continuous values and represent physical measurements along that continuum.

도 2a에 도시된 바와 같이, 메모리 디바이스(204)는 레지스터 액세스 구성요소(206), 고속 인터페이스(HSI)(208), 제어기(210), 하나 이상의 주변 감지 증폭기(PSA)(212), 메인 메모리 입력/출력(I/O) 회로부(214), 로우 어드레스 스트로브(RAS)/컬럼 어드레스 스트로브(CAS) 체인 제어 회로부(216), RAS/CAS 체인 구성요소(218), 제어 회로부(220), 및 뉴로모픽 메모리 어레이(230)를 포함할 수 있다. 제어 회로부(220)는 도 2a에 도시된 바와 같이, 뉴로모픽 메모리 어레이(230)와 물리적으로 별개인 메모리 디바이스(204)의 영역에 위치된다. 즉, 일부 실시예들에서, 제어 회로부(220)는 뉴로모픽 메모리 어레이(230)의 주변 위치에 위치된다. 2A, memory device 204 includes register access component 206, high-speed interface (HSI) 208, controller 210, one or more peripheral sense amplifiers (PSAs) 212, main memory input/output (I/O) circuitry 214, row address strobe (RAS)/column address strobe (CAS) chain control circuitry 216, RAS/CAS chain components 218, control circuitry 220, and A neuromorphic memory array 230 may be included. As shown in FIG. 2A , the control circuitry 220 is located in a region of the memory device 204 that is physically separate from the neuromorphic memory array 230 . That is, in some embodiments, the control circuitry 220 is located at a location around the neuromorphic memory array 230 .

레지스터 액세스 구성요소(206)는 데이터를 호스트(202)로부터 메모리 디바이스(204)로, 그리고 메모리 디바이스(204)로부터 호스트(202)로 전송 및 페칭하는 것을 가능하게 할 수 있다. 예를 들어, 레지스터 액세스 구성요소(206)는 메모리 디바이스(204)로부터 호스트(202)로 전송되거나 호스트(202)로부터 메모리 디바이스(204)로 전송될 데이터에 대응하는 메모리 어드레스들과 같은 어드레스들을 저장할 수 있다(또는 어드레스들의 룩업을 가능하게 할 수 있다). 일부 실시예들에서, 레지스터 액세스 구성요소(206)는 제어 회로부(220)에 의해 연산될 데이터를 전송 및 페칭하는 것을 가능하게 할 수 있고/있거나 레지스터 액세스 구성요소(206)는 호스트(202)로의 전송을 위해 제어 회로부(220)에 의해 연산되었던 데이터를 전송 또는 페칭하는 것을 가능하게 할 수 있다.The register access component 206 can enable transferring and fetching data from the host 202 to the memory device 204 and from the memory device 204 to the host 202 . For example, register access component 206 may store addresses, such as memory addresses corresponding to data to be transferred from memory device 204 to host 202 or from host 202 to memory device 204. (or enable lookup of addresses). In some embodiments, register access component 206 may enable transferring and fetching data to be operated on by control circuitry 220 and/or register access component 206 may enable transmission of data to host 202. It may be possible to transmit or fetch data that has been calculated by the control circuitry 220 for transmission.

HSI(208)는 채널(205)을 횡단하는 커맨드들 및/또는 데이터를 위해 호스트(202)와 메모리 디바이스(204) 사이에 인터페이스를 제공할 수 있다. HSI(208)는 DDR3, DDR4, DDR5 등의 인터페이스와 같은 더블 데이터 레이트(DDR) 인터페이스일 수 있다. 그러나, 실시예들은 DDR 인터페이스에 제한되지 않고, HSI(208)는 쿼드 데이터 레이트(QDR) 인터페이스, 주변 구성요소 인터커넥트(PCI) 인터페이스(예를 들어, 주변 구성요소 인터커넥트 익스프레스(PCIe)), 또는 채널(들)(203, 205)을 통해 호스트(202)와 메모리 디바이스(204) 사이에서 커맨드들 및/또는 데이터를 전송하기 위한 다른 적합한 인터페이스일 수 있다. HSI 208 may provide an interface between host 202 and memory device 204 for commands and/or data traversing channel 205 . HSI 208 may be a double data rate (DDR) interface, such as a DDR3, DDR4, DDR5 interface. However, embodiments are not limited to a DDR interface, and HSI 208 can be a Quad Data Rate (QDR) interface, a Peripheral Component Interconnect (PCI) interface (e.g., Peripheral Component Interconnect Express (PCIe)), or a channel There may be other suitable interfaces for transferring commands and/or data between host 202 and memory device 204 via (s) 203 , 205 .

제어기(210)는 호스트(202)로부터의 명령어들을 실행하고 제어 회로부(220) 및/또는 뉴로모픽 메모리 어레이(230)에 액세스하는 것을 담당할 수 있다. 제어기(210)는 상태 기계, 시퀀서 또는 일부 다른 유형의 제어기일 수 있다. 제어기(210)는(예를 들어, HSI(208)를 통해) 호스트(202)로부터 커맨드들을 수신하고, 수신된 커맨드들에 기초하여, 제어 회로부(220) 및/또는 뉴로모픽 메모리 어레이(230)의 연산을 제어할 수 있다. 일부 실시예들에서, 제어기(210)는 제어 회로부(220)를 사용하여 수신된 비트 스트링들에 대한 산술 및/또는 논리 연산의 수행을 야기하기 위한 커맨드를 호스트(202)로부터 수신할 수 있다. 이러한 커맨드의 수신에 응답하여, 제어기(210)는 산술 및/또는 논리 연산(들)의 수행을 시작할 것을 제어 회로부(220)에 지시할 수 있다.Controller 210 may be responsible for executing instructions from host 202 and accessing control circuitry 220 and/or neuromorphic memory array 230 . Controller 210 may be a state machine, sequencer, or some other type of controller. Controller 210 receives commands from host 202 (eg, via HSI 208) and, based on the received commands, controls circuitry 220 and/or neuromorphic memory array 230 ) can be controlled. In some embodiments, controller 210 may receive a command from host 202 to cause the performance of arithmetic and/or logical operations on received bit strings using control circuitry 220 . In response to receiving such a command, controller 210 may instruct control circuitry 220 to begin performing the arithmetic and/or logical operation(s).

일부 실시예들에서, 제어기(210)는 전역적 처리 제어기일 수 있고, 메모리 디바이스(204)에 전력 관리 기능들을 제공할 수 있다. 전력 관리 기능들은 메모리 디바이스(204) 및/또는 뉴로모픽 메모리 어레이(230)에 의해 소비되는 전력에 대한 제어를 포함할 수 있다. 예를 들어, 제어기(210)는 뉴로모픽 메모리 어레이(230)의 어느 뱅크들이 메모리 디바이스(204)의 연산 동안 상이한 시간들에서 가도인지를 제어하기 위해 뉴로모픽 메모리 어레이(230)의 다양한 뱅크들에 제공되는 전력을 제어할 수 있다. 이는 메모리 디바이스(230)의 전력 소비를 최적화하기 위해 뉴로모픽 메모리 어레이(230)의 특정 뱅크들을 셧다운하는 한편, 뉴로모픽 메모리 어레이(230)의 다른 뱅크들에 전력을 제공하는 것을 포함할 수 있다. 일부 실시예들에서, 메모리 디바이스(204)의 전력 소비를 제어하는 제어기(210)는 메모리 디바이스(204)의 다양한 코어들 및/또는 제어 회로부(220), 뉴로모픽 메모리 어레이(230) 등에 대한 전력을 제어하는 것을 포함할 수 있다.In some embodiments, controller 210 may be a global processing controller and may provide power management functions to memory device 204 . Power management functions may include control over power consumed by memory device 204 and/or neuromorphic memory array 230 . For example, the controller 210 may control the various banks of the neuromorphic memory array 230 to control which banks of the neuromorphic memory array 230 are active at different times during operation of the memory device 204. The power provided to them can be controlled. This may include shutting down certain banks of the neuromorphic memory array 230 to optimize power consumption of the memory device 230 while providing power to other banks of the neuromorphic memory array 230. there is. In some embodiments, the controller 210 controlling the power consumption of the memory device 204 controls the various cores of the memory device 204 and/or control circuitry 220, neuromorphic memory array 230, etc. This may include controlling power.

PSA들(212)은 뉴로모픽 메모리 어레이(230) 내의 메모리 셀들의 데이터 값들을 감지(예를 들어, 판독, 저장, 캐싱)하고 뉴로모픽 메모리 어레이(230)와 별개인 추가적인 기능들(예를 들어, 주변 증폭기들)을 제공하도록 의도된다. PSA들(212)은 래치들 및/또는 레지스터들을 포함할 수 있다. 예를 들어, 추가적인 래치들은 PSA들(212)에 포함될 수 있다. PSA들(212)의 래치들은 메모리 디바이스(204)의 뉴로모픽 메모리 어레이(230)의 주변부 상에(예를 들어, 메모리 셀들의 하나 이상의 뱅크의 주변부 상에) 위치될 수 있다.PSAs 212 sense (eg, read, store, cache) data values of memory cells in neuromorphic memory array 230 and perform additional functions separate from neuromorphic memory array 230 (eg, read, store, cache). eg, peripheral amplifiers). PSAs 212 may include latches and/or registers. For example, additional latches may be included in PSAs 212 . The latches of the PSAs 212 may be located on the periphery of the neuromorphic memory array 230 of the memory device 204 (eg, on the periphery of one or more banks of memory cells).

메인 메모리 입력/출력(I/O) 회로부(214)는 뉴로모픽 메모리 어레이(230)로 그리고 이로부터 데이터 및/또는 커맨드들의 전송을 가능하게 할 수 있다. 예를 들어, 메인 메모리 I/O 회로부(214)는 호스트(202) 및/또는 제어 회로부(220)로부터 뉴로모픽 메모리 어레이(230)로의 그리고 이로부터의 비트 스트링들, 데이터, 및/또는 커맨드들의 전송을 가능하게 할 수 있다. 일부 실시예들에서, 메인 메모리 I/O 회로부(214)는 비트 스트링들(예를 들어, 데이터의 블록들로서 저장된 포지트 비트 스트링들)을 제어 회로부(220)로부터 뉴로모픽 메모리 어레이(230)로 그리고 그 반대로 전송할 수 있는 하나 이상의 직접 메모리 액세스(DMA) 구성요소를 포함할 수 있다.Main memory input/output (I/O) circuitry 214 may enable transfer of data and/or commands to and from neuromorphic memory array 230 . For example, main memory I/O circuitry 214 may transmit bit strings, data, and/or commands from host 202 and/or control circuitry 220 to and from neuromorphic memory array 230. transmission of them can be made possible. In some embodiments, main memory I/O circuitry 214 transfers bit strings (e.g., positive bit strings stored as blocks of data) from control circuitry 220 to neuromorphic memory array 230. It may include one or more direct memory access (DMA) elements that can transfer to and from the computer and vice versa.

일부 실시예들에서, 메인 메모리 I/O 회로부(214)는 뉴로모픽 메모리 어레이(230)로부터 제어 회로부(220)로의 비트 스트링들, 데이터, 및/또는 커맨드들의 전송을 가능하게 할 수 있어서, 제어 회로부(220)는 비트 스트링들에 대해 산술 및/또는 논리 연산들을 수행할 수 있다. 유사하게, 메인 메모리 I/O 회로부(214)는 제어 회로부(220)에 의해 하나 이상의 연산이 수행된 비트 스트링들의 뉴로모픽 메모리 어레이(230)로의 전송을 가능하게 할 수 있다. 이러한 방식으로, unum 또는 포지트 포맷의 데이터는 데이터가 뉴로모픽 메모리 어레이(230) 이외의 어레이 또는 다른 위치에 저장되면서 산술 및/또는 논리 연산들을 수행하기 위해 제어 회로부(220)에 의해 연산될 수 있고; 데이터는 뉴로모픽 메모리 어레이(230)에 저장되면서, 뉴로모픽 연산들을 수행하기 위해 사용될 수 있다. 본원에서 더 상세히 설명되는 바와 같이, 연산들은 포지트 포맷의 비트 스트링들에 대해 수행되는 산술 연산들, 포지트 포맷의 비트들에 대해 수행되는 논리 연산들, 포지트 포맷의 비트 스트링들에 대해 수행되는 비트별 연산들 등, 및 포지트 포맷의 비트스트림들에 대해 수행되는 뉴로모픽 연산들을 포함할 수 있다. In some embodiments, main memory I/O circuitry 214 may enable transfer of bit strings, data, and/or commands from neuromorphic memory array 230 to control circuitry 220, such that Control circuitry 220 may perform arithmetic and/or logic operations on bit strings. Similarly, main memory I/O circuitry 214 may enable transfer of bit strings on which one or more operations have been performed by control circuitry 220 to neuromorphic memory array 230 . In this way, data in unum or positive format can be operated on by control circuitry 220 to perform arithmetic and/or logic operations while the data is stored in an array or other location other than neuromorphic memory array 230. can; While data is stored in the neuromorphic memory array 230, it may be used to perform neuromorphic operations. As described in more detail herein, the operations are arithmetic operations performed on bit strings in positive format, logical operations performed on bits in positive format, and bit strings performed on bit strings in positive format. bit-by-bit operations, etc., and neuromorphic operations performed on bitstreams in positive format.

로우 어드레스 스트로브(RAS)/컬럼 어드레스 스트로브(CAS) 체인 제어 회로부(216) 및 RAS/CAS 체인 구성요소(218)는 메모리 사이클을 개시하기 위해 로우 어드레스 및/또는 컬럼 어드레스를 래칭하기 위해 뉴로모픽 메모리 어레이(230)와 함께 사용될 수 있다. 일부 실시예들에서, RAS/CAS 체인 제어 회로부(216) 및/또는 RAS/CAS 체인 구성요소(218)는 뉴로모픽 메모리 어레이(230)와 연관된 판독 및 기록 연산들이 개시 또는 종료될 뉴로모픽 메모리 어레이(230)의 로우 및/또는 컬럼 어드레스들을 해상할 수 있다. 예를 들어, 제어 회로부(220)를 사용하는 연산의 완료 시, RAS/CAS 체인 제어 회로부(216) 및/또는 RAS/CAS 체인 구성요소(218)는 뉴로모픽 연산들을 수행하기 위해 제어 회로부(220)에 의해 연산되었던 비트 스트링들이 저장되어야 하는 뉴로모픽 메모리 어레이(230)에서의 특정 위치를 래칭하고/하거나 해상할 수 있다. 유사하게, RAS/CAS 체인 제어 회로부(216) 및/또는 RAS/CAS 체인 구성요소(218)는 제어 회로부(220)가 아날로그 포맷의 비트 스트링(들)에 대해 뉴로모픽 연산을 수행하기 전에 비트 스트링들이 제어 회로부(220)로 전송될 뉴로모픽 메모리 어레이(230)에서의 특정 위치를 래칭하고/하거나 해상할 수 있다. Row Address Strobe (RAS)/Column Address Strobe (CAS) chain control circuitry 216 and RAS/CAS chain components 218 use neuromorphic to latch row addresses and/or column addresses to initiate memory cycles. It can be used with memory array 230 . In some embodiments, RAS/CAS chain control circuitry 216 and/or RAS/CAS chain component 218 determines whether read and write operations associated with neuromorphic memory array 230 are to be initiated or terminated. Row and/or column addresses of memory array 230 may be resolved. For example, upon completion of an operation using control circuitry 220, RAS/CAS chain control circuitry 216 and/or RAS/CAS chain component 218 may perform neuromorphic operations to control circuitry ( A specific location in the neuromorphic memory array 230 where the bit strings operated by 220 should be stored may be latched and/or resolved. Similarly, the RAS/CAS chain control circuitry 216 and/or the RAS/CAS chain component 218 may perform the bit string prior to the control circuitry 220 performing neuromorphic operations on the bit string(s) in the analog format. The strings may latch and/or resolve a specific location in the neuromorphic memory array 230 to be transmitted to the control circuitry 220 .

도 1과 관련하여 상술한 바와 같이, 뉴로모픽 메모리 어레이(230)는 예를 들어, DRAM 어레이, SRAM 어레이, STT RAM 어레이, PCRAM 어레이, TRAM 어레이, RRAM 어레이, NAND 플래시 어레이, 및/또는 NOR 플래시 어레이일 수 있지만, 실시예들은 이들 특정 예들로 제한되지 않는다. 뉴로모픽 메모리 어레이(230)는 도 2에 도시된 컴퓨팅 시스템(200)에 대한 메인 메모리로서 기능할 수 있다. 일부 실시예들에서, 뉴로모픽 메모리 어레이(230)는 제어 회로부(220)에 의해 연산되는 비트 스트링들을 저장하고/하거나 제어 회로부(220)로 전송될 비트 스트링들을 저장하도록 구성될 수 있다. 뉴로모픽 메모리 어레이(230)는 어레이 부분(232)을 포함할 수 있다. 어레이 부분(232)은 이하에서 설명될 바와 같이, 다수의 뉴로모픽 연산들을 수행하기 위해 사용되는 복수의 뉴런들(예를 들어, 뉴런 구성요소들)(225-1, 225-2, 225-3)(이하, 집합적으로 뉴런들(225)로서 지칭됨)을 포함할 수 있다. 뉴런들(225)은 복수의 산술 논리 유닛들(ALU들)(226-1, 226-2, 226-3)(이하, 집합적으로 ALU들(226)로서 지칭됨)에 결합될 수 있다.As described above with respect to FIG. 1, neuromorphic memory array 230 may include, for example, a DRAM array, an SRAM array, an STT RAM array, a PCRAM array, a TRAM array, an RRAM array, a NAND flash array, and/or a NOR array. It may be a flash array, but embodiments are not limited to these specific examples. The neuromorphic memory array 230 may serve as main memory for the computing system 200 shown in FIG. 2 . In some embodiments, neuromorphic memory array 230 may be configured to store bit strings operated on by control circuitry 220 and/or to store bit strings to be transmitted to control circuitry 220 . The neuromorphic memory array 230 may include an array portion 232 . Array portion 232 includes a plurality of neurons (e.g., neuronal elements) 225-1, 225-2, 225- , which are used to perform a number of neuromorphic operations, as described below. 3) (hereinafter collectively referred to as neurons 225). Neurons 225 may be coupled to a plurality of arithmetic logic units (ALUs) 226 - 1 , 226 - 2 , and 226 - 3 (hereafter collectively referred to as ALUs 226 ).

제어 회로부(220)는 논리 회로부(예를 들어, 도 1에 도시된 처리 자원(122)) 및/또는 메모리 자원(들)(예를 들어, 도 1에 도시된 메모리 자원(124))을 포함할 수 있다. 도 1과 관련하여 위에서 설명되고 도 6과 관련하여 아래에서 더 상세히 설명되는 바와 같이, 제어 회로부(220)는 포지트 포맷의 하나 이상의 비트 스트링을 수신하고, 뉴로모픽 메모리 어레이(230)에 의해 포지트 포맷의 하나 이상 비트 스트링을 사용하여 뉴로모픽 연산들의 수행을 야기하도록 구성될 수 있다.Control circuitry 220 includes logic circuitry (eg, processing resource 122 shown in FIG. 1 ) and/or memory resource(s) (eg, memory resource 124 shown in FIG. 1 ). can do. As described above with respect to FIG. 1 and described in more detail below with respect to FIG. 6 , control circuitry 220 receives one or more bit strings in positive format and, by neuromorphic memory array 230 , It can be configured to cause the performance of neuromorphic operations using one or more bit strings in positive format.

예를 들어, 비트 스트링들(예를 들어, 데이터, 복수의 비트들 등)은 예를 들어, 제1 포맷(예를 들어, 포지트 포맷)으로 호스트(202), 및/또는 제1 포맷(예를 들어, 포지트 포맷)으로 뉴로모픽 메모리 어레이(230)로부터 제어 회로부(220)에 의해 수신될 수 있고, 예를 들어, 제어 회로부(220)의 메모리 자원(예를 들어, 본원에서 도 6에 도시된 메모리 자원(624))에 제어 회로부(220)에 의해 저장될 수 있다. 제어 회로부(220)는 본원에서 도 6과 관련하여 더 상세히 설명되는 바와 같이, 비트 스트링(들)에 대해 산술 및/또는 논리 연산들을 수행(또는 이에 대해 산술 및/또는 논리 연산들이 수행되게 야기)할 수 있다. For example, bit strings (e.g., data, a plurality of bits, etc.) may be sent to host 202, e.g., in a first format (e.g., positive format), and/or in a first format (e.g., For example, a positive format) may be received by the control circuitry 220 from the neuromorphic memory array 230, and may be received by the control circuitry 220, for example, a memory resource of the control circuitry 220 (eg, shown herein). It can be stored by the control circuitry 220 in the memory resource 624 shown in FIG. Control circuitry 220 performs arithmetic and/or logical operations on (or causes arithmetic and/or logical operations to be performed on) the bit string(s), as described herein in more detail with respect to FIG. 6 . can do.

도 3 및 도 4a-도 4b와 관련하여 더 상세히 설명되는 바와 같이, 포지트들은 부동 소수점 포맷으로 표현되는 대응하는 비트 스트링들보다 개선된 정확도(예를 들어, 정밀도)를 제공할 수 있고, 더 적은 저장 공간을 필요로 할 수 있다. 이에 따라, 제어 회로부(220)를 사용하여 신경망의 뉴런들을 사용하여 뉴로모픽 연산들을 수행하기 위해 포지트 비트 스트링들을 사용함으로써, 컴퓨팅 시스템(200)의 성능은 메모리 어레이 외부의 뉴로모픽 연산들의 수행을 위해 포지트비트 스트링들을 이용하는 접근법들에 비해 개선될 수 있는데, 이는 뉴로모픽 연산들이 메모리 어레이 내부에서 더 빠르게 수행될 수 있기 때문이다(예를 들어, 뉴로모픽 연산들이 메모리 어레이 내에서 더 효율적이기 때문이다).As described in more detail with respect to FIGS. 3 and 4A-4B , the posits may provide improved accuracy (e.g., precision) over corresponding bit strings represented in floating point format, and more May require less storage space. Accordingly, by using the control circuitry 220 to use positive bit strings to perform neuromorphic operations using neurons of a neural network, the performance of the computing system 200 can be improved by performing neuromorphic operations outside the memory array. It can be improved over approaches that use positivebit strings to perform because neuromorphic operations can be performed faster inside a memory array (e.g., neuromorphic operations can be performed within a memory array). because it is more efficient).

상술한 바와 같이, 제어 회로부(220)가 호스트(202) 및/또는 뉴로모픽 메모리 어레이(230)로부터 포지트 비트 스트링들을 수신하면, 제어 회로부(220)는 수신된 포지트 비트 스트링들에 대해 산술 및/또는 논리 연산들을 수행(또는 이의 수행을 야기)할 수 있다. 예를 들어, 제어 회로부(220)는 수신된 포지트 비트 스트링들에 대해 덧셈, 뺄셈, 곱셈, 나눗셈, 융합된 곱셈 덧셈, 곱셈-누산, 내적 단위, 절대값(예를 들어, FABS()) 초과 또는 미만, 고속 푸리에 변환, 역 고속 푸리에 변환, 시그모이드 함수, 컨볼루션, 제곱근, 지수, 및/또는 로그 연산들, 및/또는 AND, OR, XOR, NOT 등과 같은 논리 연산들뿐만 아니라, 사인, 코사인, 탄젠트 등과 같은 삼각 연산들과 같은 산술 연산들을 수행(또는 수행을 야기)하도록 구성될 수 있다. 이해될 바와 같이, 전술한 연산 리스트는 총망라한 것으로 의도되지 않고, 또한 전술한 연산 리스트는 제한적인 것으로도 의도되지 않으며, 제어 회로부(220)는 포지트 비트 스트링들에 대해 다른 산술 및/또는 논리 연산들을 수행(또는 수행을 야기)하도록 구성될 수 있다.As described above, when control circuitry 220 receives positive bit strings from host 202 and/or neuromorphic memory array 230, control circuitry 220 performs It may perform (or cause the performance of) arithmetic and/or logical operations. For example, control circuitry 220 may perform addition, subtraction, multiplication, division, fused multiplier-add, multiply-accumulate, dot product unit, absolute value (e.g., FABS()) for the received positive bit strings. over or under, fast Fourier transform, inverse fast Fourier transform, sigmoid function, convolution, square root, exponential, and/or logarithmic operations, and/or logical operations such as AND, OR, XOR, NOT, etc., as well as It may be configured to perform (or cause the performance of) arithmetic operations such as trigonometric operations such as sine, cosine, tangent, and the like. As will be appreciated, the foregoing list of operations is not intended to be exhaustive, nor is the foregoing list of operations intended to be limiting, and the control circuitry 220 may perform other arithmetic and/or logical operations on the positive bit strings. may be configured to perform (or cause the performance of)

일부 실시예들에서, 제어 회로부(220)는 하나 이상의 기계 학습 알고리즘의 실행과 함께 위에서 열거된 연산들을 수행할 수 있다. 예를 들어, 제어 회로부(220)는 뉴로모픽 메모리 어레이(230)에서 사용되는 것과 같은 하나 이상의 신경망에 관련된 연산들을 수행할 수 있다. 신경망들은 위해 알고리즘이 시간이 지남에 따라 트레이닝될 수 있게 하여 입력 신호들에 기초하여 출력 응답을 결정할 수 있다. 예를 들어, 시간이 지남에 따라, 신경망은 본질적으로 특정 목표를 완료할 기회를 더 양호하게 최대화하도록 학습할 수 있다. 이는 신경망이 특정 목표를 완료할 기회의 더 양호한 최대화를 달성하기 위해 새로운 데이터로 시간이 지남에 따라 트레이닝될 수 있기 때문에 기계 학습 적용예들에서 바람직할 수 있다. 신경망은 시간이 지남에 따라 특정 목표들 및/또는 특정 태스크들의 연산을 개선하도록 훈련될 수 있다. 그러나, 일부 접근법들에서, 기계 학습(예를 들어, 신경망 트레이닝)은 외부 디바이스들에 의해 처리되기 위해 메모리 어레이 안팎으로 데이터를 전송할 때 처리 집약적(예를 들어, 대량의 컴퓨터 처리 자원들을 소비할 수 있음)일 수 있고/있거나 시간 집약적(예를 들어, 다수의 사이클들을 소비하는 긴 계산들이 수행될 것을 필요로 할 수 있음) 수 있다. 대조적으로, 비트 스트링들에 대해 이러한 뉴로모픽 연산들을 수행하기 위해 뉴런들(225)을 사용하여 이러한 연산들을 수행함으로써, 연산들을 수행하는 데 소비되는 처리 자원들의 양 및/또는 시간의 양은 이러한 연산들이 메모리 어레이 외부의 요소들을 사용하여 수행되는 접근법들에 비해 감소될 수 있다.In some embodiments, control circuitry 220 may perform the operations listed above in conjunction with the execution of one or more machine learning algorithms. For example, the control circuitry 220 may perform operations related to one or more neural networks such as those used in the neuromorphic memory array 230 . Neural networks allow an algorithm to be trained over time to determine an output response based on input signals. For example, over time, a neural network can inherently learn to better maximize its chances of completing a particular goal. This can be desirable in machine learning applications because a neural network can be trained over time with new data to achieve a better maximization of the chance of completing a particular goal. A neural network can be trained to improve the computation of specific goals and/or specific tasks over time. However, in some approaches, machine learning (eg, neural network training) can be processing intensive (eg, consume large amounts of computer processing resources) when transferring data into and out of a memory array for processing by external devices. may be) and/or may be time intensive (eg, may require long calculations to be performed that consume many cycles). In contrast, by using neurons 225 to perform these neuromorphic operations on bit strings, the amount of processing resources and/or time spent performing the operations is less than the amount of such operations. may be reduced compared to approaches performed using elements outside the memory array.

도 2b는 본 개시의 다수의 실시예들에 따른 호스트(202), 메모리 디바이스(204), 주문형 집적 회로(223), 및 필드 프로그래머블 게이트 어레이(221)를 포함하는 컴퓨팅 시스템(200) 형태의 기능 블록도이다. 구성요소들(예를 들어, 호스트(202), 메모리 디바이스(204), FPGA(221), ASIC(223) 등) 각각은 본원에서 "장치" 로서 별개로 지칭될 수 있다.2B shows functionality in the form of a computing system 200 that includes a host 202, a memory device 204, an application specific integrated circuit 223, and a field programmable gate array 221 in accordance with multiple embodiments of the present disclosure. It is a block diagram. Each of the components (eg, host 202, memory device 204, FPGA 221, ASIC 223, etc.) may be separately referred to herein as an “apparatus.”

도 2b에 도시된 바와 같이, 호스트(202)는 채널(들)(203)을 통해 메모리 디바이스(204)에 결합될 수 있으며, 이는 도 1에 도시된 채널(들)(103)과 유사할 수 있다. 필드 프로그래머블 게이트 어레이(FPGA)(221)는 채널(들)(217)을 통해 호스트(202)에 결합될 수 있고, 주문형 집적 회로(ASIC)(223)는 채널(들)(219)을 통해 호스트(202)에 결합될 수 있다. 일부 실시예들에서, 채널(들)(217) 및/또는 채널(들)(219)은 PCIe(peripheral serial interconnect express) 인터페이스를 포함할 수 있지만, 실시예들은 이에 제한되지 않고, 채널(들)(217) 및/또는 채널(들)(219)은 다른 유형들의 인터페이스들, 버스들, 통신 채널들 등을 포함할 수 있어 호스트(202)와 FPGA(221) 및/또는 ASIC(223) 사이의 데이터의 전송을 가능하게 한다.As shown in FIG. 2B , host 202 can be coupled to memory device 204 via channel(s) 203, which can be similar to channel(s) 103 shown in FIG. there is. A field programmable gate array (FPGA) 221 may be coupled to the host 202 via channel(s) 217 and an application specific integrated circuit (ASIC) 223 may be coupled to the host via channel(s) 219 (202). In some embodiments, channel(s) 217 and/or channel(s) 219 may include a peripheral serial interconnect express (PCIe) interface, although embodiments are not limited thereto, and channel(s) 217 and/or channel(s) 219 may include other types of interfaces, buses, communication channels, etc. to communicate between host 202 and FPGA 221 and/or ASIC 223 Enables the transmission of data.

상술한 바와 같이, FPGA(221) 및/또는 ASIC(223)에 의해 수행될 수 있는 산술 및/또는 논리 연산들의 비제한적인 예들은 포지트 비트 스트링들을 사용하는 덧셈, 뺄셈, 곱셈, 나눗셈, 융합된 곱셈 덧셈, 곱셈-누산, 내적 단위, 절대값(예를 들어, FABS()) 초과 또는 미만, 고속 푸리에 변환, 역 고속 푸리에 변환, 시그모이드 함수, 컨볼루션, 제곱근, 지수, 및/또는 로그 연산들, 및/또는 AND, OR, XOR, NOT 등과 같은 논리 연산들뿐만 아니라, 사인, 코사인, 탄젠트 등과 같은 삼각 연산들과 같은 산술 연산들을 포함한다. As noted above, non-limiting examples of arithmetic and/or logic operations that may be performed by FPGA 221 and/or ASIC 223 include addition, subtraction, multiplication, division, fusion using positive bit strings. multiplicative addition, multiplication-accumulation, dot product unit, absolute value (e.g., FABS()) above or below, fast Fourier transform, inverse fast Fourier transform, sigmoid function, convolution, square root, exponential, and/or logarithmic operations, and/or logical operations such as AND, OR, XOR, NOT, etc., as well as arithmetic operations such as trigonometric operations such as sine, cosine, tangent, etc.

FPGA(221)는 상태 기계(227) 및/또는 레지스터(들)(229)를 포함할 수 있다. 상태 기계(227)는 입력에 대한 동작들을 수행하고 출력을 생성하도록 구성된 하나 이상의 처리 디바이스를 포함할 수 있다. 예를 들어, FPGA(221)는 호스트(202)로부터 포지트 비트 스트링들을 수신하고, 포지트 비트 스트링들에 대해 산술 및/또는 논리 연산들을 수행하여 수신된 포지트 비트 스트링들에 대해 수행된 연산의 결과를 나타내는 결과 포지트 비트 스트링들을 생성하도록 구성될 수 있다. 또한, FPGA(221)는 도 5a-도 5c와 관련하여 아래에서 추가로 설명될 바와 같이, 뉴런들을 사용하여 뉴로모픽 메모리 어레이(230)에서 뉴로모픽 연산들의 수행을 야기하도록 구성될 수 있다. FPGA 221 may include state machine 227 and/or register(s) 229 . State machine 227 may include one or more processing devices configured to perform operations on inputs and generate outputs. For example, FPGA 221 may receive positive bit strings from host 202, perform arithmetic and/or logical operations on the positive bit strings, and perform operations on the received positive bit strings. may be configured to generate result positive bit strings representing the result of FPGA 221 may also be configured to cause performance of neuromorphic operations in neuromorphic memory array 230 using neurons, as will be further described below with respect to FIGS. 5A-5C . .

FPGA(221)의 레지스터(들)(229)는 상태 기계(227)가 수신된 포지트 비트 스트링들에 대해 연산을 수행하기 전에 호스트(202)로부터 수신된 포지트 비트 스트링들을 버퍼링 및/또는 저장하도록 구성될 수 있다. 또한, FPGA(221)의 레지스터(들)(229)는 수신된 포지트 비트 스트링들에 대해 수행된 연산의 결과를 호스트(202) 또는 메모리 디바이스(204) 등과 같은 ASIC(233) 외부의 회로부에 전송하기 전에 그 결과를 나타내는 결과 포지트 비트 스트링을 버퍼링 및/또는 저장하도록 구성될 수 있다. Register(s) 229 of FPGA 221 buffers and/or stores the positive bit strings received from host 202 before state machine 227 performs operations on the received positive bit strings. can be configured to In addition, the register(s) 229 of the FPGA 221 transmits the results of operations performed on the received positive bit strings to circuitry outside the ASIC 233, such as the host 202 or memory device 204. It may be configured to buffer and/or store the result posit bit string representing the result prior to transmission.

ASIC(223)는 로직(241) 및/또는 캐시(243)를 포함할 수 있다. 로직(241)은 입력에 대해 연산들을 수행하고 출력을 생성하도록 구성된 회로부를 포함할 수 있다. 일부 실시예들에서, ASIC(223)는 호스트(202)로부터 포지트 비트 스트링들을 수신하고, 포지트 비트 스트링들에 대해 산술 및/또는 논리 연산들을 수행하여 수신된 포지트 비트 스트링들에 대해 수행된 연산의 결과를 나타내는 결과 포지트 비트 스트링들을 생성하도록 구성될 수 있다. 마찬가지로, ASIC(223)는 비트 스트링들에 대해 수행될 후속 뉴로모픽 연산들의 수행을 가능하게 할 수 있다.ASIC 223 may include logic 241 and/or cache 243 . Logic 241 may include circuitry configured to perform operations on inputs and generate outputs. In some embodiments, ASIC 223 receives positive bit strings from host 202 and performs arithmetic and/or logical operations on the positive bit strings to perform may be configured to generate result post bit strings representing the result of the operation performed. Likewise, ASIC 223 may enable performance of subsequent neuromorphic operations to be performed on the bit strings.

ASIC(223)의 캐시(243)는 로직(241)이 수신된 포지트 비트 스트링들에 대해 연산을 수행하기 전에 호스트(202)로부터 수신된 포지트 비트 스트링들을 버퍼링 및/또는 저장하도록 구성될 수 있다. 또한, ASIC(223)의 캐시(243)는 수신된 포지트 비트 스트링들에 대해 수행된 연산의 결과를 호스트(202) 또는 메모리 디바이스(204) 등과 같은 ASIC(233) 외부의 회로부에 전송하기 전에 그 결과를 나타내는 결과 포지트 비트 스트링을 버퍼링 및/또는 저장하도록 구성될 수 있다.Cache 243 of ASIC 223 may be configured to buffer and/or store positive bit strings received from host 202 before logic 241 performs operations on the received positive bit strings. there is. In addition, the cache 243 of the ASIC 223 transmits the result of the operation performed on the received positive bit strings to circuitry outside the ASIC 233, such as the host 202 or memory device 204. buffer and/or store a result positive bit string representing the result.

FPGA(227)가 상태 기계(227) 및 레지스터(들)(229)를 포함하는 것으로서 도시되어 있지만, 일부 실시예들에서, FPGA(221)는 상태 기계(227) 및/또는 레지스터(들)(229)에 더하여 또는 그 대신에 로직(241)과 같은 로직, 및/또는 캐시(243)와 같은 캐시를 포함할 수 있다. 유사하게, ASIC(223)는 일부 실시예들에서, 로직(241) 및/또는 캐시(243)에 더하여 또는 그 대신에 상태 기계(227)와 같은 상태 기계, 및/또는 레지스터(들)(229)와 같은 레지스터(들)를 포함할 수 있다.Although FPGA 227 is shown as including state machine 227 and register(s) 229, in some embodiments, FPGA 221 may include state machine 227 and/or register(s) ( In addition to or instead of 229 , logic such as logic 241 , and/or cache such as cache 243 , may be included. Similarly, ASIC 223 may, in some embodiments, in addition to or instead of logic 241 and/or cache 243, a state machine, such as state machine 227, and/or register(s) 229 ).

도 3은 es 지수 비트들을 갖는 n 비트 유니버설 수, 또는 "unum"의 예이다. 도 3의 예에서, n 비트 unum은 포지트 비트 스트링(331)이다. 도 3에 도시된 바와 같이, n 비트 포지트(331)는 부호 비트(들)의 집합(예를 들어, 부호 비트(333)), 레짐 비트들의 집합(예를 들어, 레짐 비트들(335)), 지수 비트들의 집합(예를 들어, 지수 비트들(337)), 및 가수 비트들의 집합(예를 들어, 가수 비트들(339))를 포함할 수 있다. 가수 비트들(339)은 대안적으로 "분수 부분" 또는 "분수 비트들"로 지칭될 수 있고, 소수점 뒤에 오는 비트 스트링의 일부분(예를 들어, 수)을 나타낼 수 있다.3 is an example of an n-bit universal number with es exponent bits, or "unum". In the example of FIG. 3 , the n-bit unum is the positive bit string 331 . As shown in FIG. 3, n bit position 331 is a set of sign bit(s) (e.g., sign bit 333), a set of regime bits (e.g., regime bits 335) ), a set of exponent bits (eg, exponent bits 337), and a set of mantissa bits (eg, mantissa bits 339). The mantissa bits 339 may alternatively be referred to as a “fractional part” or “fractional bits” and may represent the portion (eg, number) of the bit string following the decimal point.

부호 비트(333)는 양수들에 대해 제로(0)일 수 있고 음수들에 대해 일(1)일 수 있다. 레짐 비트들(335)은 아래의 표 1과 관련하여 설명되며, 이는 (이진) 비트 스트링들 및 이들의 관련된 숫자적 의미(k)를 제시한다. 표 1에서, 숫자적 의미(k)는 비트 스트링의 이어지는 길이에 의해 결정된다. 표 1의 이진 부분에서 문자 x는 비트 값이 레짐의 결정에 무관함을 나타내는데, 이는 (이진) 비트 스트링이 연속적인 비트 플립들(bit flips)에 응답하여 또는 비트 스트링의 끝에 도달할 때 종결되기 때문이다. 예를 들어, (이진) 비트 스트링 0010에서, 비트 스트링은 0이 1로 바뀌고 다시 0으로 바뀌는 것에 응답하여 종결된다. 이에 따라, 마지막 0은 레짐과 무관하고, 레짐들에 대해 고려되는 모든 것은 선두의 동일한 비트들 및 비트 스트링을 종결시키는 첫 번째 반대 비트이다(비트 스트링이 이러한 비트들을 포함하는 경우). The sign bit 333 can be zero (0) for positive numbers and one (1) for negative numbers. The regime bits 335 are described with respect to Table 1 below, which presents (binary) bit strings and their associated numeric meaning (k). In Table 1, the numerical meaning (k) is determined by the length of the string of bits. The letter x in the binary part of Table 1 indicates that the bit value is independent of the determination of the regime, since the (binary) bit string is terminated in response to successive bit flips or when the end of the bit string is reached. Because. For example, in the (binary) bit string 0010, the bit string is terminated in response to a 0 being changed to a 1 and back to a 0 again. Thus, the last 0 is independent of the regime, all that is considered for regimes is the leading identical bits and the first opposite bit that terminates the bit string (if the bit string contains these bits).

이진수binary number 00000000 00010001 001X001X 01XX01XX 10XX10XX 110X110X 11101110 11111111 숫자(k)number ( k ) -4-4 -3-3 -2-2 -1-One 00 1One 22 33

도 3에서, 레짐 비트들(335) r은 비트 스트링에서 동일한 비트들에 대응하는 한편, 레짐 비트들(335)

은 비트 스트링을 종결시키는 반대 비트에 대응한다. 예를 들어, 표 1에 도시된 숫자(k) 값 -2에 대해, 레짐 비트들 r은 처음 두 선두의 0들에 대응하는 한편, 레짐 비트(들)

은 1에 대응한다. 위에서 언급된 바와 같이, 표 1에서 X로 표현되는 숫자(k)에 대응하는 최종 비트는 레짐과 무관하다. 3, regime bits 335 r correspond to the same bits in the bit string, while regime bits 335

corresponds to the opposite bit that terminates the bit string. For example, for the number k value -2 shown in Table 1, the regime bits r correspond to the first two leading zeros, while the regime bit(s)

corresponds to 1. As mentioned above, the last bit corresponding to the number k represented by X in Table 1 is independent of the regime.

m이 비트 스트링에서의 동일한 비트들의 수에 대응하고, 비트들이 0이라면, k = -m이다. 비트들이 1이라면, k = m - 1이다. 이것이 표 1에 예시되며 이 경우, 예를 들어, (이진) 비트 스트링 10XX이 하나의 1을 갖고 k = m - 1 = 1 -1 = 0이다. 유사하게, (이진) 비트 스트링 0001은 세 개의 0을 포함하므로 k = -m = -3이다. 레짐은 useed ^k 의 스케일 인자를 나타낼 수 있으며, 여기서 useed =

이다. 사용될 몇 가지 예시적인 값들이 아래 표 2에서 제시된다.If m corresponds to the same number of bits in the bit string, and the bits are zero, then k = -m . If the bits are 1, then k = m - 1. This is illustrated in Table 1, in which case, for example, the (binary) bit string 10XX has one 1 and k = m - 1 = 1 -1 = 0. Similarly, the (binary) bit string 0001 contains three zeros, so k = -m = -3. The regime can represent a scale factor of used ^k , where used =

am. Some exemplary values to be used are presented in Table 2 below.

eses 00 1One 22 33 44 used used 22 2² = 42 ² = 4 4² = 164 ² = 16 16² = 25616 ² = 256 256² = 65536256 ² = 65536

지수 비트들(337)은 무부호 수(unsigned number)로서 지수 e에 대응한다. 부동 소수점 수들과 대조적으로, 본원에서 설명되는 지수 비트들(337)은 이와 연관된 바이어스를 갖지 않을 수 있다. 결과로서, 본원에서 설명되는 지수 비트들(337)은 2 ^e 의 인자에 의한 스케일링을 나타낼 수 있다. 도 3에 도시된 바와 같이, n 비트 포지트(331)의 레짐 비트들(335)의 우측으로 얼마나 많은 비트들이 남아 있는지에 따라 es 지수 비트들(e ₁, e ₂, e ₃, . . ., e _es )까지 있을 수 있다. 일부 실시예들에서, 이는 크기가 1에 더 가까운 수들이 매우 크거나 매우 작은 수들보다 더 높은 정확도를 갖는 n 비트 포지트(331)의 테이퍼드(tapered) 정확성을 가능하게 할 수 있다. 그러나, 매우 크거나 매우 작은 수들이 특정 종류의 연산들에서 덜 빈번하게 이용될 수 있기 때문에, 도 3에 도시된 n 비트 포지트(331)의 테이퍼드 정확성 거동은 광범위한 상황들에서 바람직할 수 있다.The exponent bits 337 correspond to the exponent e as an unsigned number. In contrast to floating point numbers, the exponent bits 337 described herein may not have a bias associated with them. As a result, the exponent bits 337 described herein may represent scaling by a factor of ^2e . As shown in FIG. 3, es exponent bits ( e ₁ , e ₂ , e ₃ , . . . , e _es ). In some embodiments, this may enable tapered accuracy of the n bit posit 331 where numbers closer to one in magnitude have higher accuracy than very large or very small numbers. However, because very large or very small numbers may be used less frequently in certain kinds of operations, the n shown in FIG. The tapered accuracy behavior of bit position 331 may be desirable in a wide range of situations.

가수 비트들(339)(또는 분수 비트들)은 지수 비트들(337)의 우측에 놓이는 n 비트 포지트(331)의 일부일 수 있는 임의의 추가적인 비트들을 나타낸다. 부동 소수점 비트 스트링들과 유사하게, 가수 비트들(339)은 분수 f - 여기서 f는 1 뒤에 나오는 소수점의 우측에 하나 이상의 비트를 포함함 - 와 유사할 수 있는 분수 1.f를 나타낸다. 그러나, 부동 소수점 비트 스트링들과 대조적으로, 도 3에 도시된 n 비트 포지트(331)에서, "숨겨진 비트(hidden bit)"(예를 들어, 1)는 항상 일(예를 들어, 1)일 수 있는 반면에, 부동 소수점 비트 스트링들은 "숨겨진 비트"가 제로인(예를 들어, 0.f) 비정규화 수(subnormal number)를 포함할 수 있다. Mantissa bits 339 (or fraction bits) represent any additional bits that may be part of n-bit positions 331 lying to the right of exponent bits 337 . Similar to floating point bit strings, the mantissa bits 339 represent the fraction 1. f, which can be analogous to the fraction f, where f includes one or more bits to the right of the decimal point following the 1. However, in contrast to floating point bit strings, in the n bit position 331 shown in Figure 3, the "hidden bit" (e.g. 1) is always one (e.g. 1) , whereas floating point bit strings may contain a subnormal number for which the “hidden bit” is zero (eg, 0. f ).

도 4a는 3 비트 포지트(431)에 대한 양의 값들의 예이다. 도 4a에는, 사영 실수들의 우반만이 있지만, 도 4a에 도시된 양의 대응부들에 대응하는 음의 사영 실수들이 도 4a에 도시된 곡선들의 y축에 대한 변환을 나타내는 곡선 상에 존재할 수 있다는 것이 이해될 것이다.4A is an example of positive values for a 3-bit position 431. In Fig. 4a, there are only the right half of the projected real numbers, but negative projected real numbers corresponding to the positive counterparts shown in Fig. 4a can exist on the curve representing the transformation of the curves shown in Fig. 4a with respect to the y-axis. It will be understood.

도 4a의 예에서, es = 2이므로, useed =

= 16이다. 포지트(431)의 정밀도는 도 4b에 도시된 바와 같이, 비트 스트링에 비트들을 덧붙임으로써 증가될 수 있다. 예를 들어, 포지트(431-1)의 비트 스트링들에 일(1)의 값을 갖는 비트를 덧붙이면 도 4b의 포지트(431-2)에 의해 도시된 바와 같이 포지트(431)의 정확도를 증가시킨다. 유사하게, 도 4b의 포지트(431-2)의 비트 스트링들에 일의 값을 갖는 비트를 덧붙이면 도 4b에 도시된 포지트(431-3)에 의해 도시된 바와 같이 포지트(431-2)의 정확도를 증가시킨다. 도 4b에 도시된 포지트들(431-2, 431-3)을 얻기 위해 도 4a에 도시된 포지트들(431-1)의 비트 스트링들에 비트들을 덧붙이는 데 사용될 수 있는 보간 규칙들의 예는 다음과 같다.In the example of Fig. 4a, since es = 2, used =

= 16. The precision of the posit 431 can be increased by appending bits to the bit string, as shown in FIG. 4B. For example, appending a bit with a value of one (1) to the bit strings of position 431-1 results in position 431 as shown by position 431-2 in FIG. 4B. increase accuracy. Similarly, appending a bit with a value of one to the bit strings of position 431-2 in Figure 4b results in position 431-3 as shown by position 431-3 shown in Figure 4b. 2) increases the accuracy. Example of interpolation rules that can be used to append bits to the bit strings of positions 431-1 shown in Figure 4A to obtain positions 431-2 and 431-3 shown in Figure 4B. is as follows

maxpos가 도 4b에 도시된 포지트들(431-1, 431-2, 431-3)의 비트 스트링의 가장 큰 양의 값이고, minpos가 포지트들(431-1, 431-2, 431-3)의 비트 스트링의 가장 작은 값이라면, maxpos는 useed와 같을 수 있고 minpos는

와 같을 수 있다. maxpos와 ±∞ 사이에서, 새로운 비트 값이

일 수 있고, 제로와 minpos 사이에서, 새로운 비트 값이

일 수 있다. 이러한 새로운 비트 값들은 새로운 레짐 비트(335)에 대응할 수 있다. 기존 값들 x = 2 ^m 과 y = 2 ⁿ (여기서, m 및 n은 1보다 크게 상이함) 사이에서, 새로운 비트 값은 기하 평균:

으로 주어질 수 있으며, 이는 새로운 지수 비트(337)에 대응한다. 새로운 비트 값이 이 다음의 기존 x와 y 값들 사이의 중간에 있다면, 새로운 비트 값은 산술 평균

를 나타낼 수 있으며, 이는 새로운 가수 비트(339)에 대응한다. maxpos is the largest positive value of the bit string of the positions 431-1, 431-2, and 431-3 shown in FIG. For the smallest value of the bit string in 3), maxpos can be equal to used and minpos is

can be equal to Between maxpos and ±∞, the new bit value is

, and between zero and minpos , the new bit value is

can be These new bit values may correspond to the new regime bit 335 . Between the old values x = 2 ^m and y = 2 ⁿ (where m and n differ by more than 1), the new bit value is the geometric mean:

, which corresponds to the new exponent bit 337. If the new bit value is midway between these next existing x and y values, the new bit value is the arithmetic mean

, which corresponds to the new mantissa bit 339.

도 4b는 두 개의 지수 비트들을 사용한 포지트 구성의 예이다. 도 4b에는, 사영 실수들의 우반만이 있지만, 도 4b에 도시된 양의 대응부들에 대응하는 음의 사영 실수들이 도 4b에 도시된 곡선들의 y축에 대한 변환을 나타내는 곡선 상에 존재할 수 있다는 것이 이해될 것이다. 도 4b에 도시된 포지트들(431-1, 431-2, 431-3)은 각각 단지 두 개의 예외 값들, 즉 비트 스트링의 모든 비트들이 제로일 때는 제로(0)이고, 비트 스트링이 1에 뒤에 모두 제로가 나오는 일(1)일 때는 ±∞를 포함한다. 도 4에 도시된 포지트들(431-1, 431-2, 431-3)의 숫자 값들은 정확하게 useed ^k 이라는 점에 유념한다. 즉, 도 4에 도시된 포지트들(431-1, 431-2, 431-3)의 숫자 값들은 정확하게 useed의 k제곱(k 값은 레짐(예를 들어, 도 3과 관련하여 상술한 레짐 비트들(335))로 표현된다)이다. 도 4b에서, 포지트 431-1은 es = 2를 가지므로, useed =

= 16이고, 포지트 431-2는 es = 3을 가지므로, useed =

= 256이며, 포지트 431-3는 es = 4를 가지므로, useed =

= 4096이다.4B is an example of a posit configuration using two exponent bits. In FIG. 4b, there are only the right half of the projected real numbers, but negative projected real numbers corresponding to the positive counterparts shown in FIG. 4b can exist on the curve representing the transformation of the curves shown in FIG. 4b with respect to the y-axis. It will be understood. The positions 431-1, 431-2, and 431-3 shown in FIG. 4B each have only two exception values: zero when all bits of the bit string are zero, and zero when the bit string is equal to 1. A one (1) followed by all zeros includes ±∞. Note that the numerical values of the positions 431-1, 431-2, and 431-3 shown in FIG. 4 are exactly used ^k . That is, the numerical values of the positions 431-1, 431-2, and 431-3 shown in FIG. 4 are exactly used to the power of k (the value of k is the regime (eg, the regime described above with reference to FIG. 3) bits 335). In FIG. 4B, post 431-1 has es = 2, so used =

= 16, and position 431-2 has es = 3, so used =

= 256, and position 431-3 has es = 4, so used =

= 4096.

도 4b의 4 비트 포지트(431-2)를 만들어 내기 위해 3 비트 포지트(431-1)에 비트들을 추가하는 예시적인 예로서, useed = 256이므로, 256의 useed에 대응하는 비트 스트링은 이에 덧붙는 추가적인 레짐 비트를 갖고, 이전의 useed 16은 이에 덧붙는 종결 레짐 비트(

)를 가진다. 상술한 바와 같이, 대응하는 비트 스트링들은 기존 값들 사이에서, 이에 덧붙는 추가적인 지수 비트를 가진다. 예를 들어, 숫자 값들 1/16, ¼, 1, 및 4는 이에 덧붙는 지수 비트를 가질 것이다. 즉, 숫자 값 4에 대응하는 최종 일은 지수 비트이고, 숫자 값 1에 대응하는 최종 제로는 지수 비트인 등이다. 이 패턴은 또한, 4 비트 포지트(431-2)로부터 상기한 규칙들에 따라 생성된 5 비트 포지트인 포지트(431-3)로 보여질 수 있다. 또 다른 비트가 도 4b의 포지트(431-3)에 추가되어 6 비트 포지트를 생성한다면, 1/16과 16 사이의 숫자 값들에 가수 비트들(339)이 덧붙을 것이다. As an illustrative example of adding bits to the 3-bit position 431-1 to produce the 4-bit position 431-2 of FIG. 4B, since used = 256, the bit string corresponding to used of 256 is thus With an additional regime bit appended, the previous used 16 is appended with the termination regime bit (

) has As described above, the corresponding bit strings have an additional exponent bit appended to, between and existing values. For example, the numeric values 1/16, ¼, 1, and 4 will have an exponent bit appended to them. That is, the last day corresponding to the numeric value 4 is the exponent bit, the last zero corresponding to the numeric value 1 is the exponent bit, and so on. This pattern can also be seen as position 431-3, which is a 5-bit position generated according to the rules described above from 4-bit position 431-2. If another bit is added to the position 431-3 of Figure 4b to create a 6-bit position, the mantissa bits 339 will be added to numeric values between 1/16 and 16.

포지트(예를 들어, 포지트(431))를 디코딩하여 이의 등가 숫자를 얻는 비제한적인 예는 다음과 같다. 일부 실시예들에서, 포지트 p에 대응하는 비트 스트링은 범위가

에서

에 이르는 무부호 정수이고, k는 레짐 비트들(335)에 대응하는 정수이며, e는 지수 비트들(337)에 대응하는 무부호 정수이다. 가수 비트들의 집합(339)이 {f ₁ f ₂ . . . f _fs }로서 표현되고 f가 1. f ₁ f ₂ . . . f _fs 로(예를 들어, 일 다음 소수점 다음 가수 비트들(339)로) 표현된다면, p는 아래 식 1로 주어질 수 있다.A non-limiting example of decoding a position (e.g. position 431) to obtain its equivalent number is as follows. In some embodiments, the bit string corresponding to posit p has the range

at

is an unsigned integer leading to , k is an integer corresponding to the regime bits 335, and e is an unsigned integer corresponding to the exponent bits 337. The set of mantissa bits 339 is { f ₁ f ₂ . . . f _fs } where f is 1. f ₁ f ₂ . . . If expressed as f _fs (eg, one followed by the decimal point followed by mantissa bits 339), p may be given by Equation 1 below.

식 1Equation 1

포지트 비트 스트링을 디코딩하는 추가의 예시적인 예가 아래와 같은 표 3에 제시된 포지트 비트 스트링 0000110111011101과 관련하여 아래에 제공된다.A further illustrative example of decoding a positive bit string is provided below with respect to the positive bit string 0000110111011101 set forth in Table 3 below.

부호sign 레짐regime 지수jisoo 가수singer 00 00010001 101101 1101110111011101

표 3에서, 포지트 비트 스트링 0000110111011101은 이를 구성하는 비트들의 집합들(예를 들어, 부호 비트(333), 레짐 비트들(335), 지수 비트들(337), 및 가수 비트들(339))로 분해된다. (예를 들어, 지수 비트들이 세 개 있기 때문에) 표 3에 제시된 포지트 비트 스트링에서 es = 3이므로, useed = 256이다. 부호 비트(333)는 제로이기 때문에, 표 3에 제시된 포지트 비트 스트링에 대응하는 수식의 값은 양이다. 레짐 비트들(335)은 (표 1과 관련하여 상술한 바와 같이) 세 개의 연속적인 제로가 이어져 -3의 값에 대응한다. 결과로서, 레짐 비트들(335)에 기여되는 스케일 팩터는 256^-3(예를 들어, useed ^k )이다. 지수 비트들(337)은 무부호 정수로서 오(5)를 나타내고, 이에 따라 추가적인 스케일 팩터 2 ^e = 2⁵ = 32에 기여한다. 마지막으로, 표 3에서 11011101로 주어지는 가수 비트들(339)은 무부호 정수로서 이백이십일(221)을 나타내므로, 위에서 f로서 주어진 가수 비트들(339)은

이다. 이들 값들 및 식 1을 사용하면, 표 3에 주어진 포지트 비트 스트링에 대응하는 숫자 값은

이다.In Table 3, the positive bit string 0000110111011101 is the set of bits that make up it (e.g., sign bit 333, regime bits 335, exponent bits 337, and mantissa bits 339) is decomposed into Since es = 3 in the positive bit string shown in Table 3 (eg, because there are three exponent bits), used = 256. Since the sign bit 333 is zero, the value of the expression corresponding to the positive bit string shown in Table 3 is positive. Regime bits 335 correspond to a value of -3 followed by three consecutive zeros (as described above with respect to Table 1). As a result, the scale factor contributed to the regime bits 335 is 256 ^-3 (eg, used ^k ). The exponent bits 337 represent five (5) as an unsigned integer, thus contributing to an additional scale factor 2 ^e = 2 ⁵ = 32. Finally, since the mantissa bits 339 given as 11011101 in Table 3 represent two hundred and twenty-one 221 as an unsigned integer, the mantissa bits 339 given as f above are

am. Using these values and Equation 1, the numeric value corresponding to the positive bit string given in Table 3 is

am.

도 5a는 본 개시의 다수의 실시예들에 따른 주변 감지 앰프들(512), 메모리 어레이(530), 복수의 먹스들(545), 및 복수의 산술 논리 유닛(ALU)들(526) 형태의 기능 블록도(550)이다. 뉴로모픽 메모리 어레이(530)는 호스트(예를 들어, 도 1, 도 2a, 도 2b에서의 호스트(102, 202)) 또는 다른 외부 디바이스로부터 데이터를 수신할 수 있다. 데이터는 특정 포맷(예를 들어, unum 수 또는 포지트 포맷)의 비트 스트링들을 포함할 수 있다. 포지트 포맷의 데이터는 주변 감지 앰프들(PSA들)(512) 및/또는 (도 1의 메모리 디바이스(104)와 같은) 메모리 디바이스 내의 다른 위치들에 저장될 수 있다. PSA들(512)은 복수의 멀티플렉서들("MUX"들)(545-1, 545-2, 545-3)(이하, 집합적으로 MUX들(545)로서 지칭됨)에 결합될 수 있고, 포지트 포맷의 데이터는 MUX들(545)을 통해 뉴런 구성요소들(525-1, 525-2, 525-3, 525-4, 525-5, 525-6, 525-7, 525-8, 525-8)(이하, 집합적으로 뉴런 구성요소들(525)로서 지칭됨)로 발신될 수 있다. 뉴런 구성요소들(525)은 데이터에 대해 다수의 뉴로모픽 연산들을 수행할 수 있다. 또한, 뉴런 구성요소들(525)은 데이터에 대한 뉴로모픽 연산들을 수행하는 데이터 및/또는 결과들을 추가적인 디바이스들로 발신할 수 있다. 예를 들어, 뉴런 구성요소들(525)은 복수의 산술 논리 유닛들(ALU들)(526-1, 526-2, 526-3)(이하, ALU들(526)로서 지칭됨)에 결합될 수 있다. 뉴런 구성요소들(525)은 추가적인 연산들(예를 들어, RELU 연산들, 시그마 연산들 등)의 수행을 위해 ALU들로 데이터 또는 결과들을 발신할 수 있다. 5A is a diagram in the form of peripheral sense amplifiers 512, a memory array 530, a plurality of muxes 545, and a plurality of arithmetic logic units (ALUs) 526 in accordance with multiple embodiments of the present disclosure. Functional block diagram 550 . The neuromorphic memory array 530 can receive data from a host (eg, hosts 102 and 202 in FIGS. 1, 2A, and 2B) or other external devices. Data may include bit strings in a specific format (eg, unum number or positive format). Data in positive format may be stored in peripheral sense amplifiers (PSAs) 512 and/or other locations within a memory device (such as memory device 104 in FIG. 1). PSAs 512 may be coupled to a plurality of multiplexers ("MUXs") 545-1, 545-2, 545-3 (hereafter collectively referred to as MUXs 545); Positive format data is transmitted through the MUXs 545 to the neuron components 525-1, 525-2, 525-3, 525-4, 525-5, 525-6, 525-7, 525-8, 525-8) (hereinafter collectively referred to as neuronal elements 525). Neuron components 525 can perform a number of neuromorphic operations on data. Neuron components 525 can also send data and/or results of performing neuromorphic operations on data to additional devices. For example, neuron components 525 may be coupled to a plurality of arithmetic logic units (ALUs) 526-1, 526-2, and 526-3 (hereinafter referred to as ALUs 526). can Neuron components 525 may send data or results to the ALUs for performance of additional operations (eg, RELU operations, sigma operations, etc.).

일 실시예에서, 뉴로모픽 연산들은 뉴런 구성요소들(525)을 신경망으로서 사용하여 수행될 수 있다. 일부 뉴로모픽 시스템들은 시냅스의 값(또는 가중치)(예를 들어, 시냅스 가중치)을 저장하기 위해 PCM 디바이스들 또는 자기 선택 메모리 디바이스들과 같은 저항성 RAM(RRAM)을 사용할 수 있다. 이러한 가변 저항 메모리는 다수의 레벨들을 저장하도록 구성되고/되거나 넓은 감지 윈도우들을 가질 수 있는 메모리 셀들을 포함할 수 있다. 이러한 타입들의 메모리는 펄스(예를 들어, 스파이크) 제어에 의해 트레이닝 연산들을 수행하도록 구성될 수 있다. 이러한 트레이닝 연산들은 스파이크-타이밍-의존 가소성(spike-timing-dependent plasticity, STDP)을 포함할 수 있다. STDP는 노드들(예를 들어, 뉴런들) 사이에서 송신되는 스파이크들 간의 상관에 의해 유도되는 Hebbian 학습 형태를 가질 수 있다. STDP는 노드들(예를 들어, 뉴런들) 간의 연결들의 강도를 조정하는 프로세스의 예일 수 있다.In one embodiment, neuromorphic operations may be performed using neuronal elements 525 as a neural network. Some neuromorphic systems may use resistive RAM (RRAM), such as PCM devices or self-selecting memory devices, to store synaptic values (or weights) (eg, synaptic weights). Such a variable resistance memory can include memory cells that can be configured to store multiple levels and/or have wide sensing windows. These types of memory may be configured to perform training operations by pulse (eg, spike) control. These training operations may include spike-timing-dependent plasticity (STDP). STDP may have a form of Hebbian learning derived by correlation between spikes transmitted between nodes (eg, neurons). STDP can be an example of a process that adjusts the strength of connections between nodes (eg, neurons).

신경망들에서, 시냅스 가중치는 두 개의 노드들(예를 들어, 뉴런들) 간의 연결의 강도 또는 진폭을 지칭한다. 신경망을 통해 송신되는 정보의 특성 및 콘텐츠는 노드들 사이에 형성된 시냅스들을 나타내는 연결들의 속성들에 부분적으로 기초할 수 있다. 예를 들어, 연결들의 속성은 시냅스 가중치들일 수 있다. 특히, 뉴로모픽 시스템들 및 디바이스들은 전통적인 컴퓨터 아키텍처들로 가능하지 않을 수 있는 결과들을 달성하도록 설계될 수 있다. 예를 들어, 뉴로모픽 시스템들은 학습, 시각 또는 시각 처리, 청각 처리, 고급 컴퓨팅, 또는 다른 프로세스들, 또는 이들의 조합과 같은 생물학적 시스템들과 더 공통적으로 연관된 결과들을 달성하기 위해 사용될 수 있다. 예로서, 적어도 두 개의 메모리 셀들 사이의 시냅스 가중치 및/또는 연결들은 시냅스, 또는 시냅스의 연결성의 강도 또는 정도를 나타낼 수 있고, 단기 및 장기 메모리의 생물학적 발생에 대응하는 각 단기 연결 또는 장기 연결과 연관될 수 있다. 아래에서 설명될 바와 같이, 메모리 셀의 어느 유형이 사용되는지에 따라, 단기간 또는 장기간 방식으로 적어도 두 개의 메모리 셀들 사이의 시냅스 가중치를 증가시키기 위해 일련의 신경망 연산들이 수행될 수 있다. In neural networks, synaptic weight refers to the strength or amplitude of a connection between two nodes (eg, neurons). The nature and content of information transmitted through a neural network may be based in part on properties of connections representing synapses formed between nodes. For example, an attribute of connections may be synaptic weights. In particular, neuromorphic systems and devices can be designed to achieve results that may not be possible with traditional computer architectures. For example, neuromorphic systems can be used to achieve results more commonly associated with biological systems, such as learning, vision or visual processing, auditory processing, advanced computing, or other processes, or combinations thereof. By way of example, synaptic weights and/or connections between at least two memory cells may represent a synapse, or strength or degree of synaptic connectivity, associated with each short-term connection or long-term connection corresponding to the biological occurrence of short-term and long-term memory. It can be. As will be explained below, a series of neural network operations can be performed to increase the synaptic weight between at least two memory cells in a short-term or long-term manner, depending on which type of memory cell is used.

신경망 연산의 학습 이벤트는 뉴런들 사이의 스파이크들의 인과적 전파를 나타낼 수 있어서, 연결 시냅스들에 대한 가중치 증가를 가능하게 한다. 시냅스의 가중치 증가는 메모리 셀의 전도도의 증가로 표현될 수 있다. 가변 저항 메모리 어레이(예를 들어, 3D 크로스 포인트 또는 자기 선택 메모리(SSM) 어레이)는 각각이 가중치, 또는 메모리 셀 컨덕턴스에 의해 특징지어지는 시냅스들의 어레이를 모방할 수 있다. 컨덕턴스가 클수록, 시냅스 가중치가 크고, 메모리 학습의 정도가 높다. 단기 메모리 학습은 시냅스의 아날로그 가중치가 향상되는 즉, 이의 전기 전도도가 가역적 메커니즘에 의해 증가되는 고속 및/또는 가역적 메모리 학습일 수 있다. 장기 메모리 학습은 셀 컨덕턴스가 특정 상태(예를 들어, SET 또는 RESET)에 대해 비가역적으로 증가되는 저속 및/또는 비가역적 메모리 학습일 수 있어, 더 길고, 경험 의존적 학습으로부터 오는 잊어버리지 않는 메모리를 초래한다. The learning event of neural network operation can represent the causal propagation of spikes between neurons, enabling weight gain on connecting synapses. An increase in synapse weight can be expressed as an increase in conductivity of a memory cell. A variable resistance memory array (eg, a 3D cross point or self-selecting memory (SSM) array) can emulate an array of synapses, each characterized by a weight, or memory cell conductance. The larger the conductance, the larger the synaptic weight and the higher the degree of memory learning. Short-term memory learning can be fast and/or reversible memory learning in which analog weights of synapses are enhanced, ie, their electrical conductivity is increased by a reversible mechanism. Long-term memory learning can be slow and/or irreversible memory learning in which cell conductance is irreversibly increased for a particular state (e.g., SET or RESET), resulting in longer, non-forgetting memory that comes from experience-dependent learning. cause

뉴로모픽 연산들은 신경계에 존재할 수 있는 신경-생물학적 아키텍처들을 모방하기 위해 그리고/또는 장기 및 단기 학습 또는 관계들과 연관된 시냅스 가중치들을 저장하기 위해 사용될 수 있다. 메모리 장치는 제1 부분 및 제2 부분을 포함하는 메모리 어레이를 포함할 수 있다. 메모리 어레이의 제1 부분은 제1 복수의 가변 저항 메모리 셀들을 포함할 수 있고, 제2 부분은 제2 복수의 가변 저항 메모리 셀들을 포함할 수 있다. 제2 부분은 강제 기록 사이클링을 통해 열화될 수 있다. 열화 메커니즘은 칼코게나이드 물질에 대한 손상을 포함할 수 있다. 칼코게나이드 물질 이외의 물질로 구성된 메모리 셀들을 포함하는 일부 실시예들에서, 열화 메커니즘은 메모리 셀들 간의 열적 관계, 메모리 셀들 간의 제어 게이트 결합을 통한 제어, 메모리 셀들에 대응하는 전하 손실, 신호 또는 임계치의 온도 유도 손실 등을 포함할 수 있다. Neuromorphic operations may be used to mimic neuro-biological architectures that may exist in the nervous system and/or to store synaptic weights associated with long-term and short-term learning or relationships. The memory device may include a memory array including a first portion and a second portion. The first portion of the memory array may include a first plurality of variable resistance memory cells, and the second portion may include a second plurality of variable resistance memory cells. The second portion may be deteriorated through forced write cycling. The degradation mechanism may include damage to the chalcogenide material. In some embodiments involving memory cells composed of a material other than a chalcogenide material, the degradation mechanism may be a thermal relationship between the memory cells, control through a control gate coupling between the memory cells, a charge loss corresponding to the memory cells, a signal or a threshold of temperature induced losses and the like.

이들 뉴로모픽 연산들은 뉴로모픽 메모리 어레이(530)에 의해 수신된 데이터에 대해 수행될 수 있다. 데이터 또는 데이터에서의 패턴에 의해 표현되는 특정 이벤트를 검출하기 위해 신경망이 사용되는 것을 예상하여, 신경망은 (예를 들어, 뉴런 구성요소들(525)로 발신되는 데이터를 사용하여) 신경망을 트레이닝하기 위해 사용되는 대량의 데이터(아래에서 설명되는, 도 5b에서 아날로그 가중치들(559)로서 지칭될 수 있는 것)를 수신할 수 있다. 일 실시예에서, 비트 스트링들을 포함하는 데이터가 수신될 때, 후속 뉴로모픽 처리를 위해 신경망을 트레이닝하기 위해 아날로그 가중치들이 데이터 값들에 추가될 수 있다. 훈련은 뉴로모픽 메모리 어레이(530)의 신경망이 데이터를 정확하게 해석하고 원하는 또는 도움이 되는 결과를 제공할 수 있게 하는 대량의 데이터를 포함할 수 있다. 상술한 신경망 처리를 사용하여, 대량의 데이터는 뉴로모픽 메모리 어레이(530)의 신경망을 트레이닝하여 이벤트 또는 패턴을 검출하고 그렇게 하는 데 더 효과적이고 효율적으로 될 수 있다. These neuromorphic operations may be performed on data received by neuromorphic memory array 530 . In anticipation that the neural network will be used to detect a particular event represented by the data or patterns in the data, the neural network can be used to train the neural network (e.g., using data sent to neuron elements 525). may receive a large amount of data (what may be referred to as analog weights 559 in FIG. 5B, described below) used for In one embodiment, when data comprising bit strings is received, analog weights may be added to the data values to train a neural network for subsequent neuromorphic processing. Training may include large amounts of data that enable the neural networks of neuromorphic memory array 530 to accurately interpret the data and provide desired or helpful results. Using the neural network processing described above, large amounts of data can train the neural networks of the neuromorphic memory array 530 to detect events or patterns and become more effective and efficient in doing so.

뉴로모픽 연산의 결과 값은 ALU들(526)로 그리고/또는 다른 저장 위치들로 추가 처리를 위해 뉴로모픽 메모리 어레이(530) 외부로 발신될 수 있다. 이러한 방식으로, 다수의 레벨들의 처리가 데이터의 동일한 부분에서 일어날 수 있다. 예를 들어, 데이터의 집합에 대해 산술/논리 연산들과 뉴로모픽 연산들 양자를 수행하기 위한 커맨드가 호스트로부터 발신될 수 있다. The resulting value of the neuromorphic operation may be sent out of the neuromorphic memory array 530 for further processing to the ALUs 526 and/or to other storage locations. In this way, multiple levels of processing can occur on the same piece of data. For example, a command to perform both arithmetic/logic operations and neuromorphic operations on a set of data may be sent from the host.

처리의 순서에 따라, 산술 연산들은 데이터가 뉴로모픽 메모리 어레이(530) 외부에 있으면서 수행될 수 있다. 산술 연산의 결과는 뉴로모픽 연산들의 수행을 위해 뉴로모픽 메모리 어레이(530)로 발신될 수 있다. 뉴로모픽 연산들이 수행되는 것에 후속하여, 데이터는 뉴런 구성요소들(525) 로부터 메모리 디바이스 내의 다른 위치들로 발신될 수 있다. 뉴로모픽 연산의 결과는 뉴런 구성요소들(525) 외부로 발신될 수 있고, 후속 산술 연산들이 수행될 수 있다. 뉴로모픽 연산들 및/또는 산술 연산들의 결과는 다시 호스트로 발신될 수 있다. Depending on the order of processing, arithmetic operations may be performed while the data is outside the neuromorphic memory array 530 . The result of the arithmetic operation may be sent to the neuromorphic memory array 530 for performing neuromorphic operations. Subsequent to the neuromorphic operations being performed, data can be sent from the neuron elements 525 to other locations within the memory device. The result of the neuromorphic operation can be sent out of the neuron elements 525, and subsequent arithmetic operations can be performed. Results of neuromorphic operations and/or arithmetic operations may be sent back to the host.

본원에서 사용될 때, 신경망 연산 또는 뉴로모픽 연산은 신경망들 중 적어도 하나의 신경망의 하나 이상의 은닉 계층을 결정하기 위해 수행되는 연산들을 포함할 수 있다. 일반적으로, 신경망은 적어도 하나의 입력 계층, 적어도 하나의 은닉 계층, 및 적어도 하나의 출력 계층을 포함할 수 있다. 계층들은 각각 입력을 수신하고 가중된 출력을 생성할 수 있는 다수의 뉴런들을 포함할 수 있다. 일부 실시예들에서, 은닉 계층(들)의 뉴런들은 입력 계층(들)로부터 수신된 입력들 및 이들의 각 가중치들의 가중 합들 및/또는 평균들을 계산하고, 이러한 정보를 출력 계층(들)에 전달할 수 있다.As used herein, neural network operations or neuromorphic operations may include operations performed to determine one or more hidden layers of at least one of the neural networks. In general, a neural network may include at least one input layer, at least one hidden layer, and at least one output layer. Layers can include multiple neurons, each capable of receiving input and generating a weighted output. In some embodiments, neurons in the hidden layer(s) calculate weighted sums and/or averages of the inputs received from the input layer(s) and their respective weights, and pass this information to the output layer(s). can

일부 실시예들에서, 신경망 또는 뉴로모픽 연산들은 트레이닝되지 않은 신경망들을 트레이닝시키기 위해 트레이닝된 신경망들에 의해 학습된 지식을 이용함으로써 수행될 수 있다. 이는 트레이닝된 신경망들에 의해 이미 학습된 정보의 재트레이닝을 감소시킴으로써 트레이닝되지 않은 신경망들을 트레이닝하는 데 소비되는 시간 및 자원들의 양을 감소시킬 수 있다. 또한, 본원의 실시예들은 특정 트레이닝 방법론 하에서 트레이닝된 신경망이 상이한 트레이닝 방법론으로 트레이닝되지 않은 신경망을 트레이닝하는 것을 가능하게 할 수 있다. 예를 들어, 신경망은 텐서플로 방법론 하에서 트레이닝될 수 있고, 그 후 MobileNet 방법론 하에서 트레이닝되지 않은 신경망을 트레이닝할 수 있다. 그러나, 실시예들은 이들 특정 예들에 제한되지 않고, 다른 트레이닝 방법론들이 본 개시의 범위 내인 것으로 고려된다. In some embodiments, neural network or neuromorphic operations may be performed by using knowledge learned by trained neural networks to train untrained neural networks. This can reduce the amount of time and resources spent training untrained neural networks by reducing retraining of information already learned by trained neural networks. Additionally, embodiments herein may enable a neural network trained under a particular training methodology to train a neural network that has not been trained with a different training methodology. For example, a neural network can be trained under the TensorFlow methodology, and then an untrained neural network can be trained under the MobileNet methodology. However, embodiments are not limited to these specific examples, and other training methodologies are contemplated as being within the scope of this disclosure.

도 5b는 본 개시의 다수의 실시예들에 따른 승산기(557), 누산기(559), 내부 ALU(551), 및 레지스터(555)를 포함하는 뉴런 구성요소(525) 형태의 기능 블록도이다. 뉴런 구성요소(525)는 뉴로모픽 연산들을 수행하기 위해 승산기(557), 누산기(559), 내부 ALU(551), 및 레지스터(555)를 연산시키도록 구성된 뉴런 회로부(552)를 포함할 수 있다. 뉴런 회로부(552)는 뉴로모픽 연산들을 수행하기 위해 사용되는 하드웨어, 로직, 또는 하나 이상의 처리 디바이스를 포함할 수 있다. 예로서, 뉴런 구성요소(525)는 멀티플렉서(예를 들어, 도 5a의 MUX(545))로부터 데이터를 수신할 수 있고, 뉴런 구성요소(525)의 뉴런 회로부(552)는 승산기(557)로의 입력으로서 데이터의 발신을 야기할 수 있다. 뉴런 회로부(552)는 승산기(557)가 수신되는 데이터 값들에 대해 승산 연산을 수행하게 하고, 그 승산 연산의 결과를 가산기(559)로 출력으로서 발신하게 할 수 있다. 가산기(559)는 비트 스트링 레지스터(555)에 결합될 수 있다. 비트 스트링 레지스터(555)는 포지트들을 사용한 뉴로모픽 연산들의 결과의 오버플로를 나타내는 데이터 값을 저장할 수 있다. 내부 ALU(예를 들어, "미니-ALU")(551)는 승산기(557)가 수신하는 데이터로 입력을 제공할 수 있다. 또한, 누산기("ACCUM")(553)는 가산기(559)의 출력을 누산하고, 이를 가산기(559)에 입력으로서 제공할 수 있다. 5B is a functional block diagram in the form of a neuron component 525 that includes a multiplier 557, an accumulator 559, an internal ALU 551, and a register 555 in accordance with multiple embodiments of the present disclosure. Neuron component 525 may include neuron circuitry 552 configured to operate multiplier 557, accumulator 559, internal ALU 551, and register 555 to perform neuromorphic operations. there is. Neuron circuitry 552 may include hardware, logic, or one or more processing devices used to perform neuromorphic operations. As an example, neuron component 525 can receive data from a multiplexer (eg, MUX 545 in FIG. 5A ), and neuron circuitry 552 of neuron component 525 can transmit data to multiplier 557 As an input, it can cause the outgoing of data. Neuron circuitry 552 can cause multiplier 557 to perform a multiplication operation on received data values and send the result of the multiplication operation to adder 559 as an output. An adder 559 may be coupled to the bit string register 555. The bit string register 555 can store a data value representing the overflow of the result of neuromorphic operations using digits. An internal ALU (eg, “mini-ALU”) 551 may provide an input with data that multiplier 557 receives. An accumulator ("ACCUM") 553 may also accumulate the output of adder 559 and provide it to adder 559 as an input.

이러한 방식으로, 뉴런 구성요소(525)는 뉴런 회로부(552)를 사용하여 승산기(557), 가산기(559), 누산기(553), 및 내부 ALU(551)를 연산시켜 데이터에 대해 뉴로모픽 연산들을 수행하고 그 결과를 비트 스트링 레지스터(555)에 저장할 수 있다. 비트 스트링 레지스터(555)의 출력은 데이터에 대해 수행될 추가적인 연산들(RELU 연산, 시그마 연산 등)을 위해 (도 5a의 ALU(526)와 같은) ALU로 발신될 수 있다. In this way, neuron component 525 uses neuron circuitry 552 to operate multiplier 557, adder 559, accumulator 553, and internal ALU 551 to perform neuromorphic operations on data. and store the result in the bit string register 555. The output of bit string register 555 can be sent to an ALU (such as ALU 526 in FIG. 5A) for additional operations to be performed on the data (RELU operation, sigma operation, etc.).

도 5c는 본 개시의 다수의 실시예들에 따른 메모리 어레이(530)의 일부의 기능 블록도이다. 어레이(530)는 액세스 라인들(554-0, 554-1, 554-2, 554-3, 554-4, 554-5, 554-6, …, 554-R)(액세스 라인들(554)로서 총칭됨)의 로우들 및 감지 라인들(505-0, 505-1, 505-2, 505-3, 505-4, 505-5, 505-6, 505-7, …, 505-S)(감지 라인들(556)로서 총칭됨)의 컬럼들에 결합된 복수의 메모리 셀들(501-0 내지 501-J)(셀들(501)로서 총칭됨)을 포함한다. 메모리 어레이(330)는 특정 수의 액세스 라인 및/또는 감지 라인으로 제한되지 않고, 용어들 "로우들" 및 "컬럼들"의 사용은 액세스 라인들 및/또는 감지 라인들의 특정 물리적 구조 및/또는 배향을 의도하지 않는다. 도시되지는 않았지만, 메모리 셀들의 각 컬럼은 대응하는 상보적 감지 라인들의 쌍(예를 들어, 도 2a의 상보적 감지 라인들(205-1 및 205-2))과 연관될 수 있다. 5C is a functional block diagram of a portion of a memory array 530 in accordance with multiple embodiments of the present disclosure. Array 530 includes access lines 554-0, 554-1, 554-2, 554-3, 554-4, 554-5, 554-6, ..., 554-R (access lines 554 (collectively referred to as and a plurality of memory cells 501-0 through 501-J (collectively referred to as cells 501) coupled to columns of (collectively referred to as sense lines 556). Memory array 330 is not limited to a particular number of access lines and/or sense lines, and use of the terms "rows" and "columns" refers to the specific physical structure and/or Orientation is not intended. Although not shown, each column of memory cells may be associated with a corresponding pair of complementary sense lines (eg, complementary sense lines 205-1 and 205-2 in FIG. 2A).

메모리 어레이(530)는 복수의 메모리 셀들(501)로 구성되는 복수의 뉴런들(525-1, 525-2)을 포함할 수 있다. 예를 들어, 도 5c에 도시된 바와 같이, 열 여섯 개의 메모리 셀들이 뉴런을 구성할 수 있다. 그러나, 예들은 이에 제한되지 않는다. 임의의 수의 메모리 셀이 뉴런을 구성할 수 있다. 메모리 셀들(501) 각각은 도 5b에 도시된 바와 같이, 뉴런의 구성요소들을 구성할 수 있다. The memory array 530 may include a plurality of neurons 525 - 1 and 525 - 2 composed of a plurality of memory cells 501 . For example, as shown in FIG. 5C , sixteen memory cells may constitute a neuron. However, examples are not limited to this. Any number of memory cells may constitute a neuron. Each of the memory cells 501 may constitute components of a neuron, as shown in FIG. 5B.

메모리 셀들의 각 컬럼은 감지 회로부(예를 들어, 감지 앰프들(547))에 결합될 수 있다. 이 예에서, 감지 회로부는 각 감지 라인들(556-0, 556-1, 556-2, 556-3, 556-4, 556-5, 556-6, 556-7, …, 556-S)에 결합된 다수의 감지 증폭기들(547-0, 547-1, 547-2, 547-3, 547-4, 547-5, 547-6, 547-7, …, 547-U)(감지 증폭기들(547)로서 총칭됨)을 포함한다. 감지 증폭기들(547)은 액세스 디바이스들(예를 들어, 트랜지스터들)(549-0, 549-1, 549-2, 549-3, 549-4, 549-5, 549-6, 549-7, …, 549-V)을 통해 입력/출력(I/O) 라인(334)(예를 들어, 로컬 I/O 라인)에 결합된다. 이 예에서, 감지 회로부는 또한 각 감지 증폭기들(547)에 대응하고 각 감지 라인들(556)에 결합된 다수의 컴퓨트 구성요소들(331-0, 331-1, 331-2, 331-3, 331-4, 331-5, 331-6, 331-7, …, 331-X)을 포함할 수 있다. 컬럼 디코드 라인들(558-1 내지 558-W)은 각각 트랜지스터들(549-1 내지 549-V)의 게이트들에 결합되고, 각 감지 증폭기들(547-0 내지 547-U)에 의해 감지되고/되거나 각 컴퓨트 구성요소들(331-0 내지 331-X)에 저장된 데이터를 이차 감지 증폭기(561)로 전송하기 위해 선택적으로 활성화될 수 있다. 다수의 실시예들에서, 컴퓨트 구성요소들(331)은 이들의 대응하는 컬럼들의 메모리 셀들 및/또는 대응하는 감지 증폭기들(547)과 피치가 맞게 형성될 수 있다.Each column of memory cells may be coupled to sense circuitry (eg, sense amplifiers 547). In this example, the sensing circuitry includes each of the sensing lines 556-0, 556-1, 556-2, 556-3, 556-4, 556-5, 556-6, 556-7, ..., 556-S A plurality of sense amplifiers (547-0, 547-1, 547-2, 547-3, 547-4, 547-5, 547-6, 547-7, ..., 547-U) coupled to (sense amplifier s 547). Sense amplifiers 547 are connected to access devices (e.g., transistors) 549-0, 549-1, 549-2, 549-3, 549-4, 549-5, 549-6, 549-7 , ..., 549-V) to an input/output (I/O) line 334 (e.g., a local I/O line). In this example, the sense circuitry also includes a number of compute elements 331-0, 331-1, 331-2, 331-0, 331-1, 331-2, 331-, corresponding to respective sense amplifiers 547 and coupled to respective sense lines 556. 3, 331-4, 331-5, 331-6, 331-7, ..., 331-X). Column decode lines 558-1 through 558-W are coupled to the gates of transistors 549-1 through 549-V, respectively, and are sensed by respective sense amplifiers 547-0 through 547-U and / or may be selectively activated to transmit data stored in each of the compute elements 331-0 to 331-X to the secondary sense amplifier 561. In some embodiments, the compute elements 331 may be formed in pitch with their corresponding columns of memory cells and/or corresponding sense amplifiers 547 .

다수의 실시예들에서, 감지 회로부(예를 들어, 컴퓨트 구성요소들(331) 및 감지 증폭기(547))는 어레이(330)에 저장된 요소들에 대해 부호 있는 나눗셈 연산을 수행하도록 구성된다. 예로서, 각각이 네 개의 데이터 유닛들을 포함하는 복수의 요소들(예를 들어, 4 비트 요소들)이 복수의 메모리 셀들에 저장될 수 있다. 복수의 요소들 중 제1 4 비트 요소는 제1 감지 라인(예를 들어, 556-0)에 그리고 다수의 액세스 라인들(예를 들어, 554-0, 554-1, 554-2, 554-3)에 결합된 메모리 셀들의 제1 그룹에 저장될 수 있고, 제2 요소는 제2 감지 라인(예를 들어, 556-1)에 그리고 다수의 액세스 라인들(예를 들어, 감지 라인들(554-0, 554-1, 554-2, 554-3))에 결합된 메모리 셀들의 제2 그룹에 저장될 수 있다. 이와 같이, 제1 요소 및 제2 요소는 메모리 셀들의 제1 컬럼 및 2 컬럼에 각각 저장된다. In some embodiments, sense circuitry (eg, compute components 331 and sense amplifier 547 ) is configured to perform signed division operations on the elements stored in array 330 . As an example, a plurality of elements each comprising four data units (eg, four bit elements) may be stored in a plurality of memory cells. A first 4-bit element of the plurality of elements is connected to a first sense line (e.g., 556-0) and to multiple access lines (e.g., 554-0, 554-1, 554-2, 554-0). 3), the second element to a second sense line (eg, 556-1) and to a plurality of access lines (eg, sense lines ( 554-0, 554-1, 554-2, 554-3)). As such, the first element and the second element are stored in the first and second columns of memory cells, respectively.

도 6은 본 개시의 다수의 실시예들에 따른 제어 회로부(620)를 포함하는 장치(600) 형태의 기능 블록도이다. 제어 회로부(620)는 논리 회로부(622) 및 메모리 자원(624)을 포함할 수 있으며, 이들은 본원에서, 도 1에 도시된 처리 자원(122) 및 메모리 자원(124)와 유사할 수 있다. 논리 회로부(622) 및/또는 메모리 자원(624)은 개별적으로 "장치"인 것으로 고려될 수 있다.6 is a functional block diagram in the form of a device 600 that includes control circuitry 620 in accordance with multiple embodiments of the present disclosure. Control circuitry 620 may include logic circuitry 622 and memory resources 624 , which may be similar herein to processing resources 122 and memory resources 124 shown in FIG. 1 . Logic circuitry 622 and/or memory resource 624 may individually be considered a “device.”

제어 회로부(620)는 호스트(예를 들어, 본원에서, 도 1, 도 2a, 및 도 2b에 도시된 호스트(102/202))로부터 포지트 포맷의 비트 스트링을 수신하도록 구성될 수 있다. 일부 실시예들에서, 포즈 비트 스트링은 메모리 자원(624)에 저장될 수 있다. 비트 스트링이 제어 회로부(620)에 의해 수신되었으면, 호스트 및/또는 제어기로부터의 개재 커맨드들의 부재 시 포지트 비트 스트링에 대해 산술 및/또는 논리 연산들이 수행될 수 있다. 예를 들어, 제어 회로부(620)는 제어 회로부(620) 외부의 회로부로부터 추가적인 커맨드들을 수신하지 않고 메모리 자원(624)에 저장된 비트 스트링들에 대한 산술 및/또는 논리 연산들을 수행하기에 충분한 처리 자원들 및/또는 명령어들을 포함할 수 있다.Control circuitry 620 may be configured to receive a bit string in positive format from a host (eg, host 102/202 shown herein in FIGS. 1, 2A, and 2B). In some embodiments, the pause bit string may be stored in memory resource 624 . Once a bit string has been received by control circuitry 620, arithmetic and/or logical operations may be performed on the positive bit string in the absence of intervening commands from the host and/or controller. For example, control circuitry 620 may have sufficient processing resources to perform arithmetic and/or logical operations on bit strings stored in memory resource 624 without receiving additional commands from circuitry external to control circuitry 620. s and/or instructions.

논리 회로부(622)는 산술 논리 유닛(arithmetic logic unit, ALU), 상태 기계, 시퀀서, 제어기, 명령어 집합 아키텍처(instruction set architecture, ISA), 또는 다른 유형의 제어 회로부일 수 있다. 상술한 바와 같이, ALU는 포지트 포맷의 비트 스트링들과 같은 정수 이진수들에 대해, 상술한 연산들과 같은 연산들(예를 들어, 산술 연산들, 논리 연산들, 비트별 연산들 등)을 수행하기 위한 회로부를 포함할 수 있다. 명령어 집합 아키텍처(ISA)는 축소 명령어 집합 컴퓨팅(reduced instruction set computing, RISC) 디바이스를 포함할 수 있다. 논리 회로부(622)가 RISC 디바이스를 포함하는 실시예들에서, RISC 디바이스는 RISC-V ISA와 같은 명령어 집합 아키텍처(ISA)를 채용할 수 있는 처리 자원을 포함할 수 있지만, 실시예들은 RISC-V ISA들로 제한되지 않고 다른 처리 디바이스들 및/또는 ISA들이 사용될 수 있다. Logic circuitry 622 may be an arithmetic logic unit (ALU), state machine, sequencer, controller, instruction set architecture (ISA), or other type of control circuitry. As described above, the ALU performs operations such as those described above (e.g., arithmetic operations, logical operations, bitwise operations, etc.) on integer binary numbers, such as bit strings in positive format. It may include circuitry to perform. An instruction set architecture (ISA) may include a reduced instruction set computing (RISC) device. In embodiments where logic circuitry 622 includes a RISC device, the RISC device may include processing resources that may employ an instruction set architecture (ISA), such as a RISC-V ISA, although embodiments It is not limited to ISAs and other processing devices and/or ISAs may be used.

일부 실시예들에서, 논리 회로부(622)는 상기한 연산들을 수행하기 위해 명령어들(예를 들어, 메모리 자원(624)의 INSTR(625) 부분에 저장된 명령어들)을 실행하도록 구성될 수 있다. 예를 들어, 논리 회로부(624)는 제어 회로부(620)에 의해 수신되는 데이터에 대해(예를 들어, 비트 스트링들에 대해) 산술 및/또는 논리 연산들을 수행하게 하기에 충분한 처리 자원들을 구비한다. In some embodiments, logic circuitry 622 may be configured to execute instructions (eg, instructions stored in INSTR 625 portion of memory resource 624) to perform the above operations. For example, logic circuitry 624 has processing resources sufficient to enable it to perform arithmetic and/or logical operations on data received by control circuitry 620 (e.g., on bit strings). .

산술 및/또는 논리 연산(들)이 논리 회로부(622)에 의해 수행되면, 결과 비트 스트링들은 메모리 자원(624) 및/또는 메모리 어레이(예를 들어, 본원에서, 도 2a에 도시된 뉴로모픽 메모리 어레이(230))에 저장될 수 있다. 저장된 결과 비트 스트링들은 이들이 연산들의 수행을 위해 액세스 가능하도록 어드레싱될 수 있다. 예를 들어, 비트 스트링들은 연산들을 수행하는 데 있어서 비트 스트링들이 액세스될 수 있도록 메모리 자원(624) 및/또는 메모리 어레이에서 (이에 대응하는 논리 어드레스들을 가질 수 있는) 특정 물리 어드레스들에 저장될 수 있다. When arithmetic and/or logic operation(s) are performed by logic circuitry 622, the resulting bit strings are stored in memory resource 624 and/or a memory array (eg, herein, the neuromorphic data shown in FIG. 2A). may be stored in the memory array 230 . Stored result bit strings may be addressed such that they are accessible for performance of operations. For example, bit strings may be stored at specific physical addresses (which may have corresponding logical addresses) in memory resource 624 and/or a memory array such that the bit strings may be accessed in performing operations. there is.

일부 실시예들에서, 메모리 자원(624)은 랜덤 액세스 메모리(예를 들어, RAM, SRAM 등)와 같은 메모리 자원일 수 있다. 그러나, 실시예들은 이에 제한되지 않고, 메모리 자원(624)은 다양한 레지스터들, 캐시들, 버퍼들, 및/또는 메모리 어레이들(예를 들어, 1T1C, 2T2C, 3T 등의 DRAM 어레이들)을 포함할 수 있다. 메모리 자원(624)은 예를 들어, 본원에서, 도 1, 도 2a에 도시된 호스트(102/202)와 같은 호스트, 및/또는 도 2a에 도시된 뉴로모픽 메모리 어레이(230)와 같은 메모리 어레이로부터 비트 스트링(예를 들어, 포지트 포맷의 비트 스트링)을 수신하도록 구성될 수 있다. 일부 실시예들에서, 메모리 자원(638)은 대략 256 킬로바이트(KB)의 크기를 가질 수 있지만, 실시예들은 이러한 특정 크기로 제한되지 않고, 메모리 자원(624)은 256 KB보다 크거나 작은 크기를 가질 수 있다.In some embodiments, memory resource 624 may be a memory resource such as random access memory (eg, RAM, SRAM, etc.). However, embodiments are not so limited, and memory resource 624 may include various registers, caches, buffers, and/or memory arrays (eg, DRAM arrays such as 1T1C, 2T2C, 3T, etc.). can do. Memory resource 624 may be, for example herein, a host such as host 102/202 shown in FIGS. 1 and 2A, and/or a memory such as neuromorphic memory array 230 shown in FIG. 2A. It may be configured to receive a bit string (eg, a bit string in positive format) from the array. In some embodiments, memory resource 638 may have a size of approximately 256 kilobytes (KB), although embodiments are not limited to this particular size, and memory resource 624 may have a size greater than or less than 256 KB. can have

메모리 자원(624)은 하나 이상의 어드레싱 가능한 메모리 영역으로 구획될 수 있다. 도 6에 도시된 바와 같이, 메모리 자원(624)은 다양한 유형들의 데이터가 내부에 저장될 수 있도록 어드레싱 가능한 메모리 영역들로 구획될 수 있다. 예를 들어, 하나 이상의 메모리 영역은 메모리 자원(624)에 의해 사용되는 명령어들("INSTR")(625)을 저장할 수 있고/거나, 하나 이상 메모리 영역은 데이터(626-1, . . ., 626-N)(예를 들어, 호스트 및/또는 메모리 어레이로부터 검색된 비트 스트링과 같은 데이터)를 저장할 수 있고/거나, 하나 이상의 메모리 영역은 메모리 자원(638)의 로컬 메모리("LOCAL MEM")(628) 부분으로서의 역할을 할 수 있다. 도 6에는 20개의 별개의 메모리 영역들이 도시되지만, 메모리 자원(624)은 임의의 수의 별개의 메모리 영역으로 구획될 수 있다는 것이 이해될 것이다. Memory resources 624 may be partitioned into one or more addressable memory regions. As shown in FIG. 6, memory resource 624 may be partitioned into addressable memory areas so that various types of data may be stored therein. For example, one or more memory areas may store instructions ("INSTR") 625 used by memory resource 624 and/or one or more memory areas may store data 626-1, ..., 626-N) (e.g., data such as bit strings retrieved from the host and/or a memory array), and/or one or more memory areas may be local memory ("LOCAL MEM") ( 628) can serve as a part. Although twenty distinct memory regions are shown in FIG. 6, it will be appreciated that the memory resource 624 may be partitioned into any number of distinct memory regions.

상술한 바와 같이, 비트 스트링(들)은 호스트, 제어기(예를 들어, 본원에서, 도 2a에 도시된 제어기(210)), 또는 논리 회로부(622)에 의해 생성되는 메시지들 및/또는 커맨드들에 응답하여 호스트 및/또는 메모리 어레이로부터 검색될 수 있다. 일부 실시예들에서, 커맨드들 및/또는 메시지들은 논리 회로부(622)에 의해 처리될 수 있다. 비트 스트링(들)이 제어 회로부(620)에 의해 수신되고 메모리 자원(624)에 저장되면, 이것들은 논리 회로부(622)에 의해 처리될 수 있다. 논리 회로부(622)에 의해 비트 스트링(들)을 처리하는 것은 수신된 비트 스트링에 대해 산술 연산들 및/또는 논리 연산들을 수행하는 것을 포함할 수 있다. As noted above, the bit string(s) may represent messages and/or commands generated by a host, a controller (e.g., controller 210 shown herein in FIG. 2A), or logic circuitry 622. In response, it can be retrieved from the host and/or memory array. In some embodiments, commands and/or messages may be processed by logic circuitry 622. Once the bit string(s) are received by control circuitry 620 and stored in memory resource 624, they may be processed by logic circuitry 622. Processing the bit string(s) by logic circuitry 622 may include performing arithmetic operations and/or logical operations on the received bit string.

비제한적인 신경망 트레이닝 적용예에서, 제어 회로부(620)는 es = 0인 8 비트 포지트로부터 변환된 부동 아날로그 비트 스트링을 수신할 수 있다. es = 0인 8 비트 포지트 비트 스트링을 이용하는 일부 접근법들과 대조적으로, 아날로그 비트 스트링은 8 비트 포지트 비트 스트링보다 더 빠른 신경망 트레이닝 결과들을 제공할 수 있다. In a non-limiting neural network training application, control circuitry 620 may receive a floating analog bit string converted from an 8-bit position with es = 0. In contrast to some approaches that use an 8-bit positive bit string with es = 0, an analog bit string can provide faster neural network training results than an 8-bit positive bit string.

도 7은 본 개시의 다수의 실시예들에 따른 포지트들에 대한 뉴로모픽 연산들에 대한 예시적인 방법(770)을 나타내는 흐름도이다. 블록(772)에서, 방법(770)은 멀티플렉서로부터 메모리 어레이의 뉴런 구성요소에서 데이터 값을 수신하는 단계를 포함할 수 있다. 데이터 값은 특정 레벨의 정밀도로 산술 연산들을 지원하는 포맷의 비트 스트링을 포함할 수 있다. 포맷은 포지트 포맷일 수 있다. 또한, 포맷은 가수, 레짐, 부호, 및 지수를 포함할 수 있다. 메모리 어레이는 메모리 어레이(도 1에 도시된 130 및 도 2a에 도시된 230)와 유사할 수 있다. 7 is a flow diagram illustrating an exemplary method 770 for neuromorphic operations on positions in accordance with multiple embodiments of the present disclosure. At block 772 , method 770 may include receiving data values at the neuron element of the memory array from the multiplexer. A data value may include a bit string in a format that supports arithmetic operations with a particular level of precision. The format may be a positive format. Also, the format may include mantissa, regime, sign, and exponent. The memory array may be similar to the memory array ( 130 shown in FIG. 1 and 230 shown in FIG. 2A ).

블록 774에서, 방법(770)은 뉴런 구성요소에서 데이터 값에 대해 뉴로모픽 연산을 수행하는 단계를 포함할 수 있다. 뉴런 구성요소는 뉴런 구성요소(도 1의 125, 도 2a의 225, 및 도 5a-도 5c의 525)와 유사할 수 있다.At block 774, method 770 may include performing neuromorphic operations on the data values at the neuron component. The neuron component may be similar to the neuron component ( 125 in FIG. 1 , 225 in FIG. 2A , and 525 in FIGS. 5A-5C ).

블록 776에서, 방법(770)은 뉴런 구성요소에서 곱셈 누적(MAC) 연산을 수행하여 MAC 결과 값을 생성하는 단계를 포함할 수 있다. 일부 실시예들에서, 연산의 수행은 메모리 셀에 기록된 데이터에 대해 뉴로모픽 연산을 수행하는 것을 포함할 수 있다. 일부 예에서, 데이터에 대한 연산의 수행은 연산이 제1 포맷으로 표현된 동일한 데이터에 대해 수행되는 경우보다 더 짧은 시간량 또는 더 적은 수의 서브 연산들로 일어난다. 일부 실시예들에서, 방법(770)은 MAC 결과 값에 대해 제2 MAC 연산을 수행하여, 추가적인 MAC 결과 값을 생성하는 단계를 더 포함할 수 있다.At block 776, the method 770 may include performing a multiply-accumulate (MAC) operation on the neuron component to generate a MAC resultant value. In some embodiments, performing the operation may include performing a neuromorphic operation on data written to a memory cell. In some examples, performance of an operation on data occurs in a shorter amount of time or in fewer sub-operations than if the operation were performed on the same data represented in the first format. In some embodiments, method 770 may further include performing a second MAC operation on the MAC result value to generate an additional MAC result value.

블록 778에서, 방법(770)은 MAC 결과 값을 뉴런 구성요소에서의 비트 스트링 레지스터에 제공하는 단계. 메모리 셀에 기록된 데이터를 사용하여 연산을 수행하여 제2 포맷의 추가적인 데이터를 생성하는 단계를 포함할 수 있다. 방법(770)은 MAC 결과 값을 메모리 디바이스 외부의 디바이스로 발신하는 단계를 더 포함할 수 있다. 메모리 디바이스 외부의 디바이스는 호스트일 수 있다. At block 778, method 770 provides the MAC result value to a bit string register in the neuron component. and generating additional data in a second format by performing an operation using the data written in the memory cell. Method 770 can further include sending the MAC result value to a device external to the memory device. A device external to the memory device may be a host.

일부 실시예들에서, 방법(770)은 추가적인 뉴런 구성요소에서 추가적인 데이터 값 ― 추가적인 데이터 값은 데이터 값에 관련됨 ― 에 대해 추가적인 뉴로모픽 연산을 수행하여, 추가적인 MAC 결과 값을 생성하는 단계를 더 포함할 수 있다. 본 방법은 추가적인 MAC 결과 값을 추가적인 뉴런 구성요소에서의 추가적인 비트 스트링 레지스터에 제공하는 단계, 및 추가적인 MAC 결과 값을 추가적인 뉴런 구성요소에서의 추가적인 비트 스트링 레지스터에 제공하는 단계를 더 포함할 수 있다. 일부 실시예들에서, MAC 결과 값과 추가적인 MAC 결과 값을 조합하는 단계는 뉴런 구성요소 및 추가적인 뉴런 구성요소의 결과의 정밀도를 증가시킬 수 있다.In some embodiments, method 770 further comprises performing an additional neuromorphic operation on an additional data value, the additional data value being related to the data value, in the additional neuron component to generate an additional MAC result value. can include The method may further include providing an additional MAC result value to an additional bit string register in the additional neuron element, and providing an additional MAC result value to an additional bit string register in the additional neuron element. In some embodiments, combining the MAC result value and the additional MAC result value may increase the precision of the neuron component and the result of the additional neuron component.

일부 실시예들에서, 방법(770)은 복수의 개별 뉴런 구성요소들 내에서 각각 복수의 MAC 연산들을 수행하는 단계를 더 포함할 수 있다. 복수의 개별 뉴런 구성요소들의 수량은 복수의 개별 뉴런 구성요소들 각각 내의 복수의 개별 비트 스트링 레지스터들에 걸쳐 저장된 데이터 값들의 정밀도에 대응할 수 있다. 복수의 개별 비트 스트링 레지스터들에 걸쳐 저장된 데이터 값들은 정밀도를 초래하는 더 큰 값을 생성하도록 조합될 수 있다. 일부 실시예들에서, 방법(770)은 MAC 연산의 적어도 일부를 뉴런 구성요소 내의 내부 ALU에서 수행하는 단계를 더 포함할 수 있다.In some embodiments, method 770 may further include performing a plurality of MAC operations within each of a plurality of individual neuronal elements. A quantity of the plurality of individual neuron elements may correspond to a precision of data values stored across a plurality of individual bit string registers within each of the plurality of individual neuron elements. Data values stored across multiple individual bit string registers can be combined to create larger values resulting in precision. In some embodiments, method 770 may further include performing at least a portion of the MAC operation in an internal ALU within the neuron component.

본원에서 구체적인 실시예들이 예시되고 설명되었지만, 해당 기술분야의 통상의 기술자들은 동일한 결과들을 달성하도록 계산된 배열이 제시된 구체적인 실시예들을 대체할 수 있다고 이해할 것이다. 본 개시는 본 개시의 하나 이상의 실시예의 개조 또는 변형을 포괄하는 것으로 의도된다. 상기한 설명은 제한적인 방식이 아니라, 예시적인 방식으로 이루어진 것으로 이해되어야 한다. 상기한 설명을 검토할 때 해당 기술분야의 통상의 기술자들에게 상기한 실시예들, 및 본원에서 구체적으로 설명되지 않은 다른 실시예들의 조합이 명백해질 것이다. 본 개시의 하나 이상의 실시예의 범위는 상기한 구조들 및 프로세스들이 사용되는 다른 적용예들을 포함한다. 따라서, 본 개시의 하나 이상의 실시예의 범위는 첨부된 청구항들을 참조하여, 이러한 청구항들에 부여되는 균등물들의 전체 범위와 함께 결정되어야 한다.Although specific embodiments have been illustrated and described herein, those skilled in the art will understand that arrangements calculated to achieve the same results may be substituted for the specific embodiments presented. This disclosure is intended to cover adaptations or variations of one or more embodiments of the disclosure. It should be understood that the foregoing description has been made in an exemplary manner and not in a restrictive manner. Combinations of the above embodiments, and other embodiments not specifically described herein, will become apparent to those skilled in the art upon review of the above description. The scope of one or more embodiments of the present disclosure includes other applications in which the structures and processes described above are used. Accordingly, the scope of one or more embodiments of the present disclosure should be determined with reference to the appended claims, along with the full scope of equivalents accorded to those claims.

앞에서의 발명을 실시하기 위한 구체적인 내용에서, 본 개시를 간략화하기 위해 몇몇 피처들이 하나의 실시예에서 함께 그룹화된다. 본 개시의 방법은 본 개시의 개시된 실시예들이 각 청구항에 명시적으로 나열된 것보다 더 많은 피처들을 사용해야 한다는 의도를 반영하는 것으로서 해석되지 않아야 한다. 더 정확히 말하면, 다음의 청구항들이 반영하는 바와 같이, 본 발명의 요지는 하나의 개시된 실시예의 모든 특징들보다 더 적은 특징들에 있다. 이에 따라, 다음의 청구항들은 이에 의해 발명을 실시하기 위한 구체적인 내용으로 통합되며, 각 청구항은 별개의 실시예로서 독립적이다.In the foregoing specifics for practicing the invention, several features are grouped together in an embodiment to simplify the present disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the disclosed embodiments of this disclosure must use more features than are expressly recited in each claim. More precisely, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Accordingly, the following claims are hereby incorporated into the specific content to practice the invention, with each claim standing alone as a separate embodiment.

Claims

As a device,
A memory array comprising:
a plurality of memory cells configured to store data comprising a plurality of bit strings; and
and a processing device residing on the memory array and configured to perform neuromorphic operations using at least one bit string of the plurality of bit strings in an input layer of the neuron component. device, which does.

The apparatus of claim 1, wherein the plurality of bit strings are in a format supporting arithmetic operations with different levels of precision, respectively.

3. The apparatus of claim 2, wherein the format supporting the arithmetic operations is a universal number format.

3. The apparatus of claim 2, wherein the format supporting the arithmetic operation is a Type III universal number format or a posit format.

3. The apparatus of claim 2, wherein the format supporting the arithmetic operations includes a mantissa, sign, regime, and exponent part.

6. The neuronal component according to any one of claims 1 to 5, wherein:
an internal ALU component configured to perform at least some of the MAC operations within the neuron component;
accumulator; and
contains a bit string register;
wherein the bit string register is configured to determine a particular level of precision.

7. The apparatus of claim 6, wherein the internal ALU component is configured to perform MAC operations that are performed with a specific frequency or performed with a specific number of times.

According to any one of claims 1 to 5,
the neuromorphic operations include neural network operations or machine learning operations;
the neuron component is configured to transmit an output from a result of performing at least one of the neuromorphic operations to external circuitry;
wherein the external circuitry is an arithmetic logic unit (ALU).

As a method,
receiving data values at the neuron element of the memory array from the multiplexer;
Using a processing device of the neuron component, performing a neuromorphic operation at the neuron component on the data value including at least one bit string of a plurality of bit strings in an input layer of the neuron component It includes the step of performing the neuromorphic operation:
Generating a MAC result value by performing a multiply accumulate (MAC) operation in the neuron component; and
and providing the MAC result value to a bit string register in the neuron component.

10. The method of claim 9, further comprising generating an additional MAC result value by performing a second MAC operation on the MAC result value.

According to claim 9,
performing an additional neuromorphic operation on an additional data value in an additional neuron component, the additional data value being related to the data value, to generate an additional MAC result value; and
providing the additional MAC result value to an additional bit string register in the additional neuron component; and
further comprising combining the MAC result value and the additional MAC result value.

12. The method of claim 11, wherein combining the MAC result value and the additional MAC result value comprises increasing precision of the neuron component and the result of the additional neuron component.

12. The method of claim 11, further comprising performing a plurality of MAC operations within a plurality of individual neuronal elements, respectively;
the quantity of the plurality of individual neuron elements corresponds to a precision of data values stored across a plurality of individual bit string registers within each of the plurality of individual neuron elements;
wherein the data values stored across the plurality of individual bit string registers are combined to produce a larger value resulting in the precision.

14. The method of any one of claims 9-13, wherein at least part of the MAC operations are performed in an internal ALU within the neuron component.

As a system,
a plurality of peripheral sense amplifiers (PSAs);
a plurality of neuronal components coupled to the plurality of PSAs through a plurality of multiplexers, each of the plurality of neuronal components including neuronal circuitry; and
a plurality of arithmetic logic units (ALUs) coupled to each of a plurality of discrete neuronal elements;
The plurality of neuronal components use individual neuronal circuitry to:
receive a data value from the PSAs via the plurality of multiplexers, the data value comprising at least one bit string;
perform neuromorphic operations on the data values;
and output a result of performing the neuromorphic operation, wherein the output indicates a degree of neural network learning.

16. The method of claim 15, wherein each of the plurality of neuronal components:
multiplier;
accumulator;
an internal arithmetic logic unit (ALU); and
A system comprising a bit string register.

According to claim 16,
the neuromorphic operation is performed within one of the plurality of neuronal components using separate multipliers, accumulators, and internal ALUs;
each bit string register is configured to be used for storing an overflow of the output as a positive value;
wherein the neuron circuitry is configured to cause the overflow of the output to be sent to the internal ALU to perform an additional neuromorphic operation in response to a threshold number of neuromorphic operations being performed.

18. The system of any one of claims 15-17, wherein the neuron circuitry is configured to send the output to an external ALU, and the external ALU is configured to perform additional operations on the output.

19. The system of claim 18, wherein the external ALU is configured to perform a RELU operation or a sigma operation on the output.

19. The system of claim 18, wherein the neuron circuitry is configured to direct the output to the external ALU in response to a threshold number of neuromorphic operations being performed.