KR102440692B1

KR102440692B1 - Accumulation of bit strings at the perimeter of a memory array

Info

Publication number: KR102440692B1
Application number: KR1020227000052A
Authority: KR
Inventors: 비제이 에스. 라메쉬
Original assignee: 마이크론 테크놀로지, 인크.
Priority date: 2019-06-04
Filing date: 2020-04-17
Publication date: 2022-09-07
Also published as: CN113924622A; CN113924622B; KR20220003674A; EP3980996A4; WO2020247077A1; EP3980996A1

Abstract

메모리 어레이 주변부에서 비트 스트링을 누산시키는 것이 설명된다. 제어 회로부(예를 들어, 처리 디바이스)는 메모리 디바이스 내의 비트 스트링을 사용하여 연산의 수행을 제어하는 데 사용될 수 있다. 연산의 결과는 메모리 디바이스의 메모리 어레이 주변의 회로부에서 누산될 수 있다. 예를 들어, 복수의 감지 증폭기가 메모리 어레이 및 처리 디바이스에 결합될 수 있다. 복수의 감지 증폭기 중 감지 증폭기의 수량은 어레이의 행 또는 열의 수량과 동일할 수 있다. 처리 디바이스는 유형 III 범용 숫자 형식 또는 포지트 형식에 따라 형식화된 하나 이상의 비트 스트링을 사용하여 재귀 연산을 수행하도록 구성될 수 있다. 처리 디바이스는 재귀 연산의 반복 결과를 나타내는 결과 비트 스트링을 복수의 감지 증폭기에서 누산하도록 더 구성될 수 있다. Accumulating a string of bits at the perimeter of a memory array is described. Control circuitry (eg, a processing device) may be used to control the performance of an operation using a string of bits in the memory device. The result of the operation may be accumulated in circuitry around the memory array of the memory device. For example, a plurality of sense amplifiers may be coupled to the memory array and processing device. The number of sense amplifiers among the plurality of sense amplifiers may be the same as the number of rows or columns of the array. The processing device may be configured to perform a recursive operation using one or more bit strings formatted according to a Type III general-purpose numeric format or a positive format. The processing device may be further configured to accumulate, in the plurality of sense amplifiers, a result bit string representing a result of an iteration of the recursive operation.

Description

Accumulation of bit strings at the perimeter of a memory array

본 발명은 일반적으로 반도체 메모리 및 방법에 관한 것으로, 보다 상세하게는 메모리 어레이 주변부에서 비트 스트링(bit string)을 누산(accumulation)시키기 위한 장치, 시스템 및 방법에 관한 것이다.BACKGROUND OF THE INVENTION Field of the Invention [0001] The present invention relates generally to semiconductor memories and methods, and more particularly to apparatus, systems and methods for accumulating bit strings at the periphery of a memory array.

메모리 디바이스는 일반적으로 컴퓨터 또는 다른 전자 시스템의 내부, 반도체, 집적 회로로 제공된다. 휘발성 메모리와 비휘발성 메모리를 포함한 많은 다른 유형의 메모리가 있다. 휘발성 메모리는 데이터(예를 들어, 호스트 데이터, 에러 데이터 등)를 유지하기 위해 전력이 필요할 수 있고, 무엇보다도 특히 랜덤 액세스 메모리(RAM), 동적 랜덤 액세스 메모리(DRAM), 정적 랜덤 액세스 메모리(SRAM), 동기식 동적 랜덤 액세스 메모리(SDRAM) 및 사이리스터 랜덤 액세스 메모리(TRAM) 등을 포함한다. 비휘발성 메모리는 전력이 공급되지 않을 때 저장된 데이터를 유지하여 영구 데이터를 제공할 수 있고, 무엇보다도 특히 NAND 플래시 메모리, NOR 플래시 메모리 및 저항 가변 메모리, 예를 들어, 상 변화 랜덤 액세스 메모리(PCRAM), 저항 랜덤 액세스 메모리(RRAM), 및 자기저항 랜덤 액세스 메모리(MRAM), 예를 들어, 스핀 토크 전달 랜덤 액세스 메모리(STT RAM)를 포함할 수 있다.Memory devices are generally provided as internal, semiconductor, integrated circuits in computers or other electronic systems. There are many different types of memory, including volatile memory and non-volatile memory. Volatile memory may require power to hold data (eg, host data, erroneous data, etc.), inter alia, random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), among others. ), synchronous dynamic random access memory (SDRAM) and thyristor random access memory (TRAM), and the like. Non-volatile memory can provide persistent data by retaining stored data when power is not applied, inter alia NAND flash memory, NOR flash memory and resistive variable memory such as phase change random access memory (PCRAM) , resistive random access memory (RRAM), and magnetoresistive random access memory (MRAM), for example, spin torque transfer random access memory (STT RAM).

메모리 디바이스는 컴퓨터 또는 전자 시스템이 동작하는 동안 호스트에 의해 사용하기 위한 데이터, 커맨드 및/또는 명령어를 저장하기 위해 호스트(예를 들어, 호스트 컴퓨팅 디바이스)에 결합될 수 있다. 예를 들어, 데이터, 커맨드, 및/또는 명령어는 컴퓨팅 또는 다른 전자 시스템의 동작 동안 호스트와 메모리 디바이스(들) 사이에서 전송될 수 있다.A memory device may be coupled to a host (eg, a host computing device) to store data, commands, and/or instructions for use by the host during operation of the computer or electronic system. For example, data, commands, and/or instructions may be transferred between the host and the memory device(s) during operation of a computing or other electronic system.

도 1은 본 발명의 소정 개수의 실시형태에 따라 호스트와 메모리 디바이스를 포함하는 장치 형태의 기능 블록도이다.
도 2a는 본 발명의 소정 개수의 실시형태에 따라 호스트와 메모리 디바이스를 포함하는 장치를 포함하는 컴퓨팅 시스템 형태의 기능 블록도이다.
도 2b는 본 발명의 소정 개수의 실시형태에 따라 호스트, 메모리 디바이스, 주문형 집적 회로, 및 전계 프로그래밍 가능 게이트 어레이를 포함하는 컴퓨팅 시스템 형태의 다른 기능 블록도이다.
도 3은 es 지수 비트(exponent bit)가 있는 n-비트 포지트(posit)의 일례이다.
도 4a는 3-비트 포지트에 대한 양수 값(positive valve)의 일례이다.
도 4b는 2개의 지수 비트를 사용한 포지트 구성의 일례이다.
도 5는 본 발명의 소정 개수의 실시형태에 따라 제어 회로부 형태의 기능 블록도이다.
도 6은 본 발명의 소정 개수의 실시형태에 따라 메모리 어레이 주변부에서 비트 스트링을 누산시키는 일례를 나타내는 블록도이다.
도 7은 본 발명의 소정 개수의 실시형태에 따라 메모리 어레이 주변부에서 비트 스트링을 누산시키는 예시적인 방법을 나타내는 흐름도이다.1 is a functional block diagram in the form of an apparatus including a host and a memory device in accordance with a number of embodiments of the present invention;
2A is a functional block diagram in the form of a computing system including an apparatus including a host and a memory device in accordance with a number of embodiments of the present invention.
2B is another functional block diagram in the form of a computing system including a host, a memory device, an application specific integrated circuit, and an electric field programmable gate array in accordance with a number of embodiments of the present invention.
3 is an example of an n-bit posit with an es exponent bit.
4A is an example of a positive valve for a 3-bit positive.
4B is an example of a positive configuration using two exponent bits.
5 is a functional block diagram in the form of control circuitry in accordance with a number of embodiments of the present invention.
6 is a block diagram illustrating an example of accumulating a string of bits at the periphery of a memory array in accordance with a number of embodiments of the present invention.
7 is a flow diagram illustrating an exemplary method of accumulating a string of bits at the periphery of a memory array in accordance with a number of embodiments of the present invention.

메모리 어레이 주변부에서 비트 스트링을 누산시키는 것과 관련된 시스템, 장치 및 방법이 설명된다. 제어 회로부(예를 들어, 처리 디바이스)는 메모리 디바이스 내의 비트 스트링을 사용하여 연산의 수행을 제어하는 데 사용될 수 있다. 연산의 결과는 메모리 디바이스의 메모리 어레이 주변의 회로부에 누산될 수 있다. 예를 들어, 복수의 감지 증폭기가 메모리 어레이 및 처리 디바이스에 결합될 수 있다. 복수의 감지 증폭기 중 감지 증폭기의 수량은 어레이의 행 또는 열의 수량과 동일할 수 있다. 처리 디바이스는 유형 III 범용 숫자 형식(Type III universal number format) 또는 포지트 형식에 따라 형식화된 하나 이상의 비트 스트링을 사용하여 재귀 연산(recursive operation)을 수행하도록 구성될 수 있다. 처리 디바이스는 재귀 연산의 반복 결과를 나타내는 결과 비트 스트링을 복수의 감지 증폭기에 누산시키도록 더 구성될 수 있다.A system, apparatus, and method related to accumulating a string of bits at the perimeter of a memory array are described. Control circuitry (eg, a processing device) may be used to control the performance of an operation using a string of bits in the memory device. The result of the operation may be accumulated in circuitry around the memory array of the memory device. For example, a plurality of sense amplifiers may be coupled to the memory array and processing device. The number of sense amplifiers among the plurality of sense amplifiers may be the same as the number of rows or columns of the array. The processing device may be configured to perform a recursive operation using one or more bit strings formatted according to a Type III universal number format or a positive format. The processing device may be further configured to accumulate, in the plurality of sense amplifiers, a result bit string representing a result of an iteration of the recursive operation.

컴퓨팅 시스템은 다양한 정도의 정확도가 필요할 수 있는 다양한 계산을 포함할 수 있는 광범위한 연산을 수행할 수 있다. 그러나, 컴퓨팅 시스템은 계산이 수행되어야 하는 피연산자(operand)를 저장하는 데 유한한 양의 메모리를 갖는다. 유한한 메모리 자원에 의해 부과되는 제약 내에서 컴퓨팅 시스템에 의해 저장된 피연산자에 연산을 수행하기 위해 피연산자는 특정 형식으로 저장될 수 있다. 하나의 이러한 형식은 단순화를 위해 "부동 소수점(floating-point)" 형식 또는 "부동"(예를 들어, IEEE 754 부동 소수점 형식)이라고 지칭된다.Computing systems are capable of performing a wide range of operations that may include various calculations that may require varying degrees of accuracy. However, computing systems have a finite amount of memory to store the operands on which calculations must be performed. Operands may be stored in a specific format to perform operations on operands stored by the computing system within constraints imposed by finite memory resources. One such format is referred to as a “floating-point” format or “floating” (eg, IEEE 754 floating-point format) for simplicity.

부동 소수점 표준에서 이진 숫자 스트링과 같은 비트 스트링(예를 들어, 숫자를 나타낼 수 있는 비트 스트링)은, 세 개의 정수 세트 또는 비트 세트, 즉 "밑수(base)"라고 하는 비트 세트, "지수(exponent)"라고 하는 비트 세트, 및 "가수(mantissa)"(또는 유효 숫자)로 하는 비트 세트로 표시된다. 이진 숫자 스트링이 저장되는 형식을 정의하는 정수 세트 또는 비트 세트는 본 명세서에서 단순화를 위해 "숫자 형식" 또는 "형식"이라고 지칭될 수 있다. 예를 들어, 부동 소수점 비트 스트링을 정의하는 위에서 설명된 세 개의 정수 비트 세트(예를 들어, 밑수, 지수 및 가수)는 형식(예를 들어, 제1 형식)이라고 지칭될 수 있다. 아래에서 보다 상세히 설명되는 바와 같이, 포지트 비트 스트링은 "숫자 형식" 또는 "형식"(예를 들어, 제2 형식)이라고도 지칭될 수 있는 4개의 정수 세트 또는 비트 세트(예를 들어, 부호, 체제(regime), 지수, 가수)를 포함할 수 있다. 또한, 부동 소수점 표준에서는, 두 개의 무한대(예를 들어, +∞ 및 -∞) 및/또는 두 종류의 "NaN"(숫자가 아님), 즉 조용한 NaN 및 시끄러운 NaN이 비트 스트링에 포함될 수 있다.In the floating-point standard, a string of bits, such as a string of binary numbers (for example, a string of bits that can represent a number), is a set of three integers, or a set of bits, a set of bits called a "base", an "exponent" )", and a set of bits called "mantissa" (or significant digits). A set of integers or a set of bits that define the format in which a binary number string is stored may be referred to herein as a "numeric format" or "format" for simplicity. For example, the set of three integer bits (eg, base, exponent, and mantissa) described above that defines a string of floating-point bits may be referred to as a form (eg, a first form). As described in more detail below, a positive bit string is a set of four integers or a set of bits (e.g., a sign, regime, exponent, mantissa). Also, in the floating-point standard, two infinities (eg +∞ and -∞) and/or two kinds of "NaNs" (not numbers), i.e., quiet NaNs and noisy NaNs, may be included in the bit string.

부동 소수점 표준은 수년 동안 컴퓨팅 시스템에서 사용되어 왔으며, 많은 컴퓨팅 시스템에서 수행되는 계산을 위해 산술 형식, 교환 형식, 반올림 규칙, 연산 및 예외 처리를 정의한다. 산술 형식은 유한 숫자, 무한대 및/또는 특수 NaN 값을 포함할 수 있는 2진 및/또는 10진 부동 소수점 데이터를 포함할 수 있다. 교환 형식은 부동 소수점 데이터를 교환하는 데 사용될 수 있는 인코딩(예를 들어, 비트 스트링)을 포함할 수 있다. 반올림 규칙은 산술 연산 및/또는 변환 연산 동안 숫자를 반올림할 때 충족될 수 있는 속성 세트를 포함할 수 있다. 부동 소수점 연산은 산술 연산 및/또는 삼각 함수와 같은 다른 계산 연산을 포함할 수 있다. 예외 처리는 0으로 나누기, 오버플로 등과 같은 예외 조건의 표시를 포함할 수 있다.The floating-point standard has been used in computing systems for many years, and defines arithmetic forms, interchange formats, rounding rules, operations, and exception handling for calculations performed on many computing systems. Arithmetic forms may contain binary and/or decimal floating-point data, which may contain finite numbers, infinity, and/or special NaN values. The interchange format may include an encoding (eg, bit string) that may be used to exchange floating point data. Rounding rules may include a set of attributes that may be satisfied when rounding numbers during arithmetic and/or conversion operations. Floating point operations may include arithmetic operations and/or other computational operations such as trigonometric functions. Exception handling may include the indication of an exception condition, such as division by zero, overflow, etc.

부동 소수점에 대한 대안적인 형식은 "범용 숫자"(unum) 형식이라고 지칭된다. "포지트" 및/또는 "유효(valid)"라고 지칭될 수 있는, 여러 형식의 unum 형식, 즉 유형 I unum, 유형 II unum, 및 유형 III unum이 있다. 유형 I unum은 실수가 정확한 부동 소수점인지 여부 또는 인접한 부동 소수점들 사이에 간격이 있는지 여부를 나타내기 위해 가수 끝에 "ubit"를 사용하는 IEEE 754 표준 부동 소수점 형식의 수퍼세트(superset)이다. 유형 I unum의 부호, 지수 및 가수 비트는 IEEE 754 부동 소수점 형식으로부터 정의를 취하지만, 유형 I unum의 지수 및 가수 필드의 길이는 단일 비트로부터 최대 사용자 정의 길이까지 크게 변할 수 있다. IEEE 754 표준 부동 소수점 형식으로부터 부호, 지수 및 가수 비트를 취함으로써, 유형 I unum은 부동 소수점 숫자와 유사하게 거동할 수 있지만 유형 I unum의 지수 및 소수 비트(fraction bit)로 표시된 가변 비트 길이는 부동 소수점에 비해 추가 관리가 필요할 수 있다.An alternative format for floating point is called the "universal number" (unum) format. There are several types of unum forms, which may be referred to as "positive" and/or "valid": type I unum, type II unum, and type III unum. Type I unum is a superset of the IEEE 754 standard floating-point format that uses a "ubit" at the end of the mantissa to indicate whether a real number is an exact floating-point number or whether there is a gap between adjacent floating-point numbers. The sign, exponent, and mantissa bits of a type I unum take their definitions from the IEEE 754 floating-point format, but the length of the exponent and mantissa fields of a type I unum can vary greatly from a single bit to a maximum user-defined length. By taking the sign, exponent, and mantissa bits from the IEEE 754 standard floating-point format, type I unums can behave similarly to floating-point numbers, but variable bit lengths denoted by exponent and fraction bits of type I unum are floating-point. Compared to the decimal point, additional care may be required.

유형 II unum은 일반적으로 부동 소수점과 호환되지 않지만 유형 II unum은 투영된 실수(projected real number)에 기초하여 깨끗하고 수학적 설계를 허용할 수 있다. 유형 II unum은 n 비트를 포함할 수 있고, 원형 투영의 사분면이 2^n-3 - 1개의 실수의 순서화된 세트로 채워지는 "u-격자(lattice)"로 설명될 수 있다. 유형 II unum의 값은 양수 값이 원형 투영의 오른쪽 위 사분면에 놓이고 음수 대응부가 원형 투영의 왼쪽 위 사분면에 놓이도록 원형 투영을 이등분하는 축에 대해 반영될 수 있다. 유형 II unum을 나타내는 원형 투영의 아래쪽 절반은 원형 투영의 위쪽 절반에 놓이는 값의 역수를 포함할 수 있다. 유형 II unum은 일반적으로 대부분의 연산을 위해 조회 테이블에 의존한다. 그 결과, 조회 테이블의 크기는 일부 상황에서 유형 II unum의 효율을 제한할 수 있다. 그러나, 유형 II unum은 일부 조건에서 부동 소수점에 비해 향상된 계산 기능을 제공할 수 있다.Type II unums are generally not compatible with floating point, but type II unums can allow for clean, mathematical designs based on projected real numbers. A type II unum can contain n bits and can be described as a "u-lattice" in which the quadrants of the circular projection are filled with an ordered set of 2 ^n-3 - 1 real numbers. Values of type II unum may be reflected about the axis that bisects the circular projection such that the positive value lies in the upper right quadrant of the circular projection and the negative counterpart lies in the upper left quadrant of the circular projection. The lower half of the circular projection representing a type II unum may contain the reciprocal of a value that lies in the upper half of the circular projection. Type II unums generally rely on lookup tables for most of their operations. As a result, the size of the lookup table can limit the efficiency of type II unums in some situations. However, type II unums may provide improved computational capabilities over floating point in some conditions.

유형 III unum 형식은 본 명세서에서 "포지트 형식" 또는 간단하게 "포지트"라고 지칭된다. 부동 소수점 비트 스트링과 달리, 포지트는 특정 조건에서 동일한 비트 폭을 가진 부동 소수점 숫자보다 더 높은 정밀도(예를 들어, 더 넓은 동적 범위, 더 높은 해상도 및/또는 더 높은 정확도)를 허용할 수 있다. 이에 의해 부동 소수점 숫자를 사용하는 경우보다 포지트를 사용하는 경우 컴퓨팅 시스템에 의해 수행되는 연산을 더 높은 속도로(예를 들어, 더 빨리) 수행할 수 있고, 이에 따라, 예를 들어, 연산을 수행하는 데 사용되는 클록 사이클의 수를 줄임으로써 이러한 연산을 수행하는 데 소비되는 처리 시간 및/또는 전력을 줄임으로써 컴퓨팅 시스템의 성능을 향상시킬 수 있다. 또한, 컴퓨팅 시스템에서 포지트를 사용하면 부동 소수점 수보다 계산 시 더 높은 정확도 및/또는 정밀도를 허용할 수 있으며, 이는 일부 접근 방식(예를 들어, 부동 소수점 형식 비트 스트링에 의존하는 접근 방식)에 비해 컴퓨팅 시스템의 기능을 더 향상시킬 수 있다.Type III unum formats are referred to herein as "positive formats" or simply "positives". Unlike floating-point bit strings, positives can tolerate higher precision (eg, wider dynamic range, higher resolution, and/or higher accuracy) than floating-point numbers of the same bit width under certain conditions. This allows operations performed by the computing system to be performed at a higher speed (for example, faster) when using posits than when using floating-point numbers, and thus, for example, The performance of a computing system can be improved by reducing the processing time and/or power consumed to perform these operations by reducing the number of clock cycles used to perform them. In addition, the use of positives in computing systems may allow for higher precision and/or precision in calculations than floating-point numbers, which may not apply to some approaches (e.g., approaches that rely on floating-point format bit strings). Computing system functions can be further improved.

포지트는 포지트에 포함된 총 비트 수량 및/또는 정수 세트 또는 비트 세트의 수량에 기초하여 정밀도와 정확도가 매우 다양할 수 있다. 또한, 포지트는 넓은 동적 범위를 생성할 수 있다. 포지트의 정확도, 정밀도 및/또는 동적 범위는 본 명세서에 보다 상세히 설명된 바와 같이 특정 조건 하에서 부동 소수점 또는 다른 숫자 형식의 것보다 클 수 있다. 포지트의 가변 정확도, 정밀도 및/또는 동적 범위는 예를 들어 포지트가 사용될 응용에 기초하여 조작될 수 있다. 또한, 포지트는 부동 소수점 및 다른 숫자 형식과 관련된 오버플로, 언더플로, NaN 및/또는 다른 코너 경우를 줄이거나 제거할 수 있다. 또한, 포지트를 사용하면 부동 소수점 또는 다른 숫자 형식에 비해 더 적은 비트를 사용하여 숫자 값(예를 들어, 숫자)을 표시할 수 있다.Posts can vary widely in precision and accuracy based on the total number of bits contained in the post and/or the quantity of integer sets or sets of bits. Also, the position can create a wide dynamic range. The precision, precision, and/or dynamic range of a position may be greater than that of a floating point or other numeric form under certain conditions, as described in more detail herein. The variable accuracy, precision, and/or dynamic range of the site may be manipulated, for example, based on the application in which the site will be used. Additionally, positives can reduce or eliminate overflow, underflow, NaN and/or other corner cases associated with floating point and other number formats. Additionally, posits allow you to represent numeric values (such as numbers) using fewer bits compared to floating-point or other number formats.

일부 실시형태에서, 이러한 특징에 의해 포지트를 고도로 재구성 가능할 수 있으며, 이에 부동 소수점 또는 다른 숫자 형식에 의존하는 접근 방식에 비해 향상된 응용 성능을 제공할 수 있다. 또한, 포지트의 이러한 기능은 부동 소수점 또는 다른 숫자 형식에 비해 기계 학습 응용에서 향상된 성능을 제공할 수 있다. 예를 들어, 계산 성능이 가장 중요한 기계 학습 응용에서 포지트를 사용하여 부동 소수점 또는 다른 숫자 형식보다 더 적은 비트를 사용하여 부동 소수점 또는 다른 숫자 형식과 동일하거나 이보다 더 높은 정확도 및/또는 정밀도로 네트워크(예를 들어, 신경망)를 훈련할 수 있다. 또한, 기계 학습 상황에서 추론 동작은 부동 소수점 또는 다른 숫자 형식보다 더 적은 비트(예를 들어, 더 작은 비트 폭)로 포지트를 사용하여 달성될 수 있다. 따라서 부동 소수점 또는 다른 숫자 형식에 비해 더 적은 비트를 사용하여 동일하거나 향상된 결과를 달성함으로써, 포지트를 사용하여 연산을 수행하는 시간을 줄일 수 있고 및/또는 응용에 필요한 메모리 공간을 줄일 수 있고, 이는 포지트가 사용되는 컴퓨팅 시스템의 전체 기능을 향상시킬 수 있다.In some embodiments, this feature may allow sites to be highly reconfigurable, thereby providing improved application performance over approaches that rely on floating point or other number formats. Additionally, these features of Posit can provide improved performance in machine learning applications compared to floating point or other number formats. For example, in machine learning applications where computational performance is paramount, use positives to network with the same or greater accuracy and/or precision as floating-point or other number formats using fewer bits than floating-point or other number formats. (e.g. neural networks) can be trained. Also, in a machine learning context, inference operations can be achieved using posits with fewer bits (eg, smaller bit widths) than floating-point or other numeric formats. Thus, by using fewer bits compared to floating point or other number formats to achieve the same or improved result, you can reduce the time to perform operations using positives and/or reduce the memory space required for your application; This can enhance the overall functionality of the computing system in which the post is used.

본 명세서의 실시형태는 컴퓨팅 디바이스의 전체 기능을 개선하기 위해 비트 스트링에 다양한 연산을 수행하도록 구성된 하드웨어 회로부(예를 들어, 제어 회로부)에 관한 것이다. 예를 들어, 본 명세서의 실시형태는 비트 스트링을 사용하여 연산(예를 들어, 재귀 연산)을 수행하고 및/또는 주변 감지 증폭기, 확장된 행 어드레스 구성요소 등과 같은 메모리 디바이스의 주변 회로부에 연산 결과를 누산(예를 들어, 저장)하도록 구성된 하드웨어 회로부에 관한 것이다. 본 명세서에 사용된 "주변 감지 증폭기"는 메모리 디바이스의 주변(예를 들어, 외부)에 위치된, 데이터 값을 래치하도록 구성된 감지 증폭기를 포함할 수 있는 반면, "확장된 행 어드레스 구성요소"는 메모리 디바이스의 주변에 위치된 다수의 래치 및/또는 플립플롭을 포함할 수 있다. 하드웨어 회로부를 사용하여 수행될 수 있는 재귀 연산의 예로는 산술 연산, 논리 연산, 비트 단위 연산, 벡터 연산, 내적 연산, 승산-누산 연산 등을 포함한다. 일부 실시형태에서, 비트 스트링은 유형 III 범용 숫자 형식 또는 포지트 형식으로 형식화될 수 있다.Embodiments herein relate to hardware circuitry (eg, control circuitry) configured to perform various operations on bit strings to improve the overall functionality of a computing device. For example, embodiments herein may use bit strings to perform operations (eg, recursive operations) and/or result in operation results in peripheral circuitry of a memory device, such as a peripheral sense amplifier, extended row address component, or the like. hardware circuitry configured to accumulate (eg, store) As used herein, a "peripheral sense amplifier" may include a sense amplifier configured to latch data values, located at the periphery (eg, external) of a memory device, whereas an "extended row address component" is It may include a number of latches and/or flip-flops located on the periphery of the memory device. Examples of recursive operations that can be performed using the hardware circuit unit include arithmetic operations, logical operations, bitwise operations, vector operations, dot product operations, multiplication-accumulation operations, and the like. In some embodiments, the bit string may be formatted in a Type III universal numeric format or positive format.

각각의 반복에서 재귀 연산의 결과(예를 들어, 정확한 결과)를 저장하기 위해 메모리 디바이스의 주변 회로부를 활용함으로써, 이러한 방식으로 메모리 디바이스의 주변 회로부를 이용하지 않는 접근 방식에 비해 재귀 연산 결과의 정확도를 향상시킬 수 있다. 예를 들어, 일부 접근 방식은 재귀 연산의 중간 결과와 같은 임시 계산을 위해 작은 캐시 또는 레지스터 세트(예를 들어, 은닉된 스크래치 영역)를 제공한다. 그러나, 일부 접근 방식에서, 이러한 레지스터 또는 캐시(들)는 레지스터 또는 캐시(들)의 크기 제약으로 인해 반올림 에러가 발생하지 않고 중간 재귀 대형 비트 스트링 연산(예를 들어, 32-비트 또는 64-비트 스트링 피연산자를 사용하는 연산)의 정확한 결과의 저장을 지원할 만큼 충분히 크지 않을 수 있다. 재귀 연산(예를 들어, 8-비트 또는 16-비트 스트링 피연산자)에 더 작은 벡터를 사용하는 경우에도, 레지스터 또는 캐시(들)는 재귀 연산에 사용된 반복 횟수에 따라 오버런(overrun)될 수 있다.By utilizing the peripheral circuitry of the memory device to store the result (eg, the correct result) of the recursive operation at each iteration, in this way the accuracy of the result of the recursive operation compared to an approach that does not use the peripheral circuitry of the memory device. can improve For example, some approaches provide a small set of caches or registers (eg, hidden scratch regions) for temporary computations, such as intermediate results of recursive operations. However, in some approaches, such registers or cache(s) do not incur round-off errors due to size constraints of the register or cache(s) and intermediate recursive large-bit string operations (e.g., 32-bit or 64-bit may not be large enough to support storage of the exact result of an operation that uses string operands). Even when using smaller vectors for recursive operations (e.g., 8-bit or 16-bit string operands), the register or cache(s) may be overrun depending on the number of iterations used in the recursive operation. .

예를 들어, (8,0) 포지트 피연산자(예를 들어, 지수 비트 없이 비트 폭이 8-비트인 포지트 비트 스트링)를 사용하는 연산에는 64-비트 레지스터가 필요할 수 있는 반면, (64,4) 포지트 피연산자(예를 들어, 4개의 지수 비트와 함께 비트 폭이 64-비트인 포지트 비트 스트링)를 사용하는 연산에는 4096-비트 레지스터가 필요할 수 있고, 이는 특히 비트 스트링 피연산자의 비트 폭이 증가함에 따라 일부 접근 방식의 레지스터(들) 및/또는 캐시를 빠르게 오버런시킬 수 있다. 이것은 재귀 연산의 각각의 반복의 결과를 사용하여 다수의 연속 연산을 수행하는 재귀 연산을 수행하는 동안 더 악화될 수 있다.For example, an operation using a (8,0) positive operand (e.g., a string of positive bits that is 8-bits wide with no exponent bits) may require a 64-bit register, whereas (64, 4) Operations using positive operands (e.g., positive bit strings that are 64 bits wide with 4 exponent bits) may require 4096-bit registers, especially the bit width of the bit string operands This increase can quickly overrun the register(s) and/or cache of some approaches. This can be exacerbated while performing recursive operations that perform multiple successive operations using the result of each iteration of the recursive operation.

일부 접근 방식에서, 작은 캐시 또는 레지스터 세트(예를 들어, 은닉된 스크래치 영역)가 "은닉될" 수 있다(예를 들어, 사용자가 액세스할 수 없음). 이와 달리, 일부 실시형태에서, 메모리 디바이스의 주변 회로부에 대한 액세스는 메모리 디바이스가 동작하는 컴퓨팅 시스템의 사용자에게 제공될 수 있다. 예를 들어, 주변 회로부에 대한 액세스를 제어할 수 있는 능력이 사용자에 제공될 수 있으며, 이에 재귀 연산과 같은 주변 회로부를 이용하는 연산을 더 잘 제어할 수 있다. 이것은 주변 회로부를 이용하는 것이 허용되는 연산의 유형을 더 잘 제어할 수 있고, 재귀 연산이 종료되는 시기를 더 잘 제어할 수 있고 및/또는 주변 회로부에 저장된 결과 비트 스트링을 절단하는(truncated) 시기를 더 잘 제어할 수 있다.In some approaches, a small cache or register set (eg, a hidden scratch area) may be "hidden" (eg, not accessible to the user). Alternatively, in some embodiments, access to peripheral circuitry of the memory device may be provided to a user of the computing system on which the memory device operates. For example, the user may be provided with the ability to control access to peripheral circuitry, which may provide greater control over operations using peripheral circuitry, such as recursive operations. This allows better control over the types of operations that are allowed to use peripheral circuitry, better control over when the recursive operation ends, and/or when to truncate the resulting string of bits stored in the peripheral circuitry. You have better control.

본 명세서에 설명된 바와 같이, 메모리 디바이스의 주변 회로부에 재귀 연산을 반복한 결과를 저장하면, 정밀도 및/또는 정확도가 요구되는 응용에서 수행되는 산술 및/또는 논리 연산에서 향상된 정밀도 및/또는 정확도를 허용함으로써 컴퓨팅 시스템의 성능을 향상시킬 수 있다. 예를 들어, 일부 실시형태에서, 재귀 연산의 각 반복의 정확한 결과를 저장할 수 있는 충분한 공간을 제공함으로써, 일부 접근 방식에서 만연한 재귀 연산의 반복의 중간 결과를 절단하는 것과 달리, 재귀 연산의 최종 결과만을 원하는 비트 폭으로 절단(예를 들어, 반올림)할 수 있다. 이것은 일부 접근 방식에 종종 존재하는 반올림 에러를 완화할 수 있으며, 따라서 이러한 재귀 연산 결과의 정확도를 증가시킴으로써 재귀 연산이 수행되는 컴퓨팅 시스템의 성능을 향상시킬 수 있다.As described herein, storing the result of repeated recursive operations in the peripheral circuitry of a memory device provides improved precision and/or accuracy in arithmetic and/or logic operations performed in applications requiring precision and/or accuracy. By allowing it, the performance of the computing system can be improved. For example, in some embodiments, the final result of a recursive operation is provided by providing sufficient space to store the exact result of each iteration of the recursive operation, as opposed to truncating the intermediate result of an iteration of the recursive operation, which is prevalent in some approaches. can be truncated (eg rounded) to the desired bit width. This can mitigate rounding errors often present in some approaches, and thus can improve the performance of the computing system in which recursive operations are performed by increasing the accuracy of the results of such recursive operations.

본 명세서의 다른 실시형태는 비트 스트링(예를 들어, 포지트 비트 스트링)을 생성하고/하거나 이를 메모리 어레이의 데이터 구조부에 저장하는 것에 관한 것이다. 비트 스트링은 포지트 비트 스트링 피연산자를 포함할 수 있고/있거나 포지트 비트 스트링 피연산자들 사이에서 수행되는 연산(예를 들어, 산술 및/또는 논리 연산)의 결과를 나타내는 결과 포지트 비트 스트링을 포함할 수 있다. 일부 실시형태에서, 상태 기계는 메모리 어레이에서 또는 메모리 어레이로부터 비트 스트링을 저장 및/또는 검색하기 위해 메모리 디바이스에 포함될 수 있다. 상태 기계는 메모리 어레이로부터 비트 스트링을 검색하고/하거나 비트 스트링을 어레이로부터 메모리 어레이 외부의 회로부로 전송하는 커맨드를 포함할 수 있는 특정 커맨드를 생성하도록 구성될 수 있다. 저장된 결과 비트 스트링은 본 명세서에서 보다 상세히 설명된 바와 같이 재귀 연산을 수행하는 데 사용될 수 있다.Another embodiment herein relates to generating a bit string (eg, a positive bit string) and/or storing it in a data structure portion of a memory array. The bit string may contain positive bit string operands and/or may contain a result positive bit string representing the result of an operation (e.g., arithmetic and/or logical operation) performed between the positive bit string operands. can In some embodiments, a state machine may be included in a memory device to store and/or retrieve a string of bits to or from a memory array. The state machine may be configured to generate specific commands, which may include commands that retrieve a string of bits from a memory array and/or transmit the string of bits from the array to circuitry external to the memory array. The stored result bit string can be used to perform recursive operations as described in more detail herein.

상태 기계를 사용하여 메모리 어레이로부터 비트 스트링을 검색함으로써, 메모리 디바이스 및/또는 이 메모리 디바이스에 결합된 호스트와 같은 컴퓨팅 디바이스의 성능이 일부 접근 방식에 비해 향상될 수 있다. 예를 들어, 상태 기계는 메모리 어레이로부터 비트 스트링을 저장 및/또는 검색하기 위해 작업 및 연산을 수행하기 위해 최소한의 회로부를 요구할 수 있으며, 이는 일부 접근 방식에서 이용되는 회로부의 양을 줄일 수 있다. 또한, 본 명세서에 설명된 실시형태에서, 비트 스트링을 사용하여 연산을 수행하는 것이 호출될 때마다 비트 스트링 피연산자를 사용하여 계산을 수행하는 접근 방식과 달리 비트 스트링을 사용하여 연산한 결과를 저장 및 검색할 수 있기 때문에 처리 자원의 양 및/또는 저장된 비트 스트링을 사용하여 연산을 수행할 때 소요되는 시간의 양은 일부 접근 방식에 비해 감소될 수 있다.By retrieving a bit string from a memory array using a state machine, the performance of a computing device, such as a memory device and/or a host coupled to the memory device, may be improved over some approaches. For example, a state machine may require minimal circuitry to perform operations and operations to store and/or retrieve a string of bits from a memory array, which may reduce the amount of circuitry used in some approaches. In addition, in the embodiments described herein, the result of the operation using the bit string is stored and Because they are searchable, the amount of processing resources and/or the amount of time required to perform operations using stored bit strings can be reduced compared to some approaches.

본 발명의 이하의 상세한 설명에서, 본 발명의 일부를 형성하고 본 발명의 하나 이상의 실시형태를 실시할 수 있는 방식을 예로서 도시하는 첨부 도면을 참조한다. 이들 실시형태는 이 기술 분야에 통상의 지식을 가진 자라면 본 발명의 실시형태를 실시할 수 있도록 충분히 상세히 설명되며, 다른 실시형태도 이용될 수 있고 본 발명의 범위를 벗어나지 않고 프로세스, 전기적 및 구조적 변경이 이루어질 수 있음을 이해해야 한다.In the following detailed description of the invention, reference is made to the accompanying drawings, which form a part hereof and which show by way of illustration one or more embodiments of the invention in which it may be practiced. While these embodiments are described in sufficient detail to enable those skilled in the art to practice the embodiments of the present invention, other embodiments may be utilized and process, electrical and structural elements may be utilized without departing from the scope of the present invention. It should be understood that changes may be made.

특히 도면의 참조 번호와 관련하여 본 명세서에 사용된 "N" 및 "M" 등과 같은 지시자는 이렇게 지시된 특정 특징을 다수 포함할 수 있는 것을 나타낸다. 또한 본 명세서에서 사용된 용어는 단지 특정한 실시형태를 설명하기 위한 것일 뿐, 본 발명을 제한하려고 의도된 것이 아님을 이해해야 한다. 본 명세서에 사용된 단수 형태의 요소 및 "상기" 요소는 문맥이 명백히 달리 지시하지 않는 한, 단수 요소와 복수의 요소를 모두 포함할 수 있다. 또한 "소정 개수의", "적어도 하나의" 및 "하나 이상의"(예를 들어, 소정 개수의 메모리 뱅크)는 하나 이상의 메모리 뱅크를 나타낼 수 있는 반면, "복수의" 요소는 둘 이상의 요소를 나타내는 데 의도된다.Designators such as “N” and “M” as used herein, particularly in connection with reference numerals in the drawings, indicate that the specific feature so indicated may include many of the specified features. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the present invention. As used herein, the singular element and "the" element may include both the singular element and the plural element, unless the context clearly dictates otherwise. Also, "a number", "at least one", and "one or more" (eg, a number of memory banks) may refer to one or more memory banks, whereas "a plurality" refers to two or more elements. is intended to

또한, "할 수 있다" 및 "할 수도 있다"라는 단어는 본 명세서에 걸쳐 강제적인 의미(즉, ~을 해야 함)가 아니라 허용 가능한 의미(즉, 가능성을 의미, ~을 할 수 있는 것)로 사용된다. "포함하다"라는 용어와 그 파생어는 "~을 포함하지만 이로 제한되지 않는"을 의미한다. "결합된" 및 "결합하는"이라는 용어는 물리적으로 직접 또는 간접 연결되거나 문맥에 따라 커맨드 및/또는 데이터에 액세스 및 이동(전송)할 수 있는 것을 의미한다. "비트 스트링", "데이터" 및 "데이터 값"이라는 용어는 본 명세서에서 상호 교환 가능하게 사용되며, 문맥에 따라 동일한 의미를 가질 수 있다. 또한, "비트 세트", "비트 서브세트" 및 "일부"라는 용어(비트 스트링의 비트의 일부 상황에서)는 본 명세서에서 상호 교환 가능하게 사용되며, 문맥에 따라 동일한 의미를 가질 수 있다.Also, the words "may" and "may" are used throughout this specification in an acceptable sense (i.e., meaning the possibility, being able to) rather than in a mandatory (i.e., should) sense. is used as The term "comprises" and its derivatives means "including, but not limited to". The terms "coupled" and "coupled" mean either physically directly or indirectly coupled or capable of accessing and moving (transmitting) commands and/or data depending on the context. The terms “bit string”, “data” and “data value” are used interchangeably herein and may have the same meaning depending on the context. Also, the terms "set of bits", "subset of bits" and "some" (in some contexts of bits of a bit string) are used interchangeably herein and may have the same meaning depending on the context.

본 도면은 제1 숫자 또는 숫자들이 도면 번호에 대응하고 나머지 숫자들이 도면의 요소 또는 구성 요소를 식별하는 번호 지정 규칙을 따른다. 여러 도면 간에 유사한 요소 또는 구성요소는 유사한 숫자를 사용하여 식별될 수 있다. 예를 들어, 120은 도 1에서 요소 "20"을 나타낼 수 있고, 유사한 요소는 도 2에서 220으로 나타낼 수 있다. 그룹 또는 복수의 유사한 요소 또는 구성요소는 일반적으로 단일 요소 번호로 본 명세서에서 언급될 수 있다. 예를 들어, 복수의 참조 요소(433-1, 433-2, ..., 433-N)는 일반적으로 433이라고 지칭될 수 있다. 이해되는 바와 같이, 본 명세서의 다양한 실시형태에 도시된 요소는 본 발명의 소정 개수의 추가적인 실시형태를 제공하기 위해 추가, 교환 및/또는 제거될 수 있다. 또한, 도면에 제공된 요소의 비율 및/또는 상대적인 축척은 본 발명의 특정 실시형태를 예시하기 위한 것일 뿐, 본 발명을 제한하는 의미로 해석되어서는 안 된다.The drawing follows a numbering convention in which the first number or numbers correspond to the drawing number and the remaining numbers identify the element or component of the drawing. Similar elements or components among the various figures may be identified using like numbers. For example, 120 may denote element “20” in FIG. 1 , and similar elements may be denoted 220 in FIG. 2 . A group or a plurality of similar elements or elements may generally be referred to herein by a single element number. For example, the plurality of reference elements 433-1, 433-2, ..., 433-N may be generally referred to as 433. As will be appreciated, elements depicted in the various embodiments herein may be added, exchanged, and/or removed to provide any number of additional embodiments of the invention. In addition, the proportions and/or relative scales of elements provided in the drawings are for the purpose of illustrating specific embodiments of the present invention only and should not be construed as limiting the present invention.

도 1은 본 발명의 소정 개수의 실시형태에 따라 호스트(102)와 메모리 디바이스(104)를 포함하는 장치를 포함하는 컴퓨팅 시스템(100) 형태의 기능 블록도이다. 본 명세서에 사용된 "장치"는 예를 들어, 회로 또는 회로부, 다이 또는 다이들, 모듈 또는 모듈들, 디바이스 또는 디바이스들 또는 시스템 또는 시스템들과 같은 다양한 구조부 또는 구조부의 조합 중 임의의 것을 나타낼 수 있지만 이로 제한되지 않는다. 메모리 디바이스(104)는 하나 이상의 메모리 모듈(예를 들어, 단일 인라인 메모리 모듈, 듀얼 인라인 메모리 모듈 등)을 포함할 수 있다. 메모리 디바이스(104)는 휘발성 메모리 및/또는 비휘발성 메모리를 포함할 수 있다. 소정 개수의 실시형태에서, 메모리 디바이스(104)는 멀티-칩 디바이스를 포함할 수 있다. 멀티-칩 디바이스는 소정 개수의 상이한 메모리 유형 및/또는 메모리 모듈을 포함할 수 있다. 예를 들어, 메모리 시스템은 임의의 유형의 모듈에 비휘발성 또는 휘발성 메모리를 포함할 수 있다. 도 1에 도시된 바와 같이, 장치(100)는 논리 회로부(122)와 메모리 자원(124)을 포함할 수 있는 제어 회로부(120), 메모리 어레이(130), 및 감지 증폭기(111)(예를 들어, 감지 AMP(111))를 포함할 수 있다. 또한, 각각의 구성요소(예를 들어, 호스트(102), 제어 회로부(120), 논리 회로부(122), 메모리 자원(124), 및/또는 메모리 어레이(130))는 본 명세서에서 개별적으로 "장치"라고 지칭될 수 있다. 제어 회로부(120)는 본 명세서에서 "처리 디바이스"라고 지칭될 수 있다.1 is a functional block diagram in the form of a computing system 100 including an apparatus including a host 102 and a memory device 104 in accordance with a number of embodiments of the present invention. "Apparatus" as used herein may refer to any of various structures or combinations of structures, such as, for example, a circuit or circuit part, die or dies, module or modules, device or devices or system or systems. but is not limited thereto. Memory device 104 may include one or more memory modules (eg, a single inline memory module, a dual inline memory module, etc.). Memory device 104 may include volatile memory and/or non-volatile memory. In some number of embodiments, memory device 104 may comprise a multi-chip device. A multi-chip device may include any number of different memory types and/or memory modules. For example, a memory system may include non-volatile or volatile memory in any type of module. As shown in FIG. 1 , device 100 includes control circuitry 120 , which may include logic circuitry 122 and memory resources 124 , memory array 130 , and sense amplifier 111 (eg, For example, the sensing AMP 111) may be included. Additionally, each component (eg, host 102 , control circuitry 120 , logic circuitry 122 , memory resource 124 , and/or memory array 130 ) is referred to herein as individually “ device". Control circuitry 120 may be referred to herein as a “processing device”.

메모리 디바이스(104)는 컴퓨팅 시스템(100)을 위한 주 메모리를 제공할 수 있고 또는 컴퓨팅 시스템(100)에 걸쳐 추가 메모리 또는 저장 매체로서 사용될 수 있다. 메모리 디바이스(104)는 휘발성 및/또는 비휘발성 메모리 셀을 포함할 수 있는 하나 이상의 메모리 어레이(130)(예를 들어, 메모리 셀의 어레이)를 포함할 수 있다. 메모리 어레이(130)는 예를 들어 NAND 아키텍처를 갖는 플래시 어레이일 수 있다. 실시형태는 특정 유형의 메모리 디바이스로 제한되지 않는다. 예를 들어, 메모리 디바이스(104)는 무엇보다도 특히 RAM, ROM, DRAM, SDRAM, PCRAM, RRAM, 및 플래시 메모리를 포함할 수 있다.Memory device 104 may provide main memory for computing system 100 or may be used as additional memory or storage medium throughout computing system 100 . Memory device 104 may include one or more memory arrays 130 (eg, arrays of memory cells) that may include volatile and/or non-volatile memory cells. Memory array 130 may be, for example, a flash array having a NAND architecture. Embodiments are not limited to particular types of memory devices. For example, memory device 104 may include RAM, ROM, DRAM, SDRAM, PCRAM, RRAM, and flash memory, among others.

메모리 디바이스(104)가 비휘발성 메모리를 포함하는 실시형태에서, 메모리 디바이스(104)는 NAND 또는 NOR 플래시 메모리 디바이스와 같은 플래시 메모리 디바이스를 포함할 수 있다. 그러나, 실시형태는 이로 제한되지 않고, 메모리 디바이스(104)는 다른 비휘발성 메모리 디바이스, 예를 들어, 비휘발성 랜덤 액세스 메모리 디바이스(예를 들어, NVRAM, ReRAM, FeRAM, MRAM, PCM), "이머징(emerging)" 메모리 디바이스, 예를 들어, 3-D 교차점(Crosspoint) (3D XP) 메모리 디바이스 등, 또는 이들의 조합을 포함할 수 있다. 비휘발성 메모리의 3D XP 어레이는 스택 가능한 크로스 그리드 데이터 액세스 어레이와 함께 벌크 저항의 변화에 기초하여 비트 저장을 수행할 수 있다. 추가적으로, 많은 플래시 기반 메모리와 달리, 3D XP 비휘발성 메모리는 비휘발성 메모리 셀을 이전에 소거하지 않고 비휘발성 메모리 셀을 프로그래밍할 수 있는 제자리 기입 동작(write in-place operation)을 수행할 수 있다.In embodiments where memory device 104 includes non-volatile memory, memory device 104 may include a flash memory device, such as a NAND or NOR flash memory device. However, embodiments are not limited thereto, and the memory device 104 may include other non-volatile memory devices, such as non-volatile random access memory devices (eg, NVRAM, ReRAM, FeRAM, MRAM, PCM), "emerging "emerging" memory devices, such as 3-D Crosspoint (3D XP) memory devices, and the like, or combinations thereof. A 3D XP array of non-volatile memory, with a stackable cross-grid data access array, can perform bit storage based on changes in bulk resistance. Additionally, unlike many flash-based memories, 3D XP non-volatile memory is capable of performing a write in-place operation to program the non-volatile memory cell without previously erasing the non-volatile memory cell.

도 1에 도시된 바와 같이, 호스트(102)는 메모리 디바이스(104)에 결합될 수 있다. 소정 개수의 실시형태에서, 메모리 디바이스(104)는 하나 이상의 채널(예를 들어, 채널(103))을 통해 호스트(102)에 결합될 수 있다. 도 1에서, 메모리 디바이스(104)는 채널(103)을 통해 호스트(102)에 결합되고, 메모리 디바이스(104)의 제어 회로부(120)는 채널(107)을 통해 메모리 어레이(130)에 결합된다. 호스트(102)는 다양한 다른 유형의 호스트 중에서도 특히 개인용 랩탑 컴퓨터, 데스크탑 컴퓨터, 디지털 카메라, 스마트폰, 메모리 카드 판독기, 및/또는 사물 인터넷(IoT) 가능 디바이스와 같은 호스트 시스템일 수 있다. 1 , a host 102 may be coupled to a memory device 104 . In some number of embodiments, memory device 104 may be coupled to host 102 via one or more channels (eg, channel 103 ). 1 , a memory device 104 is coupled to a host 102 via a channel 103 , and control circuitry 120 of the memory device 104 is coupled to a memory array 130 via a channel 107 . . Host 102 may be a host system such as a personal laptop computer, desktop computer, digital camera, smartphone, memory card reader, and/or Internet of Things (IoT) capable device, among various other types of hosts.

호스트(102)는 시스템 마더보드 및/또는 백플레인(backplane)을 포함할 수 있고, 메모리 액세스 디바이스, 예를 들어, 프로세서(또는 처리 디바이스)를 포함할 수 있다. 이 기술 분야에 통상의 지식을 가진 자라면 "프로세서"가 병렬 처리 시스템, 소정 개수의 코프로세서 등과 같은 하나 이상의 프로세서를 의도할 수 있음을 이해할 수 있을 것이다. 시스템(100)은 개별 집적 회로를 포함할 수 있고, 호스트(102), 메모리 디바이스(104), 및 메모리 어레이(130) 모두가 동일한 집적 회로 상에 있을 수 있다. 시스템(100)은 예를 들어, 서버 시스템 및/또는 고성능 컴퓨팅(HPC) 시스템 및/또는 그 일부일 수 있다. 도 1에 도시된 예는 폰 노이만 아키텍처를 갖는 시스템을 도시하지만, 본 발명의 실시형태는 폰 노이만 아키텍처와 종종 연관되는 하나 이상의 구성요소(예를 들어, CPU, ALU 등)를 포함하지 않을 수 있는 비 폰 노이만 아키텍처에서 구현될 수 있다.The host 102 may include a system motherboard and/or backplane, and may include a memory access device, eg, a processor (or processing device). Those of ordinary skill in the art will appreciate that "processor" can mean one or more processors, such as a parallel processing system, any number of coprocessors, and the like. System 100 may include separate integrated circuits, and host 102 , memory device 104 , and memory array 130 may all be on the same integrated circuit. System 100 may be, for example, a server system and/or a high performance computing (HPC) system and/or part thereof. 1 depicts a system having a von Neumann architecture, embodiments of the present invention may not include one or more components (e.g., CPU, ALU, etc.) often associated with von Neumann architecture. It can be implemented in a non von Neumann architecture.

본 명세서에서 도 2에 보다 상세히 도시된 메모리 디바이스(104)는 논리 회로부(122) 및 메모리 자원(124)을 포함할 수 있는 제어 회로부(120)를 포함할 수 있다. 논리 회로부(122)는 본 명세서에서 보다 상세히 설명된 연산을 수행하도록 구성된 집적 회로, 예를 들어, 주문형 집적 회로(ASIC), 전계 프로그래밍 가능 게이트 어레이(FPGA), 축소된 명령어 세트 컴퓨팅 디바이스(RISC), 고급 RISC 기계, 시스템 온 칩, 또는 하드웨어 및/또는 회로부의 다른 조합의 형태로 제공될 수 있다. 예를 들어, 논리 회로부(122)는 메모리 자원(124)에 의해 저장된 비트 스트링에 재귀 연산을 수행하고/하거나 재귀 연산의 하나 이상의 반복 결과를 감지 증폭기(111)에 저장할 수 있다.The memory device 104 shown in more detail herein in FIG. 2 may include logic circuitry 122 and control circuitry 120 , which may include memory resources 124 . Logic circuitry 122 may be an integrated circuit configured to perform the operations described in more detail herein, such as application specific integrated circuits (ASICs), electric field programmable gate arrays (FPGAs), reduced instruction set computing devices (RISCs). , an advanced RISC machine, a system on a chip, or other combination of hardware and/or circuitry. For example, the logic circuit unit 122 may perform a recursive operation on the bit string stored by the memory resource 124 and/or store one or more iteration results of the recursive operation in the sense amplifier 111 .

일부 실시형태에서, 연산은 부동 소수점 비트 스트링(예를 들어, 부동 소수점 숫자)을 포지트 형식의 비트 스트링으로 변환하거나 또는 그 반대로 변환하기 위한 변환 연산을 더 포함할 수 있다. 부동 소수점 비트 스트링이 포지트 형식의 비트 스트링으로 변환되면, 논리 회로부(122)는 재귀 산술 연산, 예를 들어, 가산, 감산, 승산, 제산, 융합된 승산 가산, 승산-누산, 내적 단위, 크거나 작음, 절대값(예를 들어, FABS()), 고속 푸리에 변환, 역 고속 푸리에 변환, 시그모이드 함수, 컨볼루션, 제곱근, 지수 및/또는 로그 연산, 및/또는 재귀 논리 연산, 예를 들어, AND, OR, XOR, NOT 등뿐만 아니라 포지트 비트 스트링을 사용하는 사인, 코사인, 탄젠트 등과 같은 삼각 연산을 수행(또는 수행을 야기)하도록 구성될 수 있다. 이해되는 바와 같이, 전술한 연산 목록은 완전한 것으로 의도되지 않으며, 또한 전술한 연산 목록은 본 발명을 제한하려고 의도된 것이 아니고, 논리 회로부(122)는 다른 산술 및/또는 논리 연산을 수행(또는 수행을 야기)하도록 구성될 수 있다.In some embodiments, the operation may further include a conversion operation to convert a floating-point bit string (eg, a floating-point number) to a bit string in positive format and vice versa. When the floating-point bit string is converted to a bit string in a positive format, the logic circuit unit 122 performs recursive arithmetic operations, for example, addition, subtraction, multiplication, division, fused multiplication addition, multiplication-accumulation, dot product unit, multiplier. or less than, absolute (e.g., FABS()), fast Fourier transform, inverse fast Fourier transform, sigmoid function, convolution, square root, exponential and/or logarithmic operations, and/or recursive logical operations, e.g. For example, it may be configured to perform (or cause to perform) trigonometric operations such as AND, OR, XOR, NOT, etc., as well as sine, cosine, tangent, etc. using a string of positive bits. As will be understood, the above list of operations is not intended to be exhaustive, nor is the above list of operations intended to limit the invention, and the logic circuitry 122 may perform (or perform) other arithmetic and/or logical operations. can be configured to cause

제어 회로부(120)는 논리 회로부(122)에 통신 가능하게 결합될 수 있는 메모리 자원(124)을 더 포함할 수 있다. 메모리 자원(124)은 휘발성 메모리 자원, 비휘발성 메모리 자원, 또는 휘발성 및 비휘발성 메모리 자원의 조합을 포함할 수 있다. 일부 실시형태에서, 메모리 자원은 정적 랜덤 액세스 메모리(SRAM)와 같은 랜덤 액세스 메모리(RAM)일 수 있다. 그러나 실시형태는 이로 제한되지 않으며, 메모리 자원은 캐시, 하나 이상의 레지스터, NVRAM, ReRAM, FeRAM, MRAM, PCM, "이머징" 메모리 디바이스, 예를 들어, 3-D 교차점(3D XP) 메모리 디바이스 등 또는 이들의 조합일 수 있다.The control circuitry 120 may further include a memory resource 124 that may be communicatively coupled to the logic circuitry 122 . Memory resource 124 may include a volatile memory resource, a non-volatile memory resource, or a combination of volatile and non-volatile memory resources. In some embodiments, the memory resource may be random access memory (RAM), such as static random access memory (SRAM). However, embodiments are not limited thereto, and memory resources may include cache, one or more registers, NVRAM, ReRAM, FeRAM, MRAM, PCM, “emerging” memory devices, such as 3-D junction (3D XP) memory devices, etc. or It may be a combination of these.

메모리 자원(124)은 하나 이상의 비트 스트링을 저장할 수 있다. 일부 실시형태에서, 메모리 자원(124)에 의해 저장된 비트 스트링(들)은 범용 숫자(unum) 또는 포지트 형식에 따라 저장될 수 있다. 본 명세서에서 사용된, unum(예를 들어, 유형 III unum) 또는 포지트 형식으로 저장된 비트 스트링은 비트의 여러 서브세트 또는 "비트 서브세트"를 포함할 수 있다. 예를 들어, 범용 숫자 또는 포지트 비트 스트링은 "부호" 또는 "부호 부분"이라고 하는 비트 서브세트, "체제" 또는 "체제 부분"이라고 하는 비트 서브세트, "지수" 또는 "지수 부분"이라고 하는 비트 서브세트, 및 "가수" 또는 "가수 부분"(또는 유효숫자)라고 지칭되는 비트 서브세트를 포함할 수 있다. 본 명세서에서 사용된 비트 서브세트는 비트 스트링에 포함된 비트 서브세트를 지칭하도록 의도된다. 부호, 체제, 지수 및 가수 비트 세트의 예는 본 명세서에서 도 3 및 도 4a 내지 도 4b와 관련하여 보다 자세히 설명된다. 그러나, 실시형태는 이로 제한되지 않으며, 메모리 자원은 부동 소수점 형식 또는 다른 적절한 형식과 같은 다른 형식으로 비트 스트링을 저장할 수 있다.Memory resource 124 may store one or more bit strings. In some embodiments, the bit string(s) stored by the memory resource 124 may be stored according to a universal number (unum) or positon format. As used herein, a string of bits stored in an unum (eg, type III unum) or positive format may contain multiple subsets of bits or “subsets of bits”. For example, a general-purpose number or string of positive bits is a subset of bits called a "sign" or "sign part", a subset of bits called a "system" or "system part", and a subset called an "exponent" or "exponent part". It may include subsets of bits, and subsets of bits referred to as “mantissa” or “mantissa portion” (or significant figures). Bit subset as used herein is intended to refer to a subset of bits included in a bit string. Examples of sign, regime, exponent and mantissa bit sets are described in greater detail herein with respect to Figures 3 and 4A-4B. However, embodiments are not limited thereto, and the memory resource may store the bit string in another format, such as a floating point format or other suitable format.

예를 들어, 일부 실시형태에서, 메모리 자원(124)은 제1 정밀도 레벨을 제공하는 제1 형식을 갖는 비트 스트링을 포함하는 데이터를 수신할 수 있다. 논리 회로부(122)는 메모리 자원으로부터 데이터를 수신하고, 비트 스트링을 제1 정밀도 레벨과 상이한 제2 정밀도 레벨을 제공하는 제2 형식으로 변환할 수 있다. 일부 실시형태에서, 제1 정밀도 레벨은 제2 정밀도 레벨보다 낮을 수 있다. 예를 들어, 제1 형식이 부동 소수점 형식이고 제2 형식이 범용 숫자 또는 포지트 형식이라면, 부동 소수점 비트 스트링은 본 명세서에서 도 3 및 도 4a 내지 도 4b와 관련하여 보다 상세히 설명된 바와 같이, 특정 조건 하에서 범용 숫자 또는 포지트 비트 스트링보다 낮은 레벨의 정밀도를 제공할 수 있다.For example, in some embodiments, the memory resource 124 may receive data comprising a string of bits having a first format that provides a first level of precision. The logic circuit unit 122 may receive data from the memory resource and convert the bit string into a second format that provides a second level of precision different from the first level of precision. In some embodiments, the first level of precision may be lower than the second level of precision. For example, if the first format is a floating-point format and the second format is a general-purpose number or positive format, then the floating-point bit string is a string of floating-point bits, as described in more detail herein with respect to Figures 3 and 4A-4B, It can provide a lower level of precision than general-purpose numeric or positive bit strings under certain conditions.

제1 형식은 부동 소수점 형식(예를 들어, IEEE 754 형식)일 수 있고, 제2 형식은 범용 숫자(unum) 형식(예를 들어, 유형 I unum 형식, 유형 II unum 형식, 유형 III unum 형식, 포지트 형식, 유효 형식 등)일 수 있다. 그 결과, 제1 형식은 가수, 밑수 및 지수 부분을 포함할 수 있고, 제2 형식은 가수, 부호, 체제 및 지수 부분을 포함할 수 있다.The first format may be a floating-point format (eg, IEEE 754 format), and the second format may be a universal numeric (unum) format (eg, type I unum format, type II unum format, type III unum format, post format, valid format, etc.). As a result, the first form may include mantissa, base and exponent parts, and the second form may include mantissa, sign, system and exponent parts.

논리 회로부(122)는 제2 형식(예를 들어, unum 또는 포지트 형식)을 갖는 비트 스트링을 사용하여 산술 연산 또는 논리 연산, 또는 이 둘 모두를 수행하도록 구성될 수 있다. 일부 실시형태에서, 산술 연산 및/또는 논리 연산은 재귀 연산일 수 있다. 본 명세서에 사용된 "재귀 연산"은 일반적으로 재귀 연산의 이전 반복의 결과를 연산의 후속 반복을 위한 피연산자로 사용하는 것을 지정된 횟수만큼 수행하는 연산을 지칭한다. 예를 들어, 재귀 승산 연산은 2개의 비트 스트링 피연산자(β 및

)를 함께 곱하고 재귀 연산의 각 반복 결과를 후속 반복을 위한 비트 스트링 피연산자로 사용하는 연산일 수 있다. 다시 말해, 재귀 연산은 재귀 연산의 제1 반복이 β와

를 함께 곱하여 결과(λ)(예를 들어, β×

=λ)를 얻는 것을 포함하는 연산을 지칭할 수 있다. 이 예시적인 재귀 연산의 그 다음 반복은 결과(λ)에

를 곱하여 다른 결과(ω)(예를 들어, λ×

=ω)를 얻는 것을 포함할 수 있다.The logic circuitry 122 may be configured to perform an arithmetic operation or a logical operation, or both, using a bit string having a second format (eg, unum or positive format). In some embodiments, arithmetic and/or logical operations may be recursive operations. As used herein, a "recursive operation" generally refers to an operation that performs a specified number of times using the result of a previous iteration of the recursive operation as an operand for a subsequent iteration of the operation. For example, a recursive multiplication operation takes two bit string operands (β and

) together and using the result of each iteration of the recursive operation as a bit string operand for subsequent iterations. In other words, a recursive operation means that the first iteration of the recursive operation is

are multiplied together to give the result (λ) (e.g., β ×

=λ). The next iteration of this exemplary recursive operation is

multiplied by the other result (ω) (eg, λ ×

=ω).

재귀 연산의 다른 예시적인 예는 자연수의 계승(factorial)을 계산하는 관점에서 설명될 수 있다. 수식 1에서 주어진 이 예는 주어진 숫자 n의 계승이 0보다 클 때 재귀 연산을 수행하고 숫자 n이 0과 같으면 1을 반환하는 것을 포함할 수 있다:Another illustrative example of a recursive operation may be described in terms of calculating the factorial of a natural number. This example given in Equation 1 may include performing a recursive operation when the factorial of a given number n is greater than 0 and returning 1 if the number n is equal to 0:

수식 1에 제시된 바와 같이 숫자 n의 계승을 결정하기 위한 재귀 연산은 n이 0이 될 때까지 수행될 수 있으며, 이 지점에서 솔루션에 도달하고 재귀 연산이 종료된다. 예를 들어, 수식 1을 사용하여 숫자 n을 계승하는 것은 n x (n - 1) x (n - 2) x ··· x 1의 연산을 수행하여 재귀적으로 계산될 수 있다.As shown in Equation 1, a recursive operation for determining the factorial of a number n may be performed until n becomes 0, at which point a solution is reached and the recursive operation is terminated. For example, the factorial of the number n using Equation 1 can be recursively calculated by performing the operation n x (n - 1) x (n - 2) x ... x 1.

재귀 연산의 또 다른 예는 누산기에서 a가 수식 a ← a + (b x c)에 따라 반복 시에 수정되는 승산-누산 연산이다. 승산-누산 연산에서, 누산기의 각각의 이전 반복(a)은 두 피연산자(b 및 c)의 승산 곱과 합산된다. 일부 접근 방식에서, 승산-누산 연산은 하나 이상의 반올림으로 수행될 수 있다(예를 들어, a는 연산의 하나 이상의 반복에서 절단될 수 있음). 그러나, 이와 달리, 본 명세서의 실시형태는 연산의 중간 반복의 결과를 반올림하지 않고 승산-누산 연산을 수행하여, 승산-누산 연산의 최종 결과가 완료될 때까지 각각의 반복의 정확도를 보존할 수 있다.Another example of a recursive operation is a multiply-accumulate operation in which a in the accumulator is modified upon iteration according to the expression a ← a + (b x c). In a multiply-accumulate operation, each previous iteration (a) of the accumulator is summed with the multiplication product of the two operands (b and c). In some approaches, multiply-accumulate operations may be performed with one or more rounds (eg, a may be truncated in one or more iterations of the operation). Alternatively, however, embodiments herein may perform multiply-accumulate operations without rounding the result of intermediate iterations of the operation, preserving the accuracy of each iteration until the final result of the multiply-accumulate operation is complete. have.

본 명세서에서 고려되는 재귀 연산의 예는 이러한 예로 제한되지 않는다. 이와 달리, 재귀 연산의 위의 예는 단지 예시적인 것이며, 본 발명의 맥락에서 "재귀 연산"이라는 용어의 범위를 명확히 하기 위해 제공된 것이다.Examples of recursive operations contemplated herein are not limited to these examples. In contrast, the above example of a recursive operation is illustrative only and is provided to clarify the scope of the term "recursive operation" in the context of the present invention.

도 1에 도시된 바와 같이, 복수의 감지 증폭기(예를 들어, 감지 증폭기(111))는 메모리 어레이(130) 및 제어 회로부(120)에 결합된다. 제어 회로부(120)는 하나 이상의 비트 스트링을 사용하여 재귀 연산을 수행하고/하거나 재귀 연산의 반복의 결과를 나타내는 결과 비트 스트링을 복수의 감지 증폭기에 저장(예를 들어, 누산)하도록 구성될 수 있다. 일부 실시형태에서, 결과 비트 스트링을 복수의 감지 증폭기에 누산하는 연산은 사용자 생성 커맨드의 수신에 응답하여 수행된다. 그러나, 실시형태는 이로 제한되지 않으며, 일부 실시형태에서는, 제어 회로부(120)는 호스트 커맨드의 수신에 응답하여 또는 재귀 연산에 사용될 비트 스트링이 제어 회로부(120)의 메모리 자원(124)에 저장된다는 결정에 응답하여 결과 비트 스트링을 복수의 감지 증폭기에 누산하는 연산을 수행하도록 구성될 수 있다. 본 명세서에서 보다 상세히 설명된 바와 같이, 하나 이상의 비트 스트링, 결과 비트 스트링, 또는 이 둘 모두는 유형 III 범용 숫자 형식 또는 포지트 형식에 따라 형식화될 수 있다.As shown in FIG. 1 , a plurality of sense amplifiers (eg, sense amplifiers 111 ) are coupled to the memory array 130 and the control circuitry 120 . Control circuitry 120 may be configured to perform a recursive operation using one or more bit strings and/or to store (eg, accumulate) a resultant bit string representing a result of iteration of the recursive operation in a plurality of sense amplifiers. . In some embodiments, the operation of accumulating the resulting string of bits into the plurality of sense amplifiers is performed in response to receiving a user generated command. However, embodiments are not limited thereto, and in some embodiments, the control circuitry 120 indicates that a bit string to be used in a recursive operation or in response to receiving a host command is stored in the memory resource 124 of the control circuitry 120 . and in response to the determination, perform an operation that accumulates the resulting string of bits into the plurality of sense amplifiers. As described in more detail herein, the one or more bit strings, the resulting bit strings, or both, may be formatted according to a Type III general-purpose numeric format or a positive format.

감지 증폭기(111)는 메모리 어레이(130)를 위한 추가 저장 공간을 제공할 수 있고, 메모리 디바이스(104)에 존재하는 데이터 값을 감지(예를 들어, 판독, 저장, 캐싱)할 수 있다. 일부 실시형태에서, 감지 증폭기(111)는 메모리 디바이스(104)의 주변 영역에 위치될 수 있다. 예를 들어, 감지 증폭기(111)는 메모리 어레이(130)와 물리적으로 구분되는 메모리 디바이스(104)의 영역에 위치될 수 있다. 감지 증폭기(111)는 본 명세서에 설명된 바와 같이 데이터 값을 저장하도록 구성될 수 있는 감지 증폭기, 래치, 플립플롭 등을 포함할 수 있다. 일부 실시형태에서, 감지 증폭기(111)는 레지스터 또는 일련의 레지스터의 형태로 제공될 수 있고, 메모리 어레이(130)의 행 또는 열이 있는 것과 동일한 수량의 저장 위치(예를 들어, 감지 증폭기, 래치 등)를 포함할 수 있다. 예를 들어, 메모리 어레이(130)가 약 16K 행 또는 열을 포함하는 경우, 주변 감지 증폭기(111)는 약 16K 저장 위치를 포함할 수 있다. 따라서, 일부 실시형태에서, 주변 감지 증폭기(111)는 최대 16K 데이터 값을 보유하도록 구성된 레지스터일 수 있지만, 실시형태는 도 2a와 관련하여 보다 상세히 설명된 바와 같이 이로 제한되지 않는다.The sense amplifier 111 may provide additional storage space for the memory array 130 and may sense (eg, read, store, cache) data values residing in the memory device 104 . In some embodiments, the sense amplifier 111 may be located in a peripheral region of the memory device 104 . For example, the sense amplifier 111 may be located in a region of the memory device 104 that is physically separate from the memory array 130 . Sense amplifier 111 may include a sense amplifier, latch, flip-flop, etc., that may be configured to store data values as described herein. In some embodiments, the sense amplifier 111 may be provided in the form of a register or series of registers, with the same number of storage locations (eg, sense amplifiers, latches) as there are rows or columns of the memory array 130 . etc.) may be included. For example, if memory array 130 includes about 16K rows or columns, peripheral sense amplifier 111 may include about 16K storage locations. Thus, in some embodiments, the peripheral sense amplifier 111 may be a register configured to hold up to 16K data values, although embodiments are not limited thereto as described in more detail with respect to FIG. 2A .

제어 회로부(120)는 복수의 감지 증폭기에서 이전에 저장된 결과 비트 스트링을 덮어쓰는 것에 의해 재귀 연산의 반복 결과를 나타내는 결과 비트 스트링을 복수의 감지 증폭기(예를 들어, 감지 증폭기(111))에서 누산하도록 더 구성될 수 있다. 예를 들어, 제어 회로부(120)는 이전의 중간 비트 스트링이 저장된 것과 동일한 위치에 재귀 연산의 각각의 연속적인 중간 결과 비트 스트링을 저장하도록 구성될 수 있다. 그러나, 도 2a 및 도 2b와 관련하여 아래에서 보다 상세히 설명된 바와 같이, 재귀 연산의 연속적인 반복은 재귀 연산의 이전 반복보다 더 큰 비트 폭을 가질 수 있다. 이 경우에, 제어 회로부(120)는 이전의 결과 비트 스트링을 덮어쓰고, 후속 반복을 나타내는 후속 비트 스트링의 추가 비트를 추가 감지 증폭기(111)에 저장하도록 구성될 수 있다.The control circuit unit 120 accumulates the result bit string representing the repeated result of the recursive operation by overwriting the result bit string previously stored in the plurality of sense amplifiers (eg, the sense amplifier 111) in the plurality of sense amplifiers. It may be further configured to do so. For example, the control circuitry 120 may be configured to store each successive intermediate result bit string of the recursive operation in the same location as the previous intermediate bit string was stored. However, successive iterations of a recursive operation may have a greater bit width than previous iterations of the recursive operation, as described in more detail below with respect to FIGS. 2A and 2B . In this case, the control circuitry 120 may be configured to overwrite the previous resulting bit string and store additional bits of the subsequent bit string representing the subsequent iteration in the additional sense amplifier 111 .

일부 실시형태에서, 제어 회로부(120)는 재귀 연산이 완료되었다고 결정하고, 이 결정에 따라 결과 비트 스트링의 가수 비트 서브세트 또는 지수 비트 서브세트 또는 이 둘 모두로부터 적어도 하나의 비트를 제거함으로써 최종 결과 비트 스트링이 특정 비트 폭을 갖도록 복수의 감지 증폭기에 저장된 결과 비트 스트링을 반올림하는 연산을 수행하도록 구성될 수 있다. 예를 들어, 재귀 연산이 완료되면 제어 회로부(120)는 연산의 최종 결과를 감지 증폭기(111) 외부의 회로부로 전송될 수 있는 비트 폭으로 반올림할 수 있다.In some embodiments, the control circuitry 120 determines that the recursive operation is complete and removes at least one bit from either the mantissa bits subset or the exponent bits subset or both of the resulting bit string in accordance with the determination, resulting in the final result. and may be configured to perform an operation of rounding the resulting bit string stored in the plurality of sense amplifiers such that the bit string has a particular bit width. For example, when the recursive operation is completed, the control circuit unit 120 may round the final result of the operation to a bit width that can be transmitted to a circuit unit external to the sense amplifier 111 .

재귀 연산의 최종 결과는 8-비트, 16-비트, 32-비트, 64-비트 등과 같은 특정 비트 폭으로 반올림될 수 있다. 재귀 연산의 최종 결과가 미리 결정될 수 있는 특정 비트 폭은 미리 결정될 수 있고 또는 예를 들어 사용자 입력에 의해 선택될 수 있다. 예를 들어, 일부 실시형태에서, 사용자는 재귀 연산의 최종 결과를 원하는 비트 폭으로 반올림하도록 제어 회로부(120)에 명령하는 커맨드를 제어 회로부(120)에 제공할 수 있다.The final result of the recursive operation may be rounded to a specific bit width, such as 8-bit, 16-bit, 32-bit, 64-bit, etc. The particular bit width from which the final result of the recursive operation may be predetermined may be predetermined or may be selected, for example, by user input. For example, in some embodiments, a user may provide a command to control circuitry 120 that instructs control circuitry 120 to round the final result of the recursive operation to a desired bit width.

일부 실시형태에서, 재귀 연산은 결과 비트 스트링을 메모리 디바이스(104) 외부의 회로부로 전송하지 않고 메모리 디바이스(104) 내에서 수행될 수 있다. 예를 들어, 재귀 연산은 제어 회로부의 논리 회로부(122)에 의해 수행되거나, 또는 재귀 연산을 수행하기 위해 특정 조합으로 메모리 어레이의 행 및 열을 발사(firing)함으로써 수행될 수 있다.In some embodiments, the recursive operation may be performed within the memory device 104 without transferring the resulting bit string to circuitry external to the memory device 104 . For example, the recursive operation may be performed by the logic circuitry 122 of the control circuitry, or by firing the rows and columns of the memory array in a specific combination to perform the recursive operation.

제어 회로부(120)는 일부 실시형태에서, 재귀 연산의 제1 반복의 결과를 나타내는 제1 결과 비트 스트링이 저장된 메모리 어레이의 어드레스 공간에 액세스하고/하거나 재귀 연산의 제2 반복의 결과를 나타내는 제2 결과 비트 스트링이 저장된 메모리 어레이(130)의 어드레스 공간에 액세스하도록 구성될 수 있다. 제어 회로부(120)는 제1 결과 비트 스트링과 제2 결과 비트 스트링을 사용하여 수행된 연산의 결과를 나타내는 비트 스트링을 복수의 감지 증폭기(예를 들어, 감지 증폭기(111))에 저장하도록 더 구성될 수 있다.The control circuitry 120, in some embodiments, accesses an address space of the memory array in which a first result bit string representing the result of the first iteration of the recursive operation is stored and/or a second iteration representing the result of the second iteration of the recursive operation. The resulting bit string may be configured to access an address space of the memory array 130 stored therein. The control circuitry 120 is further configured to store in a plurality of sense amplifiers (eg, sense amplifiers 111 ) a bit string representing a result of an operation performed using the first result bit string and the second result bit string. can be

일부 실시형태에서, 제어 회로부(120)는 예를 들어, 메모리 어레이(130)에 저장된 비트 스트링(예를 들어, 데이터)을 기입, 판독, 복사 및/또는 소거하기 위해 특정 명령어 세트를 실행하도록 구성될 수 있다. 예를 들어, 본 명세서에서 보다 상세히 설명된 바와 같이, 제어 회로부(120)는 메모리 어레이(130)에 저장된 데이터를 검색하기 위해 메모리 어레이(130)의 하나 이상의 행 및/또는 열로부터 데이터를 판독하는 명령어를 실행할 수 있다. 도 2a, 도 2b 및 특히, 도 5와 관련하여 보다 상세히 설명된 바와 같이, 데이터는 하나 이상의 포지트 비트 스트링 피연산자, 및/또는 포지트 비트 스트링 피연산자들 사이에서 수행되고 메모리 어레이(130)에 저장된 하나 이상의 연산 결과(예를 들어, 산술 및/또는 논리 연산)를 포함할 수 있다.In some embodiments, the control circuitry 120 is configured to execute a specific set of instructions, for example, to write, read, copy, and/or erase a bit string (eg, data) stored in the memory array 130 . can be For example, as described in more detail herein, the control circuitry 120 reads data from one or more rows and/or columns of the memory array 130 to retrieve data stored in the memory array 130 . command can be executed. As described in more detail with respect to FIGS. 2A, 2B and in particular with reference to FIG. 5 , data is stored in memory array 130 and performed between one or more positive bit string operands, and/or positive bit string operands. It may include one or more operation results (eg, arithmetic and/or logical operations).

메모리 어레이(130)로부터 포지트 비트 스트링을 기입 및/또는 검색하는 지정된 명령어 세트를 실행하도록 구성된 제어 회로부(120)를 이용함으로써, 메모리 어레이(130)에 이러한 연산의 결과(들)를 저장하고, 메모리 어레이(130)로부터 직접 연산의 결과(들)를 검색함으로써 메모리 어레이(130)에 저장된 포지트 비트 스트링들 사이의 연산을 수행하는 시간 소비 및/또는 컴퓨팅 자원 집약적 프로세스의 양을 감소시킬 수 있기 때문에 메모리 디바이스(104)의 성능이 향상될 수 있다.store the result(s) of such operations in the memory array 130 by using the control circuitry 120 configured to execute a designated set of instructions that writes and/or retrieves a string of positive bits from the memory array 130; Retrieving the result(s) of the operation directly from the memory array 130 may reduce the amount of time consuming and/or computing resource intensive processes performing operations between positive bit strings stored in the memory array 130 . Therefore, the performance of the memory device 104 may be improved.

일부 실시형태에서, 제어 회로부(120)는 관련 포지트 비트 스트링이 저장되는 메모리 어레이(130)의 어드레스를 결정할 수 있다. 예를 들어, 제어 회로부(120)는 하나 이상의 포지트 비트 스트링 피연산자가 저장되는 메모리 어레이(130)의 행 및/또는 열 어드레스를 결정하고/하거나, 하나 이상의 포지트 비트 스트링 피연산자들 사이에 산술 및/또는 논리 연산의 수행을 나타내는 결과 포지트 비트 스트링이 저장되는 행 및/또는 열 어드레스를 결정할 수 있다. 제어 회로부(120)는 그런 다음 메모리 어레이(130)의 어드레스에 저장된 포지트 비트 스트링(들)을 검색하기 위한 커맨드 또는 요청을 전송하고 및/또는 검색된 포지트 비트 스트링(들)을, 예를 들어, 저장된 비트 스트링을 사용하는 재귀 연산의 수행의 일부로서, 메모리 디바이스(102)에 결합된 감지 증폭기(111), 호스트(102), 매체 디바이스(예를 들어, 솔리드 스테이트 드라이브, 플래시 메모리 디바이스 등), 또는 메모리 어레이(130) 외부의 다른 회로부에 전송할 수 있다. In some embodiments, the control circuitry 120 may determine the address of the memory array 130 in which the associated positive bit string is stored. For example, the control circuitry 120 may determine a row and/or column address of the memory array 130 in which one or more positive bit string operands are stored, and/or perform arithmetic and/or arithmetic operations between the one or more positive bit string operands. and/or determine a row and/or column address in which a result positive bit string representing the performance of a logical operation is stored. The control circuitry 120 then sends a command or request to retrieve the positive bit string(s) stored at the address of the memory array 130 and/or sends the retrieved positive bit string(s) to, for example, , a sense amplifier 111 , a host 102 , a media device (eg, a solid state drive, a flash memory device, etc.) coupled to the memory device 102 as part of performing a recursive operation using the stored bit string. , or may be transmitted to another circuit unit outside the memory array 130 .

도 1의 실시형태는 본 발명의 실시형태를 모호하게 하지 않기 위해 도시되지 않은 추가 회로부를 포함할 수 있다. 예를 들어, 메모리 디바이스(104)는 I/O 회로부를 통한 I/O 연결을 통해 제공된 어드레스 신호를 래치하기 위한 어드레스 회로부를 포함할 수 있다. 어드레스 신호는 메모리 디바이스(104) 및/또는 메모리 어레이(130)에 액세스하기 위해 행 디코더와 열 디코더에 의해 수신 및 디코딩될 수 있다. 이 기술 분야에 통상의 지식을 가진 자라면 어드레스 입력 연결의 수는 메모리 디바이스(104) 및/또는 메모리 어레이(130)의 밀도 및 아키텍처에 의존할 수 있다는 것을 이해할 수 있을 것이다.The embodiment of Figure 1 may include additional circuitry not shown in order not to obscure the embodiments of the present invention. For example, the memory device 104 may include address circuitry for latching an address signal provided via an I/O connection through the I/O circuitry. Address signals may be received and decoded by row decoders and column decoders to access memory device 104 and/or memory array 130 . Those of ordinary skill in the art will appreciate that the number of address input connections may depend on the density and architecture of the memory device 104 and/or memory array 130 .

도 2a는 본 발명의 소정 개수의 실시형태에 따라 호스트(202)와 메모리 디바이스(204)를 포함하는 장치(200)를 포함하는 컴퓨팅 시스템 형태의 기능 블록도이다. 메모리 디바이스(204)는 도 1에 도시된 제어 회로부(120)와 유사할 수 있는 제어 회로부(220)를 포함할 수 있다. 유사하게, 호스트(202)는 도 1에 도시된 호스트(102)와 유사할 수 있고, 메모리 디바이스(204)는 도 1에 도시된 메모리 디바이스(104)와 유사할 수 있다. 각각의 구성요소(예를 들어, 호스트(202), 비트 스트링 변환 회로부(220), 논리 회로부(222), 메모리 자원(224), 및/또는 메모리 어레이(230) 등)는 본 명세서에서 개별적으로 "장치"라고 지칭될 수 있다. 2A is a functional block diagram in the form of a computing system including an apparatus 200 including a host 202 and a memory device 204 in accordance with a number of embodiments of the present invention. Memory device 204 may include control circuitry 220 , which may be similar to control circuitry 120 shown in FIG. 1 . Similarly, host 202 may be similar to host 102 shown in FIG. 1 , and memory device 204 may be similar to memory device 104 shown in FIG. 1 . Each component (eg, host 202 , bit string conversion circuitry 220 , logic circuitry 222 , memory resource 224 , and/or memory array 230 , etc.) is individually described herein. It may be referred to as a “device”.

호스트(202)는 하나 이상의 채널(203, 205)을 통해 메모리 디바이스(204)에 통신 가능하게 결합될 수 있다. 채널(203, 205)은 호스트(202)와 메모리 디바이스(205) 사이에 데이터 및/또는 커맨드를 전송하는 인터페이스 또는 다른 물리적 연결부일 수 있다. 예를 들어, 제어 회로부(220)를 사용하여 동작(예를 들어, 하나 이상의 비트 스트링을 사용하는 재귀 연산을 개시하는 동작, 재귀 연산의 반복 결과를 주변 감지 증폭기(211)에 저장하는 동작)을 개시하는 커맨드는 채널(203, 205)을 통해 호스트로부터 전송될 수 있다. 일부 실시형태에서, 제어 회로부(220)는 호스트(202)로부터의 개입 커맨드가 없을 때 채널(203, 205) 중 하나 이상을 통해 호스트(202)로부터 전송된 개시 커맨드에 응답하여 연산을 수행할 수 있음에 유의해야 한다. 즉, 제어 회로부(220)가 호스트(202)로부터 연산의 수행을 개시하라는 커맨드를 수신하면, 호스트(202)로부터의 추가 커맨드가 없을 때 제어 회로부(220)에 의해 연산이 수행될 수 있다.The host 202 may be communicatively coupled to the memory device 204 via one or more channels 203 , 205 . Channels 203 , 205 may be interfaces or other physical connections that transfer data and/or commands between host 202 and memory device 205 . For example, an operation (eg, an operation of starting a recursive operation using one or more bit strings, an operation of storing the repetition result of the recursive operation in the peripheral sense amplifier 211) using the control circuit unit 220 is performed. The initiating command may be transmitted from the host through channels 203 and 205 . In some embodiments, the control circuitry 220 may perform an operation in response to an initiation command sent from the host 202 over one or more of the channels 203 and 205 in the absence of an intervention command from the host 202 . It should be noted that there is That is, when the control circuit unit 220 receives a command to start performing the operation from the host 202 , the operation may be performed by the control circuit unit 220 when there is no additional command from the host 202 .

도 2a에 도시된 바와 같이, 메모리 디바이스(204)는 레지스터 액세스 구성요소(206), 고속 인터페이스(HSI)(208), 제어기(210), 하나 이상의 확장된 행 어드레스(XRA) 구성요소(들)를 포함할 수 있는 주변 감지 증폭기(211), 주 메모리 입력/출력(I/O) 회로부(214), 행 어드레스 스트로브(RAS)/열 어드레스 스트로브(CAS) 체인 제어 회로부(216), RAS/CAS 체인 구성요소(218), 제어 회로부(220), 및 메모리 어레이(230)를 포함할 수 있다. 주변 감지 증폭기(211) 및/또는 제어 회로부(220)는 도 2에 도시된 바와 같이, 메모리 어레이(230)와 물리적으로 구분되는 메모리 디바이스(204)의 영역에 위치된다. 즉, 일부 실시형태에서, 주변 감지 증폭기(211) 및/또는 제어 회로부(220)는 메모리 어레이(230)의 주변 위치에 위치된다.As shown in FIG. 2A , the memory device 204 includes a register access component 206 , a high-speed interface (HSI) 208 , a controller 210 , and one or more extended row address (XRA) component(s). Peripheral sense amplifier 211 , which may include main memory input/output (I/O) circuitry 214 , row address strobe (RAS)/column address strobe (CAS) chain control circuitry 216 , RAS/CAS It may include a chain component 218 , control circuitry 220 , and a memory array 230 . The peripheral sense amplifier 211 and/or the control circuitry 220 are located in a region of the memory device 204 that is physically separated from the memory array 230 as shown in FIG. 2 . That is, in some embodiments, the peripheral sense amplifiers 211 and/or the control circuitry 220 are located at peripheral locations of the memory array 230 .

레지스터 액세스 구성요소(206)는 호스트(202)로부터 메모리 디바이스(204)로 그리고 메모리 디바이스(204)로부터 호스트(202)로 데이터를 전송 및 페치할 수 있다. 예를 들어, 레지스터 액세스 구성요소(206)는 메모리 디바이스(204)로부터 호스트(202)로 전송되거나 호스트(202)로부터 메모리 디바이스(204)로 전송될 데이터에 대응하는 메모리 어드레스와 같은 어드레스를 저장할 수 있다(또는 어드레스를 조회할 수 있다). 일부 실시형태에서, 레지스터 액세스 구성요소(206)는 비트 스트링 변환 회로부(220)에 의해 연산될 데이터를 전송 및 페치할 수 있고/있거나 레지스터 액세스 구성요소(206)는 제어 회로부(220)에 의해 연산된 데이터를 전송 및 페치할 수 있고, 또는 호스트(202)로 전송하기 위해, 제어 회로부(220)에 의해 취해진 조치에 응답하여, 연산된 데이터를 전송 및 페치할 수 있다.The register access component 206 may transfer and fetch data from the host 202 to the memory device 204 and from the memory device 204 to the host 202 . For example, the register access component 206 may store an address, such as a memory address, corresponding to data to be transferred from the memory device 204 to the host 202 or from the host 202 to the memory device 204 . Yes (or you can look up the address). In some embodiments, the register access component 206 may transmit and fetch data to be computed by the bit string conversion circuitry 220 and/or the register access component 206 may be operated by the control circuitry 220 . The computed data may be sent and fetched, or in response to an action taken by the control circuitry 220 to send to the host 202 , the computed data may be sent and fetched.

HSI(208)는 채널(205)을 횡단하는 커맨드 및/또는 데이터를 위해 호스트(202)와 메모리 디바이스(204) 사이에 인터페이스를 제공할 수 있다. HSI(208)는 DDR3, DDR4, DDR5 등의 인터페이스와 같은 더블 데이터 레이트(DDR) 인터페이스일 수 있다. 그러나, 실시형태는 DDR 인터페이스로 제한되지 않고, HSI(208)는 쿼드 데이터 레이트(QDR) 인터페이스, 주변 구성요소 상호 연결(PCI) 인터페이스(예를 들어, 주변 구성요소 상호 연결 익스프레스(PCIe)) 인터페이스, 또는 호스트(202)와 메모리 디바이스(204) 사이에 커맨드 및/또는 데이터를 전송하기 위한 다른 적절한 인터페이스일 수 있다.HSI 208 may provide an interface between host 202 and memory device 204 for commands and/or data traversing channel 205 . The HSI 208 may be a double data rate (DDR) interface, such as an interface such as DDR3, DDR4, DDR5, or the like. However, embodiments are not limited to a DDR interface, and the HSI 208 is a quad data rate (QDR) interface, a Peripheral Component Interconnect (PCI) interface (eg, Peripheral Component Interconnect Express (PCIe)) interface. , or other suitable interface for transferring commands and/or data between the host 202 and the memory device 204 .

제어기(210)는 호스트(202)로부터 명령어를 실행하고 제어 회로부(220) 및/또는 메모리 어레이(230)에 액세스하는 것을 담당할 수 있다. 제어기(210)는 상태 기계, 시퀀서, 또는 일부 다른 유형의 제어기일 수 있다. 제어기(210)는 호스트(202)로부터 커맨드를 (예를 들어, HSI(208)를 통해) 수신할 수 있고, 수신된 커맨드에 기초하여 제어 회로부(220) 및/또는 메모리 어레이(230)의 동작을 제어할 수 있다. 일부 실시형태에서, 제어기(210)는 호스트(202)로부터 커맨드를 수신하여, 제어 회로부(220)를 사용하여 연산을 수행할 수 있다. 이러한 커맨드의 수신에 응답하여, 제어기(210)는 연산(들)의 수행을 시작할 것을 제어 회로부(220)에 명령할 수 있다.The controller 210 may be responsible for executing instructions from the host 202 and accessing the control circuitry 220 and/or the memory array 230 . The controller 210 may be a state machine, a sequencer, or some other type of controller. The controller 210 may receive a command (eg, via the HSI 208 ) from the host 202 , and operate the control circuitry 220 and/or the memory array 230 based on the received command. can be controlled. In some embodiments, the controller 210 may receive a command from the host 202 and perform an operation using the control circuitry 220 . In response to receiving such a command, the controller 210 may instruct the control circuitry 220 to start performing the operation(s).

비-제한적인 예에서, 제어기(210)는 메모리 어레이(230)에 저장된 하나 이상의 비트 스트링 및/또는 하나 이상의 비트 스트링 사이에 수행된 연산의 결과를 나타내는 메모리 어레이(230)에 저장된 결과 비트 스트링을 검색하는 동작을 수행할 것을 제어 회로부(220)에 명령할 수 있다. 예를 들어, 제어기(210)는 하나 이상의 비트 스트링들 사이에 연산의 수행을 요청하는 커맨드를 호스트(204)로부터 수신하고, 연산을 수행하라는 커맨드를 제어 회로부(220)에 전송할 수 있다. 제어 회로부(220)는 요청된 연산의 결과가 메모리 어레이(230)에 저장되어 있는지 여부를 결정할 수 있고, 요청된 연산의 결과가 저장되는 메모리 어레이(230) 내의 어드레스를 결정할 수 있고, 및/또는 메모리 어레이(230)로부터 요청된 연산의 결과를 검색할 수 있다. 제어 회로부(220) 및/또는 제어기(210)는 요청된 연산의 결과를 주변 감지 증폭기(211), 데이터 구조부(209), 호스트(202) 또는 메모리 어레이(230) 외부의 다른 회로부로 전송할 수 있다.In a non-limiting example, the controller 210 receives one or more bit strings stored in the memory array 230 and/or the resultant bit strings stored in the memory array 230 representing the results of operations performed between the one or more bit strings. It may instruct the control circuit unit 220 to perform a search operation. For example, the controller 210 may receive a command requesting execution of an operation between one or more bit strings from the host 204 , and transmit a command to perform an operation to the control circuit unit 220 . The control circuitry 220 may determine whether the result of the requested operation is stored in the memory array 230 , may determine an address in the memory array 230 where the result of the requested operation is stored, and/or A result of the requested operation may be retrieved from the memory array 230 . The control circuit unit 220 and/or the controller 210 may transmit the result of the requested operation to the peripheral sense amplifier 211 , the data structure unit 209 , the host 202 , or another circuit unit outside the memory array 230 . .

일부 실시형태에서, 제어기(210)는 전역 처리 제어기일 수 있고, 메모리 디바이스(204)에 전력 관리 기능을 제공할 수 있다. 전력 관리 기능은 메모리 디바이스(204) 및/또는 메모리 어레이(230)에 의해 소비되는 전력을 제어하는 것을 포함할 수 있다. 예를 들어, 제어기(210)는 메모리 디바이스(204)의 동작 동안 상이한 시간에 동작하는 메모리 어레이(230)의 뱅크를 제어하기 위해 메모리 어레이(230)의 다양한 뱅크에 제공되는 전력을 제어할 수 있다. 이것은 메모리 디바이스(230)의 전력 소비를 최적화하기 위해 메모리 어레이(230)의 다른 뱅크에 전력을 제공하면서 메모리 어레이(230)의 특정 뱅크를 셧다운하는 것을 포함할 수 있다. 일부 실시형태에서, 메모리 디바이스(204)의 전력 소비를 제어하는 제어기(210)는 메모리 디바이스(204)의 다양한 코어 및/또는 제어 회로부(220), 메모리 어레이(230) 등으로 공급되는 전력을 제어하는 것을 포함할 수 있다.In some embodiments, the controller 210 may be a global processing controller and may provide power management functionality to the memory device 204 . Power management functions may include controlling power consumed by memory device 204 and/or memory array 230 . For example, the controller 210 may control the power provided to various banks of the memory array 230 to control the banks of the memory array 230 operating at different times during operation of the memory device 204 . . This may include shutting down a particular bank of memory array 230 while providing power to other banks of memory array 230 to optimize power consumption of memory device 230 . In some embodiments, the controller 210 , which controls power consumption of the memory device 204 , controls the power supplied to the various cores and/or control circuitry 220 , the memory array 230 , and the like of the memory device 204 . may include doing

위에서 언급한 바와 같이, 주변 감지 증폭기(211)는 메모리 어레이(230)를 위한 추가 저장 공간을 제공할 수 있고, 메모리 디바이스(204)에 존재하는 데이터 값을 감지(예를 들어, 판독, 저장, 캐싱)할 수 있다. 주변 감지 증폭기(211)는 본 명세서에 설명된 바와 같이 데이터 값(예를 들어, 비트 스트링)을 저장하도록 구성될 수 있는 감지 증폭기, 래치, 플립플롭, 확장된 행 어드레스(XRA) 구성요소(들) 등을 포함할 수 있다. 도 2a에 도시된 바와 같이, 주변 감지 증폭기(211)는 메모리 어레이(230)와 물리적으로 구분되는 메모리 디바이스(204)의 위치에 있다. 일부 실시형태에서, 주변 감지 증폭기(211)는 레지스터 또는 일련의 레지스터의 형태로 제공될 수 있고, 메모리 어레이(230)의 행 또는 열이 있는 것과 동일한 수량의 저장 위치(예를 들어, 감지 증폭기, 래치 등)를 포함할 수 있다. 예를 들어, 메모리 어레이(230)가 약 16K 행 또는 열을 포함하는 경우, 주변 감지 증폭기(211)는 약 16K 저장 위치를 포함할 수 있다. 따라서, 일부 실시형태에서, 주변 감지 증폭기(211)는 최대 약 16K 데이터 값을 보유하도록 구성된 레지스터일 수 있다.As noted above, the peripheral sense amplifier 211 can provide additional storage space for the memory array 230 and sense (eg, read, store, caching) is possible. Peripheral sense amplifier 211 is a sense amplifier, latch, flip-flop, extended row address (XRA) component(s) that may be configured to store data values (eg, bit strings) as described herein. ) and the like. As shown in FIG. 2A , the peripheral sense amplifier 211 is at a location in the memory device 204 that is physically separate from the memory array 230 . In some embodiments, the peripheral sense amplifier 211 may be provided in the form of a register or series of registers, with the same number of storage locations (e.g., sense amplifiers, latches, etc.). For example, if memory array 230 includes about 16K rows or columns, peripheral sense amplifier 211 may include about 16K storage locations. Thus, in some embodiments, the peripheral sense amplifier 211 may be a register configured to hold up to about 16K data values.

그러나, 실시형태는 주변 감지 증폭기(211)가 데이터 값을 저장하기 위해 약 16K 위치를 포함하는 시나리오로 제한되지 않는다. 예를 들어, 주변 감지 증폭기(211)는 약 2K 데이터 값, 약 4K 데이터 값, 약 8K 데이터 값 등을 저장하도록 구성될 수 있다. 또한, 도 2a에서 주변 감지 증폭기(211)를 예시하는 것으로 단일 박스가 도시되어 있지만, 일부 실시형태에서 주변 감지 증폭기(211)의 둘 이상의 "행"이 있을 수 있다. 예를 들어, 약 2K 데이터 값, 약 4K 데이터 값, 약 8K 데이터 값, 약 16K 데이터 값 등을 저장하도록 각각 구성될 수 있는 주변 감지 증폭기(211)의 "행"은 무엇보다도 특히 2개, 4개 또는 8개일 수 있다. However, the embodiment is not limited to the scenario where the peripheral sense amplifier 211 includes about 16K locations for storing data values. For example, the peripheral sense amplifier 211 may be configured to store about 2K data values, about 4K data values, about 8K data values, and the like. Also, although a single box is shown illustrating the peripheral sense amplifiers 211 in FIG. 2A , in some embodiments there may be more than one “row” of the peripheral sense amplifiers 211 . For example, the “rows” of peripheral sense amplifiers 211 that may each be configured to store about 2K data values, about 4K data values, about 8K data values, about 16K data values, etc. are, among other things, two, four It can be a dog or eight.

위에서 설명된 바와 같이, 일부 실시형태에서 주변 감지 증폭기(211)는 비트 스트링을 사용하여 수행되는 재귀 연산의 중간 결과를 저장하도록 구성될 수 있다. 일부 실시형태에서, 재귀 연산의 중간 결과는 재귀 연산의 각 반복에서 생성된 결과를 나타낼 수 있다. 일부 접근 방식과 달리, 주변 감지 증폭기(211)는 최대 16K 데이터 값을 저장하도록 구성될 수 있기 때문에, 재귀 연산의 수행 동안 재귀 연산의 중간 결과는 반올림(예를 들어, 절단)되지 않을 수 있다.As described above, in some embodiments peripheral sense amplifier 211 may be configured to store intermediate results of recursive operations performed using bit strings. In some embodiments, the intermediate result of the recursive operation may represent the result produced at each iteration of the recursive operation. Unlike some approaches, since the peripheral sense amplifier 211 may be configured to store up to 16K data values, intermediate results of the recursive operation may not be rounded (eg, truncated) during performance of the recursive operation.

대신에, 일부 실시형태에서, 재귀 연산의 완료 시 주변 감지 증폭기에 저장되는 재귀 연산의 최종 결과는 원하는 비트 폭(예를 들어, 8-비트, 16-비트, 32-비트, 64-비트 등)으로 반올림될 수 있다. 이것은, 재귀 연산의 중간 결과를 저장하기 위해 주변 감지 증폭기(211)를 사용하지 않는 접근 방식과 달리, 재귀 연산의 최종 결과를 계산하기 전에 재귀 연산의 중간 결과를 반올림할 필요가 없을 수 있기 때문에, 재귀 연산의 결과의 정확도를 향상시킬 수 있다.Instead, in some embodiments, the final result of the recursive operation stored in the peripheral sense amplifier upon completion of the recursive operation is the desired bit width (eg, 8-bit, 16-bit, 32-bit, 64-bit, etc.) can be rounded to This is because, unlike the approach that does not use the peripheral sense amplifier 211 to store the intermediate result of the recursive operation, it may not be necessary to round the intermediate result of the recursive operation before calculating the final result of the recursive operation. The accuracy of the result of the recursive operation can be improved.

주변 감지 증폭기(211)는 재귀 연산의 새로운 반복이 완료될 때 재귀 연산의 이전에 저장된 중간 결과를 덮어쓰도록 구성될 수 있다. 예를 들어, 재귀 연산의 제1 반복을 나타내는 결과는 재귀 연산의 제1 반복이 완료되면 주변 감지 증폭기(211)에 저장될 수 있다. 재귀 연산의 제2 반복을 나타내는 결과가 완료되면, 재귀 연산의 제2 반복의 결과는 주변 감지 증폭기(211)에 저장될 수 있다. 유사하게, 재귀 연산의 제3 반복을 나타내는 결과가 완료되면, 재귀 연산의 제3 반복 결과가 주변 감지 증폭기(211)에 저장될 수 있다. 일부 실시형태에서, 각각의 후속 반복의 결과는 이전 반복의 저장된 결과를 덮어쓰기함으로써 주변 감지 증폭기(211)에 저장될 수 있다.The peripheral sense amplifier 211 may be configured to overwrite the previously stored intermediate result of the recursive operation when a new iteration of the recursive operation is completed. For example, a result representing the first iteration of the recursive operation may be stored in the peripheral sense amplifier 211 when the first iteration of the recursive operation is completed. When the result representing the second iteration of the recursive operation is completed, the result of the second iteration of the recursive operation may be stored in the peripheral sense amplifier 211 . Similarly, when the result representing the third iteration of the recursive operation is completed, the result of the third iteration of the recursive operation may be stored in the peripheral sense amplifier 211 . In some embodiments, the results of each subsequent iteration may be stored in the peripheral sense amplifier 211 by overwriting the stored results of the previous iteration.

각각의 반복의 결과의 비트 폭에 따라, 각각의 반복의 결과를 나타내고 주변 감지 증폭기(211)에 저장되는 후속 비트 스트링은 이전에 저장된 비트 스트링보다 더 많은 감지 증폭기를 사용하여 주변 감지 증폭기(211)에 저장될 수 있다. 예를 들어, 제1 반복의 결과는 제1 수량의 비트를 포함할 수 있고, 제2 반복의 결과는 제1 수량의 비트보다 많은 제2 수량의 비트를 포함할 수 있다. 제2 반복의 결과가 주변 감지 증폭기(211)에 기입되거나 주변 감지 증폭기(211)에 의해 저장되는 경우, 제1 반복의 결과를 덮어쓰도록 저장할 수 있지만, 제2 반복의 결과가 제1 반복의 결과보다 더 많은 비트를 포함할 수 있기 때문에, 일부 실시형태에서, 제1 반복의 결과를 저장하는 데 사용된 감지 증폭기에 추가하여 주변 감지 증폭기(211)의 추가 감지 증폭기를 사용하여 제2 반복의 결과를 저장할 수 있다.Depending on the bit width of the result of each iteration, the subsequent bit string that represents the result of each iteration and is stored in the peripheral sense amplifier 211 uses more sense amplifiers than the previously stored bit string in the peripheral sense amplifier 211 . can be stored in For example, the result of the first iteration may include a first quantity of bits, and the result of the second iteration may include more bits of the second quantity than the first quantity of bits. If the result of the second iteration is written to or stored by the peripheral sense amplifier 211, it can be stored to overwrite the result of the first iteration, but the result of the second iteration is not that of the first iteration. Because it may contain more bits than the result, in some embodiments an additional sense amplifier of the peripheral sense amplifier 211 is used in addition to the sense amplifier used to store the result of the first iteration of the second iteration. You can save the results.

재귀 연산이 숫자 2.51에 숫자 3.73을 재귀적으로 곱하는 재귀 승산 연산을 포함하는 단순화된 비-제한적인 예에서, 제1 반복의 결과는 9.3623일 수 있다. 이 예에서, 제1 반복의 결과는 5 비트를 포함하고, 예를 들어, 주변 감지 증폭기(211)의 5개의 감지 증폭기에 저장될 수 있다. 제2 반복의 결과(예를 들어, 제1 결과 9.3623과 3.73을 승산한 결과)는 8 비트를 포함하는 34.921379일 수 있다. 일부 실시형태에서, 제2 반복의 결과는 예를 들어, 5개의 감지 증폭기에 저장된 제1 반복의 결과를 덮어쓰고, 주변 감지 증폭기(211)의 3개의 다른 감지 증폭기에 추가 3 비트를 기입함으로써 주변 감지 증폭기(211)의 8개의 감지 증폭기에 저장될 수 있다. 재귀 연산의 후속 반복 결과는 이전 반복의 결과를 덮어쓰기하도록 주변 감지 증폭기(211)에 유사하게 저장될 수 있다. 그러나, 실시형태는 이로 제한되지 않으며, 일부 실시형태에서는, 각각의 반복의 결과는 주변 감지 증폭기(211)의 인접한 감지 증폭기, 또는 특히 주변 감지 증폭기(211)의 감지 증폭기에 저장될 수 있다.In a simplified non-limiting example where the recursive operation includes a recursive multiplication operation that recursively multiplies the number 2.51 by the number 3.73, the result of the first iteration may be 9.3623. In this example, the result of the first iteration contains 5 bits and may be stored in, for example, 5 sense amplifiers of the peripheral sense amplifier 211 . A result of the second iteration (eg, a result of multiplying the first result 9.3623 by 3.73) may be 34.921379 including 8 bits. In some embodiments, the result of the second iteration overwrites the result of the first iteration stored in, for example, five sense amplifiers, and writes an additional 3 bits to three other sense amplifiers of the peripheral sense amplifier 211 surrounding the peripheral sense amplifier 211 . It may be stored in eight sense amplifiers of the sense amplifier 211 . The results of subsequent iterations of the recursive operation may similarly be stored in the peripheral sense amplifier 211 to overwrite the results of previous iterations. However, embodiments are not limited thereto, and in some embodiments, the result of each iteration may be stored in an adjacent sense amplifier of the peripheral sense amplifier 211 , or in particular in a sense amplifier of the peripheral sense amplifier 211 .

일부 실시형태에서, 주변 감지 증폭기(211)에 액세스하는 것은 레지스터 매핑을 사용하여 제어될 수 있다. 예를 들어, 비트 스트링은 주변 감지 증폭기(211)에 저장될 수 있고, 주변 감지 증폭기(211)로부터 삭제될 수 있고, 및/또는 주변 감지 증폭기(211)에 저장된 비트 스트링의 비트 폭은 제어 회로부(220)에 저장될 수 있는 레지스트리 매핑과 연관된 커맨드에 응답하여 변경될 수 있다. 게다가, 메모리 어레이(230)에 (예를 들어, 메모리 어레이(230)의 데이터 구조부(209)에) 저장된 비트 스트링은 제어 회로부(220)와 연관된 커맨드에 응답하여 주변 감지 증폭기(211)에 저장된 비트 스트링에 가산되거나 비트 스트링으로부터 감산(예를 들어, 누산)될 수 있다.In some embodiments, access to the peripheral sense amplifier 211 may be controlled using register mapping. For example, the bit string may be stored in the peripheral sense amplifier 211 , deleted from the peripheral sense amplifier 211 , and/or the bit width of the bit string stored in the peripheral sense amplifier 211 may be controlled by the control circuitry. may be changed in response to a command associated with a registry mapping, which may be stored in 220 . In addition, the bit string stored in the memory array 230 (eg, in the data structure portion 209 of the memory array 230 ) is the bit stored in the peripheral sense amplifier 211 in response to a command associated with the control circuitry 220 . It may be added to a string or subtracted (eg, accumulated) from a string of bits.

제어 회로부(220)는 또한 본 명세서에서 도 6과 관련하여 보다 상세히 설명된 바와 같이, 주변 감지 증폭기(211) 및/또는 메모리 어레이(230)에 저장될 수 있는 형식과 포지트 형식 사이에서 포지트 비트 스트링을 사용하여 재귀 연산의 일부로서 수행된 연산의 결과를 변환하는 것과 연관된 커맨드를 포함할 수 있다. 예를 들어, 제어 회로부(220)는 포지트 비트 스트링을, 부호 비트, 가수 비트, 지수 비트, 및 포지트 형식으로 표시되도록 비트 스트링을 확장하는 데 사용될 수 있는 k-값으로 표현하는 것과 연관된 커맨드를 포함할 수 있는 하나 이상의 레지스터를 포함할 수 있다.The control circuitry 220 also switches between a positive format and a format that can be stored in the peripheral sense amplifier 211 and/or the memory array 230, as described in more detail with respect to FIG. 6 herein. It can include commands associated with transforming the result of an operation performed as part of a recursive operation using a bit string. For example, control circuitry 220 may provide commands associated with representing a string of positive bits as sign bits, mantissa bits, exponent bits, and k-values that may be used to expand the string of bits to be represented in positive form. It may include one or more registers that may include

주 메모리 입력/출력(I/O) 회로부(214)는 메모리 어레이(230)로 및/또는 메모리 어레이로부터 데이터 및/또는 커맨드를 전송할 수 있다. 예를 들어, 주 메모리 I/O 회로부(214)는 비트 스트링, 데이터, 및/또는 커맨드를 호스트(202) 및/또는 비트 스트링 변환 회로부(220)로부터 메모리 어레이(230)로 전송하고 메모리 어레이로부터 호스트 및/또는 비트 스트링 변환 회로부로 전송할 수 있다. 일부 실시형태에서, 주 메모리 I/O 회로부(214)는 비트 스트링(예를 들어, 데이터 블록으로 저장된 포지트 비트 스트링)을 제어 회로부(220)로부터 메모리 어레이(230)로 전송하거나 그 반대로 전송할 수 있는 하나 이상의 직접 메모리 액세스(DMA) 구성요소를 포함할 수 있다.Main memory input/output (I/O) circuitry 214 may transfer data and/or commands to and/or from memory array 230 . For example, the main memory I/O circuitry 214 transfers bit strings, data, and/or commands from the host 202 and/or bit string conversion circuitry 220 to and from the memory array 230 . may be transmitted to the host and/or bit string conversion circuitry. In some embodiments, main memory I/O circuitry 214 may transmit a string of bits (eg, a string of positive bits stored as data blocks) from control circuitry 220 to memory array 230 and vice versa. one or more direct memory access (DMA) components.

일부 실시형태에서, 주 메모리 I/O 회로부(214)는 제어 회로부(220)가 비트 스트링에 연산을 수행할 수 있도록 비트 스트링, 데이터, 및/또는 커맨드를 메모리 어레이(230)로부터 제어 회로부(220)로 전송할 수 있다. 유사하게, 주 메모리 I/O 회로부(214)는 제어 회로부(220)에 의해 하나 이상의 연산이 수행된 비트 스트링을 메모리 어레이(230)로 전송할 수 있다. 본 명세서에 보다 상세히 설명된 연산은 중간 반복의 결과가 주변 감지 증폭기(211)에 저장되는 비트 스트링(예를 들어, unum 또는 포지트 비트 스트링)을 사용하여 수행되는 재귀 연산을 포함할 수 있다.In some embodiments, main memory I/O circuitry 214 transfers bit strings, data, and/or commands from memory array 230 to control circuitry 220 so that control circuitry 220 may perform operations on the bit strings. ) can be transmitted. Similarly, the main memory I/O circuit unit 214 may transmit a bit string on which one or more operations are performed by the control circuit unit 220 to the memory array 230 . The operations described in more detail herein may include recursive operations performed using a bit string (eg, unum or positive bit string) in which the result of the intermediate iteration is stored in the peripheral sense amplifier 211 .

전술한 바와 같이, 포지트 비트 스트링(예를 들어, 데이터)은 메모리 어레이(230)로부터 저장 및/또는 검색될 수 있다. 일부 실시형태에서, 주 메모리 I/O 회로부(214)는 메모리 어레이(230)에 포지트 비트 스트링을 저장하고/하거나 메모리 어레이(230)로부터 포지트 비트 스트링을 검색할 수 있다. 예를 들어, 주 메모리 I/O 회로부(214)는 저장을 위해 메모리 어레이(230)에 포지트 비트 스트링을 전송하도록 인에이블될 수 있고/있거나 주 메모리 I/O 회로부(214)는 예를 들어 제어기(210) 및/또는 제어 회로부(220)로부터의 커맨드에 응답하여 메모리 어레이(230)로부터 포지트 비트 스트링(예를 들어, 하나 이상의 포지트 비트 스트링 피연산자들 간에 수행된 연산을 나타내는 포지트 비트 스트링)을 검색할 수 있다.As described above, the positive bit string (eg, data) may be stored and/or retrieved from the memory array 230 . In some embodiments, the main memory I/O circuitry 214 may store the positive bit string in the memory array 230 and/or retrieve the positive bit string from the memory array 230 . For example, main memory I/O circuitry 214 may be enabled to send a string of positive bits to memory array 230 for storage and/or main memory I/O circuitry 214 may, for example, A positive bit string (eg, a positive bit representing an operation performed between one or more positive bit string operands) from the memory array 230 in response to a command from the controller 210 and/or control circuitry 220 . string) can be searched.

행 어드레스 스트로브(RAS)/열 어드레스 스트로브(CAS) 체인 제어 회로부(216) 및 RAS/CAS 체인 구성요소(218)는 메모리 사이클을 개시하기 위해 행 어드레스 및/또는 열 어드레스를 래치하기 위해 메모리 어레이(230)와 함께 사용될 수 있다. 일부 실시형태에서, RAS/CAS 체인 제어 회로부(216) 및/또는 RAS/CAS 체인 구성요소(218)는 메모리 어레이(230)와 연관된 판독 및 기입 동작이 개시되거나 종료되는 메모리 어레이(230)의 행 및/또는 열 어드레스를 분석할 수 있다. 예를 들어, 제어 회로부(220)를 사용하여 동작이 완료되면, RAS/CAS 체인 제어 회로부(216) 및/또는 RAS/CAS 체인 구성요소(218)는 제어 회로부(220)에 의해 연산된 비트 스트링이 저장될 주변 감지 증폭기(211) 및/또는 메모리 어레이(230)의 특정 위치를 래치 및/또는 분석할 수 있다. 유사하게, RAS/CAS 체인 제어 회로부(216) 및/또는 RAS/CAS 체인 구성요소(218)는 제어 회로부(220)가 비트 스트링(들)에 연산(예를 들어, 재귀 연산)을 수행하기 전에 또는 수행한 후에, 제어 회로부(220)로 비트 스트링이 전송되는 출발점인 주변 감지 증폭기(211) 및/또는 메모리 어레이(230)의 특정 위치를 래치 및/또는 분석할 수 있다.Row address strobe (RAS)/column address strobe (CAS) chain control circuitry 216 and RAS/CAS chain component 218 are configured to latch a row address and/or column address to initiate a memory cycle. 230) can be used. In some embodiments, RAS/CAS chain control circuitry 216 and/or RAS/CAS chain component 218 may be configured to initiate or terminate a row of memory array 230 from which read and write operations associated with memory array 230 are initiated or terminated. and/or resolve column addresses. For example, upon completion of an operation using the control circuitry 220 , the RAS/CAS chain control circuitry 216 and/or the RAS/CAS chain component 218 may display the bit string computed by the control circuitry 220 . A specific location of the peripheral sense amplifier 211 and/or the memory array 230 to be stored may be latched and/or analyzed. Similarly, RAS/CAS chain control circuitry 216 and/or RAS/CAS chain component 218 may perform operations (eg, recursive operations) on the bit string(s) before the control circuitry 220 performs operations (eg, recursive operations) on the bit string(s). Alternatively, after performing the operation, a specific position of the peripheral sense amplifier 211 and/or the memory array 230 that is a starting point from which the bit string is transmitted to the control circuit unit 220 may be latched and/or analyzed.

제어 회로부(220)는 논리 회로부(예를 들어, 도 1에 도시된 논리 회로부(122)) 및/또는 메모리 자원(들)(예를 들어, 도 1에 도시된 메모리 자원(124))을 포함할 수 있다. 비트 스트링(예를 들어, 데이터, 복수의 비트 등)은 예를 들어, 호스트(202), 메모리 어레이(230), 및/또는 외부 메모리 디바이스로부터 제어 회로부(220)에 의해 수신될 수 있고, 제어 회로부(220)에 의해, 예를 들어, 제어 회로부(220)의 메모리 자원에 저장될 수 있다. 제어 회로부(예를 들어, 제어 회로부(220)의 논리 회로부(222))는 비트 스트링(들)에 연산을 수행할 수 있고(또는 연산의 수행을 야기할 수 있고) 연산의 중간 결과를 주변 감지 증폭기(211)에 저장할 수 있다. 전술한 바와 같이, 일부 실시형태에서, 비트 스트링(들)은 unum 또는 포지트 형식으로 형식화될 수 있다.Control circuitry 220 includes logic circuitry (eg, logic circuitry 122 shown in FIG. 1 ) and/or memory resource(s) (eg, memory resource 124 shown in FIG. 1 ). can do. A bit string (eg, data, plurality of bits, etc.) may be received by control circuitry 220 from, for example, host 202 , memory array 230 , and/or an external memory device, and control By the circuit unit 220 , for example, it may be stored in a memory resource of the control circuit unit 220 . Control circuitry (eg, logic circuitry 222 of control circuitry 220) may perform an operation on the bit string(s) (or cause the operation to be performed) and sense the intermediate result of the operation as ambient It may be stored in the amplifier 211 . As noted above, in some embodiments, the bit string(s) may be formatted in an unum or positive format.

도 3 및 도 4a 내지 도 4b와 관련하여 보다 상세히 설명된 바와 같이, 범용 숫자 및 포지트는 향상된 정확도를 제공할 수 있고, 부동 소수점 형식으로 표시된 대응하는 비트 스트링보다 더 적은 저장 공간을 요구할 수 있다(예를 들어, 더 적은 수의 비트를 포함할 수 있음). 예를 들어, 부동 소수점 숫자로 표시되는 숫자 값은 대응하는 부동 소수점 숫자의 것보다 더 적은 비트 폭을 갖는 포지트로 나타낼 수 있다. 따라서, 포지트 비트 스트링을 사용하여 연산(예를 들어, 산술 연산, 논리 연산, 비트 단위 연산, 벡터 연산 등)을 수행함으로써, 메모리 디바이스(204)의 성능은 포지트 비트 스트링에 후속 연산(예를 들어, 산술 및/또는 논리 연산)을 보다 빠르게 수행할 수 있기 때문에(예를 들어, 포지트 형식의 데이터가 더 작아서 연산을 수행하는 데 더 적은 시간이 필요하기 때문에) 부동 소수점 비트 스트링만을 이용하는 접근 방식에 비해 향상될 수 있다. 또한, 메모리 디바이스(204)의 성능은 포지트 형식으로 비트 스트링을 저장하는 데 메모리 디바이스(202)에서 더 적은 메모리 공간이 필요하여 다른 비트 스트링, 데이터, 및/또는 다른 연산이 수행될 수 있는 추가 공간을 메모리 디바이스(202)에 확보할 수 있기 때문에 부동 소수점 비트 스트링만을 이용하는 접근 방식에 비해 향상될 수 있다.3 and 4A-4B, general-purpose numbers and digits may provide improved accuracy and may require less storage space than the corresponding string of bits expressed in floating-point format ( For example, it may contain fewer bits). For example, a numeric value that is represented as a floating-point number can be represented by a positive that has a smaller bit width than that of the corresponding floating-point number. Thus, by performing an operation (eg, arithmetic operation, logical operation, bitwise operation, vector operation, etc.) using the positive bit string, the performance of the memory device 204 is increased by subsequent operations (eg, arithmetic operation, logical operation, bitwise operation, etc.) on the positive bit string. For example, arithmetic and/or logical operations) can be performed faster (e.g., because positive data is smaller and requires less time to perform the operation) using only floating-point bit strings. It can be improved compared to the approach. Additionally, the performance of the memory device 204 is an additional benefit that less memory space is required in the memory device 202 to store a string of bits in a positive format, so that other strings of bits, data, and/or other operations may be performed. Since space can be reserved in the memory device 202, it can be an improvement over an approach that uses only a string of floating-point bits.

일부 실시형태에서, 제어 회로부(220)는 포지트 비트 스트링에 재귀 산술 및/또는 논리 연산을 수행(또는 수행을 야기)할 수 있다. 예를 들어, 제어 회로부(220)는 재귀 산술 연산, 예를 들어, 재귀 가산, 재귀 감산, 재귀 승산, 재귀 분할, 융합된 승산 가산 연산, 승산-누산 연산, 재귀 내적 연산, 크거나 작음, 절대값(예를 들어, FABS()), 고속 푸리에 변환, 역 고속 푸리에 변환, 시그모이드 함수 연산, 컨볼루션 연산, 재귀 제곱근 연산, 재귀 지수 연산, 및/또는 재귀 로그 연산, 및/또는 AND, OR, XOR, NOT 등과 같은 재귀 논리 연산뿐만 아니라 사인, 코사인, 탄젠트 등과 같은 재귀 삼각 연산을 수행(또는 수행을 야기)하도록 구성될 수 있다. 이해되는 바와 같이, 연산의 전술한 목록은 모든 실시예를 전부 다 제시하는 것으로 의도된 것도 아니고, 연산의 전술한 목록은 본 발명을 제한하는 것으로 의도된 것도 아니고, 제어 회로부(220)는 포지트 비트 스트링을 사용하여 다른 산술 및/또는 논리 연산을 수행(또는 수행을 야기)하도록 구성될 수 있다.In some embodiments, the control circuitry 220 may perform (or cause to perform) recursive arithmetic and/or logical operations on the positive bit string. For example, the control circuitry 220 may be configured to perform recursive arithmetic operations, eg, recursive addition, recursive subtraction, recursive multiplication, recursive division, fused multiplication addition operation, multiply-accumulate operation, recursive dot product operation, greater or less than, absolute. value (eg, FABS()), fast Fourier transform, inverse fast Fourier transform, sigmoid function operation, convolution operation, recursive square root operation, recursive exponential operation, and/or recursive logarithmic operation, and/or AND; It may be configured to perform (or cause to perform) recursive logical operations such as OR, XOR, NOT, etc., as well as recursive trigonometric operations such as sine, cosine, tangent, etc. As will be understood, the foregoing list of operations is not intended to be exhaustive of all embodiments, and the foregoing list of operations is not intended to limit the invention, and the control circuitry 220 is positively It may be configured to perform (or cause to perform) other arithmetic and/or logical operations using the bit string.

일부 실시형태에서, 제어 회로부(220)는 하나 이상의 기계 학습 알고리즘의 실행과 함께 위에 나열된 연산을 수행할 수 있다. 예를 들어, 제어 회로부(220)는 하나 이상의 신경망과 관련된 연산을 수행할 수 있다. 신경망은 출력 신호에 기초하여 출력 응답을 결정하도록 알고리즘을 시간에 따라 훈련시킬 수 있다. 예를 들어, 시간에 따라 신경망은 본질적으로 특정 목표를 완료할 기회를 더 잘 최대화하는 방식을 학습할 수 있다. 이는 특정 목표를 완료할 기회를 더 잘 최대화하기 위해 새로운 데이터를 사용하여 시간에 따라 신경망을 훈련할 수 있기 때문에 기계 학습 응용에서 유리할 수 있다. 신경망은 특정 작업 및/또는 특정 목표의 연산을 향상시키기 위해 시간에 따라 훈련될 수 있다. 그러나, 일부 접근 방식에서, 기계 학습(예를 들어, 신경망 훈련)은 처리 집약적일 수 있고(예를 들어, 다량의 컴퓨터 처리 자원을 소비할 수 있음) 및/또는 시간 집약적(예를 들어, 다수의 사이클을 수행하는 데 긴 계산을 소비할 수 있음)일 수 있다.In some embodiments, the control circuitry 220 may perform the operations listed above in conjunction with the execution of one or more machine learning algorithms. For example, the control circuit unit 220 may perform an operation related to one or more neural networks. A neural network can train an algorithm over time to determine an output response based on the output signal. For example, over time, a neural network can essentially learn how to better maximize a chance to complete a particular goal. This can be advantageous in machine learning applications as it allows neural networks to be trained over time using new data to better maximize the chances of completing a particular goal. Neural networks can be trained over time to improve the computation of specific tasks and/or specific goals. However, in some approaches, machine learning (eg, training a neural network) may be processing intensive (eg, may consume large amounts of computer processing resources) and/or may be time intensive (eg, multiple may consume long computations to perform a cycle of

이와 달리, 제어 회로부(220)를 사용하여 이러한 연산을 수행함으로써, 예를 들어, 포지트 형식의 비트 스트링에 이러한 연산을 수행함으로써, 처리 자원의 양 및/또는 연산을 수행하는데 소비되는 시간 기간은 부동 소수점 형식의 비트 스트링을 사용하여 이러한 연산을 수행하는 접근 방식에 비해 감소될 수 있다. 또한, 재귀 연산의 중간 결과를 주변 감지 증폭기(211)에 저장함으로써, 재귀 연산의 최종 결과를 나타내는 비트 스트링의 정확도는 재귀 연산의 중간 결과를 절단하는 접근 방식에 비해 또는 재귀 연산의 중간 결과를 은닉된 스크래치 영역에 저장하는 접근 방식에 비해 더 높을 수 있다.In contrast, by performing such an operation using the control circuitry 220, for example, by performing such an operation on a bit string in a positive format, the amount of processing resources and/or the time period spent performing the operation is This can be reduced compared to the approach of performing these operations using bit strings in floating point format. Also, by storing the intermediate result of the recursive operation in the peripheral sense amplifier 211, the accuracy of the bit string representing the final result of the recursive operation is improved compared to approaches that truncate the intermediate result of the recursive operation or hide the intermediate result of the recursive operation. It can be higher compared to the approach of storing it in the scratch area.

일부 실시형태에서, 제어기(210)는 제어 회로부(220)가 호스트(202)를 방해하지 않고(예를 들어, 호스트(202)로부터 연산의 수행을 개시하기 위한 커맨드와는 별개의 커맨드 또는 개입 커맨드를 수신하지 않고 및/또는 연산의 결과를 호스트(202)에 전송하지 않고) 비트 스트링을 사용하여 연산을 수행하도록 구성될 수 있다. 그러나 실시형태는 이로 제한되지 않으며, 일부 실시형태에서, 제어기(210)는 제어 회로부(220)(예를 들어, 논리 회로부)가 비트 스트링을 사용하여 재귀 산술 및/또는 재귀 논리 연산을 수행하고, 주변 감지 증폭기(211)에 이러한 연산의 중간 결과를 저장하고/하거나 재귀 연산의 최종 결과가 연관된 특정 비트 폭을 갖도록 (주변 감지 증폭기(211) 및/또는 XRA 구성요소(들)에 저장될 수 있는) 재귀 연산의 최종 결과를 반올림하도록 구성될 수 있다.In some embodiments, the controller 210 provides a command or intervening command that is separate from the command for the control circuitry 220 to initiate performance of an operation from the host 202 without interfering with the host 202 (eg, a command for initiating the performance of an operation from the host 202 ). and/or without sending the result of the operation to the host 202) to perform the operation using the bit string. However, embodiments are not limited thereto, and in some embodiments, the controller 210 allows the control circuitry 220 (eg, logic circuitry) to perform recursive arithmetic and/or recursive logic operations using the bit string, store intermediate results of these operations in the peripheral sense amplifier 211 and/or (which may be stored in the peripheral sense amplifier 211 and/or XRA component(s) ) can be configured to round the final result of a recursive operation.

예를 들어, 제어 회로부(220)는 하나 이상의 비트 스트링을 사용하여 재귀 연산의 수행을 야기하고/하거나 재귀 연산의 대응하는 반복의 결과를 각각 나타내는 연속적인 결과 비트 스트링을 주변 감지 증폭기(211)(예를 들어, 복수의 감지 증폭기)에서 누산(예를 들어, 저장)하도록 구성될 수 있다. 일부 실시형태에서, 제어 회로부(220)는 후술하는 바와 같이 복수의 감지 증폭기(211)에 저장된 이전의 결과 비트 스트링을 덮어쓰기함으로써 각각의 연속적인 결과 비트 스트링을 복수의 감지 증폭기(211)에 누산하도록 더 구성될 수 있다.For example, the control circuitry 220 may cause performance of a recursive operation using one or more bit strings and/or generate successive string of result bits each representing the result of a corresponding iteration of the recursive operation to the peripheral sense amplifier 211 ( For example, it may be configured to accumulate (eg, store) across a plurality of sense amplifiers. In some embodiments, the control circuitry 220 accumulates each successive string of result bits into the plurality of sense amplifiers 211 by overwriting previous result bit strings stored in the plurality of sense amplifiers 211 as described below. It may be further configured to do so.

하나 이상의 비트 스트링, 결과 비트 스트링 또는 이 둘 모두는 유형 III 범용 숫자 형식 또는 포지트 형식에 따라 형식화될 수 있다. 또한, 전술한 바와 같이, 주변 감지 증폭기(211)는 메모리 어레이(230)의 주변에 위치될 수 있다. 즉, 일부 실시형태에서, 주변 감지 증폭기(211)는 메모리 어레이(230)가 위치된 영역과 물리적으로 구분된 메모리 디바이스(204)의 영역에 위치될 수 있다.One or more bit strings, the resulting bit strings, or both, may be formatted according to a Type III general-purpose numeric format or a positive format. Also, as described above, the peripheral sense amplifier 211 may be located in the periphery of the memory array 230 . That is, in some embodiments, the peripheral sense amplifier 211 may be located in an area of the memory device 204 that is physically separate from the area in which the memory array 230 is located.

일부 실시형태에서, 재귀 연산의 수행은 산술 연산, 논리 연산, 비트 단위 연산, 벡터 연산, 또는 이들의 조합을 수행하는 것을 포함할 수 있다. 재귀 연산이 완료되었다는 결정에 응답하여, 제어 회로부(220)는 마지막 결과 비트 스트링이 특정 비트 폭을 갖도록 복수의 감지 증폭기(211)에 저장된 마지막 결과 비트 스트링을 반올림(예를 들어, 절단)하도록 구성될 수 있다. 예를 들어, 제어 회로부(220)는 복수의 감지 증폭기(211)에 저장된 마지막 결과 비트 스트링을 8-비트, 16-비트, 32-비트, 64-비트 등의 비트 폭을 갖도록 반올림할 수 있다. 일부 실시형태에서, 제어 회로부(220)는 마지막 결과 비트 스트링을 특정 비트 폭으로 절단하기 위해 마지막 결과 비트 스트링의 가수 비트 서브세트 또는 지수 비트 서브세트(도 3, 도 4a 및 도 4b와 관련하여 보다 상세히 설명됨)로부터의 적어도 하나의 비트를 삭제하도록 구성될 수 있다.In some embodiments, performing a recursive operation may include performing an arithmetic operation, a logical operation, a bitwise operation, a vector operation, or a combination thereof. In response to determining that the recursive operation is complete, the control circuitry 220 is configured to round (eg, truncate) the last result bit string stored in the plurality of sense amplifiers 211 such that the last result bit string has a specified bit width. can be For example, the control circuit unit 220 may round the last result bit string stored in the plurality of sense amplifiers 211 to have a bit width of 8-bit, 16-bit, 32-bit, 64-bit, or the like. In some embodiments, the control circuitry 220 is configured to truncate the last result bit string to a specific bit width, either a subset of the mantissa bits or a subset of the exponent bits of the last result bit string (more than in connection with FIGS. 3 , 4A and 4B ). to delete at least one bit from ).

도 1과 관련하여 위에서 설명한 바와 같이, 메모리 어레이(230)는 예를 들어 DRAM 어레이, SRAM 어레이, STT RAM 어레이, PCRAM 어레이, TRAM 어레이, RRAM 어레이, NAND 플래시 어레이, 및/또는 NOR 플래시 어레이일 수 있지만, 실시형태는 이 특정 예로 제한되지 않는다. 메모리 어레이(230)는 도 2a 및 도 2b에 도시된 컴퓨팅 시스템(200)을 위한 주 메모리로서 기능할 수 있다. 일부 실시형태에서, 메모리 어레이(230)는 비트 스트링을 사용하는 연산을 수행하기 전에 제어 회로부(220)에 의해 연산되는 비트 스트링(예를 들어, 수행된 재귀 연산의 최종 결과를 나타내는 비트 스트링)을 저장하고 및/또는 제어 회로부(220)에 전달될 비트 스트링을 저장하도록 구성될 수 있다.1, memory array 230 may be, for example, a DRAM array, an SRAM array, an STT RAM array, a PCRAM array, a TRAM array, an RRAM array, a NAND flash array, and/or a NOR flash array. However, embodiments are not limited to this particular example. The memory array 230 may serve as main memory for the computing system 200 illustrated in FIGS. 2A and 2B . In some embodiments, the memory array 230 stores a bit string that is calculated by the control circuitry 220 (eg, a bit string representing the final result of the performed recursive operation) before performing an operation using the bit string. and/or may be configured to store a string of bits to be passed to the control circuitry 220 .

일부 실시형태에서, 비트 스트링(예를 들어, 포지트 비트 스트링)은 호스트(202)를 방해하지 않고 메모리 어레이(230)에 생성 및/또는 저장될 수 있다. 예를 들어, 비트 스트링은 호스트(202)로부터 다수의 커맨드를 수신하지 않고 메모리 어레이(230)에 생성 및/또는 저장될 수 있다. 다시 말해, 일부 실시형태에서, 호스트(202)는 하나 이상의 비트 스트링을 사용하여 연산을 수행하는 것을 요청하기 위해 메모리 디바이스에 단일 커맨드를 송신할 수 있다. 연산의 수행을 요청하는 커맨드의 수신에 응답하여, 메모리 디바이스(204)(예를 들어, 메모리 디바이스(204)의 제어기(210), 제어 회로부(220), 또는 다른 구성요소)는 호스트(202)로부터의 추가 커맨드가 없을 때 연산을 수행하고 및/또는 연산의 저장된 결과를 검색할 수 있다. 이것은 채널(203/205)에 걸친 트래픽을 감소시켜, 호스트(202) 및/또는 메모리 디바이스(204)와 연관된 컴퓨팅 디바이스의 성능을 증가시킬 수 있다.In some embodiments, the bit string (eg, positive bit string) may be generated and/or stored in the memory array 230 without disrupting the host 202 . For example, the bit string may be generated and/or stored in the memory array 230 without receiving multiple commands from the host 202 . In other words, in some embodiments, host 202 can send a single command to the memory device to request to perform an operation using one or more bit strings. In response to receiving a command requesting performance of an operation, the memory device 204 (eg, the controller 210 , the control circuitry 220 , or other component of the memory device 204 ) sends the host 202 to the host 202 . Perform operations and/or retrieve stored results of operations in the absence of additional commands from This may reduce traffic across channels 203 / 205 , thereby increasing the performance of the computing device associated with host 202 and/or memory device 204 .

도 2a에 도시된 바와 같이, 메모리 어레이는 복수의 메모리 셀을 포함할 수 있으며, 이들 중 일부는 데이터 구조부(209)로 그룹화될 수 있다. 예를 들어, 일부 실시형태에서, 데이터 구조부(209)는 복수의 메모리 셀로 구성될 수 있지만, 메모리 어레이(230)의 메모리 셀과 데이터 구조부(209) 사이의 구분은 데이터 구조부(209)로 사용하기 위해 예비된 메모리 셀의 일부와, 메모리 어레이(230)의 동작에서 메모리 셀에 의해 일반적으로 수행되는 기능을 자유롭게 수행할 수 있는 메모리 어레이(230)의 나머지 메모리 셀 사이를 구별하는 것을 돕기 위해 도 2a에서 이루어진다. As shown in FIG. 2A , the memory array may include a plurality of memory cells, some of which may be grouped into data structures 209 . For example, in some embodiments, data structure portion 209 may be comprised of a plurality of memory cells, but the distinction between memory cells and data structure portion 209 of memory array 230 is not used as data structure portion 209 . 2A to help distinguish between some of the memory cells reserved for is made in

데이터 구조부(209)는 비트 스트링(예를 들어, 포지트 비트 스트링)을 조직하고 저장하는 것을 허용할 수 있다. 일부 실시형태에서, 데이터 구조부(209)는 포지트 비트 스트링을 메모리 어레이(230) 내에 구성하고 저장하는 테이블(예를 들어, 조회 테이블), 트리, 레코드, 또는 다른 적절한 데이터 구조일 수 있다. Data structure 209 may allow organizing and storing bit strings (eg, positive bit strings). In some embodiments, data structure 209 may be a table (eg, a lookup table), tree, record, or other suitable data structure that organizes and stores positive bit strings in memory array 230 .

데이터 구조부(209)는 미리 결정된 크기를 가질 수 있고(예를 들어, 전력 신호(예를 들어, 메모리 어레이를 초기화하는 전력 투입 또는 개시 신호)의 수신 시 메모리 어레이(230)는 데이터 구조부로 사용하기 위해 고정된 수의 메모리 셀을 할당할 수 있고) 또는 데이터 구조부(209)는 예를 들어 제어기(210)에 의해 동적으로 할당될 수 있다. 일부 실시형태에서, 데이터 구조부(209)는 약 8 메가바이트(MB)의 크기를 가질 수 있지만, 실시형태는 이러한 특정 크기로 제한되지 않는다. 예를 들어, 위에 설명된 예에서 포지트 비트 스트링이 각각 8-비트의 비트 폭(예를 들어, 포지트 비트 스트링 피연산자(A), 포지트 비트 스트링 피연산자(B), 및 포지트 비트 스트링 피연산자(A)와 포지트 비트 스트링 피연산자(B) 사이에 수행된 연산 결과를 나타내는 결과 포지트 비트 스트링)을 갖는 경우, 데이터 구조부(209)의 크기는 약 8MB일 수 있다. 그러나, 3개를 초과하는 8-비트 포지트 비트 스트링이 메모리 어레이(230)의 데이터 구조부(209)에 저장되는 실시형태에서, 및/또는 포지트 비트 스트링이 8-비트 미만(예를 들어, 6-비트 포지티브 스트링, 4-비트 포지티브 스트링 등)이거나 또는 8-비트 초과(예를 들어, 16-비트, 32-비트, 64-비트 등)인 경우에, 데이터 구조부의 크기는 8MB 미만의 크기를 가지거나 8MB를 초과하는 크기를 가질 수 있다. The data structure 209 may have a predetermined size (eg, upon receipt of a power signal (eg, a power-up or start signal to initialize the memory array), the memory array 230 may be used as the data structure. (a fixed number of memory cells may be allocated for this purpose) or the data structure 209 may be allocated dynamically by the controller 210, for example. In some embodiments, data structure 209 may have a size of about 8 megabytes (MB), although embodiments are not limited to this particular size. For example, in the example described above, the positive bit string is each 8-bit wide (eg, positive bit string operand (A), positive bit string operand (B), and positive bit string operand) When (A) and a positive bit string operand (B) have a result (positive bit string indicating the operation result), the size of the data structure unit 209 may be about 8 MB. However, in embodiments where more than three 8-bit positive bit strings are stored in data structure portion 209 of memory array 230, and/or where the positive bit string is less than 8-bit (eg, 6-bit positive string, 4-bit positive string, etc.) or greater than 8-bit (eg 16-bit, 32-bit, 64-bit, etc.), the size of the data structure is less than 8 MB in size. , or may have a size exceeding 8 MB.

비-제한적인 예에서, 데이터 구조부(209)는 3개의 포지트 비트 스트링을 저장하도록 구성될 수 있다. 3개의 포지트 비트 스트링은 제1 포지트 비트 스트링 피연산자("β"), 제2 포지트 비트 스트링 피연산자("

"), 및 포지트 비트 스트링 피연산자(β)와 포지트 비트 스트링 피연산자(

)를 사용하여 수행된 산술 연산 또는 논리 연산의 결과에 대응할 수 있다. 이 예에서, 제어 회로부(220)는 포지트 비트 스트링 피연산자(β)와 포지트 비트 스트링 피연산자(

) 사이에서 요청된 연산(예를 들어, 산술 연산 및/또는 논리 연산)을 수행하고, 연산 결과(뿐만 아니라 포지트 비트 스트링 피연산자(β)와 포지트 비트 스트링 피연산자(

))를 메모리 어레이(230)의 데이터 구조부(209)에 저장할 수 있다. 이 예에서, 후속 시점에서 연산 수행이 필요하면, 제어기(210)는, 예를 들어 재귀 연산의 수행의 일부로서, 메모리 어레이(230)의 데이터 구조부(209)로부터 포지트 비트 스트링(β)과 포지트 비트 스트링(

) 사이의 연산 결과를 검색할 것을 요청할 수 있다. In a non-limiting example, data structure 209 may be configured to store a string of three positive bits. The three positive bit string consists of a first positive bit string operand ("β"), a second positive bit string operand ("

"), and the positive bit string operand (β) and the positive bit string operand (

) can be used to correspond to the result of an arithmetic or logical operation performed. In this example, the control circuitry 220 includes a positive bit string operand (β) and a positive bit string operand (β).

) performs the requested operation (e.g., arithmetic and/or logical operation) between

)) may be stored in the data structure unit 209 of the memory array 230 . In this example, if it is necessary to perform an operation at a subsequent point in time, the controller 210 may generate a positive bit string β from the data structure portion 209 of the memory array 230 and, for example, as part of performing a recursive operation. positive bit string (

) can be requested to retrieve the result of the operation between .

이 비-제한적인 예를 계속하면, 포지트 비트 스트링(β)과 포지트 비트 스트링(

)을 사용하여 수행되는 연산이 재귀 연산이면, 메모리 어레이(230)에 저장된 포지트 비트 스트링(β)과 포지트 비트 스트링(

)을 사용하여 수행된 연산(예를 들어, 산술 또는 논리 연산) 결과는 주변 감지 증폭기(211)로 전달되어 저장될 수 있다. 이후, 재귀 연산의 일부로 수행된 후속 연산의 결과는 주변 감지 증폭기(211)로 전송되고, 재귀 연산의 반복이 주변 감지 증폭기(211)에 누산되도록 저장될 수 있다. 본 명세서에 설명된 바와 같이, 재귀 연산의 최종 결과가 주변 감지 증폭기(211)에 누산되면, 재귀 연산의 최종 결과를 반올림하는 연산을 수행하여 재귀 연산의 결과를 특정 비트 폭으로 절단할 수 있다. Continuing this non-limiting example, the positive bit string (β) and the positive bit string (

) is a recursive operation, the positive bit string β and the positive bit string β stored in the memory array 230

), the result of an operation (eg, arithmetic or logic operation) performed using the ) may be transmitted to and stored in the peripheral sense amplifier 211 . Thereafter, the result of the subsequent operation performed as part of the recursive operation may be transmitted to the peripheral sense amplifier 211 and stored so that repetition of the recursive operation is accumulated in the peripheral sense amplifier 211 . As described herein, when the final result of the recursive operation is accumulated in the peripheral sense amplifier 211, an operation for rounding the final result of the recursive operation may be performed to truncate the result of the recursive operation to a specific bit width.

메모리 어레이(230)가 복수의 감지 증폭기(예를 들어, 주변 감지 증폭기(211))와 제어 회로부(220)에 결합된 다른 비-제한적인 예에서, 제어 회로부(220)는 제1 비트 스트링과 제2 비트 스트링이 저장되는 메모리 어레이(230) 내의 데이터 구조부(209)에서 각각의 어드레스 위치를 결정하도록 구성될 수 있다. 제1 포지트 비트 스트링과 제2 포지트 비트 스트링은 각각 산술 연산, 논리 연산 또는 이 둘 모두의 결과를 나타낼 수 있다. 제어 회로부(220)는 메모리 어레이(230)로부터 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 검색하기 위해 커맨드를 실행하도록 구성되고 및/또는 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 복수의 감지 증폭기(211)에 저장하도록 구성될 수 있다. 그러나, 실시형태는 복수의 감지 증폭기에 제1 포지트 비트 스트링 및/또는 제2 포지트 비트 스트링을 저장하는 것으로 제한되지 않으며, 일부 실시형태에서, 제어 회로부(220)는 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 하나 이상의 XRA 구성요소(들)를 포함하는 주변 감지 증폭기에 저장하도록 구성될 수 있다. In another non-limiting example in which the memory array 230 is coupled to a plurality of sense amplifiers (eg, peripheral sense amplifiers 211 ) and control circuitry 220 , the control circuitry 220 includes the first bit string and It may be configured to determine each address location in the data structure unit 209 in the memory array 230 in which the second bit string is stored. The first positive bit string and the second positive bit string may each represent the result of an arithmetic operation, a logical operation, or both. The control circuitry 220 is configured to execute a command to retrieve at least one of the first positive bit string and the second positive bit string from the memory array 230 and/or the first positive bit string and the second positive bit string It may be configured to store at least one of the positive bit strings in the plurality of sense amplifiers 211 . However, embodiments are not limited to storing the first positive bit string and/or the second positive bit string in a plurality of sense amplifiers, and in some embodiments, the control circuitry 220 may configure the first positive bit string and store at least one of the second positive bit string in a peripheral sense amplifier comprising one or more XRA component(s).

전술한 바와 같이, 제어 회로부(220)는 제1 비트 스트링과 제2 비트 스트링이 데이터 구조에 저장되기 전에 산술 연산, 논리 연산, 또는 이 둘 모두를 수행하도록 구성된다. 예를 들어, 제어 회로부(220)는 하나 이상의 포지트 비트 스트링 피연산자를 사용하여 산술 연산 및/또는 논리 연산을 수행하고, 연산 결과를 나중에 사용하기 위해 메모리 어레이(230)의 데이터 구조부(209)에 저장하도록 구성될 수 있다. As described above, the control circuitry 220 is configured to perform an arithmetic operation, a logical operation, or both, before the first bit string and the second bit string are stored in the data structure. For example, the control circuitry 220 performs arithmetic and/or logical operations using one or more positive bit string operands, and stores the operation results in the data structure portion 209 of the memory array 230 for later use. may be configured to store.

제어 회로부(220)를 사용하여 산술 및/또는 논리 연산을 수행한 다음 메모리 어레이(230)의 데이터 구조부(209)에 연산 결과를 저장함으로써, 결과(및/또는 포지트 비트 스트링 피연산자(A 및 B))는 산술 및/또는 논리 연산이 "실시간"으로 수행되는 (예를 들어, 산술 및/또는 논리 연산의 수행이 필요할 때마다 산술 및/또는 논리 연산이 수행되는) 접근 방식에 비해 더 빠르게 메모리 디바이스(204) 및/또는 호스트(202)에 의해 사용하기 위해 제공될 수 있다. By using the control circuitry 220 to perform arithmetic and/or logical operations and then storing the result of the operation in the data structure portion 209 of the memory array 230, the result (and/or the positive bit string operands A and B) . may be provided for use by device 204 and/or host 202 .

본 명세서에 설명된 바와 같이, 산술 및/또는 논리 연산은 재귀 연산의 일부로서 수행될 수 있다. 예를 들어, 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나는 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 피연산자로 사용하는 재귀 연산의 수행의 일부로서 복수의 감지 증폭기(211)에 저장될 수 있다. 재귀 연산을 수행하는 동안 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 복수의 감지 증폭기(211)에 저장함으로써, 재귀 연산의 중간 반복의 정확도는 재귀 연산이 완료될 때까지 보존될 수 있다. As described herein, arithmetic and/or logical operations may be performed as part of a recursive operation. For example, at least one of the first positive bit string and the second positive bit string is plural as part of performing a recursive operation using at least one of the first positive bit string and the second positive bit string as an operand. may be stored in the sense amplifier 211 of By storing at least one of the first positive bit string and the second positive bit string in the plurality of sense amplifiers 211 while performing the recursive operation, the accuracy of intermediate iterations of the recursive operation is preserved until the recursive operation is completed. can be

재귀 연산이 완료된 것으로 결정되면, 제어 회로부(220)는 재귀 연산의 결과를 반올림하도록 구성될 수 있다. 예를 들어, 제어 회로부(220)는 복수의 감지 증폭기(211)에 저장된 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나의 비트 스트링의 가수 비트 서브세트 또는 지수 비트 서브세트로부터 적어도 하나의 비트를 제거하여, 복수의 감지 증폭기(211)에 저장된 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 반올림 또는 절단하도록 구성될 수 있다. If it is determined that the recursive operation is complete, the control circuitry 220 may be configured to round the result of the recursive operation. For example, the control circuitry 220 may perform at least a subset of the mantissa bits or the subset of exponent bits of at least one of the first positive bit string and the second positive bit string stored in the plurality of sense amplifiers 211 . It may be configured to round or truncate at least one of the first positive bit string and the second positive bit string stored in the plurality of sense amplifiers 211 by removing one bit.

제어 회로부(220)는 메모리 디바이스(204)에 결합된 호스트(202)로부터 수신된 개시 커맨드의 수신에 응답하여 제1 비트 스트링과 제2 비트 스트링이 저장되는 메모리 어레이(230) 내의 각각의 어드레스 위치를 결정하도록 구성될 수 있다. 일부 실시형태에서 제어 회로부(220)는 개시 커맨드에 추가하여 커맨드를 수신하지 않고 메모리 어레이(220)로부터 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 검색하기 위해 커맨드를 실행하도록 더 구성될 수 있다. 예를 들어, 제어 회로부(220)는 개시 커맨드에 추가하여 커맨드를 수신하지 않고 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 복수의 감지 증폭기(211)에 저장하도록 구성될 수 있다. The control circuitry 220 responsive to receipt of a start command received from a host 202 coupled to the memory device 204 , respectively address locations within the memory array 230 in which the first bit string and the second bit string are stored. can be configured to determine In some embodiments the control circuitry 220 is configured to execute a command to retrieve at least one of the first positive bit string and the second positive bit string from the memory array 220 without receiving a command in addition to the start command. can be further configured. For example, the control circuitry 220 may be configured to store at least one of the first positive bit string and the second positive bit string in the plurality of sense amplifiers 211 without receiving a command in addition to the start command. have.

일부 실시형태에서, 제어 회로부(220)는, 예를 들어, 주 메모리 입력/출력(I/O) 회로부(214)를 통해 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 주 메모리 I/O 회로부(214)를 통해 어레이(220) 외부의 회로부로 전송하는 신호를 송신함으로써 제1 포지트 비트 스트링과 제2 포지트 비트 스트링 중 적어도 하나를 복수의 감지 증폭기(211)에 저장하도록 구성될 수 있다. In some embodiments, the control circuitry 220 receives at least one of the first positive bit string and the second positive bit string, for example via main memory input/output (I/O) circuitry 214 . At least one of the first positive bit string and the second positive bit string is stored in the plurality of sense amplifiers 211 by transmitting a signal to be transmitted to the circuit unit external to the array 220 through the memory I/O circuit unit 214 . can be configured to

도 2b는 본 발명의 소정 개수의 실시형태에 따라 호스트(202), 메모리 디바이스(204), 주문형 집적 회로(223), 및 전계 프로그래밍 가능 게이트 어레이(221)를 포함하는 컴퓨팅 시스템(200) 형태의 기능 블록도이다. 각각의 구성요소(예를 들어, 호스트(202), 메모리 디바이스(204), FPGA(221), ASIC(223) 등)는 본 명세서에서 개별적으로 "장치"라고 지칭될 수 있다. 2B is a diagram in the form of a computing system 200 including a host 202 , a memory device 204 , an application specific integrated circuit 223 , and an electric field programmable gate array 221 in accordance with a number of embodiments of the present invention. It is a functional block diagram. Each component (eg, host 202 , memory device 204 , FPGA 221 , ASIC 223 , etc.) may be individually referred to herein as an “apparatus”.

도 2b에 도시된 바와 같이, 호스트(202)는 도 2a에 도시된 채널(들)(203)과 유사할 수 있는 채널(들)(203)을 통해 메모리 디바이스(204)에 결합될 수 있다. 전계 프로그래밍 가능 게이트 어레이(FPGA)(221)는 채널(들)(217)을 통해 호스트(202)에 결합될 수 있고, 주문형 집적 회로(ASIC)(223)는 채널(들)(219)을 통해 호스트(202)에 결합될 수 있다. 일부 실시형태에서, 채널(들)(217) 및/또는 채널(들)(219)은 주변 직렬 상호 연결 익스프레스(PCIe) 인터페이스를 포함할 수 있지만, 실시형태는 이로 제한되지 않으며, 채널(들)(217) 및/또는 채널(들)(219)은 호스트(202)와 FPGA(221) 및/또는 ASIC(223) 사이에 데이터를 전송하기 위해 다른 유형의 인터페이스, 버스, 통신 채널 등을 포함할 수 있다. As shown in FIG. 2B , the host 202 may be coupled to the memory device 204 via channel(s) 203 , which may be similar to the channel(s) 203 shown in FIG. 2A . An electric field programmable gate array (FPGA) 221 may be coupled to a host 202 via channel(s) 217 , and an application specific integrated circuit (ASIC) 223 may be coupled via channel(s) 219 . may be coupled to a host 202 . In some embodiments, channel(s) 217 and/or channel(s) 219 may include a peripheral Serial Interconnect Express (PCIe) interface, although embodiments are not limited thereto, and channel(s) 217 and/or channel(s) 219 may include other types of interfaces, buses, communication channels, etc. for transferring data between host 202 and FPGA 221 and/or ASIC 223 . can

위에서 설명된 바와 같이, 메모리 디바이스(204) 상에 위치된 회로부(예를 들어, 도 2a에 도시된 제어 회로부(220))는 포지트 비트 스트링을 사용하여 재귀 연산을 수행할 수 있고, 재귀 연산의 중간 결과를 메모리 디바이스(204)의 주변 위치(예를 들어, 도 2a에 도시된 주변 감지 증폭기(211))에 저장할 수 있다. 그러나, 실시형태는 이로 제한되지 않고, 일부 실시형태에서, 재귀 연산(들)은 FPGA(221) 및/또는 ASIC(223)에 의해 수행될 수 있다. FPGA(221) 및/또는 ASIC(223)이 재귀 연산을 수행하도록 구성된 실시형태에서, FPGA 및/또는 ASIC(223)은 재귀 연산의 중간 결과를 메모리 디바이스(204)에, 예를 들어, 도 2a에 도시된 주변 감지 증폭기(211)에 저장하도록 구성될 수 있다. As described above, circuitry located on the memory device 204 (eg, the control circuitry 220 shown in FIG. 2A ) may perform a recursive operation using the positive bit string, and the recursive operation may store the intermediate result of ? in a peripheral location of the memory device 204 (eg, the peripheral sense amplifier 211 shown in FIG. 2A ). However, embodiments are not limited thereto, and in some embodiments, the recursive operation(s) may be performed by the FPGA 221 and/or the ASIC 223 . In embodiments in which FPGA 221 and/or ASIC 223 are configured to perform recursive operations, FPGA and/or ASIC 223 sends intermediate results of the recursive operations to memory device 204 , for example in FIG. 2A . It can be configured to be stored in the peripheral sense amplifier 211 shown in Fig.

위에서 설명된 바와 같이, FPGA(221) 및/또는 ASIC(223)에 의해 수행될 수 있는 재귀 산술 및/또는 재귀 논리 연산의 비-제한적인 예는 산술 연산, 예를 들어, 포지트 비트 스트링을 사용하여 가산, 감산, 승산, 제산, 융합된 승산 가산, 승산-누산, 내적 단위, 크거나 작음, 절대 값(예를 들어, FABS()), 고속 푸리에 변환, 역 고속 푸리에 변환, 시그모이드 함수, 컨볼루션, 제곱근, 지수, 및/또는 로그 연산, 및/또는 AND, OR, XOR, NOT 등과 같은 논리 연산뿐만 아니라 사인, 코사인, 탄젠트 등과 같은 삼각 연산을 포함한다. As described above, non-limiting examples of recursive arithmetic and/or recursive logic operations that may be performed by FPGA 221 and/or ASIC 223 include arithmetic operations, e.g., a string of positive bits. Use Add, Subtract, Multiply, Divide, Fused Multiply Add, Multiply-Accumulate, Dot Product Unit, Greater or Lesser, Absolute Value (e.g. FABS()), Fast Fourier Transform, Inverse Fast Fourier Transform, Sigmoid functions, convolution, square root, exponential, and/or logarithmic operations, and/or logical operations such as AND, OR, XOR, NOT, etc. as well as trigonometric operations such as sine, cosine, tangent, and the like.

FPGA(221)는 상태 기계(227) 및/또는 레지스터(들)(229)를 포함할 수 있다. 상태 기계(227)는 입력에 연산을 수행하고 출력을 생성하도록 구성된 하나 이상의 처리 디바이스를 포함할 수 있다. 예를 들어, FPGA(221)는 호스트(202) 또는 메모리 디바이스(204)로부터 포지트 비트 스트링을 수신하고, 피연산자로서 포지트 비트 스트링을 사용하여 하나 이상의 재귀 연산을 수행하도록 구성될 수 있다. 재귀 연산의 각 반복이 완료된 후, FPGA(221)는 반복의 결과를 나타내는 비트 스트링을 메모리 디바이스(204)에, 예를 들어, 도 2a에 도시된 주변 감지 증폭기(211)에 저장할 수 있다. The FPGA 221 may include a state machine 227 and/or register(s) 229 . State machine 227 may include one or more processing devices configured to perform operations on inputs and generate outputs. For example, FPGA 221 may be configured to receive a positive bit string from host 202 or memory device 204 and perform one or more recursive operations using the positive bit string as an operand. After each iteration of the recursive operation is complete, the FPGA 221 may store the bit string representing the result of the iteration in the memory device 204 , for example in the peripheral sense amplifier 211 shown in FIG. 2A .

FPGA(221)의 레지스터(들)(229)는 상태 기계(227)가 수신된 포지트 비트 스트링을 사용하여 재귀 연산을 수행하기 전에 호스트(202)로부터 수신된 포지트 비트 스트링을 버퍼링 및/또는 저장하도록 구성될 수 있다. 또한, FPGA(221)의 레지스터(들)(229)는 호스트(202) 또는 메모리 디바이스(204) 등과 같은 ASIC(233) 외부의 회로부에 결과를 전송하기 전에 재귀 연산의 반복의 중간 결과를 버퍼링 및/또는 저장하도록 구성될 수 있다. Register(s) 229 of FPGA 221 buffers and/or buffers the positive bit string received from host 202 before state machine 227 performs a recursive operation using the received positive bit string. may be configured to store. In addition, register(s) 229 of FPGA 221 buffer and buffer intermediate results of iterations of recursive operations before sending the results to circuitry external to ASIC 233 , such as host 202 or memory device 204 , etc. / or may be configured to store.

ASIC(223)은 논리 회로부(215) 및/또는 캐시(217)를 포함할 수 있다. 논리 회로부(215)는 입력에 연산을 수행하고 출력을 생성하도록 구성된 회로부를 포함할 수 있다. 일부 실시형태에서, ASIC(223)은 호스트(202) 및/또는 메모리 디바이스(204)로부터 포지트 비트 스트링을 수신하고, 포지트 비트 스트링 피연산자를 사용하여 하나 이상의 재귀 연산을 수행하도록 구성된다. ASIC 223 may include logic circuitry 215 and/or cache 217 . Logic circuitry 215 may include circuitry configured to perform operations on inputs and generate outputs. In some embodiments, ASIC 223 is configured to receive a positive bit string from host 202 and/or memory device 204 and perform one or more recursive operations using positive bit string operands.

ASIC(223)의 캐시(217)는 논리 회로(215)가 수신된 포지트 비트 스트링에 연산을 수행하기 전에 호스트(202)로부터 수신된 포지트 비트 스트링을 버퍼링 및/또는 저장하도록 구성될 수 있다. 또한, ASIC(223)의 캐시(217)는 ASIC(233) 외부의 회로부, 예를 들어, 호스트(202) 또는 메모리 디바이스(204) 등에 결과를 전송하기 전에 재귀 연산의 반복의 중간 결과를 버퍼링 및/또는 저장하도록 구성될 수 있다. The cache 217 of the ASIC 223 may be configured to buffer and/or store the positive bit string received from the host 202 before the logic circuit 215 performs an operation on the received positive bit string. . In addition, the cache 217 of the ASIC 223 buffers and buffers intermediate results of iterations of the recursive operation before sending the results to circuitry external to the ASIC 233 , for example, the host 202 or the memory device 204 , and the like. / or may be configured to store.

FPGA(227)가 상태 기계(227)와 레지스터(들)(229)를 포함하는 것으로 도시되어 있지만, 일부 실시형태에서, FPGA(221)는 상태 기계(227) 및/또는 레지스터(들)(229)에 더하여 또는 이에 대신하여 논리 회로(215)와 같은 논리 회로, 및/또는 캐시(217)와 같은 캐시를 포함할 수 있다. 유사하게, ASIC(223)은 일부 실시형태에서, 논리 회로(215) 및/또는 캐시(217)에 추가하여 또는 이에 대신하여, 상태 기계(227)와 같은 상태 기계 및/또는 레지스터(들)(229)와 같은 레지스터(들)를 포함할 수 있다. Although the FPGA 227 is shown as including a state machine 227 and register(s) 229 , in some embodiments, the FPGA 221 includes a state machine 227 and/or register(s) 229 . ) in addition to or in lieu of logic circuitry, such as logic circuitry 215 , and/or a cache such as cache 217 . Similarly, ASIC 223 may, in some embodiments, in addition to or instead of logic circuitry 215 and/or cache 217 , include state machines and/or register(s) such as state machine 227 ( 229) such as register(s).

도 3은 es 지수 비트가 있는 n-비트 범용 숫자 또는 "unum"의 일례이다. 도 3의 예에서, n-비트 unum은 포지트 비트 스트링(331)이다. 도 3에 도시된 바와 같이, n-비트 포지트(331)는 부호 비트(들)의 세트(예를 들어, 제1 비트 서브세트 또는 부호 비트 서브세트(333)), 체제 비트 세트(예를 들어, 제2 비트 서브세트 또는 체제 비트 서브세트(335)), 지수 비트 세트(예를 들어, 제3 비트 서브세트 또는 지수 비트 서브세트(337)), 및 가수 비트의 세트(예를 들어, 제4 비트 서브세트 또는 가수 비트 서브세트(339))를 포함할 수 있다. 가수 비트(339)는 대안적으로 "소수 부분(fraction portion)" 또는 "소수 비트"로 지칭될 수 있고, 소수점(decimal point) 다음에 오는 비트 스트링(예를 들어, 숫자)의 일부를 나타낼 수 있다. 3 is an example of an n-bit universal number or "unum" with es exponent bits. In the example of FIG. 3 , the n-bit unum is a positive bit string 331 . As shown in Figure 3, the n-bit position 331 is a set of sign bit(s) (e.g., a first subset of bits or a subset of sign bits 333), a set of system bits (e.g., For example, a second subset of bits or subset of regime bits 335), a set of exponent bits (eg, a third subset of bits or subset of exponent bits 337), and a set of mantissa bits (eg, a fourth subset of bits or a subset of mantissa bits (339). Mantissa bit 339 may alternatively be referred to as a “fraction portion” or “fractional bit,” and may represent a portion of a string of bits (eg, a number) following a decimal point. have.

부호 비트(333)는 양수에 대해 영(0)이고 음수에 대해 일(1)일 수 있다. 체제 비트(335)는 (이진) 비트 스트링 및 그 관련된 숫자 의미(k)를 나타내는 아래의 표 1과 관련하여 설명된다. 표 1에서, 숫자 의미(k)는 비트 스트링의 런 길이(run length)에 의해 결정된다. 표 1의 이진 부분에서 문자(x)는 (이진) 비트 스트링이 연속적인 비트 플립에 응답하여 또는 비트 스트링의 끝에 도달할 때 종료되기 때문에 비트 값이 체제 결정과 관련이 없음을 나타낸다. 예를 들어, (이진) 비트 스트링 0010에서, 비트 스트링은 0이 1로 반전된 다음 다시 0으로 반전되는 것에 응답하여 종료된다. 따라서, 마지막 0은 체제와 관련이 없으며, 체제에서 고려되는 모든 것은 선행하는 동일한 비트이고, 비트 스트링을 종료하는 제1 반대 비트(비트 스트링이 이러한 비트를 포함하는 경우임)이다. The sign bit 333 may be zero (0) for positive numbers and one (1) for negative numbers. The regime bit 335 is described with reference to Table 1 below, which represents a (binary) bit string and its associated numeric meaning (k). In Table 1, the numeric meaning (k) is determined by the run length of the bit string. The character (x) in the binary part of Table 1 indicates that the bit value is not relevant for system decisions because the (binary) bit string is terminated in response to successive bit flips or when the end of the bit string is reached. For example, in the (binary) bit string 0010, the bit string is terminated in response to a 0 being inverted to a 1 and then inverted back to a 0. Thus, the last zero is not related to the regime, all that is considered in the regime is the same bit that precedes it, and is the first opposite bit that ends the bit string (if the bit string contains such a bit).

이진수binary number 00000000 00010001 001X001X 01XX01XX 10XX10XX 110X110X 11101110 11111111 숫자(k)number (k) -4-4 -3-3 -2-2 -1-One 00 1One 22 33

도 3에서, 체제 비트(335r)는 비트 스트링의 동일한 비트에 대응하는 반면, 체제 비트(335

)는 비트 스트링을 종료시키는 반대 비트에 대응한다. 예를 들어, 표 1에 표시된 숫자(k) 값(-2)에 대해, 체제 비트(r)는 제1 두 개의 선행 0에 대응하는 반면, 체제 비트(들)(

)는 1에 대응한다. 위에서 언급된 바와 같이, 표 1에서 X로 표시되는 숫자(k)에 대응하는 최종 비트는 체제와 관련이 없다. In Figure 3, the regime bit 335r corresponds to the same bit of the bit string, while the regime bit 335

) corresponds to the opposite bit ending the bit string. For example, for the numeric (k) value (-2) shown in Table 1, the regime bit (r) corresponds to the first two leading zeros, while the regime bit(s) (

) corresponds to 1. As mentioned above, the last bit corresponding to the number k denoted by X in Table 1 is not related to the regime.

m이 비트 스트링의 동일한 비트 수에 대응하는 경우, 비트가 0인 경우, k = -m이다. 비트가 1인 경우, k = m - 1이다. 이것은 표 1에 예시되어 있고, 여기서, 예를 들어, (이진) 비트 스트링(10XX)은 하나의 1을 갖고, k = m - 1 = 1 -1 = 0이다. 유사하게, (이진) 비트 스트링(0001)은 3개의 0을 포함하므로 k = -m = -3이다. 체제는 useed^k의 축척 계수(scale factor)를 나타낼 수 있고, 여기서 useed =

이다. useed의 몇 가지 예시 값이 아래 표 2에 나와 있다. If m corresponds to the same number of bits in the bit string, k = -m if the bit is 0. If the bit is 1, k = m - 1. This is illustrated in Table 1, where, for example, the (binary) bit string 10XX has one 1, and k = m - 1 = 1 - 1 = 0. Similarly, the (binary) bit string 0001 contains 3 zeros so k = -m = -3. The regime may represent a scale factor of used ^k , where used =

to be. Some example values of used are shown in Table 2 below.

eses 00 1One 22 33 44 useedused 22 2² = 42 ² = 4 4² = 164 ² = 16 16² = 25616 ² = 256 256² = 65536256 ² = 65536

지수 비트(337)는 부호 없는 숫자로서, 지수(e)에 대응한다. 부동 소수점 수와 달리, 본 명세서에 설명된 지수 비트(337)는 연관된 바이어스를 갖지 않을 수 있다. 그 결과, 본 명세서에 설명된 지수 비트(337)는 2^e배만큼 축척을 나타낼 수 있다. 도 3에 도시된 바와 같이, n-비트 포지트(331)의 체제 비트(335)의 오른쪽에 남아 있는 비트의 양에 따라 최대 es 지수 비트(e₁, e₂, e₃, ..., e_es)가 있을 수 있다. 일부 실시형태에서, 이것은 1에 더 가까운 숫자가 매우 크거나 매우 작은 숫자보다 더 높은 정확도를 갖는 n-비트 포지트(331)의 테이퍼진 정확도를 허용할 수 있다. 그러나, 특정 종류의 연산에서 매우 크거나 매우 작은 숫자가 덜 자주 사용될 수 있으므로, 도 3에 도시된 n-비트 포지트(331)의 테이퍼진 정확도 거동은 광범위한 상황에서 바람직할 수 있다. Exponent bit 337 is an unsigned number and corresponds to exponent e. Unlike floating point numbers, the exponent bits 337 described herein may not have an associated bias. As a result, the exponent bits 337 described herein can represent a scale by a factor of 2 ^e . As shown in Figure 3, the maximum es exponent bits (e ₁ , e ₂ , e ₃ , ..., e _es ) may exist. In some embodiments, this may allow for tapered accuracy of n-bit positions 331 where numbers closer to 1 have higher accuracy than very large or very small numbers. However, the tapered precision behavior of the n-bit positions 331 shown in FIG. 3 may be desirable in a wide range of situations, as very large or very small numbers may be used less frequently in certain kinds of operations.

가수 비트(339)(또는 소수 비트)는 지수 비트(337)의 오른쪽에 있는 n-비트 포지트(331)의 일부일 수 있는 임의의 추가 비트를 나타낸다. 부동 소수점 비트 스트링과 유사하게, 가수 비트(339)는 소수부(1.f)와 유사할 수 있는 소수부(f)를 나타내고, 여기서 f는 1 다음에 오는 소수점 오른쪽에 하나 이상의 비트를 포함한다. 그러나, 부동 소수점 비트 스트링과 달리, 도 3에 도시된 n-비트 포지트(331)에서, "은닉된 비트"(예를 들어, 1)는 항상 1(예를 들어, 1)일 수 있는 반면, 부동 소수점 비트 스트링은 "은닉된 비트"가 0(예를 들어, 0.f)인 비정규 숫자(subnormal number)를 포함할 수 있다. The mantissa bit 339 (or fractional bit) represents any additional bits that may be part of the n-bit position 331 to the right of the exponent bit 337 . Similar to a floating point bit string, mantissa bit 339 represents a fractional part f, which may be similar to fractional part 1.f, where f contains one or more bits to the right of the decimal point followed by one. However, unlike floating point bit strings, in the n-bit position 331 shown in FIG. 3 , the “hidden bit” (eg, 1) may always be 1 (eg, 1), whereas , the floating-point bit string may include a subnormal number in which the “hidden bit” is 0 (eg, 0.f).

본 명세서에 설명된 바와 같이, 부호(333) 비트 서브세트, 체제(335) 비트 서브세트, 지수(337) 비트 서브세트 또는 가수(339) 비트 서브세트 중 하나 이상의 서브세트의 비트의 숫자 값 또는 수량의 변경은 n-비트 포지트(331)의 정밀도를 변경시킬 수 있다. 예를 들어, n-비트 포지트(331)의 총 비트 수의 변경은 n-비트 포지트 비트 스트링(331)의 해상도를 변경시킬 수 있다. 즉, 8-비트 포지트는 예를 들어, 포지트 비트 스트링의 해상도를 증가시키기 위해 포지트 비트 스트링의 구성 비트 서브세트 중 하나 이상의 서브세트와 연관된 비트의 숫자 값 및/또는 수량을 증가시킴으로써 16-비트 포지트로 변환될 수 있다. 반대로, 포지트 비트 스트링의 해상도는 예를 들어 포지트 비트 스트링의 구성 비트 서브세트 중 하나 이상의 서비세트와 연관된 비트의 숫자 값 및/또는 수량을 줄임으로써 64-비트 해상도로부터 32-비트 해상도로 줄일 수 있다. the numeric value of the bits of one or more subsets of the sign 333 bit subset, the system 335 bit subset, the exponent 337 bit subset, or the mantissa 339 bit subset, as described herein; Changing the quantity may change the precision of the n-bit positions 331 . For example, changing the total number of bits of the n-bit positive bits 331 may change the resolution of the n-bit positive bit string 331 . That is, an 8-bit positive can be converted to a 16-bit by increasing the numerical value and/or quantity of bits associated with one or more subsets of the constituent bit subsets of the positive bit string, for example, to increase the resolution of the positive bit string. It can be converted to a bit position. Conversely, the resolution of a positive bit string can be reduced from 64-bit resolution to 32-bit resolution, for example by reducing the numeric value and/or quantity of bits associated with one or more subsets of the constituent bit subsets of the positive bit string. can

일부 실시형태에서, n-비트 포지트(331)의 정밀도를 변경하기 위해 체제(335) 비트 서브세트, 지수(337) 비트 서브세트, 및/또는 가수(339) 비트 서브세트 중 하나 이상의 서브세트와 연관된 비트의 숫자 값 및/또는 수량의 변경은 체제(335) 비트 서브세트, 지수(337) 비트 서브세트 및/또는 가수(339) 비트 서브세트 중 적어도 하나의 서브세트를 변경할 수 있다. 예를 들어, (예를 들어, n-비트 포지트 비트 스트링(331)의 비트 폭을 증가시키기 위해 "상향 변환" 연산을 수행할 때) n-비트 포지트 비트 스트링(331)의 해상도를 증가시키기 위해 n-비트 포지트(331)의 정밀도를 변경하면, 체제(335) 비트 서브세트, 지수(337) 비트 서브세트, 및/또는 가수(339) 비트 서브세트 중 하나 이상의 서브세트와 연관된 비트의 숫자 값 및/또는 수량이 변경될 수 있다. In some embodiments, a subset of one or more of the system 335 bit subset, the exponent 337 bit subset, and/or the mantissa 339 bit subset to change the precision of the n-bit positions 331 . A change in the numeric value and/or quantity of bits associated with may change a subset of at least one of the system 335 bit subset, the exponent 337 bit subset, and/or the mantissa 339 bit subset. For example, increasing the resolution of the n-bit positive bit string 331 (eg, when performing an “up-convert” operation to increase the bit width of the n-bit positive bit string 331 ) By changing the precision of the n-bit positions 331 to make the bits associated with one or more subsets of the system 335 bits subset, the exponent 337 bits subset, and/or the mantissa 339 bits subset The numerical value and/or quantity of may change.

n-비트 포지트 비트 스트링(331)의 해상도가 증가되지만(예를 들어, n-비트 포지트 비트 스트링(331)의 정밀도는 n-비트 포지트 비트 스트링(331)의 비트 폭을 증가시키기 위해 변경됨) 지수(337) 비트 서브세트와 연관된 비트의 숫자 값 또는 수량이 변경되지 않는 비-제한적인 예에서, 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 또는 수량이 증가될 수 있다. 적어도 하나의 실시형태에서, 지수(338) 비트 서브세트가 변경되지 않은 상태로 유지될 때 가수(339) 비트 서브세트의 비트의 숫자 값 및/또는 수량의 증가는 가수(339) 비트 서브세트에 하나 이상의 제로 비트를 추가하는 것을 포함할 수 있다. Although the resolution of the n-bit positive bit string 331 is increased (eg, the precision of the n-bit positive bit string 331 ) is In a non-limiting example where the numeric value or quantity of bits associated with the exponent 337 bit subset does not change, the numeric value or quantity of the bits associated with the mantissa 339 bit subset may be increased. In at least one embodiment, an increase in the numeric value and/or quantity of the bits of the mantissa 339 bit subset occurs in the mantissa 339 bit subset when the exponent 338 bit subset remains unchanged. adding one or more zero bits.

지수(337) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량을 변경함으로써 n-비트 포지트 비트 스트링(331)의 해상도가 증가되는 (예를 들어, n-비트 포지트 비트 스트링(331)의 정밀도는 n-비트 포지트 비트 스트링(331)의 비트 폭을 증가시키기 위해 변경됨) 다른 비-제한적인 예에서, 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량은 증가되거나 감소될 수 있다. 예를 들어, 지수(337) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량이 증가하거나 감소하면, 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량에 대응하는 변경이 이루어질 수 있다. 적어도 일 실시형태에서, 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량을 증가 또는 감소시키는 것은 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트에 하나 이상의 제로 비트를 추가하고 및/또는 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 또는 수량을 절단하는 것을 포함할 수 있다. The resolution of the n-bit positive bit string 331 is increased by changing the numeric value and/or quantity of bits associated with the exponent 337 bit subset (eg, the n-bit positive bit string 331 ). The precision of the n-bit positive bit string 331 is changed to increase the bit width) In another non-limiting example, the bits associated with the system 335 bit subset and/or the mantissa 339 bit subset. The numeric value and/or quantity of may be increased or decreased. For example, if the numeric value and/or quantity of bits associated with the exponent 337 bit subset increases or decreases, the numeric value of the bits associated with the system 335 bit subset and/or the mantissa 339 bit subset and/or a change corresponding to the quantity may be made. In at least one embodiment, increasing or decreasing the numeric value and/or quantity of bits associated with the regime 335 bit subset and/or mantissa 339 bit subset includes the regime 335 bit subset and/or mantissa (339) adding one or more zero bits to the bit subset and/or truncating the numeric value or quantity of bits associated with the regime 335 bit subset and/or the mantissa 339 bit subset. .

n-비트 포지트 비트 스트링(331)의 해상도가 증가되는 (예를 들어, n-비트 포지트 비트 스트링(331)의 정밀도가 n-비트 포지트 비트 스트링(331)의 비트 폭을 증가시키기 위해 변경됨) 다른 예에서, 지수(335) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량이 증가될 수 있고, 체제(333) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량이 감소될 수 있다. 반대로, 일부 실시형태에서, 지수(335) 비트 서브세트와 연관된 비트 수량 및/또는 숫자 값은 감소될 수 있고, 체제(333) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량은 증가될 수 있다. The resolution of the n-bit positive bit string 331 is increased (eg, the precision of the n-bit positive bit string 331 is increased to increase the bit width of the n-bit positive bit string 331 ) In another example, the numeric value and/or quantity of bits associated with the exponent 335 bit subset may be increased, and the numeric value and/or quantity of bits associated with the system 333 bit subset may be decreased. have. Conversely, in some embodiments, the bit quantity and/or numeric value associated with the exponent 335 bit subset may be decreased, and the numeric value and/or quantity of bits associated with the system 333 bit subset may be increased. have.

n-비트 포지트 비트 스트링(331)의 해상도가 감소되지만(예를 들어, n-비트 포지트 비트 스트링(331)의 정밀도는 n-비트 포지트 비트 스트링(331)의 비트 폭을 감소시키기 위해 변경됨) 지수(337) 비트 서브세트와 연관된 비트의 숫자 값 또는 수량이 변경되지 않는 비-제한적인 예에서, 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 또는 수량은 감소될 수 있다. 적어도 하나의 실시형태에서, 지수(338) 비트 서브세트가 변경되지 않은 상태로 유지될 때 가수(339) 비트 서브세트의 비트의 숫자 값 및/또는 수량의 감소는 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량을 절단하는 것을 포함할 수 있다. Although the resolution of the n-bit positive bit string 331 is reduced (eg, the precision of the n-bit positive bit string 331 is reduced to reduce the bit width of the n-bit positive bit string 331 ) In a non-limiting example where the numeric value or quantity of bits associated with the exponent 337 bit subset does not change, the numeric value or quantity of the bits associated with the mantissa 339 bit subset may be decreased. In at least one embodiment, a decrease in the numeric value and/or quantity of bits in the mantissa 339 bit subset is equal to the mantissa 339 bit subset and the mantissa 339 bit subset when the exponent 338 bit subset remains unchanged. truncation of the numeric value and/or quantity of the associated bit.

n-비트 포지트 비트 스트링(331)의 해상도가 지수(337) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량을 변경함으로써 감소되는(예를 들어, n-비트 포지트 비트 스트링(331)의 정밀도는 n-비트 포지트 비트 스트링(331)의 비트 폭을 감소시키기 위해 변경됨) 다른 비-제한적인 예에서, 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량은 증가되거나 감소될 수 있다. 예를 들어, 지수(337) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량이 증가하거나 감소하면, 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량에 대응하는 변경이 이루어질 수 있다. 적어도 하나의 실시형태에서, 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 및/또는 수량을 늘리거나 줄이는 것은 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트에 하나 이상의 0비트를 추가하는 것 및/또는 체제(335) 비트 서브세트 및/또는 가수(339) 비트 서브세트와 연관된 비트의 숫자 값 또는 수량을 절단하는 것을 포함할 수 있다. The resolution of the n-bit positive bit string 331 is reduced by changing the numeric value and/or quantity of bits associated with the exponent 337 bit subset (e.g., the n-bit positive bit string 331 ). The precision of the n-bit positive bit string 331 is changed to reduce the bit width) In another non-limiting example, the bits associated with the system 335 bit subset and/or the mantissa 339 bit subset. The numeric value and/or quantity of may be increased or decreased. For example, if the numeric value and/or quantity of bits associated with the exponent 337 bit subset increases or decreases, the numeric value of the bits associated with the system 335 bit subset and/or the mantissa 339 bit subset and/or a change corresponding to the quantity may be made. In at least one embodiment, increasing or decreasing the numeric value and/or quantity of bits associated with the regime 335 bit subset and/or mantissa 339 bit subset may include the regime 335 bit subset and/or mantissa (339) adding one or more zero bits to the bit subset and/or truncating the numeric value or quantity of bits associated with the regime 335 bit subset and/or the mantissa 339 bit subset. have.

일부 실시형태에서, 지수 비트 서브세트에서 비트의 숫자 값 및/또는 수량을 변경하는 것은 n-비트 포지트(331)의 동적 범위를 변경할 수 있다. 예를 들어, 숫자 값이 0인 지수 비트 서브세트가 있는 32-비트 포지트 비트 스트링(예를 들어, es = 0인 32-비트 포지트 비트 스트링 또는 (32,0) 포지트 비트 스트링)은 약 18 디케이드(decay)의 동적 범위를 가질 수 있다. 그러나, 숫자 값이 3인 지수 비트 서브세트가 있는 32-비트 포지트 비트 스트링(예를 들어, es = 3인 32-비트 포지트 비트 스트링 또는 (32,3) 포지트 비트 스트링)은 약 145 디케이드의 동적 범위를 가질 수 있다. In some embodiments, changing the numeric value and/or quantity of bits in the exponent bit subset may change the dynamic range of the n-bit positions 331 . For example, a 32-bit positive bit string with a subset of exponent bits with a numeric value of 0 (e.g., a 32-bit positive bit string with es = 0 or a (32,0) positive bit string) is It may have a dynamic range of about 18 decade. However, a 32-bit positive bit string with a subset of exponent bits with a numeric value of 3 (e.g., a 32-bit positive bit string with es = 3 or a (32,3) positive bit string) is about 145 It can have a dynamic range of decade.

도 4a는 3-비트 포지트에 대한 양수 값의 일례이다. 그러나, 투영 실수의 오른쪽 절반만을 도시하는 도 4a에서, 도 4a에 도시된 양수 대응부에 대응하는 음의 투영 실수는 도 4a에 도시된 곡선의 y-축에 대한 변환을 나타내는 곡선 상에 존재할 수 있다는 것을 이해할 수 있을 것이다. 4A is an example of a positive value for a 3-bit position. However, in Fig. 4a showing only the right half of the projection real, the negative projection real corresponding to the positive counterpart shown in Fig. 4a can exist on the curve representing the transformation of the curve shown in Fig. 4a about the y-axis. You will understand that there is

도 4a의 예에서 es = 2이므로 useed=

= 16이다. 포지트(431-1)의 정밀도는 도 4b에 도시된 바와 같이 비트 스트링에 비트를 첨부하여 증가될 수 있다. 예를 들어, 1의 값을 갖는 비트를 포지트(431-1)의 비트 스트링에 첨부하면 도 4b의 포지트(431-2)에 의해 도시된 바와 같이 포지트(431-1)의 정확도를 증가시킨다. 유사하게, 1의 값을 갖는 비트를 도 4b의 포지트(431-2)의 비트 스트링에 첨부하면 도 4b에 도시된 포지트(431-3)에 의해 도시된 바와 같이 포지트(431-2)의 정확도를 증가시킨다. 보간 규칙의 일례는 다음에 도 4b에 도시된 포지트(431-2, 431-3)를 획득하기 위해 도 4a에 도시된 포지트(431-1)의 비트 스트링에 비트를 첨부하는 데 사용될 수 있다. In the example of Fig. 4a, es = 2, so useed=

= 16. The precision of the positions 431-1 can be increased by appending bits to the bit string as shown in FIG. 4B. For example, appending a bit with a value of 1 to the bit string of the position 431-1 increases the accuracy of the position 431-1 as shown by the position 431-2 in FIG. 4B. increase Similarly, appending a bit with a value of 1 to the bit string of position 431-2 in Fig. 4B shows the position 431-2 as shown by position 431-3 shown in Fig. 4B. ) to increase the accuracy of An example of an interpolation rule can then be used to append bits to the bit string of positions 431-1 shown in FIG. 4A to obtain positions 431-2 and 431-3 shown in FIG. 4B. have.

maxpos가 포지트(431-1, 431-2, 431-3)의 비트 스트링 중 가장 큰 양수값이고, minpos가 포지트(431-1, 431-2, 431-3)의 비트 스트링 중 가장 작은 값이라면, maxpos는 useed와 같을 수 있고, minpos는

와 같을 수 있다. maxpos와 ±∞ 사이의 새로운 비트 값은 maxpos*useed일 수 있고, 0과 minpos 사이의 새로운 비트 값은

일 수 있다. 이러한 새로운 비트 값은 새로운 체제 비트(335)에 대응할 수 있다. 기존 값들(x = 2^m과 y = 2ⁿ, 여기서, m과 n이 1을 초과하는 만큼 상이함) 사이의 새로운 비트 값은 새로운 지수 비트(337)에 대응하는 기하학적 평균(

)으로 주어질 수 있다. 새로운 비트 값이 옆에 있는 기존 x 값과 y 값 사이의 중간인 경우 새로운 비트 값은 새로운 가수 비트(339)에 대응하는 산술 평균(

)을 나타낼 수 있다. maxpos is the largest positive value among the bit strings of positives (431-1, 431-2, 431-3), and minpos is the smallest among the bit strings of positives (431-1, 431-2, 431-3) If a value, maxpos can be equal to useded, minpos is

can be the same as A new bit value between maxpos and ±∞ may be maxpos*useed, and a new bit value between 0 and minpos is

can be This new bit value may correspond to the new set bit 335 . The new bit value between the existing values (x = 2 ^m and y = 2 ⁿ , where m and n differ by more than 1) is the geometric mean (

) can be given as If the new bit value is halfway between the existing x and y values next to it, the new bit value is the arithmetic mean (

) can be represented.

도 4b는 2개의 지수 비트를 사용한 포지트 구성의 일례이다. 그러나 투영 실수의 오른쪽 절반만을 도시하는 도 4b에서, 도 4b에 도시된 양수 대응부에 대응하는 음수 투영 실수는 도 4b에 도시된 곡선의 y축에 대한 변환을 나타내는 곡선에 존재할 수 있음을 이해할 수 있을 것이다. 도 4b에 도시된 포지트(431-1, 431-2, 431-3) 각각은 두 개의 예외 값만을 포함하는 데, 즉 비트 스트링의 모든 비트가 0이면 0이고 비트 스트링이 1 다음에 모두 0이 오면 ±∞라는 것만을 포함한다. 도 4에 도시된 포지트(431-1, 431-2, 431-3)의 숫자 값은 정확히 useed^k임을 유의해야 한다. 즉, 도 4에 도시된 포지트(431-1, 431-2, 431-3)의 숫자 값은 정확히 useed를 체제로 나타내는 k 값만큼 거듭제곱한 것(예를 들어, 도 3과 관련하여 위에서 설명된 체제 비트(335))이다. 도 4b에서, 포지트(431-1)는 es = 2이므로 useed =

= 16이고, 포지트(431-2)는 es = 3이므로 useed =

= 256이고, 포지트(431 -3)는 es = 4이므로 useed =

= 4096이다. 4B is an example of a positive configuration using two exponent bits. However, in Fig. 4b showing only the right half of the projection real, it can be understood that a negative projection real corresponding to the positive counterpart shown in Fig. 4b can exist in the curve representing the transformation of the curve shown in Fig. 4b about the y-axis. There will be. Each of the positions 431-1, 431-2, and 431-3 shown in Fig. 4b contains only two exception values, i.e., 0 if all bits of the bit string are 0 and the bit string is 1 followed by all 0s. When it comes, it includes only ±∞. It should be noted that the numerical values of the positions 431-1, 431-2, and 431-3 shown in FIG. 4 are exactly used ^k . That is, the numeric values of the positions 431-1, 431-2, 431-3 shown in FIG. 4 are exactly the power of k values representing used in the system (e.g., above with respect to FIG. 3 ). the described regime bit 335). In Figure 4b, the position 431-1 is es = 2, so used =

= 16, and the position (431-2) is es = 3, so used =

= 256, the position (431 -3) is es = 4, so used =

= 4096.

도 4b의 4-비트 포지트(431-2)를 생성하기 위해 3-비트 포지트(431-1)에 비트를 추가하는 예시적인 예로서, useed = 256이므로 256의 useed에 대응하는 비트 스트링에는 추가 체제 비트가 첨부되고, 이전의 useed(16)에는 종료 체제 비트(

)가 첨부되어 있다. 전술한 바와 같이 기존 값들 사이의 대응하는 비트 스트링에는 추가 지수 비트가 첨부된다. 예를 들어, 숫자 값(1/16,

, 1 및 4)에는 지수 비트가 첨부된다. 즉, 숫자 값(4)에 대응하는 최종 값은 지수 비트이고, 숫자 값(1)에 대응하는 최종 0은 지수 비트이고 등이다. 이 패턴은 포지트(431-3)에서 더 볼 수 있고, 이는 4-비트 포지트(431-2)로부터 위의 규칙에 따라 생성된 5-비트 포지트이다. 6-비트 포지트를 생성하기 위해 도 4b의 포지트(431-3)에 다른 비트가 추가되면, 가수 비트(339)는 1/16과 16 사이의 숫자 값에 첨부된다. As an exemplary example of adding a bit to the 3-bit position 431-1 to generate the 4-bit position 431-2 of FIG. 4B, since used = 256, the bit string corresponding to the useded of 256 has An additional regime bit is appended, the previous used(16) has an exit regime bit (

) is attached. An additional exponent bit is appended to the corresponding bit string between the existing values as described above. For example, a numeric value (1/16,

, 1 and 4) are appended with an exponent bit. That is, the final value corresponding to the numeric value (4) is the exponent bit, the last 0 corresponding to the numeric value (1) is the exponent bit, and so on. This pattern can be further seen at positions 431-3, which are 5-bit positions generated from 4-bit positions 431-2 according to the above rule. When another bit is added to positions 431-3 of FIG. 4B to create a 6-bit position, a mantissa bit 339 is appended to a numeric value between 1/16 and 16.

그 수치적 등가를 얻기 위해 포지트(예를 들어, 포지트(431))를 디코딩하는 비-제한적인 예는 다음과 같다. 일부 실시형태에서, 포지트(p)에 대응하는 비트 스트링은 -2^n-1 내지 2^n-1 범위의 부호 없는 정수이고, k는 체제 비트(335)에 대응하는 정수이고, e는 지수 비트(337)에 대응하는 부호 없는 정수이다. 가수 비트(339) 세트가 {f₁f₂... f_fs}로 표현되고, f는 1.f₁ f₂ ... f_fs(예를 들어, 1 뒤에 소수점이 오고 그 뒤에 가수 비트(339))로 표시되는 값이면, p는 아래의 수식 2와 같이 주어질 수 있다. A non-limiting example of decoding a position (eg, position 431 ) to obtain its numerical equivalent is as follows. In some embodiments, the bit string corresponding to position p is an unsigned integer ranging from -2 ^n-1 to 2 ^n-1 , k is an integer corresponding to regime bit 335, and e is the exponent bit. It is an unsigned integer corresponding to (337). A set of mantissa bits (339) is represented by {f ₁ f ₂ ... f _fs }, where f is 1.f ₁ f ₂ ... f _fs (eg 1 followed by a decimal point followed by mantissa bits ( 339)), p can be given as in Equation 2 below.

포지트 비트 스트링을 디코딩하는 추가의 예시적인 예는 하기 표 3에 도시된 포지트 비트 스트링(0000110111011101)과 관련하여 아래에 제공된다. A further illustrative example of decoding a positive bit string is provided below with respect to the positive bit string 0000110111011101 shown in Table 3 below.

부호sign 체계system 지수Indices 가수singer 00 00010001 101101 1101110111011101

표 3에서, 포지트 비트 스트링(0000110111011101)은 비트의 구성 세트(예를 들어, 부호 비트(333), 체제 비트(335), 지수 비트(337) 및 가수 비트(339))로 나뉜다. 표 3에 제시된 포지트 비트 스트링에서 es = 3이므로(예를 들어, 지수 비트가 3개이므로), useed는 256이다. 부호 비트(333)가 0이기 때문에, 표 3에 제시된 포지트 비트 스트링에 대응하는 수식의 값은 양수이다. 체제 비트(335)는 (표 1과 관련하여 위에서 설명한 바와 같이) -3의 값에 대응하는 3개의 연속적인 0의 런을 갖는다. 그 결과, 체제 비트(335)에 의해 기여되는 축척 계수는 256^-3(예를 들어, useed^k)이다. 지수 비트(337)는 5를 부호 없는 정수로 나타내므로 2^e = 2⁵ = 32의 추가 축척 계수에 기여한다. 마지막으로, 표 3에서 11011101로 주어진 가수 비트(339)는 221을 부호 없는 정수로 나타내므로 위에서 f로 주어진 가수 비트(339)는

이다. 이들 값과 수식 2를 이용하여, 표 3에서 주어진 포지트 비트 스트링에 대응하는 숫자 값은

이다. In Table 3, the positive bit string 00001101111011101 is divided into a constituent set of bits (eg, a sign bit 333, a regime bit 335, an exponent bit 337, and a mantissa bit 339). Since es = 3 in the positive bit string presented in Table 3 (eg, there are 3 exponent bits), useded is 256. Since the sign bit 333 is 0, the value of the expression corresponding to the positive bit string shown in Table 3 is positive. The regime bit 335 has three consecutive runs of zeros corresponding to a value of -3 (as described above with respect to Table 1). As a result, the scale factor contributed by the regime bit 335 is 256 ^-3 (eg, used ^k ). Exponent bit 337 represents 5 as an unsigned integer, thus contributing to an additional scale factor of 2 ^e = 2 ⁵ = 32. Finally, the mantissa bit 339 given by 11011101 in Table 3 represents 221 as an unsigned integer, so the mantissa bit 339 given by f above is

to be. Using these values and Equation 2, the numeric value corresponding to the positive bit string given in Table 3 is

to be.

도 5는 본 발명의 소정 개수의 실시형태에 따라 제어 회로부(520)를 포함하는 장치(500) 형태의 기능 블록도이다. 제어 회로부(520)는 본 명세서에서 도 1에 도시된 논리 회로부(122)와 메모리 자원(124)과 유사할 수 있는, 논리 회로부(522)와 메모리 자원(524)을 포함할 수 있다. 논리 회로부(522) 및/또는 메모리 자원(524)은 개별적으로 "장치"라고 고려될 수 있다. 5 is a functional block diagram in the form of an apparatus 500 including control circuitry 520 in accordance with a number of embodiments of the present invention. Control circuitry 520 may include logic circuitry 522 and memory resources 524 , which may be similar to logic circuitry 122 and memory resource 124 shown in FIG. 1 herein. Logic circuitry 522 and/or memory resource 524 may individually be considered "devices."

제어 회로부(520)는 메모리 자원(524)에 저장된 데이터에 하나 이상의 연산(예를 들어, 재귀 연산 등)의 수행을 개시하기 위해 호스트(예를 들어, 본 명세서에서 도 1 및 도 2에 도시된 호스트(102/202)) 및/또는 제어기(예를 들어, 본 명세서에서 도 2에 도시된 제어기(210))로부터 커맨드(예를 들어, 개시 커맨드)를 수신하도록 구성될 수 있다. 일단 개시 커맨드가 제어 회로부(520)에 의해 수신되면, 제어 회로부(520)는 호스트 및/또는 제어기로부터의 개입 커맨드가 없을 때 위에서 설명된 연산을 수행할 수 있다. 예를 들어, 제어 회로부(520)는 제어 회로부(520) 외부의 회로부로부터 추가 커맨드를 수신하지 않고 메모리 자원(524)에 저장된 비트 스트링에 연산을 수행하기에 충분한 처리 자원 및/또는 명령어를 포함할 수 있다. Control circuitry 520 is configured to initiate performance of one or more operations (eg, recursive operations, etc.) on data stored in memory resource 524 by a host (eg, as shown in FIGS. 1 and 2 herein). host 102/202) and/or a controller (eg, controller 210 shown in FIG. 2 herein). Once the initiation command is received by the control circuitry 520 , the control circuitry 520 may perform the operations described above in the absence of an intervening command from the host and/or controller. For example, the control circuitry 520 may include sufficient processing resources and/or instructions to perform an operation on the bit string stored in the memory resource 524 without receiving additional commands from circuitry external to the control circuitry 520 . can

논리 회로부(522)는 산술 논리 유닛(ALU), 상태 기계, 시퀀서, 제어기, 명령어 세트 아키텍처, 또는 다른 유형의 제어 회로부일 수 있다. 위에서 설명된 바와 같이, ALU는 포지트 형식의 비트 스트링과 같은 정수 이진수를 사용하여 위에서 설명된 연산과 같은 연산(예를 들어, 비트 스트링을 사용한 재귀 연산 등)을 수행하기 위한 회로부를 포함할 수 있다. 명령어 세트 아키텍처(ISA)는 축소된 명령어 세트 컴퓨팅(RISC) 디바이스를 포함할 수 있다. 논리 회로부(522)가 RISC 디바이스를 포함하는 실시형태에서, RISC 디바이스는 RISC-V ISA와 같은 명령어 세트 아키텍처(ISA)를 사용할 수 있는 처리 자원 또는 처리 디바이스를 포함할 수 있으나, 실시형태는 RISC-V ISA로 제한되지 않고 다른 처리 디바이스 및/또는 ISA도 사용될 수 있다. Logic circuitry 522 may be an arithmetic logic unit (ALU), state machine, sequencer, controller, instruction set architecture, or other type of control circuitry. As described above, the ALU may include circuitry for performing operations such as those described above (e.g., recursive operations using bit strings, etc.) using integer binary numbers such as bit strings in positive form. have. An instruction set architecture (ISA) may include a reduced instruction set computing (RISC) device. In embodiments where logic circuitry 522 includes a RISC device, the RISC device may include processing resources or processing devices capable of using an instruction set architecture (ISA), such as a RISC-V ISA, although embodiments are RISC- It is not limited to V ISA and other processing devices and/or ISAs may be used.

일부 실시형태에서, 논리 회로부(522)는 본 명세서의 연산을 수행하기 위해 명령어(예를 들어, 메모리 자원(524)의 INSTR(525) 부분에 저장된 명령어)를 실행하도록 구성될 수 있다. 예를 들어, 논리 회로부(524)는 제어 회로부(520)에 의해 수신된 데이터에 (예를 들어, 비트 스트링에) 이러한 연산의 수행을 야기할 만큼 충분한 처리 자원이 제공된다. In some embodiments, logic circuitry 522 may be configured to execute instructions (eg, instructions stored in the INSTR 525 portion of memory resource 524 ) to perform operations herein. For example, logic circuitry 524 is provided with sufficient processing resources to cause performance of such operations on data received by control circuitry 520 (eg, on bit strings).

연산(들)이 논리 회로부(522)에 의해 수행되면, 결과 비트 스트링은 메모리 자원(524) 및/또는 메모리 어레이(예를 들어, 본 명세서에서 도 2에 도시된 메모리 어레이(230))에 저장될 수 있다. 저장된 결과 비트 스트링은 연산의 수행을 위해 액세스할 수 있도록 어드레싱될 수 있다. 예를 들어, 비트 스트링은 연산을 수행할 때 비트 스트링이 액세스될 수 있도록 특정 물리적 어드레스(이에 대응하는 대응하는 논리 어드레스를 가질 수 있음)에서 메모리 자원(524) 및/또는 메모리 어레이에 저장될 수 있다. 일부 실시형태에서, 비트 스트링은 주변 감지 증폭기(예를 들어, 도 1 및 도 2에 각각 도시된 감지 증폭기(111) 및/또는 주변 감지 증폭기(211))로 전송될 수 있다. Once the operation(s) is performed by logic circuitry 522, the resulting bit string is stored in memory resource 524 and/or in a memory array (eg, memory array 230 shown in FIG. 2 herein). can be The stored result bit string can be addressed so that it can be accessed for performing an operation. For example, a bit string may be stored in memory resource 524 and/or a memory array at a particular physical address (which may have a corresponding logical address corresponding thereto) so that the bit string can be accessed when performing operations. have. In some embodiments, the bit string may be sent to a peripheral sense amplifier (eg, the sense amplifier 111 and/or the peripheral sense amplifier 211 shown in FIGS. 1 and 2 , respectively).

메모리 자원(524)은 일부 실시형태에서 랜덤 액세스 메모리(예를 들어, RAM, SRAM 등)와 같은 메모리 자원일 수 있다. 그러나 실시형태는 이로 제한되지 않고, 메모리 자원(524)은 다양한 레지스터, 캐시, 버퍼, 및/또는 메모리 어레이(예를 들어, 1T1C, 2T2C, 3T 등 DRAM 어레이)를 포함할 수 있다. 메모리 자원(524)은 예를 들어, 도 2a 내지 도 2c에 도시된 호스트(202)와 같은 호스트 및/또는 도 2a 및 도 2b에 도시된 메모리 어레이(230)와 같은 메모리 어레이로부터 비트 스트링(들)을 수신하도록 구성될 수 있다. 일부 실시형태에서, 메모리 자원(538)은 대략 256 킬로바이트(KB)의 크기를 가질 수 있지만, 실시형태는 이 특정 크기로 제한되지 않고, 메모리 자원(524)은 256 KB를 초과하거나 미만의 크기를 가질 수 있다. Memory resource 524 may be a memory resource, such as random access memory (eg, RAM, SRAM, etc.) in some embodiments. However, embodiments are not limited thereto, and memory resources 524 may include various registers, caches, buffers, and/or memory arrays (eg, 1T1C, 2T2C, 3T, etc. DRAM arrays). Memory resource 524 may be, for example, from a host such as host 202 shown in FIGS. 2A-2C and/or a bit string(s) from a memory array, such as memory array 230 shown in FIGS. 2A-2B . ) can be configured to receive In some embodiments, memory resource 538 may have a size of approximately 256 kilobytes (KB), although embodiments are not limited to this particular size, and memory resource 524 may have a size greater than or less than 256 KB. can have

메모리 자원(524)은 하나 이상의 어드레싱 가능한 메모리 영역으로 분할될 수 있다. 도 5에 도시된 바와 같이, 메모리 자원(524)은 다양한 유형의 데이터가 내부에 저장될 수 있도록 어드레싱 가능한 메모리 영역으로 분할될 수 있다. 예를 들어, 하나 이상의 메모리 영역은 메모리 자원(524)에 의해 사용되는 명령어("INSTR")(525)를 저장할 수 있고, 하나 이상의 메모리 영역은 비트 스트링(526-1, ..., 526-N)(예를 들어, 호스트 및/또는 메모리 어레이로부터 검색된 비트 스트링과 같은 데이터)을 저장할 수 있으며, 및/또는 하나 이상의 메모리 영역은 메모리 자원(538)의 로컬 메모리("LOCAL MEM")(528) 부분으로 역할을 할 수 있다. 20개의 구분 메모리 영역이 도 5에 도시되어 있지만, 메모리 자원(524)은 임의의 수의 구분 메모리 영역으로 분할될 수 있는 것으로 이해된다. Memory resource 524 may be partitioned into one or more addressable memory regions. As shown in FIG. 5 , the memory resource 524 may be divided into addressable memory regions so that various types of data may be stored therein. For example, one or more memory regions may store instructions (“INSTR”) 525 used by memory resource 524 , and one or more memory regions may store bit strings 526-1, ..., 526- N) (eg, data such as bit strings retrieved from a host and/or memory array), and/or one or more memory regions are local memory (“LOCAL MEM”) 528 of memory resource 538 . ) can serve as part of Although twenty distinct memory regions are shown in FIG. 5 , it is understood that the memory resource 524 may be divided into any number of distinct memory regions.

위에서 논의된 바와 같이, 비트 스트링(들)은 호스트, 제어기(예를 들어, 본 명세서에서 도 2에 도시된 제어기(210)) 또는 논리 회로부(522)에 의해 생성된 메시지 및/또는 커맨드에 응답하여 호스트 및/또는 메모리 어레이로부터 검색될 수 있다. 일부 실시형태에서, 커맨드 및/또는 메시지는 논리 회로부(522)에 의해 처리될 수 있다. 비트 스트링(들)이 제어 회로부(520)에 의해 수신되고 메모리 자원(524)에 저장되면, 논리 회로부(522)에 의해 처리될 수 있다. 논리 회로부(522)에 의해 비트 스트링(들)을 처리하는 것은 비트 스트링을 피연산자로 사용하여 승산-누산 연산과 같은 재귀 연산을 수행하는 것을 포함할 수 있다. As discussed above, the bit string(s) are in response to messages and/or commands generated by the host, controller (eg, controller 210 shown in FIG. 2 herein), or logic circuitry 522 . to be retrieved from the host and/or memory array. In some embodiments, commands and/or messages may be processed by logic circuitry 522 . Once the bit string(s) are received by the control circuitry 520 and stored in the memory resource 524 , they may be processed by the logic circuitry 522 . Processing the bit string(s) by the logic circuitry 522 may include performing a recursive operation, such as a multiply-accumulate operation, using the bit string as an operand.

비-제한적인 신경망 훈련 응용에서, 제어 회로부(520)는 신경망 훈련 응용에서 사용하기 위해 es=0인 16-비트 포지트를 es=0인 8-비트 포지트로 변환할 수 있다. 일부 접근 방식에서, 반정밀도 16-비트 부동 소수점 비트 스트링을 신경망 훈련에 사용할 수 있으나, 신경망 훈련을 위해 반정밀도의 16-비트 부동 소수점 비트 스트링을 이용하는 일부 접근 방식과 달리, es = 0인 8-비트 포지트 비트 스트링이 반정밀도의 16-비트 부동 소수점 비트 스트링보다 2배 내지 4배 빠른 신경망 훈련 결과를 제공할 수 있다. In a non-limiting neural network training application, the control circuitry 520 may convert a 16-bit positron with es=0 to an 8-bit positron with es=0 for use in a neural network training application. In some approaches, half-precision 16-bit floating-point bit strings can be used for training neural networks, but unlike some approaches that use half-precision 16-bit floating-point bit strings for training neural networks, 8- with es = 0 A bit-positive bit string can provide neural network training results that are two to four times faster than a half-precision 16-bit floating-point bit string.

예를 들어, 제어 회로부(520)가 신경망 훈련 응용에서 사용하기 위해 es = 0인 16-비트 포지트 비트 스트링을 수신하면, 제어 회로부(520)는 16-비트 포지트 비트 스트링의 정밀도를 es = 0인 8-비트 포지트 비트 스트링으로 변경하기 위해 16-비트 포지트 비트 스트링의 하나 이상의 비트 서브세트로부터 비트를 선택적으로 제거할 수 있다. 실시형태는 이로 제한되지 않으며, 제어 회로부(520)는 비트 스트링의 정밀도를 변경하여 es = 1(또는 일부 다른 값)인 8-비트 포지트 비트 스트링을 생성할 수 있는 것으로 이해된다. 또한, 제어 회로부(520)는 16-비트 포지트 비트 스트링의 정밀도를 변경하여 32-비트 포지트 비트 스트링(또는 일부 다른 값)을 생성할 수 있다. For example, if control circuitry 520 receives a 16-bit positive bit string with es = 0 for use in a neural network training application, control circuitry 520 sets the precision of the 16-bit positive bit string to es = 0. Bits may be selectively removed from one or more bit subsets of the 16-bit positive bit string to change to an 8-bit positive bit string equal to zero. It is understood that embodiments are not limited thereto, and the control circuitry 520 may change the precision of the bit string to produce an 8-bit positive bit string where es = 1 (or some other value). Also, the control circuitry 520 may change the precision of the 16-bit positive bit string to generate a 32-bit positive bit string (or some other value).

위의 예와 연결된 연산의 수행 동안, 제어 회로부(520)는 각각의 반복에서의 연산의 결과를 메모리 디바이스 또는 메모리 어레이의 주변에 있는 회로부에 저장하도록 구성될 수 있다. 예를 들어, 제어 회로부(520)는 각각의 반복에서의 연산 결과를 도 2a에 도시된 주변 감지 증폭기(211)와 같은 복수의 주변 감지 증폭기에 저장하도록 구성될 수 있다. 이들 중간 결과는 신경망 훈련 응용 상황에서 재귀 연산의 후속 반복에서 사용되어 본 명세서에 설명된 바와 같이 연산의 최종 결과의 정확도를 향상시킬 수 있다. During performance of the operations associated with the example above, the control circuitry 520 may be configured to store the results of the operations at each iteration into circuitry at the periphery of the memory device or memory array. For example, the control circuit unit 520 may be configured to store an operation result in each iteration in a plurality of peripheral sense amplifiers, such as the peripheral sense amplifier 211 shown in FIG. 2A . These intermediate results can be used in subsequent iterations of a recursive operation in neural network training applications to improve the accuracy of the final result of the operation as described herein.

신경망 훈련에 사용되는 일반적인 함수는 시그모이드 함수 f(x)(예를 들어, x→ -∞로 가면서 0에 점근적으로 접근하고 x→ ∞로 가면서 1에 점근적으로 접근하는 함수)이다. 신경망 훈련 응용에서 사용될 수 있는 시그모이드 함수의 일례는

이고, 반정밀도 16-비트 부동 소수점 비트 스트링을 사용하여 계산하려면 100개 이상의 클록 사이클이 필요할 수 있다. 그러나, es = 0인 8-비트 포지트를 사용하면, x를 나타내는 포지트의 제1 비트를 뒤집고 두 비트를 오른쪽으로 시프트시켜(반정밀도의 16-비트 부동 소수점 비트 스트링을 사용하는 동일한 기능을 평가하는 것에 비해 적어도 10배 더 적은 클록 신호를 취할 수 있는 연산) 동일한 기능을 평가할 수 있다. A common function used to train neural networks is the sigmoid function f(x) (e.g. a function that asymptotically approaches 0 as it goes from x → -∞ and asymptotically approaches 1 as it goes x → ∞). An example of a sigmoid function that can be used in neural network training applications is

, and calculations using a half-precision 16-bit floating-point string of bits may require more than 100 clock cycles. However, if we use an 8-bit positive with es = 0, we invert the first bit of the positive position representing x and shift two bits to the right (the same function using a half-precision 16-bit floating-point bit string). An operation that can take at least 10 times less clock signal than evaluating) can evaluate the same function.

또한, 반복 결과를 반올림하거나 절단하지 않고 시그모이드 함수 평가의 반복 결과를 보존함으로써, 연산의 중간 결과를 반올림하거나 절단하는 접근 방식에 비해 최종 결과의 정확도를 향상시킬 수 있다. 예를 들어, 도 2a에 도시된 주변 감지 증폭기(211)와 같은 주변 감지 증폭기에서 시그모이드 함수를 평가하기 위해 재귀 연산의 중간 결과를 저장함으로써, 연산의 중간 결과를 반올림하거나 절단하는 접근 방식에 비해 최종 결과의 정확도를 향상시킬 수 있다. Also, by preserving iteration results of sigmoid function evaluation without rounding or truncating the iteration results, the accuracy of the final result can be improved compared to approaches that round or truncate intermediate results of the operation. For example, by storing the intermediate result of a recursive operation to evaluate the sigmoid function in a peripheral sense amplifier, such as the peripheral sense amplifier 211 shown in Fig. 2A, an approach of rounding or truncating the intermediate result of the operation can improve the accuracy of the final result.

이 예에서, 제어 회로부(520)를 동작시켜 포지트 비트 스트링의 정밀도를 변화시켜 보다 바람직한 정밀도 레벨을 산출함으로써, 이러한 변환 및/또는 후속 연산을 수행하도록 구성된 제어 회로부(520)를 포함하지 않는 접근 방식에 비해 처리 시간, 자원 소비, 및/또는 저장 공간을 감소시킬 수 있다. 처리 시간, 자원 소비 및/또는 저장 공간의 이러한 감소는 이러한 연산을 수행하는 데 사용되는 클록 신호의 수를 줄여, 컴퓨팅 디바이스의 소비 전력량 및/또는 이러한 연산을 수행하는 시간 기간을 감소시킬 수 있을 뿐만 아니라 다른 작업 및 기능을 위한 처리 및/또는 메모리 자원을 확보하는 것에 의해 제어 회로부(520)가 동작하는 컴퓨팅 디바이스의 기능을 향상시킬 수 있다. In this example, an approach that does not include control circuitry 520 configured to operate control circuitry 520 to vary the precision of the positive bit string to produce a more desirable level of precision, thereby performing such conversion and/or subsequent operations. It can reduce processing time, resource consumption, and/or storage space compared to the method. This reduction in processing time, resource consumption, and/or storage space may reduce the number of clock signals used to perform these operations, thereby reducing the amount of power consumed by the computing device and/or the period of time for performing such operations. Instead, the function of the computing device in which the control circuit unit 520 operates may be improved by securing processing and/or memory resources for other tasks and functions.

도 6은 본 발명의 소정 개수의 실시형태에 따라 메모리 어레이 주변부에서 비트 스트링을 누산하는 일례를 나타내는 블록도(640)이다. 주변 감지 증폭기(예를 들어, 도 2a에 도시된 주변 감지 증폭기(211))에 이용가능하거나 주변 감지 증폭기에 의해 수행되는 여러 기능은 본 발명의 양태를 추가로 예시하기 위해 도 6과 관련하여 설명된다. 예를 들어, 제어 회로부(620)를 사용한 승산-누산 연산이 도 6과 관련하여 설명된다. 도 6에 도시된 바와 같이, 메모리 어레이 주변부에서 비트 스트링을 누산하는 연산은 본 명세서에서 도 1 및 도 2a에 도시된 제어 회로부(120/220)와 유사할 수 있는 제어 회로부(620)를 사용하여 수행될 수 있다. 6 is a block diagram 640 illustrating an example of accumulating a string of bits at the periphery of a memory array in accordance with a number of embodiments of the present invention. Several functions available to or performed by the peripheral sense amplifier (eg, the peripheral sense amplifier 211 shown in FIG. 2A ) are described with respect to FIG. 6 to further illustrate aspects of the present invention. do. For example, a multiply-accumulate operation using the control circuitry 620 is described with respect to FIG. 6 . As shown in Fig. 6, the operation of accumulating a string of bits at the periphery of the memory array is performed herein using control circuitry 620, which may be similar to control circuitry 120/220 shown in Figs. 1 and 2A. can be performed.

도 6에 도시된 바와 같이, 블록(641)에서, 제어 회로부(620)에서 제1 비트 스트링(β)을 수신할 수 있다. 또한, 블록(642)에 도시된 바와 같이, 제어 회로부(620)에서 제2 비트 스트링(

)을 수신할 수 있다. 예를 들어, 제1 비트 스트링(β)과 제2 비트 스트링(

)은 제어 회로부(620)의 메모리 자원(예를 들어, 도 1에 도시된 메모리 자원(124))에 로드될 수 있다. 일부 실시형태에서, 제1 비트 스트링(β) 및/또는 제2 비트 스트링(

)은 unum 또는 포지트 형식에 따라 형식화될 수 있다. As shown in FIG. 6 , in block 641 , the control circuit unit 620 may receive the first bit string β. Also, as shown in block 642 , the second bit string (

) can be received. For example, the first bit string β and the second bit string β

) may be loaded into a memory resource (eg, the memory resource 124 shown in FIG. 1 ) of the control circuit unit 620 . In some embodiments, the first bit string β and/or the second bit string β

) can be formatted according to the unum or post format.

블록(644)에서, 제1 비트 스트링(β)과 제2 비트 스트링(

)을 피연산자로서 사용하여 승산 연산을 수행할 수 있다. 블록(644)에서 승산 연산을 수행한 후, 제어 회로부(620)는 승산 연산의 결과를 주변 감지 증폭기(611) 및/또는 메모리 어레이(630)에 저장될 수 있는 형식으로 변환하도록 구성될 수 있다. 일부 실시형태에서, 승산 연산을 수행한 결과, 결과 비트 스트링의 다양한 비트 서브세트의 비트가 시프트될 수 있다. 예를 들어, 결과 비트 스트링의 가수 비트 서브세트 및/또는 체제 비트 서브세트의 비트가 시프트될 수 있다. 이러한 잠재적인 문제를 해결하기 위해, 제어 회로부(620)는 시프트되었을 수 있는 비트로부터 발생할 수 있는 에러를 도입하지 않고 승산 연산의 결과를 주변 감지 증폭기(611) 및/또는 메모리 어레이(630)에 저장될 수 있는 형식으로 변환할 수 있다. At block 644 , a first bit string β and a second bit string β

) can be used as an operand to perform a multiplication operation. After performing the multiplication operation at block 644 , the control circuitry 620 may be configured to convert the result of the multiplication operation into a format that may be stored in the peripheral sense amplifier 611 and/or the memory array 630 . . In some embodiments, as a result of performing the multiplication operation, bits of various bit subsets of the resulting bit string may be shifted. For example, the bits of the mantissa bit subset and/or the regime bit subset of the resulting bit string may be shifted. To address this potential problem, the control circuitry 620 stores the result of the multiplication operation in the peripheral sense amplifier 611 and/or the memory array 630 without introducing errors that may arise from bits that may have been shifted. It can be converted to a format that can be

블록(649)에서, 승산 연산의 결과를 예를 들어, 콰이어 누산기(quire accumulator)에서 누산할 수 있다. 일부 실시형태에서, 콰이어 누산기에 저장된 결과는 블록(646)에 도시된 바와 같이 메모리 어레이(630)에 저장된 비트 스트링과 다중화될 수 있다. 그러나 실시형태는 이로 제한되지 않고, 일부 실시형태에서 블록(649)에서 콰이어 누산기에 저장된 승산의 결과는 주변 감지 증폭기(611)에 저장될 수 있는 재귀 연산의 중간 결과와 다중화될 수 있다. At block 649, the result of the multiplication operation may be accumulated, for example, in a quire accumulator. In some embodiments, the result stored in the choir accumulator may be multiplexed with a string of bits stored in the memory array 630 as shown in block 646 . However, embodiments are not limited thereto, and in some embodiments the result of the multiplication stored in the choir accumulator at block 649 may be multiplexed with the intermediate result of the recursive operation, which may be stored in the peripheral sense amplifier 611 .

일부 실시형태에서, 제어 회로부(620)는 승산 연산의 결과 또는 메모리 어레이(630)에 저장된 이전의 결과 비트 스트링 중 어느 하나를 선택하기 위해 블록(646)에서 연산을 수행하도록 구성될 수 있다. 승산 연산의 결과 또는 메모리 어레이(630)에 저장된 이전의 결과 비트 스트링이 블록(646)에서 선택되는지 여부는 응용에 의존할 수 있다. 예를 들어, 메모리 어레이(630)에 저장된 비트 스트링은 수행되는 재귀 연산의 유형에 따라 이전의 연산의 결과일 수 있으므로, 제어 회로부(620)에서 수행되는 후속 연산의 수행에 메모리 어레이(630)에 저장된 비트 스트링을 사용하는 것이 유리할 수 있다. In some embodiments, the control circuitry 620 may be configured to perform the operation at block 646 to select either the result of the multiplication operation or a previous result bit string stored in the memory array 630 . Whether the result of the multiplication operation or a previous result bit string stored in memory array 630 is selected at block 646 may depend on the application. For example, since the bit string stored in the memory array 630 may be the result of a previous operation depending on the type of recursive operation to be performed, it is stored in the memory array 630 for the subsequent operation performed by the control circuit unit 620 . It may be advantageous to use a stored bit string.

비트 스트링(예를 들어, 승산 연산의 수행의 결과인 비트 스트링 또는 메모리 어레이(630)에 저장된 비트 스트링)이 선택되면, 선택된 결과를 블록(648)에서 누산시킬 수 있다. 예를 들어, 승산 연산의 결과 또는 메모리 어레이(630)에 저장된 비트 스트링을 연산의 일부로서 주변 감지 증폭기(611)에 저장된 비트 스트링에 가산하거나 이 비트 스트링으로부터 감산하여 재귀 연산의 결과인 비트 스트링을 누산할 수 있다. If a bit string (eg, a bit string that is a result of performing a multiplication operation or a bit string stored in the memory array 630 ) is selected, the selected result may be accumulated at block 648 . For example, the result of the multiplication operation or the bit string stored in the memory array 630 is added to or subtracted from the bit string stored in the peripheral sense amplifier 611 as part of the operation to obtain the bit string that is the result of the recursive operation. can be accumulated

도 6에 도시된 바와 같이, 이 결과(예를 들어, 선택된 비트 스트링(들)을 누산하는 연산의 수행의 결과인 비트 스트링)는 주변 감지 증폭기(611)로 전달될 수 있다. 전술한 바와 같이, 이러한 결과(예를 들어, 각 반복에서의 재귀 연산의 결과)를 주변 감지 증폭기(611)에 저장함으로써, 재귀 연산의 하나 이상의 반복 후에 비트 스트링을 절단하는 접근 방식에 비해 결과 비트 스트링의 정확도를 보존할 수 있다. As shown in FIG. 6 , this result (eg, a bit string that is a result of performing an operation for accumulating the selected bit string(s)) may be transmitted to the peripheral sense amplifier 611 . As discussed above, by storing these results (eg, the results of a recursive operation at each iteration) in the peripheral sense amplifier 611 , the resulting bits are compared to approaches that truncate the string of bits after one or more iterations of the recursive operation. You can preserve the accuracy of the string.

재귀 연산의 누산 결과를 주변 감지 증폭기(611)로 전달하면 누산 결과는 메모리 어레이(630)에 복사될 수 있다. 일부 실시형태에서, 주변 감지 증폭기(611)로부터 메모리 어레이(630)로 전송되는 복사된 누산 비트 스트링은 후속 사용을 위해 메모리 어레이(630)에 저장될 수 있다. 메모리 어레이(630)에 저장된 누산된 비트 스트링은 일부 실시형태에서, 도 2a에 도시된 데이터 구조부(209)와 같은 메모리 어레이(630)의 데이터 구조에 저장될 수 있고, 또는 메모리 어레이(630)에 저장된 누산된 비트 스트링은 메모리 어레이(630) 내의 다른 위치에 저장될 수 있다. When the accumulation result of the recursive operation is transferred to the peripheral sense amplifier 611 , the accumulation result may be copied to the memory array 630 . In some embodiments, the copied accumulated bit string transferred from the peripheral sense amplifier 611 to the memory array 630 may be stored in the memory array 630 for subsequent use. The accumulated bit string stored in memory array 630 may, in some embodiments, be stored in a data structure of memory array 630 , such as data structure 209 shown in FIG. 2A , or in memory array 630 . The stored accumulated bit string may be stored at another location within the memory array 630 .

데이터 구조부(609)에 저장된 누산된 비트 스트링은 일부 실시형태에서, 제어 회로부(620)를 사용하여 수행된 재귀 연산의 최종 결과를 나타낼 수 있다. 예를 들어, 재귀 연산의 최종 결과가 주변 감지 증폭기(611)에 저장되면, 재귀 연산의 최종 결과는 메모리 어레이(630)의 데이터 구조부(609)에 복사되고 후속 사용을 위해 저장될 수 있다. 일부 실시형태에서, 데이터 구조부(609)에 저장된 연산의 최종 결과는 블록(644)에서 수행된 후속 승산 연산의 결과와 함께, 예를 들어, 블록(646)에서 다중화될 수 있다. The accumulated bit string stored in data structure 609 may represent the final result of a recursive operation performed using control circuitry 620 in some embodiments. For example, if the final result of the recursive operation is stored in the peripheral sense amplifier 611 , the final result of the recursive operation may be copied to the data structure portion 609 of the memory array 630 and stored for subsequent use. In some embodiments, the final result of the operation stored in data structure 609 may be multiplexed with the result of a subsequent multiplication operation performed at block 644 , for example, at block 646 .

블록(648)에서, 644에서 수행된 승산 연산의 결과를 주변 감지 증폭기(611)에 저장된 현재 비트 스트링에 가산하거나 현재 비트 스트링으로부터 감산할 수 있다. 예를 들어, 제어 회로부(620)를 사용하여 승산-누산 연산과 같은 재귀 연산을 수행하는 동안, 재귀 연산의 각 반복의 결과는 블록(648)에서 주변 감지 증폭기(611)에 누산될 수 있다. 일부 실시형태에서, 재귀 연산의 각 반복의 결과를 누산하는 것은 주변 감지 증폭기(611)에서 재귀 연산의 이전 반복의 이전에 저장된 결과를 덮어쓰고, 주변 감지 증폭기에 저장된 재귀 연산의 이전 반복 결과에 재귀 연산의 현재 반복 결과를 가산하고, 또는 주변 감지 증폭기에 저장된 재귀 연산의 이전 반복 결과로부터 재귀 연산의 현재 반복 결과를 감산하는 것을 포함할 수 있다. At block 648 , the result of the multiplication operation performed at 644 may be added to or subtracted from the current bit string stored in the peripheral sense amplifier 611 . For example, while performing a recursive operation such as a multiply-accumulate operation using the control circuitry 620 , the result of each iteration of the recursive operation may be accumulated in the peripheral sense amplifier 611 at block 648 . In some embodiments, accumulating the result of each iteration of the recursive operation overwrites the previously stored result of the previous iteration of the recursive operation in the peripheral sense amplifier 611 and recursively to the result of the previous iteration of the recursive operation stored in the peripheral sense amplifier 611 . adding the result of the current iteration of the operation, or subtracting the result of the current iteration of the recursive operation from the results of the previous iteration of the recursive operation stored in the peripheral sense amplifier.

블록(647)에 도시된 바와 같이 재귀 연산이 완료되면, 주변 감지 증폭기(611)에 저장된 재귀 연산의 최종 결과는 메모리 어레이(630)로 전송될 수 있고, 또는 주변 감지 증폭기(611)에 저장된 재귀 연산의 최종 결과는 재귀 연산의 최종 결과가 주변 감지 증폭기(611)에 저장되는 형식과는 다른 형식으로 변환될 수 있다. 예를 들어, 재귀 연산의 최종 결과가 포지트 형식으로 저장되면 최종 결과를 부동 소수점 형식으로 변환하거나 또는 그 반대로 변환할 수 있다. 유사하게, 재귀 연산의 최종 결과는 다른 형식들 간에 변환될 수 있고, 예를 들어, 주변 감지 증폭기(611)에 저장된 비트 스트링이 포지트 형식으로 저장되지 않은 경우, 재귀 연산의 최종 결과는 블록(647)에서 주변 감지 증폭기(611) 외부로 전송된 후 포지트 형식으로 변환될 수 있다. When the recursive operation is completed as shown in block 647 , the final result of the recursive operation stored in the peripheral sense amplifier 611 may be sent to the memory array 630 , or the recursive operation stored in the peripheral sense amplifier 611 . The final result of the operation may be converted into a format different from the format in which the final result of the recursive operation is stored in the peripheral sense amplifier 611 . For example, if the final result of a recursive operation is stored in positive format, you can convert the final result to floating point format and vice versa. Similarly, the final result of the recursive operation can be converted between different formats, for example, if the bit string stored in the peripheral sense amplifier 611 is not stored in the positive format, the final result of the recursive operation is 647) may be converted to a positive form after being transmitted to the outside of the peripheral sense amplifier 611.

일부 실시형태에서, 주변 감지 증폭기(611)에 저장된 재귀 연산의 최종 결과는 최종 결과 비트 스트링이 특정 비트 폭을 갖도록 반올림될 수 있다. 재귀 연산의 최종 결과는 결과 비트 스트링의 가수 비트 서브세트 또는 지수 비트 서브세트 또는 이 둘 모두로부터 적어도 하나의 비트를 제거하여 반올림될 수 있다. 예를 들어, 재귀 연산이 완료되면, 제어 회로부(620)는 연산의 최종 결과를 주변 감지 증폭기(611) 외부의 회로부로 전송될 수 있는 비트 폭으로 반올림할 수 있다. 전술한 바와 같이, 반올림된 최종 결과의 비트 폭은 미리 결정될 수 있고, 또는 이 비트 폭은 사용자 커맨드와 같은 커맨드에 응답하여 설정될 수 있다. In some embodiments, the final result of the recursive operation stored in the peripheral sense amplifier 611 may be rounded so that the final result bit string has a particular bit width. The final result of the recursive operation may be rounded off by removing at least one bit from either the mantissa bits subset or the exponent bits subset or both of the resulting string of bits. For example, when the recursive operation is completed, the control circuit unit 620 may round the final result of the operation to a bit width that may be transmitted to a circuit unit external to the peripheral sense amplifier 611 . As noted above, the bit width of the rounded final result may be predetermined, or this bit width may be set in response to a command, such as a user command.

일부 실시형태에서, 주변 감지 증폭기(611)는 PSA 소거(PSA CLEAR)로부터 주변 감지 증폭기(611)를 가리키는 화살표에 의해 지시된 바와 같이 "소거"될 수 있다. 예를 들어, 주변 감지 증폭기(611)에 저장된 정보를 삭제하라는 커맨드에 응답하여, 주변 감지 증폭기(611)에 저장된 데이터는 소거될 수 있다. 이는 주변 감지 증폭기(611)에서 재귀 연산의 반복 결과를 누산하는 것을 포함할 수 있는 후속 재귀 연산의 수행을 준비하기 위해 주변 감지 증폭기(611)에 재귀 연산의 반복 결과를 누산하는 재귀 연산의 종료 시에 바람직할 수 있다. In some embodiments, the peripheral sense amplifier 611 may be “cleared” as indicated by the arrow pointing from PSA CLEAR to the peripheral sense amplifier 611 . For example, in response to a command to delete information stored in the peripheral sense amplifier 611 , data stored in the peripheral sense amplifier 611 may be erased. This is at the end of a recursive operation that accumulates the iteration results of the recursive operation in the peripheral sense amplifier 611 in preparation for performing a subsequent recursive operation, which may include accumulating the iteration results of the recursive operation in the peripheral sense amplifier 611 . may be preferable.

블록(643)에서, 재귀 산술 및/또는 재귀 논리 연산과 같은 연산을 수행할 때 사용될 비트 스트링은 본 명세서에서 도 1 및 도 2a에 도시된 메모리 어레이(130/230)와 유사할 수 있는 메모리 어레이(630)로 전송될 수 있다. 일부 실시형태에서, 비트 스트링은 메모리 어레이(630)가 전개된 메모리 디바이스 외부의 제어 회로부로부터 전송될 수 있다. 예를 들어, 비트 스트링은 호스트(예를 들어, 본 명세서에서 도 1, 도 2a 및 도 2b에 도시된 호스트(102/202))로부터 메모리 어레이(630)로 전송될 수 있다. 비트 스트링이 메모리 어레이(630)에 의해 저장되면, 비트 스트링은 제어 회로부(620)로 전송될 수 있고, 제어 회로부(620)는 비트 스트링을 피연산자로서 사용하여 재귀 연산을 수행하거나 재귀 연산의 수행을 야기할 수 있다. At block 643 , the bit string to be used when performing operations such as recursive arithmetic and/or recursive logic operations is a memory array, which may be similar to memory array 130/230 shown in FIGS. 1 and 2A herein. may be sent to 630 . In some embodiments, the bit string may be transmitted from control circuitry external to the memory device in which the memory array 630 is deployed. For example, the bit string may be transmitted from a host (eg, host 102/202 shown in FIGS. 1 , 2A and 2B herein) to memory array 630 . When the bit string is stored by the memory array 630, the bit string may be transmitted to the control circuit unit 620, and the control circuit unit 620 uses the bit string as an operand to perform a recursive operation or perform a recursive operation. can cause

그러나 실시형태는 이로 제한되지 않고, 도 2a와 관련하여 위에서 설명된 바와 같이, 메모리 어레이(630)는 결과 비트 스트링(들)이 메모리 어레이(630)에 저장되기 전에 수행된 산술 및/또는 논리 연산의 결과를 나타내는 비트 스트링을 저장하도록 구성될 수 있다. 예를 들어, 메모리 어레이(630)는 결과 비트 스트링을 사용하여 연산을 수행하는 속도를 증가시키기 위해 도 2a에 도시된 데이터 구조부(209)와 같은 데이터 구조에 결과 비트 스트링을 저장할 수 있다. However, the embodiment is not limited thereto, and as described above with respect to FIG. 2A , the memory array 630 may perform arithmetic and/or logical operations performed before the resultant bit string(s) are stored in the memory array 630 . may be configured to store a bit string representing the result of For example, the memory array 630 may store the result bit string in a data structure such as the data structure unit 209 shown in FIG. 2A to increase the speed of performing an operation using the result bit string.

일부 실시형태에서, 비트 스트링은 주변 감지 증폭기(611) 블록과 메모리 어레이(630) 블록을 연결하는 화살표로 표시된 바와 같이 메모리 어레이(630)와 주변 감지 증폭기(611) 사이에 전송될 수 있다. 또한, 일부 실시형태에서, 메모리 어레이(630)에 저장된 비트 스트링은 블록(645)에 도시된 바와 같이 외부 메모리로 전송될 수 있다. 외부 메모리는 메모리 어레이(630)가 전개된 메모리 디바이스의 외부에 있는 메모리일 수 있다. 예를 들어, 메모리는 HDD, 플래시 메모리 디바이스, SSD 또는 다른 외부 메모리와 같은 외부 저장 볼륨일 수 있다. In some embodiments, the bit string may be transferred between the memory array 630 and the peripheral sense amplifier 611 as indicated by arrows connecting the block of peripheral sense amplifier 611 and the block of memory array 630 . Also, in some embodiments, the bit string stored in memory array 630 may be transferred to an external memory as shown in block 645 . The external memory may be memory external to the memory device in which the memory array 630 is deployed. For example, the memory may be an external storage volume such as an HDD, flash memory device, SSD or other external memory.

비-제한적인 예에서, (블록(641)에서) 포지트 비트 스트링(β)과 (블록(642)에서) 포지트 비트 스트링(

)은 제어 회로부(620)를 사용하여 블록(644)에서 함께 승산된다. 이 승산 연산의 결과, 예를 들어, 포지트 비트 스트링(λ)은 주변 감지 증폭기(611)에 저장될 수 있고/있거나 결과 포지트 비트 스트링(λ)의 복사본은 메모리 어레이(630)에 저장될 수 있다. 이 예에서, 블록(646)에서 포지트 비트 스트링(λ)은 누산을 위해 선택될 수 있다. 일부 실시형태에서, 승산 연산의 결과를 저장하기 전에, 결과는 위에서 설명된 바와 같이 주변 감지 증폭기(611) 및/또는 메모리 어레이(630)에 저장될 수 있는 형식으로 변환될 수 있다. 예를 들어, 결과를 이진 형식, 부동 소수점 형식으로 변환할 수 있고, 또는 비트 스트링의 형식을 (예를 들어, (16,2) 포지트로부터 (16,3) 포지트 등으로) 변경할 수 있다. In a non-limiting example, (at block 641) a positive bit string (β) and (at block 642) a positive bit string (

) are multiplied together in block 644 using control circuitry 620 . The result of this multiplication operation, for example, positive bit string λ, may be stored in peripheral sense amplifier 611 and/or a copy of the resulting positive bit string λ may be stored in memory array 630 . can In this example, at block 646 the positive bit string λ may be selected for accumulation. In some embodiments, prior to storing the result of the multiplication operation, the result may be converted into a format that may be stored in peripheral sense amplifier 611 and/or memory array 630 as described above. For example, you can convert the result to binary format, floating point format, or change the format of a bit string (e.g., from (16,2) positive to (16,3) positive, etc.) .

블록(648)에서, 포지트 비트 스트링(λ)은 재귀 연산의 수행의 일부로서 주변 감지 증폭기(611)에 저장된 이전 비트 스트링에 추가되거나 이전 비트 스트링으로부터 감산될 수 있다. 블록(648)에서 수행된 가산 또는 감산 연산(예를 들어, 누산 연산)의 결과는 주변 감지 증폭기(611)로 전송되어 저장될 수 있다. 일부 실시형태에서, 블록(648)에서 수행된 가산 또는 감산 연산의 결과는 이전 비트 스트링(예를 들어, 포지트 비트 스트링(λ))을 덮어쓰도록 주변 감지 증폭기(611)에 저장될 수 있다. At block 648 , the positive bit string λ may be added to or subtracted from the previous bit string stored in the peripheral sense amplifier 611 as part of performing a recursive operation. The result of the addition or subtraction operation (eg, accumulation operation) performed in block 648 may be transmitted to and stored in the peripheral sense amplifier 611 . In some embodiments, the result of the addition or subtraction operation performed at block 648 may be stored in the peripheral sense amplifier 611 to overwrite the previous bit string (eg, positive bit string λ). .

이러한 연산은 재귀 연산이 완료될 때까지 반복될 수 있으며, 이 시점에서 주변 감지 증폭기(611)에 저장된 최종 결과는 전술한 바와 같이 반올림될 수 있다. 일부 실시형태에서, 주변 감지 증폭기(611)에 저장된 최종 결과를 반올림한 후, 최종 결과는 unum 또는 포지트 형식(또는 부동 소수점 형식과 같은 다른 형식)으로 변환되어 메모리 어레이(630)로 또는 호스트와 같은 외부 회로부로 전송될 수 있다. This operation may be repeated until the recursive operation is completed, at which point the final result stored in the peripheral sense amplifier 611 may be rounded up as described above. In some embodiments, after rounding the final result stored in the peripheral sense amplifier 611, the final result is converted to an unum or positive format (or other format such as a floating point format) to the memory array 630 or with the host. may be transmitted to the same external circuit unit.

다른 비-제한적인 예에서, 메모리 어레이(630)에 저장된 비트 스트링을 누산을 위해 블록(646)에서 선택할 수 있다. 위에서 설명된 바와 같이, 메모리 어레이(630)에 저장된 비트 스트링은 주변 감지 증폭기(611)에 저장된 비트 스트링의 복사본일 수 있지만, 실시형태는 이로 제한되지 않는다. 이 예에서, 메모리 어레이(630)에 저장된 비트 스트링은 예를 들어 블록(648)에서 주변 감지 증폭기(611)에 저장된 비트 스트링과 누산될 수 있다. 블록(648)에서 누산의 결과 비트 스트링은 주변 감지 증폭기(611) 및/또는 메모리 어레이(630)에 다시 저장될 수 있다. 일부 실시형태에서, 블록(648)에서 수행된 누산 연산의 결과는 이전 비트 스트링(예를 들어, 포지트 비트 스트링(λ))을 덮어쓰도록 주변 감지 증폭기(611)에 저장될 수 있다. In another non-limiting example, a string of bits stored in memory array 630 may be selected at block 646 for accumulation. As described above, the bit string stored in the memory array 630 may be a copy of the bit string stored in the peripheral sense amplifier 611 , although embodiments are not limited thereto. In this example, the bit string stored in the memory array 630 may be accumulated with the bit string stored in the peripheral sense amplifier 611 at block 648 , for example. At block 648 , the resulting bit string of accumulation may be stored back in peripheral sense amplifier 611 and/or memory array 630 . In some embodiments, the result of the accumulation operation performed at block 648 may be stored in the peripheral sense amplifier 611 to overwrite the previous bit string (eg, positive bit string λ).

이 연산은 재귀 연산을 완료할 때까지 반복될 수 있으며, 이 시점에서 주변 감지 증폭기(611)에 저장된 최종 결과를 전술한 바와 같이 반올림할 수 있다. 일부 실시형태에서, 주변 감지 증폭기(611)에 저장된 최종 결과를 반올림한 후, 최종 결과는 unum 또는 포지트 형식(또는 부동 소수점 형식과 같은 다른 형식)으로 변환되어 메모리 어레이(630)로 또는 호스트와 같은 외부 회로부로 전송될 수 있다. This operation may be repeated until the recursive operation is completed, at which point the final result stored in the peripheral sense amplifier 611 may be rounded up as described above. In some embodiments, after rounding the final result stored in the peripheral sense amplifier 611, the final result is converted to an unum or positive format (or other format such as a floating point format) to the memory array 630 or with the host. may be transmitted to the same external circuit unit.

도 7은 본 발명의 소정 개수의 실시형태에 따라 메모리 어레이 주변부에서 비트 스트링을 누산하는 예시적인 방법(750)을 나타내는 흐름도이다. 블록(752)에서, 방법(750)은 제1 비트 스트링과 제2 비트 스트링을 사용하여 제1 연산을 수행하는 단계를 포함할 수 있다. 제1 연산은 무엇보다도 특히 산술 연산, 논리 연산, 비트 단위 연산 또는 벡터 연산일 수 있다. 일부 실시형태에서, 제1 비트 스트링과 제2 비트 스트링은 unum(예를 들어, 유형 III unum 또는 포지트) 형식)에 따라 형식화될 수 있다. 7 is a flow diagram illustrating an exemplary method 750 for accumulating a string of bits at the perimeter of a memory array in accordance with a number of embodiments of the present invention. At block 752 , the method 750 may include performing a first operation using the first bit string and the second bit string. The first operation may be, inter alia, an arithmetic operation, a logical operation, a bitwise operation or a vector operation. In some embodiments, the first bit string and the second bit string may be formatted according to an unum (eg, type III unum or positive) format.

블록(754)에서, 방법(750)은 제1 연산의 결과를 메모리 어레이의 주변 회로부에 저장하는 단계를 포함할 수 있다. 주변 회로부는 도 2a에 도시된 주변 감지 증폭기(211)와 같은 주변 감지 증폭기를 포함할 수 있고, 메모리 어레이는 본 명세서에서 도 1, 도 2a 및 도 2b에 도시된 메모리 어레이(130/230)와 유사할 수 있다. 그러나 실시형태는 제1 연산의 결과를 주변 감지 증폭기에 저장하는 것으로 제한되지 않으며, 일부 실시형태에서, 방법(750)은 메모리 어레이에 결합되지만, 메모리 어레이와 구분되는 확장된 행 어드레스 구성요소에 제1 연산의 결과를 저장하는 단계를 포함할 수 있다. At block 754 , the method 750 may include storing the result of the first operation in peripheral circuitry of the memory array. The peripheral circuit unit may include a peripheral sense amplifier, such as the peripheral sense amplifier 211 shown in FIG. 2A, and the memory array is herein described in combination with the memory array 130/230 shown in FIGS. 1, 2A, and 2B. may be similar. However, embodiments are not limited to storing the result of the first operation in a peripheral sense amplifier, and in some embodiments, method 750 is coupled to a memory array, but in an extended row address component distinct from the memory array. 1 may include storing the result of the operation.

블록(756)에서, 방법(750)은 제1 연산의 결과와 제2 비트 스트링을 사용하여 제2 연산을 수행하는 단계를 포함할 수 있다. 제2 연산은 무엇보다도 특히 산술 연산, 논리 연산, 비트 단위 연산 또는 벡터 연산일 수 있다. 일부 실시형태에서, 제1 연산과 제2 연산은 재귀 연산의 일부로서 수행될 수 있다. 그 결과, 일부 실시형태에서, 제1 연산 또는 제2 연산의 결과는 제1 연산과 제2 연산 중 다른 연산의 결과보다 더 큰 비트 폭을 가질 수 있다. At block 756 , the method 750 may include performing a second operation using the result of the first operation and the second string of bits. The second operation may be, among other things, an arithmetic operation, a logical operation, a bitwise operation or a vector operation. In some embodiments, the first operation and the second operation may be performed as part of a recursive operation. As a result, in some embodiments, the result of the first operation or the second operation may have a larger bit width than the result of the other of the first operation and the second operation.

제1 연산과 제2 연산이 재귀 연산의 일부로서 수행되는 실시형태에서, 방법(750)은 제2 연산의 결과가 재귀 연산의 최종 결과 비트 스트링이라고 결정하는 단계 및/또는 이 결정에 따라 최종 결과 비트 스트링이 특정 비트 폭을 갖도록 확장된 행 어드레스 구성요소에 저장된 최종 결과 비트 스트링을 반올림하는 연산을 수행하는 단계를 더 포함할 수 있다. 예를 들어, 방법(750)은 최종 결과 비트 스트링의 가수 비트 서브세트 또는 지수 비트 서브세트로부터 적어도 하나의 비트를 제거하여 확장된 행 어드레스 구성요소에 저장된 최종 결과 비트 스트링을 반올림하는 단계를 포함할 수 있다. In an embodiment in which the first operation and the second operation are performed as part of a recursive operation, the method 750 includes determining that the result of the second operation is the final result bit string of the recursive operation and/or according to this determination the final result The method may further include performing an operation to round the final result bit string stored in the extended row address component such that the bit string has a specific bit width. For example, method 750 may include rounding the final result bit string stored in the extended row address component by removing at least one bit from a subset of mantissa bits or a subset of exponent bits of the final result bit string. can

일부 실시형태에서, 방법(750)은 사용자 커맨드에 응답하여 사용자 커맨드로 정해진 비트 폭을 갖도록 최종 결과 비트 스트링을 반올림함으로써 적어도 하나의 비트를 제거하라는 사용자 커맨드를 수신하는 단계를 포함할 수 있다. 예를 들어, 방법(750)은 재귀 연산의 최종 결과 비트 스트링에 요청된 비트 폭을 정하는 사용자 커맨드를 수신하는 단계, 및 요청된 비트 폭을 갖도록 최종 결과 비트 스트링을 반올림하는 단계를 포함할 수 있다. 위에서 설명된 바와 같이, 이러한 비트 폭의 비-제한적인 예는 8-비트, 16-비트, 32-비트, 64-비트 등을 포함할 수 있고, 최종 결과 비트 스트링을 사용하는 응용에 기초할 수 있다. In some embodiments, method 750 may include receiving a user command in response to the user command to remove at least one bit by rounding the final result bit string to have a bit width determined by the user command. For example, method 750 may include receiving a user command that sets a requested bit width in a final result bit string of a recursive operation, and rounding the final result bit string to have the requested bit width. . As described above, non-limiting examples of such bit widths may include 8-bit, 16-bit, 32-bit, 64-bit, etc., and may be based on the application using the resulting bit string. have.

일부 실시형태에서, 위에서 언급된 바와 같이, 제1 비트 스트링과 제2 비트 스트링은 유형 III 범용 숫자(unum) 형식 또는 포지트 형식에 따라 형식화될 수 있다. 이러한 실시형태에서, 방법(750)은 확장된 행 어드레스 구성요소에 제1 연산의 결과를 저장하기 전에 제1 연산의 결과를 유형 III unum 형식 또는 포지트 형식으로부터 다른 형식으로 변환하는 단계, 및/또는 확장된 행 어드레스 구성요소에 제2 연산의 결과를 저장하기 전에 제2 연산의 결과를 유형 III unum 형식 또는 포지트 형식으로부터 다른 형식으로 변환하는 단계를 포함할 수 있다. In some embodiments, as noted above, the first bit string and the second bit string may be formatted according to a Type III universal numeric (unum) format or a positive format. In this embodiment, method 750 converts the result of the first operation from a type III unum format or positive format to another format before storing the result of the first operation in the extended row address component, and/ or converting the result of the second operation from the type III unum format or positive format to another format before storing the result of the second operation in the extended row address component.

제1 연산과 제2 연산을 재귀 연산의 일부로서 수행하는 일부 실시형태에서, 방법(750)은 제2 연산의 결과가 재귀 연산의 최종 결과 비트 스트링이라고 결정하는 단계, 및 최종 결과 비트 스트링을 유형 III 범용 숫자 형식 또는 포지트 형식으로 변환하는 연산을 수행하는 단계를 더 포함할 수 있다. 예를 들어 제1 연산, 제2 연산을 수행하는 동안 및/또는 제1 연산의 결과 및/또는 제2 연산의 결과를 주변 회로부에 저장하는 동안, 제1 비트 스트링, 제2 비트 스트링, 및/또는 제1 연산의 결과를 나타내는 비트 스트링은 unum(예를 들어, 유형 III unum 또는 포지트 형식)과는 다른 형식으로 변환될 수 있다. 따라서, 일부 실시형태에서, 최종 결과 비트 스트링은 (예를 들어, 도 1 및 도 2a에 도시된 제어 회로부(120/220)와 같은 제어 회로부에 의해) 주변 회로부에 저장된 형식으로부터 unum 형식으로 변환될 수 있다. In some embodiments where the first operation and the second operation are performed as part of the recursive operation, the method 750 includes determining that the result of the second operation is the final result bit string of the recursive operation, and sets the final result bit string to the type III. The method may further include performing an operation for converting to a general-purpose number format or a positive format. For example, while performing the first operation, the second operation and/or while storing the result of the first operation and/or the result of the second operation in the peripheral circuitry, the first bit string, the second bit string, and/or Alternatively, the bit string representing the result of the first operation may be converted into a format other than unum (eg, type III unum or positive format). Thus, in some embodiments, the final result bit string is to be converted from a format stored in peripheral circuitry to an unum format (eg, by control circuitry such as control circuitry 120/220 shown in FIGS. 1 and 2A ). can

블록(758)에서, 방법(750)은 범용 숫자 형식을 사용하여 제2 연산의 결과를 주변 회로부에 저장하는 단계를 포함할 수 있다. 예를 들어, 방법(750)은 메모리 어레이(예를 들어, 도 1 및 도 2a에 각각 도시된 감지 증폭기(111) 및/또는 주변 감지 증폭기(211)) 및/또는 메모리 어레이의 주변에 있는 XRA 구성요소에 결합되지만 이와는 구분되는 복수의 감지 증폭기에 제2 연산의 결과를 저장하는 단계를 포함할 수 있다. 제2 연산의 결과는 일부 실시형태에서, 블록(752)에서 수행된 제1 연산의 결과를 덮어쓰기하도록 주변 회로부에 저장될 수 있다. At block 758, the method 750 may include storing the result of the second operation in the peripheral circuitry using a general-purpose numeric format. For example, method 750 may include XRA at the periphery of a memory array (eg, sense amplifier 111 and/or peripheral sense amplifier 211 shown in FIGS. 1 and 2A , respectively) and/or memory array. and storing the result of the second operation in a plurality of sense amplifiers coupled to but distinct from the component. The result of the second operation may, in some embodiments, be stored in peripheral circuitry to overwrite the result of the first operation performed at block 752 .

특정 실시형태를 본 명세서에 예시하고 설명하였지만, 이 기술 분야에 통상의 지식을 가진 자라면 도시된 특정 실시형태 대신 동일한 결과를 달성하도록 계산된 배열을 사용할 수 있음을 이해할 수 있을 것이다. 본 발명은 본 발명의 하나 이상의 실시형태의 개조 또는 변형을 포함하도록 의도된다. 상기 설명은 본 발명을 제한하는 것이 아니라 본 발명을 예시적인 것으로 제시된 것으로 이해된다. 상기 실시형태의 조합과, 본 명세서에 구체적으로 설명되지 않은 다른 실시형태는 상기 설명을 검토할 때 이 기술 분야에 통상의 지식을 가진 자에게 명백할 것이다. 본 발명의 하나 이상의 실시형태의 범위는 상기 구조 및 프로세스를 사용하는 다른 응용을 포함한다. 따라서, 본 발명의 하나 이상의 실시형태의 범위는 첨부된 청구범위를 참조하여, 이러한 청구범위에 부여되는 전체 등가범위와 함께 결정되어야 한다. While specific embodiments have been illustrated and described herein, it will be understood by those skilled in the art that calculated arrangements may be used in place of the specific embodiments shown to achieve the same results. This invention is intended to cover adaptations or variations of one or more embodiments of the invention. It is to be understood that the above description is presented by way of illustration of the invention and not of limitation of the invention. Combinations of the above embodiments, and other embodiments not specifically described herein, will be apparent to those of ordinary skill in the art upon review of the above description. The scope of one or more embodiments of the invention includes other applications using the structures and processes. Accordingly, the scope of one or more embodiments of the present invention should be determined with reference to the appended claims, along with the full scope of equivalents to be assigned thereto.

전술한 상세한 설명에서, 일부 특징은 본 발명을 간소화할 목적으로 단일 실시형태에서 함께 그룹화되었다. 본 발명의 방법은 본 발명의 개시된 실시형태가 각각의 청구항에서 명시적으로 인용된 것보다 더 많은 특징을 사용해야 한다는 의도를 반영하는 것으로 해석되어서는 안 된다. 오히려, 이하 청구범위가 나타내는 바와 같이, 본 발명의 주제는 단일의 개시된 실시형태의 모든 특징보다 적다. 따라서, 이하 청구범위는 본 상세한 설명에 통합된 것으로 고려되고, 각 청구항은 그 자체로 별도의 실시형태로 각자 존재한다. In the foregoing detailed description, some features have been grouped together in a single embodiment for the purpose of streamlining the invention. This method is not to be construed as reflecting an intention that the disclosed embodiments of the present invention employ more features than are expressly recited in each claim. Rather, as the following claims indicate, inventive subject matter lies in less than all features of a single disclosed embodiment. Accordingly, the following claims are considered to be incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment.

Claims

As a device,
a plurality of sense amplifiers located in a peripheral area of the memory array including rows or columns in a quantity equal to the quantity of the sense amplifiers of the plurality of sense amplifiers; and
a control circuit portion comprising a processing device and a memory resource coupled to the plurality of sense amplifiers and the memory array
comprising, the processing device comprising:
writing one or more bit strings from the memory array to the memory resource;
performing a first iteration of a recursive operation using the at least one bit string written to the memory resource, wherein the at least one bit string is a Type III universal number format) or formatted according to a posit format, performing the first iteration;
accumulating in the plurality of sense amplifiers a first result bit string representing a result of the first iteration of the recursive operation;
determining whether to perform a second iteration of the recursive operation using the one or more bit strings written to the memory resource or one or more bit strings stored in the memory array;
performing a second iteration of the recursive operation using the one or more bit strings written to the memory resource or the one or more bit strings stored in the memory array; and
accumulating in the plurality of sense amplifiers a second string of result bits representing a result of a second iteration of the recursive operation.
A device configured to do

The method of claim 1 , wherein the processing device comprises:
determining that the recursive operation is complete; and
By removing at least one bit from a subset of mantissa bits or a subset of exponent bits, or both, of at least one of the first result bit string or the second result bit string according to the determination, the final result bit string is a specific bit performing an operation of rounding at least one of the first and second resultant bit strings accumulated in the plurality of sense amplifiers to have a width
A device further configured to:

2. The apparatus of claim 1, wherein the apparatus comprises a memory device comprising the plurality of sense amplifiers, the memory array, and the control circuitry, wherein the processing device transfers the first and second resultant bit strings external to the memory device. and perform the recursive operation within the memory device without sending to the circuitry of

The method of claim 1 , wherein the processing device comprises:
accessing an address space of the memory array in which a first result bit string representing a result of a first iteration of the recursive operation is stored;
accessing an address space of the memory array in which a second result bit string representing a result of a second iteration of the recursive operation is stored; and
storing a bit string representing a result of an operation performed using the first result bit string and the second result bit string in the plurality of sense amplifiers;
A device configured to do

The apparatus of claim 1 , wherein the processing device is configured to accumulate the first and second resultant bit strings into the plurality of sense amplifiers in response to receiving a user generated command.

2. The method of claim 1, wherein the processing device is further configured to accumulate, in the plurality of sense amplifiers, a result bit string representing a result of an iteration of the recursive operation by overwriting a previously stored result bit string in the plurality of sense amplifiers. , Device.

As a method,
retrieving a first bit string and a second bit string for use in performing a recursive operation by control circuitry external to the memory array, the first bit string and the second bit string being formatted in a general-purpose numeric format , the searching step;
performing, by the control circuit unit, a first operation using the first bit string and the second bit string;
storing an accurate result of the first operation in a peripheral circuit portion of the memory array;
determining, by the control circuit unit, whether to perform a second operation using the result of the first operation and the second bit string or the result of the first operation and the bit string stored in the memory array;
performing the second operation using the result of the first operation and the second bit string or the result of the first operation and the bit string stored in the memory array; and
storing the correct result of the second operation in the peripheral circuit unit;
How to include.

8. The method of claim 7, wherein the first operation and the second operation are performed as part of a recursive operation, the method comprising:
determining that the result of the second operation is a final result bit string of the recursive operation; and
performing an operation of rounding the final result bit string stored in the peripheral circuit unit so that the final result bit string has a specific bit width according to the determination
A method further comprising:

9. The method of claim 8, further comprising rounding the final result bit string stored in the peripheral circuitry by removing at least one bit from a subset of mantissa bits or a subset of exponent bits of the final result bit string.

10. The method of claim 9,
receiving a user command to remove the at least one bit; and
rounding the final result bit string to have a bit width defined by the user command in response to the user command.
A method further comprising:

9. The method of claim 8, further comprising sending the rounded final result bit string to the memory array.

8. The method of claim 7, wherein the first operation and the second operation are performed as part of a recursive operation, the method comprising:
determining that the result of the second operation is a final result bit string of the recursive operation; and
performing an operation to convert the final result bit string to a type III universal numeric format or positive format;
A method further comprising:

9. The method of claim 8,
converting the result of the first operation from the general-purpose numeric format to another format before storing the result of the first operation in the peripheral circuit unit; and
converting the result of the second operation from the general-purpose numeric format to the other format before storing the result of the second operation in the peripheral circuit unit.
A method further comprising:

As a system,
a memory device comprising a memory array and a plurality of sense amplifiers, the memory array including a quantity of rows or columns equal to a quantity of sense amplifiers of the plurality of sense amplifiers; and
a processing device coupled to the memory device
comprising: the processing device comprising:
writing one or more strings of bits formatted in a universal numeric format from the memory array to a memory resource external to the memory array and coupled to the memory array;
performing a first iteration of a recursive operation using the one or more bit strings;
accumulating in the plurality of sense amplifiers a first result bit string representing a result of the first iteration of the recursive operation;
determining whether to perform a second iteration of the recursive operation using the one or more bit strings written to the memory resource or one or more bit strings stored in the memory array;
performing a second iteration of the recursive operation using the one or more bit strings written to the memory resource or the one or more bit strings stored in the memory array; and
accumulating, in the plurality of sense amplifiers, a second resultant bit string representing a result of a second iteration of the recursive operation.
A system configured to do

15. The system of claim 14, wherein the plurality of sense amplifiers are located in a peripheral region of the memory array.

15. The system of claim 14, wherein the one or more bit strings, the first result bit string, the second result bit string, or a combination thereof is formatted according to a Type III universal numeric format or positive format.

delete

15. The method of claim 14, wherein the processing device comprises:
determining that the recursive operation is complete; and
truncating the last result bit string stored in the plurality of sense amplifiers such that the last result bit string has a specific bit width;
A system further configured to:

19. The system of claim 18, wherein the processing device is further configured to truncate the last result bit string by deleting at least one bit from a mantissa bit subset or an exponent bit subset of the last result bit string.

15. The system of claim 14, wherein the processing device is further configured to accumulate each successive string of result bits in the plurality of sense amplifiers by overwriting a previous string of result bits stored in the plurality of sense amplifiers.

delete