CN112989268B - Memory operation-oriented fully-unfolded non-orthogonal wiring memory array design method - Google Patents

Memory operation-oriented fully-unfolded non-orthogonal wiring memory array design method Download PDF

Info

Publication number
CN112989268B
CN112989268B CN202110176004.5A CN202110176004A CN112989268B CN 112989268 B CN112989268 B CN 112989268B CN 202110176004 A CN202110176004 A CN 202110176004A CN 112989268 B CN112989268 B CN 112989268B
Authority
CN
China
Prior art keywords
memory
array
data
fully
operand
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110176004.5A
Other languages
Chinese (zh)
Other versions
CN112989268A (en
Inventor
虞致国
马晓杰
顾晓峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN202110176004.5A priority Critical patent/CN112989268B/en
Publication of CN112989268A publication Critical patent/CN112989268A/en
Application granted granted Critical
Publication of CN112989268B publication Critical patent/CN112989268B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • G06F17/153Multidimensional correlation or convolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)

Abstract

The invention discloses a memory operation-oriented memory array design method for fully-unfolded non-orthogonal wiring, and belongs to the fields of memory operation integration and brain-like calculation. The memory array design method of the fully-expanded non-orthogonal wiring for memory operation comprises a memory array, wherein a memory unit is arranged In the array, the memory array inputs Data through a Data-In port as an operand, and the operand is a vector D converted from a matrix D And simultaneously taking the data preprogrammed in the memory calculation unit as another operand matrix W. Vector D is caused by the combined action of Data input by data_in and Bias voltage added by bias_voltage port And completing matrix multiplication operation with the matrix W so as to complete two-dimensional convolution on the matrix D. The invention utilizes the characteristics of two-dimensional convolution to redesign the connection relation between the array design and the storage and calculation units aiming at the fully-unfolded two-dimensional convolution, greatly reduces the redundancy and sparseness of the whole storage and calculation array, and can effectively reduce the whole array area under the condition of unchanged calculation force.

Description

Memory operation-oriented fully-unfolded non-orthogonal wiring memory array design method
Technical Field
The invention discloses a memory operation-oriented memory array design method for fully-unfolded non-orthogonal wiring, and belongs to the fields of memory operation integration and brain-like calculation.
Background
Most of the traditional computer architectures are von-neumann, i.e. memory-computing separation architectures, which not only cause a great deal of energy consumption in data transmission, but also cause the asynchronization of the storage rate and the operation rate, thereby affecting the overall operation speed. The in-memory calculation realizes the integration of memory calculation, and breaks through the speed wall and the power consumption wall of memory calculation. Meanwhile, by utilizing the characteristics of the devices, a single device can finish one-time multiplication and addition operation, has the characteristics of high speed, high parallelism and good energy efficiency ratio for the whole device array, and is suitable for neural network operation needing a large amount of multiplication and addition operation.
In the design of in-memory computation, the memory computation array mainly completes convolution operation, and the memory computation array in a fully-expanded form can complete convolution operation on all data at one time and output a convolution matrix, but the cost is larger area redundancy. The sparsity of devices participating in operation in the whole memory array is very high, the invention optimizes the sparsity, and provides the memory array with fully-unfolded non-orthogonal wiring, so that the sparsity of the devices participating in operation and the area of the whole memory array are greatly reduced.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides a memory array design method of fully-unfolded non-orthogonal wiring for memory operation.
(II) technical scheme
In order to achieve the above purpose, the present invention provides the following technical solutions: the memory array design method for the fully-unfolded non-orthogonal wiring for the memory operation comprises an array, wherein a memory unit is arranged In the array, the memory unit inputs Data through a Data-In port as an operand d, and simultaneously, the Data preprogrammed In the memory unit is used as another operand w. Under the combined action of the Data input by the Data-In and the Bias voltage added by the Bias voltage port, the operand d and the operand w complete multiplication operation.
Further, each of the memory cells in the memory array are connected in a non-orthogonal manner.
Further, the storage array inputs the input matrix of the array through m×n data_in ports, where m represents the number of rows of the input matrix and n represents the number of columns of the input matrix.
Further, the storage array may be adapted for convolution operations of convolution kernels of various sizes.
Further, the data of the input matrix is m×n, and is developed into a vector of 1× (mxn) by the method of formula (1), where formula (1) is:
further, the 1× (mxn) vector is input by mxn data_in ports.
Further, the array outputs Data through the data_out port.
Further, the array performs a two-dimensional convolution operation on the input matrix in a manner of formula (2), where formula (2) is:
(III) beneficial effects
Compared with the prior art, the method for designing the memory array of the fully-unfolded non-orthogonal wiring for memory operation has the following beneficial effects:
the memory operation-oriented fully-unfolded non-orthogonal wiring memory array design method is suitable for convolution layer operation in memory calculation and can adapt to convolution kernels of various sizes. The invention utilizes the characteristics of two-dimensional convolution to redesign the connection relation between the array design and the storage and calculation units aiming at the fully-unfolded two-dimensional convolution, greatly reduces the redundancy and sparseness of the whole storage and calculation array, and can effectively reduce the whole array area under the condition of unchanged calculation force.
Drawings
Fig. 1 is a block diagram showing the overall structure of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, a memory array design method of fully-expanded non-orthogonal wiring for memory operation includes an array, in which a memory unit is disposed, and the memory unit inputs Data as an operand d through a data_in port, and simultaneously uses Data preprogrammed In the memory unit as another operand w. Under the combined action of the Data input by the Data-In and the Bias voltage added by the Bias voltage port, the operand d and the operand w complete multiplication operation.
The data inputs of each of the memory cells in the array are routed non-orthogonally.
The invention will be further described with reference to the following specific drawings and examples.
FIG. 1 shows an example of a memory array according to the present invention, in which the input data is an mxn matrix, the convolution kernel uses two sizes, 3×3 and 2×2, and the corresponding convolved output matrix sizes are (m-2) x (n-2) and (m-1) x (n-1).
The block labeled CIM in the figure is the memory unit. The memory cell inputs Data as one operand d through the Data In port, while taking Data preprogrammed In the memory cell as another operand w. Under the combined action of the Data input by the Data-In and the Bias voltage added by the Bias voltage port, the operand d and the operand w complete multiplication operation.
The array has 2n+3 columns and m×n rows, wherein at most, the memory cells of the 1 st, 2 nd, 3 rd, n+1 st, n+2 nd, n+3 rd, 2n+1 st, 2n+2 nd and 2n+3 rd columns are pre-programmed with an operation amount w, and other operation units are only occupied and are called redundant units. The purpose of placing redundant units is firstly to facilitate wiring of later-stage layout, and secondly to ensure consistency of each operation unit. Meanwhile, only the 1 st to (m-1) x (n-1) th rows are provided with memory units, and only the (m-1) x (n-1) +1 th to m x n th rows are provided with wiring without memory units, so that the first design is to meet the requirement that the sizes of an input matrix and an output convolution matrix are not matched; secondly, the wiring of the later layout is convenient.
The connections between each memory cell in the array are non-orthogonal wiring, such as: the data input ports of the memory devices of the first row and the first column are singly connected out; the data input ends of the first row and the second column are connected with and connected with the data input end of the first column storage unit of the second row in parallel; the data input ends of the first row, the third column, the second row and the first column of the third row are connected in detail and output the devices which are similarly pushed to the (m-1) x (n-1), and since the data input port of the device of the last row can exceed the (m-1) x (n-1) row after oblique wiring, the invention is provided with a wiring area of m+n-2 rows, namely the (m-1) x (n-1) +1 row to the m x n row in the array, as shown in detail in figure 1.
In this example, the computational array can adapt to convolution kernels of two sizes, 3×3 and 2×2, so as to meet the operation requirements of most convolution neural networks. When the convolution kernel size is 3×3, the 1 st, 2 nd, 3 rd, n+1 st, n+2 nd, n+3 th, 2n+1 st, 2n+2 nd, 2n+3 rd columns of memory cells are enabled, and the operation amount w is preprogrammed, and then the convolution matrix with edges is output. When the convolution kernel size is 2×2, the memory cells of columns 1, 2, n+1, n+2 are enabled and preprogrammed with an operand w, at which time the output is a convolution matrix without edges.
The input matrix is expanded into a 1× (mxn) vector by the way of equation (1) and input from the data_in port of mxn rows. At the same time, each row of memory cells is preprogrammed with the same another operand w.
The analog output from each operation unit in a row of the array is collected and output from data_out, and finally the array can complete two-dimensional convolution operation on the input matrix in the mode of the formula (2).
In summary, the memory array design of the fully-expanded non-orthogonal wiring for memory operationsThe method greatly compresses the redundancy of the array when in use, the area of the array designed by the invention is (2n+3) multiplied by m multiplied by n, and the area of the array required by the traditional scheme is m 2 ×n 2 The theoretical area compression ratio isTaking m=n=48 as an example, the area compression ratio of the present invention is 24 times.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (5)

1. The memory array design method for the fully-unfolded non-orthogonal wiring for the memory operation is characterized by comprising a memory array, wherein a memory unit is arranged In the memory array, the memory unit inputs Data through a Data-In port as an operand d, simultaneously takes the Data preprogrammed In the memory unit as another operand w, and completes multiplication operation of the operand d and the operand w under the combined action of Data input by Data-In and Bias voltage added by a Bias-voltage port;
the storage array inputs an input matrix of the array through m multiplied by n Data-In ports, wherein m represents the number of rows of the input matrix, and n represents the number of columns of the input matrix;
the size of the input matrix is m×n, and the input matrix is developed into a vector of 1× (m×n) by a method of formula (1), wherein the formula (1) is:
the array completes two-dimensional convolution operation on an input matrix in a mode of a formula (2), wherein the formula (2) is as follows:
2. the memory array design method for the fully-expanded non-orthogonal wiring for memory operations according to claim 1, wherein the method comprises the following steps: each of the memory cells in the memory array are connected in a non-orthogonal manner.
3. The memory array design method for the fully-expanded non-orthogonal wiring for memory operations according to claim 2, wherein the method comprises the following steps: the storage array may accommodate convolution operations of convolution kernels of various sizes.
4. The memory array design method for fully-expanded non-orthogonal wiring for memory operations according to claim 3, wherein: the 1× (mxn) vector is input by the mxn Data-In ports of the array.
5. The memory array design method for the fully-expanded non-orthogonal wiring for memory operations according to claim 4, wherein the method comprises the following steps: the array outputs Data through the data_out port.
CN202110176004.5A 2021-02-06 2021-02-06 Memory operation-oriented fully-unfolded non-orthogonal wiring memory array design method Active CN112989268B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110176004.5A CN112989268B (en) 2021-02-06 2021-02-06 Memory operation-oriented fully-unfolded non-orthogonal wiring memory array design method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110176004.5A CN112989268B (en) 2021-02-06 2021-02-06 Memory operation-oriented fully-unfolded non-orthogonal wiring memory array design method

Publications (2)

Publication Number Publication Date
CN112989268A CN112989268A (en) 2021-06-18
CN112989268B true CN112989268B (en) 2024-01-30

Family

ID=76392637

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110176004.5A Active CN112989268B (en) 2021-02-06 2021-02-06 Memory operation-oriented fully-unfolded non-orthogonal wiring memory array design method

Country Status (1)

Country Link
CN (1) CN112989268B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343585B (en) * 2021-06-29 2024-08-23 江南大学 Method for designing weight discrete memory array for matrix multiplication operation
CN113672860B (en) * 2021-08-25 2023-05-12 恒烁半导体(合肥)股份有限公司 Positive and negative number compatible in-memory operation method, multiplication and addition operation device and application thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110647983A (en) * 2019-09-30 2020-01-03 南京大学 Self-supervision learning acceleration system and method based on storage and calculation integrated device array
CN111241028A (en) * 2018-11-28 2020-06-05 北京知存科技有限公司 Digital-analog hybrid storage and calculation integrated chip and calculation device
CN111950718A (en) * 2019-05-16 2020-11-17 北京知存科技有限公司 Method for realizing progressive CNN operation by using storage and computation integrated chip
CN112115665A (en) * 2020-09-14 2020-12-22 上海集成电路研发中心有限公司 Storage and calculation integrated storage array and convolution operation method thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7120658B2 (en) * 2002-05-14 2006-10-10 Nash James G Digital systolic array architecture and method for computing the discrete Fourier transform
US8374045B2 (en) * 2009-12-07 2013-02-12 Spansion Israel Ltd Methods circuits devices and systems for operating an array of non-volatile memory cells

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241028A (en) * 2018-11-28 2020-06-05 北京知存科技有限公司 Digital-analog hybrid storage and calculation integrated chip and calculation device
CN111950718A (en) * 2019-05-16 2020-11-17 北京知存科技有限公司 Method for realizing progressive CNN operation by using storage and computation integrated chip
CN110647983A (en) * 2019-09-30 2020-01-03 南京大学 Self-supervision learning acceleration system and method based on storage and calculation integrated device array
CN112115665A (en) * 2020-09-14 2020-12-22 上海集成电路研发中心有限公司 Storage and calculation integrated storage array and convolution operation method thereof

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A novel convolution computing paradigm based on NOR flash array with high computing speed and energy efficiency;HAN Runze等;IEEE Transactions on Circuits and Systems;全文 *
Efficient and robust spike-driven deep convolutional neural networks based on NOR flash computing array;Xiang Yachen等;IEEE Transactions on Electron Devices;全文 *
基于FPGA的机器学习硬件加速研究进展;王超;王腾;马翔;周学海;;计算机学报(06);全文 *
基于忆阻器的PIM结构实现深度卷积神经网络近似计算;李楚曦等;计算机研究与发展;第54卷(第6期);全文 *
硬件加速神经网络综述;陈桂林等;计算机研究与发展;第56卷(第2期);全文 *

Also Published As

Publication number Publication date
CN112989268A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN109062612B (en) Neural network processing device and method for executing plane rotation instruction
CN112989268B (en) Memory operation-oriented fully-unfolded non-orthogonal wiring memory array design method
WO2022037257A1 (en) Convolution calculation engine, artificial intelligence chip, and data processing method
CN109522052B (en) Computing device and board card
US10496855B2 (en) Analog sub-matrix computing from input matrixes
CN107993186A (en) 3D CNN acceleration method and system based on Winograd algorithm
CN110705703B (en) Sparse neural network processor based on systolic array
CN109032670A (en) Processing with Neural Network device and its method for executing vector duplicate instructions
CN109284824B (en) Reconfigurable technology-based device for accelerating convolution and pooling operation
Tang et al. AEPE: An area and power efficient RRAM crossbar-based accelerator for deep CNNs
CN108182959B (en) Method for realizing logic calculation based on crossing array structure of resistive device
US20220179823A1 (en) Reconfigurable reduced instruction set computer processor architecture with fractured cores
CN111723336A (en) Cholesky decomposition-based arbitrary-order matrix inversion hardware acceleration system adopting loop iteration mode
CN110059809B (en) Computing device and related product
US11934482B2 (en) Computational memory
CN111079908A (en) Network-on-chip data processing method, storage medium, computer device and apparatus
Waidyasooriya et al. FPGA implementation of heterogeneous multicore platform with SIMD/MIMD custom accelerators
Srinivasa et al. Trends and opportunities for SRAM based in-memory and near-memory computation
CN113743046B (en) Integrated layout structure for memory and calculation and integrated layout structure for data splitting and memory and calculation
CN112328536B (en) Inter-core structure of multi-core processor array and multi-core processor
CN112115665A (en) Storage and calculation integrated storage array and convolution operation method thereof
Zhang et al. A High-Efficient and Configurable Hardware Accelerator for Convolutional Neural Network
He et al. A systolic array implementation of common factor algorithm to compute DFT
CN111222632B (en) Computing device, computing method and related product
US20220207323A1 (en) Architecture and cluster of processing elements and operating method for convolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant