CN113705784A - Neural network weight coding method based on matrix sharing and hardware system - Google Patents
Neural network weight coding method based on matrix sharing and hardware system Download PDFInfo
- Publication number
- CN113705784A CN113705784A CN202110964903.1A CN202110964903A CN113705784A CN 113705784 A CN113705784 A CN 113705784A CN 202110964903 A CN202110964903 A CN 202110964903A CN 113705784 A CN113705784 A CN 113705784A
- Authority
- CN
- China
- Prior art keywords
- matrix
- neural network
- coding
- weight
- positive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000011159 matrix material Substances 0.000 title claims abstract description 79
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 67
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000013507 mapping Methods 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 12
- 150000002500 ions Chemical class 0.000 claims description 6
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 abstract description 12
- 230000005540 biological transmission Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000009825 accumulation Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 2
- 239000004744 fabric Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000008707 rearrangement Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 210000002364 input neuron Anatomy 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000004205 output neuron Anatomy 0.000 description 1
- 230000003071 parasitic effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention discloses a neural network weight coding method based on a shared matrix and a hardware system, and belongs to the technical field of neural network algorithm hardware implementation. The method aims at the problems of high demand on the number of storage and calculation devices, large size of a storage and calculation array and high cost of the current weight coding method, and realizes pairwise coding of weight parameters in a neural network convolution kernel by using a shared matrix mode. Meanwhile, the coding method of the invention has good compatibility and is not limited by the type of a nonvolatile memory device. Compared with the existing coding method, the coding method based on the shared matrix is more suitable for large-scale and ultra-large-scale computing architectures.
Description
Technical Field
The invention relates to a neural network weight coding method based on matrix sharing and a hardware system, and belongs to the technical field of neural network algorithm hardware implementation.
Background
With the development of deep learning, neural networks are widely applied in various fields such as image recognition, voice recognition, natural language processing and the like. However, as network architectures become more complex, the amount of data transmission and computation in neural networks increases dramatically. In addition, computation and data transmission bring about great power consumption, so that neural network applications are difficult to deploy on hardware devices.
In recent years, a storage and computation integrated architecture for neural network computation has been widely focused and researched, and the basic idea is to store weight mapping into a storage and computation array, so that some simple but huge data volume logic computation functions are put into a memory to reduce the data transmission volume and the transmission distance between the memory and a processor.
The operation principle of a classical non-volatile memory array is shown in fig. 1. To calculate two multiplication matricesStoring the values of the weight matrix W in a memory array in the form of conductance, and storing the matrix in the form of conductanceThe value of (A) is input to the input end of the storage and calculation array in a voltage mode, and the operation result in a current mode is obtained from the output end of the storage and calculation array, so that the whole storage and calculation integration process is completed.
The weight parameters in the neural network are usually divided into positive and negative values, in order to show that the weight parameters in the storage and calculation array usually adopt the expression mode of positive and negative matrixes, the encoded values are respectively stored in the positive and negative matrixes, then the output results of the two matrixes are subtracted, and the value of the weight parameter is expressed by the difference value of the matrixes. As shown in fig. 2, the current encoding method processes each weight parameter independently, mapping into positive and negative matrices, i.e. at least two computing devices are required for each weight represented. In the application scenario of individual pursuit of high precision, a weight is represented by a plurality of positive matrix devices and a plurality of negative matrix devices, and the device overhead of each weight parameter is even more.
Nowadays, a simple neural network is difficult to meet task requirements, so that the structure of the neural network is increasingly complex, the scale is continuously enlarged, the application scene of the storage and calculation integrated framework becomes more and more extensive and complex, and realization of the large-scale and ultra-large-scale neural network in the hardware field is a necessary way for development of the storage and calculation integrated framework. As the depth and parameters of neural networks increase, the number of devices and array sizes required in hardware implementations inevitably become increasingly large. The large array size causes a series of problems including cost, chip area, parasitic parameters and testability, and the difficulty of hardware implementation is multiplied with the increase of the number of devices.
Disclosure of Invention
The invention provides a neural network weight parameter coding method based on matrix sharing and a hardware system, aiming at the problems of large quantity of computing devices, large size of a computing array and high cost in the implementation of neural network hardware in the conventional neural network weight coding method.
The invention provides a neural network weight parameter coding method based on matrix sharing, which is characterized by comprising the following steps:
the method comprises the following steps: performing fixed-point processing on the weight parameters in the trained neural network convolution kernels;
step two: grouping convolution kernels in the neural network, and grouping every two convolution kernels when the number of the convolution kernels is an even number; when the number of the convolution kernels is odd, randomly taking out one convolution kernel, and grouping the residual convolution kernels pairwise;
step three: calculating an actual encoded value, comprising:
independently coding the weight parameters in one convolution kernel arbitrarily taken out in the step two, outputting two actual coding values after each weight parameter is coded, and mapping and storing the two actual coding values to a positive matrix and a negative matrix;
for the two-by-two grouped convolution kernels in step two: two weight parameters at the same position in the same group of convolution kernels are coded pairwise, and each two weight parameters are coded and then output three actual coded values, and are mapped and stored to three matrixes;
step four: and splicing the matrixes storing the actual coding values to obtain a coded neural network weight matrix.
Optionally, usingTwo weight parameters representing the same position of the same set of convolution kernels, pairTwo-by-two coding is carried out, and the coding result is mapped and stored to three matrixes which are characterized in thatThe positive matrix is a matrix of positive ions,positive matrix and shared negative matrix, the process of pairwise coding comprises:
When in useWhen the temperature of the water is higher than the set temperature,keeping the original value unchanged;
weight parameterBy two coded valuesAnddetermining, weighting, parametersBy two coded valuesAndare determined jointly, whereinIs a shared negative encoded value;
mapping is stored toThe positive matrix is a matrix of positive ions,mapping is stored toThe positive matrix is a matrix of positive ions,the mapping is stored to the shared negative matrix.
Optionally, usingTwo weight parameters representing the same position of the same set of convolution kernels, pairTwo-by-two coding is carried out, and the coding result is mapped and stored to three matrixes which are characterized in thatA negative matrix of the matrix is formed,negative matrix and shared positive matrix, the process of pairwise coding comprises:
When in useWhen the temperature of the water is higher than the set temperature,keeping the original value unchanged;
weight parameterBy two coded valuesAnddetermining, weighting, parametersBy two coded valuesAndare determined jointly, whereinTo share a positive encoded value;
mapping is stored toA negative matrix of the matrix is formed,mapping is stored toA negative matrix of the matrix is formed,the mapping is stored to the shared positive matrix.
A second object of the present invention is to provide a neural network chip, which includes an accumulation array composed of a plurality of accumulation devices, wherein each weight value in the neural network weight matrix is written into the corresponding accumulation device one by one, and the neural network weight matrix is obtained by using the neural network weight parameter coding method.
Optionally, the storage device is a non-volatile storage device.
Optionally, the precision of the storage device is 8 bits.
Optionally, the precision of the storage device is 4 bits.
A third object of the present invention is to provide a computing apparatus, comprising a memory and a processor, wherein the memory stores computer-executable instructions, and the instructions are characterized in that when executed by the processor, the method for encoding the neural network weight parameters is performed.
A fourth object of the present invention is to provide a neural network hardware system, comprising: the invention provides a neural network chip and/or a computing device.
The invention also provides an application of the neural network weight parameter coding method and/or the neural network chip and/or the computing device and/or the neural network hardware system in the technical field of neural networks.
The invention has the beneficial effects that:
when the hardware of the neural network is implemented, the convolution kernels in the neural network are grouped in pairs, and the weight parameters at the same positions of the convolution kernels are coded in pairs in a matrix sharing mode, and the two weight parameters are coded and then only need three storage devices to implement the calculation, so that the number of the storage devices required by the calculation array is reduced; meanwhile, the coding method is not limited by the type of a nonvolatile memory device, and has good compatibility. Compared with the existing method of independently coding each weight parameter and storing the coded value into 2 storage devices, the method only uses 1.5 storage devices at least on average for each weight parameter, and saves about 1/4 device overhead for the storage array.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of the operation of a non-volatile memory array.
Fig. 2 is a schematic diagram of a storage array structure of the current encoding method.
FIG. 3 is a schematic diagram of a shared device structure according to the present invention.
Fig. 4 shows a schematic diagram of the whole process of mapping from algorithm to storage array according to the present invention, wherein (a) is algorithm encoding process and (b) is process of writing encoded value into storage array.
Fig. 5 is a flow chart of an implementation of the encoding method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The coding method provided by the invention is oriented to the weight parameters in the neural network, in the practical application, the neural network is trained firstly to obtain the final trained neural network convolution kernel, and then the coding method provided by the invention is utilized to code the weight parameters in the convolution kernel.
The first embodiment is as follows:
the embodiment provides a neural network coding method based on matrix sharing, which is applied to a neural network, and the coding process comprises the following steps:
the method comprises the following steps: performing fixed-point processing on the weight parameters in the trained neural network convolution kernel, wherein the fixed-point processing method in the embodiment is a linear transformation mode;
table 1 shows a convolution kernel weight parameter matrix after the dotting processing in this embodiment;
TABLE 1 weight matrix after spotting
Step two: grouping the convolution kernels processed in the step one, and rearranging the convolution kernels, wherein the weight parameters are processed pairwise;
table 2 is the matrix rearranged after grouping;
TABLE 2 rearrangement of cloth matrix
|
5 | 15 | -7 | -9 |
|
12 | -3 | -7 | 11 |
Step three: calculating an actual coding value; in this embodiment, a shared negative matrix is selected, and the rearrangement matrix is encoded according to the encoding flowchart of fig. 5;
and the weight parameters corresponding to the uplink and the downlink are processed in pairs, and the two share one storage device for primary coding to obtain the primary coding matrix of the table 3.
TABLE 3 Primary coding matrix
|
5 | 15 | 0 | 0 |
Shared |
0 | 0 | 7 | 9 |
|
12 | -3 | 0 | 20 |
And judging the matrix of the initial coding, wherein the physical values in the actual storage and calculation array have no positive or negative value, when the existing coding value is negative, introducing an offset into the group of codes, and the coding values of the rest groups are not changed, so that the actual coding matrix of the table 4 is obtained.
TABLE 4 actual coding matrix
|
5 | 18 | 0 | 0 |
Shared |
0 | 3 | 7 | 9 |
|
12 | 0 | 0 | 20 |
Step four: and splicing the actual coding matrixes in the table 4 to obtain a coded neural network weight matrix.
In order to verify the effectiveness of the coding method and the characteristics of low device overhead, a series of experiments are carried out. In the experimental process, the coding method is applied to different neural networks, the MINIST hand-written digital data set is adopted for verification, and the accuracy of network inference after coding and the device overhead required by array storage in hardware implementation are observed.
A small-scale neural network is designed in the first experiment, and the small-scale neural network comprises 3 convolutional layers and 1 full-connection layer. In the first convolutional layer, the convolutional kernel size is 5 × 1 × 7, and no padding is added; in the second convolution layer, the convolution kernel size is 3 × 7, and filling is added; in the third convolution layer, the convolution kernel size is 3 × 7, and filling is added; the fourth layer is a full connection layer, the number of input neurons is 63, and the number of output neurons is 10.
Then, the coding method of the invention is applied to a neural network with deeper layers and larger scale. The new neural network has 8 layers, including 5 convolutional layers and 3 fully-connected layers. In the first convolution layer, the convolution kernel size is 7 × 1 × 16, and no padding is added; in the second convolution layer, the convolution kernel size is 5 × 16 × 32, and filling is added; in the third convolution layer, the convolution kernel size is 3 × 32 × 128, and padding is added; in the fourth convolution layer, the convolution kernel size is 3 × 128, and padding is added; in the fifth convolutional layer, the convolutional kernel size was 3 × 128 × 64, and padding was added. The last three layers are full-connection layers with classification function, and the number of neurons in each layer is respectively as follows: 1024, 512 and 10.
The results are summarized in Table 5 below. By applying the coding method in two types of neural networks, the effectiveness of the coding method can be effectively verified. Meanwhile, the transverse comparison of the two networks can show the proportional relation between the optimization effect of the coding method and the network scale.
TABLE 5 summary of the results
Network | Neural network | 1 | |
Number of layers | 5 | 8 | |
Number of parameters | 1,718 | 1,065,098 | |
Rate of accuracy | 96.71% | 98.94% | |
Number of original computing devices | 3,436 | 2,130,196 | |
Existing number of computing devices | 2,639 | 1,595,328 | |
Saving overhead | 23.2% | 25.0% |
According to the experimental results given in table 5, it can be seen that after the coding method of the present invention is applied, the target network still performs well on the calculation accuracy, and the validity of the coding method is verified. Meanwhile, the number of required storage devices before and after the coding method is applied can be compared, and the coding method has the characteristic of remarkably reducing device overhead. Therefore, the coding method of the invention can solve the problem that a large number of circuit devices need to be arranged in the implementation process of neural network hardware on the premise of ensuring the network precision, and effectively reduces the development cost and the design difficulty.
Some steps in the embodiments of the present invention may be implemented by software, and the corresponding software program may be stored in a readable storage medium, such as an optical disc or a hard disk.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (10)
1. A neural network weight parameter coding method based on matrix sharing is characterized by comprising the following steps:
the method comprises the following steps: performing fixed-point processing on the weight parameters in the trained neural network convolution kernels;
step two: grouping convolution kernels in the neural network, and grouping every two convolution kernels when the number of the convolution kernels is an even number; when the number of the convolution kernels is odd, randomly taking out one convolution kernel, and grouping the residual convolution kernels pairwise;
step three: calculating an actual encoded value, comprising:
independently coding the weight parameters in one convolution kernel arbitrarily taken out in the step two, outputting two actual coding values after each weight parameter is coded, and mapping and storing the two actual coding values to a positive matrix and a negative matrix;
for the two-by-two grouped convolution kernels in step two: two weight parameters at the same position in the same group of convolution kernels are coded pairwise, and each two weight parameters are coded and then output three actual coded values, and are mapped and stored to three matrixes;
step four: and splicing the matrixes storing the actual coding values to obtain a coded neural network weight matrix.
2. The method of claim 1, usingTwo weight parameters representing the same position of the same set of convolution kernels, pairTwo-by-two coding is carried out, and the coding result is mapped and stored to three matrixes which are characterized in thatThe positive matrix is a matrix of positive ions,positive matrix and shared negative matrix, the process of pairwise coding comprises:
When in useWhen the temperature of the water is higher than the set temperature,keeping the original value unchanged;
weight parameterBy two coded valuesAnddetermining, weighting, parametersBy two coded valuesAndare determined jointly, whereinIs a shared negative encoded value;
3. The method of claim 1, usingTwo weight parameters representing the same position of the same set of convolution kernels, pairTwo-by-two coding is carried out, and the coding result is mapped and stored to three matrixes which are characterized in thatA negative matrix of the matrix is formed,negative matrix and shared positive matrix, the process of pairwise coding comprises:
When in useWhen the temperature of the water is higher than the set temperature,keeping the original value unchanged;
weight parameterBy two coded valuesAnddetermining, weighting, parametersBy two coded valuesAndare determined jointly, whereinTo share a positive encoded value;
4. A neural network chip comprising an inventory array formed by a plurality of inventory devices, each weight value in a neural network weight matrix being written one by one into a corresponding inventory device, characterized in that the neural network weight matrix is obtained by using the neural network weight parameter coding method according to any one of claims 1 to 3.
5. The chip of claim 4, wherein the memory device is a non-volatile memory device.
6. The chip of claim 5, wherein the precision of the memory device is 8 bits.
7. The chip of claim 5, wherein the precision of the memory device is 4 bits.
8. A computing device comprising a memory and a processor, the memory having stored thereon computer-executable instructions, wherein the instructions, when executed by the processor, perform the neural network weight parameter encoding method of any one of claims 1-3.
9. A neural network hardware system, comprising: the neural network chip of any one of claims 4-7 and/or the computing device of claim 8.
10. Use of the neural network weight parameter coding method according to any one of claims 1 to 3 and/or the neural network chip according to any one of claims 4 to 7 and/or the computing device according to claim 8 and/or the neural network hardware system according to claim 9 in the field of neural network technology.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110964903.1A CN113705784A (en) | 2021-08-20 | 2021-08-20 | Neural network weight coding method based on matrix sharing and hardware system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110964903.1A CN113705784A (en) | 2021-08-20 | 2021-08-20 | Neural network weight coding method based on matrix sharing and hardware system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113705784A true CN113705784A (en) | 2021-11-26 |
Family
ID=78653841
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110964903.1A Pending CN113705784A (en) | 2021-08-20 | 2021-08-20 | Neural network weight coding method based on matrix sharing and hardware system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113705784A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116306811A (en) * | 2023-02-28 | 2023-06-23 | 苏州亿铸智能科技有限公司 | Weight distribution method for deploying neural network for ReRAM |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170200078A1 (en) * | 2014-08-28 | 2017-07-13 | Commissariat A L'energie Atomique Et Aux Energies Alternatives | Convolutional neural network |
CN107612555A (en) * | 2017-10-12 | 2018-01-19 | 江南大学 | A kind of improvement degree of rarefication Adaptive matching tracing algorithm based on dichotomy |
US20180046916A1 (en) * | 2016-08-11 | 2018-02-15 | Nvidia Corporation | Sparse convolutional neural network accelerator |
CN109993297A (en) * | 2019-04-02 | 2019-07-09 | 南京吉相传感成像技术研究院有限公司 | A kind of the sparse convolution neural network accelerator and its accelerated method of load balancing |
CN111242180A (en) * | 2020-01-03 | 2020-06-05 | 南京邮电大学 | Image identification method and system based on lightweight convolutional neural network |
CN112734025A (en) * | 2019-10-28 | 2021-04-30 | 复旦大学 | Neural network parameter sparsification method based on fixed base regularization |
CN112836757A (en) * | 2021-02-09 | 2021-05-25 | 东南大学 | Deep learning network convolution kernel internal parameter sharing method |
CN112990454A (en) * | 2021-02-01 | 2021-06-18 | 国网安徽省电力有限公司检修分公司 | Neural network calculation acceleration method and device based on integrated DPU multi-core isomerism |
US20210232897A1 (en) * | 2016-04-27 | 2021-07-29 | Commissariat A L'energie Atomique Et Aux Energies Al Ternatives | Device and method for calculating convolution in a convolutional neural network |
-
2021
- 2021-08-20 CN CN202110964903.1A patent/CN113705784A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170200078A1 (en) * | 2014-08-28 | 2017-07-13 | Commissariat A L'energie Atomique Et Aux Energies Alternatives | Convolutional neural network |
US20210232897A1 (en) * | 2016-04-27 | 2021-07-29 | Commissariat A L'energie Atomique Et Aux Energies Al Ternatives | Device and method for calculating convolution in a convolutional neural network |
US20180046916A1 (en) * | 2016-08-11 | 2018-02-15 | Nvidia Corporation | Sparse convolutional neural network accelerator |
CN107612555A (en) * | 2017-10-12 | 2018-01-19 | 江南大学 | A kind of improvement degree of rarefication Adaptive matching tracing algorithm based on dichotomy |
CN109993297A (en) * | 2019-04-02 | 2019-07-09 | 南京吉相传感成像技术研究院有限公司 | A kind of the sparse convolution neural network accelerator and its accelerated method of load balancing |
CN112734025A (en) * | 2019-10-28 | 2021-04-30 | 复旦大学 | Neural network parameter sparsification method based on fixed base regularization |
CN111242180A (en) * | 2020-01-03 | 2020-06-05 | 南京邮电大学 | Image identification method and system based on lightweight convolutional neural network |
CN112990454A (en) * | 2021-02-01 | 2021-06-18 | 国网安徽省电力有限公司检修分公司 | Neural network calculation acceleration method and device based on integrated DPU multi-core isomerism |
CN112836757A (en) * | 2021-02-09 | 2021-05-25 | 东南大学 | Deep learning network convolution kernel internal parameter sharing method |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116306811A (en) * | 2023-02-28 | 2023-06-23 | 苏州亿铸智能科技有限公司 | Weight distribution method for deploying neural network for ReRAM |
CN116306811B (en) * | 2023-02-28 | 2023-10-27 | 苏州亿铸智能科技有限公司 | Weight distribution method for deploying neural network for ReRAM |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11580377B2 (en) | Method and device for optimizing neural network | |
CN107516129B (en) | Dimension self-adaptive Tucker decomposition-based deep network compression method | |
WO2022037257A1 (en) | Convolution calculation engine, artificial intelligence chip, and data processing method | |
CN108229671B (en) | System and method for reducing storage bandwidth requirement of external data of accelerator | |
CN109840585B (en) | Sparse two-dimensional convolution-oriented operation method and system | |
CN109993293B (en) | Deep learning accelerator suitable for heap hourglass network | |
CN107633297A (en) | A kind of convolutional neural networks hardware accelerator based on parallel quick FIR filter algorithm | |
CN109791628A (en) | Neural network model splits' positions method, training method, computing device and system | |
CN112257844B (en) | Convolutional neural network accelerator based on mixed precision configuration and implementation method thereof | |
CN112286864B (en) | Sparse data processing method and system for accelerating operation of reconfigurable processor | |
CN113421187B (en) | Super-resolution reconstruction method, system, storage medium and equipment | |
CN114138231B (en) | Method, circuit and SOC for executing matrix multiplication operation | |
CN113741858A (en) | In-memory multiply-add calculation method, device, chip and calculation equipment | |
CN113705784A (en) | Neural network weight coding method based on matrix sharing and hardware system | |
TW202009799A (en) | Memory-adaptive processing method for convolutional neural network and system thereof | |
CN115859011A (en) | Matrix operation method, device and unit, and electronic equipment | |
Okubo et al. | A Cost-Efficient FPGA Implementation of Tiny Transformer Model using Neural ODE | |
CN113988279A (en) | Output current reading method and system of storage array supporting negative value excitation | |
CN110807479A (en) | Neural network convolution calculation acceleration method based on Kmeans algorithm | |
KR102154834B1 (en) | In DRAM Bitwise Convolution Circuit for Low Power and Fast Computation | |
CN114819167A (en) | Sparse approximate inverse quantum preprocessing method and device for sparse linear system | |
CN104123372B (en) | A kind of method and device that cluster is realized based on CUDA | |
CN117313803B (en) | Sliding window 2D convolution computing method based on RISC-V vector processor architecture | |
CN116055003B (en) | Data optimal transmission method, device, computer equipment and storage medium | |
CN111507178B (en) | Data processing optimization method and device, storage medium and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |