CN109784483B - FD-SOI (field-programmable gate array-silicon on insulator) process-based binary convolution neural network in-memory computing accelerator - Google Patents

FD-SOI (field-programmable gate array-silicon on insulator) process-based binary convolution neural network in-memory computing accelerator Download PDF

Info

Publication number
CN109784483B
CN109784483B CN201910068644.7A CN201910068644A CN109784483B CN 109784483 B CN109784483 B CN 109784483B CN 201910068644 A CN201910068644 A CN 201910068644A CN 109784483 B CN109784483 B CN 109784483B
Authority
CN
China
Prior art keywords
module
convolution
input
memory
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910068644.7A
Other languages
Chinese (zh)
Other versions
CN109784483A (en
Inventor
胡绍刚
刘爽
邓阳杰
罗鑫
于奇
刘洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201910068644.7A priority Critical patent/CN109784483B/en
Publication of CN109784483A publication Critical patent/CN109784483A/en
Application granted granted Critical
Publication of CN109784483B publication Critical patent/CN109784483B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention belongs to the technical field of neural networks, and relates to a binary convolution neural network in-memory computing accelerator based on an FD-SOI (field-of-flight diffraction-silicon on insulator) process. The invention realizes the exclusive-OR processing of data by utilizing the adjustment of the back gate voltage of FD-SOI-MOSFET to the threshold voltage thereof. And (3) performing one-dimensional processing on the convolution kernel parameters of the convolution neural network, storing the one-dimensional processing in a memory, and performing exclusive OR operation on the convolution kernel by using an FD-SOI-MOSFET to realize the convolution process of the convolution kernel on the neural network. On the premise of adopting in-memory calculation, compared with the traditional convolution process, the convolution process is completed by utilizing exclusive-or operation, the high precision is kept, the convolution processing speed of the neural network is greatly improved, the storage space of the neural network parameters is saved, data transmission is realized, and the operation power consumption is reduced.

Description

FD-SOI (field-programmable gate array-silicon on insulator) process-based binary convolution neural network in-memory computing accelerator
Technical Field
The invention belongs to the technical field of neural networks, and relates to a binary convolution neural network in-memory computing accelerator based on an FD-SOI (field-of-flight diffraction-silicon on insulator) process.
Background
The Convolutional Neural Network (CNN) is a common deep learning architecture, is inspired by a biological natural visual cognition mechanism (animal visual cortical cells are responsible for detecting optical signals), and is a special multilayer feedforward Neural Network. Its artificial neuron can respond to peripheral units in a part of coverage range, and has excellent performance for large-scale image processing. The CNN mainly consists of a Convolutional Layer (Convolutional Layer), a Pooling Layer (pond Layer), and a Full Connection Layer (Full Connection Layer), the Convolutional Layer is used for extracting different input features, the first Convolutional Layer may only extract some low-level features such as edges, lines, and corners, and more layers of networks can iteratively extract more complex features from the low-level features. According to the traditional binarization convolutional neural network, the weight value and the activation value of the hidden layer are binarized by 1 or-1, and the parameters of the neural network occupy smaller storage space through binarization.
As semiconductor processing has progressed to 22 nm, both FinFET and FD-SOI processing techniques have been extended to meet performance, cost, and power consumption requirements. FD-SOI is a planar technology, and has a relatively narrow application range because it has not been a complete industrial form. However, the FD-SOI technology is more and more concerned by the industry in recent years, the ecosystem is gradually formed, and the technical advantages and the application prospect are more and more attractive.
Traditional data is stored in a disk, and data needs to be extracted into a memory when operation is performed, and a large number of I/O connections are needed in the process. And by adopting the in-memory calculation, the calculation process can be sent to the data for local execution, so that the calculation speed is greatly improved, the storage area is saved, the data transmission is realized, and the calculation power consumption is reduced.
At present, no circuit for improving the CNN convolution speed by realizing the exclusive OR processing of data based on the FD-SOI process exists.
Disclosure of Invention
Aiming at the problems, the invention provides an autonomous learning impulse neural network weight quantization method.
The technical scheme of the invention is as follows:
the invention provides a binary convolution neural network in-memory calculation accelerator based on an FD-SOI (field-programmable gate insulator-silicon on insulator) process. The technical scheme is as follows:
a binarization convolution neural network memory computing accelerator based on FD-SOI technology comprises the following steps:
the in-memory calculation module is used for storing convolution kernel parameters of the convolution neural network and completing convolution processing on input data;
a shift register module for storing convolution neural network input data and having a shift function;
a controller module for logically controlling the shift register module and the in-memory computing module;
a detection conversion module for converting the calculation result of the calculation module in the memory into the conventional convolution calculation result;
the normalization module is used for adding the convolution results of different convolution kernels according to weights;
and adding an activation function module of a non-linear factor to the neural network.
Furthermore, the in-memory computing module is constructed by an SRAM module, an input and reverse phase input module, a pre-charge module and the like;
the SRAM module is a module which is constructed by 6 MOSFETs and can store one bit of data, two P-type MOSFETs and two N-type MOSFETs form two CMOS inverters and are connected end to end, the structure can be used for storing one bit of data, an FD-SOI-MOSFET is respectively connected to the output ends of the two inverters, back gate signals of the two inverters are respectively connected with input signals and inverted signals of the input signals, and due to the fact that the FD-SOI-MOSFET back gate signals have an adjusting effect on threshold voltages of the FD-SOI-MOSFET, the NAND operation of the input signals and the stored signals can be completed by the adjusting effect;
the input and reverse phase input module provides an input signal and a reverse phase signal of the input signal for the SRAM module;
the pre-charging module is used for charging the pre-charging capacitor before the XOR operation is carried out by the calculation module in the memory.
Furthermore, the in-memory calculation module is used for storing data after convolution kernel parameters of the convolution neural network are subjected to one-dimensional processing, the front (n × n) column of each row can be used for storing convolution kernels of n rows and n columns, and similarly, other convolution kernels can be stored afterwards, each row can be used for storing a plurality of convolution kernel parameters, the convolution kernel parameters of one convolution layer can be stored by one or more rows, and the output signals of the SRAM module are inverted by the skew inverter and then subjected to OR processing, so that the XOR operation of the input signals and the storage signals of the SRAM module can be realized.
Furthermore, the shift register module is used for storing input data of the convolutional neural network, and can shift and output corresponding input data for performing convolution operation of the calculation module in the memory.
Further, the controller module controls the shift register module to output corresponding data; controlling the enabling and closing of the pre-charging module; and controlling the output data of the shift register and convolution kernel parameters stored in corresponding lines of the calculation module in the memory to perform exclusive OR operation, wherein the control function is realized by a decoder.
Further, the detection conversion module detects the number of '1' in the output data of the calculation module in the memory, the conversion is the result of subtracting 2 from the bit width of the output data of the calculation module in the memory and multiplying the result by the 'detection', the binarization convolutional neural network binarizes the weight and the hidden layer activation value by 0 or 1, and after passing through the calculation module in the memory and the detection conversion module, the realized function is equal to the binarization convolutional neural network which binarizes the weight and the hidden layer activation value by-1 or 1.
Further, the normalization module adds the convolution results of different convolution kernels according to weights to obtain a normalized result.
Furthermore, the activating function module adds a nonlinear factor to the neural network through the module, so that the problem of insufficient expression and classification capability of the linear model is solved.
Further, the present invention also provides a convolution process of the calculation accelerator in the memory of the binarization convolution neural network based on the FD-SOI technology, which includes:
step 1, a controller module sends out a control instruction to control a shift register to output corresponding data to an input and inverting input module of a calculation module in a memory;
step 2, the input and inverting input module transmits data to the SRAM module in the same row;
step 3, the controller module controls the pre-charging module, enables the pre-charging module, charges the pre-charging capacitor to enable the BL and BLB potentials to be at high potentials, and then closes the pre-charging module;
step 4, the controller module selects one signal line in the VWL1-VWLc to enable the signal line to be at a high potential, and even if the SRAM module connected with the signal line is enabled, the transmission data in the step 2 and the storage data of the SRAM module are subjected to NAND operation;
step 5, transmitting the SRAM calculation result to a skew inverter, and transmitting the calculation result of the skew inverter to an OR gate, wherein the step 4-5 completes the XOR operation of the transmission data in the step 2 and the storage data of the SRAM module;
step 6, transmitting the OR gate calculation result in each in-memory calculation module to a detection conversion module, carrying out binarization 0 or 1 on the weight and the hidden layer activation value by the binarization convolutional neural network, and after passing through the in-memory calculation module and the detection conversion module, realizing the function equivalent to the binarization convolutional neural network for carrying out binarization-1 or 1 on the weight and the hidden layer activation value;
step 7, transmitting output results of all detection conversion modules to a normalization module;
step 8, transmitting the output result of the normalization module to an activation function module;
and 9, transmitting the output result of the activation function module to a shift register module for storage, skipping to the step 1 if the convolution operation is not finished, otherwise, finishing.
The invention has the beneficial effects that:
the core of the method provided by the invention is that the exclusive OR operation of data is realized by utilizing the adjustment effect of the FD-SOI-MOSFET back gate voltage on the threshold voltage of the FD-SOI-MOSFET. On the premise of adopting calculation in the memory, compared with the convolution process of the traditional convolution neural network, the convolution process is completed by utilizing exclusive-or operation, the high precision is kept, the convolution processing speed of the neural network is greatly improved, the parameter storage space of the neural network is saved, data transmission is realized, and the operation power consumption is reduced.
Drawings
FIG. 1 is a schematic diagram of a memory computing accelerator for a binary convolution neural network based on FD-SOI technology according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of CNN in FIG. 1;
FIG. 3 is a schematic diagram of the SRAM module of FIG. 1;
FIG. 4 is a schematic diagram of the FD-SOI-MOSFET of FIG. 3;
FIG. 5 is a schematic diagram of the threshold of the FD-SOI-MOSFET of FIG. 3;
FIG. 6 is a timing diagram illustrating a data NAND operation performed on the data in FIG. 3;
FIG. 7 is a schematic diagram of a data completion NAND operation truth table of FIG. 3;
FIG. 8 is a schematic diagram of the skewed inverter of FIG. 1;
FIG. 9 is a schematic diagram of the skewed inverter of FIG. 1;
FIG. 10 is a convolution flow chart of a calculation accelerator in a binarization convolution neural network memory based on FD-SOI technology according to the present invention.
Detailed Description
The present invention is described in detail below with reference to the attached drawings so that those skilled in the art can better understand the present invention.
When the existing binarization convolutional neural network is researched, multiplication and addition are used in the convolutional neural network when the convolution process is realized, wherein the multiplication calculation greatly consumes a storage area, reduces the operation speed and generates larger power consumption, and the defects greatly reduce the performance index of the binarization convolutional neural network.
The invention provides a convolution algorithm based on the prior art and realizes the convolution algorithm by using a circuit, and realizes the convolution calculation by using XOR and the like to replace the traditional convolution calculation.
In order to achieve the above-mentioned purpose, the invention provides a binary convolution neural network memory computing accelerator based on FD-SOI technology, comprising:
the in-memory computing module is used for storing convolution kernel parameters of the convolution neural network and completing convolution processing on input data;
a shift register module for storing convolution neural network input data and having a shift function;
a controller module for logically controlling the shift register module and the in-memory computing module;
the detection conversion module is used for converting the calculation result of the calculation module in the memory into the calculation result of the traditional convolution;
the normalization module is used for adding the convolution results of different convolution kernels according to weights;
and adding an activation function module of a non-linear factor to the neural network.
To make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail by specific embodiments with reference to the attached drawings, and it should be understood that the specific embodiments described herein are only for explaining the present invention and are not intended to limit the present invention.
As shown in fig. 1, the in-memory calculation module 17 is configured to store data after convolution kernel parameters of the convolutional neural network are "one-dimensionally" processed, where the first (n × n) column of each row can be used to store convolution kernels of n rows and n columns, and similarly, other convolution kernels can be stored afterwards, each row can be used to store multiple convolution kernel parameters, and convolution kernel parameters of one convolution layer can be stored by one or multiple rows. The shift register module 3 is used for storing input data of the convolutional neural network, and can shift and output corresponding input data 4 for performing convolution operation of the in-memory calculation module 17. The controller module 1 controls the shift register module 3 to input the corresponding input data of CNN into the in-memory computing module 17, and one input data received by each column outputs the input data 6 and its inverted data 6 to the SRAM module 12 through the input and inverted input module 5. Before data operation, the precharge capacitor 10 needs to be charged by the precharge block 8, the BL potential 9 and the BLB potential 9 are set at high potential, and then the precharge block 8 is turned off. The controller module 1 then enables one of the rows of SRAM modules 12 (this part of the control function is implemented by the decoder) by controlling the VWL signal 11. The output 6 of the selected and enabled row of SRAM modules 12 passes through the skew inverter 13 and the or gate 15, and the xor operation of the input data and the stored data can be realized. In this example, the CNN implemented by the circuit binarizes the weight and the hidden layer activation value by 0 or 1, whereas the conventional CNN binarizes the weight and the hidden layer activation value by-1 or 1, and the calculation result 18 of the in-memory calculation module 17 is input to the detection conversion module 19, so that the process of the convolution equivalent to the conventional CNN by the exclusive or operation in this example can be implemented, and the result 20 of the corresponding convolution kernel convolution is obtained. The normalization module 21 weights the results 20 of the different convolution kernels together to obtain a normalized result 22. The normalization result 22 adds a nonlinear factor to the neural network through the activation function module 23, and solves the problem of insufficient expression and classification capability of the linear model.
As shown in fig. 2, which shows an example of CNN, the input is a picture with two convolutional layers, two pooling layers, and shows its fully connected layers.
As shown in fig. 3, two CMOS inverters 26-27 are connected end-to-end, i.e., one bit of data B is stored, B' being the inverted data of B. The outputs of the two inverters 26-27 are connected to an FD-SOI-MOSFET28, where a and a' are the input data 6 and its inverse data 6, respectively, VWL is the select enable signal 11 that controls the FD-SOI-MOSFET28 to turn on and off, and the BL potential 9 and the BLB potential 9 are the two outputs of the SRAM module 12. The nand operation of the input signal and the storage signal (see fig. 6 and 7) can be performed by adjusting the threshold voltage of the FD-SOI-MOSFET back-gate signal (see fig. 5).
As shown in fig. 4, which shows a cross-sectional view of an n-type FD-SOI-MOSFET, compared to a conventional MOSFET, with extremely thin doped regions and channels (silicon films) separated by buried oxide layers between the doped regions and channels and the back gate, the threshold voltage is adjusted by the back gate voltage adjusting effect on the channel.
As shown in fig. 5, this graph shows the adjustment effect of the back gate voltage on the threshold voltage, and shows that the larger the back gate voltage, the smaller the threshold voltage.
As shown in fig. 6, before the in-memory computing module 17 performs operation, the pre-charge module 8 is enabled, the pre-charge capacitor 10 is charged to make the BL electric potential 9 and the BLB electric potential 9 at high electric potentials, and then the pre-charge module 8 is turned off. The controller module 1 then enables one of the rows of SRAM modules 12 by controlling VWL signal 11, when two FD-SOI-MOSFETs 28 are turned on for a particular SRAM module 12, where a is the input data 6 and B is the stored data (see fig. 3). When a is 0, the threshold voltage is large, when a is 1, the threshold voltage is small, and when the threshold voltage is large, the discharging speed of the pre-charging capacitor is slower than that when the threshold voltage is small; when B is 0, the pre-charge capacitor discharges at a high speed and its potential drops to a very low level, and when B is 1, the pre-charge capacitor discharges at a low speed and its potential change drops to a small level. Finally, combining the 4 cases, the potential BL potential 9 of the pre-charge capacitor 10 changes as shown in the figure, and it can be seen that after the inversion level 40 is realized, the curves 32 to 34 are "1" and the curve 35 is "0". Similarly, the change in the potential BLB 9 of the pre-charge capacitor 10 can be obtained, after the inversion level 40 is achieved, the curves 36-38 are "1" and the curve 39 is "0".
As shown in FIG. 7, in conjunction with the above analysis of FIG. 6, a truth table for the SRAM module 12 can be derived. The function realized by the SRAM module 12 of fig. 3, i.e., BL ═ AB ')', BLB ═ a 'b', can be obtained from the truth table.
As shown in fig. 8, the diagram shows a circuit of the skew inverter 13 in fig. 1, which is a CMOS inverter composed of two FD-SOI-MOSFETs 41-42, whose back gate voltages are Vp and Vn, respectively. The most important to implement the function of fig. 6 is the implementation of the toggle level 40 shown in fig. 6. The switching level 40 of the skewed inverter 13, i.e., the switching level 40 shown in fig. 6, can be achieved by adjusting Vp and Vn.
As shown in fig. 9, which shows the input curve 45 and output curve 46 of the skewed inverter 13 shown in fig. 8, and its toggle level 40.
As shown in fig. 10, the figure is a convolution flow chart of a binary convolution neural network memory computation accelerator based on FD-SOI technology, and includes:
step S1, the controller module sends out control instruction to control the shift register to output corresponding data to the input and inverting input module of the in-memory computing module;
step S2, the input and inverse phase input module transmits data to the SRAM module in the same row;
step S3, the controller module controls the pre-charge module, enables the pre-charge module, charges the pre-charge capacitor to make the BL and BLB potentials at high potentials, and then closes the pre-charge module;
step S4, the controller module selects one signal line of VWL1-VWLc to make it at high potential, even if the SRAM module line connected with the signal line is enabled, the transmission data of step 2 and the storage data of the SRAM module are completed with NAND operation;
step S5, the SRAM calculation result is transmitted to the skew inverter, and then the calculation result of the skew inverter is transmitted to the OR gate, and the step 4-5 completes the XOR operation of the transmission data of the step 2 and the storage data of the SRAM module;
step S6, the OR gate calculation result in each in-memory calculation module is transmitted to the detection conversion module, the binary convolution neural network binarizes the weight and the hidden layer activation value to 0 or 1, and after passing through the in-memory calculation module and the detection conversion module, the function of the binary convolution neural network is equal to the binary convolution neural network binarizing the weight and the hidden layer activation value to-1 or 1;
step S7, transmitting the output results of all detection conversion modules to a normalization module;
step S8, the output result of the normalization module is transmitted to the activation function module;
and step S9, transmitting the output result of the activation function module to the shift register module for storage, jumping to step 1 if the convolution operation is not finished, otherwise, finishing.

Claims (2)

1. The FD-SOI technology-based binary convolution neural network in-memory computing accelerator is characterized by comprising an in-memory computing module, a shift register module, a controller module, a detection conversion module, a normalization module and an activation function module; wherein the content of the first and second substances,
the shift register module is used for storing input data of the convolutional neural network and has a shift function, and the output of the shift register module is connected with the input of the in-memory computing module;
the in-memory computing module is used for storing convolution kernel parameters of the convolution neural network and completing convolution processing on input data, and the output of the in-memory computing module is connected with the input of the detection conversion module; the in-memory calculation module consists of an SRAM module, an input and reverse phase input module and a pre-charging module;
the SRAM module is a module which is constructed by 6 MOSFETs and can store one-bit data, and comprises two CMOS inverters which are composed of two P-type MOSFETs and two N-type MOSFETs, the two CMOS inverters are connected end to end, the structure is used for storing one-bit data, an FD-SOI-MOSFET is respectively connected to the output ends of the two inverters, back gate signals of the FD-SOI-MOSFET are respectively connected with input signals and inverted signals of the input signals, and due to the fact that the FD-SOI-MOSFET back gate signals have an adjusting effect on threshold voltages of the FD-SOI-MOSFET, the NAND operation of the input signals and the stored signals can be completed by the adjusting effect;
the input and inverting input module provides an input signal and an inverting signal of the input signal for the SRAM module;
the pre-charging module charges a pre-charging capacitor before the calculation module in the memory performs exclusive-or operation;
the memory internal calculation module is used for storing data processed by convolution kernel parameters of a convolution neural network in a one-dimensional mode, the front (n multiplied by n) column of each row can be used for storing convolution kernels of n rows and n columns, and similarly, other convolution kernels can be stored later, each row can be used for storing a plurality of convolution kernel parameters, the convolution kernel parameters of one convolution layer are stored by one or more rows, and the output signals of the SRAM module are inverted by a skew inverter and then subjected to OR processing, so that the XOR operation of the input signals and the stored signals of the SRAM module can be realized;
the detection conversion module is used for converting the calculation result of the calculation module in the memory into a convolution calculation result, and the output of the detection conversion module is connected with the input of the normalization module; the specific working mode is as follows: detecting the number of '1' in the output data of the calculation module in the memory, and multiplying the bit width of the output data of the calculation module in the memory by subtracting 2 to detect, namely, the binary convolution neural network is equivalent to binary-1 or 1 of the weight and the activation value of the hidden layer;
the normalization module is used for adding the convolution results of different convolution kernels according to weights, and the output of the normalization module is connected with the input of the activation function module;
the activation function module is used for adding a nonlinear factor to the neural network, and the output of the activation function module is respectively connected with the input of the controller module and the input of the shift register module;
the controller module is used for carrying out logic control on the shift register module and the in-memory computing module, and the specific control mode is as follows: controlling the shift register module to output corresponding data; controlling the enabling and closing of the pre-charging module; and controlling the output data of the shift register and convolution kernel parameters stored by the corresponding row of the calculation module in the memory to carry out exclusive OR operation.
2. A binary convolution neural network in-memory computation accelerator based on FD-SOI technology according to claim 1, wherein the convolution process of the in-memory computation accelerator is:
step 1, a controller module sends out a control instruction to control a shift register to output corresponding data to an input and inverting input module of a calculation module in a memory;
step 2, the input and reverse phase input module transmits data to the SRAM module in the same column;
step 3, the controller module controls the pre-charging module, enables the pre-charging module, charges the pre-charging capacitor to enable the output end of the SRAM module to be at a high potential, and then closes the pre-charging module;
step 4, the controller module selects one signal line in the SRAM module enabling signal to enable the signal line to be at a high potential, and even if the SRAM module line connected with the signal line enables, the transmission data in the step 2 and the storage data of the SRAM module complete NAND operation;
step 5, transmitting the calculation result of the SRAM to a skew inverter, and transmitting the calculation result of the skew inverter to an OR gate, wherein the step 4-5 completes the XOR operation of the transmission data of the step 2 and the storage data of the SRAM module;
step 6, transmitting the OR gate calculation result in each in-memory calculation module to a detection conversion module, setting a binarization convolution neural network to binarize the weight and the hidden layer activation value to 0 or 1, and after passing through the in-memory calculation module and the detection conversion module, realizing the function equivalent to the binarization convolution neural network which binarizes the weight and the hidden layer activation value to-1 or 1;
step 7, transmitting output results of all detection conversion modules to a normalization module;
step 8, transmitting the output result of the normalization module to an activation function module;
and 9, transmitting the output result of the activation function module to a shift register module for storage, skipping to the step 1 if the convolution operation is not finished, otherwise, finishing.
CN201910068644.7A 2019-01-24 2019-01-24 FD-SOI (field-programmable gate array-silicon on insulator) process-based binary convolution neural network in-memory computing accelerator Active CN109784483B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910068644.7A CN109784483B (en) 2019-01-24 2019-01-24 FD-SOI (field-programmable gate array-silicon on insulator) process-based binary convolution neural network in-memory computing accelerator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910068644.7A CN109784483B (en) 2019-01-24 2019-01-24 FD-SOI (field-programmable gate array-silicon on insulator) process-based binary convolution neural network in-memory computing accelerator

Publications (2)

Publication Number Publication Date
CN109784483A CN109784483A (en) 2019-05-21
CN109784483B true CN109784483B (en) 2022-09-09

Family

ID=66502341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910068644.7A Active CN109784483B (en) 2019-01-24 2019-01-24 FD-SOI (field-programmable gate array-silicon on insulator) process-based binary convolution neural network in-memory computing accelerator

Country Status (1)

Country Link
CN (1) CN109784483B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985602A (en) * 2019-05-24 2020-11-24 华为技术有限公司 Neural network computing device, method and computing device
CN110277121B (en) * 2019-06-26 2020-11-27 电子科技大学 Multi-bit memory integrated SRAM based on substrate bias effect and implementation method
CN110414677B (en) 2019-07-11 2021-09-03 东南大学 Memory computing circuit suitable for full-connection binarization neural network
CN110597555B (en) * 2019-08-02 2022-03-04 北京航空航天大学 Nonvolatile memory computing chip and operation control method thereof
CN110970071B (en) 2019-09-26 2022-07-05 上海科技大学 Memory cell of low-power consumption static random access memory and application
CN111126579B (en) * 2019-11-05 2023-06-27 复旦大学 In-memory computing device suitable for binary convolutional neural network computation
CN110991623B (en) * 2019-12-20 2024-05-28 中国科学院自动化研究所 Neural network operation system based on digital-analog mixed neuron
CN113344170B (en) * 2020-02-18 2023-04-25 杭州知存智能科技有限公司 Neural network weight matrix adjustment method, write-in control method and related device
CN111967586B (en) * 2020-07-15 2023-04-07 北京大学 Chip for pulse neural network memory calculation and calculation method
CN112036552B (en) * 2020-10-16 2022-11-08 苏州浪潮智能科技有限公司 Convolutional neural network operation method and device
CN112487750B (en) * 2020-11-30 2023-06-16 西安微电子技术研究所 Convolution acceleration computing system and method based on in-memory computing
CN113159276B (en) * 2021-03-09 2024-04-16 北京大学 Model optimization deployment method, system, equipment and storage medium
CN113190208B (en) * 2021-05-07 2022-12-27 电子科技大学 Storage and calculation integrated unit, state control method, integrated module, processor and equipment

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1577871A (en) * 2003-06-30 2005-02-09 株式会社东芝 Semiconductor storage device and semiconductor integrated circuit
JP2005079127A (en) * 2003-08-29 2005-03-24 Foundation For The Promotion Of Industrial Science Soi-mosfet
CN102088027A (en) * 2009-12-08 2011-06-08 S.O.I.Tec绝缘体上硅技术公司 Circuit of uniform transistors on SeOI with buried back control gate beneath the insulating film
JP2013242960A (en) * 2013-07-01 2013-12-05 Hitachi Ltd Semiconductor device
CN104617925A (en) * 2013-11-01 2015-05-13 恩智浦有限公司 Latch circuit
WO2016057973A1 (en) * 2014-10-10 2016-04-14 Schottky Lsi, Inc. Super cmos (scmostm) devices on a microelectronic system
CN105681628A (en) * 2016-01-05 2016-06-15 西安交通大学 Convolution network arithmetic unit, reconfigurable convolution neural network processor and image de-noising method of reconfigurable convolution neural network processor
CN106228240A (en) * 2016-07-30 2016-12-14 复旦大学 Degree of depth convolutional neural networks implementation method based on FPGA
CN107239824A (en) * 2016-12-05 2017-10-10 北京深鉴智能科技有限公司 Apparatus and method for realizing sparse convolution neutral net accelerator
CN207440765U (en) * 2017-01-04 2018-06-01 意法半导体股份有限公司 System on chip and mobile computing device
CN108268942A (en) * 2017-01-04 2018-07-10 意法半导体股份有限公司 Configurable accelerator frame
CN108268940A (en) * 2017-01-04 2018-07-10 意法半导体股份有限公司 For creating the tool of reconfigurable interconnection frame
EP3346426A1 (en) * 2017-01-04 2018-07-11 STMicroelectronics Srl Reconfigurable interconnect, corresponding system and method
EP3346425A1 (en) * 2017-01-04 2018-07-11 STMicroelectronics Srl Hardware accelerator engine and method
EP3346427A1 (en) * 2017-01-04 2018-07-11 STMicroelectronics Srl Configurable accelerator framework, system and method
CN108288616A (en) * 2016-12-14 2018-07-17 成真股份有限公司 Chip package
CN108805270A (en) * 2018-05-08 2018-11-13 华中科技大学 A kind of convolutional neural networks system based on memory
CN109032781A (en) * 2018-07-13 2018-12-18 重庆邮电大学 A kind of FPGA parallel system of convolutional neural networks algorithm

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7759714B2 (en) * 2007-06-26 2010-07-20 Hitachi, Ltd. Semiconductor device
US9111638B2 (en) * 2012-07-13 2015-08-18 Freescale Semiconductor, Inc. SRAM bit cell with reduced bit line pre-charge voltage
US8947970B2 (en) * 2012-07-13 2015-02-03 Freescale Semiconductor, Inc. Word line driver circuits and methods for SRAM bit cell with reduced bit line pre-charge voltage
KR20140016482A (en) * 2012-07-30 2014-02-10 에스케이하이닉스 주식회사 Sense amplifier circuit and memory device oncluding the same
FR3024917B1 (en) * 2014-08-13 2016-09-09 St Microelectronics Sa METHOD FOR MINIMIZING THE OPERATING VOLTAGE OF A MEMORY POINT OF SRAM TYPE
US10418369B2 (en) * 2015-10-24 2019-09-17 Monolithic 3D Inc. Multi-level semiconductor memory device and structure
US10014318B2 (en) * 2015-10-24 2018-07-03 Monocithic 3D Inc Semiconductor memory device, structure and methods
US10469076B2 (en) * 2016-11-22 2019-11-05 The Curators Of The University Of Missouri Power gating circuit utilizing double-gate fully depleted silicon-on-insulator transistor

Patent Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1577871A (en) * 2003-06-30 2005-02-09 株式会社东芝 Semiconductor storage device and semiconductor integrated circuit
JP2005079127A (en) * 2003-08-29 2005-03-24 Foundation For The Promotion Of Industrial Science Soi-mosfet
CN102088027A (en) * 2009-12-08 2011-06-08 S.O.I.Tec绝缘体上硅技术公司 Circuit of uniform transistors on SeOI with buried back control gate beneath the insulating film
JP2013242960A (en) * 2013-07-01 2013-12-05 Hitachi Ltd Semiconductor device
CN104617925A (en) * 2013-11-01 2015-05-13 恩智浦有限公司 Latch circuit
WO2016057973A1 (en) * 2014-10-10 2016-04-14 Schottky Lsi, Inc. Super cmos (scmostm) devices on a microelectronic system
CN105681628A (en) * 2016-01-05 2016-06-15 西安交通大学 Convolution network arithmetic unit, reconfigurable convolution neural network processor and image de-noising method of reconfigurable convolution neural network processor
CN106228240A (en) * 2016-07-30 2016-12-14 复旦大学 Degree of depth convolutional neural networks implementation method based on FPGA
CN107239824A (en) * 2016-12-05 2017-10-10 北京深鉴智能科技有限公司 Apparatus and method for realizing sparse convolution neutral net accelerator
CN108288616A (en) * 2016-12-14 2018-07-17 成真股份有限公司 Chip package
CN108268942A (en) * 2017-01-04 2018-07-10 意法半导体股份有限公司 Configurable accelerator frame
CN108268943A (en) * 2017-01-04 2018-07-10 意法半导体股份有限公司 Hardware accelerator engine
CN108268941A (en) * 2017-01-04 2018-07-10 意法半导体股份有限公司 Depth convolutional network isomery framework
CN108268940A (en) * 2017-01-04 2018-07-10 意法半导体股份有限公司 For creating the tool of reconfigurable interconnection frame
EP3346426A1 (en) * 2017-01-04 2018-07-11 STMicroelectronics Srl Reconfigurable interconnect, corresponding system and method
EP3346425A1 (en) * 2017-01-04 2018-07-11 STMicroelectronics Srl Hardware accelerator engine and method
EP3346427A1 (en) * 2017-01-04 2018-07-11 STMicroelectronics Srl Configurable accelerator framework, system and method
CN207440765U (en) * 2017-01-04 2018-06-01 意法半导体股份有限公司 System on chip and mobile computing device
CN207731321U (en) * 2017-01-04 2018-08-14 意法半导体股份有限公司 Hardware accelerator engine
CN207993065U (en) * 2017-01-04 2018-10-19 意法半导体股份有限公司 Configurable accelerator frame apparatus and the system for depth convolutional neural networks
CN108805270A (en) * 2018-05-08 2018-11-13 华中科技大学 A kind of convolutional neural networks system based on memory
CN109032781A (en) * 2018-07-13 2018-12-18 重庆邮电大学 A kind of FPGA parallel system of convolutional neural networks algorithm

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
An in-memory VLSI architecture for convolutional neural network;KANG M等;《IEEE Journal on Emerging and Selected Topics in Circuits and Systems》;20181231;第8卷(第3期);第494-505页 *
面向深度学习的FPGA硬件加速平台的研究;洪启飞;《中国优秀硕士学位论文全文数据库 信息科技辑》;20180915(第9期);第I135-287页 *

Also Published As

Publication number Publication date
CN109784483A (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN109784483B (en) FD-SOI (field-programmable gate array-silicon on insulator) process-based binary convolution neural network in-memory computing accelerator
Deng et al. DrAcc: A DRAM based accelerator for accurate CNN inference
US11151439B2 (en) Computing in-memory system and method based on skyrmion racetrack memory
CN112151091B (en) 8T SRAM unit and memory computing device
US11270764B2 (en) Two-bit memory cell and circuit structure calculated in memory thereof
US9697877B2 (en) Compute memory
CN110942792B (en) Low-power-consumption low-leakage SRAM (static random Access memory) applied to storage and calculation integrated chip
Kang et al. An energy-efficient memory-based high-throughput VLSI architecture for convolutional networks
US20230196079A1 (en) Enhanced dynamic random access memory (edram)-based computing-in-memory (cim) convolutional neural network (cnn) accelerator
US11500960B2 (en) Memory cell for dot product operation in compute-in-memory chip
US11456030B2 (en) Static random access memory SRAM unit and related apparatus
CN110941185B (en) Double-word line 6TSRAM unit circuit for binary neural network
CN112885386A (en) Memory control method and device and ferroelectric memory
CN114743580B (en) Charge sharing memory computing device
Bose et al. A 75kb SRAM in 65nm CMOS for in-memory computing based neuromorphic image denoising
Tabrizchi et al. Appcip: Energy-efficient approximate convolution-in-pixel scheme for neural network acceleration
CN114999544A (en) Memory computing circuit based on SRAM
CN115691613B (en) Charge type memory internal calculation implementation method based on memristor and unit structure thereof
Tabrizchi et al. TizBin: A low-power image sensor with event and object detection using efficient processing-in-pixel schemes
Shin et al. A PVT-robust customized 4T embedded DRAM cell array for accelerating binary neural networks
CN114093394B (en) Rotatable internal computing circuit and implementation method thereof
US20230045840A1 (en) Computing device, memory controller, and method for performing an in-memory computation
KR20240035492A (en) Folding column adder architecture for in-memory digital computing.
Xia et al. Transformers only look once with nonlinear combination for real-time object detection
Lin et al. Ensemble cross‐stage partial attention network for image classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant