CN112597079B - Data write-back system of convolutional neural network accelerator - Google Patents

Data write-back system of convolutional neural network accelerator Download PDF

Info

Publication number
CN112597079B
CN112597079B CN202011527851.3A CN202011527851A CN112597079B CN 112597079 B CN112597079 B CN 112597079B CN 202011527851 A CN202011527851 A CN 202011527851A CN 112597079 B CN112597079 B CN 112597079B
Authority
CN
China
Prior art keywords
unit
write
buffer
data
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011527851.3A
Other languages
Chinese (zh)
Other versions
CN112597079A (en
Inventor
王天一
边立剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Anlu Information Technology Co ltd
Original Assignee
Shanghai Anlu Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Anlu Information Technology Co ltd filed Critical Shanghai Anlu Information Technology Co ltd
Priority to CN202011527851.3A priority Critical patent/CN112597079B/en
Publication of CN112597079A publication Critical patent/CN112597079A/en
Application granted granted Critical
Publication of CN112597079B publication Critical patent/CN112597079B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Neurology (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application provides a data write-back system of a convolutional neural network accelerator, which comprises an input buffer memory module, N-level write-back nodes and a write-back control module, wherein the input buffer memory module is used for being connected with a calculation unit to receive data, the write-back node at the uppermost level is connected with the input buffer memory module, the write-back node at the next level is at least connected with the two write-back nodes at the last level, N is a natural number larger than 1, and the write-back control module is connected with the write-back node at the lowest level to receive data from the write-back node at the lowest level and transmit the data to a bus. The data write-back system of the convolutional neural network accelerator comprises N levels of write-back nodes, wherein the uppermost level of write-back nodes are connected with the input buffer module, one next level of write-back nodes are at least connected with two previous levels of write-back nodes, N is a natural number larger than 1, and the tree structure classifies the write-back nodes, so that the transmission efficiency of data write-back can be improved.

Description

Data write-back system of convolutional neural network accelerator
Technical Field
The application relates to the technical field of deep learning, in particular to a data write-back system of a convolutional neural network accelerator.
Background
In the prior art, compared with the edge device, the cloud field programmable gate array (Field Programmable Gate Array, FPGA) can provide a large amount of logic and memory resources, but the neural network model running on the cloud is often huge, a large amount of intermediate results can be generated in the running process, and the on-chip random access memory (Random Access Memory, RAM) resources on the FPGA platform are often unable to buffer all data, so that the data needs to be transmitted to the off-chip memory, but the prior art cannot meet the transmission requirement of concurrent data with high throughput rate, and the data transmission efficiency is low.
Therefore, there is a need to provide a new data write-back system of convolutional neural network accelerator to solve the above-mentioned problems in the prior art.
Disclosure of Invention
The application aims to provide a data write-back system of a convolutional neural network accelerator, which improves the transmission efficiency of the data write-back of the convolutional neural network accelerator.
In order to achieve the above object, the data write-back system of the convolutional neural network accelerator of the present application includes:
the input buffer module is used for being connected with the computing unit to receive data;
n-level write-back nodes, wherein the write-back node at the uppermost level is connected with the input buffer module, one write-back node at the next level is at least connected with the write-back nodes at the two previous levels, and N is a natural number larger than 1;
and the write-back control module is connected with the write-back node of the lowest stage so as to receive data from the write-back node of the lowest stage and transmit the data to the bus.
The data write-back system of the convolutional neural network accelerator has the beneficial effects that: the data write-back system comprises N levels of write-back nodes, wherein the uppermost level of write-back nodes are connected with the input buffer module, one next level of write-back nodes are at least connected with two previous levels of write-back nodes, N is a natural number larger than 1, and the tree structure classifies the write-back nodes, so that the transmission efficiency of data write-back can be improved.
Preferably, the write-back node includes a first output buffer unit, a selection unit and at least two receiving buffer units, wherein an output end of the receiving buffer unit is connected with an input end of the selection unit, and an output end of the selection unit is connected with an input end of the first output buffer unit. The beneficial effects are that: the standardized design of the write-back node is simple and easy to use and transplant.
Further preferably, the number of the write-back nodes at the previous stage is adapted to the number of the receiving cache units of the write-back nodes at the next stage. The beneficial effects are that: and the waste of the receiving buffer unit of the next-stage write-back node is avoided.
Further preferably, the write-back control module includes an address mapping unit, and the data received by the write-back control module from the write-back node of the lowest stage includes calculation unit address information and calculation result data, and the address mapping unit calculates the write-back address according to the calculation address information and the start address information.
Further preferably, the write-back node further includes an arbitration unit and a buffer management unit, where the arbitration unit is connected to the selection unit, and the buffer management unit is connected to the receiving buffer unit and the first output buffer unit respectively. The beneficial effects are that: the collision in the data transmission process can be effectively avoided.
Further preferably, the receiving buffer unit includes a first buffer status unit and a first data buffer unit that are connected to each other, and the first buffer status unit is connected to the buffer management unit. The beneficial effects are that: and judging whether the first data cache unit has data or not conveniently.
Further preferably, the first output buffer unit includes a second buffer status unit and a second data buffer unit connected to each other, and the second buffer status unit is connected to the buffer management unit. The beneficial effects are that: and judging whether the data exists in the second data cache unit or not conveniently.
Further preferably, the cache management units of the interconnected write-back nodes are interconnected. The beneficial effects are that: avoiding data collision.
Further preferably, the input buffer module includes input buffer units, and the number of the input buffer units is adapted to the number of the receiving buffer units of the write-back node at the uppermost level. The beneficial effects are that: and avoiding the waste of the receiving buffer unit of the top-level write-back node.
Further preferably, the input buffer unit includes a buffer control unit, a third data buffer unit, and a second output buffer unit, where the buffer control unit is connected to the calculation unit, the third data buffer unit, and the buffer management unit of the corresponding write-back node, and the third data buffer unit is connected to the second output buffer unit.
Preferably, the number of the write-back nodes of the lowest stage is 1. The beneficial effects are that: only one data can be transmitted to the bus at the same time, and the collision of data transmission is avoided.
Drawings
FIG. 1 is a block diagram illustrating an arbitration unit according to some embodiments of the present application;
FIG. 2 is a block diagram illustrating a receive buffer unit according to some embodiments of the application;
FIG. 3 is a block diagram illustrating a first output buffer unit according to some embodiments of the present application;
FIG. 4 is a block diagram illustrating an input buffer unit according to some embodiments of the present application;
FIG. 5 is a block diagram of a convolutional neural network accelerator data write-back system in accordance with some embodiments of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application. Unless otherwise defined, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs. As used herein, the word "comprising" and the like means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof without precluding other elements or items.
Aiming at the problems existing in the prior art, the embodiment of the application provides a data write-back system of a convolutional neural network accelerator, which is based on a cloud field programmable gate array (Field Programmable Gate Array, FPGA), and comprises an input buffer module, an N-level write-back node and a write-back control module, wherein the input buffer module is used for being connected with a computing unit to receive data; the uppermost-level write-back node is connected with the input buffer module, one next-level write-back node is at least connected with two previous-level write-back nodes, and N is a natural number larger than 1; the write-back control module is connected with the write-back node of the lowest stage, and is used for receiving data from the write-back node of the lowest stage and transmitting the data to the bus. Preferably, the number of the write-back nodes of the lowest stage is 1.
In some embodiments, the write-back node includes a first output buffer unit, a selection unit, an arbitration unit, a buffer management unit, and at least two receiving buffer units, where an output end of the receiving buffer unit is connected to an input end of the selection unit, an output end of the selection unit is connected to an input end of the first output buffer unit, the arbitration unit is connected to the selection unit, and the buffer management unit is connected to the receiving buffer unit and the first output buffer unit respectively. Specifically, the arbitration unit is a shift register, and the bit of the shift register is at least 2.
FIG. 1 is a block diagram illustrating an arbitration unit according to some embodiments of the present application. Referring to fig. 1, the arbitration unit 212 includes a shift register having the same number of bits as the number of the receiving buffer units connected thereto, for example, 4 in the receiving buffer units connected to the arbitration unit 212, and the shift register includes 4 bits, namely, a first bit 2121, a second bit 2122, a third bit 2123, and a fourth bit 2124. Taking the right shift as an example, the first bit 2121 is 1, the second bit 2122 is 0, the third bit 2123 is 0, the fourth bit 2124 is 0, the first bit 2121 is 0, the second bit 21222 is 1, the third bit 2123 is 0, and the fourth bit 2124 is 0 in the second clock period; the first bit 2121 is 0, the second bit 2122 is 0, the third bit 2123 is 1, and the fourth bit 2124 is 0 for the third clock cycle; the first bit 2121 is 0, the second bit 2122 is 0, the third bit 2123 is 0, and the fourth bit 2124 is 1 for a fourth clock cycle; and four clock cycles are one cycle. The principle of the left shift is the same as that of the right shift, and detailed description thereof will be omitted.
Fig. 2 is a block diagram illustrating a structure of a receiving buffer unit according to some embodiments of the present application. Referring to fig. 2, the receiving buffer unit 211 includes a first buffer status unit 2111 and a first data buffer unit 2112 connected to each other, and the first buffer status unit 2111 is connected to the buffer management unit (not shown). Further, the first buffer status unit 2111 is connected to the buffer management unit, when the first buffer status unit 2111 detects that no data is stored in the first data buffer unit 2112, the first buffer status unit 2111 feeds back to the buffer management unit, the buffer management unit marks the first data buffer unit 2112 as 1, when the first buffer status unit 2111 detects that data is stored in the first data buffer unit 2112, the first buffer status unit 2111 feeds back to the buffer management unit, and the buffer management unit marks the first data buffer unit 2112 as 0.
Fig. 3 is a block diagram illustrating a first output buffer unit according to some embodiments of the present application. Referring to fig. 3, the first output buffer unit 215 includes a second buffer status unit 2151 and a second data buffer unit 2152 connected to each other, the second buffer status unit 2151 is connected to the buffer management unit (not shown), an input terminal of the second data buffer unit 2152 is connected to an output terminal of the selection unit (not shown), and an output terminal of the second data buffer unit 2152 is connected to an input terminal of a first data buffer unit (not shown) of the reception buffer unit of the write-back node of the next stage or an input terminal of the write-back control module (not shown). Further, the second buffer status unit 2151 is connected to the buffer management unit, when the second buffer status unit 2151 detects that no data is stored in the second data buffer unit 2152, the buffer management unit marks the second data buffer unit 2152 as 1, when the second buffer status unit 2151 detects that data is stored in the second data buffer unit 2152, the buffer management unit feeds back to the buffer management unit, and the buffer management unit marks the second data buffer unit 2152 as 0.
In some embodiments, the cache management units of the interconnected write-back nodes are interconnected. .
Specifically, when the receiving buffer unit in the next-level write-back node is marked as 1 by the buffer management unit, that is, no data is stored in the receiving buffer unit, and the output buffer unit in the previous-level write-back node is marked as 1 by the buffer management unit, the output buffer unit may receive data from the receiving buffer unit according to the bit corresponding to 1 in the arbitration unit.
For example, the write-back node of the previous stage is a first write-back node, the write-back node of the next stage is a second write-back node, the first write-back node includes a first receiving buffer unit, a second receiving buffer unit, a third receiving buffer unit, a fourth receiving buffer unit, a first selecting unit, a first arbitration unit, a first buffer management unit, and a third output buffer unit, output ends of the first receiving buffer unit, the second receiving buffer unit, the third receiving buffer unit, and the fourth receiving buffer unit are respectively connected with four input ends of the first selecting unit, the first arbitration unit is connected with a control end of the first selecting unit, an output end of the first selecting unit is connected with the third output buffer unit, and the first arbitration unit is respectively connected with the first receiving buffer unit, the second receiving buffer unit, the third receiving buffer unit, the fourth receiving buffer unit, and the third output buffer unit, so as to mark the first receiving unit, the second receiving unit, the third receiving unit, the fourth receiving unit, the third receiving unit, and the third receiving buffer unit 1, or the third buffer unit;
the first write-back node comprises a fifth receiving buffer unit, a sixth receiving buffer unit, a seventh receiving buffer unit, an eighth receiving buffer unit, a second selecting unit, a second arbitration unit, a second buffer management unit and a fourth output buffer unit, wherein the output ends of the fifth receiving buffer unit, the sixth receiving buffer unit, the seventh receiving buffer unit and the eighth receiving buffer unit are respectively connected with the four input ends of the second selecting unit, the second arbitration unit is connected with the control end of the second selecting unit, the output end of the second selecting unit is connected with the fourth output buffer unit, and the second arbitration unit is respectively connected with the fifth receiving buffer unit, the sixth receiving buffer unit, the seventh receiving buffer unit, the eighth receiving buffer unit and the fourth output buffer unit so as to perform 1 or 0 marking on the fifth receiving buffer unit, the sixth receiving buffer unit, the seventh receiving buffer unit, the eighth receiving buffer unit and the fourth output buffer unit;
the first write-back node and the second write-back node are connected with each other, specifically, the output end of the third output buffer unit is connected with the input end of the fifth receiving buffer unit, the first buffer management unit is connected with the second buffer management unit, when no data is stored in the fifth receiving buffer unit, the second buffer unit feeds back the mark of the fifth receiving buffer unit to the first buffer unit as 1, when no data is stored in the third output buffer unit, the first buffer management unit marks the third output buffer unit as 1, and at this time, if the first bit of the first arbitration unit is 1, the first receiving buffer unit transmits the data stored in the first output buffer unit to the third output buffer unit through the first selection unit, and the third output buffer unit transmits the data to the fifth receiving buffer unit.
Fig. 4 is a block diagram illustrating an input buffer unit according to some embodiments of the application. Referring to fig. 4, the input buffer module includes input buffer units 11, the number of the input buffer units 11 is adapted to the number of receiving buffer units of the write-back node at the uppermost level, the input buffer units 11 include a buffer control unit 111, a third data buffer unit 112, and a second output buffer unit 113, the buffer control unit 111 is respectively connected to a control end of the computing unit (not shown in the figure), the third data buffer unit 112, and a corresponding buffer management unit (not shown in the figure) of the write-back node, an input end of the third data buffer unit 112 is connected to a data output end of the computing unit, the third data buffer unit 112 is connected to the second output buffer unit 113, and an output end of the second output buffer unit 113 is connected to a first data buffer unit (not shown in the figure) in the receiving buffer unit of the write-back node at the uppermost level. Wherein. Specifically, the third data buffer unit 112 is a first-in first-out (First Input First Output, FIFO) memory.
In some embodiments, when the cache management unit of the write-back node at the uppermost level feeds back 0 to the cache control unit, that is, the first data cache unit stores data, and if the third data cache unit stores data at this time, the cache control unit sends a non-empty signal to the computing unit, so that the computing unit stops working; when the cache management unit of the top level of the write-back node feeds back 0 to the cache control unit, namely, the first data cache unit does not store data, if the third data cache unit does not store data at the moment, the cache control unit does not process, or the cache control unit sends an empty signal to the calculation unit, so that the calculation unit immediately enters a working state; when the cache management unit of the top level of the write-back node feeds back 1 to the cache control unit, that is, no data is stored in the first data cache unit, if data is stored in the third data cache unit at this time, the second output cache unit reads the data from the third data cache unit.
In some embodiments, the number of the write-back nodes at the upper level is adapted to the number of the receiving cache units of the write-back nodes at the lower level.
In some embodiments, the write-back control module includes an address mapping unit, the data received by the write-back control module from the write-back node at the lowest stage includes computing unit address information and computing result data, the address mapping unit computes a write-back address according to the computing address information and the starting address information by means of address mapping, and transmits the write-back address and the computing result data together to a Bipolar Random Access Memory (BRAM) of the neural network accelerator along a bus.
In some embodiments, the second output buffer unit, the first data buffer unit, and the second data buffer unit in the present application are random access memories (Random Access Memory, RAM).
FIG. 5 is a block diagram of a convolutional neural network accelerator data write-back system in accordance with some embodiments of the present application. Referring to fig. 5, the data write-back system 100 of the convolutional neural network accelerator includes an input buffer module (not labeled in the figure), a level 2 write-back node 20, and a write-back control module 30. The level 2 write-back node 20 includes a first level write-back node 21 and a second level write-back node 22, where the first level write-back node 21 is the upper level of the second level write-back node 22, the input buffer module 10 is connected with the first level write-back node 21, the first level write-back node 21 is connected with the second level write-back node 22, the second level write-back node 22 is connected with the write-back control module 30, and the write-back control module 30 is connected with a bus (not labeled in the figure).
Referring to fig. 5, the input buffer module 10 includes 16 input buffer units 11, and the 16 input buffer units 11 are connected to 16 computing units (not labeled in the figure) in a one-to-one correspondence manner, so as to receive data from the corresponding computing units.
Referring to fig. 5, the first level write-back node 21 includes 4 write-back nodes, the second level write-back node 22 includes 1 write-back node, and the write-back nodes of the first level write-back node 21 and the second level write-back node 22 each include 4 receive buffer units 211, 1 arbitration unit 212, 1 selection unit 213, 1 buffer management unit 214, and 1 first output buffer unit 215. The input ends of the receiving buffer units 211 are connected with the input buffer units 11 in a one-to-one correspondence manner, in the same write-back node, the output ends of the 4 receiving buffer units 211 are respectively connected with the 4 input ends of the selecting unit 213, the output end of the arbitration unit 212 is connected with the control end of the selecting unit 213, and the buffer management unit 214 is connected with the 4 receiving buffer units 211 and the first output buffer unit 215.
Referring to fig. 5, the output ends of the first output buffer units 215 of the 4 write-back nodes in the first level write-back node 21 are respectively connected with the input ends of the 4 receiving buffer units 211 of the write-back node in the second level write-back node 22; the cache management units 214 of the 4 write-back nodes in the first-level write-back node 21 are all connected with the cache management unit 214 in the second-level write-back node 22.
While embodiments of the present application have been described in detail hereinabove, it will be apparent to those skilled in the art that various modifications and variations can be made to these embodiments. It is to be understood that such modifications and variations are within the scope and spirit of the present application as set forth in the following claims. Moreover, the application described herein is capable of other embodiments and of being practiced or of being carried out in various ways.

Claims (7)

1. A data write-back system of a convolutional neural network accelerator, comprising:
the input buffer module is used for being connected with the computing unit to receive data;
n-level write-back nodes, wherein the write-back node at the uppermost level is connected with the input buffer module, one write-back node at the next level is at least connected with the write-back nodes at the two previous levels, and N is a natural number larger than 1;
the write-back control module is connected with the write-back node of the lowest stage, and is used for receiving data from the write-back node of the lowest stage and transmitting the data to the bus;
the write-back node comprises a first output buffer unit, a selection unit, at least two receiving buffer units, an arbitration unit and a buffer management unit, wherein the output end of the receiving buffer unit is connected with the input end of the selection unit, the output end of the selection unit is connected with the input end of the first output buffer unit, the arbitration unit is connected with the selection unit, and the buffer management unit is respectively connected with the receiving buffer units and the first output buffer units;
the number of the write-back nodes at the upper stage is matched with the number of the receiving cache units of the write-back nodes at the lower stage;
the write-back control module comprises an address mapping unit, the data received by the write-back control module from the write-back node at the lowest stage comprises calculation unit address information and calculation result data, and the address mapping unit calculates a write-back address according to the calculation address information and the initial address information.
2. The data write-back system of a convolutional neural network accelerator of claim 1, wherein the receive buffer unit comprises a first buffer status unit and a first data buffer unit that are connected to each other, the first buffer status unit being connected to the buffer management unit.
3. The data write-back system of a convolutional neural network accelerator of claim 1, wherein the first output buffer unit comprises a second buffer status unit and a second data buffer unit that are connected to each other, the second buffer status unit being connected to the buffer management unit.
4. A data write-back system of a convolutional neural network accelerator according to any one of claims 1, 2 or 3, wherein the cache management units of the write-back nodes that are connected to each other.
5. The data write-back system of a convolutional neural network accelerator of claim 1, wherein the input buffer module comprises input buffer cells, the number of input buffer cells being adapted to the number of receive buffer cells of the write-back node of the uppermost level.
6. The data write-back system of a convolutional neural network accelerator of claim 5, wherein the input buffer unit comprises a buffer control unit, a third data buffer unit, and a second output buffer unit, the buffer control unit is respectively connected with the computing unit, the third data buffer unit, and the buffer management unit of the corresponding write-back node, and the third data buffer unit is connected with the second output buffer unit.
7. The data write-back system of a convolutional neural network accelerator of claim 1, wherein the number of write-back nodes at the lowest level is 1.
CN202011527851.3A 2020-12-22 2020-12-22 Data write-back system of convolutional neural network accelerator Active CN112597079B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011527851.3A CN112597079B (en) 2020-12-22 2020-12-22 Data write-back system of convolutional neural network accelerator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011527851.3A CN112597079B (en) 2020-12-22 2020-12-22 Data write-back system of convolutional neural network accelerator

Publications (2)

Publication Number Publication Date
CN112597079A CN112597079A (en) 2021-04-02
CN112597079B true CN112597079B (en) 2023-10-17

Family

ID=75199931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011527851.3A Active CN112597079B (en) 2020-12-22 2020-12-22 Data write-back system of convolutional neural network accelerator

Country Status (1)

Country Link
CN (1) CN112597079B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08123725A (en) * 1994-10-20 1996-05-17 Hitachi Ltd Write-back type cache system
US5924115A (en) * 1996-03-29 1999-07-13 Interval Research Corporation Hierarchical memory architecture for a programmable integrated circuit having an interconnect structure connected in a tree configuration
JP2006072832A (en) * 2004-09-03 2006-03-16 Nec Access Technica Ltd Image processing system
CN101430664A (en) * 2008-09-12 2009-05-13 中国科学院计算技术研究所 Multiprocessor system and Cache consistency message transmission method
CN107329734A (en) * 2016-04-29 2017-11-07 北京中科寒武纪科技有限公司 A kind of apparatus and method for performing convolutional neural networks forward operation
CN109739696A (en) * 2018-12-13 2019-05-10 北京计算机技术及应用研究所 Double-control storage array solid state disk cache acceleration method
CN110516801A (en) * 2019-08-05 2019-11-29 西安交通大学 A kind of dynamic reconfigurable convolutional neural networks accelerator architecture of high-throughput
CN111126584A (en) * 2019-12-25 2020-05-08 上海安路信息科技有限公司 Data write-back system
CN112100097A (en) * 2020-11-17 2020-12-18 杭州长川科技股份有限公司 Multi-test channel priority adaptive arbitration method and memory access controller

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8412881B2 (en) * 2009-12-22 2013-04-02 Intel Corporation Modified B+ tree to store NAND memory indirection maps
US10725934B2 (en) * 2015-10-08 2020-07-28 Shanghai Zhaoxin Semiconductor Co., Ltd. Processor with selective data storage (of accelerator) operable as either victim cache data storage or accelerator memory and having victim cache tags in lower level cache wherein evicted cache line is stored in said data storage when said data storage is in a first mode and said cache line is stored in system memory rather then said data store when said data storage is in a second mode
US10430706B2 (en) * 2016-12-01 2019-10-01 Via Alliance Semiconductor Co., Ltd. Processor with memory array operable as either last level cache slice or neural network unit memory

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08123725A (en) * 1994-10-20 1996-05-17 Hitachi Ltd Write-back type cache system
US5924115A (en) * 1996-03-29 1999-07-13 Interval Research Corporation Hierarchical memory architecture for a programmable integrated circuit having an interconnect structure connected in a tree configuration
JP2006072832A (en) * 2004-09-03 2006-03-16 Nec Access Technica Ltd Image processing system
CN101430664A (en) * 2008-09-12 2009-05-13 中国科学院计算技术研究所 Multiprocessor system and Cache consistency message transmission method
CN107329734A (en) * 2016-04-29 2017-11-07 北京中科寒武纪科技有限公司 A kind of apparatus and method for performing convolutional neural networks forward operation
CN109739696A (en) * 2018-12-13 2019-05-10 北京计算机技术及应用研究所 Double-control storage array solid state disk cache acceleration method
CN110516801A (en) * 2019-08-05 2019-11-29 西安交通大学 A kind of dynamic reconfigurable convolutional neural networks accelerator architecture of high-throughput
CN111126584A (en) * 2019-12-25 2020-05-08 上海安路信息科技有限公司 Data write-back system
CN112100097A (en) * 2020-11-17 2020-12-18 杭州长川科技股份有限公司 Multi-test channel priority adaptive arbitration method and memory access controller

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
数字信号处理器分布式寄存器的写回设计;邵铮;谢憬;王琴;毛志刚;;微电子学与计算机(第07期);第30-33页 *

Also Published As

Publication number Publication date
CN112597079A (en) 2021-04-02

Similar Documents

Publication Publication Date Title
CN101667451B (en) Data buffer of high-speed data exchange interface and data buffer control method thereof
CN102402611B (en) Method for quickly searching keywords and reading lists by using ternary content addressable memory (TCAM)
CN1997987A (en) An apparatus and method for packet coalescing within interconnection network routers
CN101692651A (en) Method and device for Hash lookup table
CN102130833A (en) Memory management method and system of traffic management chip chain tables of high-speed router
CN106534368B (en) A kind of packet sending and receiving method and system of automobile CAN-bus gateway
CN104065588B (en) A kind of device and method of data packet dispatching and caching
CN101771598A (en) Communication dispatching method of real-time Ethernet
CN112597079B (en) Data write-back system of convolutional neural network accelerator
CN102736888B (en) With the data retrieval circuit of synchronization of data streams
CN102622323A (en) Data transmission management method based on switch matrix in dynamic configurable serial bus
CN1568464A (en) Tagging and arbitration mechanism in an input/output node of a computer system
CN102999443A (en) Management method of computer cache system
CN100512218C (en) Transmitting method for data message
CN103077198A (en) Operation system and file cache positioning method thereof
CN100387027C (en) Bag-preprocessing circuit assembly of interface card for high-speed network diversion equipment
CN111126584B (en) Data write-back system
CN112242963A (en) Rapid high-concurrency neural pulse data packet distribution and transmission method
CN100396044C (en) Dynamic buffer memory management ATM switching arrangement and switching method thereof
CN105049377B (en) AFDX exchange datas bus structures and method for interchanging data based on Crossbar frameworks
CN103501251A (en) Method and device for processing data packet under offline condition
CN105162725B (en) A kind of method and device pre-processed to protocol processes assembly line message address
CN110764733B (en) Multi-distribution random number generation device based on FPGA
CN111857817B (en) Data reading method, data reading device and data reading system
CN101185056B (en) Data pipeline management system and method for using the system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
CB02 Change of applicant information

Address after: 200434 Room 202, building 5, No. 500, Memorial Road, Hongkou District, Shanghai

Applicant after: Shanghai Anlu Information Technology Co.,Ltd.

Address before: Room 501-504, building 9, Pudong Software Park, 498 GuoShouJing Road, Pudong New Area, Shanghai, 201203

Applicant before: ANLOGIC INFOTECH Co.,Ltd.

CB02 Change of applicant information
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant