CN111126584B - Data write-back system - Google Patents

Data write-back system Download PDF

Info

Publication number
CN111126584B
CN111126584B CN201911360228.0A CN201911360228A CN111126584B CN 111126584 B CN111126584 B CN 111126584B CN 201911360228 A CN201911360228 A CN 201911360228A CN 111126584 B CN111126584 B CN 111126584B
Authority
CN
China
Prior art keywords
data
unit
write
control unit
validity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911360228.0A
Other languages
Chinese (zh)
Other versions
CN111126584A (en
Inventor
王天一
边立剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Anlu Information Technology Co.,Ltd.
Original Assignee
Shanghai Anlogic Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Anlogic Information Technology Co ltd filed Critical Shanghai Anlogic Information Technology Co ltd
Priority to CN201911360228.0A priority Critical patent/CN111126584B/en
Publication of CN111126584A publication Critical patent/CN111126584A/en
Application granted granted Critical
Publication of CN111126584B publication Critical patent/CN111126584B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Abstract

The invention provides a data write-back system, which comprises at least one write-back module, wherein the write-back module comprises two volatile storage units, a logic control unit, a data transmission unit and a cache control unit, the cache control unit is connected with an activator, the write-back system can simultaneously receive a plurality of data, the parallelism and the high data throughput rate are greatly improved, the logic control unit is connected with the two volatile storage units, the two volatile storage units are connected with the data transmission unit, and when the number of the write-back modules is more than 1, the data transmission units are connected in series; the cache control unit is used for receiving data from the activator, the logic control unit is used for controlling the two volatile storage units to read data from the cache control unit and write data into the data transmission unit in a ping-pong read-write mode, and simultaneously, the data is read and written, so that the continuous output of the data is ensured, and the data write-back efficiency is improved.

Description

Data write-back system
Technical Field
The invention relates to the technical field of deep learning, in particular to a data write-back system.
Background
In addition to intensive operations, frequent reading and writing of a memory in the conventional convolutional neural network is also a remarkable characteristic of the convolutional neural network. In order to ensure that computing resources in the accelerator exert a strong example, input data must be timely and accurately written back to a memory for use in the next layer of convolution operation.
The calculation speed of the neural network accelerator is fast, data output by the activator cannot be accurately and quickly written back into the memory, and a calculation unit of the neural network accelerator enters an idle state because the calculation data cannot be read from the memory, so that the overall performance is affected.
Therefore, there is a need to provide a new data write-back system to solve the above problems in the prior art.
Disclosure of Invention
The invention aims to provide a data write-back system which improves the write-back efficiency of data.
In order to achieve the above object, the data write-back system of the present invention includes at least one write-back module, where the write-back module includes two volatile memory units, a logic control unit, a data transmission unit, and a cache control unit, the cache control unit is connected to an activator, the logic control unit is connected to the two volatile memory units, the two volatile memory units are connected to the data transmission unit, and when the number of the write-back modules is greater than 1, the data transmission units are connected in series; the buffer control unit is used for receiving data from the activator, and the logic control unit is used for controlling the two volatile storage units to read data from the buffer control unit and write data into the data transmission unit in a ping-pong read-write mode.
The invention has the beneficial effects that: the write-back system comprises at least one write-back module, wherein the write-back module comprises a cache control unit, the cache control unit is connected with an activator, so that the data write-back system can simultaneously receive a plurality of data, the parallelism is greatly improved, the high throughput rate of the data is improved, the write-back module also comprises two volatile storage units and a logic control unit, the logic control unit can control the two volatile storage units to read the data from the cache control unit and write the data into the data transmission unit in a ping-pong read-write mode, the data can be read and written simultaneously, the continuous output of the data is ensured, and the data write-back efficiency is improved.
Preferably, the cache control unit comprises a write-back control unit and a write-back data receiving unit, the write-back data receiving unit is connected with the activator, and the write-back data receiving unit is used for receiving write-back data from the activator; the write-back control unit is used for obtaining the current storage address of the write-back data according to the initial address and the address offset of the write-back data and generating mask data according to the state of the write-back data receiving unit.
Further preferably, the write back data receiving unit comprises at least one sub write back data receiving unit for receiving the write back data according to a time period. The beneficial effects are that: and after receiving a plurality of write-back data, the write-back data are output at the same time, so that the access times of the memory can be reduced, and the efficiency is improved.
Preferably, the data transmission unit includes a judgment unit, a selection unit and a data link unit, an output end of the judgment unit is connected with an input end of the selection unit, and an output end of the selection unit is connected with an input end of the data link unit.
Further preferably, the data transmission unit includes a data link update judging unit, a first data validity judging unit and a second data validity judging unit, an input end of the data link update judging unit is connected with input ends of the two volatile memory units, a first output end of the data link update judging unit is connected with an input end of the first data validity judging unit, an output end of the first data validity judging unit is connected with a first input end of the selecting unit, a second output end of the data link update judging unit is connected with an input end of the second data validity judging unit, an output end of the second data validity judging unit is connected with a second input end of the selecting unit, wherein the data link update judging unit is configured to judge whether the data link unit is in an update state, to transmit data to the first data validity judging unit or the second data validity judging unit; the first data validity judging unit and the second data validity judging unit are used for judging whether data are valid or not. The beneficial effects are that: the data link updating judgment unit can judge whether the data link unit is in an updating state or not so as to transmit data to the first data validity judgment unit or the second data validity judgment unit, so that the selection unit can output data orderly, and data collision is avoided.
Further preferably, between adjacent write-back modules, a third input end of the selection unit is connected to an output end of a data link unit in an adjacent data transmission unit. The beneficial effects are that: data transfer between adjacent write-back modules is guaranteed.
Further preferably, the data transmission unit further includes a buffer unit, and the buffer unit is disposed on a connection line between the first data validity judgment unit and the selection unit, and is configured to buffer the data output by the first data validity judgment unit. The beneficial effects are that: the cache unit can cache the data output by the first data validity judging unit, avoid data conflict with the data in the data link unit and ensure effective transmission of the data.
Further preferably, the data write-back module is constructed by an FPGA. The beneficial effects are that: the write-back modules are constructed through the FPGA, the number of the write-back modules can be reasonably configured according to application scenes and resources of the FPGA, and waste of resources is avoided.
Drawings
Fig. 1 is a block diagram of the overall structure of the present invention.
Fig. 2 is a block diagram of a data transmission unit in some embodiments of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. Unless defined otherwise, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. As used herein, the word "comprising" and similar words are intended to mean that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items.
In order to solve the problems in the prior art, an embodiment of the present invention provides a data write-back system, and referring to fig. 1, the data write-back system 10 includes at least one write-back module 11, and the write-back modules 11 are sequentially connected in series. Preferably, the data write-back system comprises 32 of the write-back modules, and the write-back modules are implemented by an FPGA. The number of the write-back modules can be reasonably configured according to the application scene and the resources of the FPGA, and resource waste is avoided.
In some embodiments of the present invention, referring to fig. 1, the write-back module 11 includes two volatile memory units 111, a logic control unit 112, a data transmission unit 113, and a cache control unit 114, an input terminal of the cache control unit 114 is connected to the activator (not shown), an output terminal of the cache control unit 114 is connected to input terminals of the two volatile memory units 111, the logic control unit 112 is connected to control terminals of the two volatile memory units 111, and output terminals of the two volatile memory units 111 are connected to input terminals of the data transmission unit 113. The logic control unit 112 is configured to control the two volatile storage units 111 to read data from the buffer control unit 114 and write data to the data transmission unit 113 in a ping-pong manner.
In some embodiments of the invention, the volatile storage unit is a FIFO memory (First Input First Output), and the logic control unit is a state controller.
In some embodiments of the present invention, the cache control unit includes a write-back control unit and a write-back data receiving unit, and the write-back data receiving unit is connected to the activator. Wherein the write-back data receiving unit is configured to receive write-back data from the activator; the write-back control unit is used for obtaining the current storage address of the write-back data according to the initial address and the address offset of the write-back data and generating mask data according to the state of the write-back data receiving unit.
In some embodiments of the present invention, the write-back control unit is a write-back controller.
In some preferred embodiments of the present invention, the write back data receiving unit comprises at least one sub write back data receiving unit for receiving the write back data according to a time period. And after receiving a plurality of write-back data, the write-back data are output at the same time, so that the access times of the memory can be reduced, and the efficiency is improved.
In some embodiments of the present invention, the write-back data receiving unit includes four sub-write-back data receiving units, which are a first sub-write-back data receiving unit, a second sub-write-back data receiving unit, a third sub-write-back data receiving unit and a fourth sub-write-back data receiving unit, respectively. Wherein, every four clock cycles are a clock cycle, and the first sub write-back data receiving unit, the second sub write-back data receiving unit, the third sub write-back data receiving unit and the fourth sub write-back data receiving unit respectively receive the write-back data in four clock cycles.
In some embodiments of the present invention, the write-back control unit generates mask data according to a state of the write-back data receiving unit, the state of the write-back data receiving unit is determined by a state of the sub write-back data receiving unit, i.e. whether the sub write-back data receiving unit receives the write-back data. Specifically, the mask data is composed of 0 and/or 1, and the number of bits of the mask data corresponds to the number of the sub write back data receiving units one to one. And in one clock cycle, if the sub write back data receives the write back data, the write back control unit marks the sub write back data as 1, and if the write back data is not received, the write back control unit marks the sub write back data as 0. For example, if the first sub write-back data receiving unit receives the write-back data in a first clock cycle, the write-back control unit marks the first sub write-back data receiving unit as 1, and if the second sub write-back data receiving unit, the third sub write-back data receiving unit, and the fourth sub write-back data receiving unit do not receive the write-back data in a second clock cycle, a third clock cycle, and a fourth clock cycle, respectively, the write-back control unit marks the second sub write-back data receiving unit, the third sub write-back data receiving unit, and the fourth sub write-back data receiving unit as 0, so that the mask data generated by the write-back control unit is 1000.
In some embodiments of the invention, the write back module further comprises a counting unit for counting time periods. Specifically, the counting unit is a counter or a timer.
In some embodiments of the present invention, the write back control unit transmits the mask data and the current address to the volatile memory unit, while all the sub write back data receiving units transmit the write back data to the volatile memory unit. And when the volatile storage unit outputs data to the data transmission unit, the volatile storage unit transmits the mask data, the current address and all the write-back data as transmission data to the data transmission unit.
Fig. 2 is a block diagram of a data transmission unit in some embodiments of the invention. Referring to fig. 1 and 2, the data transmission unit 113 includes a judging unit 1131, a selecting unit 1132 and a data linking unit 1133, wherein an output end of the judging unit 1131 is connected to an input end of the selecting unit 1132, and an output end of the selecting unit 1132 is connected to an input end of the data linking unit 1133. Specifically, the judging unit 1131 includes a data chain update judging unit 11311, a first data validity judging unit 11312 and a second data validity judging unit 11313, the input terminal of the data chain update judging unit 11311 is connected to the output terminals of the two volatile memory units 111, a first output terminal of the data chain update judging unit 11311 is connected to an input terminal of the first data validity judging unit 11312, an output terminal of the first data validity judging unit 11312 is connected to a first input terminal of the selecting unit 1132, a second output terminal of the data chain update judging unit 11311 is connected to an input terminal of the second data validity judging unit 11313, an output terminal of the second data validity judging unit 11313 is connected to a second input terminal of the selecting unit 1132, a third input terminal of the selection unit 1132 is connected to an output terminal of the data link unit 1133 in the adjacent write-back module.
The data link updating unit judges whether the data link unit is in an updating state or not according to the clock period; when the data link updating judgment unit judges that the data link unit is in an updating state, the transmission data is transmitted to the first data validity judgment unit; when the data link updating judgment unit judges that the data link unit is not in the updating state, the transmission data is transmitted to the second data validity judgment unit; the first data validity judging unit and the second data validity judging unit are used for judging whether the transmission data are valid or not through the mask data and deleting the mask data in the transmission data. Specifically, if the mask data includes at least one 1, it represents that the transmission data is valid, and otherwise, it represents that the transmission data is invalid. The data invalidation represents that the write back data and the current address are not present in the transfer data.
In some embodiments of the invention, the data link unit is a data link.
In some preferred embodiments of the present invention, referring to fig. 2, the data transmission unit 113 further includes a buffer unit 1134, and the buffer unit 1134 is disposed on a connection line between the first data validity judgment unit 11312 and the selection unit 1132. Specifically, an output end of the first data validity judging unit 11312 is connected to an input end of the buffering unit 1134, and an output end of the buffering unit 1134 is connected to a first input end of the selecting unit 1132. The cache unit 1134 can play a role of temporarily storing data, so as to avoid conflict between the data in the data link unit 1133 and the data output by the first data validity determination unit 11312, thereby ensuring effective transmission of data.
In some embodiments of the present invention, the data write-back system includes two write-back modules, which are a first write-back module and a second write-back module, respectively, an input end of a write-back data receiving unit in the first write-back module is connected to a first activator for receiving write-back data from the first activator, an input end of a write-back data receiving unit in the second write-back module is connected to a second activator for receiving write-back data from the second activator, an output end of a data link unit in the first write-back module is connected to an input end of a memory through an AXI bus, an output end of a data link unit in the second write-back module is connected to a third input end of a selection unit in the first write-back module, and a third input end of a selection unit in the second write-back module has no connector. Specifically, the Memory is a Dynamic Random Access Memory (DRAM).
In some embodiments of the present invention, when the data link is not in the update state, first, the selection unit in the first write-back module reads data from the data link unit in the second write-back module, and transmits the data to the data link unit in the first write-back module until there is no new data in the data link unit in the second write-back module; then, the selection unit in the first write-back module reads data from the second data validity judgment unit in the first write-back module and transmits the data to the data link unit in the first write-back module until no new data exists in the second data validity judgment unit in the first write-back module; and finally, the selection unit in the first write-back module reads data from the cache unit in the first write-back module and transmits the data to the data chain unit until no new data exists in the cache unit in the first write-back module. And the data chain unit in the first write-back module transmits the received data to the memory in real time. And when the data chain is in an updating state, the selection unit stops working.
Although the embodiments of the present invention have been described in detail hereinabove, it is apparent to those skilled in the art that various modifications and variations can be made to these embodiments. However, it is to be understood that such modifications and variations are within the scope and spirit of the present invention as set forth in the following claims. Moreover, the invention as described herein is capable of other embodiments and of being practiced or of being carried out in various ways.

Claims (5)

1. A data write-back system is characterized by comprising at least one write-back module, wherein the write-back module comprises two volatile storage units, a logic control unit, a data transmission unit and a cache control unit, the cache control unit is connected with an activator, the logic control unit is connected with the two volatile storage units, the data transmission unit comprises a judgment unit, a selection unit and a data link unit, the output end of the selection unit is connected with the input end of the data link unit, the judgment unit comprises a data link update judgment unit, a first data validity judgment unit and a second data validity judgment unit, the input end of the data link update judgment unit is connected with the input ends of the two volatile storage units, the first output end of the data link update judgment unit is connected with the input end of the first data validity judgment unit, the output end of the first data validity judging unit is connected with the first input end of the selecting unit, the second output end of the data link updating judging unit is connected with the input end of the second data validity judging unit, the output end of the second data validity judging unit is connected with the second input end of the selecting unit, and when the number of the second data validity judging unit is larger than 1, the second data validity judging unit is adjacent to the first data validity judging unit, the third input end of the selecting unit is connected with the output end of the data link unit in the adjacent data transmission unit, wherein the cache control unit is used for receiving data from the activator, the logic control unit is used for controlling the two volatile storage units to read data from the cache control unit and write data into the data transmission unit in a ping-pong read-write mode, and the data link updating judging unit is used for judging whether the data link unit is in an updating state or not, the data is transmitted to the first data validity judging unit or the second data validity judging unit, and the first data validity judging unit and the second data validity judging unit are used for judging whether the data is valid or not.
2. The data write back system of claim 1, wherein the cache control unit comprises a write back control unit and a write back data receiving unit, the write back data receiving unit is connected to the activator, the write back data receiving unit is configured to receive write back data from the activator; the write-back control unit is used for obtaining the current storage address of the write-back data according to the initial address and the address offset of the write-back data and generating mask data according to the state of the write-back data receiving unit.
3. The data write back system of claim 2, wherein the write back data receiving unit comprises at least one sub write back data receiving unit for receiving the write back data according to a time period.
4. The data write-back system according to claim 1, wherein the data transmission unit further includes a cache unit, and the cache unit is disposed on a connection line between the first data validity judgment unit and the selection unit, and is configured to cache the data output by the first data validity judgment unit.
5. The data write-back system according to any one of claims 1 to 4, wherein the data write-back module is constructed by an FPGA.
CN201911360228.0A 2019-12-25 2019-12-25 Data write-back system Active CN111126584B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911360228.0A CN111126584B (en) 2019-12-25 2019-12-25 Data write-back system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911360228.0A CN111126584B (en) 2019-12-25 2019-12-25 Data write-back system

Publications (2)

Publication Number Publication Date
CN111126584A CN111126584A (en) 2020-05-08
CN111126584B true CN111126584B (en) 2020-12-22

Family

ID=70502572

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911360228.0A Active CN111126584B (en) 2019-12-25 2019-12-25 Data write-back system

Country Status (1)

Country Link
CN (1) CN111126584B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597079B (en) * 2020-12-22 2023-10-17 上海安路信息科技股份有限公司 Data write-back system of convolutional neural network accelerator

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101350036B (en) * 2008-08-26 2010-09-15 天津理工大学 High speed real-time data acquisition system
US10698657B2 (en) * 2016-08-12 2020-06-30 Xilinx, Inc. Hardware accelerator for compressed RNN on FPGA
CN206557767U (en) * 2016-11-11 2017-10-13 北京润科通用技术有限公司 A kind of caching system based on ping-pong operation structure control data buffer storage
CN108122031B (en) * 2017-12-20 2020-12-15 杭州国芯科技股份有限公司 Low-power consumption neural network accelerator device
CN110309088B (en) * 2019-06-19 2021-06-08 北京百度网讯科技有限公司 ZYNQ FPGA chip, data processing method thereof and storage medium

Also Published As

Publication number Publication date
CN111126584A (en) 2020-05-08

Similar Documents

Publication Publication Date Title
CN103246625B (en) A kind of method of data and address sharing pin self-adaptative adjustment memory access granularity
US20100228922A1 (en) Method and system to perform background evictions of cache memory lines
Gjessing et al. Performance of the RamLink Memory Architecture
CN105183662A (en) Cache consistency protocol-free distributed sharing on-chip storage framework
CN102508803A (en) Matrix transposition memory controller
CN111126584B (en) Data write-back system
CN116089343A (en) AXI-based data storage method, device, storage medium and equipment
CN108388529B (en) Method for actively realizing data exchange between peripheral and CPU
CN105577985A (en) Digital image processing system
CN101227689A (en) Method and apparatus for reporting information
CN113360130B (en) Data transmission method, device and system
CN102117478A (en) Real-time processing method and system for batch image data
CN115543869A (en) Multi-way set connection cache memory and access method thereof, and computer equipment
CN101883046A (en) Data cache architecture applied to EPON terminal system
CN111338567B (en) Mirror image caching method based on Protocol Buffer
CN105446935A (en) Shared storage concurrent access processing method and apparatus
CN115145842A (en) Data cache processor and method
CN114610231A (en) Control method, system, equipment and medium for large-bit-width data bus segmented storage
CN112565474B (en) Batch data transmission method oriented to distributed shared SPM
CN106057226B (en) The access control method of dual-port storage system
CN102571535A (en) Device and method for delaying data and communication system
CN102203748B (en) High-speed counter processing method and counter
CN215576588U (en) Data buffer processor
WO2011055168A1 (en) Area efficient counters array system and method for updating counters
CN116862756B (en) Line data processing method, line buffer, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 200434 Room 202, building 5, No. 500, Memorial Road, Hongkou District, Shanghai

Patentee after: Shanghai Anlu Information Technology Co.,Ltd.

Address before: Floor 4, no.391-393, dongdaming Road, Hongkou District, Shanghai 200080 (centralized registration place)

Patentee before: SHANGHAI ANLOGIC INFORMATION TECHNOLOGY Co.,Ltd.

CP03 Change of name, title or address