CN113220625B - Design method of calculation amount proving system based on stream processing chip - Google Patents

Design method of calculation amount proving system based on stream processing chip Download PDF

Info

Publication number
CN113220625B
CN113220625B CN202110514566.6A CN202110514566A CN113220625B CN 113220625 B CN113220625 B CN 113220625B CN 202110514566 A CN202110514566 A CN 202110514566A CN 113220625 B CN113220625 B CN 113220625B
Authority
CN
China
Prior art keywords
chip
packet
bits
data
data packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110514566.6A
Other languages
Chinese (zh)
Other versions
CN113220625A (en
Inventor
余光辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Huizhichengxin Technology Co ltd
Original Assignee
Beijing Huizhichengxin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Huizhichengxin Technology Co ltd filed Critical Beijing Huizhichengxin Technology Co ltd
Priority to CN202110514566.6A priority Critical patent/CN113220625B/en
Publication of CN113220625A publication Critical patent/CN113220625A/en
Application granted granted Critical
Publication of CN113220625B publication Critical patent/CN113220625B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17356Indirect interconnection networks
    • G06F15/17368Indirect interconnection networks non hierarchical topologies
    • G06F15/17375One dimensional, e.g. linear array, ring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4063Device-to-bus coupling
    • G06F13/4068Electrical coupling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/167Interprocessor communication using a common memory, e.g. mailbox
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • H04L1/0071Use of interleaving
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/06Notations for structuring of protocol data, e.g. abstract syntax notation one [ASN.1]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mathematical Physics (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a design method of a calculation amount proving system based on a stream processing chip, which belongs to the technical field of computer processing and is characterized in that the design method of the proving system comprises the following steps: WRITE PACKET, READ REQ PACKET and READ DATA SEND back packet and Error packet, and the beneficial effects of the invention are as follows: the whole system structural design of the stream processing chip is adopted, command and message formats between the chip and the main controller are utilized, ethash algorithm operation is carried out by combining the whole system, the problem of quick operation of Ethernet calculation amount demonstration is solved, according to the [183:176], [175:168] and [167:160] of the data packet, the data packet is used for ECC64 verification, the { [191:184], [159:112] }, the [111:56] and the [55:0] are respectively verified, all bits in the packet are interleaved, any continuous 3 bits belong to different ECC groups, and the burst error effect of the longest 3 bits is tolerated is obtained.

Description

Design method of calculation amount proving system based on stream processing chip
Technical field:
The invention belongs to the technical field of computer processing, and particularly relates to a design method of a calculation amount proving system based on a stream processing chip.
The background technology is as follows:
The Proof of Work (POW) is a bottom consensus mechanism adopted by mainstream blockchains such as ethernet, which requires a large amount of hash operations and consumes a large amount of memory bandwidth to find hash values meeting specific difficulty conditions. Since the ethernet computation proves that the algorithm needs to access a data set more than 1GB frequently (the data set increases approximately linearly with time, the 2021 data set will exceed 4GB, and the 2024 data set will exceed 6 GB), there is a high requirement on the memory capacity and access bandwidth of the system.
In order to solve the problem of fast operation of the calculation amount evidence of the Ethernet, most of the current solutions are as follows: the high-performance GPU display card adopting the Inlet and the AMD has the characteristics of strong operation capability and high memory bandwidth, so that the display card has better cost performance and performance power consumption ratio when processing calculation amount evidence compared with a CPU and an FPGA.
On the other hand, there are also some chip design companies that have specifically tailored GPU-like chips for ethernet computation. The chips have access bandwidth equivalent to that of the GPU, and meanwhile, a customized operation module is adopted (the calculation performance exceeds that of the GPU operation unit, and the power consumption is greatly reduced).
Whether GPU or GPU-like ASIC chips, there is a common problem: the off-chip memory bandwidth is low, which makes it difficult to improve the performance of the chip.
In addition, the calculated amount of the Ethernet proves that the required memory capacity exceeds 4GB, and with the current chip design process, the integration of the SRAM of 4GB inside a single chip is not possible at all. Thus limiting further increases in GPU and ASIC chip performance.
The invention comprises the following steps:
In order to solve the problems and overcome the defects of the prior art, the invention provides a design method of a calculation amount proving system based on a stream processing chip, which can effectively solve the problem of quick operation of the calculation amount proving of the Ethernet.
The specific technical scheme for solving the technical problems is as follows: the design method of the calculation amount proving system based on the stream processing chip is characterized by comprising the following steps:
(1) WRITE PACKET: writing operations on SRAM and REG inside chipid chips, which are sent by the main controller; data packet [191:190] =' b01 (binary representation 01) indicates that the message is WRITE PACKET; bit [189] of the data packet, 0 indicates writing to only the chip corresponding to the chip_id, and 1 indicates broadcasting to all chips; bit [188] of the data packet is reserved and is available for address expansion in the future; the address is located in [159:128] bits of the data packet, which is 32 bits in total; packet target chipid chip number, 4 bits total of [187:184] of the data packet; the [183:176], [175:168] and [167:160] of the data packet are used for ECC64 verification, and { [191:184], [159:112] }, [111:56] and [55:0] are respectively verified, and all bits in the packet are interleaved so that any continuous 3 bits belong to different ECC groups to tolerate burst errors of the longest 3 bits; the data is positioned in the format of a data packet between the main controller and the chip [127:0], and the width of the data packet is 192 bits;
(2) READ REQ PACKET: the main controller requests the read of SRAM or reg in each chipid chips, and the occurrence is as follows: when updating the SRAM, checking whether the SRAM is correctly written; the master controller needs to check and observe the internal register of the chip; data packet [191:188] =' b0001 (binary 0001) indicates that the message is READ PACKET; the total number of the data packets [187:184] is 4, which indicates the number of the target chipid chip of the packet initiated by the main controller; the address is located in [159:128] bits of the data packet, which is 32 bits in total; data is located in the data packet [127:0], the format of the data packet between the main controller and the chip, the data packet width is 192 bits,
(3) READ DATA SEND Back packet: chipid the chip returns the internal SRAM or REG data to the host controller; data packet [191:188] =' b1001 (binary representation 1001) indicates that the packet is READ DATA SEND back packet; address and data are located in the packet as READ PACKET: [187:184] is 4 bits in total, which represents chipid chip numbers for initiating the packet; the format of the data packet between the main controller and the chip, the data packet width is 192 bits,
(4) Error packet: the Chipid chip detects errors and then sends the errors to the main controller, and the occurrence condition is that: SERDES ECC error, illegal address, illegal packet type, FIFO under/over flow, SERDES LANE; data packet [191:188] =' b1011 (binary representation 1011) indicates that the message is error packet; the address and ECC are located in the same WRITE PACKET in the data packet; the total number of the data packet [187:184] is 4 bits, which represents chipid chip numbers for initiating the packet; the total 12 bits of the data packet [127:116] are all 0; the data packet [115:112] has 4 bits in total and represents an error type, 4'b0000 represents that ECC has corrected errors, 4' b0001 represents that ECC has parity errors, 4'b0010 represents that ECC is in error but not corrected errors, 4' b0100 represents an illegal read address, 4'b1000 represents an illegal write address, 4' b1100 represents an illegal packet type, 4'b1110 represents FIFO under/over flow, and 4' b1111 represents that Lane is blocked; the data packet [111:96] represents the id of the lane where the error occurred; the format of the data packet between the main controller and the chip is 192 bits in width.
Furthermore, the computing power proving system based on the flow processing chips adopts a system architecture of interconnection of a preset number of chips, a part of a data set is stored in each chip by using a sram, the total storage capacity of the preset number of chips exceeds the required storage space, and the chips are connected through a high-speed serdes bus.
Further, the error occurrence in step (4) includes:
I: inter-chip SERDES ECC error, ii: illegal address, iii: illegal packet type, iv: FIFO under/over flow, v: SERDES LANE is hung up.
The beneficial effects of the invention are as follows: the invention adopts the whole system structural design of the stream processing chip and utilizes
The command and message formats between the chip and the main controller are combined with the whole system to carry out Ethash algorithm operation, so that the problem of quick operation of the Ethernet calculation amount proving is solved
According to the invention, according to the data packet [183:176], [175:168] and [167:160] for ECC64 verification, the { [191:184], [159:112] }, [111:56] and [55:0] are respectively verified, and all bits in the packet are interleaved, so that any continuous 3 bits belong to different ECC groups, and the burst error effect of the longest 3 bits is obtained.
The specific embodiment is as follows:
specific details are set forth in the description of the invention in order to provide a thorough understanding of embodiments of the invention, it will be apparent to those skilled in the art that the invention is not limited to these details. In other instances, well-known structures and functions have not been shown or described in detail to avoid obscuring aspects of embodiments of the invention. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Specific embodiments of the invention:
The working principle is as follows:
(1) WRITE PACKET: writing operations on SRAM and REG in chipid chips, wherein addresses are positioned in [159:128] bits of a data packet and are 32 bits in total, and the write operations are sent by a main controller; packet target chipid chip number, 4 bits total of [187:184] of the data packet; the [183:176], [175:168] and [167:160] of the data packet are used for ECC64 verification, and { [191:184], [159:112] }, [111:56] and [55:0] are respectively verified, and all bits in the packet are interleaved so that any continuous 3 bits belong to different ECC groups to tolerate burst errors of the longest 3 bits; the format of the data packet between the main controller and the chip, and the width of the data packet is 192 bits;
(2) READ REQ PACKET: the main controller requests the read request of SRAM or reg in each chipid chip, and the situation occurs; when updating the SRAM, checking whether the SRAM is correctly written; the master controller needs to check and observe the internal register of the chip; the total number of the data packets [187:184] is 4, which indicates the number of the target chipid chip of the packet initiated by the FPGA; the format of the data packet between the main controller and the chip, and the width of the data packet is 192 bits;
(3) READ DATA SEND Back packet: the Chipid chip returns the internal SRAM or REG data to the main controller, and the situation occurs; address arrangement: READ PACKET at the same time; [187:184] is 4 bits in total, which represents chipid chip numbers for initiating the packet; the format of the data packet between the main controller and the chip, and the width of the data packet is 192 bits;
(4) Error packet: chipid after detecting the error, the chip sends the error to the main controller, and the occurrence conditions comprise:
I: inter-chip SERDES ECC error, ii: illegal address, iii: illegal packet type, iv: FIFO under/over flow, v: SERDES LANE is hung up.
During the use process:
(1) In order to meet the requirement of the storage space exceeding 4GB, a system architecture design of interconnection of a plurality of chips is adopted, a part of a data set is stored by using a sram in each chip, the total storage capacity of the plurality of chips exceeds the required storage space, and the chips are connected through a plurality of high-speed serdes buses.
(2) The core computing unit of ETHASH algorithm is a 64-round cycle computing and memory access operation, and in order to improve the computing efficiency, the cycle is split into pipeline operations, and each pipeline stage processes 1-round cycle computing and memory access operation.
(3) On a multi-chip streaming architecture, each chip is responsible for processing one or more pipeline stages of a 64-stage pipeline, and simultaneously, each chip is also responsible for responding to memory access and calculation requests of other chips to perform corresponding operations.
(4) In order to reduce the design complexity of the whole system PCB, we abandon the topological structure of the multi-chip full interconnection and adopt the multi-chip interconnection scheme of ring mode. When chip 0 (chip number starting from 0) needs to send data or commands to chip N (N > 1), the data and commands need to pass through chip 1, chip 2, … … serially, all the way to chip N. The chip passing in between encounters these data and commands, and will send the data directly to the next chip.
(5) The main controller is realized by a desktop or embedded SoC chip, and is connected with each chip through a low-speed or high-speed bus (not limited to I2C, SPI, serdes and other IO interfaces) to form an out-of-band main control network. The main controller is responsible for initializing all chips, starting and ending tasks, error handling and recovery.
(6) Considering the number of algorithm pipeline stages, in order to reduce the data delay overhead caused by overlong communication links, the scheme only comprises the following system structures with the number of chips: 8,9, 10, 11, 12, 13, 16, 17, 32, 33.
The specific implementation method comprises the following steps:
Assuming that the number of chips of the system is 4, the Sram capacity inside a single chip exceeds 1GB. The total memory capacity of the system exceeds 4GB.
The 64-stage computation is divided into two 32-stage pipeline stages, and the steps of the 32-stage pipeline process are described next.
Embodiment one:
The first step: chip 0 processes the 1 st pipeline stage and then sends the intermediate result to chip 1 via serdes
And a second step of: chip 1 processes the 2 nd pipeline stage and then sends the intermediate result to chip 2 via serdes
And a third step of: sequentially processing, and calculating 32 pipeline stages by 8 rounds and 4 chips
Fourth step: according to the first to third steps, 8 rounds of calculation are carried out, and the total number of the 64 pipeline stages is calculated by 4 chips to finish the whole algorithm
Embodiment two:
The first step: chip 0 processes 1-8, for a total of 8 pipeline stages, and then sends the intermediate result to chip 1 via serdes
And a second step of: chip 1 processes the 9 th-16 th, 8 th pipeline stages altogether, and then sends the intermediate result to chip 2 through serdes
And a third step of: sequentially processing, and calculating 32 pipeline stages by 1 round of 4 chips
Fourth step: according to the first to third steps, 1 round of calculation is carried out, and the total number of the 64 pipeline stages is calculated by 4 chips to finish the whole algorithm
The first scheme is simpler in design on the chip microstructure.
The second scheme has better access locality and higher bandwidth utilization, similar to burst operation of memory.
Burst operation of traditional memory: 8 pipeline-level computations need to access 8 internal memories, and there must be multiple accesses to the sram of the same chip. Therefore, the request command and the return data which access the same chip can be packaged, and the utilization rate of the serdes bandwidth is improved. But this design requires more internal buffers for storing intermediate results of multiple pipeline stages.
To sum up: (1) To meet the requirement of more than 4GB of memory space, a system architecture design of interconnection of a plurality of chips is adopted, and a part of a data set is stored in each chip by using a sram. The total memory capacity of the plurality of chips will exceed the required memory space. The chips are connected through a plurality of high-speed serdes buses.
(2) The core computing unit of ETHASH algorithm is a 64-round cycle computing and memory access operation, and in order to improve the computing efficiency, we split the cycle into pipeline operations, and each pipeline stage processes 1-round cycle computing and memory access operation.
(3) On a multi-chip streaming architecture, each chip is responsible for processing one or more pipeline stages of a 64-stage pipeline, and simultaneously, each chip is also responsible for responding to memory access and calculation requests of other chips to perform corresponding operations.
(4) In order to reduce the design complexity of the whole system PCB, we abandon the topological structure of the multi-chip full interconnection and adopt the multi-chip interconnection scheme of ring mode. When chip 0 (chip number starting from 0) needs to send data or commands to chip N (N > 1), the data and commands need to pass through chip 1, chip 2, … … serially, all the way to chip N. The chip passing in between encounters these data and commands, and will send the data directly to the next chip.
(5) The main controller is realized by a desktop or embedded SoC chip, and is connected with each chip through a low-speed or high-speed bus (not limited to I2C, SPI, serdes and other IO interfaces) to form an out-of-band main control network. The main controller is responsible for initializing all chips, starting and ending tasks, error handling and recovery.
(6) Considering the number of algorithm pipeline stages, in order to reduce the data delay overhead caused by overlong communication links, the scheme only comprises the following system structures with the number of chips: 8,9, 10, 11, 12, 13, 16, 17, 32, 33.
The whole system structural design of the stream processing chip is adopted, the command and message format between the chip and the main controller are utilized, ethash algorithm operation is carried out by combining the whole system, and the problem of rapid operation of the Ethernet calculation amount evidence is solved;
According to the invention, according to the data packet [183:176], [175:168] and [167:160] for ECC64 verification, the { [191:184], [159:112] }, [111:56] and [55:0] are respectively verified, and all bits in the packet are interleaved, so that any continuous 3 bits belong to different ECC groups, and the burst error effect of the longest 3 bits is obtained.

Claims (2)

1. The design method of the calculation amount proving system based on the flow processing chip is characterized in that the calculation amount proving system based on the flow processing chip is as follows:
Adopting a system architecture design of interconnection of a plurality of chips, wherein a part of a data set is stored in each chip by using a sram, the total storage capacity of the plurality of chips exceeds the required storage space, and the chips are connected through a plurality of high-speed serdes buses;
the core computing unit of ETHASH algorithm is a 64-round cycle computing and memory access operation, the cycle is split into pipeline operation, each pipeline stage processes 1-round cycle computing and memory access operation;
On a flow processing architecture formed by multiple chips, each chip is responsible for processing one or more flow stages of a 64-stage pipeline, and simultaneously, each chip is also responsible for responding to memory access and calculation requests of other chips to perform corresponding operations;
When chip 0, the chip number starts from 0, data or command needs to be sent to chip N (N > 1), the data and command needs to pass through chip 1, chip 2, … … in series until chip N, the chip passing through in the middle encounters the data and command, and the data is directly sent to the next chip;
The main controller is realized by a desktop or embedded SoC chip, is connected with each chip through a low-speed or high-speed bus to form an out-of-band main control network, and is responsible for initializing all the chips, starting and ending tasks, and performing error processing and recovery;
the design method of the proving system comprises the following steps:
(1) WRITE PACKET: writing operations on SRAM and REG inside chipid chips, which are sent by the main controller; data packet [191:190] =' b01, binary representation 01, and message WRITE PACKET; bit [189] of the data packet, 0 indicates writing to only the chip corresponding to the chip_id, and 1 indicates broadcasting to all chips; bit [188] of the data packet is reserved and is available for address expansion in the future; the address is located in [159:128] bits of the data packet, which is 32 bits in total; packet target chipid chip number, 4 bits total of [187:184] of the data packet; the [183:176], [175:168] and [167:160] of the data packet are used for ECC64 verification, and { [191:184], [159:112] }, [111:56] and [55:0] are respectively verified, and all bits in the packet are interleaved so that any continuous 3 bits belong to different ECC groups to tolerate burst errors of the longest 3 bits; the data is positioned in the format of a data packet between the main controller and the chip [127:0], and the width of the data packet is 192 bits;
(2) READ REQ PACKET: the main controller requests the read of SRAM or reg in each chipid chips, and the occurrence is as follows: when updating the SRAM, checking whether the SRAM is correctly written; the master controller needs to check and observe the internal register of the chip; data packet [191:188] =' b0001, binary 0001, which indicates that the message is READ PACKET; the total number of the data packets [187:184] is 4, which indicates the number of the target chipid chip of the packet initiated by the main controller; the address is located in [159:128] bits of the data packet, which is 32 bits in total; data is located in the data packet [127:0], the format of the data packet between the main controller and the chip, the data packet width is 192 bits,
(3) READ DATA SEND Back packet: chipid the chip returns the internal SRAM or REG data to the host controller; data packet [191:188] =' b1001, binary representation 1001, and the message is READ DATA SEND back packet; address and data are located in the packet as READ PACKET: [187:184] is 4 bits in total, which represents chipid chip numbers for initiating the packet; the format of the data packet between the main controller and the chip, the data packet width is 192 bits,
(4) Error packet: the Chipid chip detects errors and then sends the errors to the main controller, and the occurrence condition is that: SERDES ECC error, illegal address, illegal packet type, FIFO under/over flow, SERDES LANE; data packet [191:188] =' b1011, binary representation 1011, indicating that the message is error packet; the address and ECC are located in the same WRITE PACKET in the data packet; the total number of the data packet [187:184] is 4 bits, which represents chipid chip numbers for initiating the packet; the total 12 bits of the data packet [127:116] are all 0; the data packet [115:112] has 4 bits in total and represents an error type, 4'b0000 represents that ECC has corrected errors, 4' b0001 represents that ECC has parity errors, 4'b0010 represents that ECC is in error but not corrected errors, 4' b0100 represents an illegal read address, 4'b1000 represents an illegal write address, 4' b1100 represents an illegal packet type, 4'b1110 represents FIFO under/over flow, and 4' b1111 represents that Lane is blocked; the data packet [111:96] represents the id of the lane where the error occurred; the format of the data packet between the main controller and the chip is 192 bits in width.
2. The method for designing a flow processing chip-based computation amount proving system according to claim 1, wherein the error occurrence in the step (4) comprises:
I: inter-chip SERDES ECC error, ii: illegal address, iii: illegal packet type, iv: FIFO under/over flow, v: SERDES LANE is hung up.
CN202110514566.6A 2021-05-12 2021-05-12 Design method of calculation amount proving system based on stream processing chip Active CN113220625B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110514566.6A CN113220625B (en) 2021-05-12 2021-05-12 Design method of calculation amount proving system based on stream processing chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110514566.6A CN113220625B (en) 2021-05-12 2021-05-12 Design method of calculation amount proving system based on stream processing chip

Publications (2)

Publication Number Publication Date
CN113220625A CN113220625A (en) 2021-08-06
CN113220625B true CN113220625B (en) 2024-05-14

Family

ID=77095290

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110514566.6A Active CN113220625B (en) 2021-05-12 2021-05-12 Design method of calculation amount proving system based on stream processing chip

Country Status (1)

Country Link
CN (1) CN113220625B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7203890B1 (en) * 2004-06-16 2007-04-10 Azul Systems, Inc. Address error detection by merging a polynomial-based CRC code of address bits with two nibbles of data or data ECC bits
CN106648968A (en) * 2016-10-19 2017-05-10 盛科网络(苏州)有限公司 Data recovery method and device when ECC correction failure occurs on chip
CN108259333A (en) * 2016-12-29 2018-07-06 华为技术有限公司 A kind of BUM flow control methods, relevant apparatus and system
CN110297779A (en) * 2018-03-23 2019-10-01 余晓鹏 A kind of solution of memory intractability algorithm
CN110493310A (en) * 2019-07-17 2019-11-22 中国人民解放军战略支援部队信息工程大学 A kind of protocol controller and method of software definition
CN111600872A (en) * 2020-05-13 2020-08-28 中国人民解放军国防科技大学 Access validity check controller, chip and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7203890B1 (en) * 2004-06-16 2007-04-10 Azul Systems, Inc. Address error detection by merging a polynomial-based CRC code of address bits with two nibbles of data or data ECC bits
CN106648968A (en) * 2016-10-19 2017-05-10 盛科网络(苏州)有限公司 Data recovery method and device when ECC correction failure occurs on chip
CN108259333A (en) * 2016-12-29 2018-07-06 华为技术有限公司 A kind of BUM flow control methods, relevant apparatus and system
CN110297779A (en) * 2018-03-23 2019-10-01 余晓鹏 A kind of solution of memory intractability algorithm
CN110493310A (en) * 2019-07-17 2019-11-22 中国人民解放军战略支援部队信息工程大学 A kind of protocol controller and method of software definition
CN111600872A (en) * 2020-05-13 2020-08-28 中国人民解放军国防科技大学 Access validity check controller, chip and device

Also Published As

Publication number Publication date
CN113220625A (en) 2021-08-06

Similar Documents

Publication Publication Date Title
US10109343B2 (en) Multi-mode memory device and method having stacked memory dice, a logic die and a command processing circuit and operating in direct and indirect modes
CN107301455B (en) Hybrid cube storage system for convolutional neural network and accelerated computing method
US10056123B2 (en) Method and system for improving serial port memory communication latency and reliability
US7979757B2 (en) Method and apparatus for testing high capacity/high bandwidth memory devices
CN102012791B (en) Flash based PCIE (peripheral component interface express) board for data storage
US20210103818A1 (en) Neural network computing method, system and device therefor
US8843805B1 (en) Memory error protection using addressable dynamic ram data locations
CN112463714A (en) Remote direct memory access method, heterogeneous computing system and electronic equipment
WO2020257748A1 (en) Dma-scatter and gather operations for non-contiguous memory
WO2021089036A1 (en) Data transmission method, network device, network system and chip
CN207008602U (en) A kind of storage array control device based on Nand Flash memorizer multichannel
CN113791822B (en) Memory access device and method for multiple memory channels and data processing equipment
CN113220625B (en) Design method of calculation amount proving system based on stream processing chip
WO2021078197A1 (en) Method and device for an embedded processor to perform fast data communication, and storage medium
WO2022095439A1 (en) Hardware acceleration system for data processing, and chip
US7856527B2 (en) Raid system and data transfer method in raid system
US20220300370A1 (en) Configurable Error Correction Code (ECC) Circuitry and Schemes
US11687407B2 (en) Shared error correction code (ECC) circuitry
US20210191811A1 (en) Memory striping approach that interleaves sub protected data words
CN115543869A (en) Multi-way set connection cache memory and access method thereof, and computer equipment
US11755235B2 (en) Increasing random access bandwidth of a DDR memory in a counter application
JP7177948B2 (en) Information processing device and information processing method
CN111694777B (en) DMA transmission method based on PCIe interface
CN109388344B (en) Dual-port SRAM access control system and method based on bandwidth expansion cross addressing
US11094368B2 (en) Memory, memory chip and memory data access method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant