CN112214427B - Cache structure, workload proving operation chip circuit and data calling method thereof - Google Patents

Cache structure, workload proving operation chip circuit and data calling method thereof Download PDF

Info

Publication number
CN112214427B
CN112214427B CN202011079281.6A CN202011079281A CN112214427B CN 112214427 B CN112214427 B CN 112214427B CN 202011079281 A CN202011079281 A CN 202011079281A CN 112214427 B CN112214427 B CN 112214427B
Authority
CN
China
Prior art keywords
request
unit
cache
calculation
computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011079281.6A
Other languages
Chinese (zh)
Other versions
CN112214427A (en
Inventor
汪福全
刘明
蔡凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenglong Singapore Pte Ltd
Original Assignee
Sunlune Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sunlune Technology Beijing Co Ltd filed Critical Sunlune Technology Beijing Co Ltd
Priority to CN202011079281.6A priority Critical patent/CN112214427B/en
Publication of CN112214427A publication Critical patent/CN112214427A/en
Application granted granted Critical
Publication of CN112214427B publication Critical patent/CN112214427B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • G06F15/781On-chip cache; Off-chip memory

Abstract

The application relates to a cache structure, a workload proving operation chip circuit and a data calling method thereof, wherein the cache structure comprises: the system comprises a plurality of first cache regions, at least one request screening unit and a plurality of second cache regions; the first cache region is used for caching the received calculation request; the request screening unit is used for screening out calculation requests with different addresses for requesting access in the calculation requests; the second cache region is used for caching the calculation requests screened by the request screening unit. The technical scheme provided by the application can improve the random access performance of the calculation request.

Description

Cache structure, workload proving operation chip circuit and data calling method thereof
Technical Field
The present invention relates to the field of integrated circuit technology, and more particularly, to a cache structure, a workload proving operation chip circuit, and a data calling method thereof.
Background
The Proof of Work (POW) is a common recognition mechanism used for mainstream encrypted digital coins such as ethernet coins, and is basically characterized in that a large amount of hash operations are required, and a hash value meeting the conditions is found under the condition of a specific difficulty value. However, encrypted digital currency centered on the ETHASH algorithm requires a data set larger than 1GB and frequent access to the data set during workload certification.
The traditional approach is to use a separate external memory to store the data set off the computing chip, but is limited by bandwidth, which is poor performance. In order to increase the bandwidth, as shown in fig. 1, the conventional structure generally employs a computing unit connected to a routing unit, the routing unit is connected to an arbitration unit through a crossbar switch, and the arbitration unit is connected to a storage unit respectively. Based on the structure shown in fig. 1, the principle of issuing a calculation request by a calculation unit is as follows:
as shown in fig. 1, since one computing unit is connected to one routing unit, the computing request sent by each computing unit is transmitted to the routing unit connected to the computing unit, and then transmitted to each arbitration unit by the routing unit through the crossbar switch, until the computing request sent by the computing unit is arbitrated by some arbitration unit, and then is sent to the corresponding storage unit to execute the request, the computing unit may send the next computing request to the routing unit connected to the computing unit.
The structure shown in fig. 1 and the sending mechanism of the computation request result in that when the access addresses sent by a plurality of computation units are the same, only the request of one computation unit is received by the arbitration unit, resulting in low access efficiency.
Disclosure of Invention
In view of the above, the present invention provides a cache structure, a workload proving operation chip circuit and a data calling method thereof, so as to improve the access efficiency of the workload proving operation chip data.
The invention provides a cache structure, comprising:
the system comprises a plurality of first cache regions, at least one request screening unit and a plurality of second cache regions;
the first cache region is used for caching the received calculation request;
the request screening unit is used for screening out calculation requests with different addresses for requesting access in the calculation requests;
the second cache region is used for caching the calculation requests screened by the request screening unit.
Therefore, by arranging the first cache region, a plurality of computing requests from the same computing unit can be cached, the computing requests pass through the request screening unit to obtain a plurality of computing requests with different addresses for requesting access, and the screened computing requests are cached to the second cache region.
As an implementation manner of the first aspect, the first cache region is specifically configured to: and caching the received calculation request by a pre-entry pre-existing rule.
As an implementation manner of the first aspect, the second cache region is specifically configured to: and caching the calculation request screened by the request screening unit according to a rule which is input first and stored first.
Therefore, by entering the pre-existing rule, the corresponding computation requests can be orderly cached to the first cache region and the second cache region.
A workload proving operation chip circuit comprising the above cache structure, the circuit comprising: the system comprises a computing unit, a cache structure, a routing unit, an arbitration unit and a storage unit which are connected in sequence;
the computing unit is used for sending a computing request to the cache structure according to the received computing task;
the routing unit is used for determining an access path of the calculation request screened by the cache structure and sending the access path of the calculation request to the arbitration unit;
the arbitration unit is used for arbitrating the access path of the received calculation request, and if the access path of the calculation request meets arbitration conditions, the calculation request is sent to a corresponding storage unit to call related data through the access path;
the storage unit is used for storing the data set of the workload proving operation chip.
As an implementation manner of the second aspect, the computing unit, the cache structure, the routing unit, the arbitration unit, and the storage unit are respectively multiple, where:
each computing unit, each cache structure and each routing unit are connected in a one-to-one correspondence manner;
each arbitration unit is connected with each storage unit in a one-to-one correspondence manner;
each routing unit is fully connected with each arbitration unit.
As an implementation of the second aspect, the routing unit and the arbitration unit are connected by a crossbar.
Therefore, the cache structure is added in the workload proving operation chip circuit, so that the efficiency of the whole cross switch can be improved, and the access performance of chip data can be improved.
A data calling method of the workload proving operation chip circuit comprises the following steps:
sending out a calculation request according to the calculation task;
screening out calculation requests with different access request paths in the calculation requests;
determining the access path of the calculation request with different access request paths according to a preset routing table, and arbitrating the determined access path of the calculation request;
the data required by the arbitrated computation requests are invoked separately.
In summary, the present invention can solve the following problems: the data access efficiency is improved.
Drawings
FIG. 1 is a diagram illustrating a workload proving computing chip according to the prior art;
fig. 2 is a schematic structural diagram of a cache structure according to an embodiment of the present disclosure;
fig. 3 is a circuit diagram of a workload verification operation chip according to an embodiment of the present application.
Detailed Description
To facilitate an understanding of the present application, the present application will now be described more fully with reference to the accompanying drawings. Preferred embodiments of the present application are shown in the drawings. This application may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
In the following description, references to the terms "first \ second \ third, etc. or module a, module B, module C, etc. are used solely to distinguish between similar objects and do not denote a particular order or importance to the objects, but rather the specific order or sequence may be interchanged as appropriate to enable embodiments of the application described herein to be practiced in an order other than that shown or described herein.
In the following description, reference to reference numerals indicating steps, such as S100, S200 … …, etc., does not necessarily indicate that the steps are performed in this order, and the order of the preceding and following steps may be interchanged or performed simultaneously, where permissible.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
As shown in fig. 2, one embodiment of the present application provides a cache structure, which includes: the system comprises a plurality of first cache regions, at least one request screening unit and a plurality of second cache regions. The input end of the request screening unit is connected with the output end of each first cache region, and the output end of the request screening unit is connected with the input end of each second cache region.
The first cache region is used for caching the received calculation request and sending the calculation request to the request screening unit. The first cache region may perform caching according to a preset rule when caching the received calculation request, for example, the preset rule is that the calculation requests are sequentially cached from small to large according to the sequence of receiving the calculation requests. For example: in fig. 2, the cache address of the first cache region a is 0, the cache address of the first cache region b is 1, and so on, and the cache address of the first cache region M is x, then, when the cache structure receives a computation request, the received first computation request is cached in the first cache region a with the cache address of 0, the received second computation request is cached in the first cache region b with the cache address of 1, and so on, until all computation requests are cached or until all memories of the first cache region are occupied.
The request screening unit is used for screening out calculation requests with different access addresses in the calculation requests from the first cache regions, and sending the screened calculation requests to the second cache region. For example: the address to be accessed by the first computation request received in the request screening unit is 1, the address to be accessed by the second computation request received in the request screening unit is the same as the address to be accessed by the first computation request, the address to be accessed by the third computation request received in the computation request area is 2, at this time, the computation requests with different addresses to be accessed in the computation requests, namely the first computation request and the third computation request, are screened out, and the first computation request and the third computation request are sent to the second cache area at the same time. Correspondingly, the two first cache regions corresponding to the first computation request and the third computation request release the corresponding storage resources. When computing requests with the same addresses requested to be accessed are screened, a first-in first-out strategy is used for screening according to the precedence mechanism, namely in the first computing request and the second computing request which are accessed with the addresses 1 in the example, the first computing request is stored into the first cache region firstly in time, so that the computing requests are selected preferentially during screening, and the strategy is also used for screening subsequent computing requests.
The second cache region is used for caching the calculation requests screened by the request screening unit. The caching mode is the same as the mode of caching the calculation request in the first cache region, and therefore, the detailed description is omitted.
In this embodiment, the numbers of the first buffer, the request screening unit, and the second buffer may be adjusted respectively, so as to achieve the balance between resources and efficiency in the use condition. Namely: when the efficiency required by the use condition is higher, more first cache regions, request screening units and second cache regions can be arranged at the moment, and when the efficiency required by the use condition is not high and is limited by cost, less first cache regions, request screening units and second cache regions can be arranged at the moment.
As shown in fig. 3, another embodiment of the present application provides a workload proving operation chip circuit including the above-mentioned cache structure, the circuit including: the system comprises a computing unit, a cache structure, a routing unit, an arbitration unit and a storage unit which are connected in sequence.
The computing unit is used for sending a computing request to the cache structure according to the received computing task. The number of calculation requests is not limited here.
The cache structure is configured to cache and screen the computation requests sent by the computation unit, and the specific cache rule and the screening rule are the same as those in the previous embodiment, so that details thereof are not repeated here.
The routing unit is used for determining the access path of the calculation request screened by the cache structure and sending the access path of the calculation request to the arbitration unit;
the arbitration unit is used for arbitrating the access path of the received calculation request, and if the access path of the calculation request meets the arbitration condition, the calculation request is sent to the corresponding storage unit to call the related data through the access path.
The storage unit is used for storing the data set of the workload proving operation chip.
Here, taking the example that the first computing unit in fig. 3 issues three computing requests, a specific working process of the workload proving arithmetic chip circuit in this embodiment is described:
the first step, the first computing unit sends a first computing request, a second computing request and a third computing request to a first cache structure connected with the first computing request according to the computing task;
secondly, a first cache region in the first cache structure caches a first calculation request, a second calculation request and a third calculation request respectively according to a preset rule; the request screening unit in the first cache structure screens request access addresses of the first computation request, the second computation request and the third computation request, and if an address to be accessed by the first computation request is 1, an address to be accessed by the second computation request is also 1, and an address to be accessed by the third computation request is 2, the request screening unit simultaneously sends the first computation request and the third computation request to a second cache region in the first cache structure, and a cache rule of the request screening unit is also a preset rule in the previous embodiment;
and thirdly, the second cache region in the first cache structure sends the first calculation request and the third calculation request to a first routing unit connected with the first cache structure, the first routing unit determines access paths of the two calculation requests, and sends the access paths of the two calculation requests to a first arbitration unit-a k-th arbitration unit shown in fig. 3 through a cross switch for arbitration respectively, and if the access path of the first calculation request is arbitrated by a certain arbitration unit, the arbitration unit sends the first calculation request to a storage unit connected with the arbitration unit so as to call related data.
It can be seen from the above that, since the workload proves that the operation chip has the cache structure, each routing unit can send out a plurality of calculation requests at the same time, and the working efficiency of the whole chip is improved.
With reference to fig. 2 and 3, the following provides an implementation of the present application:
first, the computing unit issues a computing request (the number of the computing requests is not limited), the computing request is firstly cached at a free cache address in a first cache region of the cache structure, if the cache address in the first cache region of the cache structure is empty at 0, the first computing request is cached at the cache address in the first cache region of 0, if the computing request already exists at 0, the first computing request is cached at the cache address in the first cache region of 1, and so on. The computing unit may continue to send computing requests until the first cache region of the cache structure is full.
Secondly, the request screening unit in the cache structure selects the computation requests with non-conflicting addresses from all the computation requests of the first cache region in the cache structure and fills the computation requests into the second cache region in the cache structure, and the filling mode is the same as the filling mode.
Then, the calculation requests of the second cache area in the cache structure simultaneously participate in the arbitration of the first arbitration unit to the k-th arbitration unit through the routing unit, if the calculation requests do not meet the arbitration condition, no calculation request is sent to the storage unit through the first arbitration unit to the k-th arbitration unit, if one calculation request meets the arbitration condition, one calculation request is sent to the storage unit through one of the first arbitration unit to the k-th arbitration unit, and if a plurality of calculation requests meet the arbitration condition, a plurality of calculation requests are sent to the storage unit through the first arbitration unit to the k-th arbitration unit of the arbitration unit.
The efficiency of the cache structure provided by the present application is verified as follows:
for example, the number of calculation units is 32, and the number of storage units is 32. If the chip circuit structure shown in fig. 1 is used, when the computing unit sends a computing request to the routing unit, the routing unit sends the computing request to the arbitration unit for arbitration, and if the computing request does not pass arbitration, the computing request occupies the position of the routing unit, so that the computing unit cannot continue to send a new computing request, and the computing unit cannot send a next computing request until the computing request passes arbitration of the arbitration unit. If the address of the computation request issued by the computation unit is completely random or close to random, the efficiency of the structure is:
Figure BDA0002717460670000071
Figure BDA0002717460670000072
when the cache structure provided by the present application is used, after the computing unit sends a computing request, the computing request is cached in the cache structure first, and at this time, the computing unit may continue to send the computing request until the first cache region of the cache structure is full. Meanwhile, one or more requests of the cache structure can be simultaneously sent to the arbitration unit for arbitration through the routing unit. Under this configuration, similarly, the number of calculation units is 32, and the number of storage units is 32:
if the number of the first buffer areas in the buffer structure is 4 and the number of the second buffer areas in the buffer structure is 4, the efficiency of the structure can reach 78%.
If the number of the first buffer areas in the buffer structure is 4 and the number of the second buffer areas is 8, the efficiency of the structure can reach 90%.
If the number of the first buffer areas in the buffer structure is 8 and the number of the second buffer areas is 12, the efficiency of the structure can reach 94%.
If the number of the first buffer areas in the buffer structure is 12 and the number of the second buffer areas is 12, the efficiency of the structure can reach 95%.
In practical application, the number of the first cache regions and the number of the second cache regions in the computing unit, the storage unit and the cache structure can be set according to actual requirements.
By last, the cache structure that this application provided has apparent effect to promoting access efficiency.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (7)

1. A cache structure, the cache structure comprising:
the system comprises a plurality of first cache regions, at least one request screening unit and a plurality of second cache regions;
the first cache region is used for caching the received calculation request;
the request screening unit is used for screening out calculation requests with different addresses for requesting access in the calculation requests;
the second cache region is used for caching the calculation requests screened by the request screening unit.
2. The structure of claim 1, wherein the first cache region is specifically configured to: and caching the received calculation request by a pre-entry pre-existing rule.
3. The structure of claim 1, wherein the second cache area is specifically configured to: and caching the calculation request screened by the request screening unit according to a rule which is input first and stored first.
4. A workload certification op-chip circuit comprising the cache architecture of any one of claims 1 to 3, the circuit comprising: the system comprises a computing unit, a cache structure, a routing unit, an arbitration unit and a storage unit which are connected in sequence;
the computing unit is used for sending a computing request to the cache structure according to the received computing task;
the routing unit is used for determining an access path of the calculation request screened by the cache structure and sending the access path of the calculation request to the arbitration unit;
the arbitration unit is used for arbitrating the access path of the received calculation request, and if the access path of the calculation request meets arbitration conditions, the calculation request is sent to a corresponding storage unit to call related data through the access path;
the storage unit is used for storing the data set of the workload proving operation chip.
5. The circuit according to claim 4, wherein the computing unit, the cache structure, the routing unit, the arbitration unit and the storage unit are respectively plural, and wherein:
each computing unit, each cache structure and each routing unit are connected in a one-to-one correspondence manner;
each arbitration unit is connected with each storage unit in a one-to-one correspondence manner;
each routing unit is fully connected with each arbitration unit.
6. The circuit of claim 5, wherein the routing unit and the arbitration unit are connected by a crossbar.
7. A data calling method based on the workload proving operation chip circuit as claimed in any one of claims 4 to 6, wherein the method comprises:
sending out a calculation request according to the calculation task;
screening out calculation requests with different access request paths in the calculation requests;
determining the access path of the calculation request with different access request paths according to a preset routing table, and arbitrating the determined access path of the calculation request;
data required for the computation requests satisfying the arbitration conditions are respectively called.
CN202011079281.6A 2020-10-10 2020-10-10 Cache structure, workload proving operation chip circuit and data calling method thereof Active CN112214427B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011079281.6A CN112214427B (en) 2020-10-10 2020-10-10 Cache structure, workload proving operation chip circuit and data calling method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011079281.6A CN112214427B (en) 2020-10-10 2020-10-10 Cache structure, workload proving operation chip circuit and data calling method thereof

Publications (2)

Publication Number Publication Date
CN112214427A CN112214427A (en) 2021-01-12
CN112214427B true CN112214427B (en) 2022-02-11

Family

ID=74053439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011079281.6A Active CN112214427B (en) 2020-10-10 2020-10-10 Cache structure, workload proving operation chip circuit and data calling method thereof

Country Status (1)

Country Link
CN (1) CN112214427B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051194B (en) * 2021-03-02 2023-06-09 长沙景嘉微电子股份有限公司 Buffer memory, GPU, processing system and buffer access method
CN113435148B (en) * 2021-06-04 2022-11-08 上海天数智芯半导体有限公司 Parameterized cache digital circuit micro-architecture and design method thereof
CN114915594B (en) * 2022-07-14 2022-09-30 中科声龙科技发展(北京)有限公司 Method for balancing routing, network interconnection system, cross switch device and chip
CN115002050B (en) * 2022-07-18 2022-09-30 中科声龙科技发展(北京)有限公司 Workload proving chip
CN114928577B (en) * 2022-07-19 2022-10-21 中科声龙科技发展(北京)有限公司 Workload proving chip and processing method thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0384102A2 (en) * 1989-02-22 1990-08-29 International Business Machines Corporation Multi-processor caches with large granularity exclusivity locking
CA2920528A1 (en) * 2013-08-06 2015-02-12 Huawei Technologies Co., Ltd. Memory access processing method and apparatus, and system
CN106569727A (en) * 2015-10-08 2017-04-19 福州瑞芯微电子股份有限公司 Shared parallel data reading-writing apparatus of multi memories among multi controllers, and reading-writing method of the same
CN106886498A (en) * 2017-02-28 2017-06-23 华为技术有限公司 Data processing equipment and terminal
CN111666239A (en) * 2020-07-10 2020-09-15 深圳开立生物医疗科技股份有限公司 Master-slave equipment interconnection system and master-slave equipment access request processing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0384102A2 (en) * 1989-02-22 1990-08-29 International Business Machines Corporation Multi-processor caches with large granularity exclusivity locking
CA2920528A1 (en) * 2013-08-06 2015-02-12 Huawei Technologies Co., Ltd. Memory access processing method and apparatus, and system
CN106569727A (en) * 2015-10-08 2017-04-19 福州瑞芯微电子股份有限公司 Shared parallel data reading-writing apparatus of multi memories among multi controllers, and reading-writing method of the same
CN106886498A (en) * 2017-02-28 2017-06-23 华为技术有限公司 Data processing equipment and terminal
CN111666239A (en) * 2020-07-10 2020-09-15 深圳开立生物医疗科技股份有限公司 Master-slave equipment interconnection system and master-slave equipment access request processing method

Also Published As

Publication number Publication date
CN112214427A (en) 2021-01-12

Similar Documents

Publication Publication Date Title
CN112214427B (en) Cache structure, workload proving operation chip circuit and data calling method thereof
US8321385B2 (en) Hash processing in a network communications processor architecture
US8583851B2 (en) Convenient, flexible, and efficient management of memory space and bandwidth
US8539199B2 (en) Hash processing in a network communications processor architecture
US8843682B2 (en) Hybrid address mutex mechanism for memory accesses in a network processor
US8325603B2 (en) Method and apparatus for dequeuing data
US8537832B2 (en) Exception detection and thread rescheduling in a multi-core, multi-thread network processor
US8576862B2 (en) Root scheduling algorithm in a network processor
US10616333B2 (en) System for the management of out-of-order traffic in an interconnect network and corresponding method and integrated circuit
US20090175275A1 (en) Flexible network processor scheduler and data flow
US7555593B1 (en) Simultaneous multi-threading in a content addressable memory
US20030174699A1 (en) High-speed packet memory
US9727499B2 (en) Hardware first come first serve arbiter using multiple request buckets
US9336162B1 (en) System and method for pre-fetching data based on a FIFO queue of packet messages reaching a first capacity threshold
US8094819B1 (en) Method and apparatus for high agility cryptographic key manager
US6625700B2 (en) Arbitration and select logic for accessing a shared memory
US6937133B2 (en) Apparatus and method for resource arbitration
CN113110943A (en) Software defined switching structure and data switching method based on the same
CN115361336B (en) Router with cache, route switching network system, chip and routing method
US20160103777A1 (en) Memory aggregation device
US7028116B2 (en) Enhancement of transaction order queue
US20160124772A1 (en) In-Flight Packet Processing
US9268600B2 (en) Picoengine pool transactional memory architecture
US11960416B2 (en) Multichannel memory arbitration and interleaving scheme
Korikawa et al. Packet Processing Architecture with Off-Chip Last Level Cache Using Interleaved 3D-Stacked DRAM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240315

Address after: 10 Jialeng Road, Singapore # 09-11

Patentee after: Shenglong (Singapore) Pte. Ltd.

Country or region after: Singapore

Address before: 100190 3a18-01, floor 3a, building 9, North Fourth Ring Road West, Haidian District, Beijing

Patentee before: SUNLUNE TECHNOLOGY DEVELOPMENT (BEIJING) Co.,Ltd.

Country or region before: China

TR01 Transfer of patent right