CN110716797A

CN110716797A - DDR4 performance balance scheduling structure and method for multiple request sources

Info

Publication number: CN110716797A
Application number: CN201910852485.XA
Authority: CN
Inventors: 吕晖; 石嵩; 刘骁; 吴铁彬; 赵冠一; 王迪; 王吉军
Original assignee: Wuxi Jiangnan Computing Technology Institute
Current assignee: Wuxi Jiangnan Computing Technology Institute
Priority date: 2019-09-10
Filing date: 2019-09-10
Publication date: 2020-01-21

Abstract

The invention relates to the technical field of computer system structures and processor microstructures, in particular to a DDR4 performance balance scheduling structure and method for multiple request sources. A DDR4 performance balance scheduling structure facing multiple request sources comprises a plurality of access request scheduling buffers, which are used for improving the access bandwidth of the corresponding access request sources; the multi-source continuous arbitration component is used for selecting a memory access request to transmit; the DDR4 memory device is used for receiving the access requests transmitted by the multi-source continuous arbitration component. A DDR4 performance balance scheduling method facing multiple request sources comprises L1, setting a memory access request scheduling buffer for the memory access request of each memory access request source; and L2, the multi-source continuous arbitration component selects a memory access request to transmit through an arbitration strategy. The memory access request scheduling buffer is respectively arranged facing to multiple request sources, so that the memory access bandwidth can be improved, the influence on the memory access delay is reduced, and the comprehensive memory access performance of the system is improved.

Description

DDR4 performance balance scheduling structure and method for multiple request sources

Technical Field

The invention relates to the technical field of computer system structures and processor microstructures, in particular to a DDR4 performance balance scheduling structure and method for multiple request sources.

Background

With the continuous progress of processor manufacturing process and the requirement of practical application, a multi-core structure becomes the development trend of the current high-performance microprocessor, and the problem of memory access bandwidth and memory access delay of a many-core processor system which are difficult to match with the memory wall of computing performance is a hot problem researched in the current computer system structure.

In order to improve the memory access bandwidth, a large-scale memory access request scheduling buffer is adopted in the many-core processor. However, large-scale memory scheduling buffering can greatly increase memory access delay. For a multi-source access request sequence, some sources need higher access bandwidth, and some sources need shorter access delay, namely some sources are sensitive to delay and some sources are sensitive to bandwidth. While the traditional scheduling mechanism does not take the source characteristics into consideration, the traditional scheduling mechanism can achieve the maximization of bandwidth utilization but is not beneficial to exerting the overall performance of the chip.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a DDR4 performance balance scheduling structure and method facing multiple request sources.

The technical scheme adopted by the invention for solving the technical problems is as follows: a DDR4 performance balance scheduling structure for multiple request sources comprises

The memory access request scheduling buffers are used for improving the memory access bandwidth corresponding to the memory access request source;

the multi-source continuous arbitration component is used for selecting a memory access request to transmit;

the DDR4 memory device is used for receiving the access requests transmitted by the multi-source continuous arbitration component.

Preferably, the memory request scheduling buffer comprises a bandwidth sensitive memory scheduling buffer and a delay sensitive memory scheduling buffer.

Preferably, the bandwidth-sensitive memory scheduling buffer comprises

The memory entry is used for recording the information of the memory access request;

the empty entry queue is used for mounting storage entries in a queue form;

a binary tree is scheduled for organizing the memory entries in the form of a binary tree.

Preferably, the information of the access request comprises access request information, a left sub pointer of an entry and a right sub pointer of the entry.

A DDR4 performance balance scheduling method facing multiple request sources comprises

L1, setting a memory access request scheduling buffer for the memory access request of each memory access request source;

l2, the multi-source continuous arbitration component selects a memory access request to transmit through an arbitration strategy;

l3. the ddr4 memory device receives access requests transmitted by a multi-source sequential arbitration component.

Preferably, the arbitration policy in L2 is specifically,

1) the highest priority is rotated among the arbitration sources;

2) the arbitration source with the highest priority will release the highest priority to set the arbitration source priority to be the lowest and add one to the priorities of all other arbitration sources after the consecutive arbitration of N access requests has passed.

Preferably, a bandwidth sensitive memory access scheduling buffer is set for the memory access request of a bandwidth sensitive memory access request source in the L1;

and setting a delay sensitive access scheduling buffer for the access request of the delay sensitive access request source.

Preferably, the bandwidth-sensitive memory scheduling buffer comprises

the empty entry queue is used for mounting storage entries in a queue form;

The invention has the advantages that the invention respectively sets a plurality of memory access request scheduling buffers facing to a plurality of request sources, can improve the memory access bandwidth, simultaneously reduce the influence on the memory access delay and improve the comprehensive memory access performance of the system.

Drawings

Fig. 1 is a schematic structural diagram of a multi-request-source-oriented DDR4 performance balancing scheduling structure according to the present application.

Detailed Description

The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.

As shown in FIG. 1, in a first embodiment, a DDR4 performance balancing scheduling structure for multiple request sources includes

And the memory access request scheduling buffers are used for improving the memory access bandwidth corresponding to the memory access request source.

And the multi-source continuous arbitration component is used for selecting one access request to transmit.

The method and the device for scheduling the memory access requests face to multiple request sources and are respectively provided with multiple memory access request scheduling buffers, so that the mutual influence of memory access delay among the multiple sources can be reduced, and a scheduling structure with balanced memory access delay and memory access bandwidth performance is obtained.

Among them, the arbitration policy is specifically,

1) the highest priority is rotated among the arbitration sources;

Firstly, a memory access request scheduling buffer is respectively arranged for each memory access request source, and the buffer is used for mining the locality in a memory access sequence and improving the memory access bandwidth.

Second, a plurality of memory access request scheduling buffers select one memory access request to be transmitted to the DDR4 memory device through the arbitration component with one more selection. The arbitration policy is:

(1) the highest priority is rotated between the various arbitration sources.

(2) The arbitration source with the highest priority will release the highest priority only after N requests of consecutive arbitration pass. While releasing the highest priority, the arbitration source priority is set to be lowest, and the priorities of all other arbitration sources are increased by one.

For example, the initial priorities of the first arbitration source, the second arbitration source, the third arbitration source and the fourth arbitration source are all 1, the highest priority is given to the first arbitration source first, when the first arbitration source continuously arbitrates 5 requests, the priority is reduced to 0, and the priorities of the second arbitration source, the third arbitration source and the fourth arbitration source are all 2. Then, the highest priority is given to the second arbitration source, after the second arbitration source continuously arbitrates 5 requests to pass, the priority is reduced to 0, the priority of the first arbitration source is changed to 1, and the priorities of the third arbitration source and the fourth arbitration source are changed to 3. Then, the highest priority is given to the third arbitration source, when the third arbitration source continuously arbitrates 5 requests to pass, the priority is reduced to 0, the priority of the first arbitration source is changed to 2, the priority of the second arbitration source is changed to 1, and the priority of the fourth arbitration source is changed to 4. Then, the highest priority is given to the arbitration source four first, when the arbitration source four continuously arbitrates 5 requests to pass, the priority is reduced to 0, the priority of the arbitration source one is changed into 3, the priority of the arbitration source two is changed into 2, and the priority of the arbitration source three is changed into 1. Then, the highest priority is given to the first arbitration source, when the first arbitration source continuously arbitrates 5 requests to pass, the priority is reduced to 0, the priority of the second arbitration source is changed to 3, the priority of the third arbitration source is changed to 2, the priority of the fourth arbitration source is changed to 1, and so on. Wherein, the arbitration source schedules the buffer for the access request with different sources.

The memory access request scheduling buffer is respectively arranged facing to multiple request sources, so that the memory access bandwidth can be improved, the influence on the memory access delay is reduced, and the comprehensive memory access performance of the system is improved.

In the second embodiment, on the basis of the first embodiment, the memory request scheduling buffer includes a bandwidth-sensitive memory scheduling buffer and a delay-sensitive memory scheduling buffer.

Wherein the bandwidth sensitive memory scheduling buffer comprises

And the memory entry is used for recording the information of the memory access request. The information of the access request comprises access request information, a left sub pointer of an entry and a right sub pointer of the entry

And the empty entry queue is used for mounting the storage entries in a queue form.

First, each memory entry of the bandwidth-sensitive memory access scheduling buffer includes three pieces of information: the memory access request information, a left sub pointer of the entry and a right sub pointer of the entry. These memory entries are organized into two structures: an empty entry queue and a scheduling binary tree. In the initial state, all empty storage entries are in the empty entry queue, and the binary tree is scheduled to be empty.

Secondly, when a new access request arrives, a storage item is taken out from the empty item queue, and the access request information of the storage item is filled. Meanwhile, searching a scheduling binary tree according to the access request information, and if a node which is the same as the access request information already exists in the scheduling binary tree, mounting a new access request to a left child pointer of the node; and if the node which is the same as the access request information does not exist in the scheduling binary tree, mounting a new access request to a right child pointer of the rightmost child node of the scheduling binary tree.

Thirdly, when the scheduling binary tree is not empty, the root node of the binary tree is selected to transmit. At this time:

(1) and if the left child pointer of the root node is not null, taking the left child pointer as a new root node of the binary tree, and mounting the right child pointer of the original root node onto the right child pointer of the new root node.

(2) If the left child pointer of the root node is null, then the right child pointer of the root node serves as the new root node of the binary tree.

And finally, mounting the new root node after transmission to a corresponding storage entry of the empty entry queue.

When the memory access request reaches the bandwidth sensitive memory access scheduling buffer, the memory access request is organized into a binary tree structure, only the root node of the binary tree needs to be selected during transmission, and large-scale memory access request scheduling can be realized when a large number of memory access requests are confronted.

The above-described embodiments are merely illustrative of the preferred embodiments of the present invention and do not limit the spirit and scope of the present invention. Various modifications and improvements of the technical solutions of the present invention may be made by those skilled in the art without departing from the design concept of the present invention, and the technical contents of the present invention are all described in the claims.

Claims

1. A DDR4 performance balance scheduling structure facing multiple request sources is characterized in that: comprises that

2. The multi-request-source-oriented DDR4 performance balancing scheduling architecture of claim 1, wherein: the access request scheduling buffer comprises a bandwidth sensitive access scheduling buffer and a delay sensitive access scheduling buffer.

3. The multi-request-source-oriented DDR4 performance balancing scheduling architecture of claim 2, wherein: the bandwidth sensitive memory scheduling buffer comprises

the empty entry queue is used for mounting storage entries in a queue form;

4. The multi-request-source-oriented DDR4 performance balancing scheduling architecture of claim 3, wherein: the information of the memory access request comprises memory access request information, a left sub pointer of an entry and a right sub pointer of the entry.

5. A DDR4 performance balance scheduling method facing multiple request sources is characterized in that: comprises that

6. The multi-request-source-oriented DDR4 performance balancing scheduling method of claim 5, wherein: the arbitration policy in L2 is specifically that,

1) the highest priority is rotated among the arbitration sources;

7. The multi-request-source-oriented DDR4 performance balancing scheduling method of claim 5, wherein: setting a bandwidth sensitive access scheduling buffer for the access request of a bandwidth sensitive access request source in the L1;

8. The multi-request-source-oriented DDR4 performance balancing scheduling method of claim 7, wherein: the bandwidth sensitive memory scheduling buffer comprises

the empty entry queue is used for mounting storage entries in a queue form;

9. The multi-request-source-oriented DDR4 performance balancing scheduling method of claim 8, wherein: the information of the memory access request comprises memory access request information, a left sub pointer of an entry and a right sub pointer of the entry.