CN116361206A - Address conversion buffer marking controller with configurable capacity and application method thereof - Google Patents

Address conversion buffer marking controller with configurable capacity and application method thereof Download PDF

Info

Publication number
CN116361206A
CN116361206A CN202310314768.5A CN202310314768A CN116361206A CN 116361206 A CN116361206 A CN 116361206A CN 202310314768 A CN202310314768 A CN 202310314768A CN 116361206 A CN116361206 A CN 116361206A
Authority
CN
China
Prior art keywords
tag
bit
array
mark
marking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310314768.5A
Other languages
Chinese (zh)
Inventor
张建民
黎铁军
许炜康
孙岩
马胜
刘路
陆平静
王子聪
吴利舟
刘津津
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202310314768.5A priority Critical patent/CN116361206A/en
Publication of CN116361206A publication Critical patent/CN116361206A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses a address conversion buffer marking controller with a configurable capacity and an application method thereof, wherein the controller comprises a marking storage body array, a marking storage body and a marking storage body, wherein the marking storage body array is used for caching marking items and comprises m marking storage bodies; the mark control module is used for controlling read-write access of the mark memory array and configuring the capacity of the mark memory array; and the gating clock module is used for controlling the working state of each mark memory bank in the mark memory bank array so as to shut down the mark memory banks which are not needed to be used, thereby saving energy consumption. The invention can realize the capacity configurability of the address conversion buffer, and control the working state of each mark memory bank in the mark memory bank array to close the unnecessary mark memory bank so as to save energy consumption and solve the problem of high power consumption caused by long-term large-capacity operation of the address conversion buffer.

Description

Address conversion buffer marking controller with configurable capacity and application method thereof
Technical Field
The invention relates to the technical field of address conversion in network card chips of high-performance parallel computers and data centers, in particular to a capacity-configurable address conversion buffer marking controller and an application method thereof.
Background
In high performance parallel computers and large-scale data centers, communication between thousands of nodes (microprocessors or accelerators) is based on high-speed communication protocols of network card chips and switching chips. In order to solve the delay of data processing of each node in network transmission, communication operation at the user level has become a main technical approach adopted in a high-speed communication protocol, most commonly a remote memory direct access (Remote Direct Memory Access, RDMA) technology, which allows a user-mode application program to directly read or write into a memory of a remote node without kernel intervention and occurrence of memory copying. In a specific remote memory read-write process, remote memory addresses used for read-write operations are contained in RDMA messages for transfer, and the CPU of the remote node does not provide services during the entire RDMA data transfer process except for connection establishment, registration calls, and the like. Therefore, RDMA technology has the advantages of low delay, high throughput, less CPU resource occupation and the like.
Since RDMA technology enables memory access at the user level, if the memory is accessed directly using physical addresses, the following problems may exist: first, it is difficult to guarantee security of address access, for example, there may be a memory used by one program to maliciously write into another program; second, memory usage is inefficient, and even when one program releases its own memory after execution, the "memory hole" left by it is unlikely to completely match the memory size required by another program, and some "memory fragments" that are difficult to use may be generated, so virtual addresses are widely used to solve these problems. The virtual address and the physical address need to be converted in one layer, the physical address is managed by a driver or an operating system, and the security and the high utilization rate of the virtual address are ensured through a memory registration and management mechanism. The user program accesses the memory through the virtual address, and cannot directly use the physical address; in the middle, a special module, namely an address translation table (Address Translation Table, ATT), is used for translation, and the table is stored in the memory of the CPU and maintained by a driver or an operating system, as shown in fig. 1.
As shown in fig. 1, when the network card initiates access to a remote virtual address, the physical address is queried through the address translation table in the CPU-to-memory. But the network card accesses the memory through the CPU, and the delay is very large; the physical address of the message sent or received in each RDMA operation needs to be obtained according to the virtual address, so that the memory of the CPU needs to be frequently accessed to obtain the physical address, the whole remote data copying process becomes slow, and the efficiency of the RDMA operation is greatly reduced. Therefore, there is a need for hardware implementation of an address translation buffer (Address Translation Cache, ATC) within a network card for caching recently used address translation tables to reduce stalls in message processing and delay in address translation table access. However, address translation buffering is usually large in capacity, and consumes much energy during operation, resulting in a significant increase in power consumption of the network card chip.
Disclosure of Invention
The invention aims to solve the technical problems: aiming at the problems in the prior art, the invention provides a capacity-configurable address conversion buffer mark controller and an application method thereof, which can realize the capacity configurability of address conversion buffer, control the working state of each mark memory bank in a mark memory bank array so as to close the unnecessary mark memory bank to save energy consumption and solve the problem of high power consumption caused by long-term large-capacity operation of the address conversion buffer.
In order to solve the technical problems, the invention adopts the following technical scheme:
a configurable capacity address translation buffer flag controller, comprising:
the tag memory array is used for caching tag items corresponding to the address conversion table and comprises m tag memory banks;
the mark control module is used for controlling read-write access of the mark memory array and configuring the capacity of the mark memory array;
the gating clock module is used for controlling the working state of each marking memory bank in the marking memory bank array so as to close the marking memory banks which are not needed to be used, so that the energy consumption is saved;
the system comprises a tag memory array, a tag control module, a gating clock module, a clock signal input end and a clock signal output end, wherein the tag memory array is connected with the tag control module, the control end of the gating clock module is connected with the output end of the tag control module, and the output end of the gating clock module is connected with the clock signal input end of each tag memory in the tag memory array.
Optionally, the tag control module includes an m-bit capacity configuration register configured by a user, each bit in the capacity configuration register is synchronized with the clock signal CLK through a latch to generate a clock enable signal, and the clock enable signal and the clock signal CLK are subjected to and logic operation through an and gate to serve as the gate clock signal GCLK of the corresponding tag memory bank.
Optionally, each bit in the capacity configuration register is further and-logically operated with the original enable signal of the corresponding tag memory bank through an and gate to be used as a final enable signal of the tag memory bank to realize capacity configuration of the tag memory bank array.
Optionally, the tag memory array is further connected with a replacement algorithm module, so that when the tag memory array is missed and the tag memory array is full, one tag memory is selected from the tag memory array to be used for storing the tag item corresponding to the read-back address conversion table.
Optionally, the tag control module is further connected with an error correction and detection module, and is used for generating a check item for the tag item written into the tag memory bank array and performing error detection and correction on the tag item read from the tag memory bank array.
In addition, the invention also provides an application method of the address translation buffer marking controller with the configurable capacity, which comprises the following steps:
s101, detecting an input virtual address conversion request, and if the virtual address conversion request is detected, jumping to the step S102;
s102, matching the virtual address conversion request with m mark items stored in a mark storage array, and if no hit occurs, jumping to step S103; if hit occurs, step S104 is skipped;
s103, sending a request for reading the address conversion table through a PCIe interface, writing the request into a mark storage array and returning the mark storage array to the virtual address conversion request when receiving a mark item corresponding to the returned address conversion table, and exiting;
s104, reading the hit mark item, returning to the virtual address conversion request, and exiting.
Optionally, in step S101, the bit field of the virtual address conversion request includes p bits in total, and from the low order to the high order, the offset bit field is the first bit of the virtual address<a-1:0>The bits are a bits in total, representing that 2 is contained in one buffer line a The physical address of the item fetches which physical address of the item; index bit field is the first of virtual addresses<a+b-1:a>Bits, b bits in total, represent a slave depth of 2 b Which tag item is fetched from the tag memory array; marking the bit field as virtual address<a+b+c-1:a+b>Bits, c bits altogether, for comparison with tag entries fetched from the tag memory array, if identical, indicating a hit; the bit field of the mark item stored in the mark memory array comprises a mark, a check code and a valid bit, and the mark bit field is the first part of the mark item<c-1:0>Bits, c bits total, hold the virtual address<a+b+c-1:a+b>A bit; check code bit field is the first of the tag items<c+d-1:c>The bits are d bits in total, and error detection and correction codes are generated according to the marked bit domain; the valid bit field is the first of the tag items<c+d>Bits, 1 in total, which if 1, indicate that the tag is valid; if 0, this indicates that the tag is currently invalid.
Optionally, matching the virtual address translation request with m tag entries stored in the tag memory bank array in step S102 includes:
s201, selecting the marker items with the valid bit fields of 1 of m marker items stored in the marker memory array, judging that the m marker items stored in the marker memory array are all invalid items if the number of the marker items obtained by selection is 0, and ending if the matching result is miss; otherwise, jumping to S202;
s202, aiming at the marker item with the selected valid bit field of 1, carrying out error detection and correction on the marker bit field and the check code bit field of the < c+d-1:0> bit, comparing the marker bit field with the marker bit field of the virtual address conversion request, if the marker bit field is completely equal to the marker bit field of the virtual address conversion request, judging that the marker item is hit if the error detection and correction of the marker item passes, otherwise, judging that the matching result is miss, and ending.
Optionally, step S104 includes:
s301, checking the valid bit fields of m marking items stored in the marking storage array, and if the valid bit fields of m marking items stored in the marking storage array are not all 1, jumping to the step S302; otherwise, step S303 is skipped;
s302, taking the < a+b+c-1:a+b > bit of a virtual address conversion request as the < c-1:0> bit of a new marking item, generating a d-bit check code through an error correction and detection module, forming the new marking item by taking the most significant position as 1, writing the new marking item into a first row of marking storage bodies with significant bits of 0 according to the sequence from 0, ending and exiting;
s303, selecting a mark memory bank from the mark memory bank array through a replacement algorithm module to store a mark item corresponding to the read-back address conversion table, taking the < a+b+c-1:a+b > bit of the virtual address conversion request as the < c-1:0> bit of a new mark item, generating a d-bit check code through an error correction and detection module, setting the most significant position as 1, forming the new mark item, writing the new mark item into the selected mark memory bank, and ending and exiting.
Optionally, selecting one of the tag memory banks from the tag memory bank array by the replacement algorithm module in step S303 includes: the replacement algorithm module generates a pseudo-random binary sequence by adopting a specified 64-bit pseudo-random binary sequence algorithm, and the first bit of the pseudo-random binary sequence<log 2 m:0>The bits determine a tag memory bank selected from the tag memory bank array.
Compared with the prior art, the invention has the following advantages: the invention comprises a mark storage body array, a cache memory and a storage unit, wherein the mark storage body array is used for caching mark items and comprises m mark storage bodies; the mark control module is used for controlling read-write access of the mark memory array and configuring the capacity of the mark memory array; and the gating clock module is used for controlling the working state of each mark storage body in the mark storage body array to close the unnecessary mark storage body so as to save energy consumption, when the system detects that the request for acquiring the physical address is less, the capacity of the address conversion buffer can be configured according to the need, and the unnecessary mark storage body is gated so as to stop the clock, thereby effectively reducing the dynamic power consumption caused by signal overturning, obviously reducing the power consumption of the address conversion buffer, and solving the problem of high power consumption caused by long-term high-capacity operation of the address conversion buffer.
Drawings
FIG. 1 is a schematic diagram of an address translation table and address translation buffer in the prior art.
Fig. 2 is a schematic diagram of an address translation buffer flag controller according to an embodiment of the present invention.
Fig. 3 is a schematic circuit diagram of a gating clock signal for marking a memory bank according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a capacity configuration register and a capacity configuration thereof according to an embodiment of the present invention.
FIG. 5 is a schematic diagram illustrating the bit-domain partitioning of virtual addresses in an embodiment of the present invention.
FIG. 6 is a schematic diagram of bit-domain partitioning of a tag in an embodiment of the present invention.
Detailed Description
As shown in fig. 2, the address translation buffer flag controller of the configurable capacity of the present embodiment includes:
the tag memory bank array 1 is used for caching tag items corresponding to the address conversion table and comprises m tag memory banks;
a tag control module 2 for controlling read-write access of the tag memory bank array 1 and configuring the capacity of the tag memory bank array 1;
and a gating clock module 3, configured to control an operation state of each of the tag memory banks in the tag memory bank array 1 to shut down the tag memory banks that are not needed for use, so as to save energy consumption;
the tag memory array 1 is connected with the tag control module 2, the control end of the gating clock module 3 is connected with the output end of the tag control module 2, and the output end of the gating clock module 3 is connected with the clock signal input end of each tag memory in the tag memory array 1.
The tag control module 2 is a core module of the address translation buffer which receives a virtual address request from an input port and determines whether it hits, i.e. whether the address translation buffer ATC contains the physical address requested by the virtual address, by matching the bit field of the tag entry of the virtual address. In order to realize configuration of the capacity of the tag memory array 1 and control of the working states of the tag memory banks in the tag memory array 1 to shut down the tag memory banks that do not need to be used to save energy, as shown in fig. 3, the tag control module 2 in this embodiment includes an m-bit capacity configuration register that can be configured by a user, each bit (capacity configuration register <0> -capacity configuration register < m-1 >) in the capacity configuration register and the clock signal CLK generate a clock enable signal after being synchronized by a latch, and the clock enable signal and the clock signal CLK are used as gate clock signals GCLK (GCLK <0> -GCLK < m-1 >) of the corresponding tag memory banks after being logically operated through an and gate, that is, when a certain bit of the capacity configurable register is 1, the clock of the corresponding tag memory banks can be normally generated; when the bit is 0, the clock corresponding to the marked memory bank is 0, and the signal of the memory is not inverted any more, so that the purpose of saving power consumption is achieved.
As shown in fig. 4, each bit in the capacity configuration register in this embodiment is further and-logically operated with the original enable signal of the corresponding tag memory bank by an and gate, and then used as the final enable signal of the tag memory bank to implement capacity configuration of the tag memory bank array 1. Referring to fig. 4, the width of each of the tag memory banks of the tag memory bank array 1 in this embodiment is w, and m columns of tag memory banks are respectively denoted as SRAM0 to SRAM m-1.
As shown in fig. 2, the tag memory array 1 in this embodiment is further connected to a replacement algorithm module 4, so as to select, when the tag memory array 1 is missed and the tag memory array 1 is full, one tag memory from the tag memory array 1 for storing the tag entry corresponding to the read-back address translation table. In this embodiment, the replacement algorithm module 4 adopts a random algorithm to randomly select one of the tag memory banks from the tag memory bank array 1 for storing the tag entry corresponding to the read-back address translation table, and may also adopt other scheduling algorithms as required.
As shown in fig. 2, the tag control module 2 in this embodiment is further connected to an error correction and detection module 5, which is configured to generate a check term for a tag term written into the tag memory bank array 1, and perform error correction and detection on the tag term read from the tag memory bank array 1. In this embodiment, the error correction and detection module 5 uses ECC (Error Correcting Code) algorithm to perform encoding and verification. ECC is a common error correction code, and a typical ECC algorithm is adopted in the tag term array and the data array, so that the comprehensive performance and the implementation complexity are improved, and a one-check-two algorithm is realized. By correction, it is meant that if there is a bit error in the original data and the check code, the algorithm is able to automatically correct this bit error; if there are two bits in the original data and the check code, the algorithm can detect the error 100%. The method for generating the ECC check code by the one-check-two correction algorithm in the ECC check code generation module comprises the following steps: each bit in the ECC check code is generated by exclusive-or logic operations on certain bits in the original data. When the mark item is written into the memory, an ECC check code is added and stored together with the original mark item data. The checking method of the ECC correction and error detection module comprises the following steps: when the marked item is read out, a group of error correction codes are generated by the ECC check code and certain bits in the original marked item data through exclusive OR operation, then the error correction codes are subjected to NAND logic operation according to rules, and then the error correction codes and the original data are subjected to exclusive OR operation according to bits, so that correct data after one bit is corrected is generated, that is, if one bit appears, corrected data is automatically generated; if two bit errors occur, the tag entry is marked invalid and an error is reported to the global error register.
The embodiment also provides an application method of the address translation buffer marking controller with the configurable capacity, which comprises the following steps:
s101, detecting an input virtual address conversion request, and if the virtual address conversion request is detected, jumping to the step S102;
s102, matching the virtual address conversion request with m mark items stored in the mark storage array 1, and if no hit occurs, jumping to step S103; if hit occurs, step S104 is skipped;
s103, sending a request for reading the address conversion table through a PCIe interface, writing the request into the mark memory array 1 and returning the mark memory array to the virtual address conversion request when receiving the mark item corresponding to the returned address conversion table, and exiting;
s104, reading the hit mark item, returning to the virtual address conversion request, and exiting.
As shown in fig. 5, in step S101, the bit field of the virtual address conversion request is p bits, and includes three parts, namely offset (offset), index (index) and Tag (Tag), from the lower order to the upper order, wherein the offset bit field is the first bit of the virtual address<a-1:0>The bits are a bits in total, representing that 2 is contained in one buffer line a The physical address of the item fetches which physical address of the item; index bit field is the first of virtual addresses<a+b-1:a>Bits, b bits in total, represent a slave depth of 2 b Which tag item is fetched from the tag memory bank array 1; marking the bit field as virtual address<a+b+c-1:a+b>Bits, c bits total, are used to compare with the tag entries fetched in tag memory bank array 1, and if identical, represent hits. In the present embodiment, any one of<b:a>The meaning of (2) is as follows: the (b-a+1) bit total binary character from the a-th bit to the b-th bit in the address bitStrings. In this embodiment, the initialization process of the tag memory array 1 includes: when the chip is reset, the chip starts counting from 0 through a b-bit counter, and 1 is added each time, and the result is accumulated to 2 b -1. The tag memory array 1 is sequentially accessed using the value of the counter as an address, and all 0 values are written into the memory, and the valid bits of all tag entries in the tag memory array 1 are cleared to 0. At this time, the tag control module 2 generates a signal valid_clear_finish of 0, and disables a First-In-First-Out (FIFO) queue In the input port to transmit an address conversion request to the tag control module 2. When the tag memory bank array 1 is all written with 0's, valid_clear_finish is again set to 1, at which point the FIFO queue in the input port can transmit a request to the tag control module 2.
As shown in fig. 6, the bit field of the Tag entry stored in the Tag memory bank array 1 includes three parts of Tag (Tag), check code and valid bit, the Tag bit field is < c-1:0> bit, c bits are all, and < a+b+c-1:a+b > bits of the virtual address are saved; the check code bit field is < c+d-1:c > bit of the mark item, d bits are added, and error detection and correction codes are generated according to the mark bit field; the valid bit field is the < c+d > bit of the tag, which is 1 in total, which if 1 indicates that the tag is valid; if 0, this indicates that the tag is currently invalid. In this embodiment, according to the error detection and correction codes generated by the tag bit field, factors such as comprehensive performance, power consumption, design complexity, and the like, the error correction coding of one check and two check is adopted, and the specific generation and verification logic is implemented in the error correction and detection module.
In this embodiment, matching the virtual address translation request with m tag entries stored in the tag memory bank array 1 in step S102 includes:
s201, selecting the marker items with the valid bit fields of 1 of m marker items stored in the marker memory array 1, and judging that the m marker items stored in the marker memory array 1 are all invalid items if the number of the marker items obtained by selection is 0, wherein the matching result is a miss; otherwise, jumping to S202;
s202, aiming at the marker item with the selected valid bit field of 1, carrying out error detection and correction on the marker bit field and the check code bit field of the < c+d-1:0> bit, comparing the marker bit field with the marker bit field of the virtual address conversion request, if the marker bit field is completely equal to the marker bit field of the virtual address conversion request, judging that the marker item is hit if the error detection and correction of the marker item passes, otherwise, judging that the matching result is miss, and ending.
In this embodiment, step S104 includes:
s301, checking the valid bit fields of m marking items stored in the marking storage array 1, and if the valid bit fields of m marking items stored in the marking storage array 1 are not all 1, jumping to the step S302; otherwise, step S303 is skipped;
s302, taking the < a+b+c-1:a+b > bit of a virtual address conversion request as the < c-1:0> bit of a new tag item, generating a d-bit check code through an error correction and detection module 5, forming the new tag item by taking the most significant position as 1, writing the new tag item into a row of tag memory banks with the first significant bit of 0 according to the sequence from 0, ending and exiting;
s303, selecting a mark storage body from the mark storage body array 1 through a replacement algorithm module 4 for storing mark items corresponding to the read-back address conversion table, taking the < a+b+c-1:a+b > bit of the virtual address conversion request as the < c-1:0> bit of a new mark item, generating a d-bit check code through an error correction and detection module 5, setting the most significant position as 1, forming the new mark item, writing the new mark item into the selected mark storage body, ending and exiting.
In this embodiment, the selecting a tag memory from the tag memory array 1 by the replacement algorithm module 4 in step S303 includes: the replacement algorithm module 4 generates a pseudo-random binary sequence using a specified 64-bit pseudo-random binary sequence (Pseudo Random Binary Sequences, PRBS) algorithm and generates a third pseudo-random binary sequence<log 2 m:0>The bits determine the tag memory bank selected from the tag memory bank array 1.
When receiving an address conversion request message sent by a FIFO queue of an input port, the virtual address of p= (a+b+c) bit is analyzed from the address conversion request message, whereinFirst, the<a+b-1:a>The bit is an index, which is used as a read-write address of the mark memory bank array 1; according to the index, the Tag data of m q= (c+d+1) bits are read out in parallel from m memories, the (c+d) th bit is a valid bit, and g (0.ltoreq.g.ltoreq.m) Tag items with the valid bit of 1 are selected from m Tag item data; if g=0, then it indicates that the Tag memory array is an invalid entry, and there is no Tag entry data corresponding to the virtual address, i.e. a miss; otherwise, g is more than or equal to 1, the g marked item data is written<c+d-1:0>The bit comprises a mark domain and a check code domain, and ECC check is carried out; meanwhile, the mark control module 2 comprises an m-bit capacity configuration register which is set by a user, wherein each bit corresponds to each mark memory bank of the mark memory bank array 1, and when a certain bit is 1, the corresponding mark memory bank is available; when a bit is 0, it indicates that the corresponding tag memory bank is not enabled, capacity is configurable, and its clock is also gated, thereby reducing power consumption. Each bit of the capacity configuration register is used as the enabling of each tag memory body to read out the tag item data, and when the corresponding bit is 1, the first bit is read out<c-1:0>Bits in parallel with virtual addresses<a+b+c-1:a+b>Bits are compared. There are five cases: 1. if the ECC check is not wrong and there are exactly equal columns, i.e., there is a match, indicating that the current virtual address request hits in the tag memory bank. 2. If the ECC check is error free and there are no exactly equal columns, i.e., misses, it indicates that there is no tag entry data in the address translation buffer corresponding to the virtual address. 3. If ECC check errors occur and no completely equal columns exist, the current virtual address is not hit to mark the memory bank array 1, and meanwhile, the errors are reported to a global error register; 4. if the column with ECC check error is the same as the completely equal column, the current virtual address is not hit to mark the memory bank array 1, and the error is reported to the global error register; 5. if the column in which the ECC check error occurs is different from the exactly equal column, this indicates a hit in the tag memory bank array 1. When the above-mentioned hit mark memory array 1 body condition occurs, indicating that there is a physical address corresponding to the virtual address in the address conversion buffer, the mark control module 2 sends the hit column number together with the virtual address to the lower side of the mark memory array 1And a primary data array module. When the above-described miss-tag bank array 1 occurs, a virtual address request is sent to the invalidate buffer for saving. The invalidation buffer reads the required physical address from the memory of the CPU through the PCIe interface according to the virtual address generation request, writes the physical address into the data array module of the address conversion buffer, and synchronously updates the corresponding tag item of the tag memory array 1. According to virtual address<a+b-1:a>The bit is the index, and the corresponding m tag entries are read, and at this time, there are two cases: 1. one is that the valid bits of the m tag entries are not all 1, then the virtual address is written to<a+b+c-1:a+b>Bit as new tag item<c-1:0>And generating a d-bit check code through the error correction and detection module, forming a new mark item by using the most significant bit position as 1, and writing the mark storage bodies with the significant bits as 0 in the sequence from 0 to (m-1). 2. One is that the valid bits of m marker terms are all 1, at this time, one of the m terms is selected to be replaced by a replacement algorithm module, which in this embodiment adopts a random replacement algorithm, and the PRBS-64 result is selected after each operation by a standard 64-bit pseudo-random binary sequence (Pseudo Random Binary Sequences, PRBS) algorithm<log 2 m:0>Bits, the item to be replaced is obtained, then a new tag item is constructed and written to the item to be replaced according to the method of the first case.
In summary, for the problem of high power consumption caused by long-term large-capacity operation of address translation buffer, a capacity-configurable address translation buffer technique is proposed, in which, by implementing a capacity-configurable function in a tag controller of an address translation buffer ATC, when a system detects fewer requests for acquiring physical addresses, a tag memory array can be configured as 1/n of an original capacity as needed, where n=2 i ,0≤i≤log 2 m, m is the column number of the address conversion buffer ATC, namely the group association number, and the clocks of the rest (m-1/n) column mark memory array are all gated, so that the clocks of the memories are stopped, the dynamic power consumption caused by signal inversion is reduced, and the power consumption of the address conversion buffer is obviously reduced.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims (10)

1. A configurable capacity address translation buffer flag controller, comprising:
the tag memory bank array (1) is used for caching tag items corresponding to the address conversion table and comprises m tag memory banks;
a tag control module (2) for controlling read-write access of the tag memory array (1) and configuring the capacity of the tag memory array (1);
and a gating clock module (3) for controlling the working state of each of the tag memory banks in the tag memory bank array (1) to shut down the tag memory banks which are not required to be used to save energy consumption;
the system is characterized in that the marking memory array (1) is connected with the marking control module (2), the control end of the gating clock module (3) is connected with the output end of the marking control module (2), and the output end of the gating clock module (3) is connected with the clock signal input end of each marking memory in the marking memory array (1).
2. The address translation buffer marking controller with configurable capacity according to claim 1, wherein the marking control module (2) comprises a capacity configuration register with m bits which can be configured by a user, each bit in the capacity configuration register and a clock signal CLK generate a clock enable signal after being synchronized by a latch, and the clock enable signal and the clock signal CLK are used as a gating clock signal GCLK of a corresponding marking storage body after being subjected to AND logic operation by an AND gate.
3. The address translation buffer flag controller of claim 2, wherein each bit in the capacity configuration register is further and-logically operated with the original enable signal of the corresponding flag bank as the final enable signal of the flag bank to implement capacity configuration of the flag bank array (1).
4. A configurable capacity address translation buffer flag controller according to claim 3, wherein the flag bank array (1) is further connected to a replacement algorithm module (4) for selecting one flag bank from the flag bank array (1) for storing a flag entry corresponding to the read-back address translation table when there is a miss in the flag bank array (1) and the flag bank array (1) is full.
5. The address translation buffer marking controller with configurable capacity according to claim 4, wherein the marking control module (2) is further connected with an error correction and detection module (5) for generating check items for the marking items written into the marking storage array (1) and performing error detection and correction for the marking items read from the marking storage array (1).
6. A method of using the configurable capacity address translation buffer flag controller of any one of claims 1 to 5, comprising:
s101, detecting an input virtual address conversion request, and if the virtual address conversion request is detected, jumping to the step S102;
s102, matching the virtual address conversion request with m mark items stored in the mark storage array (1), and if no hit occurs, jumping to step S103; if hit occurs, step S104 is skipped;
s103, sending a request for reading the address conversion table through a PCIe interface, writing the request into the mark memory array (1) and returning the mark memory array to the virtual address conversion request when receiving a mark item corresponding to the returned address conversion table, and backing out;
s104, reading the hit mark item, returning to the virtual address conversion request, and exiting.
7. The method of claim 6, wherein the bit field of the virtual address translation request in step S101 is p bits in total, and the bit field includes three parts, i.e., offset, index and tag, from the lower order to the upper order, wherein the offset bit field is the first bit of the virtual address<a-1:0>The bits are a bits in total, representing that 2 is contained in one buffer line a The physical address of the item fetches which physical address of the item; index bit field is the first of virtual addresses<a+b-1:a>Bits, b bits in total, represent a slave depth of 2 b Which tag item is fetched from the tag memory array (1); marking the bit field as virtual address<a+b+c-1:a+b>Bits, c bits altogether, for comparison with tag entries fetched in the tag memory bank array (1), if identical, indicating a hit; the bit field of the mark item stored in the mark memory array (1) comprises a mark, a check code and a valid bit, and the mark bit field is the first part of the mark item<c-1:0>Bits, c bits total, hold the virtual address<a+b+c-1:a+b>A bit; check code bit field is the first of the tag items<c+d-1:c>The bits are d bits in total, and error detection and correction codes are generated according to the marked bit domain; the valid bit field is the first of the tag items<c+d>Bits, 1 in total, which if 1, indicate that the tag is valid; if 0, this indicates that the tag is currently invalid.
8. The method for applying the configurable capacity address translation buffer tag controller according to claim 7, wherein matching m tag entries stored in the virtual address translation request tag memory array (1) in step S102 comprises:
s201, selecting the marker items with the valid bit fields of 1 of m marker items stored in the marker memory array (1), judging that the m marker items stored in the marker memory array (1) are all invalid items if the number of the marker items obtained by selection is 0, and ending if the matching result is a miss; otherwise, jumping to S202;
s202, aiming at the marker item with the selected valid bit field of 1, carrying out error detection and correction on the marker bit field and the check code bit field of the < c+d-1:0> bit, comparing the marker bit field with the marker bit field of the virtual address conversion request, if the marker bit field is completely equal to the marker bit field of the virtual address conversion request, judging that the marker item is hit if the error detection and correction of the marker item passes, otherwise, judging that the matching result is miss, and ending.
9. The method for applying a configurable capacity address translation buffer flag controller according to claim 7, wherein step S104 includes:
s301, checking the valid bit fields of m marking items stored in the marking storage array (1), and if the valid bit fields of m marking items stored in the marking storage array (1) are not all 1, jumping to the step S302; otherwise, step S303 is skipped;
s302, taking the < a+b+c-1:a+b > bit of a virtual address conversion request as the < c-1:0> bit of a new tag item, generating a d-bit check code through an error correction and detection module (5), forming the new tag item by taking the most significant position as 1, writing the new tag item into a row of tag memory banks with the first significant bit of 0 according to the sequence from 0, ending and exiting;
s303, selecting a mark memory bank from the mark memory bank array (1) through a replacement algorithm module (4) for storing mark items corresponding to the read-back address conversion table, taking the < a+b+c-1:a+b > bit of the virtual address conversion request as the < c-1:0> bit of a new mark item, generating a d-bit check code through an error correction and detection module (5), setting the most significant position as 1, forming the new mark item, writing the new mark item into the selected mark memory bank, and ending and exiting.
10. The method of claim 9, wherein selecting one of the tag memory banks from the tag memory bank array (1) by the replacement algorithm module (4) in step S303 comprises: the replacement algorithm module (4) generates a pseudo-random binary sequence by adopting a specified 64-bit pseudo-random binary sequence algorithm and outputs the pseudo-random binary sequence to the first<log 2 m:0>The bits determine a selected tag memory bank from the tag memory bank array (1).
CN202310314768.5A 2023-03-28 2023-03-28 Address conversion buffer marking controller with configurable capacity and application method thereof Pending CN116361206A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310314768.5A CN116361206A (en) 2023-03-28 2023-03-28 Address conversion buffer marking controller with configurable capacity and application method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310314768.5A CN116361206A (en) 2023-03-28 2023-03-28 Address conversion buffer marking controller with configurable capacity and application method thereof

Publications (1)

Publication Number Publication Date
CN116361206A true CN116361206A (en) 2023-06-30

Family

ID=86914015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310314768.5A Pending CN116361206A (en) 2023-03-28 2023-03-28 Address conversion buffer marking controller with configurable capacity and application method thereof

Country Status (1)

Country Link
CN (1) CN116361206A (en)

Similar Documents

Publication Publication Date Title
CN109582214B (en) Data access method and computer system
US6356990B1 (en) Set-associative cache memory having a built-in set prediction array
JP5328792B2 (en) Second chance replacement mechanism for highly responsive processor cache memory
US8037281B2 (en) Miss-under-miss processing and cache flushing
CN110018971B (en) cache replacement technique
JP2010532517A (en) Cache memory with configurable association
KR20110025188A (en) Utilization of a store buffer for error recovery on a store allocation cache miss
CN109219804B (en) Nonvolatile memory access method apparatus and system
US9411731B2 (en) System and method for managing transactions
US20210065798A1 (en) Fully associative cache management
US8312331B2 (en) Memory testing with snoop capabilities in a data processing system
US7185172B1 (en) CAM-based search engine devices having index translation capability
US7649764B2 (en) Memory with shared write bit line(s)
JP2001249847A (en) Method and device for improving transfer time of multilevel cache
US6701484B1 (en) Register file with delayed parity check
US6976130B2 (en) Cache controller unit architecture and applied method
CN116361206A (en) Address conversion buffer marking controller with configurable capacity and application method thereof
Deb et al. Enabling technologies for memory compression: Metadata, mapping, and prediction
US11467979B2 (en) Methods for supporting mismatched transaction granularities
KR100398954B1 (en) Multi-way set associative cache memory and data reading method therefrom
Rabiee et al. Enduring non-volatile L1 cache using low-retention-time STTRAM cells
US10180907B2 (en) Processor and method
JP4404373B2 (en) Semiconductor integrated circuit
US6496904B1 (en) Method and apparatus for efficient tracking of bus coherency by using a single coherency tag bank
CN111653306B (en) Micro-architecture level universal reinforcing device for single-port SRAM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination