WO2024087559A1 - 内存访问方法、装置、系统及电子设备 - Google Patents

内存访问方法、装置、系统及电子设备 Download PDF

Info

Publication number
WO2024087559A1
WO2024087559A1 PCT/CN2023/091023 CN2023091023W WO2024087559A1 WO 2024087559 A1 WO2024087559 A1 WO 2024087559A1 CN 2023091023 W CN2023091023 W CN 2023091023W WO 2024087559 A1 WO2024087559 A1 WO 2024087559A1
Authority
WO
WIPO (PCT)
Prior art keywords
address
memory
time period
access
preset time
Prior art date
Application number
PCT/CN2023/091023
Other languages
English (en)
French (fr)
Inventor
王克行
周峰
冯辉宇
Original Assignee
北京象帝先计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京象帝先计算技术有限公司 filed Critical 北京象帝先计算技术有限公司
Publication of WO2024087559A1 publication Critical patent/WO2024087559A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory

Definitions

  • the present application relates to the field of memory technology, and in particular, to a memory access method, device, system and electronic device.
  • GDDR SDRAM Graphics Double Data Rate Synchronous Dynamic Random Access Memory
  • GPU Graphics Processing Unit
  • AI Artificial Intelligence
  • AR Augmented Reality
  • VR Virtual Reality
  • the performance of memory access is closely related to the storage unit (bank), row, and column accessed by the previous and next transfers.
  • the traditional approach is to select the mapping relationship between the address with the best performance and the physical address of the memory through simulation as the subsequent address mapping relationship.
  • simulation consumes a lot of manpower and time costs, and the stimulus given during simulation may be quite different from the actual transmission during system operation, resulting in the selected address mapping relationship not being the best address mapping relationship.
  • the purpose of the present disclosure is to provide a memory access method, device, system and electronic device, which solve the technical problems of difficulty in determining the address mapping relationship and low accuracy in the memory access process in the prior art.
  • a memory access method comprising:
  • the current address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed is optimized to obtain an optimized address mapping relationship
  • an optimized address mapping relationship is used to map the address of the burst access received within the second preset time period to the memory physical address corresponding to the memory to be accessed, so as to perform an access operation on the memory to be accessed.
  • the above method according to the number of flips of each bit in the address of the burst access within the first preset time period, the current address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed is optimized, and before the step of obtaining the optimized address mapping relationship, the above method further includes:
  • the number of flips of the column bits in the address of the burst access within the first preset time period is corrected; wherein the column bits are the bits mapped to the column address of the memory to be accessed according to the current address mapping relationship.
  • the number of flipping times of the column bits in the address of the burst access within the first preset time period is corrected, including the following steps:
  • t colnew t col /[(Trp+Trcd)/(Tccdl-Tccds)]
  • t colnew is the corrected value of the number of flips of the column bit within the first preset time period
  • t col is the number of flips of the column bit within the first preset time period before correction
  • Trp is the pre-charge operation delay of the memory to be accessed
  • Trcd is the activation operation delay of the memory to be accessed
  • Tccdl is the time interval between two consecutive burst accesses when they access the same storage unit group
  • Tccds is the time interval between two consecutive burst accesses when they access different storage unit groups.
  • the current address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed is optimized to obtain the optimized address mapping relationship, including the following steps:
  • the first M bits with the most flip times are mapped as the storage unit group addresses of the memory to be accessed, and the other N-M bits with the next largest flip times are mapped as the storage unit addresses of the memory to be accessed, so as to obtain an optimized address mapping relationship.
  • the above method further includes:
  • multiple bits other than the first N bits with the most flip times in the address of burst access they are divided into multiple groups of bits according to preset rules, and the multiple groups of bits are respectively mapped to addresses of corresponding types in the memory physical address of the memory to be accessed.
  • the corresponding type of address in the memory physical address of the memory to be accessed includes at least a column address and a row address.
  • determining whether the same bit in each of two consecutive burst access addresses flips within a first preset time period to count the number of flips of each bit in the burst access address within the first preset time period includes the following steps:
  • the flip times of the same bit in the address of the write operation burst access and the address of the read operation burst access within the first preset time period are added to obtain the flip times of each bit in the address of the burst access within the first preset time period.
  • determining whether the same bit in each of two consecutive burst access addresses flips within a first preset time period to count the number of flips of each bit in the burst access address within the first preset time period includes the following steps:
  • the count value corresponding to the bit is increased by 1, thereby obtaining the final count value corresponding to each bit in the burst access address within the first preset time period, and the final count value corresponding to each bit in the burst access address is used as the number of flips of each bit in the burst access address within the first preset time period.
  • an optimized address mapping relationship is used to map the address of the burst access received in the second preset time period to a memory physical address corresponding to the memory to be accessed, so as to perform an access operation on the memory to be accessed, including the following steps:
  • an optimized address mapping relationship is used to map the address of the burst access received within the second preset time period to the memory physical address corresponding to the memory to be accessed, so as to perform an access operation on the memory to be accessed.
  • the address of the burst access within the first preset time period is obtained by the following steps:
  • the multiple access requests are parsed to obtain the addresses of each burst access in the multiple access requests.
  • the above method before the step of parsing multiple access requests to obtain the address of each burst access in the multiple access requests, the above method further includes:
  • an address mapping device comprising:
  • a transmission monitoring module is configured to determine whether the same bit in each of two consecutive burst access addresses is flipped within a first preset time period, so as to count the number of flips of each bit in the burst access address within the first preset time period;
  • the mapping judgment module is connected to the transmission monitoring module and is configured to optimize the current address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed according to the number of flips of each bit in the address of the burst access monitored by the transmission monitoring module within a first preset time period, so as to obtain an optimized address mapping relationship.
  • the transmission monitoring module includes:
  • a write transfer monitoring module is configured to determine whether the same bit in the address of each of two consecutive write operation burst accesses in a first preset time period is flipped, so as to count the number of flips of each bit in the address of the write operation burst access in the first preset time period;
  • the read transfer monitoring module is configured to determine whether the same bit in the address of each two consecutive read operation burst accesses is flipped within a first preset time period, so as to count the number of flips of each bit in the address of the read operation burst access within the first preset time period.
  • the write transfer monitoring module includes a plurality of first counters configured to respectively count the number of flips of each bit in the address accessed by the write operation burst;
  • the read transmission monitoring module includes a plurality of second counters configured to respectively count the number of flippings of each bit in the address accessed by the read operation burst.
  • a memory controller connected between an upstream device and a memory to be accessed, comprising:
  • a first-in-first-out queue is connected to the address mapping device and is configured to process and output the burst access transmitted from the address mapping device across clock domains;
  • the address mapping module is connected between the FIFO queue and the memory to be accessed, and is configured to adopt a corresponding address mapping relationship to map the address of the burst access transmitted from the FIFO queue to the memory physical address corresponding to the memory to be accessed, so as to perform an access operation on the memory to be accessed.
  • the transmission monitoring module of the address mapping device includes a write transmission monitoring module and a read transmission monitoring module
  • the memory controller also includes a multiplexer connected between the address mapping device and the first-in-first-out queue, and is configured to select one of the write operation burst access transmitted from the write transfer monitoring module and the read operation burst access transmitted from the read transfer monitoring module to write into the first-in-first-out queue according to the write pointer of the first-in-first-out queue.
  • a memory access system comprising an upstream device and a memory to be accessed, and a memory controller according to any one of the above embodiments.
  • the above-mentioned memory access system further includes: a non-volatile memory, which is connected to the memory controller and is configured to store the address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed.
  • the above-mentioned memory access system further includes: a port physical layer chip, which is connected between the memory to be accessed and the memory controller and is configured to convert the burst access digital signal transmitted by the memory controller into an interface physical signal of the memory to be accessed.
  • an electronic device comprising the memory access system of any one of the above embodiments.
  • an electronic device comprising the electronic device of any one of the above embodiments.
  • FIG1 is a schematic diagram of a flow chart of a memory access method provided by an embodiment of the present disclosure
  • FIG2 is a schematic diagram of a current address mapping relationship between a burst access address and a memory physical address of a memory to be accessed, provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of an optimized mapping relationship between a burst access address and a memory physical address of a memory to be accessed provided by an embodiment of the present disclosure
  • FIG4 is a schematic diagram of the structure of an address mapping device provided by an embodiment of the present disclosure.
  • FIG5 is a schematic diagram of the structure of a memory controller provided by an embodiment of the present disclosure.
  • FIG6 is a schematic diagram of the structure of a memory access system provided by an embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of the structure of a GPU SOC system provided by an embodiment of the present disclosure.
  • connection or communication relationship between two components unless it is explicitly stated that the two components are directly connected or directly communicating, the connection or communication between the two components can be understood as direct connection or communication, or as indirect connection or communication through an intermediate component.
  • the memory controller After receiving an access request sent by an upstream device based on the AXI transmission protocol, the memory controller converts the address of the access request (the access address given by the upstream device) into a memory physical address (i.e., the physical address of the memory particle) through address mapping.
  • a memory such as a GDDR particle
  • these 16 storage cells (banks) will be divided into 4 storage cell groups (bank groups), and there will be 4 bits to indicate the storage cell (bank) information, of which 2 bits indicate the storage cell group (bank group) address information, and 2 bits indicate the storage cell (bank) address information.
  • Each storage cell (bank) is a storage array including multiple rows and columns.
  • the memory particle address can be divided into at least a storage cell group (bank group) address, a storage cell (bank) address, a row address (row) and a column address (column).
  • the storage unit group (bank group) address, storage unit (bank) address, row address (row) and column address (column) are used for addressing, and the corresponding position of the storage array can be accessed.
  • the row address (row) and column address (column) can share the address signal line, so continuous access to different rows of the same storage unit (bank) is usually not allowed. Instead, a set of switching actions must be inserted between two row accesses.
  • This set of switching actions includes precharging the previously operated row (row) and activating the row to be operated (active).
  • the operation sequence and time interval of this set of actions are correspondingly specified for different memories or memory chips.
  • This set of actions has a certain delay, which reduces the access efficiency.
  • Page hits refer to the row to be accessed is in the activated state, so read and write access can be performed directly without additional operations.
  • Page misses refer to the storage unit (bank) to be accessed, no row is in the activated state, and an activation command needs to be sent first to activate the row to be accessed, and then the access to this row is initiated.
  • Page conflict refers to the storage unit (bank) where the row to be accessed is located, and other rows except the row to be accessed are in the activated state.
  • a precharge command needs to be sent to turn off the other rows in the activated state, and then an activation command is sent to activate the row to be accessed, and then the access to the row to be accessed is initiated.
  • Page hits are the most friendly to system performance, while page conflicts are the least friendly to system performance and need to be avoided as much as possible.
  • the performance of two consecutive accesses to particles is better when accessing different storage unit groups (bank groups) than when accessing the same storage unit group (bank group).
  • the purpose of the present disclosure is to provide a memory access method, device, system and electronic device, which aims to optimize the current address mapping relationship by counting the number of flips of each bit in the address of the burst access within the first preset time period, and obtain the optimized address mapping relationship; for the burst access received in the second preset time period after the first preset time period, the optimized address mapping relationship is used to map the address of the burst access received in the second preset time period to the memory physical address corresponding to the memory to be accessed, so as to perform the access operation on the memory to be accessed.
  • This method of adaptively optimizing the address mapping relationship by counting the actual operation situation can accurately map the subsequent burst access to the appropriate memory physical address, and can make two consecutive (adjacent) burst accesses to access different storage unit groups or different storage units as much as possible, reduce the occurrence of page conflicts, thereby improving the parallel degree of memory access and improving the bus bandwidth. And this adaptive optimization method can be carried out simultaneously during the actual operation of the memory access system, without consuming a lot of manpower and time costs, and simplifying the optimization process of the address mapping relationship.
  • An embodiment of the present disclosure provides a memory access method, as shown in FIG1 , the method comprising:
  • Step S110 determining whether the same bit in each of two consecutive burst access addresses is flipped within a first preset time period, so as to count the number of flips of each bit in the burst access address within the first preset time period;
  • Step S120 optimizing the current address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed according to the number of flips of each bit in the address of the burst access within the first preset time period, to obtain an optimized address mapping relationship;
  • Step S130 For burst accesses received in a second preset time period after the first preset time period, an optimized address mapping relationship is used to map the address of the burst access received in the second preset time period to the memory physical address corresponding to the memory to be accessed, so as to perform an access operation on the memory to be accessed.
  • the memory to be accessed includes a plurality of storage units (Bank), each storage unit (Bank) is a storage array including a plurality of rows and a plurality of columns, and the plurality of storage units in the memory to be accessed are divided into a plurality of storage unit groups (bank groups).
  • the above-mentioned memory to be accessed includes but is not limited to synchronous dynamic random access memory (Synchronous Dynamic Random Access Memory, SDRAM), SDRAM includes but is not limited to double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDR SDRAM), GDDR and low power double data rate synchronous dynamic random access memory (Low Power Double Data Rate SDRAM, LPDDR SDRAM).
  • SDRAM Synchronous Dynamic Random Access Memory
  • SDRAM includes but is not limited to double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDR SDRAM), GDDR and low power double data rate synchronous dynamic random access memory (Low Power Double Data Rate SDRAM, LPDDR SDRAM).
  • the first preset time period includes a time period, i.e., a time window. That is, the addresses of burst accesses to the memory to be accessed received within a longer time window can be analyzed to count the number of flips of each bit in the address of the burst access within the first preset time period.
  • the first preset time period includes multiple time periods, that is, multiple time windows.
  • the addresses of the burst accesses to the memory to be accessed received within the time window are analyzed to count the number of flips of each bit in the addresses of the burst accesses within the first preset time period.
  • the number of bits in the address of each burst access is the same.
  • the address of the burst access within the first preset time period is obtained by the following steps:
  • the access request includes: a starting address, an access data bit width, and a burst length.
  • the burst type of access request includes INCR or WRAP.
  • INCR type burst access (burst transfer) is incremented, and the address of WRAP type burst access (burst transfer) is round-robin.
  • parsing the multiple access requests to obtain the address of each burst access in the multiple access requests includes the following steps:
  • the starting value of n is 2
  • Address_n is the address of the nth burst access in the corresponding access request
  • the maximum value of n is the burst length of the corresponding access request
  • AWSIZE is the access data bit width of the corresponding access request
  • INT is a round-down function
  • Start_address is the starting address of the corresponding access request.
  • the starting value of n is 1
  • Address_n is the address of the n+1th burst access in the corresponding access request
  • the maximum value of n is the burst length of the corresponding access request minus 1. That is, Address_1 is the address of the second burst access in the corresponding access request, Address_2 is the address of the third burst access in the corresponding access request, and so on.
  • parsing the multiple access requests to obtain the address of each burst access in the multiple access requests includes the following steps:
  • Low_boundary_address is the low address boundary
  • INT is the round-down function
  • Start_address is the starting address of the corresponding access request
  • AWSIZE is the access data bit width of the corresponding access request
  • Burst_length is the burst length of the corresponding access request
  • High_boundary_address is the high address boundary
  • the starting value of n is 2
  • Address_n is the address of the nth burst access in the corresponding access request
  • the maximum value of n is the burst length of the corresponding access request
  • Address_n Start_address + (n-1) * Number_bytes - (Number_bytes*Burst_length).
  • the starting value of n is 1
  • Address_n is the address of the n+1th burst access in the corresponding access request
  • the maximum value of n is the burst length of the corresponding access request minus 1. That is, Address_1 is the address of the second burst access in the corresponding access request, Address_2 is the address of the third burst access in the corresponding access request, and so on.
  • the method before the step of parsing the plurality of access requests to obtain the address of each burst access in the plurality of access requests, the method further includes:
  • the starting addresses of the multiple access requests are aligned with the bus width of the memory to be accessed.
  • the address of the access request only needs to be aligned to 64B first, and the received start address is recorded as AWADDR[MSB:LSB], where MSB is the high address boundary and LSB is the low address boundary.
  • 64B requires at least 6 bits of binary to encode, and 64B alignment is to set the lower 6 bits of the start address to 0, so the aligned start address is ⁇ AWADDR[MSB:6],6’b0 ⁇ , which is also the address of the unique burst access.
  • the start address is first aligned to obtain the address of the first burst access, and then the addresses of each other burst access are parsed according to the burst type.
  • step S110 includes the following steps:
  • Step S112a performing XOR processing on the same bits in the addresses of every two consecutive burst accesses within the first preset time period;
  • Step S114a In response to the XOR result of any identical bit in two consecutive burst access addresses being 1, the count value corresponding to the bit is increased by 1, so as to obtain the final count value corresponding to each bit in the burst access address within the first preset time period, and the final count value corresponding to each bit in the burst access address is used as the number of flips of each bit in the burst access address within the first preset time period.
  • two consecutive burst accesses refer to two adjacent burst accesses in timing, that is, the two consecutive burst accesses will continuously access the above-mentioned memory to be accessed. Therefore, the number of flips of each bit in the address of the burst access obtained by the above statistical method within the first preset time period is equivalent to the frequency of flipping of the bits of the two burst accesses that continuously access the above-mentioned memory to be accessed within the first preset time period.
  • a burst access is a write operation burst access or a read operation burst access.
  • the burst access within the first preset time period may include a plurality of write operation burst accesses and a plurality of read operation burst accesses.
  • the number of flips of each bit in a burst access within a first preset time period is counted by crossing (or mixing) write operation burst accesses with read operation burst accesses, that is, when determining whether each bit in the address of each two consecutive burst accesses flips, it can be ignored whether each burst access in the two consecutive burst accesses is a write operation burst access or a read operation burst access.
  • the two consecutive burst accesses can be two consecutive write operation burst accesses, two consecutive read operation burst accesses, or one consecutive write operation burst access and one read operation burst access.
  • the number of flipping times of each bit in the address accessed by the write operation burst within the first preset time period and the number of flipping times of each bit in the address accessed by the read operation burst within the first preset time period are counted separately.
  • step S110 includes the following steps:
  • Step S112b determining whether the same bit in the addresses of every two consecutive write operation burst accesses in the first preset time period is flipped, and whether the same bit in the addresses of every two consecutive read operation burst accesses in the first preset time period is flipped, so as to respectively count the number of flips of each bit in the addresses of the write operation burst accesses in the first preset time period and the number of flips of each bit in the addresses of the read operation burst accesses in the first preset time period;
  • Step S114b The same bits in the address of the write operation burst access and the address of the read operation burst access are within a first preset time period. The flipping times of each bit in the address of the burst access are added up to obtain the flipping times of each bit in the first preset time period.
  • the number of flips of each bit in the address of a write operation burst access within the first preset time period and the number of flips of each bit in the address of a read operation burst access within the first preset time period can be counted in the write transfer monitoring module and the read transfer monitoring module of the memory controller, respectively, and then summarized in the mapping judgment module of the memory controller, and the number of flips of the same bit in the first preset time period are added to obtain the number of flips of each bit in the address of the burst access within the first preset time period.
  • the optimized address mapping relationship obtained by the method of reading and writing separately counting bit flipping is better than the optimized address mapping relationship obtained by the method of reading and writing mixed counting bit flipping, because continuous read operations or continuous write operations have better performance than read and write interleaving. Therefore, although the read and write operation burst accesses will be mixed together in the first-in-first-out queue (First In First Out, FIFO) of the memory controller, in order to achieve better performance, the read and write operation burst accesses can still be concentrated together in the address mapping module after instruction scheduling.
  • FIFO First In First Out
  • the information identifiers of the write access request and the read access request are different, for example, each information of the write access request starts with AW, such as the starting address of the write access request is represented by AWADDR, the access data bit width is represented by AWSIZE, and the burst length is represented by AWLEN; each information of the read access request starts with AR, such as the starting address of the read access request is represented by ARADDR, the access data bit width is represented by ARSIZE, and the burst length is represented by ARLEN.
  • the memory controller when it receives an access request, it can determine whether the received access request is a write access request and a read access request based on the information identifier, so as to respectively distribute the write access request and the read access request to the write transmission monitoring module and the read transmission monitoring module for analysis, so as to parse out the corresponding addresses of each write operation burst access and each read operation burst access, and then respectively count the number of flips of each bit in the addresses of the write operation burst access and the read operation access within the first preset time period.
  • step S120 the following steps are included:
  • the number of flips of the column bit in the address of the burst access within the first preset time period is corrected to obtain a corrected value of the number of flips of the column bit within the first preset time period; wherein, the column bit is the bit mapped to the column address of the memory to be accessed according to the current address mapping relationship, the precharge operation delay of the memory to be accessed is the shortest time from the precharge operation of the memory to be accessed to the execution of the access operation on the storage unit targeted by the precharge operation, and the activation operation delay of the memory to be accessed is the shortest time from the activation operation of the memory to be accessed to the execution of the access operation on the specified row of the storage unit targeted by the activation operation.
  • the precharge operation delay Trp of the memory to be accessed can be understood as at least Trp time after the precharge operation to perform read and write operations on the same storage unit (the storage unit targeted by the precharge operation).
  • the activation operation delay Trcd of the memory to be accessed is at least Trcd time after the activation operation to perform read and write operations on this row (the designated row of the storage unit targeted by the activation operation). (Trp+Trcd) is equivalent to the delay caused by a page conflict.
  • the second burst access is a page hit (the row to be accessed is just in the activated state).
  • Tccdl-Tccds the time interval requirement between two adjacent (continuous) burst accesses when they access the same storage cell group
  • Tccds the time interval between two adjacent (continuous) burst accesses when they access different storage cell groups.
  • the above-mentioned correction step includes the following steps:
  • t colnew t col /[(Trp+Trcd)/(Tccdl-Tccds)]
  • t colnew is the correction value of the number of flips of the column bit in the first preset time period
  • t col is the number of flips of the column bit in the first preset time period before correction
  • Trp is the pre-charge operation delay of the memory to be accessed
  • Trcd is the activation operation delay of the memory to be accessed
  • Tccdl is the time interval between two consecutive burst accesses when they access the same storage unit group
  • Tccds is the time interval between two consecutive burst accesses when they access different storage unit groups.
  • a group of T-based counters can be set on each column bit.
  • This group of T-based counters can be cascaded through two counters, so that the first counter automatically returns to zero when it counts to T and outputs a trigger signal to the second counter.
  • the second counter increases the count value by 1 according to the received trigger signal.
  • the above correction step may further include the following steps:
  • the number of flips of the column bits in the address of the write operation burst access within the first preset time period and the number of flips of the column bits in the address of the read operation burst access within the first preset time period are modified respectively.
  • the numbers are then added to obtain a correction value of the number of flipping times of the column bits in the address of the burst access within the first preset time period.
  • t colnew_w is a correction value of the number of flips of the column bits in the address of the write operation burst access within the first preset time period
  • t col_w is the number of flips of the column bits in the address of the write operation burst access before the correction within the first preset time period
  • Tccdl_w is the time interval between the two consecutive write operation burst accesses when they access the same storage unit group
  • Tccds_w is the time interval between the two consecutive write operation burst accesses when they access different storage unit groups
  • t colnew_r is a correction value of the number of flips of the column bits in the address of the read operation burst access within the first preset time period
  • t col_r is the number of flips of the column bits in the address of the read operation burst access before the correction within the first preset time period
  • Tccdl_r is the time interval between the two consecutive read operation burst accesses when they access the same storage unit group
  • the method for optimizing the current address mapping relationship in step S120 includes:
  • Step S122 selecting the first N bits with the largest number of flipping times from the flipping times of each bit in the address of the burst access within the first preset time period;
  • Step S124 For the first N bits with the most flip times, the first M bits with the most flip times are mapped as the storage unit group addresses of the memory to be accessed, and the other N-M bits with the next largest flip times are mapped as the storage unit addresses of the memory to be accessed, so as to obtain an optimized address mapping relationship.
  • the M bits with the largest flip frequency can be mapped to the storage unit group address, and the N-M bits with the second lowest flip frequency can be mapped to the storage unit address, so as to try to make two consecutive (adjacent) burst accesses access different storage unit groups or different storage units.
  • the new burst access can be mapped to the appropriate memory physical address more accurately, reducing the occurrence of page conflicts, thereby improving the parallelism of memory access and increasing the bus bandwidth.
  • M is the number of storage unit group bits in the address of the burst access, and the storage unit group bits are the bits mapped to the storage unit group address of the memory to be accessed according to the current address mapping relationship
  • N-M is the number of storage unit bits in the address of the burst access, and the storage unit bits are the bits mapped to the storage unit address of the memory to be accessed according to the current address mapping relationship
  • N is the total number of storage unit bits and storage unit group bits in the address of the burst access.
  • the address of the burst access has a total of 34 bits, where M and N are 2 and 4 respectively.
  • step S120 further includes:
  • Step S126 For multiple bits other than the first N bits with the most flip times in the address of the burst access, divide them into multiple groups of bits according to a preset rule, and map the multiple groups of bits to corresponding types of addresses in the memory physical address of the memory to be accessed.
  • the corresponding type of address in the memory physical address of the memory to be accessed includes at least a column address and a row address.
  • the corresponding types of addresses in the memory physical address of the memory to be accessed include a column address, a row address, a data channel address, and a chip select address.
  • the above preset rule is from low to high, according to the number of bits in each group of bits. In some embodiments, for multiple bits other than the first N bits with the most flip times in the address of burst access, they are divided into four groups of bits in order from low to high, and are respectively mapped to the data channel address, column address, row address and chip select address of the memory to be accessed.
  • the burst access address has a total of 34 bits, wherein the number of bits in the four groups of bits respectively mapped to the data channel address, column address, row address and chip select address of the memory to be accessed are 2, 11, 16 and 1 respectively.
  • the data channels are two channels of the memory particles, which can store four bytes of data, corresponding to a 2-bit address, so the lowest two bits of AXI can be fixedly mapped to the data channel.
  • the memory to be accessed includes multiple memory particles (such as X8 type memory)
  • a chip select signal is needed to select which memory particle to access, and the highest available bit can be fixedly mapped as the chip select signal.
  • the bits corresponding to the chip select address do not need to be divided; when there is only one data channel in the memory cell, the bits corresponding to the data channel do not need to be divided.
  • the number of row and column addresses varies depending on the size and type of the selected memory particles.
  • the addresses of access requests sent by upstream devices based on the AXI transmission protocol are continuous addresses, so the lower-order addresses can be mapped to the column addresses and the higher-order addresses can be mapped to the row addresses. In this way, continuous addresses will be mapped to different columns, which is performance-friendly.
  • the above four groups of bits can be divided from low to high, and mapped to the data channel address, column address, row address and chip select address of the memory to be accessed respectively.
  • the current address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed can be as shown in FIG.
  • the address of the burst access has a total of 34 bits, among which the 0th to 1st bits of the burst access address are mapped to the data channel address of the memory to be accessed, the 2nd to 12th bits are mapped to the column address of the memory to be accessed, the 13th to 14th bits are mapped to the storage unit group address of the memory to be accessed, the 15th to 16th bits are mapped to the storage unit address of the memory to be accessed, the 17th to 32nd bits are mapped to the row address of the memory to be accessed, and the 33rd bit is mapped to the chip select address of the memory to be accessed.
  • the memory access system can use the address mapping relationship shown in Figure 2 as the internal default or initial address mapping relationship, that is, when the memory access system is initially operated, the address mapping relationship shown in Figure 2 can be used for address mapping.
  • the 4-bit (4bits) storage unit group (Bank group) address and storage unit (Bank) address are scattered between the row address and the column address.
  • continuous burst access can be allowed to access different storage units as much as possible to a certain extent, which can solve the address mapping problem before the address mapping relationship is optimized to a certain extent.
  • the positions of the storage unit group (Bank group) address and the storage unit (Bank) address are relatively flexible, and also need to be focused on during the adaptive optimization process.
  • the top 4 bits with the most flip times within the first preset time period are the 10th bit (the current column bit, the flip times have been corrected), the 18th bit, the 14th bit and the 15th bit, as shown in FIG.
  • the 10th bit and the 18th bit are mapped to the storage unit group (bank group) address of the memory to be accessed
  • the 14th bit and the 15th bit are mapped to the storage unit (bank) address of the memory to be accessed
  • the remaining 0th to 9th bits, 11th to 13th bits, 16th to 17th bits, and 19th to 33rd bits are divided into a first group of bits (2 bits), a second group of bits (11 bits), a third group of bits (16 bits) and a fourth group of bits (1 bit) in order from low to high, and then these four groups of bits are mapped to the data channel address, the column address, the row address and the chip select address respectively.
  • bits 0 to 1 are mapped to the data channel address of the memory to be accessed
  • bits 2 to 9 and bits 11 to 13 are mapped to the column address of the memory to be accessed
  • bits 16 to 17 and bits 19 to 32 are mapped to the row address of the memory to be accessed
  • bit 33 is mapped to the chip select address of the memory to be accessed.
  • step S130 since in the optimized address mapping relationship obtained in step S120, among the first N bits with the largest number of flips, the first M bits with the largest number of flips, and the other N-M bits with the second largest number of flips are respectively mapped to the storage unit group address and the storage unit address of the memory to be accessed, therefore, when the burst access received within the second preset time period adopts the optimized address mapping relationship and is mapped to the memory physical address corresponding to the memory to be accessed, when the addresses of two consecutive burst accesses received within the second preset time period are flipped on at least one bit of the above-mentioned first M bits with the largest number of flips, the storage unit group addresses mapped to the addresses of the two consecutive burst accesses are different (the storage unit group addresses are different), thereby accessing different storage unit groups of the memory to be accessed; when the addresses of two consecutive burst accesses received within the second preset time period are flipped on at least one bit of the above-ment
  • the address mapping relationship is stored in a non-volatile memory of a memory access system, and the optimized address mapping relationship obtained according to the burst access within the first preset time period will be uploaded to the non-volatile memory at a corresponding time after the first preset time period, and after the upload, the optimized address mapping relationship can be used in a subsequent second preset time period.
  • the second preset time period can be a time period from after the optimized address mapping relationship is uploaded to before the next optimized address mapping relationship is uploaded.
  • the corresponding time window may be selected as the first preset time period, and steps S110 to S130 may be repeated to perform the next optimization operation.
  • step S130 may further include the following steps:
  • an optimized address mapping relationship is used to map the address of the burst access received within the second preset time period to the memory physical address corresponding to the memory to be accessed, so as to perform an access operation on the memory to be accessed.
  • an address mapping device including:
  • the transmission monitoring module is configured to determine whether the same bit occurs in the addresses of two consecutive burst accesses within a first preset time period. Flipping, so as to count the number of flips of each bit in the address of the burst access within a first preset time period;
  • the mapping judgment module is connected to the transmission monitoring module and is configured to optimize the current address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed according to the number of flips of each bit in the address of the burst access monitored by the transmission monitoring module within a first preset time period, so as to obtain an optimized address mapping relationship.
  • the transmission monitoring module includes:
  • a write transfer monitoring module is configured to determine whether the same bit in the address of each of two consecutive write operation burst accesses in a first preset time period is flipped, so as to count the number of flips of each bit in the address of the write operation burst access in the first preset time period;
  • the read transfer monitoring module is configured to determine whether the same bit in the address of each two consecutive read operation burst accesses is flipped within a first preset time period, so as to count the number of flips of each bit in the address of the read operation burst access within the first preset time period.
  • the flipping times of each bit in the address accessed by the write operation burst within the first preset time period and the flipping times of each bit in the address accessed by the read operation burst within the first preset time period can be counted in the write transfer monitoring module and the read transfer monitoring module respectively, and then summarized in the mapping judgment module, and the flipping times of the same bit in the first preset time period are added to obtain the flipping times of each bit in the address accessed by the burst within the first preset time period.
  • the write transfer monitoring module includes a plurality of first counters configured to respectively count the number of flips of each bit in the address accessed by the write operation burst;
  • the read transmission monitoring module includes a plurality of second counters configured to respectively count the number of flippings of each bit in the address accessed by the read operation burst.
  • the embodiment of the present disclosure further provides a memory controller connected between an upstream device and a memory to be accessed, including:
  • a first-in-first-out queue connected to the address mapping device, is configured to process and output the burst access transmitted from the address mapping device across clock domains between the upstream device and the memory to be accessed;
  • the address mapping module is connected between the FIFO queue and the memory to be accessed, and is configured to adopt a corresponding address mapping relationship to map the address of the burst access transmitted from the FIFO queue to the memory physical address corresponding to the memory to be accessed, so as to perform an access operation on the memory to be accessed.
  • the first-in-first-out queue (FIFO) is used to implement cross-clock domain processing between the upstream device and the memory to be accessed, so that the memory to be accessed performs corresponding read and write operations according to the time sequence of the received access instructions.
  • the transfer monitoring module of the address mapping device includes a write transfer monitoring module and a read transfer monitoring module
  • the memory controller also includes a multiplexer (Mux) connected between the address mapping device and the first-in-first-out queue, and is configured to select one of the write operation burst access transmitted from the write transfer monitoring module and the read operation burst access transmitted from the read transfer monitoring module to write into the first-in-first-out queue according to the write pointer of the first-in-first-out queue.
  • Mcux multiplexer
  • each module of the above-mentioned memory controller can refer to any embodiment of the above-mentioned memory access method, which will not be repeated here.
  • an embodiment of the present disclosure further provides a memory access system, including an upstream device, a memory to be accessed, and a memory controller of any of the above embodiments.
  • the upstream device is connected to the memory to be accessed through the memory controller to access the memory to be accessed through the memory controller.
  • the memory to be accessed includes SDRAM, and SDRAM includes but is not limited to DDR, GDDR and LPDDR.
  • the above system also includes: a non-volatile memory (not shown in the figure), which is connected to the memory controller and configured to store the address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed.
  • a non-volatile memory (not shown in the figure), which is connected to the memory controller and configured to store the address mapping relationship between the address of the burst access and the memory physical address of the memory to be accessed.
  • Non-volatile memory can be flash memory, read-only memory (ROM) and other memories.
  • the above system also includes: a port physical layer chip (Physical, PHY), which is connected between the memory to be accessed and the memory controller, and is configured to convert the burst access digital signal transmitted by the memory controller into an interface physical signal of the memory to be accessed.
  • a port physical layer chip Physical, PHY
  • the memory controller and the port physical layer are connected through the DFI (DDR PHY Interface) protocol.
  • DFI DDR PHY Interface
  • the product form of the memory access system is a GPU SOC system.
  • the GPU SOC system includes a GPU core and other upstream devices (such as encoders, decoders, displays, etc.), a memory to be accessed (such as GDDR), a CPU core and a flash chip (non-volatile memory), which initiate read and write transmissions to the memory to be accessed.
  • the memory controller determines the address mapping relationship in an adaptive manner through the memory access method of any of the above embodiments, and the steps are as follows:
  • Step 1 When the system is currently running, the memory controller uses the current address mapping relationship to run.
  • the controller can run with the initial (default) address mapping.
  • Step 2 Each upstream device in the system initiates access to the memory to be accessed, and the write transfer monitoring module and the read transfer monitoring module in the memory controller monitor the read/write transfer (read/write access request) within the set time window (the first preset time period).
  • the read/write transfer read/write access request
  • the set time window the first preset time period.
  • the address of each burst access is parsed as 0x200_0000, 0x200_0040, 0x200_0080, and 0x200_00C0.
  • the same bits in (0x300_0000 ⁇ 0x200_0000) are XORed, and the XOR result of the 24th bit (bit[24]) is 1, that is, the 24th bit (bit[24]) is flipped once, and its corresponding counter is increased by 1.
  • Step 3 The read/write monitoring module in the memory controller aggregates the statistical information of the address flip to the mapping judgment module, and the mapping judgment module obtains the address mapping relationship.
  • Step 4 Count the number of flips of each bit in the addresses of multiple burst accesses within the first preset time period (including one time window or multiple time windows) within the first preset time period, adaptively optimize the current address mapping relationship, obtain the optimized address mapping relationship, and save it in the flash chip.
  • Step 5 After power is turned on again, the address mapping module in the memory controller selects the address mapping relationship obtained by hardware adaptation to perform address mapping.
  • the embodiment of the present disclosure also provides an electronic device, which includes the memory access system in any of the above embodiments.
  • the product form of the electronic device is embodied as a graphics card; in other usage scenarios, the product form of the electronic device is embodied as a CPU motherboard.
  • the disclosed embodiment also provides an electronic device, which includes the above-mentioned electronic device.
  • the product form of the electronic device is a portable electronic device, such as a smart phone, a tablet computer, a VR device, etc.; in some usage scenarios, the product form of the electronic device is a personal computer, a game console, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Dram (AREA)

Abstract

本公开提供一种内存访问方法、装置、系统及电子设备。该方法包括通过统计到的突发访问的地址中每个比特位在第一预设时间段内的翻转次数,来对初始地址映射关系进行优化,得到优化后的地址映射关系;针对在所述第一预设时间段之后的第二预设时间段内接收到的突发访问,采用所述优化后的地址映射关系,将所述第二预设时间段内接收到的突发访问的地址映射到所述待访问内存对应的内存物理地址上,以对所述待访问内存执行访问操作。这种通过统计实际运行情况来自适应地优化地址映射关系的方法,可以较为准确地将后续的突发访问映射到恰当的内存物理地址上,可以尽量减少页冲突的发生,从而提高内存访问的并行程度,提高总线带宽。

Description

内存访问方法、装置、系统及电子设备
相关申请的交叉引用
本申请基于申请号为202211321996.7、申请日为2022年10月27日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及内存技术领域,具体地,涉及一种内存访问方法、装置、系统及电子设备。
背景技术
随着科学技术的发展,片上系统(System on a Chip,SOC)芯片对于内存的存取速度有着更高的要求。其中,图形双倍速率同步动态随机存储器(Graphics Double Data Rate Synchronous Dynamic Random Access Memory,GDDR SDRAM)是一种可实现更高的数据带宽的内存新设计,数据吞吐速率大幅提高。目前GDDR被广泛的应用于图形处理器(Graphics Processing Unit,GPU)、人工智能(Artificial Intelligence,AI)、增强现实(Augmented Reality,AR)/虚拟现实(Virtual Reality,VR)设备中。
在一个将GDDR作为内存的SOC中,内存访问的性能与前后两笔传输所访问的存储单元(bank),行,列有很大关系。传统的做法是通过仿真,选取出性能最佳的地址与内存物理地址的映射关系,作为后续的地址映射关系。但是通过仿真的方式会消耗大量的人力和时间成本,且仿真时给出的激励可能与系统运行时的实际传输有较大区别,导致选取的地址映射关系不是最佳的地址映射关系。
发明内容
本公开的目的是提供一种内存访问方法、装置、系统及电子设备,解决了现有技术中内存访问过程中地址映射关系的确定困难和准确度低的技术问题。
根据本公开的一个方面,提供一种内存访问方法,包括:
确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生翻转,以统计突发访问的地址中每个比特位在第一预设时间段内的翻转次数;
根据突发访问的地址中每个比特位在第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系;
针对在第一预设时间段之后的第二预设时间段内接收到的突发访问,采用优化后的地址映射关系,将第二预设时间段内接收到的突发访问的地址映射到待访问内存对应的内存物理地址上,以对待访问内存执行访问操作。
在一些实施例中,上述内存访问方法中,根据突发访问的地址中每个比特位在第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系的步骤之前,上述方法还包括:
基于待访问内存的预充电操作延时和激活操作延时,对突发访问的地址中的列比特位在第一预设时间段内的翻转次数进行修正;其中,列比特位为根据当前的地址映射关系映射到待访问内存的列地址的比特位。
在一些实施例中,上述内存访问方法中,基于待访问内存的预充电操作延时和激活操作延时,对突发访问的地址中的列比特位在第一预设时间段内的翻转次数进行修正,包括以下步骤:
基于待访问内存的预充电操作延时和激活操作延时,通过如下计算式对突发访问的地址中的列比特位在第一预设时间段内的翻转次数进行修正:
tcolnew=tcol/[(Trp+Trcd)/(Tccdl-Tccds)]
其中,tcolnew为所述列比特位在所述第一预设时间段内的翻转次数的修正值,tcol为修正之前的所述列比特位在所述第一预设时间段内的翻转次数,Trp为所述待访问内存的预充电操作延时,Trcd为所述待访问内存的激活操作延时,Tccdl为连续两个突发访问去访问同一个存储单元组时该连续两个突发访问之间的时间间隔,Tccds为连续两个突发访问去访问不同的存储单元组时该连续两个突发访问之间的时间间隔。
在一些实施例中,上述内存访问方法中,根据突发访问的地址中每个比特位在第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系,包括以下步骤:
从突发访问的地址中每个比特位在第一预设时间段内的翻转次数中,选取翻转次数最多的前N个比特位;
针对翻转次数最多的前N个比特位,将其中翻转次数最多的前M个比特位映射为待访问内存的存储单元组地址,翻转次数次之的另外N-M个比特位映射为待访问内存的存储单元地址,以得到优化后的地址映射关系。
在一些实施例中,上述内存访问方法中,从突发访问的地址中每个比特位在第一预设时间段内的翻转次数中,选取翻转次数最多的前N个比特位的步骤之后,上述方法还包括:
针对突发访问的地址中翻转次数最多的前N个比特位以外的多个比特位,按照预设规则划分为多组比特位,并将多组比特位分别映射为待访问内存的内存物理地址中对应类型的地址。
在一些实施例中,上述内存访问方法中,待访问内存的内存物理地址中对应类型的地址至少包括列地址和行地址。
在一些实施例中,上述内存访问方法中,确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生翻转,以统计突发访问的地址中每个比特位在第一预设时间段内的翻转次数,包括以下步骤:
确定第一预设时间段内每连续两个写操作突发访问的地址中相同比特位是否发生翻转,以及第一预设时间段内每连续两个读操作突发访问的地址中相同比特位是否发生翻转,以分别统计写操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数以及读操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数;
将写操作突发访问的地址与读操作突发访问的地址中相同比特位在第一预设时间段内的翻转次数进行加和,以得到突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
在一些实施例中,上述内存访问方法中,确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生翻转,以统计突发访问的地址中每个比特位在第一预设时间段内的翻转次数,包括以下步骤:
对第一预设时间段内每连续两个突发访问的地址中相同比特位进行异或处理;
响应于有连续两个突发访问的地址中任一相同比特位的异或结果为1,将该比特位对应的计数值加1,从而得到突发访问的地址中每个比特位在第一预设时间段内对应的最终的计数值,并将突发访问的地址中每个比特位对应的最终的计数值分别作为突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
在一些实施例中,上述内存访问方法中,针对在第一预设时间段之后的第二预设时间段内接收到的突发访问,采用优化后的地址映射关系,将第二预设时间段内接收到的突发访问的地址映射到待访问内存对应的内存物理地址上,以对待访问内存执行访问操作,包括以下步骤:
响应于待访问内存重新上电,针对待访问内存重新上电后的第二预设时间段内接收到的突发访问,采用优化后的地址映射关系,将第二预设时间段内接收到的突发访问的地址映射到待访问内存对应的内存物理地址上,以对待访问内存执行访问操作。
在一些实施例中,上述内存访问方法中,第一预设时间段内的突发访问的地址,通过以下步骤获得:
获取在第一预设时间段内接收到的且针对待访问内存的多个访问请求;
对多个访问请求进行解析,以得到多个访问请求中各个突发访问的地址。
在一些实施例中,上述内存访问方法中,对多个访问请求进行解析,以得到多个访问请求中各个突发访问的地址的步骤之前,上述方法还包括:
将多个访问请求的起始地址与待访问内存的总线位宽对齐。
根据本公开的另一方面,提供一种地址映射装置,包括:
传输监测模块,被配置为确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生翻转,以统计突发访问的地址中每个比特位在第一预设时间段内的翻转次数;
映射判断模块,与传输监测模块连接,被配置为根据传输监测模块监测到的突发访问的地址中每个比特位在第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系。
在一些实施例中,上述地址映射装置中,传输监测模块包括:
写传输监测模块,被配置为确定第一预设时间段内每连续两个写操作突发访问的地址中相同比特位是否发生翻转,以统计写操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数;
读传输监测模块,被配置为确定第一预设时间段内每连续两个读操作突发访问的地址中相同比特位是否发生翻转,以统计读操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
在一些实施例中,上述地址映射装置中,写传输监测模块包括多个第一计数器,被配置为分别对写操作突发访问的地址中每个比特位的翻转次数进行统计;
读传输监测模块包括多个第二计数器,被配置为分别对读操作突发访问的地址中每个比特位的翻转次数进行统计。
根据本公开的另一方面,提供一种内存控制器,连接于上游设备与待访问内存之间,包括:
上述任一实施例的地址映射装置;
先进先出队列,与地址映射装置连接,被配置为对地址映射装置传输过来的突发访问,进行跨时钟域处理并输出;
地址映射模块,连接于先进先出队列与待访问内存之间,被配置为采用对应的地址映射关系,将先进先出队列传输过来的突发访问的地址映射到待访问内存对应的内存物理地址上,以对待访问内存执行访问操作。
在一些实施例中,上述内存控制器中,地址映射装置的传输监测模块包括写传输监测模块和读传输监测模块;
内存控制器还包括多路选择器,连接于地址映射装置与先进先出队列之间,被配置为根据先进先出队列的写指针,从写传输监测模块传输过来的写操作突发访问和读传输监测模块传输过来的读操作突发访问中选择其一写入先进先出队列中。
根据本公开的另一方面,提供一种内存访问系统,包括上游设备和待访问内存,以及上述任一实施例的内存控制器。
在一些实施例中,上述内存访问系统中,还包括:非易失性存储器,其与内存控制器连接,被配置为对突发访问的地址与待访问内存的内存物理地址之间的地址映射关系进行存储。
在一些实施例中,上述内存访问系统中,还包括:端口物理层芯片,其连接于待访问内存与内存控制器之间,被配置为将内存控制器传输过来的突发访问的数字信号转换为待访问内存的接口物理信号。
根据本公开的另一方面,提供一种电子装置,包括上述任一实施例的内存访问系统。
根据本公开的另一方面,提供一种电子设备,包括上述任一实施例的电子装置。
附图说明
图1为本公开一个实施例提供的内存访问方法的流程示意图;
图2为本公开一个实施例提供的突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系的示意图;
图3为本公开一个实施例提供的突发访问的地址与待访问内存的内存物理地址之间优化后的映射关系的示意图;
图4为本公开一个实施例提供的地址映射装置的结构示意图;
图5为本公开一个实施例提供的内存控制器的结构示意图;
图6为本公开一个实施例提供的内存访问系统的结构示意图;
图7为本公开一个实施例提供的GPU SOC系统的结构示意图。
具体实施方式
在介绍本公开实施例之前,应当说明的是:
本公开部分实施例被描述为处理流程,虽然流程的各个操作步骤可能被冠以顺序的步骤编号,但是其中的操作步骤可以被并行地、并发地或者同时实施。
本公开实施例中可能使用了术语“第一”、“第二”等等来描述各个特征,但是这些特征不应当受这些术语限制。使用这些术语仅仅是为了将一个特征与另一个特征进行区分。
本公开实施例中可能使用了术语“和/或”,“和/或”包括其中一个或更多所列出的相关联特征的任意和所有组合。
应当理解的是,当描述两个部件的连接关系或通信关系时,除非明确指明两个部件之间直接连接或直接通信,否则,两个部件的连接或通信可以理解为直接连接或通信,也可以理解为通过中间部件间接连接或通信。
为了使本公开实施例中的技术方案及优点更加清楚明白,以下结合附图对本公开的示例性实施例进行进一步详细的说明,显然,所描述的实施例仅是本公开的一部分实施例,而不是所有实施例的穷举。 需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。
在一种内存访问系统中,内存控制器在接收到上游设备基于AXI传输协议发送过来的访问请求,通过地址映射,将访问请求的地址(上游设备给定的访问地址)转换为内存物理地址(即内存颗粒的物理地址)。示例性的,在一种内存(如GDDR颗粒)中,一共有16个存储单元(bank),这16个存储单元(bank)会被划分为4个存储单元组(bank组),会有4个比特位来指示存储单元(bank)信息,其中2个比特位指示存储单元组(bank组)地址信息,2个比特位指示存储单元(bank)地址信息。每个存储单元(bank)为包括多行和多列的存储阵列,根据内存颗粒大小和内存颗粒类型的不同,行和列的数据也有所区别。内存颗粒地址至少可分为存储单元组(bank组)地址、存储单元(bank)地址、行地址(row)和列地址(column)。在内存颗粒内部,采用存储单元组(bank组)地址、存储单元(bank)地址、行地址(row)和列地址(column)进行寻址,可访问存储阵列的对应位置。行地址(row)和列地址(column)可以共享地址信号线,因此通常不允许连续访问同一个存储单元(bank)的不同行,而是要在两次换行访问之间插入一组切换动作。这组切换动作包括,对之前操作的行(row)进行预充(precharge),对即将操作的行完成激活(active)。这组动作的操作顺序和时间间隔针对不同内存或存储器芯片都有相应的规定,这组动作有一定的延时,因此降低了存取效率。
根据访问颗粒的地址的不同,传输被分为页命中,页错失,页冲突。页命中指的是当前要访问的行刚好处于激活状态,这样可以直接进行读写访问,不需要额外操作。页错失指的是当前所要访问的存储单元(bank),没有任何行处于激活状态,需要先发激活命令来激活所要访问的行,然后再发起对这一行的访问。页冲突指的是当前所要访问的行所在的存储单元(bank),有除所要访问的行以外的其它行处于激活状态,这样需要发预充电命令把处于激活状态的其它行关闭掉,然后发激活命令激活所要访问的行,然后再发起对所要访问的行的访问。页命中对系统性能最为友好,而页冲突对系统性能最不友好,是需要尽量避免的情况。同时,连续的两笔对颗粒的访问,访问不同存储单元组(bank组)相较于访问同一个存储单元组(bank组)组性能也更好。
本公开的目的是提供一种内存访问方法、装置、系统及电子设备,旨在通过统计到的突发访问的地址中每个比特位在第一预设时间段内的翻转次数,来对当前的地址映射关系进行优化,得到优化后的地址映射关系;针对在第一预设时间段之后的第二预设时间段内接收到的突发访问,采用优化后的地址映射关系,将第二预设时间段内接收到的突发访问的地址映射到待访问内存对应的内存物理地址上,以对待访问内存执行访问操作。这种通过统计实际运行情况来自适应地优化地址映射关系的方法,可以较为准确地将后续的突发访问映射到恰当的内存物理地址上,可以尽量使得连续的(相邻的)两个突发访问去访问不同的存储单元组或不同的存储单元,减少页冲突的发生,从而提高内存访问的并行程度,提高总线带宽。且这种自适应的优化方式,可以在内存访问系统实际运行的过程中同时进行,不会消耗大量的人力和时间成本,简化了地址映射关系的优化过程。
本公开的一个实施例提供一种内存访问方法,如图1所示,该方法包括:
步骤S110:确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生翻转,以统计突发访问的地址中每个比特位在第一预设时间段内的翻转次数;
步骤S120:根据突发访问的地址中每个比特位在第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系;
步骤S130:针对在第一预设时间段之后的第二预设时间段内接收到的突发访问,采用优化后的地址映射关系,将第二预设时间段内接收到的突发访问的地址映射到待访问内存对应的内存物理地址上,以对待访问内存执行访问操作。
其中,待访问内存包括多个存储单元(Bank),每个存储单元(Bank)为包括多行和多列的存储阵列,待访问内存中的上述多个存储单元被划分为多个存储单元组(bank组)。
在一些实施例中,上述待访问内存包括不限于同步动态随机存储器(Synchronous Dynamic Random Access Memory,SDRAM),SDRAM包括但不限于双倍速率同步动态随机存储器(Double Data Rate SDRAM,DDR SDRAM)、GDDR和低功耗双倍速率同步动态随机存储器(Low Power Double Data Rate SDRAM,LPDDR SDRAM)。
在一些实施例中,上述待访问内存中可以有16个存储单元(bank),这16个存储单元(bank)会被划分为4个存储单元组(bank组),会有4个比特位来指示存储单元(bank)信息,其中2个比特位指示存储单元组(bank组)地址信息,2个比特位指示存储单元(bank)地址信息。
在一些实施例中,第一预设时间段包括一个时间段,即一个时间窗口。也就是说,可以对一个较长时间窗口内接收到的针对待访问内存的突发访问的地址进行分析,以统计突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
在另一些实施例中,第一预设时间段包括多个时间段,即多个时间窗口。也就是说,可以对多个时 间窗口内接收到的针对待访问内存的突发访问的地址进行分析,以统计突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
在一些实施例中,每个突发访问的地址的比特数相同。
在一些实施例中,第一预设时间段内的突发访问的地址,通过以下步骤获得:
(1)确定在第一预设时间段内接收到的且针对待访问内存的多个访问请求;
(2)对上述多个访问请求进行解析,以得到上述多个访问请求中的各个突发访问的地址。
在一些实施例中,访问请求包括:起始地址、访问数据位宽和突发长度。
访问请求的突发类型包括INCR或WRAP。INCR类型的突发访问(突发传输)的地址是递增的,WRAP型的突发访问(突发传输)的地址是轮回的。
在一些实施例中,当访问请求的突发类型为INCR时,对上述多个访问请求进行解析,以得到上述多个访问请求中的各个突发访问的地址,包括以下步骤:
(a)将每个访问请求的起始地址作为每个访问请求中第一个突发访问的地址;
(b)根据每个访问请求中第一个突发访问的地址,以及每个访问请求的访问数据位宽和突发长度,通过如下计算式得到其它每个访问请求中每个突发访问的地址:
Address_n=Aligned_address+(n-1)Number_bytes;
Number_bytes=2^AWSIZE;
Aligned_address=INT((Start_address)/(Number_bytes))*
Number_bytes;
其中,n起始值为2,Address_n为对应的访问请求中第n个突发访问的地址,n的最大值为对应的访问请求的突发长度,AWSIZE为对应的访问请求的访问数据位宽,INT为向下取整函数,Start_address为对应的访问请求的起始地址。
在另一些实施例中,上述步骤(b)中还可以通过如下计算式得到每个访问请求中其它每个突发访问的地址:
Address_n=Aligned_address+n*Number_bytes
其中,n起始值为1,Address_n为对应的访问请求中第n+1个突发访问的地址,n的最大值为对应的访问请求的突发长度减1。即Address_1为对应的访问请求中第二个突发访问的地址,Address_2为对应的访问请求中第三个突发访问的地址,以此类推。
在一些实施例中,当访问请求的突发类型为WRAP时,对上述多个访问请求进行解析,以得到上述多个访问请求中的各个突发访问的地址,包括以下步骤:
(a)将每个访问请求的起始地址作为每个访问请求中第一个突发访问的地址;
(b)根据每个访问请求中第一个突发访问的地址,以及每个访问请求的数据位宽和突发长度,通过如下计算式计算每个访问请求的低位地址边界:
Low_boundary_address=INT((Start_address)/(Number_bytes*
Burst_length))*(Number_bytes*Burst_length);
Number_bytes=2^AWSIZE;
其中,Low_boundary_address为低位地址边界,INT为向下取整函数,Start_address为对应的访问请求的起始地址,AWSIZE为对应的访问请求的访问数据位宽,Burst_length为对应的访问请求的突发长度;
(c)根据每个访问请求的低位地址边界,以及每个访问请求的突发长度,通过如下计算式计算每个访问请求的高位地址边界:
High_boundary_address=Low_boundary_address+
(Number_bytes*Burst_length);
其中,High_boundary_address为高位地址边界;
(d)通过如下计算式得到每个访问请求中其它每个突发访问的地址:
Address_n=Start_address+(n-1)*Number_bytes;
其中,n起始值为2,Address_n为对应的访问请求中第n个突发访问的地址,n的最大值为对应的访问请求的突发长度;
(e)将计算得到的第n个突发访问的地址Address_n与高位地址边界进行比较,响应于访问请求中第n个突发访问的地址跨越高位地址边界,通过如下计算式对该访问请求中第n个突发访问的地址进行修正:
Address_n=Start_address+(n-1)*Number_bytes-
(Number_bytes*Burst_length)。
在另一些实施例中,上述步骤(d)中还可以通过如下计算式得到每个访问请求中其它每个突发访问 的地址:
Address_n=Start_address+n*Number_bytes;
上述步骤(e)中通过如下计算式对该访问请求中第n个突发访问的地址进行修正:
Address_n=Start_address+n*Number_bytes-
(Number_bytes*Burst_length);
其中,n起始值为1,Address_n为对应的访问请求中第n+1个突发访问的地址,n的最大值为对应的访问请求的突发长度减1。即Address_1为对应的访问请求中第二个突发访问的地址,Address_2为对应的访问请求中第三个突发访问的地址,以此类推。
需要说明的是,上述计算每个突发访问的地址相关的计算式的其它变型,均属于本公开的保护范围之内。
在一些实施例中,对上述多个访问请求进行解析,以得到上述多个访问请求中的各个突发访问的地址的步骤之前,上述方法还包括:
将上述多个访问请求的起始地址与待访问内存的总线位宽对齐。
可以理解为,上述待访问内存(如GDDR颗粒)的每次访问请求可以最多包含16个突发访问(突发传输),每个突发访问最多包含两个数据通道(channel)一共32比特数据(4Byte),即上述待访问内存(如GDDR颗粒)的总线位宽为16*4Byte=64B(64Byte),因此每次向上述待访问内存发起的访问请求需要是64B(64Byte)对齐的。
对于接收的上游设备发送的写访问请求(或读访问请求),如果突发长度为1(即突发访问的数量为1),则只需先把这笔访问请求的地址做64B对齐操作,收到的起始地址记为AWADDR[MSB:LSB],MSB为高位地址边界,LSB为低位地址边界,其中,64B至少需要6位二进制数进行编码,64B对齐是将起始地址的较低6比特位置0,所以对齐后的起始地址为{AWADDR[MSB:6],6’b0},该对齐后的起始地址也为该唯一的突发访问的地址。如果接收的上游设备发送的写访问请求(或读访问请求),突发长度大于1(即突发访问的数量大于1),首先对起始地址进行对齐得到第一个突发访问的地址,然后按照突发类型解析出其它每个突发访问的地址。
在一些实施例中,步骤S110包括以下步骤:
步骤S112a:对第一预设时间段内每连续两个突发访问的地址中相同比特位进行异或处理;
步骤S114a:响应于有连续两个突发访问的地址中任一相同比特位的异或结果为1,将该比特位对应的计数值加1,从而得到突发访问的地址中每个比特位在第一预设时间段内对应的最终的计数值,并将突发访问的地址中每个比特位对应的最终的计数值分别作为突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
其中,连续两个突发访问是指在时序上的相邻的两个突发访问,即该连续两个突发访问会连续访问上述待访问内存。所以,通过上述统计方式得到的突发访问的地址中每个比特位在第一预设时间段内的翻转次数,相当于连续访问上述待访问内存的两个突发访问的比特位在第一预设时间段内发生翻转的频率。
可以理解为,对每个突发访问的地址解析出来之后,对每连续两个突发访问的地址按照比特位进行异或操作,(Address_n^Address_(n-1)),当某一相同比特位的异或结果为1时,对应的计数器(counter)的计数值加1,这样来监控每个比特位的翻转。
在一些实施例中,一个突发访问为一个写操作突发访问或一个读操作突发访问。
对应的,第一预设时间段内的突发访问可以包括多个写操作突发访问和多个读操作突发访问。
在一些实施例中,突发访问中每个比特位在第一预设时间段内的翻转次数是将写操作突发访问与读操作突发访问交叉(或混合)进行统计的,即在确定每连续两个突发访问的地址中每个比特位是否发生翻转时,可忽略该连续两个突发访问中每个突发访问是写操作突发访问还是读操作突发访问,此时,该连续两个突发访问可以为连续两个写操作突发访问、连续两个读操作突发访问或连续的一个写操作突发访问和一个读操作突发访问。
在另一些实施例中,写操作突发访问的地址中各个比特位在第一预设时间段内的翻转次数,与,读操作突发访问的地址中各个比特位在第一预设时间段内的翻转次数是分开统计的。
对应的,步骤S110包括以下步骤:
步骤S112b:确定第一预设时间段内每连续两个写操作突发访问的地址中相同比特位是否发生翻转,以及第一预设时间段内每连续两个读操作突发访问的地址中相同比特位是否发生翻转,以分别统计写操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数以及读操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数;
步骤S114b:将写操作突发访问的地址与读操作突发访问的地址中相同比特位在第一预设时间段内 的翻转次数进行加和,以得到突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
对应的,在一些实施例中,写操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数,与,读操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数可以分别在内存控制器的写传输监测模块和读传输监测模块中进行统计的,随后再在内存控制器的映射判断模块中进行汇总,相同比特位在第一预设时间段内的翻转次数进行加和,以得到突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
而其中,相对于读写混合统计比特位翻转的方式得到的优化的地址映射关系,通过读写分开统计比特位翻转的方式得到的优化的地址映射关系的效果更好,这是因为连续的读操作或连续的写操作相较于读写交叉进行的性能更好。也因此,读写操作突发访问虽然会在内存控制器的先进先出队列(First In First Out,FIFO)中混在一起,但是为了实现更好的性能,到指令调度后在地址映射模块中还是可以将读写操作突发访问各自集中在一起。
在一些实施例中,根据AXI传输协议,写访问请求和读访问请求的信息标识是不同的,比如写访问请求的各个信息是AW开头的,如,写访问请求的起始地址用AWADDR表示,访问数据位宽用AWSIZE表示,突发长度用AWLEN表示;读访问请求的各个信息是AR开头的,如,读访问请求的起始地址用ARADDR表示,访问数据位宽用ARSIZE表示,突发长度用ARLEN表示。所以,当内存控制器收到访问请求时,可以根据信息标识确定收到的访问请求是写访问请求和读访问请求,从而将写访问请求和读访问请求分别分给写传输监测模块和读传输监测模块进行分析,以解析出对应的每笔写操作突发访问和每笔读操作突发访问的地址,进而分别统计写操作突发访问和读操作访问的地址中每个比特位在第一预设时间段内的翻转次数。
在一些实施例中,步骤S120之前,包括以下步骤:
基于待访问内存的预充电操作延时和激活操作延时,对突发访问的地址中的列比特位在第一预设时间段内的翻转次数进行修正,以得到列比特位在第一预设时间段内的翻转次数的修正值;其中,列比特位为根据当前的地址映射关系映射到待访问内存的列地址的比特位,待访问内存的预充电操作延时为待访问内存的预充电操作后至对预充电操作针对的存储单元执行访问操作的最短时间,待访问内存的激活操作延时为待访问内存的激活操作后至对激活操作针对的存储单元的指定行执行访问操作的最短时间。
其中,待访问内存的预充电操作延时Trp可以理解为预充电操作后至少要经过Trp时间才能对相同的存储单元(预充电操作针对的存储单元)执行读写操作。待访问内存的激活操作延时Trcd为激活操作后至少要经过Trcd的时间才能对这一行(激活操作针对的存储单元的指定行)执行读写操作。(Trp+Trcd)相当于一次页冲突带来的延时。
可以理解为,当连续的(相邻的)两个突发访问去访问待访问内存的同一行的不同列时,第二个突发访问属于页命中(当前要访问的行刚好处于激活状态),第二个突发访问虽然不会引入额外的预充电操作和激活操作,但是相较于访问不同存储单元组(bank组)还是会增加(Tccdl-Tccds)的延时,其中,Tccdl为相邻(连续)的两个突发访问去访问同一个存储单元组时该两个突发访问之间的时间间隔要求,Tccds为相邻(连续)的两个突发访问去访问不同的存储单元组时该两个突发访问之间的时间间隔。引入的(Tccdl-Tccds)延时相较于预充电操作和激活操作引入的延时有如下关系:T=(Trp+Trcd)/(Tccdl-Tccds),也即T次连续访问同一个存储单元(Bank)的不同列引入的延时等于一次页冲突引入的延时,因此,按照当前的地址映射关系,将当前的列比特位经过T次翻转记为一次翻转,以对该列比特位在第一预设时间段内的翻转次数进行修正。
在一些实施例中,上述修正步骤,包括以下步骤:
基于待访问内存的预充电操作延时和激活操作延时,通过如下计算式对突发访问的地址中的列比特位在第一预设时间段内的翻转次数进行修正:
tcolnew=tcol/[(Trp+Trcd)/(Tccdl-Tccds)]
其中,tcolnew为列比特位在第一预设时间段内的翻转次数的修正值,tcol为修正之前的列比特位在第一预设时间段内的翻转次数,Trp为待访问内存的预充电操作延时,Trcd为待访问内存的激活操作延时,Tccdl为连续两个突发访问去访问同一个存储单元组时该连续两个突发访问之间的时间间隔,Tccds为连续两个突发访问去访问不同的存储单元组时该连续两个突发访问之间的时间间隔。
进一步的,为了实现对列比特位在第一预设时间段内的翻转次数的修正,可以在每个列比特位上设置一组T进制计数器,该组T进制计数器通过两个计数器的级联,可实现第一个计数器计数到T时自动归零并输出触发信号给第二个计数器,第二个计数器根据接收到的触发信号,计数值加1。
在一些实施例中,在读写分开统计比特位翻转的方式中,上述修正步骤,还可以包括以下步骤:
基于待访问内存的预充电操作延时和激活操作延时,分别对写操作突发访问的地址中列比特位在第一预设时间段内的翻转次数和读操作突发访问的地址中列比特位在第一预设时间段内的翻转次数进行修 正,再进行加和,以得到突发访问的地址中列比特位在第一预设时间段内的翻转次数的修正值。
对应的,基于待访问内存的预充电操作延时和激活操作延时,可以通过如下计算式分别对突发访问中写操作突发访问的地址中列比特位在第一预设时间段内的翻转次数和突发访问中读操作突发访问的地址中列比特位在第一预设时间段内的翻转次数进行修正:
tcolnew_w=tcol_w/[(Trp+Trcd)/(Tccdl_w-Tccds_w)];
tcolnew_r=tcol_r/[(Trp+Trcd)/(Tccdl_r-Tccds_r)];
其中,tcolnew_w为写操作突发访问的地址中列比特位在第一预设时间段内的翻转次数的修正值,tcol_w为修正之前的写操作突发访问的地址中列比特位在第一预设时间段内的翻转次数,Tccdl_w为连续的两个写操作突发访问去访问同一个存储单元组时该连续的两个写操作突发访问之间的时间间隔,Tccds_w为连续的两个写操作突发访问访问不同的存储单元组时该连续的两个写操作突发访问之间的时间间隔,tcolnew_r为读操作突发访问的地址中列比特位在第一预设时间段内的翻转次数的修正值,tcol_r为修正之前的读操作突发访问的地址中列比特位在第一预设时间段内的翻转次数,Tccdl_r为连续的两个读操作突发访问去访问同一个存储单元组时该连续的两个读操作突发访问之间的时间间隔,Tccds_r为连续的两个读操作突发访问访问不同的存储单元组时该连续的两个读操作突发访问之间的时间间隔。
在一些实施例中,步骤S120中对当前的地址映射关系进行优化的方法包括:
步骤S122:从突发访问的地址中每个比特位在第一预设时间段内的翻转次数中,选取翻转次数最多的前N个比特位;
步骤S124:针对翻转次数最多的前N个比特位,将翻转次数最多的前M个比特位映射为待访问内存的存储单元组地址,翻转次数次之的另外N-M个比特位映射为待访问内存的存储单元地址,以得到优化后的地址映射关系。
通过这种自适应的优化方式,可以将翻转频率最大的M个比特位映射为存储单元组地址,翻转频率次之的N-M个比特位映射为存储单元地址,以尽量使得连续的(相邻的)两个突发访问去访问不同的存储单元组或不同的存储单元,可以较为准确地将新来的突发访问映射到恰当的内存物理地址上,减少页冲突的发生,从而提高内存访问的并行程度,提高总线带宽。
在一些实施例中,M为突发访问的地址中的存储单元组比特位的数量,存储单元组比特位为根据当前的地址映射关系映射到待访问内存的存储单元组地址的比特位;N-M为突发访问的地址中的存储单元比特位的数量,存储单元比特位为根据当前的地址映射关系映射到待访问内存的存储单元地址的比特位;N为突发访问的地址中的存储单元比特位与存储单元组比特位的总数量。在一些实施例中,突发访问的地址一共有34bits,其中,M和N分别为2和4。
进一步的,步骤S120中对当前的地址映射关系进行优化的方法还包括:
步骤S126:针对突发访问的地址中翻转次数最多的前N个比特位以外的多个比特位,按照预设规则划分为多组比特位,并将多组比特位分别映射为待访问内存的内存物理地址中对应类型的地址。
在一些实施例中,待访问内存的内存物理地址中对应类型的地址至少包括列地址和行地址。
进一步的,在一些实施例中,待访问内存的内存物理地址中对应类型的地址包括列地址、行地址、数据通道地址和片选地址。
在一些实施例中,上述预设规则为从低位到高位,按照每组比特位中比特位的数量。在一些实施例中,针对突发访问的地址中翻转次数最多的前N个比特位以外的多个比特位,按照从低位到高位的顺序划分为四组比特位,分别映射为待访问内存的数据通道地址、列地址、行地址和片选地址。
在一些实施例中,突发访问的地址一共有34bits,其中,分别映射为待访问内存的数据通道地址、列地址、行地址和片选地址的四组比特位中比特位的数量分别为2、11、16和1。
其中,数据通道为内存颗粒的两个通道(channel),可存储四个字节的数据,对应2个比特位的地址,故可以将AXI的最低两个比特位固定映射给数据通道。
当待访问内存包括多个内存颗粒(如X8类型的内存)时,需要片选信号选择访问哪个内存颗粒,可以固定将可用的最高1个比特位映射为片选信号。
需要说明的是,当待访问内存中只有一个内存颗粒时,可以不用划分片选地址对应的比特位,当内存颗粒中只有一个数据通道时,可以不用划分通数据道对应的比特位。
根据选用的内存颗粒的大小和类型不同,行列地址的数目也不相同,因为通常来说上游设备基于AXI传输协议发送过来的访问请求的地址为连续地址,所以可以将较低位地址映射给列地址,将较高位地址映射给行地址,这样连续的地址会映射到不同列,对性能友好。
所以,可以选择按照从低位到高位划分上述四组比特位,并分别映射为待访问内存的数据通道地址、列地址、行地址和片选地址。
示例性的,突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系,可以如图2所 示,突发访问的地址一共有34bits,其中,突发访问的地址的第0~1bit映射为待访问内存的数据通道地址,第2~12bit映射为待访问内存的列地址,13~14bit映射为待访问内存的存储单元组地址,第15~16bit映射为待访问内存的存储单元地址,第17~32bit映射为待访问内存的行地址,第33bit映射为待访问内存的片选地址。
其中,内存访问系统可以将如图2所示的地址映射关系作为内默认或初始的地址映射关系,即当内存访问系统初次运行时,可以采用如图2所示的地址映射关系进行地址映射。这是由于在如图2所示的地址映射关系中,4个比特位(4bits)的存储单元组(Bank组)地址和存储单元(Bank)地址分散在行地址和列地址中间,在地址映射过程中,一定程度上可以让连续的突发访问尽可能去访问不同的存储单元,一定程度上可解决地址映射关系被优化之前的地址映射问题。
而地址映射关系中,存储单元组(Bank组)地址和存储单元(Bank)地址的位置比较灵活,也是在自适应优化过程中需要着重处理的。
在一些实施例中,与图2所示的当前的地址映射关系对应的,若突发访问的地址的34bits中,按照步骤S110的统计方式得到,在第一预设时间段内翻转次数最多的前4个比特位依次是第10bit(当前的列比特位,翻转次数已进行修正)、第18bit、第14bit和第15bit,如图3所示,按照步骤S120中的优化方式,将第10bit、第18bit映射为待访问内存的存储单元组(bank组)地址,第14bit和第15bit映射为待访问内存的存储单元(bank)地址,并将剩下的第0~9bit、第11~13bit、第16~17bit、第19~33bit按照从低位到高位的顺序划分为第一组比特位(2bits)、第二组比特位(11bits)、第三组比特位(16bits)和第四组比特位(1bits),然后将这四组比特位分别映射数据通道地址、列地址、行地址和片选地址。
也即,第0~1bit映射为待访问内存的数据通道地址,第2~9bit和第11~13bit映射为待访问内存的列地址,第16~17bit和第19~32bit映射为待访问内存的行地址,第33bit映射为待访问内存的片选地址。
在一些实施例中,步骤S130中,由于在步骤S120得到的优化后的地址映射关系中,翻转次数最多的前N个比特位中,翻转次数最多的前M个比特位,和翻转次数次之另外N-M个比特位分别映射为待访问内存的存储单元组地址和存储单元地址,所以,在第二预设时间段内接收到的突发访问采用优化后的地址映射关系,映射到待访问内存对应的内存物理地址上的情况下,当第二预设时间段内接收到的连续两个突发访问的地址在上述翻转次数最多的前M个比特位中的至少一个比特位上发生翻转,该连续两个突发访问的地址映射到的存储单元组地址是不同的(存储单元组地址不同),从而访问待访问内存的不同存储单元组;当第二预设时间段内接收到的连续两个突发访问的地址在上述翻转次数次之的另外N-M个比特位中的至少一个比特位上发生翻转,该连续两个突发访问的地址映射到的存储单元地址是不同的(存储单元地址不同),从而访问待访问内存的不同存储单元。
而,由于上述翻转次数最多的前M个比特位是翻转频率(翻转概率)最大的比特位,所以通过上述方式可以最大概率上将连续两个突发访问映射到不同存储单元组地址上,实现并行访问。
且,由于上述翻转次数次之的另外N-M个比特位是翻转频率(翻转概率)次之的比特位,所以通过上述方式,可以在连续两个突发访问无法映射到不同存储单元组地址上时,尽量将连续两个突发访问映射到不同存储单元地址上,进一步实现并行访问。
其中,在一些实施例中,地址映射关系是存储在内存访问系统中的非易失性存储器中的,根据第一预设时间段内的突发访问得到的优化后的地址映射关系会在第一预设时间段后的对应时机上传至非易失性存储器中,上传之后,后续的第二预设时间段内都可以使用该优化后的地址映射关系。在一些实施例中,第二预设时间段可以为该优化后的地址映射关系上传之后至下一次优化得到的地址映射关系上传之前的时间段。
在此过程中,可以选取对应的时间窗口作为第一预设时间段,重复步骤S110至S130,以进行下一次的优化操作。
在一些实施例中,第一预设时间段之后的第二预设时间段是指第一预设时间段之后待访问内存第一次重新上电后至第二次重新上电前的时间段。即步骤S130还可以包括以下步骤:
响应于待访问内存重新上电,针对待访问内存重新上电后的第二预设时间段内接收到的突发访问,采用优化后的地址映射关系,将第二预设时间段内接收到的突发访问的地址映射到待访问内存对应的内存物理地址上,以对待访问内存执行访问操作。
可以理解为,由于内存访问系统掉电之后,重新上电过程(重新启动过程)中,为了保证数据的一致性,系统需要重构地址映射关系表。所以,本公开的上述实施例中,可以选择在内存访问系统掉电之后,重新上电过程(重新启动过程)中,以优化后的地址映射关系替换非易失性存储器中当前的地址映射关系,以降低地址映射关系的更新替换带来的功耗。
基于相同的发明构思,本公开实施例还提供一种地址映射装置,包括:
传输监测模块,被配置为确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生 翻转,以统计突发访问的地址中每个比特位在第一预设时间段内的翻转次数;
映射判断模块,与传输监测模块连接,被配置为根据传输监测模块监测到的突发访问的地址中每个比特位在第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系。
在一些实施例中,如图4所示,传输监测模块包括:
写传输监测模块,被配置为确定第一预设时间段内每连续两个写操作突发访问的地址中相同比特位是否发生翻转,以统计写操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数;
读传输监测模块,被配置为确定第一预设时间段内每连续两个读操作突发访问的地址中相同比特位是否发生翻转,以统计读操作突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
即,为了实现读写分开统计比特位在第一预设时间段内的翻转次数,写操作突发访问的地址中各个比特位在第一预设时间段内的翻转次数,与,读操作突发访问的地址中各个比特位在第一预设时间段内的翻转次数可以分别在写传输监测模块和读传输监测模块中进行统计的,随后再在映射判断模块中进行汇总,相同比特位在第一预设时间段内的翻转次数进行加和,以得到突发访问的地址中每个比特位在第一预设时间段内的翻转次数。
在一些实施例中,写传输监测模块包括多个第一计数器,被配置为分别对写操作突发访问的地址中每个比特位的翻转次数进行统计;
读传输监测模块包括多个第二计数器,被配置为分别对读操作突发访问的地址中每个比特位的翻转次数进行统计。
上述地址映射装置的各个模块的具体实施过程可参见上述内存访问方法的任一实施例,此处不再赘述。
基于相同的发明构思,如图5所示,本公开实施例还提供一种内存控制器,连接于上游设备与待访问内存之间,包括:
上述任一实施例的地址映射装置;
先进先出队列(FIFO),与地址映射装置连接,被配置为对地址映射装置传输过来的突发访问,进行在上游设备与待访问内存之间的跨时钟域处理并输出;
地址映射模块,连接于先进先出队列与待访问内存之间,被配置为采用对应的地址映射关系,将先进先出队列传输过来的突发访问的地址映射到待访问内存对应的内存物理地址上,以对待访问内存执行访问操作。
其中,先进先出队列(FIFO)用于实现上游设备与待访问内存之间的跨时钟域处理,以使得待访问内存根据接收到的访问指令的时间先后顺序执行对应的读写操作。
在一些实施例中,地址映射装置的传输监测模块包括写传输监测模块和读传输监测模块;
内存控制器还包括多路选择器(Mux),连接于地址映射装置与先进先出队列之间,被配置为根据先进先出队列的写指针,从写传输监测模块传输过来的写操作突发访问和读传输监测模块传输过来的读操作突发访问中选择其一写入先进先出队列中。
上述内存控制器的各个模块的具体实施过程可参见上述内存访问方法的任一实施例,此处不再赘述。
如图6所示,本公开实施例还提供一种内存访问系统,包括上游设备、待访问内存和上述任一实施例的内存控制器。
上游设备通过内存控制器连接待访问内存,以通过内存控制器访问待访问内存。
在一些实施例中,上述待访问内存包括SDRAM,SDRAM包括但不限于DDR、GDDR和LPDDR。
在一些实施例中,上述系统还包括:非易失性存储器(图中未示出),其与内存控制器连接,被配置为对突发访问的地址与待访问内存的内存物理地址之间的地址映射关系进行存储。
非易失性存储器可以为闪存(flash)、只读存储器(Read Only Memory,ROM)等存储器。
在一些实施例中,上述系统还包括:端口物理层芯片(Physical,PHY),其连接于待访问内存与内存控制器之间,被配置为将内存控制器传输过来的突发访问的数字信号转换为待访问内存的接口物理信号。
在一些实施例中,内存控制器与端口物理层之间通过DFI(DDR PHY Interface)协议连接。
在一些使用场景下,内存访问系统的产品形式为GPU SOC系统。
如图7所示,GPU SOC系统包括GPU核(GPU core)及其它上游设备(如编码器、解码器、显示等多个相对于上游设备),待访问内存(如GDDR),以及CPU核(CPU core)和flash芯片(非易失性存储器),它们会向待访问内存发起读写传输,在系统运行的过程中,内存控制器通过上述任一实施例的内存访问方法,自适应方式确定地址映射关系,步骤如下:
步骤1:系统当前运行时内存控制器采用当前的地址映射关系运行。其中,系统初次运行时内存控 制器可以采用初始的(默认的)地址映射关系运行。
步骤2:系统中各个上游设备对待访问内存发起访问,内存控制器中的写传输监测模块和读传输监测模块在设定的时间窗口(第一预设时间段)内监测读/写传输(读/写访问请求)。以写传输为例,假设时刻0收到一笔写访问请求,起始地址为0x300_0000,突发长度为1。随后收到一笔写访问请求,起始地址为0x200_0000,访问数据位宽等于六,突发长度为4,突发类型为INCR类型,解析出其中每一个突发访问的地址为0x200_0000,0x200_0040,0x200_0080,0x200_00C0。将(0x300_0000^0x200_0000)中相同比特位进行异或,得出第24bit(bit[24])的异或结果为1,即第24bit(bit[24])翻转了一次,其对应的计数器加1。将(0x200_0000^0x200_0040)中相同比特位进行异或,得出第6bit(bit[6])的异或结果为1,其对应的计数器加1。分别将(0x200_0040^0x200_0080)以及(0x200_0080^0x200_00C0)中相同比特位进行异或,得出的结果也都是第6bit(bit[6])的异或结果为1,对应的计数器累加。但是因为按照当前的地址映射关系,第6bit(bit[6])为列比特位,需要进行修正,翻转(Trp+Trcd)/(Tccdl_w-Tccds_w)次才记为一次真正的翻转。后续统计以此类推,读传输监测模块跟写传输监测模块类似。
步骤3:内存控制器中的读写监测模块将对于地址翻转的统计信息汇总到映射判断模块,映射判断模块得出地址映射关系。
步骤4:统计第一预设时间段(包括一个时间窗口或多个时间窗口)内的多个突发访问的地址中每个比特位在第一预设时间段内的翻转次数,自适应地对当前的地址映射关系进行优化,得到优化后的地址映射关系,保存到flash芯片中。
步骤5:重新上电后,内存控制器中的地址映射模块选取硬件自适应得到的地址映射关系来进行地址映射。
基于相同的发明构思,本公开实施例还提供一种电子装置,该电子装置包括上述任一实施例中的内存访问系统。在一些使用场景下,该电子装置的产品形式体现为显卡;在另一些使用场景下,该电子装置的产品形式体现为CPU主板。
基于相同的发明构思,本公开实施例还提供一种电子设备,该电子设备包括上述的电子装置。在一些使用场景下,该电子设备的产品形式是便携式电子设备,例如智能手机、平板电脑、VR设备等;在一些使用场景下,该电子设备的产品形式是个人电脑、游戏主机等。

Claims (21)

  1. 一种内存访问方法,所述方法包括:
    确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生翻转,以统计突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数;
    根据突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系;
    针对在所述第一预设时间段之后的第二预设时间段内接收到的突发访问,采用所述优化后的地址映射关系,将所述第二预设时间段内接收到的突发访问的地址映射到所述待访问内存对应的内存物理地址上,以对所述待访问内存执行访问操作。
  2. 根据权利要求1所述的方法,根据突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系的步骤之前,所述方法还包括:
    基于所述待访问内存的预充电操作延时和激活操作延时,对突发访问的地址中的列比特位在所述第一预设时间段内的翻转次数进行修正;其中,所述列比特位为根据当前的地址映射关系映射到所述待访问内存的列地址的比特位。
  3. 根据权利要求2所述的方法,基于所述待访问内存的预充电操作延时和激活操作延时,对突发访问的地址中的列比特位在所述第一预设时间段内的翻转次数进行修正,包括以下步骤:
    基于所述待访问内存的预充电操作延时和激活操作延时,通过如下计算式对突发访问的地址中的列比特位在所述第一预设时间段内的翻转次数进行修正:
    tcolnew=tcol/[(Trp+Trcd)/(Tccdl-Tccds)]
    其中,tcolnew为所述列比特位在所述第一预设时间段内的翻转次数的修正值,tcol为修正之前的所述列比特位在所述第一预设时间段内的翻转次数,Trp为所述待访问内存的预充电操作延时,Trcd为所述待访问内存的激活操作延时,Tccdl为连续两个突发访问去访问同一个存储单元组时该连续两个突发访问之间的时间间隔,Tccds为连续两个突发访问去访问不同的存储单元组时该连续两个突发访问之间的时间间隔。
  4. 根据权利要求1所述的方法,根据突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系,包括以下步骤:
    从突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数中,选取翻转次数最多的前N个比特位;
    针对翻转次数最多的前N个比特位,将其中翻转次数最多的前M个比特位映射为所述待访问内存的存储单元组地址,翻转次数次之的另外N-M个比特位映射为所述待访问内存的存储单元地址,以得到优化后的地址映射关系。
  5. 根据权利要求4所述的方法,从突发访问的地址中每个比特位在所述第一预设时间段内的翻转 次数中,选取翻转次数最多的前N个比特位的步骤之后,所述方法还包括:
    针对突发访问的地址中翻转次数最多的前N个比特位以外的多个比特位,按照预设规则划分为多组比特位,并将所述多组比特位分别映射为所述待访问内存的内存物理地址中对应类型的地址。
  6. 根据权利要求5所述的方法,所述待访问内存的内存物理地址中对应类型的地址至少包括列地址和行地址。
  7. 根据权利要求1所述的方法,确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生翻转,以统计突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数,包括以下步骤:
    确定第一预设时间段内每连续两个写操作突发访问的地址中相同比特位是否发生翻转,以及所述第一预设时间段内每连续两个读操作突发访问的地址中相同比特位是否发生翻转,以分别统计写操作突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数以及读操作突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数;
    将写操作突发访问的地址与读操作突发访问的地址中相同比特位在所述第一预设时间段内的翻转次数进行加和,以得到突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数。
  8. 根据权利要求1所述的方法,确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生翻转,以统计突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数,包括以下步骤:
    对第一预设时间段内每连续两个突发访问的地址中相同比特位进行异或处理;
    响应于有连续两个突发访问的地址中任一相同比特位的异或结果为1,将该比特位对应的计数值加1,从而得到突发访问的地址中每个比特位在所述第一预设时间段内对应的最终的计数值,并将突发访问的地址中每个比特位对应的最终的计数值分别作为突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数。
  9. 根据权利要求1所述的方法,针对在所述第一预设时间段之后的第二预设时间段内接收到的突发访问,采用所述优化后的地址映射关系,将所述第二预设时间段内接收到的突发访问的地址映射到所述待访问内存对应的内存物理地址上,以对所述待访问内存执行访问操作,包括以下步骤:
    响应于所述待访问内存重新上电,针对所述待访问内存重新上电后的第二预设时间段内接收到的突发访问,采用所述优化后的地址映射关系,将所述第二预设时间段内接收到的突发访问的地址映射到所述待访问内存对应的内存物理地址上,以对所述待访问内存执行访问操作。
  10. 根据权利要求1所述的方法,所述第一预设时间段内的突发访问的地址,通过以下步骤获得:
    获取在所述第一预设时间段内接收到的且针对所述待访问内存的多个访问请求;
    对所述多个访问请求进行解析,以得到所述多个访问请求中各个突发访问的地址。
  11. 根据权利要求10所述的方法,对所述多个访问请求进行解析,以得到所述多个访问请求中各个突发访问的地址的步骤之前,所述方法还包括:
    将所述多个访问请求的起始地址与所述待访问内存的总线位宽对齐。
  12. 一种地址映射装置,包括:
    传输监测模块,被配置为确定第一预设时间段内每连续两个突发访问的地址中相同比特位是否发生翻转,以统计突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数;
    映射判断模块,与所述传输监测模块连接,被配置为根据所述传输监测模块监测到的突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数,对突发访问的地址与待访问内存的内存物理地址之间当前的地址映射关系进行优化,得到优化后的地址映射关系。
  13. 根据权利要求12所述的地址映射装置,所述传输监测模块包括:
    写传输监测模块,被配置为确定第一预设时间段内每连续两个写操作突发访问的地址中相同比特位是否发生翻转,以统计写操作突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数;
    读传输监测模块,被配置为确定所述第一预设时间段内每连续两个读操作突发访问的地址中相同比特位是否发生翻转,以统计读操作突发访问的地址中每个比特位在所述第一预设时间段内的翻转次数。
  14. 根据权利要求13所述的地址映射装置,所述写传输监测模块包括多个第一计数器,被配置为分别对写操作突发访问的地址中每个比特位的翻转次数进行统计;
    所述读传输监测模块包括多个第二计数器,被配置为分别对读操作突发访问的地址中每个比特位的翻转次数进行统计。
  15. 一种内存控制器,连接于上游设备与待访问内存之间,包括:
    如权利要求12至14中任意一项所述的地址映射装置;
    先进先出队列,与所述地址映射装置连接,被配置为对所述地址映射装置传输过来的突发访问,进行跨时钟域处理并输出;
    地址映射模块,连接于所述先进先出队列与所述待访问内存之间,被配置为采用对应的地址映射关系,将所述先进先出队列传输过来的突发访问的地址映射到所述待访问内存对应的内存物理地址上,以对所述待访问内存执行访问操作。
  16. 根据权利要求15所述的内存控制器,所述地址映射装置的所述传输监测模块包括写传输监测模块和读传输监测模块;
    所述内存控制器还包括多路选择器,连接于所述地址映射装置与所述先进先出队列之间,被配置为根据所述先进先出队列的写指针,从所述写传输监测模块传输过来的写操作突发访问和所述读传输监测模块传输过来的读操作突发访问中选择其一写入所述先进先出队列中。
  17. 一种内存访问系统,包括上游设备和待访问内存,以及如权利要求15或16所述的内存控制器。
  18. 根据权利要求17所述的系统,还包括:非易失性存储器,其与所述内存控制器连接,被配置为对突发访问的地址与所述待访问内存的内存物理地址之间的地址映射关系进行存储。
  19. 根据权利要求17所述的系统,还包括:端口物理层芯片,其连接于所述内存控制器与所述待访问内存之间,被配置为将所述内存控制器传输过来的突发访问的数字信号转换为所述待访问内存的接口物理信号。
  20. 一种电子装置,包括如权利要求17至19中任意一项所述的内存访问系统。
  21. 一种电子设备,包括如权利要求20中所述的电子装置。
PCT/CN2023/091023 2022-10-27 2023-04-27 内存访问方法、装置、系统及电子设备 WO2024087559A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211321996.7 2022-10-27
CN202211321996.7A CN115374022B (zh) 2022-10-27 2022-10-27 内存访问方法、装置、系统及电子设备

Publications (1)

Publication Number Publication Date
WO2024087559A1 true WO2024087559A1 (zh) 2024-05-02

Family

ID=84072902

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/091023 WO2024087559A1 (zh) 2022-10-27 2023-04-27 内存访问方法、装置、系统及电子设备

Country Status (2)

Country Link
CN (1) CN115374022B (zh)
WO (1) WO2024087559A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115374022B (zh) * 2022-10-27 2023-02-07 北京象帝先计算技术有限公司 内存访问方法、装置、系统及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030200414A1 (en) * 2002-03-15 2003-10-23 Thomas Harley Address generators for mapping arrays in bit reversed order
CN104850501A (zh) * 2015-04-29 2015-08-19 中国人民解放军国防科学技术大学 一种ddr存储器访存地址映射方法及访存地址映射单元
CN111858396A (zh) * 2020-07-27 2020-10-30 福州大学 一种存储器自适应地址映射方法及系统
CN113590508A (zh) * 2021-09-30 2021-11-02 沐曦科技(北京)有限公司 动态可重构的内存地址映射方法及装置
CN115374022A (zh) * 2022-10-27 2022-11-22 北京象帝先计算技术有限公司 内存访问方法、装置、系统及电子设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8751769B2 (en) * 2007-12-21 2014-06-10 Qualcomm Incorporated Efficient address generation for pruned interleavers and de-interleavers

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030200414A1 (en) * 2002-03-15 2003-10-23 Thomas Harley Address generators for mapping arrays in bit reversed order
CN104850501A (zh) * 2015-04-29 2015-08-19 中国人民解放军国防科学技术大学 一种ddr存储器访存地址映射方法及访存地址映射单元
CN111858396A (zh) * 2020-07-27 2020-10-30 福州大学 一种存储器自适应地址映射方法及系统
CN113590508A (zh) * 2021-09-30 2021-11-02 沐曦科技(北京)有限公司 动态可重构的内存地址映射方法及装置
CN115374022A (zh) * 2022-10-27 2022-11-22 北京象帝先计算技术有限公司 内存访问方法、装置、系统及电子设备

Also Published As

Publication number Publication date
CN115374022B (zh) 2023-02-07
CN115374022A (zh) 2022-11-22

Similar Documents

Publication Publication Date Title
JP5231642B2 (ja) メモリモジュール内の独立制御式仮想メモリ装置
US10056123B2 (en) Method and system for improving serial port memory communication latency and reliability
US8667368B2 (en) Method and apparatus for reading NAND flash memory
US6604180B2 (en) Pipelined memory controller
TWI380314B (en) Method for optimizing data transfer between memory modules and at least one memory controller using at least one memory module, and memory apparatus
US6721864B2 (en) Programmable memory controller
EP2546755A2 (en) Flash controller hardware architecture for flash devices
WO2024087559A1 (zh) 内存访问方法、装置、系统及电子设备
EP1474747A1 (en) Address space, bus system, memory controller and device system
US20230409198A1 (en) Memory sharing control method and device, computer device, and system
US9275692B2 (en) Memory, memory controllers, and methods for dynamically switching a data masking/data bus inversion input
US10162522B1 (en) Architecture of single channel memory controller to support high bandwidth memory of pseudo channel mode or legacy mode
CN116257191B (zh) 存储器的控制器、组件、电子设备及命令调度方法
TW201403605A (zh) 用於讀取nand快閃記憶體的方法和設備
KR100295930B1 (ko) 램 장치 및 고속 메모리 시스템
CN116107923B (zh) 一种基于bram的多对多高速访存架构和访存系统
US11636056B1 (en) Hierarchical arbitration structure
US20050010726A1 (en) Low overhead read buffer
US11803467B1 (en) Request buffering scheme
US20220027294A1 (en) Storage card and storage device
US6532523B1 (en) Apparatus for processing memory access requests
WO2022160321A1 (zh) 一种访问内存的方法和装置
US11782715B2 (en) Methods and apparatus for reordering signals
WO2021129304A1 (zh) 内存管理器、处理器内存子系统、处理器和电子设备
KR101863590B1 (ko) 리프레시 소비 전력 절감 기능을 갖는 캐시 메모리장치 및 그 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23881154

Country of ref document: EP

Kind code of ref document: A1