WO2020024113A1 - 一种内存交织方法及装置 - Google Patents

一种内存交织方法及装置 Download PDF

Info

Publication number
WO2020024113A1
WO2020024113A1 PCT/CN2018/097807 CN2018097807W WO2020024113A1 WO 2020024113 A1 WO2020024113 A1 WO 2020024113A1 CN 2018097807 W CN2018097807 W CN 2018097807W WO 2020024113 A1 WO2020024113 A1 WO 2020024113A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
configuration information
capacity
access
channels
Prior art date
Application number
PCT/CN2018/097807
Other languages
English (en)
French (fr)
Inventor
信恒超
夏晶
曾红义
陈挚睿
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201880096144.8A priority Critical patent/CN112513824B/zh
Priority to EP18928427.6A priority patent/EP3822796B1/en
Priority to PCT/CN2018/097807 priority patent/WO2020024113A1/zh
Publication of WO2020024113A1 publication Critical patent/WO2020024113A1/zh
Priority to US17/162,287 priority patent/US20210149804A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0844Multiple simultaneous or quasi-simultaneous cache accessing
    • G06F12/0846Cache with multiple tag or data arrays being simultaneously accessible
    • G06F12/0851Cache with interleaved addressing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0646Configuration or reconfiguration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0607Interleaved addressing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement

Definitions

  • Embodiments of the present application relate to the field of computers, and in particular, to a memory interleaving method and device.
  • the memory controller controls the CPU's access to the memory (memory).
  • the channel between the memory controller and the memory can be called a memory channel.
  • each memory controller controls a part of the memory in the memory system, and the corresponding memory capacity of each memory channel is equal.
  • memory interleaving technology can be used to uniformly interleave access to all memory channels, and all memory channels can use the same interleaving window and interleaving algorithm.
  • the memory capacity corresponding to each memory channel may not be exactly the same.
  • the access may be concentrated on a certain memory channel, and the remaining memory channels are idle. If two interleaving windows and interleaving algorithms are used to interleave accesses, different address spaces will exhibit different memory access performance.
  • the embodiments of the present application provide a memory interleaving method and device, which solves the memory access performance of different address spaces when using two interleaving windows and an interleaving algorithm to interleave accesses when the memory capacities corresponding to the memory channels are not equal. The problem of differences.
  • an embodiment of the present application provides a memory interleaving method.
  • the method can be applied to a CPU, and / or the method can be applied to a communication device that can support the CPU to implement the method.
  • the method includes: dividing the access capacity into P partial access capacities according to N configuration information, and mapping the P partial access capacities to N memory channels according to a configuration mapping table.
  • N represents the total number of memory channels
  • N is an integer greater than or equal to 2
  • N configuration information is the configuration information of N memory channels
  • one configuration information corresponds to one memory channel
  • the access capacity of P shares is the same.
  • the configuration mapping table is used to indicate the mapping relationship between capacity and memory channels.
  • the memory interleaving method provided in the embodiment of the present application divides the access capacity according to the number of memory channel mapping capacity indicated by the configuration information, and then maps the divided partial access capacity to the memory channel according to the configuration mapping table, so that at least N memory channels
  • One memory channel maps two capacities, so that the memory interleaving processing of the access capacity is realized through one interleaving window, avoiding differences in memory access performance in different address spaces.
  • the method before dividing the access capacity into P partial access capacities according to the N configuration information, the method further includes: generating N configuration information and a configuration mapping table, and N configuration information Including M first configuration information and NM second configuration information, the first configuration information includes a memory channel identifier and a first indication identifier, the first indication identifier is used to instruct the memory channel identifier corresponding to the memory channel to map two capacities, and the second The configuration information includes a memory channel identifier and a second indication identifier. The second indication identifier is used to instruct the memory channel mapping corresponding to the memory channel identifier to have a copy of capacity, where M is an integer, M is greater than or equal to 1, and less than N.
  • the method before mapping the P partial access capacity to N memory channels according to the configuration mapping table, the method further includes: mapping to the same memory The address space corresponding to the non-contiguous partial access capacity of the channel is processed continuously.
  • an identification bit of an address space of each partial access capacity is set at a lower bit of an address requesting access to a memory.
  • the method further includes: restoring the addresses to the original addresses according to the N configuration information and identification bits of the address space of each partial access capacity.
  • an embodiment of the present application further provides a communication apparatus for implementing the method described in the first aspect.
  • the communication device is a CPU or a communication device supporting the CPU to implement the method described in the first aspect, for example, the communication device includes a chip system.
  • the communication device includes a processing unit.
  • the processing unit is configured to divide the access capacity into P partial access capacities according to N configuration information, and map the P partial access capacities to N memory channels according to a configuration mapping table.
  • N represents the total number of memory channels
  • N is an integer greater than or equal to 2
  • N configuration information is the configuration information of N memory channels
  • one configuration information corresponds to one memory channel
  • the access capacity of P shares is the same.
  • the configuration mapping table is used to indicate the mapping relationship between capacity and memory channels.
  • the specific method of memory interleaving is the same as that described in the first aspect, and is not repeated here.
  • the communication device may further include a communication interface for sending or receiving data.
  • the functional modules in the second aspect may be implemented by hardware, or may be implemented by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the transceiver is used to complete the functions of the receiving unit and the sending unit
  • the processor is used to complete the functions of the processing unit
  • the memory is used by the processor to process the program instructions of the method in the embodiment of the present application.
  • the processor, the transceiver, and the memory are connected and communicate with each other through a bus. Specifically, reference may be made to the function of the behavior of the CPU in the method described in the first aspect.
  • an embodiment of the present application further provides a communication apparatus for implementing the method described in the first aspect.
  • the communication device is a CPU or a communication device supporting the CPU to implement the method described in the first aspect, for example, the communication device includes a chip system.
  • the communication device includes a processor for implementing the functions of the method described in the first aspect.
  • the communication device may further include a memory for storing program instructions and data.
  • the memory is coupled to the processor, and the processor may call and execute program instructions stored in the memory to implement functions in the method described in the first aspect.
  • the communication device may further include a communication interface, where the communication interface is used for the communication device to communicate with other devices. Exemplarily, if the communication device is a CPU, the other device is a memory.
  • the communication device includes: a communication interface, where the communication interface is used for the communication device to communicate with other devices.
  • the communication interface may be a transceiver for transmitting or receiving data.
  • Memory for storing program instructions.
  • the processor is configured to divide the access capacity into P partial access capacities according to the N configuration information, and map the P partial access capacities to the N memory channels according to the configuration mapping table.
  • N represents the total number of memory channels, N is an integer greater than or equal to 2
  • N configuration information is the configuration information of N memory channels, one configuration information corresponds to one memory channel, and the access capacity of P shares is the same.
  • the configuration mapping table is used to indicate the mapping relationship between capacity and memory channels.
  • the method of memory interleaving is the same as that described in the first aspect, and is not repeated here.
  • an embodiment of the present application further provides a computer-readable storage medium, including: computer software instructions; when the computer software instructions are executed in a communication device, the communication device is caused to execute the method described in the first aspect.
  • an embodiment of the present application further provides a computer program product including instructions.
  • the computer program product runs in a communication device, the communication device is caused to execute the method described in the first aspect.
  • an embodiment of the present application provides a chip system.
  • the chip system includes a processor and may further include a memory, which is configured to implement a function of a CPU in the foregoing method.
  • the chip system can be composed of chips, and can also include chips and other discrete devices.
  • the names of the CPU and the communication device do not constitute a limitation on the device itself. In actual implementation, these devices may appear under other names. As long as the functions of each device are similar to the embodiments of the present application, they belong to the scope of the claims of the present application and their equivalent technologies.
  • FIG. 1 is a composition example diagram of a computer device according to an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a memory interleaving method according to an embodiment of the present application
  • FIG. 3 is a schematic flowchart of another memory interleaving method according to an embodiment of the present application.
  • FIG. 4 is a composition example diagram of a communication device according to an embodiment of the present application.
  • FIG. 5 is a composition example diagram of another communication device according to an embodiment of the present application.
  • Memory is a kind of memory, is one of the important components on the host, is the bridge between the central processing unit (CPU) and other devices, mainly used to temporarily store data, and cooperate with the CPU to work and coordinate The processing speed of the CPU improves the performance of the whole machine.
  • Memory can also be called main memory or internal memory.
  • the memory generally adopts the structure of a memory module. According to the package and pin types of the memory module, it can be divided into a single in-line memory module (SIMM) and a dual in-line memory module (DIMM). ). A memory module can be inserted into a slot on the motherboard.
  • the memory module is implemented by multiple identical memory chips pasted on the same printed circuit board (PCB) liner, that is, the memory module is composed of multiple memory chips.
  • PCB printed circuit board
  • a memory chip is commonly called a memory particle.
  • a memory chip is the basic unit of a memory module.
  • ROM read-only memory
  • RAM random access memory
  • ROM can only read data from it but cannot write data arbitrarily. It has the advantage that the data can remain unchanged after power-off. It is generally used to store unchangeable data, such as basic input output system (BIOS).
  • BIOS basic input output system
  • the contents stored in the RAM can be accessed by random read and write instructions. The data in the RAM will be lost when the power is turned off, so the data can only be stored when the device is running.
  • BIOS basic input output system
  • RAM can be divided into static RAM (static RAM) and dynamic RAM (dynamic RAM).
  • SDRAM synchronous dynamic random access memory
  • Synchronization refers to the need for a synchronous clock for memory operation. The internal command sending and data transmission are based on the synchronous clock. Dynamic means that the storage array needs to be constantly refreshed to ensure that data is not lost. Random means that the data is not stored sequentially in a linear manner, but the data is freely designated for reading and writing.
  • double SDRAM, DDR SDRAM double-rate synchronous dynamic random access memory
  • a memory controller can also be set on the motherboard.
  • the memory controller can be set in the CPU, or the memory controller device is in the memory.
  • the memory controller controls the CPU's access to the memory.
  • the channel between the memory controller and the memory can be called a memory channel.
  • a memory channel can also be understood as a memory controller and a memory medium corresponding to the memory controller.
  • more than two memory controllers can be set on the motherboard, and each memory controller controls a part of the memory on the motherboard.
  • the motherboard includes more than two memory channels. These memory channels are generally identical and independent.
  • the so-called dual-channel memory essentially means that the CPU has two completely independent memory controllers.
  • a memory channel can correspond to one or more memory slots.
  • the data width of the memory controller is 32 bits or 64 bits.
  • the memory chip bit width of a single memory chip in the memory module is 4bit, 8bit, or 16bit.
  • rank In order to meet the data width requirements of the memory controller, multiple memory chips need to be combined to perform data interaction with the memory controller.
  • the combination of multiple memory chips is called rank, and it can also be called physical bank (physical bank, P- BANK).
  • rank Physical bank
  • P- BANK physical bank
  • DIMM is a unit larger than rank. Currently, a DIMM has 1 to 4 ranks. Different ranks can receive different chip select signals in the memory controller.
  • FIG. 1 is a composition example diagram of a computer device according to an embodiment of the present application.
  • the computer device may include at least one processor 101, a memory 102, a communication interface 103, a communication bus 104, and a DIMM 105.
  • the processor 101 is a control center of a computer device, and may be a processor or a collective name of a plurality of processing elements.
  • the processor 101 is a CPU, and may also be an application-specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present application, such as one or more microprocessors.
  • the processor 101 may execute various functions of the computer device by running or executing a software program stored in the memory 102 and calling data stored in the memory 102.
  • the processor 101 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 1.
  • the processor 101 may further include two memory controllers, and each memory controller is connected to two DIMMs 105.
  • the computer device may include multiple processors, such as the processor 101 and the processor 106 shown in FIG. 1.
  • processors can be a single-core processor (single-CPU) or a multi-core processor (multi-CPU).
  • a processor herein may refer to one or more devices, circuits, and / or processing cores for processing data (e.g., computer program instructions).
  • the memory 102 may be a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, or an electrically erasable programmable read-only memory (electrically erasable programmable) read-only memory (EEPROM), compact disc-ready only memory (CD-ROM) or other optical disc storage, optical disc storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disks
  • EEPROM electrically erasable programmable read-only memory
  • CD-ROM compact disc-ready only memory
  • optical disc storage including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.
  • magnetic disks A storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and can be accessed by a computer, is not limited thereto.
  • the memory 102 may exist independently, and is connected to
  • the memory 102 is configured to store a software program that executes the solution of the present application, and is controlled and executed by the processor 101.
  • the communication interface 101 is used to communicate with other devices or communication networks, such as Ethernet, wireless access network (RAN), wireless local area networks (WLAN), and the like.
  • the communication interface 101 may include a receiving unit to implement a receiving function, and a transmitting unit to implement a transmitting function.
  • the communication bus 104 may be an industry standard architecture (ISA) bus, an external component interconnect (PCI) bus, or an extended industry standard architecture (EISA) bus.
  • ISA industry standard architecture
  • PCI external component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in FIG. 1, but it does not mean that there is only one bus or one type of bus.
  • the device structure shown in FIG. 1 does not constitute a limitation on the computer device, and may include more or fewer components than shown, or some components may be combined, or different component arrangements may be included.
  • the addresses accessed by the software program in a short time may be concentrated in a small range.
  • the access When the access is concentrated in a certain address, if only the upper address is used to distinguish the access, the access may be concentrated in a certain memory channel, while the remaining memory channels are idle.
  • Address interleaving is usually used to uniformly interleave the address space to be accessed to all memory channels (usually low-order interleaving), and all memory channels are used to process the addresses to be accessed.
  • all memory channels can use the same interleaving window and interleaving algorithm.
  • the addresses in the memory system can be interleaved onto the two memory channels in an average interleaving manner. For example, interleave odd addresses onto the first memory channel and interleave even addresses onto the second memory channel.
  • the memory capacity corresponding to each memory channel may not be exactly the same, for example, it is limited by the number of DIMM slots on a single board or cost factors.
  • the memory capacities corresponding to the memory channels are not equal, if all the memory channels still use the same interleaving window and interleaving algorithm, the access may be concentrated on a certain memory channel, and the remaining memory channels are idle. If two interleaving windows and interleaving algorithms are used to interleave accesses, different address spaces will exhibit different memory access performance.
  • the address range of the memory system is 0 to 6G, and the memory capacity is 6G.
  • the memory system includes 4 memory channels. If the interleaving algorithm corresponding to interleaving window 1 (uniform interleaving) is used first, 4G in 6G can be evenly mapped to the four memory channels of memory channel 0 to memory channel 3; 2G is mapped to memory channel 2 and memory channel 3 of the four memory channels. Of course, the remaining 2G can also be mapped to memory channel 0 and memory channel 1 of the four memory channels. As can be seen from Table 1, the memory capacity is mapped to the four memory channels through two interleaving windows, and the memory capacity cannot be evenly mapped to the memory channels.
  • Memory channel 0 and memory channel 1 have the same interleaved capacity
  • memory channel 2 and memory channel 3 have the same interleaved capacity
  • memory channel 0 and memory channel 1 have the same interleaved capacity as memory channel 2 and memory channel 3 interleaved.
  • the interleaving algorithms corresponding to interleaving window 1 and interleaving window 2 are different.
  • an embodiment of the present application provides a memory interleaving method.
  • the basic principle is that the access capacity is divided into P partial access capacities according to N configuration information, and the P partial access capacities are mapped to N according to the configuration mapping table.
  • Memory channels N represents the total number of memory channels, N is an integer greater than or equal to 2, N configuration information is the configuration information of N memory channels, one configuration information corresponds to one memory channel, and the access capacity of P shares is the same.
  • the configuration mapping table is used to indicate the mapping relationship between capacity and memory channels.
  • the memory interleaving method provided in the embodiment of the present application divides the access capacity according to the number of memory channel mapping capacity indicated by the configuration information, and then maps the divided partial access capacity to the memory channel according to the configuration mapping table, so that at least N memory channels
  • One memory channel maps two capacities, so that the memory interleaving processing of the access capacity is realized through one interleaving window, avoiding differences in memory access performance in different address spaces.
  • FIG. 2 is a schematic flowchart of a memory interleaving method according to an embodiment of the present application. As shown in FIG. 2, the method may include:
  • the N configuration information is the configuration information of N memory channels, and one configuration information corresponds to one memory channel.
  • the configuration information is used to indicate the number of capacity copies of the memory channel mapping, and at least one of the N memory channels is mapped to two capacities.
  • N represents the total number of memory channels, and N is an integer greater than or equal to two.
  • the capacity of the memory channel mapping can be determined according to the corresponding memory capacity of all the memory channels in the memory system.
  • the ratio of the memory capacities corresponding to the two memory channels is 1: 2, that is, the memory capacity corresponding to one memory channel is the memory capacity corresponding to the other memory channel 2 Times.
  • one memory channel can map two capacities and the other memory channel can map one capacity.
  • the N configuration information includes M first configuration information and N-M second configuration information.
  • the first configuration information includes a memory channel identifier and a first indication identifier, and the first indication identifier is used to instruct the memory channel corresponding to the memory channel identifier to map two capacities.
  • the second configuration information includes a memory channel identifier and a second indication identifier. The second indication identifier is used to instruct the memory channel corresponding to the memory channel identifier to map a capacity.
  • M is an integer, M is greater than or equal to 1, and less than N.
  • the identifier of memory channel 0 can be 0.
  • the identifier of memory channel 1 can be 1.
  • the identifier of memory channel 2 may be 2.
  • the identifier of the memory channel 3 may be 3.
  • Memory channels 2 and 3 have the same memory capacity.
  • the memory capacity corresponding to memory channel 0 and memory channel 1 is the same.
  • the memory capacity corresponding to memory channel 2 and memory channel 3 is twice the memory capacity corresponding to memory channel 0 and memory channel 1.
  • M 2.
  • the memory capacity can be divided into 6 shares.
  • the size of a serving is 1GB.
  • Memory channel 1 maps a copy of capacity.
  • Memory channel 2 maps a copy of capacity.
  • Memory channel 3 maps two capacities.
  • Memory channel 4 maps two capacities.
  • the so-called access capacity can be the capacity that users need to read and write.
  • the access capacity can be smaller than or equal to the memory capacity.
  • the access capacity can be divided according to the number of capacity maps of all the memory channels indicated by the N configuration information to obtain the P access capacity.
  • the access capacity of P shares is the same.
  • part of access capacity 0 and part of access capacity 1 can be mapped to memory channel 0; part of access capacity 2 and part of access capacity 3 can be mapped to memory channel 1; part of access capacity 4 can be mapped to memory Channel 2; part of access capacity 5 can be mapped to memory channel 3.
  • the memory interleaving method provided in the embodiment of the present application divides the access capacity according to the number of memory channel mapping capacity indicated by the configuration information, and then maps the divided partial access capacity to the memory channel according to the configuration mapping table, so that at least One memory channel maps two capacities, so that the memory interleaving processing of the access capacity is realized through one interleaving window, avoiding differences in memory access performance in different address spaces.
  • the embodiment of the present application may further include the following steps.
  • S203 Before dividing the access capacity into P partial access capacities according to the N configuration information, S203 may also be performed.
  • N configuration information and configuration mapping tables are generated. N configuration information and configuration mapping tables are stored. In order to divide and map the access capacity.
  • the address space corresponding to the access capacity may be a plurality of non-contiguous address spaces.
  • the access capacity corresponding to each segment of the continuous address space is divided according to the configuration information of the N memory channels, where each segment of the continuous address space corresponds to The access capacity is the same, and the divided access capacity is also the same.
  • the address space corresponding to the non-contiguous partial access capacity mapped to the same memory channel is processed continuously.
  • the so-called non-contiguous partial access capacity refers to the access capacity corresponding to the non-continuous address space division in the access capacity.
  • the division of the access capacity corresponding to a continuous address space can be understood as a division cycle.
  • the continuous processing of the address space corresponding to the non-contiguous partial access capacity mapped to the same memory channel can be understood as the continuity of the partial access capacity in different division cycle periods.
  • the identification bit of the address space of each partial access capacity is set at the lower bit of the address requesting access to the memory.
  • the lower bits of the address requesting access to the memory may be a bit higher than the interleaving granularity.
  • the identification bits of each address space can also be placed in other locations.
  • S205 When reading the capacity of the address space, S205 may be executed to restore the original address and obtain the original data.
  • S205 Restore the addresses to the original addresses respectively according to the N configuration information and the identification bits of the address space of each partial access capacity.
  • the method provided in the embodiments of the present application is described from the perspective of the CPU. It can be understood that, for each network element, for example, the CPU, in order to implement the functions in the method provided in the embodiments of the present application, the CPU includes a hardware structure and / or a software module corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is performed by hardware or computer software-driven hardware depends on the specific application of the technical solution and design constraints. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
  • the CPU may be divided into functional modules according to the foregoing method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules may be implemented in the form of hardware or software functional modules. It should be noted that the division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
  • FIG. 4 shows a possible composition diagram of the communication device involved in the foregoing and embodiments, and the communication device can execute any method in each method embodiment of the present application.
  • the communication device is a CPU or a communication device that supports the CPU to implement the method provided in the embodiment.
  • the communication device may be a chip system.
  • the communication device may include a processing unit 401.
  • the processing unit 401 is configured to support a communication device to execute a method described in an embodiment of the present application.
  • the processing unit 401 is configured to execute or support the communication device to perform S201 and S202 in the parameter configuration method shown in FIG. 2 and S201 and S205 in the parameter configuration method shown in FIG. 3.
  • the communication device provided in the embodiment of the present application is configured to execute the method in any of the foregoing embodiments, and thus can achieve the same effect as the method in the foregoing embodiment.
  • a communication device 500 is used to implement a function of a CPU in the foregoing method.
  • the communication device 500 may be a CPU or a device in the CPU.
  • the communication device 500 may be a chip system.
  • the chip system may be composed of a chip, and may also include a chip and other discrete devices.
  • the communication device 500 includes at least one processor 501, and is configured to implement a function of a CPU in the method provided in the embodiment of the present application.
  • the processor 501 may be used to execute S201 to S205.
  • S201 to S205 For details, refer to the detailed description in the method example, and details are not described herein.
  • the communication device 500 may further include at least one memory 502 for storing program instructions and / or data.
  • the memory 502 and the processor 501 are coupled.
  • the coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units or modules, and may be electrical, mechanical or other forms for information exchange between devices, units or modules.
  • the processor 501 may operate in cooperation with the memory 502.
  • the processor 501 may execute program instructions stored in the memory 502. At least one of the at least one memory may be included in a processor.
  • the communication device 500 may further include a communication interface 503 for communicating with other devices through a transmission medium, so that the devices used in the communication device 500 may communicate with other devices.
  • a communication interface 503 for communicating with other devices through a transmission medium, so that the devices used in the communication device 500 may communicate with other devices.
  • the communication device is a CPU
  • the other device is a memory.
  • the processor 501 uses the communication interface 503 to send and receive data, and is used to implement the method executed by the CPU described in the embodiments corresponding to FIG. 2 to FIG. 3.
  • the embodiments of the present application are not limited to the specific connection medium between the communication interface 503, the processor 501, and the memory 502.
  • the communication interface 503, the processor 501, and the memory 502 are connected by a bus 504 in FIG. 5.
  • the bus is indicated by a thick line in FIG. It is not limited.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in FIG. 5, but it does not mean that there is only one bus or one type of bus.
  • the processor may be a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, which may implement or The disclosed methods, steps and logic block diagrams in the embodiments of the present application are executed.
  • a general-purpose processor may be a microprocessor or any conventional processor. The steps of the method disclosed in combination with the embodiments of the present application may be directly implemented by a hardware processor, or may be performed by a combination of hardware and software modules in the processor.
  • the memory may be a non-volatile memory, such as a hard disk (HDD) or a solid-state drive (SSD), etc., and may also be a volatile memory (volatile memory), such as Random-access memory (RAM).
  • the memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and can be accessed by a computer, but is not limited thereto.
  • the memory in the embodiment of the present application may also be a circuit or any other device capable of implementing a storage function, for storing program instructions and / or data.
  • the CPU involved in the embodiment of the present application may be a communication device shown in FIG. 4.
  • the disclosed apparatus and method may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the modules or units is only a logical function division.
  • multiple units or components may be divided.
  • the combination can either be integrated into another device, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may be one physical unit or multiple physical units, that is, may be located in one place, or may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • the functional units in the embodiments of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the methods provided in the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software When implemented in software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on a computer, the processes or functions according to the embodiments of the present invention are wholly or partially generated.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, a network device, a terminal, or another programmable device.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission by wire (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration.
  • the available medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a digital video disc (DVD)), or a semiconductor medium (for example, an SSD).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System (AREA)
  • Storage Device Security (AREA)
  • Communication Control (AREA)

Abstract

一种内存交织方法及装置,涉及计算机领域,解决了采用两个交织窗口和交织算法对访问进行交织时出现不同的地址空间的访存性能的差异的问题。具体方案为:根据N个配置信息将访问容量划分为P份部分访问容量,P份部分访问容量的大小相同,N个配置信息为N个内存通道的配置信息,一个配置信息对应一个内存通道,配置信息用于指示内存通道映射的容量份数,N个内存通道中至少一个内存通道映射两份容量,N表示内存通道的总数,N为大于或等于2的整数;根据配置映射表将P份部分访问容量映射到N个内存通道,配置映射表用于指示容量与内存通道的映射关系。本方法及装置用于内存交织的过程中。

Description

一种内存交织方法及装置 技术领域
本申请实施例涉及计算机领域,尤其涉及一种内存交织方法及装置。
背景技术
在中央处理器(central processing unit,CPU)访问内存时,由内存控制器(memory controller,MC)控制CPU对内存(memory)的访问。内存控制器与内存之间的通道可以称为内存通道(channel)。在主板上设置有两个以上内存控制器的情况下,每个内存控制器控制内存系统中的一部分内存,每个内存通道对应的内存容量相等。
为了提高系统性能,可以使用内存交织技术将访问均匀交织到所有的内存通道,所有内存通道可以使用相同的交织窗口和交织算法。但是,由于内存控制器控制的内存条个数不同和成本等因素,各个内存通道对应的内存容量可能并不完全相同。在内存通道对应的内存容量不相等的情况下,如果所有内存通道还是使用相同的交织窗口和交织算法,可能导致访问集中在某个内存通道,而其余内存通道则处于空闲状态。而如果采用两个交织窗口和交织算法对访问进行交织,会使不同的地址空间体现出不同的访存性能。
发明内容
本申请实施例提供一种内存交织方法及装置,解决了在内存通道对应的内存容量不相等的情况下,采用两个交织窗口和交织算法对访问进行交织时出现不同的地址空间的访存性能的差异的问题。
为达到上述目的,本申请实施例采用如下技术方案:
第一方面,本申请实施例提供了一种内存交织方法,该方法可应用于CPU,和/或者该方法可应用于可以支持CPU实现该方法的通信装置,例如该通信装置包括芯片系统,方法包括:根据N个配置信息将访问容量划分为P份部分访问容量,根据配置映射表将P份部分访问容量映射到N个内存通道。其中,N表示内存通道的总数,N为大于或等于2的整数,N个配置信息为N个内存通道的配置信息,一个配置信息对应一个内存通道,P份部分访问容量的大小相同。配置映射表用于指示容量与内存通道的映射关系。
本申请实施例提供的内存交织方法,根据配置信息指示的内存通道映射的容量份数划分访问容量,再根据配置映射表将划分后的部分访问容量映射到内存通道,使得N个内存通道中至少一个内存通道映射两份容量,从而,通过一个交织窗口实现对访问容量的内存交织处理,避免出现不同地址空间访存性能差异。
结合第一方面,在一种可能的实现方式中,在根据N个配置信息将访问容量划分为P份部分访问容量之前,方法还包括:生成N个配置信息和配置映射表,N个配置信息包括M个第一配置信息和N-M个第二配置信息,第一配置信息包括内存通道标识和第一指示标识,第一指示标识用于指示内存通道标识对应的内存通道映射两份容 量,第二配置信息包括内存通道标识和第二指示标识,第二指示标识用于指示内存通道标识对应的内存通道映射一份容量,M为整数,M大于或等于1,且小于N。
结合第一方面和上述可能的实现方式,在另一种可能的实现方式中,在根据配置映射表将P份部分访问容量映射到N个内存通道之前,方法还包括:将映射到同一个内存通道的非连续的部分访问容量对应的地址空间进行连续化处理。
结合上述可能的实现方式,在另一种可能的实现方式中,每份部分访问容量的地址空间的标识位设置在请求访问内存的地址的低位。
结合上述可能的实现方式,在另一种可能的实现方式中,方法还包括:根据N个配置信息和每份部分访问容量的地址空间的标识位将地址分别恢复出原始地址。
第二方面,本申请实施例还提供了一种通信装置,用于实现上述第一方面描述的方法。通信装置为CPU或支持CPU实现该第一方面描述的方法的通信装置,例如该通信装置包括芯片系统。例如,该通信装置包括:处理单元。所述处理单元,用于根据N个配置信息将访问容量划分为P份部分访问容量,根据配置映射表将P份部分访问容量映射到N个内存通道。其中,N表示内存通道的总数,N为大于或等于2的整数,N个配置信息为N个内存通道的配置信息,一个配置信息对应一个内存通道,P份部分访问容量的大小相同。配置映射表用于指示容量与内存通道的映射关系。
可选地,内存交织的具体方法同第一方面中相应的描述,这里不再赘述。
可选地,通信装置还可以包括通信接口,用于发送或接收数据。
需要说明的是,上述第二方面的功能模块可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块。例如,收发器,用于完成接收单元和发送单元的功能,处理器,用于完成处理单元的功能,存储器,用于处理器处理本申请实施例的方法的程序指令。处理器、收发器和存储器通过总线连接并完成相互间的通信。具体的,可以参考第一方面所述的方法中的CPU的行为的功能。
第三方面,本申请实施例还提供了一种通信装置,用于实现上述第一方面描述的方法。所述通信装置为CPU或支持CPU实现该第一方面描述的方法的通信装置,例如该通信装置包括芯片系统。例如所述通信装置包括处理器,用于实现上述第一方面描述的方法的功能。所述通信装置还可以包括存储器,用于存储程序指令和数据。所述存储器与所述处理器耦合,所述处理器可以调用并执行所述存储器中存储的程序指令,用于实现上述第一方面描述的方法中的功能。所述通信装置还可以包括通信接口,所述通信接口用于该通信装置与其它设备进行通信。示例性地,若所述通信装置为CPU,该其它设备为内存。
在一种可能的设备中,该通信装置包括:通信接口,所述通信接口用于所述通信装置和其它装置进行通信。示例性地,该通信接口可以是收发器,所述收发器用于发送或接收数据。存储器,用于存储程序指令。处理器,用于根据N个配置信息将访问容量划分为P份部分访问容量,根据配置映射表将P份部分访问容量映射到N个内存通道。其中,N表示内存通道的总数,N为大于或等于2的整数,N个配置信息为N个内存通道的配置信息,一个配置信息对应一个内存通道,P份部分访问容量的大小相同。配置映射表用于指示容量与内存通道的映射关系。
可选地,内存交织的方法同第一方面中相应的描述,这里不再赘述。
第四方面,本申请实施例还提供了一种计算机可读存储介质,包括:计算机软件指令;当计算机软件指令在通信装置中运行时,使得通信装置执行上述第一方面所述的方法。
第五方面,本申请实施例还提供了一种包含指令的计算机程序产品,当计算机程序产品在通信装置中运行时,使得通信装置执行上述第一方面所述的方法。
第六方面,本申请实施例提供了一种芯片系统,该芯片系统包括处理器,还可以包括存储器,用于实现上述方法中CPU的功能。该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
另外,上述任意方面的设计方式所带来的技术效果可参见第一方面中不同设计方式所带来的技术效果,此处不再赘述。
本申请实施例中,CPU和通信装置的名字对设备本身不构成限定,在实际实现中,这些设备可以以其他名称出现。只要各个设备的功能和本申请实施例类似,属于本申请权利要求及其等同技术的范围之内。
附图说明
图1为本申请实施例提供的一种计算机设备的组成示例图;
图2为本申请实施例提供的一种内存交织方法的流程示意图;
图3为本申请实施例提供的另一种内存交织方法的流程示意图;
图4为本申请实施例提供的一种通信装置的组成示例图;
图5为本申请实施例提供的另一种通信装置的组成示例图。
具体实施方式
内存(memory)是存储器的一种,是主机上重要的部件之一,是中央处理器(central processing unit,CPU)与其他设备沟通的桥梁,主要用来临时存放数据,并配合CPU工作,协调CPU的处理速度,从而提高整机的性能。内存也可以称为主存或内存储器。
内存一般采用内存条的结构,根据内存条的封装和插脚形式不同,可以分为单列直插内存模块(single inline memory module,SIMM)和双列直插内存模块(dual in-line memory module,DIMM)。在主板上一个插槽上可以插上一根内存条。内存条由多个相同的内存芯片贴在同一个印制电路板(printed circuit board,PCB)衬板上实现的,即内存条由多个内存芯片组成。内存芯片俗称内存颗粒,内存芯片是组成内存条的基本单元。
按照内存的工作原理内存主要分为只读存储器(read-only memory,ROM)和随机存取存储器(random access memory,RAM)。ROM是只能从中读取数据而不能任意写数据,具有掉电后数据可保持不变的优点,一般用于保存不可更改的数据,如基本输入输出系统(basic input output system,BIOS)。RAM存储的内容可以通过指令随机读写访问,RAM中的数据在掉电时会丢失,因而只能在开机运行时存储数据。通常所说的内存就是指RAM。
根据结构和工作原理RAM可以分为静态RAM(static RAM)和动态RAM(dynamic RAM)。目前业界普遍使用的内存是同步动态随机存储器(synchronous dynamic random access memory,SDRAM)。同步是指内存工作需要同步时钟,内部的命令的发送与 数据的传输都以同步时钟为基准。动态是指存储阵列需要不断的刷新来保证数据不丢失。随机是指数据不是线性依次存储,而是自由指定地址进行数据读写。为了提高SDRAM的速度,业界还可以使用双倍速率同步动态随机存储器(double data rate SDRAM,DDR SDRAM),这样不需要提高时钟的频率就能加倍提高SDRAM的速度,并具有比SDRAM多一倍的传输速率和内存带宽。
主板上还可以设置内存控制器(memory controller,MC)。例如,内存控制器可以设置在CPU内,或者内存控制器设备在内存中。在CPU访问内存时,由内存控制器控制CPU对内存的访问。内存控制器与内存之间的通道可以称为内存通道(channel)。当然,内存通道也可以理解为一个内存控制器和该内存控制器对应的内存介质。为了提高CPU处理数据的速度,可以在主板上设置两个以上内存控制器,每个内存控制器控制主板上的一部分内存。在这种场景下,主板上包括两个以上内存通道,这些内存通道一般都是完全相同且独立的。所谓的双通道内存,实质上是指CPU有两个完全独立的内存控制器。一个内存通道可以对应一个或多个内存条的插槽。
通常,内存控制器的数据位宽是32比特(bit)或64bit。内存条中单颗内存芯片的内存芯片位宽是4bit、8bit或16bit。为了满足内存控制器的数据位宽要求,需要将多个内存芯片组合起来与内存控制器进行数据交互,该多个内存芯片的组合称为rank,也可以称为物理bank(physical bank,P-BANK)。例如,内存控制器的数据位宽是64位,内存芯片位宽是8bit,则8颗内存芯片组成一个rank。同理,若内存控制器的数据位宽是64位,内存芯片位宽是16bit,则4颗内存芯片组成一个rank。DIMM是比rank大的单位,目前来说一根DIMM有1~4个rank。不同的rank可以接到内存控制器中不同的片选(chip select)信号。
图1为本申请实施例提供的一种计算机设备的组成示例图,如图1所示,计算机设备可以包括至少一个处理器101,存储器102、通信接口103、通信总线104和DIMM105。
下面结合图1对计算机设备的各个构成部件进行具体的介绍:
处理器101是计算机设备的控制中心,可以是一个处理器,也可以是多个处理元件的统称。例如,处理器101是一个CPU,也可以是特定集成电路(application specific integrated circuit,ASIC),或者是被配置成实施本申请实施例的一个或多个集成电路,例如:一个或多个微处理器(digital signal processor,DSP),或,一个或者多个现场可编程门阵列(field programmable gate array,FPGA)。
其中,处理器101可以通过运行或执行存储在存储器102内的软件程序,以及调用存储在存储器102内的数据,执行计算机设备的各种功能。
在具体的实现中,作为一种实施例,处理器101可以包括一个或多个CPU,例如图1中所示的CPU0和CPU1。
在本申请实施例中处理器101还可以包括两个内存控制器,每个内存控制器连接两个DIMM105。
在具体实现中,作为一种实施例,计算机设备可以包括多个处理器,例如图1中所示的处理器101和处理器106。这些处理器中的每一个可以是一个单核处理器(single-CPU),也可以是一个多核处理器(multi-CPU)。这里的处理器可以指一个 或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。
存储器102可以是ROM或可存储静态信息和指令的其他类型的静态存储设备,RAM或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器102可以是独立存在,通过通信总线104与处理器101相连接。存储器102也可以和处理器101集成在一起。
其中,所述存储器102用于存储执行本申请方案的软件程序,并由处理器101来控制执行。
通信接口101,用于与其他设备或通信网络通信,如以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。通信接口101可以包括接收单元实现接收功能,以及发送单元实现发送功能。
通信总线104,可以是工业标准体系结构(industry standard architecture,ISA)总线、外部设备互连(peripheral component interconnect,PCI)总线或扩展工业标准体系结构(extended industry standard architecture,EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图1中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
图1中示出的设备结构并不构成对计算机设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
一般的,由于软件程序存在空间局部性,即软件程序在短时间内访问的地址可能会集中在一个较小的范围内。当访问集中在某段地址时,如果仅通过高位地址区分访问,可能导致访问集中在某个内存通道,而其余内存通道则处于空闲状态。为了充分利用系统的内存带宽,并尽量保证各内存通道的带宽均衡,提升DDR SDRAM的利用率,需要对访问DDR SDRAM的操作进行交织。通常采用地址交织的方式将需要访问的地址空间均匀交织到所有的内存通道上(一般采用低位交织),使用所有的内存通道来处理需要访问的地址。
在内存系统中每个内存通道对应的内存容量相等的情况下,所有内存通道可以使用相同的交织窗口和交织算法。例如,内存系统中设置两个内存通道时,可以采用平均交织方式将内存系统中的地址交织到两个内存通道上。例如,将奇数地址交织到第一个内存通道上,将偶数地址交织到第二个内存通道上。
但是,在实际应用中,各个内存通道对应的内存容量可能并不完全相同,例如受到单板DIMM插槽数量的限制或成本因素等等。在内存通道对应的内存容量不相等的情况下,如果所有内存通道还是使用相同的交织窗口和交织算法,可能导致访问集中在某个内存通道,而其余内存通道则处于空闲状态。而如果采用两个交织窗口和交织算法对访问进行交织,会使不同的地址空间体现出不同的访存性能。
示例性的,如表1所示,假设内存系统地址范围是0~6G,内存容量为6G。内存系统包括4个内存通道。若先根据交织窗口1对应的交织算法(均匀交织),可以将 6G中的4G先均匀映射到内存通道0至内存通道3的四个内存通道上;可以根据交织窗口2对应的交织算法将剩余的2G映射到四个内存通道中的内存通道2和内存通道3上。当然,也可以将剩余的2G映射到四个内存通道中的内存通道0和内存通道1上。从表1中可以看出,通过两个交织窗口将内存容量映射到四个内存通道上,且内存容量无法均匀映射到内存通道上。内存通道0和内存通道1交织的容量相同,内存通道2和内存通道3交织的容量相同,而内存通道0和内存通道1交织的容量与内存通道2和内存通道3交织的容量不同。交织窗口1和交织窗口2对应的交织算法不同。
表1
Figure PCTCN2018097807-appb-000001
为了解决上述问题,本申请实施例提供一种内存交织方法,其基本原理是:根据N个配置信息将访问容量划分为P份部分访问容量,根据配置映射表将P份部分访问容量映射到N个内存通道。其中,N表示内存通道的总数,N为大于或等于2的整数,N个配置信息为N个内存通道的配置信息,一个配置信息对应一个内存通道,P份部分访问容量的大小相同。配置映射表用于指示容量与内存通道的映射关系。
本申请实施例提供的内存交织方法,根据配置信息指示的内存通道映射的容量份数划分访问容量,再根据配置映射表将划分后的部分访问容量映射到内存通道,使得N个内存通道中至少一个内存通道映射两份容量,从而,通过一个交织窗口实现对访问容量的内存交织处理,避免出现不同地址空间访存性能差异。
下面将结合附图对本申请实施例的实施方式进行详细描述。
图2为本申请实施例提供的一种内存交织方法的流程示意图,如图2所示,该方法可以包括:
S201、根据N个配置信息将访问容量划分为P份部分访问容量。
N个配置信息为N个内存通道的配置信息,一个配置信息对应一个内存通道。配置信息用于指示内存通道映射的容量份数,N个内存通道中至少一个内存通道映射两份容量。N表示内存通道的总数,N为大于或等于2的整数。
可理解的,内存通道映射几份容量可以根据内存系统中所有的内存通道对应的内存容量确定。通常,两个内存通道对应的内存容量不相同的情况下,两个内存通道对应的内存容量的比例关系为1:2,即一个内存通道对应的内存容量是另一个内存通道对应的内存容量2倍。在这种情况下,一个内存通道可以映射两份容量,另一个内存通道可以映射一份容量。
N个配置信息包括M个第一配置信息和N-M个第二配置信息。第一配置信息包括内存通道标识和第一指示标识,第一指示标识用于指示内存通道标识对应的内存通道映射两份容量。第二配置信息包括内存通道标识和第二指示标识,第二指示标识用于指示内存通道标识对应的内存通道映射一份容量。M为整数,M大于或等于1,且小于N。
示例性的,如表2所示,假设内存系统地址范围是0~6G,内存容量为6G。内存 系统包括4个内存通道,即N=4。内存通道0的标识可以为0。内存通道1的标识可以为1。内存通道2的标识可以为2。内存通道3的标识可以为3。内存通道2和内存通道3对应的内存容量相同。内存通道0和内存通道1对应的内存容量相同,内存通道2和内存通道3对应的内存容量是内存通道0和内存通道1对应的内存容量的两倍。M=2。
表2
Figure PCTCN2018097807-appb-000002
从表2中可以看出,可以将内存容量划分为6份。一份容量的大小是1GB。内存通道1映射一份容量。内存通道2映射一份容量。内存通道3映射两份容量。内存通道4映射两份容量。
所谓访问容量可以是用户需要进行读写的容量。访问容量可以小于内存容量,也可以等于内存容量。
在获取到访问容量后,可以根据N个配置信息指示的所有内存通道映射的容量份数划分访问容量,得到P份部分访问容量。P份部分访问容量的大小相同。
S202、根据配置映射表将P份部分访问容量映射到N个内存通道。
配置映射表用于指示容量与内存通道的映射关系。示例性的,假设内存系统包括4个内存通道。划分访问容量后得到6份部分访问容量,即P=6。如表3所示。
表3
容量名称 内存通道名称
部分访问容量5 内存通道3
部分访问容量4 内存通道2
部分访问容量3 内存通道1
部分访问容量2 内存通道1
部分访问容量1 内存通道0
部分访问容量0 内存通道0
从表3中可以看出,可以将部分访问容量0和部分访问容量1映射到内存通道0;将部分访问容量2和部分访问容量3映射到内存通道1;可以将部分访问容量4映射到内存通道2;可以将部分访问容量5映射到内存通道3。
本申请实施例提供的内存交织方法,根据配置信息指示的内存通道映射的容量份数划分访问容量,再根据配置映射表将划分后的部分访问容量映射到内存通道,使得N个内存通道中至少一个内存通道映射两份容量,从而,通过一个交织窗口实现对访问容量的内存交织处理,避免出现不同地址空间访存性能差异。
进一步的,如图3所示,本申请实施例还可以包括以下步骤。
在根据N个配置信息将访问容量划分为P份部分访问容量之前,还可以执行S203。
S203、生成N个配置信息和配置映射表。
在生成N个配置信息和配置映射表之后,存储N个配置信息和配置映射表。以便于对访问容量进行划分和映射。
另外,访问容量对应的地址空间可以是包括多段非连续的地址空间。在根据配置映射表将所述P份部分访问容量映射到N个内存通道之前,还可以执行S204。
S204、将映射到同一个内存通道的非连续的部分访问容量对应的地址空间进行连续化处理。
在访问容量对应的地址空间包括多段非连续的地址空间的情况下,对于每段连续的地址空间对应的访问容量根据N个内存通道的配置信息进行划分,其中,每段连续的地址空间对应的访问容量相同,划分后的部分访问容量也相同。将映射到同一个内存通道的非连续的部分访问容量对应的地址空间进行连续化处理,这里所谓的非连续的部分访问容量指的是访问容量中非连续的地址空间划分后对应的访问容量。可理解的,可以将一段连续的地址空间对应的访问容量的划分理解为一个划分循环周期。将映射到同一个内存通道的非连续的部分访问容量对应的地址空间进行连续化处理,可以理解为不同划分循环周期内的部分访问容量的连续化。
进一步,每份部分访问容量的地址空间的标识位设置在请求访问内存的地址的低位。请求访问内存的地址的低位可以是比交织粒度高一位的位置。当然,每份地址空间的标识位也可以放在其他位置。
在读取地址空间的容量时,可以执行S205,恢复出原始地址,得到原始数据。
S205、根据N个配置信息和每份部分访问容量的地址空间的标识位将地址分别恢复出原始地址。
上述本申请提供的实施例中,从CPU的角度对本申请实施例提供的方法进行了介绍。可以理解的是,各个网元,例如CPU为了实现上述本申请实施例提供的方法中的各功能,CPU包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对CPU进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图4示出了上述和实施例中涉及的通信装置的一种可能的组成示意图,该通信装置能执行本申请各方法实施例中任一方法实施例中CPU所执行的步骤。如图4所示,所述通信装置为CPU或支持CPU实现实施例中提供的方法的通信装置,例如该通信装置可以是芯片系统。该通信装置可以包括:处理单元401。
其中,处理单元401,用于支持通信装置执行本申请实施例中描述的方法。例如,处理单元401,用于执行或用于支持通信装置执行图2所示的参数配置方法中的S201和S202,图3所示的参数配置方法中的S201和S205。
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
本申请实施例提供的通信装置,用于执行上述任意实施例的方法,因此可以达到与上述实施例的方法相同的效果。
如图5所示为本申请实施例提供的通信装置500,用于实现上述方法中CPU的功能。该通信装置500可以是CPU,也可以是CPU中的装置。其中,该通信装置500可以为芯片系统。本申请实施例中,芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
通信装置500包括至少一个处理器501,用于实现本申请实施例提供的方法中CPU的功能。示例性地,处理器501可以用于执行S201至S205,具体参见方法示例中的详细描述,此处不做赘述。
通信装置500还可以包括至少一个存储器502,用于存储程序指令和/或数据。存储器502和处理器501耦合。本申请实施例中的耦合是装置、单元或模块之间的间接耦合或通信连接,可以是电性,机械或其它的形式,用于装置、单元或模块之间的信息交互。处理器501可能和存储器502协同操作。处理器501可能执行存储器502中存储的程序指令。所述至少一个存储器中的至少一个可以包括于处理器中。
通信装置500还可以包括通信接口503,用于通过传输介质和其它设备进行通信,从而用于通信装置500中的装置可以和其它设备进行通信。示例性地,若通信装置为CPU,该其它设备为内存。处理器501利用通信接口503收发数据,并用于实现图2~图3对应的实施例中所述的CPU所执行的方法。
本申请实施例中不限定上述通信接口503、处理器501以及存储器502之间的具体连接介质。本申请实施例在图5中以通信接口503、处理器501以及存储器502之间通过总线504连接,总线在图5中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在本申请实施例中,处理器可以是通用处理器、数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
在本申请实施例中,存储器可以是非易失性存储器,比如硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD)等,还可以是易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM)。存储器是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。本申请实施例中的存储器还可以是电路或者其它任意能够实现存储功能的装置,用于存储程序指令和/或数据。本申请实施例所涉及的CPU可以为图4所示的通信装置。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的 方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
本申请实施例提供的方法中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、终端或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机可以存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,数字视频光盘(digital video disc,DVD))、或者半导体介质(例如,SSD)等。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (10)

  1. 一种内存交织方法,其特征在于,包括:
    根据N个配置信息将访问容量划分为P份部分访问容量,所述P份部分访问容量的大小相同,所述N个配置信息为N个内存通道的配置信息,一个配置信息对应一个内存通道,所述配置信息用于指示内存通道映射的容量份数,所述N个内存通道中至少一个内存通道映射两份容量,N表示内存通道的总数,N为大于或等于2的整数;
    根据配置映射表将所述P份部分访问容量映射到N个内存通道,所述配置映射表用于指示容量与内存通道的映射关系。
  2. 根据权利要求1所述的内存交织方法,其特征在于,在所述根据N个配置信息将访问容量划分为P份部分访问容量之前,所述方法还包括:
    生成所述N个配置信息和所述配置映射表,所述N个配置信息包括M个第一配置信息和N-M个第二配置信息,所述第一配置信息包括内存通道标识和第一指示标识,所述第一指示标识用于指示所述内存通道标识对应的内存通道映射两份容量,所述第二配置信息包括内存通道标识和第二指示标识,所述第二指示标识用于指示所述内存通道标识对应的内存通道映射一份容量,M为整数,M大于或等于1,且小于N。
  3. 根据权利要求1或2所述的内存交织方法,其特征在于,在所述根据配置映射表将所述P份部分访问容量映射到N个内存通道之前,所述方法还包括:
    将映射到同一个内存通道的非连续的部分访问容量对应的地址空间进行连续化处理。
  4. 根据权利要求3所述的内存交织方法,其特征在于,每份部分访问容量的地址空间的标识位设置在请求访问内存的地址的低位。
  5. 根据权利要求4所述的内存交织方法,其特征在于,所述方法还包括:
    根据所述N个配置信息和每份部分访问容量的地址空间的标识位将地址分别恢复出原始地址。
  6. 一种通信装置,其特征在于,包括处理单元,所述处理单元,用于实现如权利要求1至5中任一项所述的内存交织方法。
  7. 一种通信装置,其特征在于,包括:至少一个处理器、存储器、总线和收发器,其中,所述存储器用于存储计算机程序,使得所述计算机程序被所述至少一个处理器执行时实现如权利要求1-5中任一项所述的内存交织方法。
  8. 一种计算机可读存储介质,其特征在于,包括计算机软件指令,当所述计算机软件指令在计算机中运行时,使得所述计算机执行如权利要求1至5中任一项所述的内存交织方法。
  9. 一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在计算机中运行时,使得所述计算机执行如权利要求1至5中任一项所述的内存交织方法。
  10. 一种芯片系统,其特征在于,所述芯片系统包括处理器,用于实现如权利要求1至5中任一项所述的内存交织方法。
PCT/CN2018/097807 2018-07-31 2018-07-31 一种内存交织方法及装置 WO2020024113A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201880096144.8A CN112513824B (zh) 2018-07-31 2018-07-31 一种内存交织方法及装置
EP18928427.6A EP3822796B1 (en) 2018-07-31 2018-07-31 Memory interleaving method and device
PCT/CN2018/097807 WO2020024113A1 (zh) 2018-07-31 2018-07-31 一种内存交织方法及装置
US17/162,287 US20210149804A1 (en) 2018-07-31 2021-01-29 Memory Interleaving Method and Apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/097807 WO2020024113A1 (zh) 2018-07-31 2018-07-31 一种内存交织方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/162,287 Continuation US20210149804A1 (en) 2018-07-31 2021-01-29 Memory Interleaving Method and Apparatus

Publications (1)

Publication Number Publication Date
WO2020024113A1 true WO2020024113A1 (zh) 2020-02-06

Family

ID=69230979

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/097807 WO2020024113A1 (zh) 2018-07-31 2018-07-31 一种内存交织方法及装置

Country Status (4)

Country Link
US (1) US20210149804A1 (zh)
EP (1) EP3822796B1 (zh)
CN (1) CN112513824B (zh)
WO (1) WO2020024113A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113791822A (zh) * 2021-11-15 2021-12-14 沐曦集成电路(上海)有限公司 多内存通道的内存存取装置、方法和数据处理设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115344506B (zh) * 2022-10-19 2023-06-16 瀚博半导体(上海)有限公司 内存地址的映射方法、内存访问方法和装置、芯片、设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104750557A (zh) * 2013-12-27 2015-07-01 华为技术有限公司 一种内存管理方法和内存管理装置
CN105446911A (zh) * 2014-05-29 2016-03-30 展讯通信(上海)有限公司 终端设备的内存访问控制方法与装置
CN105452986A (zh) * 2013-08-08 2016-03-30 高通股份有限公司 用于具有选择性功率或性能优化的内存通道交织的系统和方法
CN105518632A (zh) * 2013-09-27 2016-04-20 高通股份有限公司 用于存储器交错的可配置扩展函数
CN106155912A (zh) * 2015-04-14 2016-11-23 扬智科技股份有限公司 多通道存储器与其存储器存取方法

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9495290B2 (en) * 2007-06-25 2016-11-15 Sonics, Inc. Various methods and apparatus to support outstanding requests to multiple targets while maintaining transaction ordering
US8661200B2 (en) * 2010-02-05 2014-02-25 Nokia Corporation Channel controller for multi-channel cache
US9110795B2 (en) * 2012-12-10 2015-08-18 Qualcomm Incorporated System and method for dynamically allocating memory in a memory subsystem having asymmetric memory components
US9424209B2 (en) * 2013-09-19 2016-08-23 Intel Corporation Dynamic heterogeneous hashing functions in ranges of system memory addressing space
US9141541B2 (en) * 2013-09-20 2015-09-22 Advanced Micro Devices, Inc. Nested channel address interleaving
US9465735B2 (en) * 2013-10-03 2016-10-11 Qualcomm Incorporated System and method for uniform interleaving of data across a multiple-channel memory architecture with asymmetric storage capacity
US10140223B2 (en) * 2016-06-27 2018-11-27 Qualcomm Incorporated System and method for odd modulus memory channel interleaving

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105452986A (zh) * 2013-08-08 2016-03-30 高通股份有限公司 用于具有选择性功率或性能优化的内存通道交织的系统和方法
CN105518632A (zh) * 2013-09-27 2016-04-20 高通股份有限公司 用于存储器交错的可配置扩展函数
CN104750557A (zh) * 2013-12-27 2015-07-01 华为技术有限公司 一种内存管理方法和内存管理装置
CN105446911A (zh) * 2014-05-29 2016-03-30 展讯通信(上海)有限公司 终端设备的内存访问控制方法与装置
CN106155912A (zh) * 2015-04-14 2016-11-23 扬智科技股份有限公司 多通道存储器与其存储器存取方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3822796A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113791822A (zh) * 2021-11-15 2021-12-14 沐曦集成电路(上海)有限公司 多内存通道的内存存取装置、方法和数据处理设备

Also Published As

Publication number Publication date
EP3822796A4 (en) 2021-07-21
EP3822796B1 (en) 2023-01-18
CN112513824A (zh) 2021-03-16
US20210149804A1 (en) 2021-05-20
CN112513824B (zh) 2024-04-09
EP3822796A1 (en) 2021-05-19

Similar Documents

Publication Publication Date Title
WO2021004231A1 (zh) 一种闪存设备中的数据存储方法及闪存设备
JP6097444B2 (ja) メモリシステムの温度情報に基づくメモリシステム管理のためのシステム及び方法
US9317214B2 (en) Operating a memory management controller
US10725957B1 (en) Uniform memory access architecture
US8917571B2 (en) Configurable-width memory channels for stacked memory structures
US10540303B2 (en) Module based data transfer
US20210286551A1 (en) Data access ordering for writing-to or reading-from memory devices
US20220114115A1 (en) Interleaving of heterogeneous memory targets
US20210149804A1 (en) Memory Interleaving Method and Apparatus
US20190042095A1 (en) Memory module designed to conform to a first memory chip specification having memory chips designed to conform to a second memory chip specification
CN115729849A (zh) 内存管理方法及计算设备
US20230350795A1 (en) Dual-port memory module design for composable computing
US11341037B2 (en) System and method for providing per channel frequency optimization in a double data rate memory system
CN115858438A (zh) 用于存储器模块数据宽度的灵活配置的使能逻辑
US11221931B2 (en) Memory system and data processing system
CN116560560A (zh) 存储数据的方法和相关装置
CN115904689A (zh) 控制内存带宽的方法、装置、处理器及计算设备
US20230376427A1 (en) Memory system and computing system including the same
US20200327049A1 (en) Method and system for memory expansion with low overhead latency
US20240241646A1 (en) Memory system and method of operating memory system
WO2024016751A1 (zh) 内存分配方法、装置及计算机
US20220222178A1 (en) Selective fill for logical control over hardware multilevel memory
US20240118970A1 (en) Techniques for memory scrubbing associated with reliability availability and serviceability features
US20230028301A1 (en) Data management apparatus, data management method, and data storage device
EP4273702A1 (en) Operating method of memory device for managing map data of each of plurality of storage devices, computing system including memory device, and operating method of computing system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18928427

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018928427

Country of ref document: EP

Effective date: 20210212