WO2023217255A1 - Data processing method and device, processor and computer system - Google Patents

Data processing method and device, processor and computer system Download PDF

Info

Publication number
WO2023217255A1
WO2023217255A1 PCT/CN2023/093749 CN2023093749W WO2023217255A1 WO 2023217255 A1 WO2023217255 A1 WO 2023217255A1 CN 2023093749 W CN2023093749 W CN 2023093749W WO 2023217255 A1 WO2023217255 A1 WO 2023217255A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
memory space
address
processor core
processor
Prior art date
Application number
PCT/CN2023/093749
Other languages
French (fr)
Chinese (zh)
Inventor
王官皓
杨瑞
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023217255A1 publication Critical patent/WO2023217255A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system

Definitions

  • the present application relates to the field of computers, and in particular, to a data processing method, device, processor and computer system.
  • computer systems include traditional memory (such as dynamic random-access memory (DRAM)) and extended memory (such as storage-class-memory (SCM), solid-state disk (Solid State Disk) , SSD)), using both traditional memory and extended memory as the main memory of the computing system to expand the storage capacity of traditional memory.
  • DRAM dynamic random-access memory
  • SCM storage-class-memory
  • SSD solid-state disk
  • the processor Central Processing Unit, CPU
  • the processor uses virtual memory technology to access the memory, that is, the processor accesses the memory according to the physical address (Physical Address) corresponding to the virtual address (Virtual Memory Address).
  • the same CPU can include multiple processor cores (cores), which are used to process different memory access requests.
  • the processor core When the processor core performs the data processing of the access request on the address of the memory space according to the memory access request, it will describe the data in the memory. Operate on the metadata of the properties of the data in memory. Multiple processor cores can access the memory storage space indicated by the physical address associated with the same virtual address and operate on the same metadata. Therefore, metadata is globally shared in memory, and multiple processor cores cannot operate on metadata at the same time to avoid multiple processor cores operating on metadata at the same time, resulting in inconsistent memory accessed by multiple processor cores. question.
  • the processor cores other than the first processor core need to wait for the first processor core to complete.
  • the operation of the above-mentioned access request will cause an access conflict, resulting in a longer memory access time, and ultimately leading to a decrease in the processing performance of the processor. Therefore, how to provide a more efficient memory access method has become an urgent technical issue to be solved.
  • the present application provides a data processing method, device, processor and computer system, thereby improving the memory access efficiency of the processor.
  • a data processing method is provided, which is suitable for a computer system including a processor and a memory, and is executed by a first processor core of the processor.
  • the above data processing method includes: the first processor core obtains a first memory access request.
  • the first memory access request is used to instruct the first processor core to perform data processing on the data stored in the first memory space.
  • the first memory space is the first
  • the first attraction zone (Attraction Zone) of the processor core is associated with the storage space of the memory
  • the first processor core determines the address of the first memory space and performs the data processing of the first memory access request according to the address of the first memory space.
  • the first processor core checks the address of the first memory space Perform data processing of the first memory access request, and associate the first processor core with the first memory space through the attraction domain, so that processor cores other than the first processor core cannot perform data processing on the address of the first memory space. Therefore, different processor cores perform data processing on addresses in different memory spaces, and different processor cores operate on different metadata when performing data processing.
  • the operations of metadata between multiple processor cores do not affect each other, which is equivalent to the metadata in the memory being divided into multiple parts that are isolated from each other. Multiple processor cores can simultaneously operate metadata in their respective attraction domains.
  • processors other than the first processor core When the first processor core performs the data processing of the access request on the address of the memory space according to the memory access request, processors other than the first processor core
  • the core does not need to wait for the first processor core to complete memory access and metadata operations, and can also operate on metadata outside the first attraction domain to complete memory access. This avoids access conflicts when multiple processor cores perform memory access in parallel, reduces the overall duration of memory access, and improves the efficiency of memory access.
  • the first attraction domain includes an association relationship between a memory space address and an attraction domain identifier.
  • the attraction domain identifier is used to indicate the attraction domain to which the association relationship belongs.
  • the memory space address is associated with the attraction domain to which the association relationship belongs. The address of the memory space.
  • the processor core determines whether the memory space is the storage space of the memory associated with the attraction domain of the processor core based on the attraction domain identifier and the memory space address contained in the association relationship, ensuring that the processor checks the memory space and the attraction domain. Relevance judgment efficiency.
  • the association is a page table entry
  • the attraction domain identifier is set in a reserved bit of the page table entry.
  • the processor can identify the attraction domain identifier without modifying the microarchitecture.
  • the attraction domain identifier includes a processor core identifier.
  • the processor core identifier in the association relationship can indicate that the association relationship belongs to the attraction domain of the processor core represented by the processor core identifier.
  • the first memory access request may be generated by the first processor core when a thread executed by the first processor core needs to read and write data in the first memory space.
  • the first memory access request may be made by a thread outside the first processor core when a thread executed by a processor core outside the first processor core needs to read and write data in the first memory space.
  • the processor core generates and sends it to the first processor core. After the first processor core performs the data processing of the first memory access request according to the address of the first memory space, the obtained data is sent to the processor core that sent the first memory access request, thereby avoiding the processor core not being reconciled with the processor.
  • the memory space associated with the core's own attraction domain is used for data processing, which improves the isolation strength of the memory space associated with different attraction domains.
  • the processor core can also dynamically adjust the associations contained in the attraction domain. For example, the first processor core adds an association relationship that does not belong to any attraction domain to the first attraction domain.
  • the step for the first processor core to add an association relationship that does not belong to any attraction domain to the first attraction domain may be as follows: the first processor core may obtain a second memory access request, and the second memory access request is used to instruct the first processor Check the data stored in the second memory space and perform data processing. When the second memory space is not associated with any attraction domain and the cumulative number of second memory access requests is greater than the preset threshold, the first processor checks the data stored in the second memory space and performs data processing. The association relationship of the address of the second memory space is added to the first attraction domain.
  • the first processor core adding the association relationship of the address of the second memory space to the first attraction domain may include: setting the attraction domain identifier containing the association relationship of the address of the second memory space as the first attraction domain.
  • the attractor domain ID of the domain may include: setting the attraction domain identifier containing the association relationship of the address of the second memory space as the first attraction domain.
  • the first processor core migrates the association relationship belonging to the first attraction domain to an attraction domain outside the first attraction domain.
  • the steps for the first processor core to migrate the association relationship belonging to the first attraction domain to an attraction domain outside the first attraction domain may be as follows: the first processor core obtains a third memory access request, and the third memory access request is used to indicate the third memory access request.
  • One processor core performs data processing on the data stored in the third memory space, and the third memory access request is generated by a processor core other than the first processor core and sent to the first processor core.
  • the first processor core adds an association including the address of the third memory space to the attraction domain of the processor core that sends the third memory access request the most times.
  • the first processor core adds an association including the address of the third memory space to the attraction domain of the processor core that sends the third memory access request the most, including: adding the association including the address of the third memory space.
  • the attraction domain identifier of the association relationship is set to the attraction domain identifier of the attraction domain of the processor core that sends the third memory access request the most.
  • the processor core realizes the expansion and reduction of the attraction domain, so that according to the frequency of the processor core accessing different memory spaces, the memory space that is more likely to be accessed by the processor core is associated with the attraction domain, so as to reduce the
  • the obtained memory access request indicates the probability of the processor core accessing the memory space associated with the attraction domain that does not belong to itself, thereby improving the utilization of the processing resources of the processor core.
  • the association relationship of the first attraction domain may be stored in a storage unit connected to the first processor core, and the storage unit is used to store and query the association relationship contained in the first attraction domain.
  • the first memory access request includes the virtual address of the first memory space
  • the associated memory space includes the physical address of the first memory space.
  • the step for the first processor core to determine the address of the first memory space may be as follows: first The processor core queries the storage unit for an association relationship according to the virtual address of the first memory space, obtains the first association relationship, and determines the physical address of the first memory space based on the first association relationship.
  • the storage unit may be an address translation unit.
  • the first processor core may also delete associations in the storage unit that do not belong to the first attraction domain every other preset period.
  • the first processor core determines the physical address of the memory space based on the association relationship, if the association relationship does not belong to any attraction domain, belongs to the first attraction domain, or belongs to an attraction domain outside the first attraction domain, all processor cores will There is no need to stop the thread to update the storage unit, thus improving the performance of memory access.
  • a second aspect provides a data processing device, which includes various modules for executing the data processing method in the first aspect or any possible implementation of the first aspect.
  • a processor configured to execute the operation steps of the data processing method in the first aspect or any possible implementation of the first aspect.
  • a computer system in a fourth aspect, includes a memory and a processor in the third aspect.
  • the processor is configured to execute the operating steps of the data processing method in the first aspect or any possible implementation of the first aspect, The data processing of the memory access request is performed at the address of the memory storage space.
  • a computer-readable storage medium including: computer software instructions; when the computer software instructions are run in a computing device, the computing device is caused to execute as in the first aspect or any possible implementation of the first aspect. The steps of the method.
  • a computer program product is provided.
  • the computer program product When the computer program product is run on a computer system, it causes the computing device to perform the operation steps of the method described in the first aspect or any possible implementation of the first aspect.
  • Figure 1 is a schematic diagram of a multi-layer storage system provided by an embodiment of the present application.
  • Figure 2 is an architectural schematic diagram of a computer system provided by an embodiment of the present application.
  • Figure 3 is a schematic flow chart of a data processing method provided by an embodiment of the present application.
  • Figure 4 is a schematic structural diagram of a page table entry provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of data processing steps provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of another data processing step provided by the embodiment of the present application.
  • Figure 7 is a schematic structural diagram of an attraction domain provided by an embodiment of the present application.
  • Figure 8 is a schematic flowchart of an attraction domain expansion step provided by an embodiment of the present application.
  • Figure 9 is a schematic flowchart of steps for reducing an attraction domain provided by an embodiment of the present application.
  • Figure 10 is a schematic structural diagram of a data processing device provided by an embodiment of the present application.
  • the processor's cache can be divided into Level 1 cache, Level 2 cache, and Level 3 cache.
  • the computer system can also be configured with main memory (or memory), for example, using random access memory (Random Access Memory, RAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), solid state drive , Mechanical hard disk (hard disk driver, HDD) serves as the main memory of the computer system.
  • each type of hard drive has different performance.
  • the data storage speed of SSD is higher than that of mechanical hard disk. If data that is accessed frequently and has high performance requirements is placed on a hard disk with high read and write performance, and data that is not frequently accessed or has low performance requirements and is originally stored in a high-performance hard disk is moved to a low-performance hard disk, performance of the hard drive.
  • FIG. 1 is a schematic diagram of a multi-layer storage system provided by the present application. From the first layer to the third layer, the storage capacity increases step by step, the access speed decreases step by step, and the cost decreases step by step.
  • the first level includes a register 111 , a first-level cache 112 , a second-level cache 113 and a third-level cache 114 located in the processor 210 .
  • the memory contained in the second level can be used as the main memory of the computer system, that is, memory.
  • dynamic random access memory 121 double data rate synchronous dynamic random access memory (double data date SDRAM, DDR SDRAM) 122, storage-class-memory (SCM) 123.
  • SCM storage-class-memory
  • Main memory can be simply referred to as main memory or internal memory, which is the memory that exchanges information with the CPU.
  • the memory contained in the third level can be used as auxiliary memory of the computer system, that is, external memory.
  • auxiliary memory for example, network storage 131, solid state drive (Solid State Disk, SSD) 132, hard disk drive 133.
  • the auxiliary storage may be referred to as auxiliary storage or external storage, that is, the hard disk in this embodiment.
  • hard disks Compared with main memory, hard disks have large storage capacity and slow access speeds. It can be seen that the closer the memory is to the CPU, the smaller the capacity, the faster the access speed, the greater the bandwidth, and the lower the latency. Therefore, the memory contained in the third level stores data that is not frequently accessed by the CPU, improving data reliability.
  • the memory contained in the second level can be used as a cache device to store data frequently accessed by the CPU, significantly improving the access performance of the system.
  • the operating system of the computer system uses virtual memory technology, which uses part of the hard disk space as memory storage space.
  • the operating system provides users with a virtual memory (Virtual Memory) through virtual memory technology.
  • the operating system stores part of the program data in the memory, while leaving the remaining data on the hard disk, and then the program can be started.
  • the operating system swaps the required part of the data into the memory, and then continues to execute the program.
  • the operating system swaps out the data in the memory that is not temporarily accessed by the program to the hard disk, thereby freeing up space to store the data that will be transferred into the memory.
  • the system uses virtual memory technology to provide a memory much larger than the physical memory, that is, virtual memory, which logically implements memory expansion.
  • the operating system uses virtual memory technology to set up a " Continuous "virtual storage space" divides the virtual storage space into multiple pages (Page) with continuous address ranges, and maps these pages to physical memory. During the running of the program, the pages of the virtual storage space are moved to Physical memory.
  • the hardware When a program needs to access an address space in physical memory, the hardware performs mapping of virtual addresses and physical addresses; when a program references an address space that is not in physical memory, the operating system is responsible for converting the missing part Store into physical memory and re-execute the failed instruction.
  • Data is stored in units of bytes in memory (for example, memory, hard disk).
  • memory for example, memory, hard disk.
  • physical Address is the address used for data access.
  • Logical Address (Logical Address)
  • the address given by the memory access instruction (also called the internal access instruction) is called the logical address, also called the relative address. That is, in machine language instructions, it is used to specify an operand or the address of an instruction.
  • the logical address can be composed of a segment selector plus an offset address relative to the specified segment. The logical address needs to be calculated or transformed by the addressing mode to obtain the actual effective address in the internal memory, that is, the physical address.
  • Linear Address is the middle layer between logical address to physical address conversion.
  • the program code will generate a logical address, or an offset address in the segment, which is added to the base address of the corresponding segment to generate a linear address.
  • the virtual address is the intra-segment offset address in the logical address.
  • Pages correspond to page frames in physical memory.
  • Page frames are blocks into which physical memory is divided. Pages and page frames are generally the same size, such as 4KB, 8KB, 16KB or even larger. In fact, computer systems generally range from 512 bytes to 1GB. This is the paging technology in virtual memory technology.
  • the page table is a one-to-one relationship table between pages and page frames, and plays an indexing role when querying the page frame corresponding to a page.
  • the corresponding relationship between a page table and a page frame in the page table is called a page table entry.
  • the page table consists of multiple page table entries. The structure of the page table entries depends on the machine architecture, but they are basically the same.
  • the page table entry contains identification and data auxiliary information.
  • the identification is the virtual address and the data is the physical address mapped by the virtual address.
  • the auxiliary information includes valid bits, modification bits, access bits and reserved bits. The valid bit indicates whether the page currently exists in memory. When a thread executed by the processor or processor core attempts to access a page that does not exist in memory, a page fault interrupt will be caused.
  • the protection bit indicates the type of access allowed to the page, such as read-write or read-only.
  • the modification bit and access bit are introduced to record page usage and are used in page replacement. For example, when a memory page is modified by the program, the hardware will automatically set the modification bit. If the program encounters a page fault next time and needs to run the page replacement algorithm to call out the page to make room for the page to be called in, it will It will first access the modification bit to know that the page has been modified, that is, a dirty page. Then the latest page content needs to be written back to the hard disk for storage. Otherwise, it means that the copy content in the memory and the hard disk is synchronized.
  • the access bit is also automatically set by the system when the program accesses the page. It is also a value used by the page replacement algorithm. The operating system will determine whether to eliminate this based on whether the page is being accessed. Pages, generally speaking, pages that are no longer used are more suitable for elimination.
  • the reserved bits are used to store other auxiliary information, such as the attraction domain identification in this embodiment.
  • Embodiments of the present application provide a data processing method.
  • the first processor core obtains a first memory access request.
  • the first memory access request is used to instruct the first processor core to perform data processing on the data stored in the first memory space. If
  • the first memory space is the storage space of the memory associated with the first attraction domain of the first processor core.
  • the first processor core determines the address of the first memory space, it executes the first memory space according to the address of the first memory space.
  • Data processing of access requests are associated with the memory space associated with the attraction domain through the attraction domain, and then the attraction domain is associated with the metadata used to describe the attributes of the data stored in the memory space associated with the attraction domain, so that multiple processor cores can operate in parallel.
  • the metadata associated with the respective attraction domains and the data in the memory space can be processed. This avoids access conflicts when multiple processor cores perform memory access in parallel, reduces the overall duration of memory access, and improves memory access. s efficiency.
  • the data processing method in the embodiment of the present application may also be executed by a processor. If the data processing method is executed by the processor, the difference from the method executed by the first processor core is that the first memory space is the storage space of the memory associated with the first attraction domain of the processor.
  • Figure 2 is a schematic architectural diagram of a computer system provided by an embodiment of the present application. As shown in FIG. 2 , computer system 200 includes processor 210 , memory 220 and bus 230 .
  • the processor 210 can be a graphics processing unit (GPU), a central processing unit (CPU), other general-purpose processors, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (application-specific integrated circuit) specific integrated circuit (ASIC), field-programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor can be a microprocessor or any conventional processor, etc.
  • Processor 210 includes at least one processor core.
  • the processor core also known as the core, is the most important component of the processor.
  • Various processor cores have fixed logical structures, such as first-level cache, second-level cache, execution unit, instruction level unit, bus interface and other logical units.
  • Each processor core of the processor 210 has a unique processor core identifier (identifier, id) in the operating system.
  • the processor 210 includes a processor core 211 , a memory management unit (Memory Management Unit, MMU) 212 and a processor core 215 .
  • the memory management unit 212 is connected to the memory 220 and the hard disk through the bus 230.
  • the hard disk in this embodiment can also be called external memory relative to the memory 120 .
  • the memory management unit 212 is a hardware unit in the processor or processor core.
  • the memory management unit 212 includes an address translation unit 213 and a page table walk unit (Table Walk Unit, TWU) 214.
  • the memory management unit 212 is mainly used to manage the control lines of virtual memory and physical memory, and map virtual addresses to physical addresses. Typically there is one memory management unit per processor core.
  • the processor core 211 may be connected to one or more storage units.
  • the first storage unit is the address translation unit 213, and the second storage unit is the memory 220.
  • the memory management unit 212 may be provided within the processor core 211 .
  • the address translation unit 213 in this embodiment may also be called an address translation cache (Translation Lookaside Buffer, TLB).
  • the address translation unit 213 is used to store the correspondence between the virtual address and the physical address, that is, the page table entry. If the virtual address requested by the CPU exists in the corresponding relationship between the virtual address and the physical address of the address translation unit 213, the address translation unit 213 will quickly match a page table entry containing the virtual address of the access request, and will confirm the page table entry. It is sent to the CPU, and the CPU accesses the memory 220 according to the physical address contained in the obtained page table entry. In this way, compared to when the processor core 211 does not use the memory management unit 212 to perform memory access, it needs to obtain the page table entry from the memory 220 and then obtain the physical address from the page table entry. The memory management unit 212 can use the address translation unit. 213 obtains the physical address without accessing the memory 220 to realize the translation from the virtual address to the physical address, improving the translation speed from the virtual address to the physical address.
  • the process of the address translation unit 213 matching the page table entry of the virtual address includes: the memory management unit 212 extracts the virtual address from the access request, and sends the virtual address to the address translation unit 213, where the address translation unit 213 can access and store the page. Query whether there is a page table entry that is the same as the virtual address in the storage space of the table entry; if so, it means that the address translation unit hits (TLB hit); if not, it means that the address translation unit misses (TLB miss).
  • the page table entry queried by the memory management unit 212 in the address translation unit 213 is invalid.
  • the valid bit of the page table entry indicates that the page table entry is invalid, or may also indicate that the address translation unit misses.
  • the page table traversal unit 214 queries and obtains the page table entry in the memory 220.
  • the process in which the page table traversal unit 214 queries and obtains page table entries in the memory 220 includes: the page table traversal unit 214 performs a page table traversal. (Translation Table Walk) step: access the memory 120 multiple times according to multi-level page tables such as the first-level page table and the second-level page table to obtain the address of the page table entry, and then obtain the page table entry in the memory 220 according to the address of the page table entry.
  • Transport Table Walk access the memory 120 multiple times according to multi-level page tables such as the first-level page table and the second-level page table to obtain the address of the page table entry, and then obtain the page table entry in the memory 220 according to the address of the page table entry.
  • the memory management unit 212 of the processor 210 may be an integrated memory controller (IMC).
  • IMC integrated memory controller
  • the processor core 211 can obtain the first memory access request when the program needs to access the first data.
  • the first memory access request is used to instruct the processor core 211 to perform data processing on the data stored in the first memory space.
  • the processor core 211 executes the data processing of the first memory access request according to the address of the first memory space.
  • the above-mentioned first memory access request may be generated by the processor core 211 when a thread executed by the processor core 211 needs to access the first data.
  • the first memory access request may be a data access instruction.
  • the data access instruction is a Load/Store instruction.
  • the Load/Store instruction is used to instruct data transfer between the register and the memory.
  • the Load/Store instruction includes the information that the processor core 211 wants to access. The virtual address of the data.
  • the above-mentioned first memory access request may also be generated by a processor core other than the processor core 211 when a thread executed by a processor core other than the processor core 211 (for example, the processor core 215) needs to access the first data.
  • the processor core of the first memory access request is sent to the processor core 211 .
  • the first memory access request may be a proxy instruction, used to instruct the processor core 211 to perform data processing on the data stored in the first memory space, and send the obtained data to the processor core that generated the first memory access request.
  • the memory 220 and the hard disk are memories connected to the processor 210 .
  • Memory is a memory device used to store programs and various data. The larger the memory capacity, the slower the access speed. On the contrary, the smaller the memory capacity, the faster the access speed.
  • Access speed refers to the data transfer speed when writing data or reading data to the memory. Access speed can also be called read and write speed. Memory can be divided into different levels based on storage capacity and access speed.
  • the memory 220 can store multi-level page tables.
  • the first-level page table is used to store the first-level page table index
  • the second-level page table is used to store the second-level page table index
  • the third-level page table is used to store the first-level page table index.
  • the three-level page table index and the fourth-level page table are used to store page table entries.
  • the dotted line between the dotted box and the memory 220 indicates that level one to level four page tables are stored in the memory 220 .
  • the first-level page table can be Page Map Level 4 (PML4)
  • the second-level page table can be Page Directory Pointer Table (PDPT)
  • the third-level page table can be Page Directory table (Page Directory).
  • PD Page Directory Pointer Table
  • the fourth-level page table can be a page table (Page Table, PT)
  • the page table stores one or more page table entries.
  • the processor core 211 queries the page table entries in the multi-level page table according to the virtual address
  • the processor core 211 sequentially queries the first-level page table, the second-level page table and the second-level page table index according to the first-level page table index, the second-level page table index and the third-level page table index.
  • the page table, the third-level page table and the fourth-level page table obtain the address of the page table entry in the memory 220, and obtain the page table entry in the memory 220 according to the address.
  • the memory 220 may also store metadata, which is information used to describe the data attributes (properties) of the physical pages of the memory space in the memory 220 .
  • metadata is information used to describe the data attributes (properties) of the physical pages of the memory space in the memory 220 .
  • active list is used to maintain active physical pages
  • free list is used to maintain idle physical pages.
  • the LRU list is used to maintain the proximity of active physical pages.
  • the least recently used strategy (Least Recently Used, LRU) is a commonly used page replacement algorithm, which selects the most recently unused pages for elimination. LRU will use a linked list to maintain the access status of each data in the cache, and adjust the position of the data in the linked list based on the real-time access of the data. Then, the position of the data in the linked list will indicate whether the data has been accessed recently or is already there. Haven't visited in a while.
  • the bus 230 can be different types of buses for different connection objects.
  • the bus 230 between the processor 210 and the memory 220 may be a DDR3 interface standard bus
  • the bus 230 between the processor 210 and the external memory may be a Peripheral Component Interconnect Express (PCIe) interface standard. bus.
  • PCIe interface standard bus the external memory can be PCIe SSD, PCIe SSD (PCIe solid state drive) (driver) is a high-speed expansion card that connects computer system 200 to its peripheral devices.
  • PCIe SSD takes the form of a plug-in card built into the computer.
  • PCIe uses a multi-queue method in data transmission, which can achieve the purpose of concurrent data transmission on a single disk and improve the efficiency of the data interface.
  • the data processing method provided in this embodiment can be applied in memory expansion scenarios, such as memory expansion scenarios implemented through virtual memory technology.
  • the data processing method in the embodiment of the present application can be applied in scenarios such as memory expansion of data center servers and memory expansion of cloud computing servers.
  • the server's operating system will process the thread of the user terminal's request based on the request generated by the user terminal (such as mobile phone, computer, tablet computer) or itself.
  • the processor core 211 Assigned to the processor core 211, the processor core 211 generates a first memory access request according to the thread.
  • the first memory access request is used to instruct the processor core 211 to perform data processing on the data stored in the first memory space.
  • the first memory space is the processing The storage space of the memory associated with the first attraction domain of the processor core 211, wherein each processor core is associated with at least one attraction domain, and the same attraction domain is only associated with one processor core.
  • the processor core 211 determines the address of the first storage space according to the first memory access request, and then performs the data processing of the first memory access request according to the address of the first memory space.
  • the processor core is associated with the memory space associated with the attraction domain, and then the attraction domain is associated with metadata, so that multiple processor cores of the server can access the metadata associated with their respective attraction domains when performing memory access in parallel.
  • the data and data in the memory space are processed, which avoids the access conflict of parallel memory access by multiple processor cores and reduces the overall duration of memory access. Therefore, the data processing method provided by this embodiment is applied in the application scenarios of memory expansion of data center servers and cloud computing servers with large concurrent accesses, which greatly improves the concurrent access efficiency of memory access and reduces concurrency.
  • the overall access latency of the access is applied in the application scenarios of memory expansion of data center servers and cloud computing servers with large concurrent accesses, which greatly improves the concurrent access efficiency of memory access and reduces concurrency. The overall access latency of the access.
  • FIG. 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application.
  • the positional relationship between the devices, devices, modules, etc. shown in FIG. 2 does not constitute any limitation.
  • the first processor core is the processor core 211 in FIG. 2 and the processor core 211 accesses the first data as an example for explanation.
  • Step 310 The processor core 211 obtains the first memory access request.
  • the first memory access request may be that the processor core 211 or a processor core outside the processor core 211 needs to read/write the first memory in the memory 220 during the execution of the thread.
  • the first memory access request includes the virtual address of the first storage space.
  • the first memory access request is used to instruct the processor core 211 to perform data processing on the data stored in the first memory space.
  • the first memory space is the storage space of the memory 220 associated with the first attraction domain of the processor core 211 .
  • the first attraction domain includes an association relationship between a memory space address associated with the processor core 211 and an attraction domain identifier.
  • the attraction domain identifier of the association relationship in the first attraction domain is used to globally uniquely indicate the location to which the association relationship belongs.
  • the first attraction domain, the memory space address is the address of the memory space associated with the first attraction domain to which the association relationship belongs.
  • the processor core In virtual memory technology, the processor core usually uses page table entries to convert virtual addresses and physical addresses. Therefore, this embodiment can use page table entries as an association relationship.
  • the memory space address (virtual address) contained in a page table entry and physical address) is associated with the attraction domain represented by the attraction domain identifier.
  • this embodiment can also set the attraction domain identifier in the reserved bit of the page table entry, so that the page table entry can carry the attraction domain identifier without modifying the microarchitecture of the processor.
  • the attraction domain identifier in the page table entry may be the processor core identifier.
  • the attraction domain identifier of the first attraction domain is the processor core identifier of the processor core 211, so that the page table entry represents the memory space, the attraction domain and the processor. nuclear correlation.
  • Step 320 The processor core 211 determines the address of the first memory space.
  • the above-mentioned address of the first memory space refers to the physical address of the first memory space.
  • the processor core 211 queries the address translation unit 213 according to the virtual address of the first memory space whether there is a first page table entry with the same identifier as the virtual address. If so, it indicates that the address If the translation unit hits (TLB hit), perform step 330; if not, it means that the address translation unit misses (TLB miss), and the processor core 211 uses the page table traversal unit 214 to obtain the first page table entry from the memory 220. Then, the processor core 211 determines the physical address of the first memory space according to the corresponding relationship between the virtual address and the physical address in the first page table entry.
  • the memory management unit 212 determines that the first page table entry belongs to the first attraction domain of the processor core 211. After entering the domain, the address translation unit loading (Install) step is performed, that is, the first page table entry is copied, and the copied first page table entry is stored in the address translation unit 213 of the processor core 211 .
  • the processor core 211 stores the first page table entry to the address translation unit 213, if the processor core 211 needs to perform memory access based on the first page table entry again, it can directly hit the first page table entry in the address translation unit 213. Avoiding performing the page table entry traversal operation again to obtain the first page table entry from the memory 220 reduces the computational memory overhead and time overhead caused by the page table entry traversal operation, thereby improving memory access efficiency.
  • Step 330 The processor core 211 executes the data processing of the first memory access request according to the address of the first memory space.
  • the processor core 211 uses the memory management unit 212 to perform data processing of the first memory access request according to the physical address of the first memory space.
  • the memory management unit 212 performs the steps of data processing of the first memory access request on the physical address of the first memory space as follows:
  • Scenario 1 The first memory access request is generated by the processor core 211.
  • the memory management unit 212 sends the physical address of the first memory space to the memory 220 , receives the first data of the storage space of the physical address of the first memory space returned by the memory 220 , and sends the physical address of the first memory space to the processor core 211 .
  • Register 111 sends the first data.
  • Case 2 The first memory access request is generated by the processor core 215 .
  • the memory management unit 212 sends the physical address of the first memory space to the memory 220, and receives the storage of the physical address of the first memory space returned by the memory 220. first data in the space, and sends the first data to the register of the processor core 215.
  • the processor core and the memory space are associated through the attraction domain, and the processor core is used to perform data processing on the memory space associated with its own attraction domain, allowing different processor cores to perform data processing on the addresses of different memory spaces. , and then when different processor cores perform data processing, they operate on the metadata used to describe the data in the memory space to which their respective attraction domains belong.
  • the processor core 211 is configured to determine the address of the memory space according to the page table entry belonging to the first attraction domain, and perform data processing on the address of the memory space.
  • the processor core 211 only performs data on the data of the address of the memory space associated with the first attraction domain. Processing, the processor core 211 only needs to operate on the first metadata, thereby associating the processor core 211, the first attraction domain and the first metadata. Therefore, different processor cores operate on different portions of the metadata in memory 220, and different processor cores do not operate on the same portion of metadata. When one of the multiple processor cores is performing data processing in the memory space, the other processor cores do not need to wait for the processor core that is performing data processing to complete the data processing and metadata operations before executing the new memory space. data processing. Furthermore, multiple processor cores can process data concurrently, which avoids access conflicts when multiple processor cores perform memory access in parallel, reduces the overall duration of memory access, and improves the efficiency of memory access.
  • the attraction domain of the present application will be introduced in detail below with reference to Figures 7-9.
  • the first attraction domain associated with the processor core 211 will be taken as an example.
  • FIG. 7 is a schematic structural diagram of an attraction domain provided by an embodiment of the present application.
  • the first attraction domain 700 includes one or more page table entries, and the attraction domain identifier in the one or more page table entries is the processor core identifier of the processor core 211 .
  • the processor core 211 Since the metadata in the memory is divided into multiple isolated parts by the attraction domain, the processor core 211 is only associated with the first attraction domain 700 when performing data processing of the first memory access request according to the physical address of the first memory space.
  • the first metadata of the data in the memory space is operated. Therefore, the first metadata may be used as data included in the first attraction domain 700 .
  • the first metadata may include active list, free list, etc.
  • the page table entries included in the first attraction domain 700 may be page table entries stored in the address translation unit 213 , or may be page table entries stored in the memory 220 .
  • the page table entries included in the first attraction domain 700 can be dynamically adjusted.
  • the processor core 211 implements page table entry entry according to the access frequency to the memory space, that is, adds the page table entry to the first attraction domain 700 . an attraction domain 700, thereby realizing the expansion of the first attraction domain 700.
  • Step 810 The processor core 211 queries the second page table entry in the address translation unit 213 according to the second memory access request, and a TLB miss occurs.
  • the second memory access request is used to instruct the processor core 211 to perform data processing on the second data stored in the second memory space corresponding to the virtual address of the second page table entry.
  • Step 820 The processor core 211 obtains the second page table entry from the memory 220.
  • the processor core 211 uses the page table traversal unit 214 to perform the page table traversal step to obtain the second page table entry from the memory 220.
  • the attraction field identifier in the second page table entry is an initial value, indicating that the second memory space is not associated with any attraction. Domain association.
  • Step 830 When the cumulative number of times of obtaining the second page table entry is greater than the preset threshold, the processor core 211 performs data processing on the second memory access request according to the address of the second page table entry in the second memory space, and processes the second memory access request.
  • the second page table entry is added to the first attraction field.
  • processor core 211 Each time the processor core 211 receives the second memory access request, it uses the page table traversal unit 214 to obtain the second page table entry from the multi-level page table of the memory 220.
  • PML4E, PDPTE, PDE and PTE in Figure 8 represent respectively The second memory access requests the data queried in PML4, PDPT, PD and PT.
  • the processor core 211 may use the first counter to record the cumulative number of times the page table traversal unit 214 obtains the second page table entry from the memory 220 .
  • the register is used as the first counter, and the value of the first counter is incremented by one each time the second page table entry is obtained from the memory 220 .
  • the first threshold may be 5 times, 8 times, 15 times, etc.
  • adding the second page table entry to the first attraction domain 700 by the processor core 211 also includes: the processor core 211 uses the page table traversal unit 214 to copy the second page table entry and stores the copied second page table entry. Enter the address translation unit 213.
  • the processor core 211 can also add the virtual page corresponding to the second page table entry to the activity table of the first attraction domain 700, that is, the processor core 211 adds the virtual page corresponding to the second page table entry in the processor core footprints (footprints).
  • the virtual pages are added to the active table, and the processor core footprint sequence records which addresses corresponding to the virtual pages the processor checks to perform data processing.
  • the processor core 211 adds the second page entry to the first attraction domain, which may be by modifying the attraction domain identifier of the second page entry to the attraction domain identifier of the first attraction domain.
  • the processor core 211 stores the second page table entry to the address translation unit 213, if the processor core 211 needs to perform memory access based on the second page table entry again, the second page table entry can be directly queried in the address translation unit 213. , avoid performing the page table entry traversal operation again to obtain the second page table entry from the memory 220, reducing the computational memory overhead and time overhead caused by the page table entry traversal operation, thereby improving the memory access efficiency and improving the address translation unit 213 storage resource utilization.
  • the dynamic adjustment of the page table entries contained in the first attraction domain 700 by the processor core 211 may also include the reduction of the first attraction domain 700 .
  • the steps for reducing the first attraction domain 700 will be described in detail below with reference to Figure 9 . 1-6 in Figure 9 represent the execution order of the steps.
  • Step 910 The processor core 211 queries the third page table entry in the address translation unit 213 according to the third memory access request, and a TLB miss occurs.
  • the third memory access request is a proxy instruction sent by a processor core other than the processor core 211 to instruct the processor core 211 to perform data processing on the third data stored in the third memory space corresponding to the virtual address of the third page table entry. .
  • the processor core 215 obtains a memory access request, and the virtual address contained in the memory access request is not found in the address translation unit of the processor core 215 If a TLB miss occurs in the third page table entry, the processor core 215 uses its own page table traversal unit to obtain the third page table entry in the memory 220. When the attraction domain identifier of the third page table entry is the attraction domain identifier of the first attraction domain 700, a proxy instruction is sent to the processor core 215 to the processor core 211, where the proxy instruction includes the virtual address of the third memory space.
  • Step 920 The processor core 211 obtains the third page table entry from the memory 220.
  • the processor core 211 uses the page table traversal unit 214 to perform the page table traversal step to obtain the third page table entry from the multi-level page table in the memory 220.
  • PML4E, PDPTE, PDE and PTE in Figure 9 respectively represent the third memory access request.
  • the attraction domain identifier in the third page entry is the attraction domain identifier of the processor core 211 .
  • Step 930 The processor core 211 migrates the third page table entry to the attraction domain of the processor core that has sent the third memory access request the most.
  • the processor core 215 is the processor core that sends the third memory access request to the processor core 211 the most.
  • the processor core 211 migrates the third page table entry to the attraction domain of the processor core 215, which may be the third memory access request.
  • the attraction domain identification of the three-page table entry is modified to the attraction domain identification of the attraction domain of the processor core 215 .
  • processor core 211 Each time the processor core 211 receives the proxy instruction, it uses the page table traversal unit 214 to obtain the third page table entry from the memory 220 . For each processor core that sends a proxy instruction to the processor core 211, the processor core 211 may use a counter to record the third data that the processor core other than the processor core 211 instructs the processor core 211 to store in the third memory space. The number of times data processing is performed.
  • the processor core 211 may record the number of times the proxy instruction sent by the processor core 215 is received through a second counter. For example, when the processor core 211 receives the proxy instruction sent by the processor core 215 for the first time, the register is used as the second counter, and each time the processor core 215 receives the proxy instruction sent by the processor core 215, the value of the second counter is incremented by one.
  • the second threshold may be 5 times, 8 times, 15 times, etc.
  • the processor core 211 migrates the third page table entry to the attraction domain of the processor core 215, which also includes: the processor core 211 uses the page table traversal unit 214 to copy the third page table entry and copies the third page table entry.
  • the page table entry is stored in the address translation unit 213.
  • the processor core 211 can also add the virtual page corresponding to the third page table entry to the activity table of the attraction domain of the processor core 215, that is, the processor core 215 adds the virtual page corresponding to the third page table entry in the processor core footprint. Page is added to the activity table.
  • the processor core 211 realizes the migration of the third page table entry between different attraction domains.
  • the processor core 215 performs data processing in the third memory space again, it does not need to send a proxy instruction to the processor core 211. Instead, The processor core 215 performs data processing in the third memory space according to the third page table entry in the address translation unit connected to itself, which improves the efficiency of memory access and reduces the overhead of communication resources between processor cores.
  • the combination of migration and domain entry realizes the dynamic adjustment of the attraction domain, so that the page table entries in the attraction domain are page table entries with higher access frequency by the processor core, thereby improving the utilization of the storage resources of the address translation unit. .
  • the processor core 211 uses the page table traversal unit 214 to periodically update the address translation unit 213 .
  • the page table traversal unit 214 determines the attraction domain identifiers of all page table entries in the address translation unit 213 , retains the page table entries in the address translation unit 213 whose attraction domain identifier is the attraction domain identifier of the first attraction domain 700 , and deletes the page table entries in the address translation unit 213
  • the attraction domain identifier is not a page table entry of the attraction domain identifier of the first attraction domain 700 .
  • Each two adjacent cycles of the above-mentioned update address translation unit 213 may be equal or unequal.
  • the length of the period can be 10 seconds, 20 seconds, 30 seconds, or any other length.
  • the processor core 211 uses the page table traversal unit 214 to complete the refresh of the page table entries, thereby avoiding the address translation unit 213 from querying the page table entries that do not belong to the first attraction domain, reducing the number of TLB shootdown triggers, thereby improving the processor
  • the utilization of computing resources of the core improves the memory access performance of the processor 210 .
  • the data processing method provided according to this embodiment is described in detail above with reference to FIGS. 1 to 9 .
  • the data processing device provided according to this embodiment will be described below with reference to FIG. 10 .
  • Figure 10 is a schematic structural diagram of a possible data processing device provided by this embodiment. These data processing devices can be used to implement the functions of the processor or processor core in the above method embodiments, and therefore can also achieve the beneficial effects of the above method embodiments.
  • the data processing device 1000 includes a transceiver module 1010 and a processing module 1020.
  • the data processing device 800 is used to implement the functions of the processor core 211 in the above method embodiment shown in Figure 3, Figure 8 or Figure 9.
  • the transceiver module 1010 is used to obtain a first memory access request.
  • the first memory access request is used to instruct the first processor core to perform data processing on the data stored in the first memory space.
  • the first memory space is the first attraction of the first processor core.
  • the memory storage space associated with the domain.
  • the processing module 1020 is configured to determine the address of the first memory space, and perform data processing of the first memory access request according to the address of the first memory space.
  • the first attraction domain includes an association relationship between a memory space address and an attraction domain identifier.
  • the attraction domain identifier is used to indicate the attraction domain to which the association relationship belongs.
  • the memory space address is the memory associated with the attraction domain to which the association relationship belongs. The address of the space.
  • the association is a Page Table Entry (PTE)
  • the attraction field identifier is set in the reserved bit of the page table entry.
  • the processor can identify the attraction domain identifier without modifying the microarchitecture.
  • the attraction domain identifier includes a processor core identifier.
  • the processor core identifier in the association relationship can indicate that the association relationship belongs to the attraction domain of the processor core represented by the processor core identifier.
  • the first memory access request may be generated by the first processor core when a thread executed by the first processor core needs to read and write data in the first memory space.
  • the first memory access request may be made by a thread outside the first processor core when a thread executed by a processor core outside the first processor core needs to read and write data in the first memory space.
  • the processor core generates and sends it to the first processor core.
  • the data processing device 1000 can also dynamically adjust the association relationships included in the attraction domain.
  • the transceiver module 1010 is also used to obtain the second memory access request.
  • the processing module 1020 is used to perform steps 810 to 830 in Figure 8 .
  • the transceiving module 1010 is also used to obtain the third memory access request, and the processing module 1020 is used to execute steps 910 to 930 in Figure 9 .
  • the association relationship of the first attraction domain may be stored in a storage unit connected to the first processor core, and the storage unit is used to store and query the association relationship contained in the first attraction domain.
  • the first memory access request includes the virtual address of the first memory space
  • the associated memory space includes the physical address of the first memory space.
  • the processing module 1020 is specifically configured to: query the association relationship in the storage unit according to the virtual address of the first memory space, obtain the first association relationship, and determine the first memory space according to the first association relationship. physical address.
  • the storage unit may be an address translation unit.
  • processing module 1020 is also configured to: delete associations in the storage unit that do not belong to the first attraction domain every preset period.
  • the data processing device 1000 in the embodiment of the present application can be implemented by an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the above PLD can be a complex program.
  • Logic device complex programmable logical device, CPLD
  • field-programmable gate array field-programmable gate array
  • FPGA field-programmable gate array
  • GAL general array logic
  • the data processing device 1000 may correspond to performing the method described in the embodiment of the present application, and the above and other operations and/or functions of the various units in the data processing device 1000 are in order to implement FIG. 3, FIG. 8 or FIG.
  • the corresponding processes of each method in 9 will not be repeated here for the sake of brevity.
  • the method steps in this embodiment can be implemented by hardware or by a processor executing software instructions.
  • Software instructions can be composed of corresponding software modules, and software modules can be stored in random access memory (random access memory, RAM), flash memory, read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM) , PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable programmable read-only memory (electrically EPROM, EEPROM), register, hard disk, mobile hard disk, CD-ROM or other well-known in the art any other form of storage media.
  • An exemplary storage medium is coupled to the processor such that the processor can read information from the storage medium and write information to the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and storage media may be located in an ASIC. Additionally, the ASIC can be located in a computing device. Of course, the processor and storage medium may also exist as discrete components in a computing device.
  • the computer program product includes one or more computer programs or instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user equipment, or other programmable device.
  • the computer program or instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another.
  • the computer program or instructions may be transmitted from a website, computer, A server or data center transmits via wired or wireless means to another website site, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center that integrates one or more available media.
  • the available media may be magnetic media, such as floppy disks, hard disks, and magnetic tapes; they may also be optical media, such as digital video discs (DVDs); they may also be semiconductor media, such as solid state drives (solid state drives). ,SSD).
  • SSD solid state drives

Abstract

The present invention relates to the field of computers. Disclosed are a data processing method and device, a processor and a computer system. The method comprises: a first processor core obtains a first memory access request, the first memory access request being used for instructing the first processor core to perform data processing on data stored in a first memory space; when the first memory space is a memory space of a memory associated with a first attraction zone of the first processor core, the first processor core determines an address of the first memory space and performs data processing of the first memory access request according to the address of the first memory space. Therefore, access conflicts generated when a plurality of processor cores perform memory access in parallel are avoided, the overall duration of memory access is reduced, and the memory access efficiency is improved.

Description

数据处理方法、装置、处理器及计算机系统Data processing methods, devices, processors and computer systems
本申请要求于2022年05月12日提交中国专利局、申请号为202210514855.0、发明名称为“数据处理方法、装置、处理器及计算机系统”的中国专利申请的优先权,所述专利申请的全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application submitted to the China Patent Office on May 12, 2022, with the application number 202210514855.0 and the invention name "data processing method, device, processor and computer system". All the patent applications The contents are incorporated into this application by reference.
技术领域Technical field
本申请涉及计算机领域,尤其涉及一种数据处理方法、装置、处理器及计算机系统。The present application relates to the field of computers, and in particular, to a data processing method, device, processor and computer system.
背景技术Background technique
目前,计算机系统包括传统内存(如:动态随机存取存储器(dynamic random-access memory,DRAM))和扩展内存(如:存储级内存(storage-class-memory,SCM)、固态硬盘(Solid State Disk,SSD)),将传统内存和扩展内存均作为计算系统的主存(main memory)使用,以扩展传统内存的存储容量。进一步地,计算机系统中处理器(Central Processing Unit,CPU)采用虚拟内存技术访问内存,即处理器根据虚拟地址(Virtual Memory Address)对应的物理地址(Physical Address)访问内存。同一CPU可包括多个处理器核(core),分别用于处理不同的内存访问请求,处理器核根据内存访问请求对内存空间的地址执行访问请求的数据处理时,会对内存中用于描述内存中数据的属性的元数据进行操作。多个处理器核能够访问同一虚拟地址关联的物理地址指示的内存的存储空间,会对相同的元数据进行操作。因此元数据在内存中是全局共享的,且多个处理器核不能对元数据同时进行操作,以避免多个处理器核同时对元数据进行操作,出现多个处理器核访问的内存不一致的问题。At present, computer systems include traditional memory (such as dynamic random-access memory (DRAM)) and extended memory (such as storage-class-memory (SCM), solid-state disk (Solid State Disk) , SSD)), using both traditional memory and extended memory as the main memory of the computing system to expand the storage capacity of traditional memory. Furthermore, the processor (Central Processing Unit, CPU) in the computer system uses virtual memory technology to access the memory, that is, the processor accesses the memory according to the physical address (Physical Address) corresponding to the virtual address (Virtual Memory Address). The same CPU can include multiple processor cores (cores), which are used to process different memory access requests. When the processor core performs the data processing of the access request on the address of the memory space according to the memory access request, it will describe the data in the memory. Operate on the metadata of the properties of the data in memory. Multiple processor cores can access the memory storage space indicated by the physical address associated with the same virtual address and operate on the same metadata. Therefore, metadata is globally shared in memory, and multiple processor cores cannot operate on metadata at the same time to avoid multiple processor cores operating on metadata at the same time, resulting in inconsistent memory accessed by multiple processor cores. question.
而当多个处理器核中的第一处理器核在根据内存访问请求对内存空间的地址执行访问请求的数据处理时,第一处理器核以外的处理器核需等待第一处理器核完成对内存的访问,第一处理器核才能够执行访问请求的数据处理。因此,上述访问请求的操作会产生访问冲突,导致内存访问时长较长,最终导致处理器的处理性能下降。因此,如何提供一种更高效率的内存访问方法成为亟待解决的技术问题。When the first processor core among the multiple processor cores performs the data processing of the access request on the address of the memory space according to the memory access request, the processor cores other than the first processor core need to wait for the first processor core to complete. For memory access, only the first processor core can perform the data processing of the access request. Therefore, the operation of the above-mentioned access request will cause an access conflict, resulting in a longer memory access time, and ultimately leading to a decrease in the processing performance of the processor. Therefore, how to provide a more efficient memory access method has become an urgent technical issue to be solved.
发明内容Contents of the invention
本申请提供一种数据处理方法、装置、处理器及计算机系统,由此提高了处理器的内存访问效率。The present application provides a data processing method, device, processor and computer system, thereby improving the memory access efficiency of the processor.
第一方面,提供了一种数据处理方法,适用于包括处理器和内存的计算机系统,由处理器的第一处理器核执行。上述数据处理方法包括:第一处理器核获取第一内存访问请求,第一内存访问请求用于指示第一处理器核对第一内存空间存储的数据执行数据处理,当第一内存空间为第一处理器核的第一吸引域(Attraction Zone)关联的内存的存储空间时,第一处理器核确定第一内存空间的地址,根据第一内存空间的地址执行第一内存访问请求的数据处理。In a first aspect, a data processing method is provided, which is suitable for a computer system including a processor and a memory, and is executed by a first processor core of the processor. The above data processing method includes: the first processor core obtains a first memory access request. The first memory access request is used to instruct the first processor core to perform data processing on the data stored in the first memory space. When the first memory space is the first When the first attraction zone (Attraction Zone) of the processor core is associated with the storage space of the memory, the first processor core determines the address of the first memory space and performs the data processing of the first memory access request according to the address of the first memory space.
由此,第一内存访问请求所指示的需要访问的第一内存空间为第一处理器核的第一吸引域关联的内存的存储空间的情况下,第一处理器核对第一内存空间的地址执行第一内存访问请求的数据处理,通过吸引域将第一处理器核和第一内存空间关联起来,使第一处理器核以外的处理器核不能对第一内存空间的地址执行数据处理。因此,不同的处理器核对不同的内存空间的地址执行数据处理,不同的处理器核在执行数据处理时是对不同的元数据进行操作。多个处理器核之间对元数据的操作互不影响,相当于内存中的元数据根据吸引域被划分为多 个相互隔离的部分。多个处理器核可以对各自吸引域内的元数据同时进行操作,则第一处理器核在根据内存访问请求对内存空间的地址执行访问请求的数据处理时,第一处理器核以外的处理器核无需等待第一处理器核完成内存访问及元数据操作,也能够对第一吸引域外的元数据进行操作,完成内存访问。进而避免了多处理器核并行进行内存访问时存在的访问冲突,降低了内存访问的整体时长,提高了内存访问的效率。Therefore, when the first memory space that needs to be accessed indicated by the first memory access request is the storage space of the memory associated with the first attraction domain of the first processor core, the first processor core checks the address of the first memory space Perform data processing of the first memory access request, and associate the first processor core with the first memory space through the attraction domain, so that processor cores other than the first processor core cannot perform data processing on the address of the first memory space. Therefore, different processor cores perform data processing on addresses in different memory spaces, and different processor cores operate on different metadata when performing data processing. The operations of metadata between multiple processor cores do not affect each other, which is equivalent to the metadata in the memory being divided into multiple parts that are isolated from each other. Multiple processor cores can simultaneously operate metadata in their respective attraction domains. When the first processor core performs the data processing of the access request on the address of the memory space according to the memory access request, processors other than the first processor core The core does not need to wait for the first processor core to complete memory access and metadata operations, and can also operate on metadata outside the first attraction domain to complete memory access. This avoids access conflicts when multiple processor cores perform memory access in parallel, reduces the overall duration of memory access, and improves the efficiency of memory access.
在一种可能的实现方式中,第一吸引域包括内存空间地址和吸引域标识的关联关系,吸引域标识用于指示关联关系所属的吸引域,内存空间地址为与关联关系所属的吸引域关联的内存空间的地址。In a possible implementation, the first attraction domain includes an association relationship between a memory space address and an attraction domain identifier. The attraction domain identifier is used to indicate the attraction domain to which the association relationship belongs. The memory space address is associated with the attraction domain to which the association relationship belongs. The address of the memory space.
在上述实现方式中,处理器核根据关联关系包含的吸引域标识和内存空间地址确定内存空间是否为处理器核的吸引域关联的内存的存储空间,保证了处理器核对内存空间与吸引域的关联性的判断效率。In the above implementation, the processor core determines whether the memory space is the storage space of the memory associated with the attraction domain of the processor core based on the attraction domain identifier and the memory space address contained in the association relationship, ensuring that the processor checks the memory space and the attraction domain. Relevance judgment efficiency.
例如,关联关系为页表项,吸引域标识设置于页表项的保留位。利用页表项中已有的保留位记录吸引域标识,处理器无需修改微架构也能对识别吸引域标识。For example, the association is a page table entry, and the attraction domain identifier is set in a reserved bit of the page table entry. Using the existing reserved bits in the page table entry to record the attraction domain identifier, the processor can identify the attraction domain identifier without modifying the microarchitecture.
又如,吸引域标识包括处理器核标识。关联关系中的处理器核标识能够指示该关联关系属于处理器核标识表示的处理器核的吸引域。For another example, the attraction domain identifier includes a processor core identifier. The processor core identifier in the association relationship can indicate that the association relationship belongs to the attraction domain of the processor core represented by the processor core identifier.
在一种可能的实现方式中,第一内存访问请求可以是第一处理器核执行的线程需要读写第一内存空间的数据的情况下,由第一处理器核生成的。In a possible implementation, the first memory access request may be generated by the first processor core when a thread executed by the first processor core needs to read and write data in the first memory space.
在另一种可能的实现方式中,第一内存访问请求可以是第一处理器核外的处理器核执行的线程需要读写第一内存空间的数据的情况下,由第一处理器核外的处理器核生成并发送至第一处理器核。第一处理器核根据第一内存空间的地址执行第一内存访问请求的数据处理后,将获得的数据发送至发送第一内存访问请求的处理器核,从而避免了处理器核对不与处理器核自身的吸引域关联的内存空间进行数据处理,提高了不同吸引域关联的内存空间的隔离强度。In another possible implementation, the first memory access request may be made by a thread outside the first processor core when a thread executed by a processor core outside the first processor core needs to read and write data in the first memory space. The processor core generates and sends it to the first processor core. After the first processor core performs the data processing of the first memory access request according to the address of the first memory space, the obtained data is sent to the processor core that sent the first memory access request, thereby avoiding the processor core not being reconciled with the processor. The memory space associated with the core's own attraction domain is used for data processing, which improves the isolation strength of the memory space associated with different attraction domains.
处理器核还可以对吸引域包含的关联关系进行动态调整。例如,第一处理器核将不属于任何吸引域的关联关系添加至第一吸引域。The processor core can also dynamically adjust the associations contained in the attraction domain. For example, the first processor core adds an association relationship that does not belong to any attraction domain to the first attraction domain.
第一处理器核将不属于任何吸引域的关联关系添加至第一吸引域的步骤可以如下:第一处理器核可以获取第二内存访问请求,第二内存访问请求用于指示第一处理器核对第二内存空间存储的数据执行数据处理。在第二内存空间不与任一吸引域关联,且获取第二内存访问请求的累计次数大于预设阈值的情况下,第一处理器核对第二内存空间存储的数据执行数据处理,并将所述第二内存空间的地址的关联关系添加至第一吸引域。The step for the first processor core to add an association relationship that does not belong to any attraction domain to the first attraction domain may be as follows: the first processor core may obtain a second memory access request, and the second memory access request is used to instruct the first processor Check the data stored in the second memory space and perform data processing. When the second memory space is not associated with any attraction domain and the cumulative number of second memory access requests is greater than the preset threshold, the first processor checks the data stored in the second memory space and performs data processing. The association relationship of the address of the second memory space is added to the first attraction domain.
可选地,第一处理器核将所述第二内存空间的地址的关联关系添加至第一吸引域可以包括:将包含第二内存空间的地址的关联关系的吸引域标识设置为第一吸引域的吸引域标识。Optionally, the first processor core adding the association relationship of the address of the second memory space to the first attraction domain may include: setting the attraction domain identifier containing the association relationship of the address of the second memory space as the first attraction domain. The attractor domain ID of the domain.
又例如,第一处理器核将属于第一吸引域的关联关系迁移至第一吸引域外的吸引域中。For another example, the first processor core migrates the association relationship belonging to the first attraction domain to an attraction domain outside the first attraction domain.
第一处理器核将属于第一吸引域的关联关系迁移至第一吸引域外的吸引域中的步骤可以如下:第一处理器核获取第三内存访问请求,第三内存访问请求用于指示第一处理器核对第三内存空间存储的数据执行数据处理,第三内存访问请求为第一处理器核外的处理器核生成并发送至第一处理器核。第一处理器核将包含第三内存空间的地址的关联关系添加至发送第三内存访问请求的次数最多的处理器核的吸引域中。The steps for the first processor core to migrate the association relationship belonging to the first attraction domain to an attraction domain outside the first attraction domain may be as follows: the first processor core obtains a third memory access request, and the third memory access request is used to indicate the third memory access request. One processor core performs data processing on the data stored in the third memory space, and the third memory access request is generated by a processor core other than the first processor core and sent to the first processor core. The first processor core adds an association including the address of the third memory space to the attraction domain of the processor core that sends the third memory access request the most times.
可选地,第一处理器核将包含第三内存空间的地址的关联关系添加至发送第三内存访问请求的次数最多的处理器核的吸引域中,包括:将包含第三内存空间的地址的关联关系的吸引域标识,设置发送第三内存访问请求的次数最多的处理器核的吸引域的吸引域标识。 Optionally, the first processor core adds an association including the address of the third memory space to the attraction domain of the processor core that sends the third memory access request the most, including: adding the association including the address of the third memory space. The attraction domain identifier of the association relationship is set to the attraction domain identifier of the attraction domain of the processor core that sends the third memory access request the most.
由此,处理器核实现了吸引域的扩张和缩减,从而根据处理器核访问不同内存空间的频次将被处理器核访问可能性较大的内存空间与吸引域相关联,以减少处理器核获取的内存访问请求指示处理器核对不属于自身的吸引域关联的内存空间进行访问的概率,提高了处理器核的处理资源的利用率。As a result, the processor core realizes the expansion and reduction of the attraction domain, so that according to the frequency of the processor core accessing different memory spaces, the memory space that is more likely to be accessed by the processor core is associated with the attraction domain, so as to reduce the The obtained memory access request indicates the probability of the processor core accessing the memory space associated with the attraction domain that does not belong to itself, thereby improving the utilization of the processing resources of the processor core.
作为一种可选地实现方式,第一吸引域的关联关系可以存储在与第一处理器核连接的存储单元中,存储单元用于存储和查询第一吸引域包含的关联关系。第一内存访问请求包括第一内存空间的虚拟地址,关联关系的内存空间包含第一内存空间的物理地址,则第一处理器核确定所述第一内存空间的地址的步骤可以如下:第一处理器核根据第一内存空间的虚拟地址在存储单元中查询关联关系,得到第一关联关系,并根据第一关联关系确定第一内存空间的物理地址。As an optional implementation manner, the association relationship of the first attraction domain may be stored in a storage unit connected to the first processor core, and the storage unit is used to store and query the association relationship contained in the first attraction domain. The first memory access request includes the virtual address of the first memory space, and the associated memory space includes the physical address of the first memory space. Then the step for the first processor core to determine the address of the first memory space may be as follows: first The processor core queries the storage unit for an association relationship according to the virtual address of the first memory space, obtains the first association relationship, and determines the physical address of the first memory space based on the first association relationship.
可选地,存储单元可以是地址转译单元。Alternatively, the storage unit may be an address translation unit.
另外,第一处理器核还可以每隔一个预设周期,删除所述存储单元中不属于所述第一吸引域的关联关系。In addition, the first processor core may also delete associations in the storage unit that do not belong to the first attraction domain every other preset period.
由此,减少了处理器核在连接的存储单元中查询的关联关系不存在或失效的概率,减少了处理器核在连接的存储单元中未查询到关联关系时,从内存中获取关联关系的处理开销,从而提高了内存访问的性能。This reduces the probability that the association relationship queried by the processor core in the connected storage unit does not exist or is invalid, and reduces the probability that the processor core obtains the association relationship from the memory when the association relationship is not queried in the connected storage unit. processing overhead, thus improving the performance of memory accesses.
应说明是,第一处理器核根据关联关系确定内存空间的物理地址时,若关联关系不属于任一吸引域、属于第一吸引域或属于第一吸引域外的吸引域,所有处理器核均无需停止线程对存储单元进行更新,从而提高了内存访问的性能。It should be noted that when the first processor core determines the physical address of the memory space based on the association relationship, if the association relationship does not belong to any attraction domain, belongs to the first attraction domain, or belongs to an attraction domain outside the first attraction domain, all processor cores will There is no need to stop the thread to update the storage unit, thus improving the performance of memory access.
第二方面,提供了一种数据处理装置,所述装置包括用于执行第一方面或第一方面任一种可能实现方式中的数据处理方法的各个模块。A second aspect provides a data processing device, which includes various modules for executing the data processing method in the first aspect or any possible implementation of the first aspect.
第三方面,提供了一种处理器,处理器核用于执行第一方面或第一方面任一种可能实现方式中的数据处理方法的操作步骤。In a third aspect, a processor is provided. The processor core is configured to execute the operation steps of the data processing method in the first aspect or any possible implementation of the first aspect.
第四方面,提供了一种计算机系统,计算机系统包括内存以及第三方面中的处理器,处理器用于执行第一方面或第一方面任一种可能实现方式中的数据处理方法的操作步骤,以对内存的存储空间的地址执行内存访问请求的数据处理。In a fourth aspect, a computer system is provided. The computer system includes a memory and a processor in the third aspect. The processor is configured to execute the operating steps of the data processing method in the first aspect or any possible implementation of the first aspect, The data processing of the memory access request is performed at the address of the memory storage space.
第五方面,提供一种计算机可读存储介质,包括:计算机软件指令;当计算机软件指令在计算设备中运行时,使得计算设备执行如第一方面或第一方面任意一种可能的实现方式中所述方法的操作步骤。In a fifth aspect, a computer-readable storage medium is provided, including: computer software instructions; when the computer software instructions are run in a computing device, the computing device is caused to execute as in the first aspect or any possible implementation of the first aspect. The steps of the method.
第六方面,提供一种计算机程序产品,当计算机程序产品在计算机系统上运行时,使得计算设备执行如第一方面或第一方面任意一种可能的实现方式中所述方法的操作步骤。In a sixth aspect, a computer program product is provided. When the computer program product is run on a computer system, it causes the computing device to perform the operation steps of the method described in the first aspect or any possible implementation of the first aspect.
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。Based on the implementation methods provided in the above aspects, this application can also be further combined to provide more implementation methods.
附图说明Description of the drawings
图1为本申请实施例提供的一种多层结构的存储系统示意图;Figure 1 is a schematic diagram of a multi-layer storage system provided by an embodiment of the present application;
图2为本申请实施例提供的一种计算机系统的架构示意图;Figure 2 is an architectural schematic diagram of a computer system provided by an embodiment of the present application;
图3为本申请实施例提供的一种数据处理方法的流程示意图;Figure 3 is a schematic flow chart of a data processing method provided by an embodiment of the present application;
图4为本申请实施例提供的一种页表项的结构示意图;Figure 4 is a schematic structural diagram of a page table entry provided by an embodiment of the present application;
图5为本申请实施例提供的一种数据处理步骤的流程示意图;Figure 5 is a schematic flowchart of data processing steps provided by an embodiment of the present application;
图6为本申请实施例提供的又一种数据处理步骤的流程示意图;Figure 6 is a schematic flowchart of another data processing step provided by the embodiment of the present application;
图7为本申请实施例提供的一种吸引域的结构示意图; Figure 7 is a schematic structural diagram of an attraction domain provided by an embodiment of the present application;
图8为本申请实施例提供的一种吸引域的扩张步骤的流程示意图;Figure 8 is a schematic flowchart of an attraction domain expansion step provided by an embodiment of the present application;
图9为本申请实施例提供的一种吸引域的缩减步骤的流程示意图;Figure 9 is a schematic flowchart of steps for reducing an attraction domain provided by an embodiment of the present application;
图10为本申请实施例提供的一种数据处理装置的结构示意图。Figure 10 is a schematic structural diagram of a data processing device provided by an embodiment of the present application.
具体实施方式Detailed ways
为了便于理解,下面先对本申请实施例涉及的相关术语及虚拟内存等相关概念进行介绍。In order to facilitate understanding, the relevant terms and related concepts such as virtual memory involved in the embodiments of this application are first introduced below.
(1)分级存储(1) Hierarchical storage
在整个计算机系统中,可配置缓存、主存和硬盘等不同类型的存储介质。在数据处理过程中,由于CPU与存储介质间的通信协议、访问路径、带宽等因素影响,CPU对存储介质的访问速度往往存在差异,通常按照分级存储方式存储和访问数据。Throughout the computer system, different types of storage media such as cache, main memory, and hard disks can be configured. During the data processing process, due to factors such as the communication protocol, access path, and bandwidth between the CPU and the storage medium, there are often differences in the access speed of the CPU to the storage medium. Data is usually stored and accessed in a hierarchical storage manner.
按照CPU访问数据的速度可将处理器的缓存划分为一级缓存(Level 1 cache)、二级缓存(Level 2 cache)、三级缓存(Level 3 cache)。计算机系统中还可以配置主存(main memory)(或称为内存),例如,利用随机存取存储器(Random Access Memory,RAM)、动态随机存取存储器(Dynamic Random Access Memory,DRAM)、固态硬盘、机械硬盘(hard disk driver,HDD)作为计算机系统的主存。According to the speed at which the CPU accesses data, the processor's cache can be divided into Level 1 cache, Level 2 cache, and Level 3 cache. The computer system can also be configured with main memory (or memory), for example, using random access memory (Random Access Memory, RAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), solid state drive , Mechanical hard disk (hard disk driver, HDD) serves as the main memory of the computer system.
而硬盘也有很多种类型,每种类型的硬盘其性能也不同。例如,SSD的数据存储速度高于机械硬盘。如果将访问频繁并且性能要求又较高的数据放置在读写性能高的硬盘中,而将原先存储在高性能的硬盘中的那些不被经常访问或者性能要求也不高的数据都移动到低性能的硬盘中。There are many types of hard drives, and each type of hard drive has different performance. For example, the data storage speed of SSD is higher than that of mechanical hard disk. If data that is accessed frequently and has high performance requirements is placed on a hard disk with high read and write performance, and data that is not frequently accessed or has low performance requirements and is originally stored in a high-performance hard disk is moved to a low-performance hard disk, performance of the hard drive.
示例地,图1为本申请提供的一种多层结构的存储系统示意图。从第一层至第三层,存储容量逐级增加,存取速度逐级降低,成本逐级减少。如图1所示,第一层级包含位于处理器210内的寄存器111、一级缓存112、二级缓存113和三级缓存114。第二层级包含的存储器可以作为计算机系统的主存储器,即内存。例如,动态随机存取存储器121,双倍数据速率同步动态随机存取存储器(double data date SDRAM,DDR SDRAM)122,存储级内存(storage-class-memory,SCM)123。主存储器可以简称为主存或内存,即与CPU交换信息的存储器。第三层级包含的存储器可以作为计算机系统的辅助存储器,即外存。例如,网络存储器131,固态驱动器(Solid State Disk,SSD)132,硬盘驱动器133。辅助存储器可以简称为辅存或外存,即本实施例中的硬盘。相对主存,硬盘的存储容量大,存取速度慢。可见,距离CPU越近的存储器,容量越小、存取速度越快、带宽越大、延迟越低。因此,第三层级包含的存储器存储CPU不经常访问的数据,提高数据的可靠性。第二层级包含的存储器可以作为缓存设备,用于存储CPU经常访问的数据,显著地改善系统的访问性能。For example, FIG. 1 is a schematic diagram of a multi-layer storage system provided by the present application. From the first layer to the third layer, the storage capacity increases step by step, the access speed decreases step by step, and the cost decreases step by step. As shown in FIG. 1 , the first level includes a register 111 , a first-level cache 112 , a second-level cache 113 and a third-level cache 114 located in the processor 210 . The memory contained in the second level can be used as the main memory of the computer system, that is, memory. For example, dynamic random access memory 121, double data rate synchronous dynamic random access memory (double data date SDRAM, DDR SDRAM) 122, storage-class-memory (SCM) 123. Main memory can be simply referred to as main memory or internal memory, which is the memory that exchanges information with the CPU. The memory contained in the third level can be used as auxiliary memory of the computer system, that is, external memory. For example, network storage 131, solid state drive (Solid State Disk, SSD) 132, hard disk drive 133. The auxiliary storage may be referred to as auxiliary storage or external storage, that is, the hard disk in this embodiment. Compared with main memory, hard disks have large storage capacity and slow access speeds. It can be seen that the closer the memory is to the CPU, the smaller the capacity, the faster the access speed, the greater the bandwidth, and the lower the latency. Therefore, the memory contained in the third level stores data that is not frequently accessed by the CPU, improving data reliability. The memory contained in the second level can be used as a cache device to store data frequently accessed by the CPU, significantly improving the access performance of the system.
(2)虚拟内存(2)Virtual memory
计算机系统中所运行的程序均需内存作为存储空间,若执行的程序占用内存资源很多,则会导致内存消耗殆尽。为解决该问题,计算机系统的操作系统运用了虚拟内存技术,即使用部分硬盘空间作为内存的存储空间使用。操作系统通过虚拟内存技术为用户提供了一个虚拟存储器(Virtual Memory),在启动程序时,操作系统将程序的一部分数据存入内存,而其余部分数据留在硬盘,就可以启动程序。在程序执行的过程中,当程序要访问的数据不在内存时,由操作系统将所需要的部分数据换入内存,然后继续执行程序。另一方面,操作系统将内存中程序暂时不访问的数据换出到硬盘上,从而腾出空间存放将要调入内存的数据。这样,系统就通过虚拟内存技术提供了一个比物理内存大得多的存储器,即虚拟存储器,从逻辑上实现了内存扩展。从技术角度来讲,操作系统使用虚拟内存技术为每个程序设置一段" 连续"的虚拟的存储空间,把虚拟的存储空间空间分割成多个具有连续地址范围的页(Page),并把这些页和物理内存做映射,在程序运行期间将虚拟的存储空间的页到物理内存。当程序需要访问一段在物理内存的地址空间时,由硬件执行虚拟地址和物理地址的映射;而当程序引用到一段不在物理内存中的地址空间时,由操作系统负责将缺失的部分存入物理内存并重新执行失败的指令。All programs running in the computer system require memory as storage space. If the executed program takes up a lot of memory resources, the memory will be exhausted. To solve this problem, the operating system of the computer system uses virtual memory technology, which uses part of the hard disk space as memory storage space. The operating system provides users with a virtual memory (Virtual Memory) through virtual memory technology. When starting a program, the operating system stores part of the program data in the memory, while leaving the remaining data on the hard disk, and then the program can be started. During program execution, when the data to be accessed by the program is not in the memory, the operating system swaps the required part of the data into the memory, and then continues to execute the program. On the other hand, the operating system swaps out the data in the memory that is not temporarily accessed by the program to the hard disk, thereby freeing up space to store the data that will be transferred into the memory. In this way, the system uses virtual memory technology to provide a memory much larger than the physical memory, that is, virtual memory, which logically implements memory expansion. From a technical perspective, the operating system uses virtual memory technology to set up a " Continuous "virtual storage space" divides the virtual storage space into multiple pages (Page) with continuous address ranges, and maps these pages to physical memory. During the running of the program, the pages of the virtual storage space are moved to Physical memory. When a program needs to access an address space in physical memory, the hardware performs mapping of virtual addresses and physical addresses; when a program references an address space that is not in physical memory, the operating system is responsible for converting the missing part Store into physical memory and re-execute the failed instruction.
(3)页表项(3)Page table entry
在存储器(例如:内存、硬盘)里以字节为单位存储数据,为正确地存放或取得数据,每一个字节单元给以一个唯一的存储器地址,称为物理地址,在虚拟内存技术中物理地址是用于进行数据存取的地址。逻辑地址(Logical Address)是在有地址变换功能的计算机系统中,访问内存指令(也称为访内指令)给出的地址叫逻辑地址,也叫相对地址。也就是机器语言指令中,用来指定一个操作数或是一条指令的地址。逻辑地址可以是由一个段选择符加上一个指定段内相对地址的偏移地址组成的。逻辑地址要经过寻址方式的计算或变换才得到内存储器中的实际有效地址即物理地址。线性地址(Linear Address)是逻辑地址到物理地址变换之间的中间层。程序代码会产生逻辑地址,或者说是段中的偏移地址,加上相应段的基地址就生成了一个线性地址。虚拟地址就是逻辑地址中的段内偏移地址。Data is stored in units of bytes in memory (for example, memory, hard disk). In order to correctly store or obtain data, each byte unit is given a unique memory address, called a physical address. In virtual memory technology, physical Address is the address used for data access. Logical Address (Logical Address) In a computer system with address conversion function, the address given by the memory access instruction (also called the internal access instruction) is called the logical address, also called the relative address. That is, in machine language instructions, it is used to specify an operand or the address of an instruction. The logical address can be composed of a segment selector plus an offset address relative to the specified segment. The logical address needs to be calculated or transformed by the addressing mode to obtain the actual effective address in the internal memory, that is, the physical address. Linear Address is the middle layer between logical address to physical address conversion. The program code will generate a logical address, or an offset address in the segment, which is added to the base address of the corresponding segment to generate a linear address. The virtual address is the intra-segment offset address in the logical address.
虚拟地址空间按照固定大小划分成被称为页(Page)的若干单元,页在物理内存中与页框(Page Frame)对应,页框是物理内存划分的块。页与页框一般来说是一样的大小,例如4KB、8KB、16KB甚至更大,实际上计算机系统中一般是512字节到1GB,这就是虚拟内存技术中的分页技术。The virtual address space is divided into several units called pages according to a fixed size. Pages correspond to page frames in physical memory. Page frames are blocks into which physical memory is divided. Pages and page frames are generally the same size, such as 4KB, 8KB, 16KB or even larger. In fact, computer systems generally range from 512 bytes to 1GB. This is the paging technology in virtual memory technology.
页表是页和页框的一一对应的关系表,在查询与一个页对应的页框时起到索引作用。页表中的一个页表与页框的对应关系称为页表项。页表由多个页表项组成,页表项的结构取决于机器架构,不过基本上都大同小异。例如,页表项中包含标识、数据辅助信息,标识为虚拟地址,数据为虚拟地址映射的物理地址,辅助信息包括有效位、修改位、访问位和保留位。有效位表示该页当前是否存在于内存中,当处理器或处理器核执行的线程尝试访问一个不存在于内存中的页时,就会引起一个缺页中断。保护位指示页所允许的访问类型,例如可读写或只读等。修改位和访问位是为了记录页面使用情况而引入的,在页面置换中会使用到。比如当一个内存页被程序修改过之后,硬件会自动设置修改位,如果下次程序发生缺页中断需要运行页面置换算法把该页调出以便为即将调入的页腾出空间之时,就会先去访问修改位,从而得知该页被修改过,也就是脏页(Dirty Page),则需要把最新的页内容写回到硬盘保存,否则就表示内存和硬盘上的副本内容是同步的,无需写回硬盘;而访问位同样也是系统在程序访问页时自动设置的,它也是页面置换算法会使用到的一个值,操作系统会根据页是否正在被访问来判断是否要淘汰掉这个页,一般来说不再使用的页更适合被淘汰掉。保留位用于存储其他辅助信息,例如本实施例中的吸引域标识。The page table is a one-to-one relationship table between pages and page frames, and plays an indexing role when querying the page frame corresponding to a page. The corresponding relationship between a page table and a page frame in the page table is called a page table entry. The page table consists of multiple page table entries. The structure of the page table entries depends on the machine architecture, but they are basically the same. For example, the page table entry contains identification and data auxiliary information. The identification is the virtual address and the data is the physical address mapped by the virtual address. The auxiliary information includes valid bits, modification bits, access bits and reserved bits. The valid bit indicates whether the page currently exists in memory. When a thread executed by the processor or processor core attempts to access a page that does not exist in memory, a page fault interrupt will be caused. The protection bit indicates the type of access allowed to the page, such as read-write or read-only. The modification bit and access bit are introduced to record page usage and are used in page replacement. For example, when a memory page is modified by the program, the hardware will automatically set the modification bit. If the program encounters a page fault next time and needs to run the page replacement algorithm to call out the page to make room for the page to be called in, it will It will first access the modification bit to know that the page has been modified, that is, a dirty page. Then the latest page content needs to be written back to the hard disk for storage. Otherwise, it means that the copy content in the memory and the hard disk is synchronized. There is no need to write back to the hard disk; the access bit is also automatically set by the system when the program accesses the page. It is also a value used by the page replacement algorithm. The operating system will determine whether to eliminate this based on whether the page is being accessed. Pages, generally speaking, pages that are no longer used are more suitable for elimination. The reserved bits are used to store other auxiliary information, such as the attraction domain identification in this embodiment.
本申请实施例提供了一种数据处理方法,第一处理器核获取到第一内存访问请求,第一内存访问请求用于指示第一处理器核对第一内存空间存储的数据执行数据处理,若第一内存空间为第一处理器核的第一吸引域关联的所述内存的存储空间,则第一处理器核确定第一内存空间的地址后,根据第一内存空间的地址执行第一内存访问请求的数据处理。通过吸引域将处理器核和吸引域关联的内存空间关联起来,进而将吸引域与用于描述吸引域关联的内存空间存储的数据的属性的元数据关联起来,使多个处理器核在并行进行内存访问时能对各自的吸引域关联的元数据以及内存空间的数据进行数据处理,避免了多处理器核在并行进行内存访问的访问冲突,降低了内存访问的整体时长,提高了内存访问的效率。 Embodiments of the present application provide a data processing method. The first processor core obtains a first memory access request. The first memory access request is used to instruct the first processor core to perform data processing on the data stored in the first memory space. If The first memory space is the storage space of the memory associated with the first attraction domain of the first processor core. After the first processor core determines the address of the first memory space, it executes the first memory space according to the address of the first memory space. Data processing of access requests. The processor cores are associated with the memory space associated with the attraction domain through the attraction domain, and then the attraction domain is associated with the metadata used to describe the attributes of the data stored in the memory space associated with the attraction domain, so that multiple processor cores can operate in parallel. When performing memory access, the metadata associated with the respective attraction domains and the data in the memory space can be processed. This avoids access conflicts when multiple processor cores perform memory access in parallel, reduces the overall duration of memory access, and improves memory access. s efficiency.
需要说明的是,本申请实施例中的数据处理方法除了由第一处理器核执行,还可以是由处理器执行。若数据处理方法由处理器执行,与第一处理器核执行方法的区别在于第一内存空间为处理器的第一吸引域关联的内存的存储空间。It should be noted that, in addition to being executed by the first processor core, the data processing method in the embodiment of the present application may also be executed by a processor. If the data processing method is executed by the processor, the difference from the method executed by the first processor core is that the first memory space is the storage space of the memory associated with the first attraction domain of the processor.
下面结合附图对本申请实施例的实施方式进行详细描述。The implementation of the embodiments of the present application will be described in detail below with reference to the accompanying drawings.
图2为本申请实施例提供的一种计算机系统的架构示意图。如图2所示,计算机系统200包括处理器210、内存220和总线230。Figure 2 is a schematic architectural diagram of a computer system provided by an embodiment of the present application. As shown in FIG. 2 , computer system 200 includes processor 210 , memory 220 and bus 230 .
处理器210可以是图形处理单元(graphic processing unit,GPU)、中央处理器(central processing unit,CPU)、其他通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。The processor 210 can be a graphics processing unit (GPU), a central processing unit (CPU), other general-purpose processors, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (application-specific integrated circuit) specific integrated circuit (ASIC), field-programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor can be a microprocessor or any conventional processor, etc.
处理器210包括至少一个处理器核。处理器核又称为内核,是处理器最重要的组成部分。各种处理器核都具有固定的逻辑结构,例如一级缓存、二级缓存、执行单元、指令级单元和总线接口等逻辑单元。处理器210的每个处理器核在操作系统中均具有唯一的处理器核标识(identifier,id)。Processor 210 includes at least one processor core. The processor core, also known as the core, is the most important component of the processor. Various processor cores have fixed logical structures, such as first-level cache, second-level cache, execution unit, instruction level unit, bus interface and other logical units. Each processor core of the processor 210 has a unique processor core identifier (identifier, id) in the operating system.
示例地,如图2所示,处理器210包括处理器核211、内存管理单元(Memory Management Unit,MMU)212以及处理器核215。内存管理单元212通过总线230与内存220以及硬盘连接。本实施例中的硬盘相对于内存120也可以称为外存。For example, as shown in FIG. 2 , the processor 210 includes a processor core 211 , a memory management unit (Memory Management Unit, MMU) 212 and a processor core 215 . The memory management unit 212 is connected to the memory 220 and the hard disk through the bus 230. The hard disk in this embodiment can also be called external memory relative to the memory 120 .
内存管理单元212是处理器或处理器核中的一个硬件单元,内存管理单元212包括地址转译单元213和页表遍历单元(Table Walk Unit,TWU)214。内存管理单元212主要用来管理虚拟存储器、物理存储器的控制线路,将虚拟地址映射为物理地址。通常每个处理器核有一个内存管理单元。本实施例中,处理器核211可以连接有一个或多个存储单元。例如,第一存储单元是地址转译单元213,第二存储单元是内存220。在一些可能的实施例中,内存管理单元212可以设置在处理器核211内。可选地,本实施例中的地址转译单元213也可以称为地址转译高速缓存(Translation Lookaside Buffer,TLB)。The memory management unit 212 is a hardware unit in the processor or processor core. The memory management unit 212 includes an address translation unit 213 and a page table walk unit (Table Walk Unit, TWU) 214. The memory management unit 212 is mainly used to manage the control lines of virtual memory and physical memory, and map virtual addresses to physical addresses. Typically there is one memory management unit per processor core. In this embodiment, the processor core 211 may be connected to one or more storage units. For example, the first storage unit is the address translation unit 213, and the second storage unit is the memory 220. In some possible embodiments, the memory management unit 212 may be provided within the processor core 211 . Optionally, the address translation unit 213 in this embodiment may also be called an address translation cache (Translation Lookaside Buffer, TLB).
地址转译单元213,用于存储虚拟地址与物理地址的对应关系,即页表项。如果CPU请求的虚拟地址在地址转译单元213的虚拟地址与物理地址的对应关系中存在,地址转译单元213将快速匹配出一个包含访问请求的虚拟地址的页表项,并将确认的页表项发送给CPU,由CPU根据获得的页表项包含的物理地址访问内存220。以此,相对于处理器核211未使用内存管理单元212在执行内存访问时,需要从内存220中获取页表项,再从页表项中获取物理地址,内存管理单元212可以利用地址转译单元213获取物理地址,无需访问内存220即实现了虚拟地址到物理地址的转译,提升虚拟地址到物理地址的转译速度。The address translation unit 213 is used to store the correspondence between the virtual address and the physical address, that is, the page table entry. If the virtual address requested by the CPU exists in the corresponding relationship between the virtual address and the physical address of the address translation unit 213, the address translation unit 213 will quickly match a page table entry containing the virtual address of the access request, and will confirm the page table entry. It is sent to the CPU, and the CPU accesses the memory 220 according to the physical address contained in the obtained page table entry. In this way, compared to when the processor core 211 does not use the memory management unit 212 to perform memory access, it needs to obtain the page table entry from the memory 220 and then obtain the physical address from the page table entry. The memory management unit 212 can use the address translation unit. 213 obtains the physical address without accessing the memory 220 to realize the translation from the virtual address to the physical address, improving the translation speed from the virtual address to the physical address.
示例地,地址转译单元213匹配虚拟地址的页表项的过程包括:内存管理单元212从访问请求中提取虚拟地址,向地址转译单元213发送虚拟地址,地址转译单元213在其可访问且存储页表项的存储空间中查询是否存在与虚拟地址相同的页表项;若是,则表示地址转译单元命中(TLB hit);若否,则表示地址转译单元未命中(TLB miss)。可选地,内存管理单元212在地址转译单元213中查询到的页表项失效,例如,页表项的有效位表示页表项已失效,也可以表示地址转译单元未命中。For example, the process of the address translation unit 213 matching the page table entry of the virtual address includes: the memory management unit 212 extracts the virtual address from the access request, and sends the virtual address to the address translation unit 213, where the address translation unit 213 can access and store the page. Query whether there is a page table entry that is the same as the virtual address in the storage space of the table entry; if so, it means that the address translation unit hits (TLB hit); if not, it means that the address translation unit misses (TLB miss). Optionally, the page table entry queried by the memory management unit 212 in the address translation unit 213 is invalid. For example, the valid bit of the page table entry indicates that the page table entry is invalid, or may also indicate that the address translation unit misses.
在本实施例中,如果CPU请求的虚拟地址在地址转译单元213的虚拟地址与物理地址的对应关系中不存在,页表遍历单元214在内存220中查询并获取页表项。其中,页表遍历单元214在内存220中查询并获取页表项的过程包括:页表遍历单元214执行页表遍历 (Translation Table Walk)步骤,根据一级页表和二级页表等多级页表多次访问内存120得到页表项的地址,再根据页表项的地址在内存220中获取页表项。In this embodiment, if the virtual address requested by the CPU does not exist in the corresponding relationship between the virtual address and the physical address of the address translation unit 213, the page table traversal unit 214 queries and obtains the page table entry in the memory 220. The process in which the page table traversal unit 214 queries and obtains page table entries in the memory 220 includes: the page table traversal unit 214 performs a page table traversal. (Translation Table Walk) step: access the memory 120 multiple times according to multi-level page tables such as the first-level page table and the second-level page table to obtain the address of the page table entry, and then obtain the page table entry in the memory 220 according to the address of the page table entry.
可选地,处理器210的内存管理单元212可以是整合内存控制器(integrated memory controller,IMC)。Optionally, the memory management unit 212 of the processor 210 may be an integrated memory controller (IMC).
在本申请实施例中,处理器核211可以在程序需要访问第一数据时获取第一内存访问请求,第一内存访问请求用于指示处理器核211对第一内存空间存储的数据执行数据处理,处理器核211根据确定第一内存空间的地址后,根据第一内存空间的地址执行第一内存访问请求的数据处理。In this embodiment of the present application, the processor core 211 can obtain the first memory access request when the program needs to access the first data. The first memory access request is used to instruct the processor core 211 to perform data processing on the data stored in the first memory space. , after determining the address of the first memory space, the processor core 211 executes the data processing of the first memory access request according to the address of the first memory space.
可选地,上述第一内存访问请求可以是处理器核211执行的线程需要访问第一数据时由处理器核211生成的。第一内存访问请求可以是数据存取指令,数据存取指令为Load/Store指令,Load/Store指令用于指示寄存器和存储器之间进行数据传送,Load/Store指令中包含处理器核211要访问的数据的虚拟地址。Optionally, the above-mentioned first memory access request may be generated by the processor core 211 when a thread executed by the processor core 211 needs to access the first data. The first memory access request may be a data access instruction. The data access instruction is a Load/Store instruction. The Load/Store instruction is used to instruct data transfer between the register and the memory. The Load/Store instruction includes the information that the processor core 211 wants to access. The virtual address of the data.
另外,上述第一内存访问请求也可以是处理器核211外的处理器核(例如处理器核215)执行的线程需要访问第一数据时处理器核211外的处理器核生成的,由生成第一内存访问请求的处理器核发送至处理器核211。第一内存访问请求可以是代理(proxy)指令,用于指示处理器核211对第一内存空间存储的数据执行数据处理,并将获取的数据发送至生成第一内存访问请求的处理器核。In addition, the above-mentioned first memory access request may also be generated by a processor core other than the processor core 211 when a thread executed by a processor core other than the processor core 211 (for example, the processor core 215) needs to access the first data. The processor core of the first memory access request is sent to the processor core 211 . The first memory access request may be a proxy instruction, used to instruct the processor core 211 to perform data processing on the data stored in the first memory space, and send the obtained data to the processor core that generated the first memory access request.
内存220和硬盘是与处理器210连接的存储器。存储器是用于存储程序和各种数据的记忆器件。存储器的容量越大,存取速度越慢。反之,存储器的容量越小,存取速度越快。存取速度是指对存储器写入数据或读取数据时的数据传输速度。存取速度也可以称为读写速度。依据存储容量和存取速度可以将存储器划分为不同层级。The memory 220 and the hard disk are memories connected to the processor 210 . Memory is a memory device used to store programs and various data. The larger the memory capacity, the slower the access speed. On the contrary, the smaller the memory capacity, the faster the access speed. Access speed refers to the data transfer speed when writing data or reading data to the memory. Access speed can also be called read and write speed. Memory can be divided into different levels based on storage capacity and access speed.
在本实施例中,内存220可以存储有多级页表,例如一级页表用于存储一级页表索引,二级页表用于存储二级页表索引,三级页表用于存储三级页表索引,四级页表用于存储页表项。可选地,如图2中的虚线框中所示,虚线框与内存220之间的虚线表示一至四级页表存储于内存220中。一级页表可以是页面地图四级表(Page Map Level 4,PML4),二级页表为页目录指针表(Page Directory Pointer Table,PDPT),三级页表可以是页目录表(Page Directory,PD),四级页表可以是页表(Page Table,PT),页表存储有一个或多个页表项。处理器核211根据虚拟地址在多级页表中查询页表项时,处理器核211根据一级页表索引、二级页表索引和三级页表索引依次查询一级页表、二级页表、三级页表和四级页表,得到页表项的在内存220中的地址,根据地址在内存220中获取页表项。In this embodiment, the memory 220 can store multi-level page tables. For example, the first-level page table is used to store the first-level page table index, the second-level page table is used to store the second-level page table index, and the third-level page table is used to store the first-level page table index. The three-level page table index and the fourth-level page table are used to store page table entries. Optionally, as shown in the dotted box in FIG. 2 , the dotted line between the dotted box and the memory 220 indicates that level one to level four page tables are stored in the memory 220 . The first-level page table can be Page Map Level 4 (PML4), the second-level page table can be Page Directory Pointer Table (PDPT), and the third-level page table can be Page Directory table (Page Directory). , PD), the fourth-level page table can be a page table (Page Table, PT), and the page table stores one or more page table entries. When the processor core 211 queries the page table entries in the multi-level page table according to the virtual address, the processor core 211 sequentially queries the first-level page table, the second-level page table and the second-level page table index according to the first-level page table index, the second-level page table index and the third-level page table index. The page table, the third-level page table and the fourth-level page table obtain the address of the page table entry in the memory 220, and obtain the page table entry in the memory 220 according to the address.
内存220还可以存储有元数据,元数据是用于描述内存220中内存空间的物理页的数据属性(property)的信息。例如活动表(active list)、空闲表(free list)和LRU list等。active list用于维护活动的物理页,free list用于维护空闲的物理页。LRU list用于维护活动的物理页的就近性。最近最少使用策略(Least Recently Used,LRU)是一种常用的页面置换算法,选择最近最久未使用的页面予以淘汰。LRU会使用一个链表维护缓存中每个数据的访问情况,并根据数据的实时访问,调整数据在链表中的位置,然后通过数据在链表中的位置,表示数据是最近刚访问的,还是已有段时间未访问。The memory 220 may also store metadata, which is information used to describe the data attributes (properties) of the physical pages of the memory space in the memory 220 . For example, active list, free list, LRU list, etc. The active list is used to maintain active physical pages, and the free list is used to maintain idle physical pages. The LRU list is used to maintain the proximity of active physical pages. The least recently used strategy (Least Recently Used, LRU) is a commonly used page replacement algorithm, which selects the most recently unused pages for elimination. LRU will use a linked list to maintain the access status of each data in the cache, and adjust the position of the data in the linked list based on the real-time access of the data. Then, the position of the data in the linked list will indicate whether the data has been accessed recently or is already there. Haven't visited in a while.
请继续参考图2,总线230针对不同的连接对象可以是不同类型的总线。例如,处理器210和内存220之间的总线230可以是DDR3接口标准的总线,处理器210和外存之间的总线230可以是快捷外围部件互连标准(Peripheral Component Interconnect Express,PCIe)接口标准的总线。配合PCIe接口标准的总线,外存可以是PCIe SSD,PCIe SSD(PCIe固态驱 动器)是将计算机系统200连接到其外围设备的高速扩展卡。换言之,PCIe SSD采用计算机内置插卡的形式。和SATA/SAS等接口相比,PCIe在数据传输中采用了多队列的方式,从而可以实现单盘并发数据传输的目的,提高了数据接口效率。Please continue to refer to Figure 2. The bus 230 can be different types of buses for different connection objects. For example, the bus 230 between the processor 210 and the memory 220 may be a DDR3 interface standard bus, and the bus 230 between the processor 210 and the external memory may be a Peripheral Component Interconnect Express (PCIe) interface standard. bus. With the PCIe interface standard bus, the external memory can be PCIe SSD, PCIe SSD (PCIe solid state drive) (driver) is a high-speed expansion card that connects computer system 200 to its peripheral devices. In other words, PCIe SSD takes the form of a plug-in card built into the computer. Compared with interfaces such as SATA/SAS, PCIe uses a multi-queue method in data transmission, which can achieve the purpose of concurrent data transmission on a single disk and improve the efficiency of the data interface.
结合计算机系统200,本实施例提供的数据处理方法能够应用在内存扩展的场景,例如通过虚拟内存技术实现的内存扩展的场景。具体而言,本申请实施例的数据处理方法能够应用在数据中心服务器的内存扩展、云计算服务器的内存扩展等场景中。Combined with the computer system 200, the data processing method provided in this embodiment can be applied in memory expansion scenarios, such as memory expansion scenarios implemented through virtual memory technology. Specifically, the data processing method in the embodiment of the present application can be applied in scenarios such as memory expansion of data center servers and memory expansion of cloud computing servers.
在数据中心服务器的内存扩展、云计算服务器的内存扩展的应用场景中,服务器的操作系统根据用户终端(例如:手机、电脑、平板电脑)或自身生成的请求,将处理用户终端的请求的线程分配至处理器核211,处理器核211根据线程生成第一内存访问请求,第一内存访问请求用于指示处理器核211对第一内存空间存储的数据执行数据处理,第一内存空间为处理器核211的第一吸引域关联的内存的存储空间,其中,每个处理器核关联至少一个吸引域,同一吸引域仅关联一个处理器核。处理器核211根据第一内存访问请求确定第一存储空间的地址,然后根据第一内存空间的地址执行第一内存访问请求的数据处理。通过吸引域将处理器核和吸引域关联的内存空间关联起来,进而将吸引域与元数据关联起来,使服务器的多个处理器核在并行进行内存访问时能对各自的吸引域关联的元数据以及内存空间的数据进行数据处理,避免了多处理器核在并行进行内存访问的访问冲突,降低了内存访问的整体时长。因此,本实施例提供的数据处理方法应用于并发访问量较大的数据中心服务器的内存扩展、云计算服务器的内存扩展的应用场景中,极大地提高了内存访问的并发访问效率,降低了并发访问的整体访问时延。In the application scenarios of memory expansion of data center servers and memory expansion of cloud computing servers, the server's operating system will process the thread of the user terminal's request based on the request generated by the user terminal (such as mobile phone, computer, tablet computer) or itself. Assigned to the processor core 211, the processor core 211 generates a first memory access request according to the thread. The first memory access request is used to instruct the processor core 211 to perform data processing on the data stored in the first memory space. The first memory space is the processing The storage space of the memory associated with the first attraction domain of the processor core 211, wherein each processor core is associated with at least one attraction domain, and the same attraction domain is only associated with one processor core. The processor core 211 determines the address of the first storage space according to the first memory access request, and then performs the data processing of the first memory access request according to the address of the first memory space. Through the attraction domain, the processor core is associated with the memory space associated with the attraction domain, and then the attraction domain is associated with metadata, so that multiple processor cores of the server can access the metadata associated with their respective attraction domains when performing memory access in parallel. The data and data in the memory space are processed, which avoids the access conflict of parallel memory access by multiple processor cores and reduces the overall duration of memory access. Therefore, the data processing method provided by this embodiment is applied in the application scenarios of memory expansion of data center servers and cloud computing servers with large concurrent accesses, which greatly improves the concurrent access efficiency of memory access and reduces concurrency. The overall access latency of the access.
图2仅是本申请实施例提供的一种系统架构的示意图,图2中所示设备、器件、模块等之间的位置关系不构成任何限制。FIG. 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application. The positional relationship between the devices, devices, modules, etc. shown in FIG. 2 does not constitute any limitation.
接下来请参考图3,对数据处理方法进行详细阐述。本实施例中以第一处理器核为图2中的处理器核211,处理器核211访问第一数据为例进行说明。Next, please refer to Figure 3 for a detailed explanation of the data processing method. In this embodiment, the first processor core is the processor core 211 in FIG. 2 and the processor core 211 accesses the first data as an example for explanation.
步骤310、处理器核211获取第一内存访问请求。Step 310: The processor core 211 obtains the first memory access request.
如图3中步骤S310处的实现和虚线所示,第一内存访问请求可以是处理器核211或处理器核211外的处理器核执行线程的过程中需要读/写内存220中第一内存空间的数据时发起的,第一内存访问请求包含第一存储空间的虚拟地址。第一内存访问请求用于指示处理器核211对第一内存空间存储的数据执行数据处理。第一内存空间为处理器核211的第一吸引域关联的内存220的存储空间。As shown in the implementation of step S310 in Figure 3 and the dotted line, the first memory access request may be that the processor core 211 or a processor core outside the processor core 211 needs to read/write the first memory in the memory 220 during the execution of the thread. The first memory access request includes the virtual address of the first storage space. The first memory access request is used to instruct the processor core 211 to perform data processing on the data stored in the first memory space. The first memory space is the storage space of the memory 220 associated with the first attraction domain of the processor core 211 .
作为一种可能的实现方式,第一吸引域包括处理器核211关联的内存空间地址和吸引域标识的关联关系,第一吸引域中关联关系的吸引域标识用于全局唯一指示关联关系所属的第一吸引域,内存空间地址为与关联关系所属的第一吸引域关联的内存空间的地址。在步骤330后以第一吸引域为例,结合图7对吸引域的结构进行详细描述,在此不再赘述。As a possible implementation, the first attraction domain includes an association relationship between a memory space address associated with the processor core 211 and an attraction domain identifier. The attraction domain identifier of the association relationship in the first attraction domain is used to globally uniquely indicate the location to which the association relationship belongs. The first attraction domain, the memory space address is the address of the memory space associated with the first attraction domain to which the association relationship belongs. After step 330, taking the first attraction domain as an example, the structure of the attraction domain will be described in detail with reference to Figure 7, which will not be described again.
在虚拟内存技术中,处理器核通常利用页表项进行虚拟地址和物理地址的转换,因此,本实施例可以将页表项作为关联关系,一个页表项中包含的内存空间地址(虚拟地址和物理地址)与吸引域标识表示的吸引域相关联。In virtual memory technology, the processor core usually uses page table entries to convert virtual addresses and physical addresses. Therefore, this embodiment can use page table entries as an association relationship. The memory space address (virtual address) contained in a page table entry and physical address) is associated with the attraction domain represented by the attraction domain identifier.
示例地,如图4所示,本实施例还可以将吸引域标识设置于页表项的保留位,从而在不修改处理器的微体系结构的情况下使页表项能够携带吸引域标识。页表项中的吸引域标识可以是处理器核标识,例如,第一吸引域的吸引域标识为处理器核211的处理器核标识,从而使页表项表示内存空间、吸引域和处理器核的关联关系。For example, as shown in FIG. 4 , this embodiment can also set the attraction domain identifier in the reserved bit of the page table entry, so that the page table entry can carry the attraction domain identifier without modifying the microarchitecture of the processor. The attraction domain identifier in the page table entry may be the processor core identifier. For example, the attraction domain identifier of the first attraction domain is the processor core identifier of the processor core 211, so that the page table entry represents the memory space, the attraction domain and the processor. nuclear correlation.
步骤320、处理器核211确定第一内存空间的地址。 Step 320: The processor core 211 determines the address of the first memory space.
上述第一内存空间的地址是指第一内存空间的物理地址。以关联关系是页表项为例,首先,处理器核211根据第一内存空间的虚拟地址在地址转译单元213中查询是否存在标识与虚拟地址相同的第一页表项,若是,则表示地址转译单元命中(TLB hit),执行步骤330;若否,则表示地址转译单元未命中(TLB miss),处理器核211利用页表遍历单元214从内存220中获取第一页表项。然后,处理器核211根据第一页表项中虚拟地址和物理地址的对应关系确定第一内存空间的物理地址。The above-mentioned address of the first memory space refers to the physical address of the first memory space. Taking the association relationship as a page table entry as an example, first, the processor core 211 queries the address translation unit 213 according to the virtual address of the first memory space whether there is a first page table entry with the same identifier as the virtual address. If so, it indicates that the address If the translation unit hits (TLB hit), perform step 330; if not, it means that the address translation unit misses (TLB miss), and the processor core 211 uses the page table traversal unit 214 to obtain the first page table entry from the memory 220. Then, the processor core 211 determines the physical address of the first memory space according to the corresponding relationship between the virtual address and the physical address in the first page table entry.
可选地,在第一内存空间为处理器核211的第一吸引域关联的内存220的存储空间的情况下,内存管理单元212在确定第一页表项属于处理器核211的第一吸引域后,执行地址转译单元加载(Install)步骤,即复制第一页表项,并将复制的第一页表项存储至处理器核211的地址转译单元213中。处理器核211将第一页表项存储至地址转译单元213后,若处理器核211需再次根据第一页表项进行内存访问,则可以直接在地址转译单元213命中第一页表项,避免再次执行页表项遍历操作从内存220中获取第一页表项,减少了页表项遍历操作带来的运算内存开销和时间开销,从而提高了内存访问效率。Optionally, when the first memory space is the storage space of the memory 220 associated with the first attraction domain of the processor core 211, the memory management unit 212 determines that the first page table entry belongs to the first attraction domain of the processor core 211. After entering the domain, the address translation unit loading (Install) step is performed, that is, the first page table entry is copied, and the copied first page table entry is stored in the address translation unit 213 of the processor core 211 . After the processor core 211 stores the first page table entry to the address translation unit 213, if the processor core 211 needs to perform memory access based on the first page table entry again, it can directly hit the first page table entry in the address translation unit 213. Avoiding performing the page table entry traversal operation again to obtain the first page table entry from the memory 220 reduces the computational memory overhead and time overhead caused by the page table entry traversal operation, thereby improving memory access efficiency.
步骤330、处理器核211根据第一内存空间的地址执行第一内存访问请求的数据处理。Step 330: The processor core 211 executes the data processing of the first memory access request according to the address of the first memory space.
处理器核211利用内存管理单元212根据第一内存空间的物理地址执行第一内存访问请求的数据处理。The processor core 211 uses the memory management unit 212 to perform data processing of the first memory access request according to the physical address of the first memory space.
根据第一内存访问请求为处理器核211或处理器核211外的处理器核生成的,内存管理单元212对第一内存空间的物理地址执行第一内存访问请求的数据处理的步骤可以如下:According to the first memory access request generated by the processor core 211 or a processor core other than the processor core 211, the memory management unit 212 performs the steps of data processing of the first memory access request on the physical address of the first memory space as follows:
情况一、第一内存访问请求由处理器核211生成。Scenario 1: The first memory access request is generated by the processor core 211.
如图5所示,内存管理单元212将第一内存空间的物理地址发送至内存220,接收内存220返回的第一内存空间的物理地址的存储空间的第一数据,并向处理器核211的寄存器111发送第一数据。As shown in FIG. 5 , the memory management unit 212 sends the physical address of the first memory space to the memory 220 , receives the first data of the storage space of the physical address of the first memory space returned by the memory 220 , and sends the physical address of the first memory space to the processor core 211 . Register 111 sends the first data.
情况二、第一内存访问请求由处理器核215生成。Case 2: The first memory access request is generated by the processor core 215 .
如图6所示,图6中的数字1-4表示数据传输顺序,内存管理单元212将第一内存空间的物理地址发送至内存220,接收内存220返回的第一内存空间的物理地址的存储空间的第一数据,并向处理器核215的寄存器发送第一数据。As shown in Figure 6, numbers 1-4 in Figure 6 represent the data transmission sequence. The memory management unit 212 sends the physical address of the first memory space to the memory 220, and receives the storage of the physical address of the first memory space returned by the memory 220. first data in the space, and sends the first data to the register of the processor core 215.
由此,通过吸引域将处理器核和内存空间关联起来,处理器核用于对自身的吸引域所关联的内存空间进行数据处理,使不同的处理器核对不同的内存空间的地址执行数据处理,进而不同的处理器核在执行数据处理时是对用于描述各自吸引域所属的内存空间的数据的元数据进行操作。例如,处理器核211用于根据属于第一吸引域的页表项确定内存空间的地址,并对内存空间的地址执行数据处理。假设第一元数据为内存220中用于描述第一吸引域关联的内存空间的地址的数据的部分元数据,由于处理器核211仅对第一吸引域关联的内存空间的地址的数据进行数据处理,处理器核211只需对第一元数据进行操作,从而使处理器核211、第一吸引域和第一元数据相关联。因此,不同处理器核对内存220中元数据的不同部分进行操作,不同处理器核不会对相同部分元数据进行操作。多个处理器核中的一个处理器核在执行内存空间的数据处理时,另外的处理器核无需等待正在执行数据处理的处理器核完成数据处理及元数据操作后,再执行新的内存空间的数据处理。进而多个处理器核能够并发进行数据处理,避免了多处理器核并行进行内存访问时存在的访问冲突,降低了内存访问的整体时长,提高了内存访问的效率。Therefore, the processor core and the memory space are associated through the attraction domain, and the processor core is used to perform data processing on the memory space associated with its own attraction domain, allowing different processor cores to perform data processing on the addresses of different memory spaces. , and then when different processor cores perform data processing, they operate on the metadata used to describe the data in the memory space to which their respective attraction domains belong. For example, the processor core 211 is configured to determine the address of the memory space according to the page table entry belonging to the first attraction domain, and perform data processing on the address of the memory space. Assume that the first metadata is part of the metadata in the memory 220 used to describe the data of the address of the memory space associated with the first attraction domain, because the processor core 211 only performs data on the data of the address of the memory space associated with the first attraction domain. Processing, the processor core 211 only needs to operate on the first metadata, thereby associating the processor core 211, the first attraction domain and the first metadata. Therefore, different processor cores operate on different portions of the metadata in memory 220, and different processor cores do not operate on the same portion of metadata. When one of the multiple processor cores is performing data processing in the memory space, the other processor cores do not need to wait for the processor core that is performing data processing to complete the data processing and metadata operations before executing the new memory space. data processing. Furthermore, multiple processor cores can process data concurrently, which avoids access conflicts when multiple processor cores perform memory access in parallel, reduces the overall duration of memory access, and improves the efficiency of memory access.
下面结合图7-图9详细介绍本申请的吸引域,为了便于描述,以处理器核211关联的第一吸引域为例进行说明。 The attraction domain of the present application will be introduced in detail below with reference to Figures 7-9. For convenience of description, the first attraction domain associated with the processor core 211 will be taken as an example.
首先,对第一吸引域的结构进行说明,请参考图7,图7为本申请实施例提供的一种吸引域的结构示意图。First, the structure of the first attraction domain is described. Please refer to FIG. 7 . FIG. 7 is a schematic structural diagram of an attraction domain provided by an embodiment of the present application.
第一吸引域700包括一个或多个页表项,一个或多个页表项中的吸引域标识为处理器核211的处理器核标识。The first attraction domain 700 includes one or more page table entries, and the attraction domain identifier in the one or more page table entries is the processor core identifier of the processor core 211 .
由于内存中的元数据被吸引域划分为多个相互隔离的部分,处理器核211在根据第一内存空间的物理地址执行第一内存访问请求的数据处理时,只对第一吸引域700关联的内存空间的数据的第一元数据进行操作。因此,第一元数据可以作为第一吸引域700所包括的数据。Since the metadata in the memory is divided into multiple isolated parts by the attraction domain, the processor core 211 is only associated with the first attraction domain 700 when performing data processing of the first memory access request according to the physical address of the first memory space. The first metadata of the data in the memory space is operated. Therefore, the first metadata may be used as data included in the first attraction domain 700 .
可选地,第一元数据可以包括活动表和空闲表等。Optionally, the first metadata may include active list, free list, etc.
作为一种可能的实现方式,第一吸引域700所包括的页表项可以是存储于地址转译单元213中的页表项,也可以是存储于内存220中的页表项。As a possible implementation manner, the page table entries included in the first attraction domain 700 may be page table entries stored in the address translation unit 213 , or may be page table entries stored in the memory 220 .
应说明的是,第一吸引域700所包含的页表项可以是动态调整的,例如,处理器核211根据对内存空间的访问频次实现页表项的进域,即将页表项添加至第一吸引域700,从而实现第一吸引域700的扩张。It should be noted that the page table entries included in the first attraction domain 700 can be dynamically adjusted. For example, the processor core 211 implements page table entry entry according to the access frequency to the memory space, that is, adds the page table entry to the first attraction domain 700 . an attraction domain 700, thereby realizing the expansion of the first attraction domain 700.
接下来以第一吸引域700为例,结合图8对吸引域的扩张步骤进行详细阐述。Next, taking the first attraction domain 700 as an example, the expansion steps of the attraction domain will be described in detail with reference to FIG. 8 .
步骤810、处理器核211根据第二内存访问请求在地址转译单元213查询第二页表项,出现TLB miss。Step 810: The processor core 211 queries the second page table entry in the address translation unit 213 according to the second memory access request, and a TLB miss occurs.
第二内存访问请求用于指示处理器核211对第二页表项的虚拟地址对应的第二内存空间存储的第二数据执行数据处理。The second memory access request is used to instruct the processor core 211 to perform data processing on the second data stored in the second memory space corresponding to the virtual address of the second page table entry.
步骤820、处理器核211从内存220中获取第二页表项。Step 820: The processor core 211 obtains the second page table entry from the memory 220.
处理器核211利用页表遍历单元214执行页表遍历步骤从内存220中获取第二页表项,第二页表项中的吸引域标识为初始值,表示第二内存空间不与任一吸引域关联。The processor core 211 uses the page table traversal unit 214 to perform the page table traversal step to obtain the second page table entry from the memory 220. The attraction field identifier in the second page table entry is an initial value, indicating that the second memory space is not associated with any attraction. Domain association.
步骤830、获取第二页表项的累计次数大于预设阈值的情况下,处理器核211根据第二页表项对第二内存空间的地址对第二内存访问请求执行数据处理,并将第二页表项加入第一吸引域。Step 830: When the cumulative number of times of obtaining the second page table entry is greater than the preset threshold, the processor core 211 performs data processing on the second memory access request according to the address of the second page table entry in the second memory space, and processes the second memory access request. The second page table entry is added to the first attraction field.
处理器核211每次接收到第二内存访问请求,则利用页表遍历单元214从内存220的多级页表中获取第二页表项,图8中的PML4E、PDPTE、PDE和PTE分别表示第二内存访问请求在PML4、PDPT、PD和PT中查询到的数据。处理器核211可以通过第一计数器记录页表遍历单元214从内存220中获取第二页表项的累计次数。例如,页表遍历单元214首次从内存220中获取到第二页表项时采用寄存器作为第一计数器,每次从内存220中获取第二页表项时将第一计数器的值加一。可选地,第一阈值可以是5次、8次或15次等。Each time the processor core 211 receives the second memory access request, it uses the page table traversal unit 214 to obtain the second page table entry from the multi-level page table of the memory 220. PML4E, PDPTE, PDE and PTE in Figure 8 represent respectively The second memory access requests the data queried in PML4, PDPT, PD and PT. The processor core 211 may use the first counter to record the cumulative number of times the page table traversal unit 214 obtains the second page table entry from the memory 220 . For example, when the page table traversal unit 214 obtains the second page table entry from the memory 220 for the first time, the register is used as the first counter, and the value of the first counter is incremented by one each time the second page table entry is obtained from the memory 220 . Optionally, the first threshold may be 5 times, 8 times, 15 times, etc.
可选地,处理器核211将第二页表项加入第一吸引域700还包括:处理器核211利用页表遍历单元214复制第二页表项并将复制得到的第二页表项存入地址转译单元213。Optionally, adding the second page table entry to the first attraction domain 700 by the processor core 211 also includes: the processor core 211 uses the page table traversal unit 214 to copy the second page table entry and stores the copied second page table entry. Enter the address translation unit 213.
此外,处理器核211还可以将第二页表项对应的虚拟页加入第一吸引域700的活动表中,即处理器核211将处理器核足迹(footprints)中第二页表项对应的虚拟页加入活动表中,处理器核足迹顺序记录有处理器核对哪些虚拟页对应的地址执行数据处理。In addition, the processor core 211 can also add the virtual page corresponding to the second page table entry to the activity table of the first attraction domain 700, that is, the processor core 211 adds the virtual page corresponding to the second page table entry in the processor core footprints (footprints). The virtual pages are added to the active table, and the processor core footprint sequence records which addresses corresponding to the virtual pages the processor checks to perform data processing.
上述处理器核211将第二页表项加入第一吸引域,可以是将第二页表项的吸引域标识修改为第一吸引域的吸引域标识。处理器核211将第二页表项存储至地址转译单元213后,若处理器核211需再次根据第二页表项进行内存访问,则可以直接在地址转译单元213查询到第二页表项,避免再次执行页表项遍历操作从内存220中获取第二页表项,减少了页表项遍历操作带来的运算内存开销和时间开销,从而提高了内存访问效率,提升了地址转译单元213的存储资源的利用率。 The processor core 211 adds the second page entry to the first attraction domain, which may be by modifying the attraction domain identifier of the second page entry to the attraction domain identifier of the first attraction domain. After the processor core 211 stores the second page table entry to the address translation unit 213, if the processor core 211 needs to perform memory access based on the second page table entry again, the second page table entry can be directly queried in the address translation unit 213. , avoid performing the page table entry traversal operation again to obtain the second page table entry from the memory 220, reducing the computational memory overhead and time overhead caused by the page table entry traversal operation, thereby improving the memory access efficiency and improving the address translation unit 213 storage resource utilization.
除了第一吸引域700的扩张,处理器核211对第一吸引域700所包含的页表项的动态调整还可以包括第一吸引域700的缩减。下面结合图9对第一吸引域700的缩减步骤进行详细说明,图9中的①-⑥表示步骤的执行顺序。In addition to the expansion of the first attraction domain 700 , the dynamic adjustment of the page table entries contained in the first attraction domain 700 by the processor core 211 may also include the reduction of the first attraction domain 700 . The steps for reducing the first attraction domain 700 will be described in detail below with reference to Figure 9 . ①-⑥ in Figure 9 represent the execution order of the steps.
步骤910、处理器核211根据第三内存访问请求在地址转译单元213查询第三页表项,出现TLB miss。Step 910: The processor core 211 queries the third page table entry in the address translation unit 213 according to the third memory access request, and a TLB miss occurs.
第三内存访问请求为处理器核211以外的处理器核发送的代理指令,用于指示处理器核211对第三页表项的虚拟地址对应的第三内存空间存储的第三数据执行数据处理。The third memory access request is a proxy instruction sent by a processor core other than the processor core 211 to instruct the processor core 211 to perform data processing on the third data stored in the third memory space corresponding to the virtual address of the third page table entry. .
以向处理器核211发送代理指令的处理器核为处理器核215举例,处理器核215获取内存访问请求,根据内存访问请求包含的虚拟地址在处理器核215的地址转译单元中未查询到第三页表项,出现TLB miss,则处理器核215利用自身的页表遍历单元在内存220中获取第三页表项。当第三页表项的吸引域标识为第一吸引域700的吸引域标识时,向处理器核215向处理器核211发送代理指令,代理指令包含第三内存空间的虚拟地址。Taking the processor core that sends proxy instructions to the processor core 211 as the processor core 215 as an example, the processor core 215 obtains a memory access request, and the virtual address contained in the memory access request is not found in the address translation unit of the processor core 215 If a TLB miss occurs in the third page table entry, the processor core 215 uses its own page table traversal unit to obtain the third page table entry in the memory 220. When the attraction domain identifier of the third page table entry is the attraction domain identifier of the first attraction domain 700, a proxy instruction is sent to the processor core 215 to the processor core 211, where the proxy instruction includes the virtual address of the third memory space.
步骤920、处理器核211从内存220中获取第三页表项。Step 920: The processor core 211 obtains the third page table entry from the memory 220.
处理器核211利用页表遍历单元214执行页表遍历步骤从内存220中的多级页表中获取第三页表项,图9中的PML4E、PDPTE、PDE和PTE分别表示第三内存访问请求在PML4、PDPT、PD和PT中查询到的数据。第三页表项中的吸引域标识为处理器核211的吸引域标识。The processor core 211 uses the page table traversal unit 214 to perform the page table traversal step to obtain the third page table entry from the multi-level page table in the memory 220. PML4E, PDPTE, PDE and PTE in Figure 9 respectively represent the third memory access request. Data queried in PML4, PDPT, PD and PT. The attraction domain identifier in the third page entry is the attraction domain identifier of the processor core 211 .
步骤930、处理器核211将第三页表项迁移至发送第三内存访问请求的次数最多的处理器核的吸引域中。Step 930: The processor core 211 migrates the third page table entry to the attraction domain of the processor core that has sent the third memory access request the most.
假设处理器核215为向处理器核211发送第三内存访问请求的次数最多的处理器核,处理器核211将第三页表项迁移至处理器核215的吸引域中,可以是将第三页表项的吸引域标识修改为处理器核215的吸引域的吸引域标识。Assume that the processor core 215 is the processor core that sends the third memory access request to the processor core 211 the most. The processor core 211 migrates the third page table entry to the attraction domain of the processor core 215, which may be the third memory access request. The attraction domain identification of the three-page table entry is modified to the attraction domain identification of the attraction domain of the processor core 215 .
处理器核211每次接收到代理指令,则利用页表遍历单元214从内存220中获取第三页表项。针对每一个向处理器核211发送代理指令的处理器核,处理器核211可以分别采用一个计数器记录处理器核211外的处理器核指示处理器核211对第三内存空间存储的第三数据执行数据处理的次数。Each time the processor core 211 receives the proxy instruction, it uses the page table traversal unit 214 to obtain the third page table entry from the memory 220 . For each processor core that sends a proxy instruction to the processor core 211, the processor core 211 may use a counter to record the third data that the processor core other than the processor core 211 instructs the processor core 211 to store in the third memory space. The number of times data processing is performed.
处理器核211可以通过第二计数器记录接收到处理器核215发送的代理指令的次数。例如,处理器核211首次接收到处理器核215发送的代理指令时采用寄存器作为第二计数器,每次接收到处理器核215发送的代理指令时将第二计数器的值加一。可选地,第二阈值可以是5次、8次或15次等。The processor core 211 may record the number of times the proxy instruction sent by the processor core 215 is received through a second counter. For example, when the processor core 211 receives the proxy instruction sent by the processor core 215 for the first time, the register is used as the second counter, and each time the processor core 215 receives the proxy instruction sent by the processor core 215, the value of the second counter is incremented by one. Optionally, the second threshold may be 5 times, 8 times, 15 times, etc.
可选地,处理器核211将第三页表项迁移至处理器核215的吸引域,还包括:处理器核211利用页表遍历单元214复制第三页表项并将复制得到的第三页表项存入地址转译单元213。Optionally, the processor core 211 migrates the third page table entry to the attraction domain of the processor core 215, which also includes: the processor core 211 uses the page table traversal unit 214 to copy the third page table entry and copies the third page table entry. The page table entry is stored in the address translation unit 213.
另外,处理器核211还可以将第三页表项对应的虚拟页加入处理器核215的吸引域的活动表中,即处理器核215将处理器核足迹中第三页表项对应的虚拟页加入活动表中。In addition, the processor core 211 can also add the virtual page corresponding to the third page table entry to the activity table of the attraction domain of the processor core 215, that is, the processor core 215 adds the virtual page corresponding to the third page table entry in the processor core footprint. Page is added to the activity table.
由此,处理器核211实现了第三页表项在不同吸引域之间的迁移,处理器核215再次进行第三内存空间的数据处理时不需要向处理器核211发送代理指示,而是处理器核215根据自身连接的地址转译单元中的第三页表项进行第三内存空间的数据处理,提高了内存访问的效率,且减少了处理器核间通信资源的开销。另一方面,迁移与进域配合实现了吸引域的动态调整,使吸引域中的页表项是处理器核访问频次较高的页表项,从而提升了地址转译单元的存储资源的利用率。As a result, the processor core 211 realizes the migration of the third page table entry between different attraction domains. When the processor core 215 performs data processing in the third memory space again, it does not need to send a proxy instruction to the processor core 211. Instead, The processor core 215 performs data processing in the third memory space according to the third page table entry in the address translation unit connected to itself, which improves the efficiency of memory access and reduces the overhead of communication resources between processor cores. On the other hand, the combination of migration and domain entry realizes the dynamic adjustment of the attraction domain, so that the page table entries in the attraction domain are page table entries with higher access frequency by the processor core, thereby improving the utilization of the storage resources of the address translation unit. .
除了较多的页表项会占用地址转译单元的存储资源的利用率,内存220和硬盘之间换入/ 换出数据后所有处理器核对连接的地址转译单元进行页表项更新,触发TLB击落(shootdown)以暂停所有处理器核正在执行的线程,存在处理器核的计算资源的利用率较低的问题。为了提高处理器核的计算资源的利用率,本实施例中处理器核周期性地更新自身的地址转译单元。In addition to more page table entries occupying the storage resource utilization of the address translation unit, swapping/ After swapping out the data, all processor cores update the page table entries of the connected address translation units, triggering TLB shootdown to suspend the threads being executed by all processor cores. There is a problem of low utilization of the computing resources of the processor cores. . In order to improve the utilization of the computing resources of the processor core, in this embodiment, the processor core periodically updates its own address translation unit.
以处理器核211为例,处理器核211利用页表遍历单元214周期性地更新地址转译单元213。页表遍历单元214确定地址转译单元213中所有页表项的吸引域标识,保留地址转译单元213中吸引域标识为第一吸引域700的吸引域标识的页表项,删除地址转译单元213中吸引域标识不是第一吸引域700的吸引域标识的页表项。Taking the processor core 211 as an example, the processor core 211 uses the page table traversal unit 214 to periodically update the address translation unit 213 . The page table traversal unit 214 determines the attraction domain identifiers of all page table entries in the address translation unit 213 , retains the page table entries in the address translation unit 213 whose attraction domain identifier is the attraction domain identifier of the first attraction domain 700 , and deletes the page table entries in the address translation unit 213 The attraction domain identifier is not a page table entry of the attraction domain identifier of the first attraction domain 700 .
上述更新地址转译单元213的每相邻两个周期可以是相等的,也可以是不相等的。周期的长度可以是10秒、20秒、30秒或其他任意时长。Each two adjacent cycles of the above-mentioned update address translation unit 213 may be equal or unequal. The length of the period can be 10 seconds, 20 seconds, 30 seconds, or any other length.
由此,处理器核211利用页表遍历单元214完成页表项的刷新,避免在地址转译单元213查询到不属于第一吸引域的页表项,减少TLB击落的触发次数,从而提高处理器核的计算资源的利用率,提高了处理器210的内存访问性能。Therefore, the processor core 211 uses the page table traversal unit 214 to complete the refresh of the page table entries, thereby avoiding the address translation unit 213 from querying the page table entries that do not belong to the first attraction domain, reducing the number of TLB shootdown triggers, thereby improving the processor The utilization of computing resources of the core improves the memory access performance of the processor 210 .
应注意的是,本实施例中处理器核211或处理器核215在地址转译单元中查询页表项出现TLB miss时,均不发起TLB击落,从而进一步提高了内存访问的效率。It should be noted that in this embodiment, when the processor core 211 or the processor core 215 queries the page table entry in the address translation unit and a TLB miss occurs, neither the processor core 211 nor the processor core 215 initiates a TLB shootdown, thereby further improving the efficiency of memory access.
上文中结合图1至图9,详细描述了根据本实施例所提供的数据处理方法,下面将结合图10,描述根据本实施例所提供的数据处理装置。The data processing method provided according to this embodiment is described in detail above with reference to FIGS. 1 to 9 . The data processing device provided according to this embodiment will be described below with reference to FIG. 10 .
图10为本实施例提供的可能的数据处理装置的结构示意图。这些数据处理装置可以用于实现上述方法实施例中处理器或处理器核的功能,因此也能实现上述方法实施例所具备的有益效果。Figure 10 is a schematic structural diagram of a possible data processing device provided by this embodiment. These data processing devices can be used to implement the functions of the processor or processor core in the above method embodiments, and therefore can also achieve the beneficial effects of the above method embodiments.
如图10所示,数据处理装置1000包括收发模块1010和处理模块1020。数据处理装置800用于实现上述图3、图8或图9中所示的方法实施例中处理器核211的功能。As shown in Figure 10, the data processing device 1000 includes a transceiver module 1010 and a processing module 1020. The data processing device 800 is used to implement the functions of the processor core 211 in the above method embodiment shown in Figure 3, Figure 8 or Figure 9.
收发模块1010用于获取第一内存访问请求,第一内存访问请求用于指示第一处理器核对第一内存空间存储的数据执行数据处理,第一内存空间为第一处理器核的第一吸引域关联的内存的存储空间。The transceiver module 1010 is used to obtain a first memory access request. The first memory access request is used to instruct the first processor core to perform data processing on the data stored in the first memory space. The first memory space is the first attraction of the first processor core. The memory storage space associated with the domain.
处理模块1020用于确定第一内存空间的地址,以及根据第一内存空间的地址执行第一内存访问请求的数据处理。The processing module 1020 is configured to determine the address of the first memory space, and perform data processing of the first memory access request according to the address of the first memory space.
作为一种可能实现方式,第一吸引域包括内存空间地址和吸引域标识的关联关系,吸引域标识用于指示关联关系所属的吸引域,内存空间地址为与关联关系所属的吸引域关联的内存空间的地址。As a possible implementation, the first attraction domain includes an association relationship between a memory space address and an attraction domain identifier. The attraction domain identifier is used to indicate the attraction domain to which the association relationship belongs. The memory space address is the memory associated with the attraction domain to which the association relationship belongs. The address of the space.
例如,关联关系为页表项(Page Table Entry,PTE),吸引域标识设置于页表项的保留位。利用页表项中已有的保留位记录吸引域标识,处理器无需修改微架构也能对识别吸引域标识。For example, the association is a Page Table Entry (PTE), and the attraction field identifier is set in the reserved bit of the page table entry. Using the existing reserved bits in the page table entry to record the attraction domain identifier, the processor can identify the attraction domain identifier without modifying the microarchitecture.
又如,吸引域标识包括处理器核标识。关联关系中的处理器核标识能够指示该关联关系属于处理器核标识表示的处理器核的吸引域。For another example, the attraction domain identifier includes a processor core identifier. The processor core identifier in the association relationship can indicate that the association relationship belongs to the attraction domain of the processor core represented by the processor core identifier.
在一种可能的实现方式中,第一内存访问请求可以是第一处理器核执行的线程需要读写第一内存空间的数据的情况下,由第一处理器核生成的。In a possible implementation, the first memory access request may be generated by the first processor core when a thread executed by the first processor core needs to read and write data in the first memory space.
在另一种可能的实现方式中,第一内存访问请求可以是第一处理器核外的处理器核执行的线程需要读写第一内存空间的数据的情况下,由第一处理器核外的处理器核生成并发送至第一处理器核。In another possible implementation, the first memory access request may be made by a thread outside the first processor core when a thread executed by a processor core outside the first processor core needs to read and write data in the first memory space. The processor core generates and sends it to the first processor core.
数据处理装置1000还可以对吸引域包含的关联关系进行动态调整。The data processing device 1000 can also dynamically adjust the association relationships included in the attraction domain.
例如,收发模块1010还用于获取第二内存访问请求。处理模块1020用于执行图8中的步骤810-步骤830。 For example, the transceiver module 1010 is also used to obtain the second memory access request. The processing module 1020 is used to perform steps 810 to 830 in Figure 8 .
又如,收发模块1010还用于获取第三内存访问请求,处理模块1020用于执行图9中的步骤910-步骤930。As another example, the transceiving module 1010 is also used to obtain the third memory access request, and the processing module 1020 is used to execute steps 910 to 930 in Figure 9 .
作为一种可能的实现方式,第一吸引域的关联关系可以存储在与第一处理器核连接的存储单元中,存储单元用于存储和查询第一吸引域包含的关联关系。第一内存访问请求包括第一内存空间的虚拟地址,关联关系的内存空间包含第一内存空间的物理地址。处理模块1020在确定第一内存空间的地址时,具体用于:根据第一内存空间的虚拟地址在存储单元中查询关联关系,得到第一关联关系,并根据第一关联关系确定第一内存空间的物理地址。As a possible implementation manner, the association relationship of the first attraction domain may be stored in a storage unit connected to the first processor core, and the storage unit is used to store and query the association relationship contained in the first attraction domain. The first memory access request includes the virtual address of the first memory space, and the associated memory space includes the physical address of the first memory space. When determining the address of the first memory space, the processing module 1020 is specifically configured to: query the association relationship in the storage unit according to the virtual address of the first memory space, obtain the first association relationship, and determine the first memory space according to the first association relationship. physical address.
可选地,存储单元可以是地址转译单元。Alternatively, the storage unit may be an address translation unit.
另外,处理模块1020还用于:每隔一个预设周期,删除所述存储单元中不属于所述第一吸引域的关联关系。In addition, the processing module 1020 is also configured to: delete associations in the storage unit that do not belong to the first attraction domain every preset period.
应理解的是,本申请实施例的数据处理装置1000可以通过专用集成电路(application-specific integrated circuit,ASIC)实现,或可编程逻辑器件(programmable logic device,PLD)实现,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。It should be understood that the data processing device 1000 in the embodiment of the present application can be implemented by an application-specific integrated circuit (ASIC) or a programmable logic device (PLD). The above PLD can be a complex program. Logic device (complex programmable logical device, CPLD), field-programmable gate array (field-programmable gate array, FPGA), general array logic (generic array logic, GAL) or any combination thereof.
根据本申请实施例的数据处理装置1000可对应于执行本申请实施例中描述的方法,并且数据处理装置1000中的各个单元的上述和其它操作和/或功能为了实现图3、图8或图9中的各个方法的相应流程,为了简洁,在此不再赘述。The data processing device 1000 according to the embodiment of the present application may correspond to performing the method described in the embodiment of the present application, and the above and other operations and/or functions of the various units in the data processing device 1000 are in order to implement FIG. 3, FIG. 8 or FIG. The corresponding processes of each method in 9 will not be repeated here for the sake of brevity.
本实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(random access memory,RAM)、闪存、只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于计算设备中。当然,处理器和存储介质也可以作为分立组件存在于计算设备中。The method steps in this embodiment can be implemented by hardware or by a processor executing software instructions. Software instructions can be composed of corresponding software modules, and software modules can be stored in random access memory (random access memory, RAM), flash memory, read-only memory (read-only memory, ROM), programmable read-only memory (programmable ROM) , PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable programmable read-only memory (electrically EPROM, EEPROM), register, hard disk, mobile hard disk, CD-ROM or other well-known in the art any other form of storage media. An exemplary storage medium is coupled to the processor such that the processor can read information from the storage medium and write information to the storage medium. Of course, the storage medium can also be an integral part of the processor. The processor and storage media may be located in an ASIC. Additionally, the ASIC can be located in a computing device. Of course, the processor and storage medium may also exist as discrete components in a computing device.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其它可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序或指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字视频光盘(digital video disc,DVD);还可以是半导体介质,例如,固态硬盘(solid state drive,SSD)。以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的 保护范围应以权利要求的保护范围为准。 In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are executed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user equipment, or other programmable device. The computer program or instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer program or instructions may be transmitted from a website, computer, A server or data center transmits via wired or wireless means to another website site, computer, server, or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center that integrates one or more available media. The available media may be magnetic media, such as floppy disks, hard disks, and magnetic tapes; they may also be optical media, such as digital video discs (DVDs); they may also be semiconductor media, such as solid state drives (solid state drives). ,SSD). The above are only specific embodiments of the present application, but the protection scope of the present application is not limited thereto. Any person familiar with the technical field can easily think of various equivalent methods within the technical scope disclosed in the present application. Modification or replacement, these modifications or replacements shall be covered by the protection scope of this application. Therefore, this application The scope of protection shall be based on the scope of protection of the claims.

Claims (14)

  1. 一种数据处理方法,其特征在于,所述方法适用于包括处理器和内存的计算机系统,所述方法由所述处理器的第一处理器核执行,包括:A data processing method, characterized in that the method is applicable to a computer system including a processor and a memory, and the method is executed by the first processor core of the processor, including:
    获取第一内存访问请求,所述第一内存访问请求用于指示所述第一处理器核对第一内存空间存储的数据执行数据处理,所述第一内存空间为所述第一处理器核的第一吸引域关联的所述内存的存储空间;Obtain a first memory access request. The first memory access request is used to instruct the first processor core to perform data processing on data stored in a first memory space. The first memory space is the first memory space of the first processor core. The storage space of the memory associated with the first attraction domain;
    确定所述第一内存空间的地址;Determine the address of the first memory space;
    根据所述第一内存空间的地址执行所述第一内存访问请求的数据处理。Perform data processing of the first memory access request according to the address of the first memory space.
  2. 根据权利要求1所述的方法,其特征在于,所述第一吸引域包括内存空间地址和吸引域标识的关联关系,所述吸引域标识用于指示关联关系所属的吸引域,所述内存空间地址为与所述关联关系所属的吸引域关联的内存空间的地址。The method of claim 1, wherein the first attraction domain includes an association relationship between a memory space address and an attraction domain identifier, and the attraction domain identifier is used to indicate the attraction domain to which the association relationship belongs, and the memory space The address is the address of the memory space associated with the attraction domain to which the association relationship belongs.
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:The method of claim 2, further comprising:
    获取第二内存访问请求,所述第二内存访问请求用于指示所述第一处理器核对第二内存空间存储的数据执行数据处理,所述第二内存空间不与任一吸引域关联;Obtain a second memory access request, the second memory access request is used to instruct the first processor to perform data processing on the data stored in the second memory space, and the second memory space is not associated with any attraction domain;
    若所述第一处理核获取所述第二内存访问请求的累计次数大于预设阈值,将所述第二内存空间的地址的关联关系添加至所述第一吸引域。If the cumulative number of times the first processing core obtains the second memory access request is greater than a preset threshold, the association of the address of the second memory space is added to the first attraction domain.
  4. 根据权利要求2所述的方法,其特征在于,所述第一处理器核连接有存储单元,所述存储单元用于存储和查询所述第一吸引域包含的关联关系,所述第一内存访问请求包括所述第一内存空间的虚拟地址,所述关联关系的内存空间包括所述第一内存空间的物理地址,确定所述第一内存空间的地址,包括:The method according to claim 2, characterized in that the first processor core is connected to a storage unit, the storage unit is used to store and query the association relationships contained in the first attraction domain, and the first memory The access request includes the virtual address of the first memory space, the associated memory space includes the physical address of the first memory space, and determining the address of the first memory space includes:
    根据所述第一内存空间的虚拟地址在所述存储单元中查询关联关系,得到第一关联关系;Query the association relationship in the storage unit according to the virtual address of the first memory space to obtain the first association relationship;
    根据所述第一关联关系确定所述第一内存空间的物理地址。Determine the physical address of the first memory space according to the first association relationship.
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:The method of claim 4, further comprising:
    每隔一个预设周期,删除所述存储单元中不属于所述第一吸引域的关联关系。Every preset period, associations in the storage unit that do not belong to the first attraction domain are deleted.
  6. 根据权利要求2-5中任一项所述的方法,其特征在于,所述关联关系为页表项,所述吸引域标识设置于页表项的保留位。The method according to any one of claims 2 to 5, characterized in that the association relationship is a page table entry, and the attraction domain identifier is set in a reserved bit of the page table entry.
  7. 一种数据处理装置,其特征在于,包括:A data processing device, characterized in that it includes:
    收发模块,用于获取第一内存访问请求,所述第一内存访问请求用于指示所述第一处理器核对第一内存空间存储的数据执行数据处理,所述第一内存空间为所述第一处理器核的第一吸引域关联的所述内存的存储空间;A transceiver module configured to obtain a first memory access request. The first memory access request is used to instruct the first processor to perform data processing on the data stored in the first memory space. The first memory space is the third memory space. The storage space of the memory associated with the first attraction domain of a processor core;
    处理模块,用于确定所述第一内存空间的地址;A processing module used to determine the address of the first memory space;
    所述处理模块,还用于根据所述第一内存空间的地址执行所述第一内存访问请求的数据处理。The processing module is also configured to perform data processing of the first memory access request according to the address of the first memory space.
  8. 根据权利要求7所述的装置,其特征在于,所述第一吸引域包括内存空间地址和吸引域标识的关联关系,所述吸引域标识用于指示关联关系所属的吸引域,所述内存空间地址为与所述关联关系所属的吸引域关联的内存空间的地址。The device according to claim 7, wherein the first attraction domain includes an association relationship between a memory space address and an attraction domain identifier, and the attraction domain identifier is used to indicate the attraction domain to which the association relationship belongs, and the memory space The address is the address of the memory space associated with the attraction domain to which the association relationship belongs.
  9. 根据权利要求8所述的装置,其特征在于,所述收发模块还用于获取第二内存访问请求,所述第二内存访问请求用于指示所述第一处理器核对第二内存空间存储的数据执行数据处理,所述第二内存空间不与任一吸引域关联;The device according to claim 8, characterized in that the transceiver module is further used to obtain a second memory access request, and the second memory access request is used to instruct the first processor to check the information stored in the second memory space. The data performs data processing, and the second memory space is not associated with any attraction domain;
    所述处理模块还用于在所述第一处理核获取所述第二内存访问请求的累计次数大于预设阈值时,将所述第二内存空间的地址的关联关系添加至所述第一吸引域。 The processing module is also configured to add the association of the address of the second memory space to the first attraction when the cumulative number of times the first processing core obtains the second memory access request is greater than a preset threshold. area.
  10. 根据权利要求8所述的装置,其特征在于,所述第一处理器核连接有存储单元,所述存储单元用于存储和查询所述第一吸引域包含的关联关系,所述第一内存访问请求包括所述第一内存空间的虚拟地址,所述关联关系的内存空间包括所述第一内存空间的物理地址,所述处理模块还用于:The device according to claim 8, characterized in that the first processor core is connected to a storage unit, the storage unit is used to store and query the association relationships contained in the first attraction domain, and the first memory The access request includes the virtual address of the first memory space, the associated memory space includes the physical address of the first memory space, and the processing module is also used to:
    根据所述第一内存空间的虚拟地址在所述存储单元中查询关联关系,得到第一关联关系;Query the association relationship in the storage unit according to the virtual address of the first memory space to obtain the first association relationship;
    根据所述第一关联关系确定所述第一内存空间的物理地址。Determine the physical address of the first memory space according to the first association relationship.
  11. 根据权利要求10所述的装置,其特征在于,所述处理模块还用于:The device according to claim 10, characterized in that the processing module is also used to:
    每隔一个预设周期,删除所述存储单元中不属于所述第一吸引域的关联关系。Every preset period, associations in the storage unit that do not belong to the first attraction domain are deleted.
  12. 根据权利要求8-11中任一项所述的装置,其特征在于,所述关联关系为页表项,所述吸引域标识设置于页表项的保留位。The device according to any one of claims 8-11, wherein the association relationship is a page table entry, and the attraction field identifier is set in a reserved bit of the page table entry.
  13. 一种处理器,其特征在于,所述处理器包括处理器核,所述处理器核用于执行上述权利要求1-6中任一项所述的方法的操作步骤。A processor, characterized in that the processor includes a processor core, and the processor core is used to execute the operating steps of the method described in any one of claims 1-6.
  14. 一种计算系统,其特征在于,所述计算机系统包括内存以及如权利要求13所述的处理器,所述处理器用于执行上述权利要求1-6中任一项所述的方法的操作步骤,以对内存的存储空间的地址执行内存访问请求的数据处理。 A computing system, characterized in that the computer system includes a memory and a processor as claimed in claim 13, and the processor is configured to perform the operating steps of the method as claimed in any one of claims 1 to 6, The data processing of the memory access request is performed at the address of the memory storage space.
PCT/CN2023/093749 2022-05-12 2023-05-12 Data processing method and device, processor and computer system WO2023217255A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210514855.0 2022-05-12
CN202210514855.0A CN117093132A (en) 2022-05-12 2022-05-12 Data processing method, device, processor and computer system

Publications (1)

Publication Number Publication Date
WO2023217255A1 true WO2023217255A1 (en) 2023-11-16

Family

ID=88729756

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/093749 WO2023217255A1 (en) 2022-05-12 2023-05-12 Data processing method and device, processor and computer system

Country Status (2)

Country Link
CN (1) CN117093132A (en)
WO (1) WO2023217255A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105579977A (en) * 2014-09-01 2016-05-11 华为技术有限公司 File access method, device and storage system
US20200204356A1 (en) * 2018-12-20 2020-06-25 Ido Ouziel Restricting usage of encryption keys by untrusted software

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105579977A (en) * 2014-09-01 2016-05-11 华为技术有限公司 File access method, device and storage system
US20200204356A1 (en) * 2018-12-20 2020-06-25 Ido Ouziel Restricting usage of encryption keys by untrusted software
CN111353157A (en) * 2018-12-20 2020-06-30 英特尔公司 Restricting use of encryption keys by untrusted software

Also Published As

Publication number Publication date
CN117093132A (en) 2023-11-21

Similar Documents

Publication Publication Date Title
US20200057729A1 (en) Memory access method and computer system
CN109074317B (en) Adaptive deferral of lease for an entry in a translation look-aside buffer
US8370533B2 (en) Executing flash storage access requests
KR101944876B1 (en) File access method and apparatus and storage device
JP2019067417A (en) Final level cache system and corresponding method
US20160085585A1 (en) Memory System, Method for Processing Memory Access Request and Computer System
US11210020B2 (en) Methods and systems for accessing a memory
EP3121731A1 (en) Memory management method and device
US8037281B2 (en) Miss-under-miss processing and cache flushing
CN105740164A (en) Multi-core processor supporting cache consistency, reading and writing methods and apparatuses as well as device
US10078588B2 (en) Using leases for entries in a translation lookaside buffer
US9146879B1 (en) Virtual memory management for real-time embedded devices
WO2019062747A1 (en) Data access method and computer system
US20190324914A1 (en) Method, Apparatus, and Non-Transitory Readable Medium for Accessing Non-Volatile Memory
US10108553B2 (en) Memory management method and device and memory controller
US10747679B1 (en) Indexing a memory region
CN107870867B (en) Method and device for 32-bit CPU to access memory space larger than 4GB
WO2024078342A1 (en) Memory swap method and apparatus, and computer device and storage medium
CN115794669A (en) Method, device and related equipment for expanding memory
WO2023227004A1 (en) Memory access popularity statistical method, related apparatus and device
WO2023217255A1 (en) Data processing method and device, processor and computer system
US10241906B1 (en) Memory subsystem to augment physical memory of a computing system
CN107870870B (en) Accessing memory space beyond address bus width
WO2024045643A1 (en) Data access device, method and system, data processing unit, and network interface card
WO2024061344A1 (en) Data migration method and apparatus, and chip and computer-readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23803024

Country of ref document: EP

Kind code of ref document: A1