WO2021072721A1 - 一种地址翻译方法及装置 - Google Patents

一种地址翻译方法及装置 Download PDF

Info

Publication number
WO2021072721A1
WO2021072721A1 PCT/CN2019/111777 CN2019111777W WO2021072721A1 WO 2021072721 A1 WO2021072721 A1 WO 2021072721A1 CN 2019111777 W CN2019111777 W CN 2019111777W WO 2021072721 A1 WO2021072721 A1 WO 2021072721A1
Authority
WO
WIPO (PCT)
Prior art keywords
address
storage space
subpage
virtual
tables
Prior art date
Application number
PCT/CN2019/111777
Other languages
English (en)
French (fr)
Inventor
刘君龙
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201980101414.4A priority Critical patent/CN114556881B/zh
Priority to EP19949260.4A priority patent/EP4036741A1/en
Priority to PCT/CN2019/111777 priority patent/WO2021072721A1/zh
Publication of WO2021072721A1 publication Critical patent/WO2021072721A1/zh
Priority to US17/720,858 priority patent/US20220245067A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/109Address translation for multiple virtual address spaces, e.g. segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1072Decentralised address translation, e.g. in distributed shared memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1081Address translation for peripheral access to main memory, e.g. direct memory access [DMA]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1024Latency reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/15Use in a specific computing environment
    • G06F2212/154Networked environment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/657Virtual address space management

Definitions

  • the embodiments of the present application relate to the computer field, and in particular to an address translation method and device.
  • High-speed serial computer expansion bus standard peripheral component interconnect express, PCIe
  • PCIe peripheral component interconnect express
  • DMA Direct Memory Access
  • ATS Address Translation Service
  • the page table is used to indicate the mapping relationship between virtual addresses and physical addresses. Therefore, it is convenient for the PCIe device to determine the physical address corresponding to the virtual address of the first virtual storage space according to the information of the page table, and access the memory of the processor according to the physical address.
  • PCIe basic specification revision 4.0 and PCIe basic specification revision 5.0 both make the following provisions for the processor to process address translation requests: 1) An address translation request message can use one address translation response message or two addresses Translate the response message to respond; 2) When an address translation response message contains more than two page tables, the page table size is the same. It is understandable that address translation may refer to a process of converting a virtual address in a virtual storage space into a physical address in a memory.
  • the processor feeds back a page table to the PCIe device Information (as shown in Figure 1), or the processor feeds back information about a part of the page table (more than one page table but less than all page tables determined by the processor) to the PCIe device (as shown in Figure 2). Therefore, the sum of the sizes of the feedback page tables is smaller than the size of the first virtual storage space.
  • the processor can only feed back the physical addresses corresponding to part of the virtual addresses in the first virtual storage space, and cannot feed back all the information of the multiple page tables to the PCIe device.
  • the PCIe device needs to send at least two address translation request messages to obtain the information of all page tables in the first virtual storage space. As a result, the delay of the PCIe device requesting the translation address is relatively large, and the bandwidth occupancy rate between the PCIe device and the processor is relatively large.
  • the embodiments of the present application provide an address translation method and device, which solves the problem of a large delay in obtaining a page table by a PCIe device.
  • an address translation device includes an interface and an address translation unit.
  • the interface is used to receive an address translation request message from the PCIe device, the address translation request message includes the first virtual address and the size of the first virtual storage space, the first virtual address is the first virtual storage space
  • the address translation unit is used to determine P sub-page tables of the same size according to the first virtual storage space, and each sub-page table is used to indicate the mapping relationship between virtual addresses and physical addresses, and P sub-page tables The sum of the size of is equal to the size of the first virtual storage space, P is an integer, and P ⁇ 1;
  • the interface is also used to send an address translation response message to the PCIe device, and an address translation response message includes P subpage tables The physical starting address and the size of P subpage tables.
  • the address translation device provided by the embodiment of the present application can determine P subpage tables with the same size according to the first virtual storage space, and the sum of the size of the P subpage tables is equal to the size of the first virtual storage space, and passes an address
  • the translation response message feeds back the physical start addresses of the P subpage tables and the size of the P subpage tables. Therefore, the PCIe device only needs to send one address translation request message, and can obtain the physical addresses corresponding to all virtual addresses in the first virtual storage space through one address translation response message. Therefore, under the condition of complying with the provisions of the PCIe basic specifications, the delay of the PCIe device requesting the translation address and the bandwidth occupancy rate between the PCIe device and the processor can be effectively reduced.
  • the address translation unit is used to determine P subpage tables of the same size according to the first virtual storage space and the smallest translation unit.
  • the address translation unit is configured to determine N page tables according to the first virtual address, the first virtual storage space and the virtual storage space indicated by each of the N page tables at least partially overlap, N is an integer, and N ⁇ 1; Divide N page tables according to the minimum translation unit to obtain M subpage tables, each of the M subpage tables has the same size, M is an integer, M ⁇ 1, the first virtual storage space includes at least one minimum translation unit ; Determine P sub-page tables from M sub-page tables, 1 ⁇ P ⁇ M. Since the virtual start address of the first virtual storage space and the virtual start address of the page table are aligned with the minimum translation unit, the first virtual storage space and the page table are divided according to the minimum translation unit, and the first virtual storage space and the page table can be divided. A virtual storage space completely overlaps subpage tables of the same size, so that the processor feeds back the physical addresses corresponding to all virtual addresses in the first virtual storage space to the PCIe device.
  • the address translation unit is configured to determine N page tables that overlap with the first virtual storage space according to the size of the first virtual address and the page table.
  • the part where the virtual storage space indicated by the i-th page table overlaps with the first virtual storage space is expressed by the following formula: (p_addr+(s_addr+D*STU)-v_addr) ⁇ (p_addr+min(Xi,U)) , Where p_addr represents the physical starting address of the i-th page table, s_addr represents the first virtual address, and D represents the part of the intersection between the first page table and the i-1th page table and the first virtual storage space.
  • STU represents the smallest translation unit
  • v_addr represents the virtual start address of the i-th page table
  • Xi represents the size of the i-th page table
  • U represents the untranslated space size in the first virtual storage space
  • i is an integer, i ⁇ [1,N].
  • the physical start addresses of the P subpage tables included in an address translation response message are sorted. Therefore, to avoid disorder of the subpage table, the PCIe device determines the wrong physical address and accesses the wrong physical storage space.
  • the physical start addresses of the P subpage tables may be sorted according to the virtual address sequence of the first virtual storage space.
  • the virtual address of the first virtual storage space is a virtual address obtained by dividing the first virtual storage space according to the smallest translation unit.
  • the attributes of the subpage table are the same as the attributes of the page table to which the subpage table belongs.
  • the attributes of the subpage table or page table include the memory space indicated by the subpage table or the page table in the system. Attributes.
  • an address translation method is provided.
  • the method can be applied to a processor, or the method can be applied to a communication device that can support the processor to implement the method, for example, the communication device includes a chip system.
  • the method includes: receiving an address translation request message, determining P sub-page tables of the same size according to the first virtual storage space, and sending an address translation response message, and an address translation response message includes P sub-page tables The physical start address and the size of P subpage tables.
  • the one address translation request message includes the first virtual address and the size of the first virtual storage space, and the first virtual address is the virtual start address of the first virtual storage space.
  • Each subpage table is used to indicate the mapping relationship between virtual addresses and physical addresses.
  • the sum of the size of P subpage tables is equal to the size of the first virtual storage space, P is an integer, and P ⁇ 1.
  • the processor can determine P subpage tables with the same size according to the first virtual storage space, and the sum of the size of the P subpage tables is equal to the size of the first virtual storage space, and pass An address translation response message feeds back the physical start address of the P subpage tables and the size of the P subpage tables. Therefore, the PCIe device only needs to send one address translation request message, and can obtain the physical addresses corresponding to all virtual addresses in the first virtual storage space through one address translation response message. Therefore, under the condition of complying with the provisions of the PCIe basic specifications, the delay of the PCIe device requesting the translation address and the bandwidth occupancy rate between the PCIe device and the processor can be effectively reduced.
  • determining P subpage tables with the same size according to the first virtual storage space includes: determining P subpage tables with the same size according to the first virtual storage space and the smallest translation unit.
  • determining P sub-page tables of the same size according to the first virtual storage space and the minimum translation unit includes: determining N page tables according to the first virtual address, and each of the first virtual storage space and the N page tables The virtual storage space indicated by each page table overlaps at least partially, N is an integer, N ⁇ 1; N page tables are divided according to the minimum translation unit to obtain M subpage tables, each of the M subpage tables has the same size, M It is an integer, M ⁇ 1, the first virtual storage space includes at least one minimum translation unit; P sub-page tables are determined from M sub-page tables, 1 ⁇ P ⁇ M.
  • the first virtual storage space and the page table are divided according to the minimum translation unit, and the first virtual storage space and the page table can be divided.
  • a virtual storage space completely overlaps subpage tables of the same size, so that the processor feeds back the physical addresses corresponding to all virtual addresses in the first virtual storage space to the PCIe device.
  • determining the N page tables according to the first virtual address includes: determining the N page tables that overlap the first virtual storage space according to the first virtual address and the size of the page table.
  • the part where the virtual storage space indicated by the i-th page table overlaps with the first virtual storage space is expressed by the following formula: (p_addr+(s_addr+D*STU)-v_addr) ⁇ (p_addr+min(Xi,U)) , Where p_addr represents the physical starting address of the i-th page table, s_addr represents the first virtual address, and D represents the part of the intersection between the first page table and the i-1th page table and the first virtual storage space.
  • STU represents the smallest translation unit
  • v_addr represents the virtual start address of the i-th page table
  • Xi represents the size of the i-th page table
  • U represents the untranslated space size in the first virtual storage space
  • i is an integer, i ⁇ [1,N].
  • the physical start addresses of the P subpage tables included in the one address translation response message are sorted. Therefore, to avoid disorder of the subpage table, the PCIe device determines the wrong physical address and accesses the wrong physical storage space.
  • the physical start addresses of the P subpage tables may be sorted according to the virtual address sequence of the first virtual storage space.
  • the virtual address of the first virtual storage space is a virtual address obtained by dividing the first virtual storage space according to the smallest translation unit.
  • the attributes of the subpage table are the same as the attributes of the page table to which the subpage table belongs.
  • the attributes of the subpage table or page table include the memory space indicated by the subpage table or the page table in the system. Attributes.
  • FIG. 1 is a schematic diagram of a feedback page table provided by the prior art
  • Fig. 2 is a schematic diagram of another feedback page table provided by the prior art
  • FIG. 3 is a schematic diagram of the composition of a PCIe system provided by an embodiment of the application.
  • FIG. 5 is a flowchart of an address translation method provided by an embodiment of this application.
  • FIG. 6 is a flowchart of another address translation method provided by an embodiment of the application.
  • FIG. 7 is an example diagram of the correspondence between virtual storage space and page table provided by an embodiment of the application.
  • FIG. 8 is a flowchart of yet another address translation method provided by an embodiment of this application.
  • FIG. 9 is an example diagram of a page table division provided by an embodiment of this application.
  • FIG. 10 is a flowchart of a method for merging P subpage tables according to an embodiment of the application.
  • FIG. 11 is a schematic diagram of an address translation process provided by an embodiment of this application.
  • Figure 12a is a schematic diagram of another address translation process provided by an embodiment of this application.
  • Figure 12b is a schematic diagram of yet another address translation process provided by an embodiment of this application.
  • FIG. 13 is a schematic diagram of yet another address translation process provided by an embodiment of this application.
  • FIG. 14 is a schematic diagram of yet another address translation process provided by an embodiment of this application.
  • Figures 15 to 17 are schematic diagrams of a process of merging subpage tables provided by an embodiment of the application.
  • words such as “exemplary” or “for example” are used as examples, illustrations, or illustrations. Any embodiment or design solution described as “exemplary” or “for example” in the embodiments of the present application should not be construed as being more preferable or advantageous than other embodiments or design solutions. To be precise, words such as “exemplary” or “for example” are used to present related concepts in a specific manner.
  • the virtual storage space may refer to a collection of virtual addresses generated by the processor.
  • the virtual storage space may also be referred to as a virtual address space.
  • Virtual addresses can also be called logical addresses.
  • the virtual address may be an address generated by the processor.
  • the virtual address generated by the processor includes the page number and the page offset.
  • the page number contains the base address of each page in physical memory.
  • the page number can be used as the index of the page table.
  • the page offset combined with the page number can be used to determine the physical address of the memory.
  • the size of the virtual storage space can be expressed as 2 to the power of m (for example, 2 m ).
  • the size of the page table can be expressed as 2 to the power of n (for example, 2 n ).
  • the high (mn) bits of the virtual address represent the page number, and the low n bits represent the page offset.
  • the physical address space may refer to a collection of physical addresses in the memory corresponding to the virtual addresses.
  • the physical address can be the address of the memory.
  • the processor only generates virtual addresses, and considers the virtual storage space of the process to be 0 to the maximum value.
  • the physical address space corresponding to the virtual address of the virtual storage space may be composed of multiple sub-physical address spaces with a small range. These sub-physical address spaces are not necessarily continuous, and the sum of the corresponding sub-physical address spaces is equal to the size of the virtual storage space. .
  • the physical address range corresponding to each sub-physical address space is R+0 to R+size, and R is the base address corresponding to this physical address space. The R of different sub-physical address spaces is different.
  • the physical address space may also be referred to as physical storage space.
  • Address mapping is the process of converting virtual addresses in the virtual storage space into physical addresses in the memory.
  • the address mapping may be performed by a memory management unit (Memory Management Unit, MMU) in the processor.
  • MMU memory Management Unit
  • the so-called PCIe device may refer to a device that communicates with other devices through the PCIe bus according to the PCIe protocol.
  • IOMMU I/O Memory Management Unit
  • the I/O memory management unit is also called the System Memory Management Unit (SMMU).
  • the page table is a special data structure. Stored in the page table area of the system space.
  • the page table is a conversion relationship table used to convert a virtual address to a physical address. When accessing a virtual address, the computer finds the corresponding physical address through the page table for access. Therefore, the page table indicates the correspondence between the virtual address and the physical address.
  • an embodiment of the present application provides an address translation method, which includes receiving an address translation request report. After the text, determine P subpage tables of the same size according to the first virtual storage space, and send an address translation response message, which includes the physical start addresses of P subpage tables and P subpage tables the size of.
  • the one address translation request message includes the first virtual address and the size of the first virtual storage space, and the first virtual address is the virtual start address of the first virtual storage space.
  • Each subpage table is used to indicate the mapping relationship between the virtual address and the physical address, the sum of the size of the P subpage tables is equal to the size of the first virtual storage space, P is an integer, and P ⁇ 1.
  • P is a positive integer greater than 1, that is, multiple sub-page tables of the same size are determined; of course, this application does not exclude the case where P is equal to 1.
  • the processor can determine P subpage tables with the same size according to the first virtual storage space, and the sum of the size of the P subpage tables is equal to the size of the first virtual storage space, and pass An address translation response message feeds back the physical start address of the P subpage tables and the size of the P subpage tables. Therefore, the PCIe device only needs to send one address translation request message, and can obtain the physical addresses corresponding to all virtual addresses in the first virtual storage space through one address translation response message. Therefore, under the condition of complying with the provisions of the PCIe basic specifications, the delay of the PCIe device requesting the translation address and the bandwidth occupancy rate between the PCIe device and the processor can be effectively reduced.
  • FIG. 3 is a schematic diagram of the composition of a PCIe system provided by an embodiment of the application.
  • the PCIe system 300 may include a processor 301, a root complex 302, a switch 303, and endpoints ( Endpoint) 304 and bridge (PCIe bridge) 305.
  • the root complex 302 is used to connect the processor 301 and input/output I/O devices.
  • the switch 303 supports peer-to-peer communication between different endpoints 304.
  • the bridge 305 is used to connect PCIe with other PCI bus standards (such as PCI/PCI-X).
  • the endpoint 304 may be a PCIe endpoint device or a PCIe device, such as a PCIe interface network card device, a serial port card device, and a memory card device.
  • Figure 3 is only a schematic diagram, and the structure of the PCIe system shown does not constitute a limitation on the PCIe system.
  • the PCIe system may also include more or less components than shown in the figure, or a combination of certain components, or different component arrangements.
  • memory 306 and PCIe bus 307 may also be included.
  • the embodiment of the present application does not limit the number of endpoints and processors included in the PCIe system.
  • the processor 301 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), and on-site Field Programmable Gate Array (FPGA) or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof.
  • the general-purpose processor may be a microprocessor or any conventional processor. The steps of the method disclosed in the embodiments of the present application can be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • the processor 301 is configured to, after receiving an address translation request message, determine P subpage tables of the same size according to the first virtual storage space, and send an address translation response message, an address translation
  • the response message includes the physical start addresses of the P subpage tables and the size of the P subpage tables.
  • the one address translation request message includes the first virtual address and the size of the first virtual storage space, and the first virtual address is the virtual start address of the first virtual storage space.
  • Each subpage table is used to indicate the mapping relationship between the virtual address and the physical address, the sum of the size of the P subpage tables is equal to the size of the first virtual storage space, P is an integer, and P ⁇ 1.
  • the PCIe device 304 is used to send an address translation request message and receive an address translation response message, and access the physical storage space corresponding to the first virtual storage space according to the physical start address of the P subpage tables and the size of the P subpage tables (For example: memory 306).
  • the PCIe device may be a network card or a graphics processing unit (GPU).
  • FIG. 4 is a schematic diagram of another composition of the foregoing PCIe system 300 provided by an embodiment of the application.
  • the PCIe system 300 may include a processor 301 and a PCIe device 304.
  • the processor 301 is connected to the PCIe device 304.
  • the processor 301 includes an address translation circuit 411.
  • the address translation circuit 411 includes an address translation unit 4111 and an interface 4112. Among them, the interface 4112 is used to implement the communication between the processor 301 and the PCIe device. For example, receiving an address translation request message and sending an address translation response message.
  • the address translation unit 4111 is used to implement the translation from a virtual address to a physical address. In fact, the address translation unit 4111 is also a circuit module.
  • Fig. 5 is a flowchart of an address translation method provided by an embodiment of the application.
  • the PCIe device requests the processor for the page table corresponding to the first virtual storage space as an example for description.
  • the page table corresponding to the first virtual storage space may be used to indicate the physical address of the physical space corresponding to the first virtual storage space.
  • the method may include:
  • S501 The PCIe device sends an address translation request message to the processor.
  • the one address translation request is used to request the physical address of the physical space corresponding to the first virtual storage space.
  • the physical space corresponding to the first virtual storage space may be the storage space of the internal memory.
  • the first virtual storage space is a continuous segment of virtual storage space.
  • the one address translation request message includes the first virtual address and the size of the first virtual storage space.
  • the first virtual address is a virtual start address of the first virtual storage space. It should be understood that the first virtual address is an untranslated start address in the first virtual storage space requested by the PCIe device for translation.
  • the first virtual storage space includes at least one Smallest Translation Unit (STU).
  • STU Smallest Translation Unit
  • the size of the first virtual storage space can be represented by the smallest translation unit.
  • the virtual start address of the first virtual storage space is aligned with the smallest translation unit.
  • the size of the first virtual storage space is K minimum translation units.
  • the range of the first virtual storage space can be expressed as s_addr ⁇ s_addr+K*STU, where s_addr represents the virtual start address of the first virtual storage space, s_addr+K*STU represents the virtual end address of the first virtual storage space, STU Represents the size of the smallest translation unit, and * represents a multiplication operation.
  • the size of the minimum translation unit may be 4 kilobytes (KB).
  • the minimum translation unit is defined by the PCIe protocol to indicate the minimum space size for address translation conversion.
  • the unit of the smallest translation unit is Byte.
  • the specific value of the minimum translation unit can be set by the host, and the PCIe device can be notified by configuring the bit corresponding to the register of the configuration space of the PCIe device. In a computer system, this value is determined by the architecture of the system. Generally speaking, the system will set it to the minimum granularity of the page table of the system.
  • the address translation between the host and the PCIe device has the determination of the minimum translation space, and the size of the translation address space corresponding to the translation request length. This value How to determine the specific size of the PCI Express protocol has a detailed calculation formula.
  • S502 The processor receives an address translation request message.
  • the address translation request message includes the first virtual address and the size of the first virtual storage space. For specific explanation, please refer to the explanation of S501, which will not be repeated.
  • the processor determines P subpage tables with the same size according to the first virtual storage space.
  • the processor may first obtain N page tables corresponding to the first virtual storage space, and then divide N page tables according to the smallest translation unit to obtain P sub-page tables of the same size, where N is an integer, and N ⁇ 1.
  • the size of the N page tables is not necessarily the same.
  • the N page tables are not necessarily consecutive.
  • the process of obtaining P subpage tables of the same size from N page tables is described in detail below. As shown in FIG. 6, the process of obtaining P sub-page tables with the same size from N page tables specifically includes S5031 to S5033.
  • the processor determines N page tables according to the first virtual address.
  • the address translation request is used to request information corresponding to the N page tables corresponding to the first virtual storage space.
  • the first virtual storage space at least partially overlaps the virtual storage space indicated by each of the N page tables.
  • the first virtual storage space at least partially overlaps with the virtual storage space indicated by each of the N page tables. It can also be described as at least part of the first virtual storage space and the virtual storage space indicated by each of the N page tables. Intersect (overlap).
  • FIG. 7 it is an exemplary diagram of the correspondence between the virtual storage space and the page table provided in an embodiment.
  • the virtual storage space indicated by a page table partially overlaps the virtual storage space.
  • the physical start address of the page table is aligned with the page size (page size).
  • the range of the virtual storage space indicated by the page table can be expressed as v_addr ⁇ v_addr+X, where v_addr represents the virtual start address of the page table, which is aligned with the size of the page table. v_addr+X represents the virtual end address of the page table, and X represents the size of the page table.
  • the processor may perform a page table walk (PTW), and determine N page tables corresponding to the first virtual storage space according to the first virtual address and the size of the page table.
  • the size of the N page tables is not necessarily the same.
  • PW page table walk
  • the following takes the i-th page table as an example for detailed description. i is an integer, i ⁇ [1,N], that is, the i-th page table is any one of the N page tables.
  • the processor determines a second virtual address according to the size of the first virtual address and the i-th page table.
  • the following formula (1) is used to determine the second virtual address:
  • v_addr’ (s_addr+D*STU)&( ⁇ (Xi-1)) (1)
  • v_addr' represents the second virtual address
  • s_addr represents the first virtual address
  • D represents the number of STUs contained in the portion where the first page table to the i-1th page table intersects the first virtual storage space
  • Xi represents the size of the i-th page table
  • & represents bitwise logical AND
  • represents bitwise negation.
  • the processor determines whether the second virtual address is the same as the virtual start address of the i-th page table.
  • the processor determines that the first virtual storage space at least partially overlaps the virtual storage space indicated by the i-th page table.
  • the processor determines that the first virtual storage space and the virtual storage space indicated by the i-th page table at least partially overlap, it needs to feed back a part of the page table in the i-th page table that overlaps with the first virtual storage space.
  • the overlapping portion of the virtual storage space indicated by the i-th page table and the first virtual storage space can be expressed by the following formula (2):
  • p_addr represents the physical start address of the i-th page table
  • v_addr represents the virtual start address of the i-th page table
  • Xi represents the size of the i-th page table
  • U represents the untranslated space in the first virtual storage space size.
  • the processor determines that the first virtual storage space completely overlaps the virtual storage space indicated by the i-th page table, it needs to feed back the i-th page table.
  • the processor determines that the first virtual storage space does not overlap with the virtual storage space indicated by the i-th page table at all.
  • the processor can determine that the i-th page table is not a page table corresponding to the first virtual storage space. There is no need to feed back the information of the i-th page table.
  • the processor determines the N page tables corresponding to the first virtual storage space according to the first virtual address, the processor determines P sub-page tables from the N page tables.
  • Each of the P sub-page tables has the same size.
  • the sum of the sizes of the P subpage tables is equal to the size of the first virtual storage space.
  • One of the sub-page tables is used to indicate the mapping relationship of part or all of the virtual addresses to the physical addresses in the page table. It should be understood that the virtual storage space indicated by the P subpage tables completely overlaps the first virtual storage space.
  • the P subpage tables are obtained from N page tables, which may not exist as a page table in the MMU/IOMMU (this application is generated based on the actual N physical page tables), but the corresponding physical The address space must belong to the physical address space corresponding to the N page tables.
  • P is an integer, P ⁇ 1.
  • the size of the page table may be a multiple of the smallest translation unit.
  • the size of the page table is a power of two.
  • the processor may divide N page tables according to the minimum translation unit, and determine P sub-page tables.
  • the processor determining P sub-page tables from the N page tables specifically includes the following steps.
  • the processor divides N page tables according to the minimum translation unit to obtain M sub-page tables.
  • the processor may divide the page table according to the minimum translation unit.
  • the processor equally divides N page tables according to the minimum translation unit to obtain M sub-page tables, where M is an integer, and M ⁇ 1.
  • M is an integer
  • M ⁇ 1 M is an integer
  • M ⁇ 1 M is an integer
  • the virtual start address of the sth subpage table can be expressed as v_addr+(s-1)*STU, where v_addr represents the virtual start address of the page table to which the sth subpage table belongs, and s is an integer , S ⁇ [1,M].
  • the physical start address of the sth subpage table can be expressed as p_addr+(s-1)*STU, where p_addr represents the physical start address of the page table to which the sth subpage table belongs, s is an integer, s ⁇ [1, M].
  • the virtual start address corresponding to the first subpage table is v_addr+(1-1)*STU.
  • the physical start address corresponding to the first subpage table is p_addr+(1-1)*STU.
  • the virtual start address corresponding to the second subpage table is v_addr+(2-1)*STU.
  • the physical start address corresponding to the first subpage table is p_addr+(2-1)*STU.
  • the virtual start address corresponding to the third subpage table is v_addr+(3-1)*STU.
  • the physical start address corresponding to the first subpage table is p_addr+(3-1)*STU.
  • FIG. 9 it is an example diagram of page table division provided by an embodiment. Assume that the page table includes 4 minimum translation units. According to the minimum translation unit, the page table is divided to obtain 4 sub-page tables.
  • the virtual start address of the first subpage table is the virtual start address of the corresponding page table, which can be expressed as v_addr.
  • the physical start address of the first subpage table is the physical start address of the page table to which it belongs, which can be expressed as p_addr.
  • the virtual start address of the second subpage table is the virtual end address of the first subpage table, which can be expressed as v_addr+1*STU.
  • the physical start address of the second subpage table is the physical end address of the first subpage table, which can be expressed as p_addr+1*STU.
  • the virtual start address of the third subpage table is the virtual end address of the second subpage table, which can be expressed as v_addr+2*STU.
  • the physical start address of the third subpage table is the physical end address of the second subpage table, which can be expressed as p_addr+2*STU.
  • the virtual start address of the fourth subpage table is the virtual end address of the third subpage table, which can be expressed as v_addr+3*STU.
  • the physical start address of the fourth subpage table is the physical end address of the third subpage table, which can be expressed as p_addr+3*STU.
  • the processor determines P subpage tables from the M subpage tables.
  • the first virtual storage space includes at least one minimal translation unit.
  • the virtual start address of the first virtual storage space is aligned with the smallest translation unit.
  • the processor may divide the first virtual storage space according to the minimum translation unit, and determine P address translation units. Compare the virtual start address of each sub-page table in the M sub-page tables with the virtual start address corresponding to each address translation unit in the P address translation units, if the virtual start address of the sub-page table and the virtual start address corresponding to the address translation unit If the starting address is the same, it is determined that the virtual storage space indicated by the subpage table and the virtual storage space corresponding to the address translation unit completely overlap, and it is determined that the subpage table is a subpage table that needs to be fed back. Therefore, by traversing the M subpage tables, the subpage tables overlapping with the first virtual storage space are determined from the M subpage tables, until the sum of the determined subpage table sizes is equal to the size of the first virtual storage space.
  • the virtual start address corresponding to the address translation unit may be expressed as s_addr+(P-1)*STU, where s_addr represents the virtual start address of the first virtual storage space.
  • the first subpage table is determined
  • the virtual storage space indicated by one subpage table completely overlaps the virtual storage space corresponding to the first address translation unit, and the first subpage table is the subpage table that needs to be fed back.
  • the virtual start address of the first virtual storage space is aligned with the smallest translation unit first, and then the first virtual storage space is divided according to the smallest translation unit.
  • a virtual storage space if the virtual start address of the first virtual storage space is not aligned with the smallest translation unit, the virtual start address of the first virtual storage space is aligned with the smallest translation unit first, and then the first virtual storage space is divided according to the smallest translation unit.
  • the page table attributes such as read and write permissions of the sub-page tables belonging to the same page table are the same as the page table to which they belong.
  • the processor because the processor has not obtained the partial page table corresponding to the first virtual storage space, at this time, the processor cannot obtain the partial physical address corresponding to the virtual address of the first virtual storage space. Therefore, the P sub The sum of the sizes of the page tables is smaller than the size of the first virtual storage space. It should be understood that the virtual storage space indicated by the P subpage tables partially overlaps the first virtual storage space.
  • S504 The processor sends an address translation response message to the PCIe device.
  • the one address translation response message includes the physical start addresses of the P subpage tables and the size of the P subpage tables. Each of the physical start addresses of the P sub-page tables is used to indicate the physical start address of a sub-page table.
  • the one address translation response message may also include the attributes of P subpage tables.
  • the attributes of the subpage table are the same as the attributes of the page table to which the subpage table belongs.
  • the attributes of the subpage table or the page table include the attributes of the memory space indicated by the subpage table or the page table in the system.
  • the attributes of the page table include, but are not limited to, read and write permission attributes, global page table attributes, and inability to use physical address attributes.
  • the read and write permission attribute is used to indicate the read permission and write permission of the memory space indicated by the page table in the system.
  • the global page table attribute is used to indicate that the memory space indicated by the page table is globally available in the system.
  • the physical address attribute cannot be used to indicate that untrusted devices cannot access the memory space indicated by the page table.
  • other attributes may also be included, which will not be described in detail.
  • PCIe basic specification version 4.0 and PCIe basic specification version 5.0 stipulate that when an address translation response message contains multiple page tables, the order of the multiple page tables is determined according to the virtual address order of the virtual storage space requested for translation. of. Therefore, the physical start addresses of the P subpage tables included in the address translation response message are sorted. For example, the physical start addresses of the P subpage tables are sorted according to the virtual address order of the first virtual storage space, thereby avoiding disorder of the subpage tables, and the PCIe device determines the wrong physical address and accesses the wrong physical storage space.
  • the processor may divide the first virtual storage space according to the minimum translation unit to obtain P address translation request units. Since the first virtual storage space has a logical sequence relationship, the P address translation units also have the virtual address sequence relationship of the first virtual storage space. The so-called address order relationship can refer to the relationship of virtual addresses from large to small.
  • the P address translation units may be sorted according to the virtual address of the first virtual storage space in descending order.
  • the processor may sequentially set the identifiers of the P address translation request units according to the virtual address of the first virtual storage space.
  • the sequence of the identifiers of the P address translation request units may be sorted according to the virtual address of the first virtual storage space from large to small. Therefore, the processor can sort the physical start addresses of the P sub-page tables according to the order of the identifiers of the P address translation request units.
  • the identifiers of the P address translation request units may be set in the header of the address translation response message.
  • the processor determines 3 sub-page tables.
  • the first physical address is the physical start address of the first subpage table.
  • the second physical address is the physical start address of the second subpage table.
  • the third physical address is the physical start address of the third subpage table.
  • the processor sorts the first physical address, the second physical address, and the third physical address according to the virtual address sequence of the first virtual storage space.
  • the physical addresses may be sorted first, and then the size of the subpage table may be sorted.
  • the first physical address, the second physical address, the third physical address, the size of the first subpage table, the size of the second subpage table, and the size of the third subpage table are sorted in order.
  • the size of the first subpage table is the size of the subpage table corresponding to the first physical address.
  • the size of the second subpage table is the size of the subpage table corresponding to the second physical address.
  • the size of the third subpage table is the size of the subpage table corresponding to the third physical address.
  • the physical address and the size of the subpage table can be sorted crosswise.
  • the first physical address, the size of the first subpage table, the second physical address, the size of the second subpage table, the third physical address, and the size of the third subpage table are sorted in order.
  • the size of the first subpage table is the size of the subpage table corresponding to the first physical address.
  • the size of the second subpage table is the size of the subpage table corresponding to the second physical address.
  • the size of the third subpage table is the size of the subpage table corresponding to the third physical address.
  • S505 The PCIe device receives an address translation response message.
  • the address translation response message includes the translation result.
  • the translation result includes the physical start addresses of the P subpage tables, the size of the P subpage tables, and the attributes of the P subpage tables.
  • the size of each sub-page table in the P sub-page tables is the same, the sum of the sizes of the P sub-page tables is equal to the size of the first virtual storage space, P is an integer, and P ⁇ 1.
  • P is an integer, and P ⁇ 1.
  • the PCIe device accesses the physical storage space corresponding to the first virtual storage space according to the physical start addresses of the P subpage tables and the size of the P subpage tables.
  • the PCIe device establishes a one-to-one mapping relationship between virtual storage space and physical storage space according to the received translation result. For example, the first virtual storage space and its corresponding physical storage space, read and write access permissions of the related physical storage space, attributes of the corresponding space, and other information.
  • the PCIe device initiates to access the memory of the processor, it can convert the virtual address to the corresponding physical address maintained locally to access the processor, and mark that the address in the PCIe message sent by the PCIe device to the memory is translated.
  • the attribute corresponding to the address is placed in the prefix or other reserved fields of the message.
  • the PCIe device determines the physical address of the memory according to the physical address of the subpage table and the offset address in the virtual address corresponding to the first virtual storage space, and accesses the physical storage corresponding to the first virtual storage space according to the physical address of the memory and the size of the subpage table space.
  • the specific method can refer to the prior art and will not be repeated.
  • the PCIe device receives the address translation response message, and parses the address translation response message to obtain the physical start addresses of the P subpage tables and the size of the P subpage tables, and can directly convert the values of the P subpage tables
  • the physical start address and the size of the P subpage tables are stored in the local Address Translation Cache (ATC).
  • ATC Address Translation Cache
  • the order of the physical start addresses of the P subpage tables and the sizes of the P subpage tables corresponds to the order of the first virtual storage space from the virtual start address to the virtual end address.
  • the PCIe device needs to access the physical storage space corresponding to the first virtual storage space, it then accesses the physical storage space corresponding to the first virtual storage space according to the physical start addresses of the P subpage tables and the size of the P subpage tables.
  • the PCIe device can determine which sub-page tables can be combined into a continuous power-of-two according to the order of the physical start addresses of the P sub-page tables and the size of the P sub-page tables. For the page table of the square, the physical start address of the merged page table and the size of the merged page table are cached in the address translation cache, otherwise the physical start address of the subpage table and the size of the subpage table are directly cached to the address Translation cache.
  • the merged page table includes one sub-page table or multiple sub-page tables.
  • S506, S507, or S508 and S509 may be executed.
  • the PCIe device stores the physical start addresses of the P subpage tables and the size of the P subpage tables.
  • the PCIe device merges the P subpage tables according to the physical start addresses of the P subpage tables and the size of the P subpage tables to obtain the physical start addresses of the N page tables and the size of the N page tables.
  • the PCIe device can merge the subpage tables belonging to the same page table according to the physical start addresses of the P subpage tables and the size of the P subpage tables to obtain N
  • the physical starting address of a page table and the size of N page tables, N is an integer, 1 ⁇ N ⁇ P.
  • the PCIe device stores the physical start addresses of the N page tables and the size of the N page tables.
  • the P sub-page tables are traversed, and the P sub-page tables are merged according to the following method.
  • the following takes the j-th subpage table as an example for description.
  • j is an integer, j ⁇ [1,P], the jth subpage table represents any one of the P subpage tables.
  • the specific method for merging P subpage tables is as follows.
  • the PCIe device judges whether the physical start address of the jth subpage table is equal to the combined address.
  • the merge address is used to indicate the physical end address of the merged page table.
  • the merged page table may include at least one sub-page table.
  • the PCIe device caches the physical start address of the jth subpage table and the size of the jth subpage table.
  • the PCIe device may cache the physical start address of the jth subpage table and the size of the jth subpage table in the address translation cache.
  • the PCIe device judges whether the attribute of the j-th subpage table is the same as the attribute of the merged page table.
  • the attribute of the jth subpage table is not the same as the attribute of the merged page table, it indicates that the jth subpage table and the j-1th subpage table do not belong to the same page table, go to S1002; if the attribute of the jth subpage table The attribute is the same as the merged page table, indicating that the jth subpage table and the j-1th subpage table belong to the same page table, and S1004 is executed.
  • the PCIe device judges whether the size of the j-th subpage table is the same as the size of the merged page table.
  • the size of the jth subpage table is not the same as the size of the merged page table, it indicates that although the jth subpage table and the merged page table belong to the same page table, the jth subpage table and the merged page table will not be merged. If the power of 2 is satisfied, S1005 is executed; if the size of the jth subpage table is the same as the size of the merged page table, it indicates that the jth subpage table and the j-1th subpage table belong to the same page table, and S1006 is executed.
  • the PCIe device caches the physical end address of the jth subpage table and the size of the jth subpage table.
  • the PCIe device may first cache the jth subpage table in another cache area outside the address translation cache, traverse the j+1th subpage table, and determine whether the j+1th subpage table can be the same as the j+1th subpage table.
  • the j sub-page tables are merged. For details, please refer to the description of S1001 to S1006, which will not be repeated.
  • the PCIe device merges the j-th subpage table with the merged page table.
  • the PCIe device may combine the physical start address of the merged page table with the merged page.
  • the size of the table is cached in the address translation cache.
  • the PCIe device after the PCIe device merges the j-th subpage table with the merged page table, it updates the merge address and the size of the merged page table, so as to further determine whether other subpage tables can be merged.
  • the processor can determine P subpage tables with the same size according to the first virtual storage space, and the sum of the size of the P subpage tables is equal to the size of the first virtual storage space, and pass An address translation response message feeds back the physical start address of the P subpage tables and the size of the P subpage tables. Therefore, the PCIe device only needs to send an address translation request message, and can obtain the physical addresses corresponding to all virtual addresses in the first virtual storage space through an address translation response message. Therefore, under the condition of complying with the provisions of the PCIe basic specifications, the delay of the PCIe device requesting the translation address and the bandwidth occupancy rate between the PCIe device and the processor can be effectively reduced.
  • the address translation method will be illustrated below with reference to FIG. 11 to FIG. 14. As shown in Figure 11, assume that the size of the first virtual storage space is 8 minimum translation units. s_addr represents the first virtual start address of the first virtual storage space.
  • the PCIe device sends an address translation request message to the processor, where the address translation request message includes the first virtual address and the size of the first virtual storage space.
  • the processor determines 4 page tables according to the first virtual address.
  • the physical start address of the first page table is p_addr0
  • the first page table includes 2 STUs
  • the physical start address of the second page table is p_addr1
  • the first page table includes 2 STUs
  • the physical start address of the third page table is p_addr2
  • the first page table includes 1 STU
  • the physical start address of the fourth page table is p_addr3
  • the first The page table includes 4 STUs.
  • the method of determining the page table according to the first virtual address is as described in S5031, and will not be repeated.
  • the processor divides 4 page tables according to the minimum translation unit to obtain 9 sub-page tables.
  • the first page table is divided into two sub-page tables, the physical starting addresses are p_addr0 and p_addr0+STU respectively, and the virtual starting addresses are v_addr0 and v_addr0+STU respectively.
  • the second page table is divided into 2 sub-page tables, the physical starting addresses are p_addr1 and p_addr1+STU respectively, and the virtual starting addresses are v_addr1 and v_addr1+STU respectively.
  • the third page table is divided into 1 sub-page table, the physical starting addresses are p_addr2, and the virtual starting addresses are v_addr2.
  • the fourth page table is divided into 4 sub-page tables, the physical starting addresses are p_addr3, p_addr3+STU, p_addr3+2*STU and p_addr3+3*STU respectively, and the virtual starting addresses are v_addr3, v_addr3+STU, v_addr3+ respectively 2*STU and v_addr3+3*STU.
  • the processor can divide the first virtual storage space according to the minimum translation unit, determine 8 address translation units, and compare the virtual start address of each subpage table in the 9 subpage tables with the corresponding address translation unit in each of the 8 address translation units Virtual start address. If the virtual start address of the subpage table is the same as the virtual start address corresponding to the address translation unit, determine that the virtual storage space indicated by the subpage table and the virtual storage space corresponding to the address translation unit completely overlap, and determine the subpage table The page table is the sub-page table that needs feedback. Therefore, by traversing the M subpage tables, the subpage tables overlapping with the first virtual storage space are determined from the M subpage tables, until the sum of the determined subpage table sizes is equal to the size of the first virtual storage space.
  • the virtual storage space indicated by the 8 subpage tables of the 9 subpage tables completely overlaps the first virtual storage space
  • the physical start addresses of the 8 subpage tables are p_addr0+STU, p_addr1, p_addr1, respectively. +STU, p_addr2, p_addr3, p_addr3+STU, p_addr3+2*STU, and p_addr3+3*STU.
  • the processor sends an address translation response message.
  • the address translation response message includes the physical start addresses of the 8 subpage tables and the size of the 8 subpage tables.
  • the physical start addresses of the eight sub-page tables and the sizes of the eight sub-page tables may be sorted according to the order of the virtual start addresses corresponding to the address translation unit.
  • the PCIe device After the PCIe device receives the address translation response message, it can establish a one-to-one mapping relationship from the first virtual storage space to the physical storage space.
  • the virtual start address s_addr of the first virtual storage space corresponds to the physical start address p_addr0+STU of the page table.
  • the virtual address s_addr+STU of the first virtual storage space corresponds to the physical address p_addr1 of the page table.
  • the virtual address s_addr+3*STU of the first virtual storage space corresponds to the physical address p_addr2 of the page table.
  • the virtual address s_addr+4*STU of the first virtual storage space corresponds to the physical address p_addr3 of the page table.
  • the PCIe device may use the following formula (3) to first align the virtual start address of the first virtual storage space with the STU, and then send an address translation request message to the processor to facilitate The processor may divide the first virtual storage space according to the minimum translation unit.
  • the processor may divide the first virtual storage space according to the minimum translation unit to obtain 8 address translation request units.
  • the virtual start address corresponding to the first address translation unit is s_addr
  • the virtual start address corresponding to the second address translation unit is s_addr+STU
  • the virtual start address corresponding to the third address translation unit is s_addr+2 *STU
  • the virtual start address corresponding to the fourth address translation unit is s_addr+3*STU
  • the virtual start address corresponding to the fifth address translation unit is s_addr+4*STU
  • the starting address is s_addr+5*STU
  • the virtual starting address corresponding to the seventh address translation unit is s_addr+6*STU
  • the virtual starting address corresponding to the eighth address translation unit is s_addr+7*STU.
  • the processor sequentially sets the identifiers of the eight address translation request units, that is, P0 to P7, according to the virtual address of the first virtual storage space.
  • the processor can translate 8 address translation request units through the page table, that is, determine the sub-page tables corresponding to the 8 address translation request units.
  • formula (1) may be used to determine the page table corresponding to the address translation request unit, and then to determine the overlapping sub-page table in the page table corresponding to the address translation request unit. For example, according to formula (4), the overlapping sub-page table in the page table corresponding to the first address translation request unit is determined.
  • trans_addr s_addr&( ⁇ (X1-1))+p_addr0 (4)
  • trans_addr represents the translation address.
  • s_addr represents the virtual start address of the first virtual storage space.
  • X1 represents the size of the first page table.
  • p_addr0 represents the physical start address of the page table.
  • the translation address is the same as the virtual start address of the address translation request unit.
  • the size of the translation is one STU.
  • the processor sorts the physical start addresses of the 8 subpage tables and the sizes of the 8 subpage tables according to the order of the identifiers of the 8 address translation request units to form an address translation response.
  • the message is returned to the peer.
  • parallel processing may be used to determine the sub-page tables of the virtual storage space corresponding to the eight address translation units according to the foregoing method.
  • the processor does not uniformly divide the first virtual storage space into multiple address translation request units with a size of STU to request translation addresses, but performs serial processing on the first virtual storage space, that is, for the same ATS request, first try to request translation once, after obtaining the translation result, judge whether the address range of the translation result requested by the PCIe device has been fully translated at this time. If it has not been fully translated, continue to translate the remaining untranslated address space. For multiple different ATS requests, the above translation process is still executed in parallel. That is, the description of the above-mentioned embodiment.
  • the processor first aligns the virtual start address of the first virtual storage space with the STU, and then determines a page table corresponding to the first virtual storage space.
  • the physical start address of the page table is p_addr
  • the size of the page table is 2 STUs. Then, according to the following formulas (5) to (9), the sub-page table that overlaps the first virtual storage space in the page table is determined.
  • X represents the size of the page table.
  • U represents the size of the untranslated space in the first virtual storage space.
  • min(a,b) is the smaller value of both a and b.
  • st_ovl refers to the offset of the intersection space between the current translation result and the first virtual storage space in the physical address space corresponding to the translation result.
  • the & in formula (6) represents a bitwise logical AND.
  • ovl_size represents the size of the space corresponding to the portion overlapping with the first virtual storage space.
  • the size of all the above spaces is in STU, so the above logical operation can be operated under STU alignment, so that the bit width of the signal involved in the operation will be reduced. Since the STU is at least 4KB, the bit width is reduced by at least 12 bits.
  • the processor determines how large the intersecting space between the current translation result and the first virtual storage space is based on the first translation result (the size of the intersecting space must be a multiple of the STU), and calculates the physical starting address of the intersecting space And other logical operations. And cut the intersection space with STU as the unit, and finally output one or more translation results with the size of STU, and buffer them in order.
  • the processor will also calculate how much address space has not yet been translated, and if its size is not 0, it will continue to initiate translation.
  • the processor executes the above serial steps according to the second translation result and the physical starting address of the virtual address space registered after the previous processing, until the U is 0 at the end, and finally all the translation results are obtained; All the translation results of the protocol constitute an address translation response message and return it to the PCIe device.
  • the processor can dynamically determine whether the address translation request is processed in parallel or serially according to the current translation delay, the predicted average page table size (classification prediction according to the request identifier in the address translation request message, etc.).
  • the processor can also dynamically select whether the address translation request is processed in parallel or serially according to the translation delay threshold (for example, if the threshold is exceeded, the translation delay is considered too large), or only part of the translation result is returned to the PCIe device to allow PCIe
  • the device can use the translation result as soon as possible, without waiting too long for the translation result to be available (for example, when the delay is too large, a translation result is directly returned to the PCIe device, so that the translation delay seen by the PCIe device may be It will be smaller because the translation results will be available in a short time).
  • the processor since the processor has not established the page table corresponding to the first virtual storage space, the processor cannot obtain the sub-page table corresponding to the address translation unit. At this time, the processor only feedbacks and obtains the subpage table corresponding to the address translation unit (such as: P0 to P2).
  • the PCIe device accesses the physical storage space corresponding to the first virtual storage space according to the physical start addresses of the eight subpage tables and the size of the eight subpage tables.
  • the physical address corresponding to the virtual address s_addr of the first virtual storage space is p_addr0+STU.
  • the physical address corresponding to the virtual address s_addr+STU of the first virtual storage space is p_addr1.
  • the physical address corresponding to the virtual address s_addr+3*STU of the first virtual storage space is p_addr2.
  • the physical address corresponding to the virtual address s_addr+4*STU of the first virtual storage space is p_addr3.
  • the PCIe device first caches the physical start addresses of the 8 subpage tables and the size of the 8 subpage tables.
  • the PCIe device merges 8 subpage tables according to the physical start addresses of the 8 subpage tables and the size of the 8 subpage tables to obtain 4 physical addresses and 4 The size of the page table.
  • the PCIe device can first store the physical start address of the first subpage table and the size of the subpage table in a cache (L_c), and update the merge address (m_addr) and the size of the merged page table.
  • the combined address is the physical end address of the first subpage table, and the physical end address of the first subpage table is p_addr0+2*STU.
  • the size of the merged page table is 1 STU.
  • L_c can be a part of the register
  • the storage depth of L_c can be the physical start address of the 8 subpage tables and the size of the 8 subpage tables.
  • the storage depth of L_c can be 16. Since the physical start address of the second subpage table (p_addr1) is not equal to the merge address (p_addr0+2*STU), the physical start address of the first subpage table in L_c and the size of the subpage table are stored in the address translation In the cache, the physical start address of the second subpage table and the size of the subpage table are stored in L_c. Update the merge address (m_addr) and the size of the merged page table. The combined address is the physical end address of the second subpage table, and the physical end address of the second subpage table is p_addr1+1*STU. The size of the merged page table is 1 STU. Wherein, L_wr is used to indicate the position within L_c corresponding to the size of the merged address and the merged page table, that is, the value corresponding to the current merged page table.
  • the physical start address of the third subpage table (p_addr1+1*STU) is equal to the merge address (p_addr1+1*STU), and the attributes of the third subpage table are the same as those of the merged page table Same, and the size of the third subpage table is equal to the size of the merged page table, store the physical start address of the third subpage table and the size of the subpage table in L_c, merge the second subpage table and the third subpage table .
  • the combined address is the physical end address of the third subpage table, and the physical end address of the third subpage table is p_addr1+2*STU.
  • the size of the merged page table is the size of 2 STUs.
  • the physical start address of the merged page table is the physical start address of the second subpage table (p_addr1).
  • the physical start address (p_addr2) of the fourth subpage table is not equal to the merge address (p_addr1+2*STU)
  • the physical start address of the merged page table (p_addr1) in L_c and the size of the merged page table 2 *STU is stored in the address translation cache, and the physical start address of the fourth sub-page table and the size of the sub-page table are stored in L_c.
  • the combined address is the physical end address p_addr2+1*STU of the fourth subpage table.
  • the size of the merged page table is 1 STU.
  • the merged address is the physical end address (p_addr3+2*STU) of the sixth subpage table.
  • the size of the merged page table is the size of 2 STUs.
  • the seventh subpage table Since the physical start address of the seventh subpage table (p_addr3+2*STU) is equal to the merge address (p_addr3+2*STU), and the attributes of the seventh subpage table are the same as those of the merged page table, but the seventh subpage table
  • the size of the page table is not equal to the size of the merged page table.
  • the physical start address of the seventh subpage table and the size of the subpage table are stored in L_c, and the seventh subpage table is not merged.
  • the combined address is the physical end address of the seventh subpage table (p_addr3+3*STU).
  • the size of the merged page table is 1 STU.
  • the eighth subpage table Since the physical start address of the eighth subpage table (p_addr3+3*STU) is equal to the merge address (p_addr3+3*STU), and the attributes of the eighth subpage table are the same as those of the merged page table, and the eighth subpage table
  • the size of the page table is equal to the size of the merged page table, the physical starting address of the eighth subpage table and the size of the subpage table are stored in L_c, and the fifth subpage table to the eighth subpage table are merged.
  • the combined address is the physical end address of the eighth subpage table (p_addr3+4*STU).
  • the size of the merged page table is 4 STUs.
  • the physical start address of the merged page table is the physical start address of the fifth subpage table (p_addr3), and the size of the merged page table is 4*STU.
  • L_wr is used to indicate the position within L_c corresponding to the size of the merged address and the merged page table, that is, the value corresponding to the current merged page table. For example, when the physical starting address of the jth subpage table (the current translation result) is the same as the value of the register of the merged page table pointed to by L_wr, it is determined that the size of the jth subpage table is not equal to the merged page table pointed to by the current L_wr Therefore, the physical end address of the jth subpage table and the size of the subpage table are cached in L_c, and the value of L_wr is refreshed (such as adding 1), and the refreshed L_wr points to the jth subpage table.
  • the physical end address and the size of the jth subpage table When the physical start address of the j+1th subpage table is the same as the value of the register of the merged page table pointed to by L_wr, it is determined that the size of the j+1th subpage table is equal to When the size of the jth subpage table pointed to by L_wr, merge the jth subpage table and the j+1th subpage table, refresh the value of L_wr to L_wr-1, and L_wr points to the physical starting address of the jth subpage table and has been merged The size of the page table.
  • the storage space occupied by storing the information of the page table is reduced, and the utilization rate of ATC and the efficiency of address translation are improved.
  • the processor intersects all of them with the virtual address space.
  • the (overlap) translated physical address space is cut into a translation result of all the size of the STU, and is returned to the PCIe peer according to the PCIe protocol.
  • Software and PCIe devices do not need to make any adaptation changes, compatible with all existing protocols and software architectures, and compatible with all PCIe devices. All translation results can be returned without violating the provisions of the PCIe protocol standard.
  • the peer PCIe device when the peer PCIe device initiates an address translation request, it does not need to consider whether the page tables corresponding to the virtual address space requested for translation are all the same (otherwise it cannot return all the translation results), so the PCIe device simply sees what it wants The virtual address space to be translated is a continuous virtual address space of the starting address and the size of the address space, and then an address translation request is initiated, thereby simplifying the implementation of the address translation request function by the peer PCIe device.
  • the processor and the PCIe device include corresponding hardware structures and/or software modules that perform each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application scenarios and design constraints of the technical solution.
  • “at least one” refers to one or more, and “multiple” refers to two or more.
  • “And/or” describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the associated object before and after is an “or” relationship; in the formula of this application, the character “/” indicates that the associated object before and after is a kind of "division" Relationship.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

公开了地址翻译方法及装置,涉及计算机领域,解决了PCIe设备获取页表的时延较大的问题。该地址翻译装置包括:接口,用于从PCIe设备接收一个地址翻译请求报文,该地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小,该第一虚拟地址为第一虚拟存储空间的虚拟起始地址;地址翻译单元,用于根据第一虚拟存储空间确定P个大小相同的子页表,每个子页表用于指示虚拟地址到物理地址的映射关系,该P个子页表的大小之和等于第一虚拟存储空间的大小,P为整数,P≥1;该接口,还用于发送一个地址翻译响应报文给该PCIe设备,该地址翻译响应报文包括该P个子页表的物理起始地址和该P个子页表的大小。

Description

一种地址翻译方法及装置 技术领域
本申请实施例涉及计算机领域,尤其涉及一种地址翻译方法及装置。
背景技术
高速串行计算机扩展总线标准(peripheral component interconnect express,PCIe)设备基于直接内存存取(Direct Memory Access,DMA)协议访问处理器的内存前,可以向处理器发送地址翻译请求(Address Translation Service,ATS)报文,请求处理器反馈第一虚拟存储空间对应的页表的信息。页表用于指示虚拟地址到物理地址的映射关系。从而,以便于PCIe设备根据页表的信息确定第一虚拟存储空间的虚拟地址对应的物理地址,并根据物理地址访问处理器的内存。
PCIe基本规范版本(base specification revision)4.0和PCIe基本规范版本5.0均对处理器处理地址翻译请求做了如下的规定:1)一个地址翻译请求报文可以采用一个地址翻译响应报文或者两个地址翻译响应报文来响应;2)一个地址翻译响应报文包含两个以上页表时,页表大小相同。可理解的,地址翻译可以是指将虚拟存储空间中的虚拟地址转换成内存中的物理地址的过程。
当PCIe设备请求翻译的第一虚拟存储空间的大小比较大,且第一虚拟存储空间对应的多个页表的大小不一样的时候,按照协议的规定,处理器向PCIe设备反馈一个页表的信息(如图1所示),或者,处理器向PCIe设备反馈部分页表(多于一个页表但少于处理器确定的所有页表)的信息(如图2所示)。因此,反馈的页表的大小之和小于第一虚拟存储空间的大小。处理器只能反馈第一虚拟存储空间的部分虚拟地址对应的物理地址,无法将多个页表的信息全部反馈给PCIe设备。PCIe设备需要发送至少两条地址翻译请求报文,才能获取到第一虚拟存储空间的所有页表的信息。导致PCIe设备请求翻译地址的时延较大,以及PCIe设备与处理器间的带宽占用率较大。
发明内容
本申请实施例提供一种地址翻译方法及装置,解决了PCIe设备获取页表的时延较大的问题。
为达到上述目的,本申请实施例采用如下技术方案:
第一方面,提供了地址翻译装置,该地址翻译装置包括接口和地址翻译单元。其中,所述接口,用于从PCIe设备接收一个地址翻译请求报文,该一个地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小,第一虚拟地址为第一虚拟存储空间的虚拟起始地址;所述地址翻译单元,用于根据第一虚拟存储空间确定P个大小相同的子页表,每个子页表用于指示虚拟地址到物理地址的映射关系,P个子页表的大小之和等于第一虚拟存储空间的大小,P为整数,P≥1;所述接口,还用于发送一个地址翻译响应报文给PCIe设备,一个地址翻译响应报文包括P个子页表的物理起始地址和P个子页表的大小。本申请的实施例提供的地址翻译装置,可以根据第一虚拟存储 空间确定P个大小相同的子页表,且P个子页表的大小之和等于第一虚拟存储空间的大小,并通过一个地址翻译响应报文反馈该P个子页表的物理起始地址和P个子页表的大小。因此,PCIe设备只需要发送一条地址翻译请求报文,便可以通过一个地址翻译响应报文获取到第一虚拟存储空间的所有虚拟地址对应的物理地址。从而,在符合PCIe基本规范的规定的情况下,能够有效地降低PCIe设备请求翻译地址的时延和PCIe设备与处理器间的带宽占用率。
在一种可能的设计中,地址翻译单元,用于根据第一虚拟存储空间和最小翻译单元确定P个大小相同的子页表。
具体的,地址翻译单元,用于根据第一虚拟地址确定N个页表,第一虚拟存储空间与N个页表中每个页表指示的虚拟存储空间至少部分重叠,N为整数,N≥1;根据最小翻译单元划分N个页表以得到M个子页表,M个子页表中每个子页表的大小相同,M为整数,M≥1,第一虚拟存储空间包括至少一个最小翻译单元;从M个子页表中确定P个子页表,1≤P≤M。由于第一虚拟存储空间的虚拟起始地址和页表的虚拟起始地址均是与最小翻译单元对齐的,从而,依据最小翻译单元对第一虚拟存储空间和页表进行划分,可以得到与第一虚拟存储空间完全重叠的大小相同的子页表,以便于处理器向PCIe设备反馈第一虚拟存储空间的所有虚拟地址对应的物理地址。
在一些实施例中,地址翻译单元,用于根据第一虚拟地址和页表的大小确定与第一虚拟存储空间重叠的N个页表。
示例的,第i个页表指示的虚拟存储空间与第一虚拟存储空间重叠的部分采用如下公式表示:(p_addr+(s_addr+D*STU)-v_addr)~(p_addr+min(Xi,U)),其中,p_addr表示第i个页表的物理起始地址,s_addr表示第一虚拟地址,D表示第1个页表至第i-1个页表与第一虚拟存储空间相交的部分所包含的最小翻译单元的个数,STU表示最小翻译单元,v_addr表示第i个页表的虚拟起始地址,Xi表示第i个页表的大小,U表示第一虚拟存储空间中未翻译的空间大小,i为整数,i∈[1,N]。
在另一种可能的设计中,一个地址翻译响应报文包括的P个子页表的物理起始地址是已排序的。从而,避免子页表发生乱序,PCIe设备确定错误的物理地址,访问错误的物理存储空间。
示例的,可以依据第一虚拟存储空间的虚拟地址顺序对P个子页表的物理起始地址排序。第一虚拟存储空间的虚拟地址是根据最小翻译单元划分第一虚拟存储空间后的虚拟地址。
在另一种可能的设计中,子页表的属性与该子页表所属的页表的属性相同,子页表或页表的属性包括子页表或页表指示的内存空间在系统中的属性。
第二方面,提供了一种地址翻译方法,该方法可应用于处理器,或者该方法可应用于可以支持处理器实现该方法的通信装置,例如该通信装置包括芯片系统。该方法包括:接收到一个地址翻译请求报文,根据第一虚拟存储空间确定P个大小相同的子页表,并发送一个地址翻译响应报文,一个地址翻译响应报文包括P个子页表的物理起始地址和P个子页表的大小。其中,该一个地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小,第一虚拟地址为第一虚拟存储空间的虚拟起始地址。每个子页表用于指示虚拟地址到物理地址的映射关系,P个子页表的大小之和等于第一虚 拟存储空间的大小,P为整数,P≥1。本申请的实施例提供的地址翻译方法,处理器可以根据第一虚拟存储空间确定P个大小相同的子页表,且P个子页表的大小之和等于第一虚拟存储空间的大小,并通过一个地址翻译响应报文反馈该P个子页表的物理起始地址和P个子页表的大小。因此,PCIe设备只需要发送一条地址翻译请求报文,便可以通过一个地址翻译响应报文获取到第一虚拟存储空间的所有虚拟地址对应的物理地址。从而,在符合PCIe基本规范的规定的情况下,能够有效地降低PCIe设备请求翻译地址的时延和PCIe设备与处理器间的带宽占用率。
在一种可能的设计中,根据第一虚拟存储空间确定P个大小相同的子页表,包括:根据第一虚拟存储空间和最小翻译单元确定P个大小相同的子页表。
具体的,根据所述第一虚拟存储空间和最小翻译单元确定P个大小相同的子页表,包括:根据第一虚拟地址确定N个页表,第一虚拟存储空间与N个页表中每个页表指示的虚拟存储空间至少部分重叠,N为整数,N≥1;根据最小翻译单元划分N个页表以得到M个子页表,M个子页表中每个子页表的大小相同,M为整数,M≥1,第一虚拟存储空间包括至少一个最小翻译单元;从M个子页表中确定P个子页表,1≤P≤M。由于第一虚拟存储空间的虚拟起始地址和页表的虚拟起始地址均是与最小翻译单元对齐的,从而,依据最小翻译单元对第一虚拟存储空间和页表进行划分,可以得到与第一虚拟存储空间完全重叠的大小相同的子页表,以便于处理器向PCIe设备反馈第一虚拟存储空间的所有虚拟地址对应的物理地址。
在一些实施例中,根据第一虚拟地址确定N个页表,包括:根据第一虚拟地址和页表的大小确定与第一虚拟存储空间重叠的N个页表。
示例的,第i个页表指示的虚拟存储空间与第一虚拟存储空间重叠的部分采用如下公式表示:(p_addr+(s_addr+D*STU)-v_addr)~(p_addr+min(Xi,U)),其中,p_addr表示第i个页表的物理起始地址,s_addr表示第一虚拟地址,D表示第1个页表至第i-1个页表与第一虚拟存储空间相交的部分所包含的最小翻译单元的个数,STU表示最小翻译单元,v_addr表示第i个页表的虚拟起始地址,Xi表示第i个页表的大小,U表示第一虚拟存储空间中未翻译的空间大小,i为整数,i∈[1,N]。
在另一种可能的设计中,所述一个地址翻译响应报文包括的P个子页表的物理起始地址是已排序的。从而,避免子页表发生乱序,PCIe设备确定错误的物理地址,访问错误的物理存储空间。
示例的,可以依据第一虚拟存储空间的虚拟地址顺序对P个子页表的物理起始地址排序。第一虚拟存储空间的虚拟地址是根据最小翻译单元划分第一虚拟存储空间后的虚拟地址。
在另一种可能的设计中,子页表的属性与该子页表所属的页表的属性相同,子页表或页表的属性包括子页表或页表指示的内存空间在系统中的属性。
附图说明
图1为现有技术提供的一种反馈页表的示意图;
图2为现有技术提供的另一种反馈页表的示意图;
图3为本申请实施例提供的一种PCIe系统的组成示意图;
图4为本申请实施例提供的另一种PCIe系统的组成示意图;
图5为本申请实施例提供的一种地址翻译方法的流程图;
图6为本申请实施例提供的另一种地址翻译方法的流程图;
图7为本申请实施例提供的虚拟存储空间与页表的对应关系示例图;
图8为本申请实施例提供的又一种地址翻译方法的流程图;
图9为本申请实施例提供的一种页表划分示例图;
图10为本申请实施例提供的一种合并P个子页表的方法流程图;
图11为本申请实施例提供的一种地址翻译过程的示意图;
图12a为本申请实施例提供的另一种地址翻译过程的示意图;
图12b为本申请实施例提供的又一种地址翻译过程的示意图;
图13为本申请实施例提供的再一种地址翻译过程的示意图;
图14为本申请实施例提供的再一种地址翻译过程的示意图;
图15~图17为本申请实施例提供的一种合并子页表的过程示意图。
具体实施方式
本申请说明书和权利要求书及上述附图中的术语“第一”、“第二”和“第三”等是用于区别不同对象,而不是用于限定特定顺序。
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
为了下述各实施例的描述清楚简洁,首先给出相关技术的简要介绍:
虚拟存储空间可以是指由处理器所生成的虚拟地址的集合。虚拟存储空间也可以称为虚拟地址空间。虚拟地址也可以称为逻辑地址。虚拟地址可以是处理器所生成的地址。处理器产生的虚拟地址包括页号和页偏移。页号包含每个页在物理内存中的基地址。页号可以用来作为页表的索引。页偏移与页号相结合,可以用来确定内存的物理地址。
虚拟存储空间的大小可以表示为2的m次方(如:2 m)。页表的大小可以表示为2的n次方(如:2 n)。虚拟地址的高(m-n)位表示页号,低n位表示页偏移。
物理地址空间可以是指与虚拟地址相对应的内存中的物理地址的集合。物理地址可以是内存的地址。
需要说明的是,处理器只生成虚拟地址,且认为进程的虚拟存储空间为0到最大值。虚拟存储空间的虚拟地址对应的物理地址空间可能是由多段范围较小的子物理地址空间构成的,这些子物理地址空间不一定连续,对应的所有子物理地址空间大小之和等于虚拟存储空间大小。对于每块子物理地址空间对应的物理地址范围是R+0到R+size,R为这块物理地址空间对应的基地址。不同的子物理地址空间的R是不一样的。物理地址空间也可以称为物理存储空间。
地址映射是将虚拟存储空间中的虚拟地址转换成内存中的物理地址的过程。在一些实施例中,对于PCIe设备访问内存的操作,地址映射可以由处理器中的内存管理单元(Memory Management Unit,MMU)来执行。所谓PCIe设备可以是指依据PCIe协议通过PCIe总线与其他设备进行通信的设备。在另一些实施例中,对于处理器的输入 /输出(Input/Output,I/O)设备访问内存的操作,地址映射可以由I/O内存管理单元(I/O Memory Management Unit,IOMMU)来执行。在ARM架构下,I/O内存管理单元又叫系统内存管理单元(System Memory Management Unit,SMMU)。
页表是一种特殊的数据结构。存储在系统空间的页表区。页表是用于将虚拟地址转换为物理地址的转换关系表,访问虚拟地址时,计算机通过页表找到对应的物理地址进行访问,因此,页表指示虚拟地址与物理地址的对应关系。
为了解决PCIe设备请求翻译地址的时延较大,以及PCIe设备与处理器间的带宽占用率较大问题,本申请的实施例提供了一种地址翻译方法,该方法包括接收一个地址翻译请求报文后,根据第一虚拟存储空间确定P个大小相同的子页表,并发送一个地址翻译响应报文,该一个地址翻译响应报文包括P个子页表的物理起始地址和P个子页表的大小。其中,该一个地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小,第一虚拟地址为第一虚拟存储空间的虚拟起始地址。每个子页表用于指示虚拟地址到物理地址的映射关系,P个子页表的大小之和等于第一虚拟存储空间的大小,P为整数,P≥1。
通常地,P为大于1的正整数,即多个大小相同的子页表被确定;当然,本申请也不排除P等于1的情况。本申请的实施例提供的地址翻译方法,处理器可以根据第一虚拟存储空间确定P个大小相同的子页表,且P个子页表的大小之和等于第一虚拟存储空间的大小,并通过一个地址翻译响应报文反馈该P个子页表的物理起始地址和P个子页表的大小。因此,PCIe设备只需要发送一条地址翻译请求报文,便可以通过一个地址翻译响应报文获取到第一虚拟存储空间的所有虚拟地址对应的物理地址。从而,在符合PCIe基本规范的规定的情况下,能够有效地降低PCIe设备请求翻译地址的时延和PCIe设备与处理器间的带宽占用率。
下面将结合附图对本申请实施例的实施方式进行详细描述。
图3为本申请实施例提供的一种PCIe系统的组成示意图,如图3所示,PCIe系统300可以包括处理器301、根复合体(Root Complex)302、交换器(Switch)303、端点(Endpoint)304和桥(PCIe brige)305。
根复合体302用于连接处理器301与输入/输出I/O设备。交换器303支持在不同端点304间进行对等通信。桥305用于将PCIe与其它PCI总线标准(如PCI/PCI-X)相连。端点304可以是PCIe端点设备或PCIe设备,例如PCIe接口网卡设备、串口卡设备和存储卡设备等。图3只是示意图,示出的PCIe系统的结构并不构成对PCIe系统的限定,该PCIe系统中还可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。如还可以包括内存306和PCIe总线307。本申请的实施例对该PCIe系统中包括的端点和处理器的数量不做限定。
处理器301可以是中央处理单元(Central Processing Unit,CPU),还可以是其它通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其它可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处 理器中的硬件及软件模块组合执行完成。
在本实施例中,处理器301用于在接收到一个地址翻译请求报文后,根据第一虚拟存储空间确定P个大小相同的子页表,并发送一个地址翻译响应报文,一个地址翻译响应报文包括P个子页表的物理起始地址和P个子页表的大小。其中,该一个地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小,第一虚拟地址为第一虚拟存储空间的虚拟起始地址。每个子页表用于指示虚拟地址到物理地址的映射关系,P个子页表的大小之和等于第一虚拟存储空间的大小,P为整数,P≥1。
PCIe设备304用于发送一个地址翻译请求报文,并接收一个地址翻译响应报文,根据P个子页表的物理起始地址和P个子页表的大小访问第一虚拟存储空间对应的物理存储空间(如:内存306)。在一些实施例中,PCIe设备可以是网卡或图形处理器(Graphics Processing Unit,GPU)。
图4为本申请实施例提供的上述PCIe系统300的又一种组成示意图,如图4所示,PCIe系统300可以包括处理器301和PCIe设备304。处理器301和PCIe设备304连接。处理器301包括地址翻译电路411。地址翻译电路411包括地址翻译单元4111和接口4112。其中,接口4112用于实现处理器301和PCIe设备的通信。例如,接收地址翻译请求报文和发送地址翻译响应报文。地址翻译单元4111用于实现虚拟地址到物理地址的翻译。实际上,地址翻译单元4111也是一个电路模块。
接下来,对翻译地址的过程进行详细阐述。图5为本申请实施例提供的一种地址翻译方法的流程图。这里以PCIe设备向处理器请求第一虚拟存储空间对应的页表为例进行说明。第一虚拟存储空间对应的页表可以用于指示第一虚拟存储空间对应的物理空间的物理地址。如图5所示,该方法可以包括:
S501、PCIe设备向处理器发送一个地址翻译请求报文。
所述一个地址翻译请求用于请求第一虚拟存储空间对应的物理空间的物理地址。例如,第一虚拟存储空间对应的物理空间可以是内存的存储空间。第一虚拟存储空间为一段连续的虚拟存储空间。
在一些实施例中,该一个地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小。第一虚拟地址为第一虚拟存储空间的虚拟起始地址。应理解,第一虚拟地址是PCIe设备请求翻译的第一虚拟存储空间中还未翻译的起始地址。
在另一些实施例中,第一虚拟存储空间包括至少一个最小翻译单元(Smallest Translation Unit,STU)。第一虚拟存储空间的大小可以采用最小翻译单元表示。第一虚拟存储空间的虚拟起始地址是与最小翻译单元对齐的。例如,第一虚拟存储空间的大小为K个最小翻译单元。第一虚拟存储空间的范围可以表示为s_addr~s_addr+K*STU,其中,s_addr表示第一虚拟存储空间的虚拟起始地址,s_addr+K*STU表示第一虚拟存储空间的虚拟结束地址,STU表示最小翻译单元的大小,*表示乘法运算。例如,最小翻译单元的大小可以是4千字节(kilobyte,KB)。
所述最小翻译单元是PCIe协议定义的一个指示地址翻译转换的最小空间的大小。最小翻译单元的单位是字节(Byte)。可以由主机设置最小翻译单元具体的值,并且通过配置PCIe设备的配置空间的寄存器对应的比特来告知PCIe设备。在一个计算机系统中,这个值是由系统的架构决定的,一般而言,系统会将其设置为本系统页表的 最小粒度。有了这个最小翻译单元的约定,在PCI Express总线中,主机和PCIe设备之间的地址翻译就有了最小翻译空间的确定,也确定了翻译请求长度所对应的翻译地址空间的大小,这个值的具体大小如何确定,在PCI Express协议中有规定详细的计算公式。
S502、处理器接收一个地址翻译请求报文。
地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小。具体解释可以参考S501的阐述,不予赘述。
S503、处理器根据第一虚拟存储空间确定P个大小相同的子页表。
在一些实施例中,处理器可以先获取第一虚拟存储空间对应的N个页表,再根据最小翻译单元划分N个页表以得到P个大小相同的子页表,N为整数,N≥1。N个页表的大小不一定相同。N个页表不一定连续。
下面对从N个页表中获取P个大小相同的子页表的过程进行详细介绍。如图6所示,从N个页表中获取P个大小相同的子页表的过程具体包括S5031~S5033。
S5031、处理器根据第一虚拟地址确定N个页表。
可以理解的,地址翻译请求用于请求对应第一虚拟存储空间对应的N个页表的信息。第一虚拟存储空间与N个页表中每个页表指示的虚拟存储空间至少部分重叠。第一虚拟存储空间与N个页表中每个页表指示的虚拟存储空间至少部分重叠也可以替换描述为第一虚拟存储空间与N个页表中每个页表指示的虚拟存储空间至少部分相交(overlap)。示例的,如图7所示,为一实施例提供的虚拟存储空间与页表的对应关系示例图。一个页表指示的虚拟存储空间与虚拟存储空间部分重叠。
页表的物理起始地址是与页表大小(page size)对齐的。页表指示的虚拟存储空间的范围可以表示为v_addr~v_addr+X,其中,v_addr表示页表的虚拟起始地址,是与页表大小对齐的。v_addr+X表示页表的虚拟结束地址,X表示页表的大小。
在一些实施例中,处理器可以进行页表走(Page Table Walk,PTW),根据第一虚拟地址和页表的大小确定对应第一虚拟存储空间的N个页表。N个页表的大小不一定相同。如图8所示,为确定页表的具体方法。下面以第i个页表为例详细说明。i为整数,i∈[1,N],即第i个页表为N个页表中的任意一个页表。
S5031a、处理器根据第一虚拟地址和第i个页表的大小确定第二虚拟地址。
在一些实施例中,采用如下公式(1)确定第二虚拟地址:
v_addr’=(s_addr+D*STU)&(~(Xi-1))   (1)
其中,v_addr’表示第二虚拟地址,s_addr表示第一虚拟地址,D表示第1个页表至第i-1个页表与第一虚拟存储空间相交的部分所包含的STU的个数,Xi表示第i个页表的大小,&表示按位逻辑与,~表示按位取反。
S5031b、处理器判断第二虚拟地址与第i个页表的虚拟起始地址是否相同。
若第二虚拟地址与第i个页表的虚拟起始地址相同,执行S5031c;若第二虚拟地址与第i个页表的虚拟起始地址不同,执行S5031d。
S5031c、处理器确定第一虚拟存储空间与第i个页表指示的虚拟存储空间至少部分重叠。
在本实施例中,处理器确定第一虚拟存储空间与第i个页表指示的虚拟存储空间 至少部分重叠后,需要反馈第i个页表中与第一虚拟存储空间重叠的部分页表。
在一些实施例中,第i个页表指示的虚拟存储空间与第一虚拟存储空间重叠的部分可以采用如下公式(2)表示:
(p_addr+(s_addr+D*STU)-v_addr)~(p_addr+min(Xi,U))(2)
其中,p_addr表示第i个页表的物理起始地址,v_addr表示第i个页表的虚拟起始地址,Xi表示第i个页表的大小,U表示第一虚拟存储空间中未翻译的空间大小。
如果处理器确定第一虚拟存储空间与第i个页表指示的虚拟存储空间完全重叠后,需要反馈第i个页表。
S5031d、处理器确定第一虚拟存储空间与第i个页表指示的虚拟存储空间完全不重叠。
在本实施例中,处理器确定第一虚拟存储空间与第i个页表指示的虚拟存储空间完全不重叠后,处理器可以确定第i个页表不是对应第一虚拟存储空间的页表,无需反馈第i个页表的信息。
由于N个页表对应的物理地址空间的大小之和大于或者等于第一虚拟存储空间的大小,因此需要从这N个页表中选取与第一虚拟存储空间相交的那些子物理地址空间。在处理器根据第一虚拟地址确定对应第一虚拟存储空间的N个页表后,处理器从N个页表中确定P个子页表。
P个子页表中每个子页表的大小相同。P个子页表的大小之和等于第一虚拟存储空间的大小。一个所述子页表用于指示页表中的部分或全部虚拟地址到物理地址的映射关系。应理解的,P个子页表指示的虚拟存储空间与第一虚拟存储空间完全重叠。P个子页表是从N个页表中获取的,其在MMU/IOMMU可能不是一个以页表存在的(是本申请根据真实的N个物理页表所产生的),但其所对应的物理地址空间一定属于这N个页表所对应的物理地址空间。P为整数,P≥1。
在一些实施例中,页表的大小可以是最小翻译单元的倍数。例如,页表的大小是2的幂次方。处理器可以根据最小翻译单元划分N个页表,确定P个子页表。处理器从N个页表中确定P个子页表具体包括以下步骤。
S5032、处理器根据最小翻译单元划分N个页表,得到M个子页表。
在一些实施例中,对于N个页表中的每个页表,处理器可以根据最小翻译单元等分页表。处理器根据最小翻译单元等分N个页表后以得到M个子页表,M为整数,M≥1。M个子页表中每个子页表的大小相同。例如,子页表的大小等于最小翻译单元的大小。
在一些实施例中,第s个子页表的虚拟起始地址可以表示为v_addr+(s-1)*STU,其中,v_addr表示第s个子页表所属的页表的虚拟起始地址,s为整数,s∈[1,M]。
第s个子页表的物理起始地址可以表示为p_addr+(s-1)*STU,其中,p_addr表示第s个子页表所属的页表的物理起始地址,s为整数,s∈[1,M]。
例如,当s=1,表示第一个子页表,第一个子页表对应的虚拟起始地址为v_addr+(1-1)*STU。第一个子页表对应的物理起始地址为p_addr+(1-1)*STU。
又例如,当s=2,表示第二个子页表,第二个子页表对应的虚拟起始地址为v_addr+(2-1)*STU。第一个子页表对应的物理起始地址为p_addr+(2-1)*STU。
又例如,当s=3,表示第三个子页表,第三个子页表对应的虚拟起始地址为v_addr+(3-1)*STU。第一个子页表对应的物理起始地址为p_addr+(3-1)*STU。
示例的,如图9所示,为一实施例提供的页表划分示例图。假设页表包括4个最小翻译单元。根据最小翻译单元等分页表以得到4个子页表。
第一个子页表的虚拟起始地址为所属页表的虚拟起始地址,可以表示为v_addr。第一个子页表的物理起始地址为所属页表的物理起始地址,可以表示为p_addr。
第二个子页表的虚拟起始地址为第一个子页表的虚拟结束地址,可以表示为v_addr+1*STU。第二个子页表的物理起始地址为第一个子页表的物理结束地址,可以表示为p_addr+1*STU。
第三个子页表的虚拟起始地址为第二个子页表的虚拟结束地址,可以表示为v_addr+2*STU。第三个子页表的物理起始地址为第二个子页表的物理结束地址,可以表示为p_addr+2*STU。
第四个子页表的虚拟起始地址为第三个子页表的虚拟结束地址,可以表示为v_addr+3*STU。第四个子页表的物理起始地址为第三个子页表的物理结束地址,可以表示为p_addr+3*STU。
S5033、处理器从M个子页表中确定P个子页表。
在一些实施例中,第一虚拟存储空间包括至少一个最小翻译单元。第一虚拟存储空间的虚拟起始地址是与最小翻译单元对齐的。处理器可以根据最小翻译单元划分第一虚拟存储空间,确定P个地址翻译单元。比较M个子页表中每个子页表的虚拟起始地址与P个地址翻译单元中每个地址翻译单元对应的虚拟起始地址,若子页表的虚拟起始地址与地址翻译单元对应的虚拟起始地址相同,确定该子页表指示的虚拟存储空间与地址翻译单元对应的虚拟存储空间完全重叠,确定该子页表是需要反馈的子页表。从而,通过遍历M个子页表,从M个子页表中确定与第一虚拟存储空间重叠的子页表,直至取到确定的子页表的大小之和等于第一虚拟存储空间的大小为止。
在一些实施例中,地址翻译单元对应的虚拟起始地址可以表示为s_addr+(P-1)*STU,其中,s_addr表示第一虚拟存储空间的虚拟起始地址。
例如,当P=1,表示第一个地址翻译单元,第一个地址翻译单元对应的虚拟起始地址为s_addr+(1-1)*STU。
又例如,当P=2,表示第二个地址翻译单元,第二个地址翻译单元对应的虚拟起始地址为s_addr+(2-1)*STU。
又例如,当P=3,表示第三个地址翻译单元,第三个地址翻译单元对应的虚拟起始地址为s_addr+(3-1)*STU。
示例的,若第一个子页表对应的虚拟起始地址v_addr+(1-1)*STU等于第一个地址翻译单元对应的虚拟起始地址s_addr+(1-1)*STU,则确定该第一个子页表指示的虚拟存储空间与第一个地址翻译单元对应的虚拟存储空间完全重叠,该第一个子页表是需要反馈的子页表。
在另一些实施例中,若第一虚拟存储空间的虚拟起始地址与最小翻译单元未对齐,先将第一虚拟存储空间的虚拟起始地址与最小翻译单元对齐,再根据最小翻译单元划分第一虚拟存储空间。
在另一些实施例中,属于同一个页表的子页表的读写权限等页表属性与所属页表相同。
在另一些实施例中,由于处理器未获取到第一虚拟存储空间对应的部分页表,此时,处理器无法获取到第一虚拟存储空间的虚拟地址对应的部分物理地址,因此,P个子页表的大小之和小于第一虚拟存储空间的大小。应理解的,P个子页表指示的虚拟存储空间与第一虚拟存储空间部分重叠。
S504、处理器向PCIe设备发送一个地址翻译响应报文。
所述一个地址翻译响应报文包括P个子页表的物理起始地址和P个子页表的大小。P个子页表的物理起始地址中每个物理起始地址用于指示一个子页表的物理起始地址。在一些实施例中,该一个地址翻译响应报文还可以包括P个子页表的属性。子页表的属性与该子页表所属的页表的属性相同。子页表或页表的属性包括子页表或页表指示的内存空间在系统中的属性。页表的属性包括但不限于读写权限属性、全局页表属性和不能使用物理地址属性等。读写权限属性用于指示页表指示的内存空间在系统中的读权限和写权限。全局页表属性用于指示页表指示的内存空间在系统中是全局可用的。不能使用物理地址属性用于指示不可信的设备不能访问页表指示的内存空间。在另一些实施例中,还可以包括其他属性,不予赘述。
需要说明的是,PCIe基本规范版本4.0和PCIe基本规范版本5.0还规定一个地址翻译响应报文包含多个页表时,多个页表的顺序是按照请求翻译的虚拟存储空间的虚拟地址顺序确定的。因此,该一个地址翻译响应报文包含的P个子页表的物理起始地址是已排序的。例如,P个子页表的物理起始地址依据第一虚拟存储空间的虚拟地址顺序排序,从而,避免子页表发生乱序,PCIe设备确定错误的物理地址,访问错误的物理存储空间。
在一些实施例中,处理器接收地址翻译请求报文之后,处理器可以根据最小翻译单元划分第一虚拟存储空间以得到P个地址翻译请求单元。由于第一虚拟存储空间具有逻辑顺序关系,则P个地址翻译单元也是具有第一虚拟存储空间的虚拟地址顺序关系的。所谓地址顺序关系可以是指虚拟地址从大到小的关系。P个地址翻译单元可以根据第一虚拟存储空间的虚拟地址从大到小的顺序排序。
处理器可以依据第一虚拟存储空间的虚拟地址顺序设置P个地址翻译请求单元的标识。P个地址翻译请求单元的标识的顺序可以根据第一虚拟存储空间的虚拟地址从大到小的顺序排序。从而,处理器可以依据P个地址翻译请求单元的标识的顺序对P个子页表的物理起始地址排序。在一些实施例中,P个地址翻译请求单元的标识可以设置于地址翻译响应报文的报文头中。
示例的,假设P=3,表示处理器确定3个子页表。第一个物理地址是第一个子页表的物理起始地址。第二个物理地址是第二个子页表的物理起始地址。第三个物理地址是第三个子页表的物理起始地址。处理器依据第一虚拟存储空间的虚拟地址顺序对第一个物理地址、第二个物理地址和第三个物理地址排序。
在一些实施例中,可以先对物理地址排序,再对子页表的大小排序。例如,第一个物理地址、第二个物理地址、第三个物理地址、第一个子页表的大小、第二个子页表的大小和第三个子页表的大小依次排序。此时,第一个子页表的大小为第一个物理 地址对应的子页表的大小。第二个子页表的大小为第二个物理地址对应的子页表的大小。第三个子页表的大小为第三个物理地址对应的子页表的大小。
在另一些实施例中,可以物理地址和子页表的大小可以交叉排序。例如,第一个物理地址、第一个子页表的大小、第二个物理地址、第二个子页表的大小、第三个物理地址和第三个子页表的大小依次排序。此时,第一个子页表的大小为第一个物理地址对应的子页表的大小。第二个子页表的大小为第二个物理地址对应的子页表的大小。第三个子页表的大小为第三个物理地址对应的子页表的大小。
S505、PCIe设备接收一个地址翻译响应报文。
地址翻译响应报文包括翻译结果。翻译结果包含P个子页表的物理起始地址、P个子页表的大小和P个子页表的属性等。P个子页表中每个子页表的大小相同,P个子页表的大小之和等于第一虚拟存储空间的大小,P为整数,P≥1。具体解释可以参考S504的阐述,不予赘述。
S506、PCIe设备根据P个子页表的物理起始地址和P个子页表的大小访问第一虚拟存储空间对应的物理存储空间。
PCIe设备根据接收到的翻译结果后,建立了虚拟存储空间到物理存储空间的一一映射关系。例如,第一虚拟存储空间与其对应的物理存储空间、以及相关物理存储空间的读写访问权限、对应空间的属性等信息。这样,在PCIe设备发起访问处理器的内存时,可以将虚拟地址转换为本地维护的对应的物理地址去访问处理器,并且标记PCIe设备向内存发送的PCIe报文中的地址是翻译过的,地址对应的属性放置于报文的头前缀(prefix)或者其他保留域中。
PCIe设备根据子页表的物理地址和对应第一虚拟存储空间的虚拟地址中的偏移地址确定内存的物理地址,根据内存的物理地址和子页表的大小访问第一虚拟存储空间对应的物理存储空间。具体的方法可以参考现有技术,不予赘述。
在一些实施例中,PCIe设备接收到地址翻译响应报文,解析地址翻译响应报文以得到P个子页表的物理起始地址和P个子页表的大小后,可以直接将P个子页表的物理起始地址和P个子页表的大小存储到本地的地址翻译缓存(Address Translation Cache,ATC)中。P个子页表的物理起始地址和P个子页表的大小的顺序对应的是第一虚拟存储空间从虚拟起始地址到虚拟结束地址的顺序。待PCIe设备需要访问第一虚拟存储空间对应的物理存储空间时,再根据P个子页表的物理起始地址和P个子页表的大小访问第一虚拟存储空间对应的物理存储空间。
在另一些实施例中,PCIe设备可以根据P个子页表的物理起始地址和P个子页表的大小的顺序,依次判断哪些子页表是可以合并为一个连续的、大小是2的幂次方的页表,则将合并后页表的物理起始地址和合并后页表的大小再缓存到地址翻译缓存中,否则直接将子页表的物理起始地址和子页表的大小缓存到地址翻译缓存中。合并后页表包括有1个子页表或多个子页表。待PCIe设备需要访问第一虚拟存储空间对应的物理存储空间时,再根据合并后页表的物理起始地址和合并后页表的大小访问第一虚拟存储空间对应的物理存储空间。
示例的,如图8所示,在S506之后,还可以执行S507,或S508和S509。
S507、PCIe设备存储P个子页表的物理起始地址和P个子页表的大小。
S508、PCIe设备根据P个子页表的物理起始地址和P个子页表的大小合并P个子页表以得到N个页表的物理起始地址和N个页表的大小。
由于P个子页表是由至少一个页表划分得到的,则PCIe设备可以根据P个子页表的物理起始地址和P个子页表的大小将属于同一个页表的子页表合并,得到N个页表的物理起始地址和N个页表的大小,N为整数,1≤N≤P。
S509、PCIe设备存储N个页表的物理起始地址和N个页表的大小。
在一些实施例中,遍历P个子页表,根据如下方法合并P个子页表。下面以第j个子页表为例进行说明。j为整数,j∈[1,P],第j个子页表表示P个子页表中的任意一个子页表。如图10所示,合并P个子页表具体方法如下所述。
S1001、PCIe设备判断第j个子页表的物理起始地址是否等于合并地址。
合并地址用于指示已合并页表的物理结束地址。已合并页表可以包括至少一个子页表。
若第j个子页表的物理地址不等于合并地址,表明第j个子页表与第j-1个子页表不属于同一个页表,执行S1002;若第j个子页表的物理地址等于合并地址,执行S1003。
S1002、PCIe设备缓存第j个子页表的物理起始地址和第j个子页表的大小。
在一些实施例中,PCIe设备可以将第j个子页表的物理起始地址和第j个子页表的大小缓存到地址翻译缓存。
S1003、PCIe设备判断第j个子页表的属性与已合并页表的属性是否相同。
若第j个子页表的属性与已合并页表的属性不相同,表明第j个子页表与第j-1个子页表不属于同一个页表,执行S1002;若第j个子页表的属性与已合并页表的属性相同,表明第j个子页表与第j-1个子页表属于同一个页表,执行S1004。
S1004、PCIe设备判断第j个子页表的大小与已合并页表的大小是否相同。
若第j个子页表的大小与已合并页表的大小不相同,表明虽然第j个子页表与已合并页表属于同一个页表,但第j个子页表与已合并页表合并后不满足2的幂次方,执行S1005;若第j个子页表的大小与已合并页表的大小相同,表明第j个子页表与第j-1个子页表属于同一个页表,执行S1006。
S1005、PCIe设备缓存第j个子页表的物理结束地址和第j个子页表的大小。
在一些实施例中,PCIe设备可以将第j个子页表先缓存到地址翻译缓存之外的其他缓存区域中,遍历第j+1个子页表,确定第j+1个子页表是否可以与第j个子页表合并,具体的可以参考S1001至S1006的阐述,不予赘述。
S1006、PCIe设备将第j个子页表与已合并页表合并。
在一些实施例中,若第j+1个子页表与已合并页表不能合并,确定已合并页表为合并后页表,PCIe设备可以将合并后页表的物理起始地址和合并后页表的大小缓存到地址翻译缓存。
在一些实施例中,PCIe设备将第j个子页表与已合并页表合并后,更新合并地址和已合并页表的大小,以便于进一步判断其他子页表是否可以合并。
本申请的实施例提供的地址翻译方法,处理器可以根据第一虚拟存储空间确定P个大小相同的子页表,且P个子页表的大小之和等于第一虚拟存储空间的大小,并通过一个地址翻译响应报文反馈该P个子页表的物理起始地址和P个子页表的大小。因 此,PCIe设备只需要发送一条地址翻译请求报文,便可以通过一个地址翻译响应报文获取到第一虚拟存储空间的所有虚拟地址对应的物理地址。从而,在符合PCIe基本规范的规定的情况下,能够有效地降低PCIe设备请求翻译地址的时延和PCIe设备与处理器间的带宽占用率。
下面结合图11至图14对地址翻译方法进行举例说明。如图11所示,假设第一虚拟存储空间的大小为8个最小翻译单元。s_addr表示第一虚拟存储空间的第一虚拟起始地址。PCIe设备向处理器发送地址翻译请求报文,地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小。
处理器根据第一虚拟地址确定4个页表,其中,第一个页表的物理起始地址为p_addr0,第一个页表包括2个STU,第二个页表的物理起始地址为p_addr1,第一个页表包括2个STU,第三个页表的物理起始地址为p_addr2,第一个页表包括1个STU,第四个页表的物理起始地址为p_addr3,第一个页表包括4个STU。根据第一虚拟地址确定页表的方法如S5031的阐述,不予赘述。处理器根据最小翻译单元划分4个页表以得到9个子页表。第一个页表划分为2个子页表,物理起始地址分别为p_addr0和p_addr0+STU,虚拟起始地址分别为v_addr0和v_addr0+STU。第二个页表划分为2个子页表,物理起始地址分别为p_addr1和p_addr1+STU,虚拟起始地址分别为v_addr1和v_addr1+STU。第三个页表划分为1个子页表,物理起始地址分别为p_addr2,虚拟起始地址分别为v_addr2。第四个页表划分为4个子页表,物理起始地址分别为p_addr3、p_addr3+STU、p_addr3+2*STU和p_addr3+3*STU,虚拟起始地址分别为v_addr3、v_addr3+STU、v_addr3+2*STU和v_addr3+3*STU。
处理器可以根据最小翻译单元划分第一虚拟存储空间,确定8个地址翻译单元,比较9个子页表中每个子页表的虚拟起始地址与8个地址翻译单元中每个地址翻译单元对应的虚拟起始地址,若子页表的虚拟起始地址与地址翻译单元对应的虚拟起始地址相同,确定该子页表指示的虚拟存储空间与地址翻译单元对应的虚拟存储空间完全重叠,确定该子页表是需要反馈的子页表。从而,通过遍历M个子页表,从M个子页表中确定与第一虚拟存储空间重叠的子页表,直至取到确定的子页表的大小之和等于第一虚拟存储空间的大小为止。在本实施例中,假设9个子页表的8个子页表指示的虚拟存储空间与第一虚拟存储空间完全重叠,所述8个子页表的物理起始地址分别为p_addr0+STU、p_addr1、p_addr1+STU、p_addr2、p_addr3、p_addr3+STU、p_addr3+2*STU和p_addr3+3*STU。
处理器发送地址翻译响应报文,地址翻译响应报文包括8个子页表的物理起始地址和8个子页表的大小。在一些实施例中,可以依据地址翻译单元对应的虚拟起始地址的顺序排序8个子页表的物理起始地址和8个子页表的大小。
PCIe设备接收到地址翻译响应报文后,可以建立第一虚拟存储空间到物理存储空间的一一映射关系。例如,第一虚拟存储空间的虚拟起始地址s_addr与页表的物理起始地址p_addr0+STU对应。第一虚拟存储空间的虚拟地址s_addr+STU对应页表的物理地址p_addr1。第一虚拟存储空间的虚拟地址s_addr+3*STU对应页表的物理地址p_addr2。第一虚拟存储空间的虚拟地址s_addr+4*STU对应页表的物理地址p_addr3。
在一些实施例中,如图12a所示,PCIe设备可以采用如下公式(3)先对第一虚拟存 储空间的虚拟起始地址与STU对齐,再向处理器发送地址翻译请求报文,以便于处理器可以根据最小翻译单元划分第一虚拟存储空间。
s_addr=s_addr&(~(STU-1))   (3)
在本实施例中,处理器可以根据最小翻译单元划分第一虚拟存储空间以得到8个地址翻译请求单元。其中,第一个地址翻译单元对应的虚拟起始地址为s_addr,第二个地址翻译单元对应的虚拟起始地址为s_addr+STU,第三个地址翻译单元对应的虚拟起始地址为s_addr+2*STU,第四个地址翻译单元对应的虚拟起始地址为s_addr+3*STU,第五个地址翻译单元对应的虚拟起始地址为s_addr+4*STU,第六个地址翻译单元对应的虚拟起始地址为s_addr+5*STU,第七个地址翻译单元对应的虚拟起始地址为s_addr+6*STU,第八个地址翻译单元对应的虚拟起始地址为s_addr+7*STU。然后,处理器依据第一虚拟存储空间的虚拟地址顺序设置8个地址翻译请求单元的标识,即P0至P7。处理器通过页表走可以对8个地址翻译请求单元进行翻译,即确定8个地址翻译请求单元对应的子页表。在一些实施例中,可以采用公式(1)确定与地址翻译请求单元对应的页表,再确定与地址翻译请求单元对应的页表中重叠的子页表。例如,根据公式(4)确定与第一地址翻译请求单元对应的页表中重叠的子页表。
trans_addr=s_addr&(~(X1-1))+p_addr0   (4)
其中,trans_addr表示翻译地址。s_addr表示第一虚拟存储空间的虚拟起始地址。X1表示第一个页表的大小。p_addr0表示页表的物理起始地址。翻译地址与地址翻译请求单元的虚拟起始地址相同。翻译的大小为一个STU。
得到8个地址翻译请求单元对应的子页表后,处理器依据8个地址翻译请求单元的标识的顺序对8个子页表的物理起始地址和8个子页表的大小排序,组成地址翻译响应报文返回给对端。
在一些实施例中,可以采用并行处理的方式依据上述方法确定8个地址翻译单元对应的虚拟存储空间的子页表。在另一些实施例中,处理器对第一虚拟存储空间不统一分割为多个大小为STU的地址翻译请求单元请求翻译地址,而是对第一虚拟存储空间进行串行处理,即对于同一个ATS请求,先尝试请求翻译一次,获得翻译结果后,判断此时PCIe设备请求的翻译结果地址范围是否都已经得到全部翻译。如果还没有得到完全翻译,再继续翻译剩下的那些没有翻译的地址空间。对于多个不同的ATS请求,依然并行执行上述的翻译过程。即针对上述实施例的阐述。
示例的,如图12b所示,处理器先对第一虚拟存储空间的虚拟起始地址与STU进行对齐处理,然后,确定与第一虚拟存储空间对应的一个页表。例如,页表的物理起始地址为p_addr,该页表的大小为2个STU。然后,根据如下公式(5)至(9)确定该页表中与第一虚拟存储空间重叠的子页表。
X=min(X,U)  (5)
st_ovl=s_addr&(X-1)   (6)
ovl_size=X–st_ovl  (7)
s_addr=s_addr+ovl_size  (8)
U=U–ovl_size  (9)
其中,X表示页表的大小。U表示第一虚拟存储空间中未翻译的空间大小。min(a,b) 是取a和b两者值较小的值。st_ovl是指当前翻译结果和第一虚拟存储空间之间的交集空间在这个翻译结果对应的物理地址空间的偏移量。公式(6)中的&表示按位逻辑与。ovl_size表示与第一虚拟存储空间重叠的部分所对应的那块空间的大小。
上述所有空间的大小都是以STU为单位的,因此上述逻辑运算可以在STU对齐下运算,这样参与运算的信号比特位宽都会减少。由于STU至少为4KB,比特位宽至少减少12比特。
处理器根据第一个翻译结果判定当前翻译结果和第一虚拟存储空间之间的相交的空间是多大(这个相交空间的大小一定是STU的倍数),并且计算出这个相交空间的物理起始地址等逻辑运算。并且对这个相交空间以STU为单元进行切割,最终输出一个或者多个大小为STU的翻译结果,并且按照顺序缓存下来。处理器也会计算出当前还没翻译的地址空间有多少,如果其大小不为0,则继续发起翻译。
处理器根据第二次的翻译结果,以及之前处理后寄存下来的虚拟地址空间的物理起始地址等信息,执行上述串行的步骤,直至最后U为0,最后获得所有的翻译结果;将获得的所有翻译结果按照协议组成地址翻译响应报文返回给PCIe设备。
处理器可以根据当前翻译延时、预测平均页表大小(根据地址翻译请求报文中的请求标识来分类预测等)等值来动态决定地址翻译请求是并行处理还是串行处理。处理器还可以根据翻译延时阈值(例如,超过阈值即认为翻译延时过大),动态选择地址翻译请求是并行处理还是串行处理,还是只返回部分的翻译结果给PCIe设备,以便让PCIe设备能够尽快使用翻译结果,而不需要等太长的时间后才有翻译结果可用(例如,延时过大的时候,直接返回一个翻译结果给PCIe设备,这样PCIe设备看到的翻译延时可能会比较小,因为较短时间就有翻译结果可用)。
在一些实施例中,如图13所示,由于处理器未建立第一虚拟存储空间对应的页表,处理器无法获取到地址翻译单元对应的子页表。此时,处理器只反馈获取到地址翻译单元(如:P0至P2)对应的子页表。
PCIe设备接收地址翻译响应报文后,根据8个子页表的物理起始地址和8个子页表的大小访问第一虚拟存储空间对应的物理存储空间。其中,第一虚拟存储空间的虚拟地址s_addr对应的物理地址为p_addr0+STU。第一虚拟存储空间的虚拟地址s_addr+STU对应的物理地址为p_addr1。第一虚拟存储空间的虚拟地址s_addr+3*STU对应的物理地址为p_addr2。第一虚拟存储空间的虚拟地址s_addr+4*STU对应的物理地址为p_addr3。
在一些实施例中,如图14中的(a)所示,PCIe设备先缓存8个子页表的物理起始地址和8个子页表的大小。
在另一些实施例中,如图14中的(b)所示,PCIe设备根据8个子页表的物理起始地址和8个子页表的大小合并8个子页表以得到4个物理地址和4个页表的大小。
下面结合图15至图17对合并子页表的方法进行举例说明。如图15所示,PCIe设备可以先将第一个子页表的物理起始地址和子页表大小存储到一个缓存(L_c)中,更新合并地址(m_addr)和已合并页表的大小。合并地址为第一个子页表的物理结束地址,第一个子页表的物理结束地址为p_addr0+2*STU。已合并页表的大小为1个STU的大小。需要说明的是,L_c可以是寄存器中的一部分,L_c的存储深度可以是8个子 页表的物理起始地址和8个子页表的大小。根据PCIe协议的规定,L_c的存储深度可以是16。由于第二个子页表的物理起始地址(p_addr1)不等于合并地址(p_addr0+2*STU),将L_c中的第一个子页表的物理起始地址和子页表的大小存储到地址翻译缓存中,将第二个子页表的物理起始地址和子页表的大小存储到L_c。更新合并地址(m_addr)和已合并页表的大小。合并地址为第二个子页表的物理结束地址,第二个子页表的物理结束地址为p_addr1+1*STU。已合并页表的大小为1个STU的大小。其中,L_wr用于指示对应合并地址和已合并页表的大小的L_c的内的位置,即当前已合并的页表对应的值。
如图16所示,由于第三个子页表的物理起始地址(p_addr1+1*STU)等于合并地址(p_addr1+1*STU),且第三个子页表的属性与已合并页表的属性相同,以及第三个子页表的大小与已合并页表的大小相等,将第三个子页表的物理起始地址和子页表大小存储到L_c,合并第二个子页表和第三个子页表。更新合并地址(m_addr)和已合并页表的大小。合并地址为第三个子页表的物理结束地址,第三个子页表的物理结束地址为p_addr1+2*STU。已合并页表的大小为2个STU的大小。已合并页表的物理起始地址是第二个子页表的物理起始地址(p_addr1)。
由于第四个子页表的物理起始地址(p_addr2)不等于合并地址(p_addr1+2*STU),将L_c中的已合并页表的物理起始地址(p_addr1)和已合并页表的大小2*STU存储到地址翻译缓存中,将第四个子页表的物理起始地址和子页表的大小存储到L_c。更新合并地址(m_addr)和已合并页表的大小。合并地址为第四个子页表的物理结束地址p_addr2+1*STU。已合并页表的大小为1个STU的大小。
如图17所示,同理,合并第五个子页表和第六个子页表后,合并地址为第六个子页表的物理结束地址(p_addr3+2*STU)。已合并页表的大小为2个STU的大小。
由于第七个子页表的物理起始地址(p_addr3+2*STU)等于合并地址(p_addr3+2*STU),且第七个子页表的属性与已合并页表的属性相同,但第七个子页表的大小与已合并页表的大小不相等,先将第七个子页表的物理起始地址和子页表大小存储到L_c,不合并第七个子页表。更新合并地址(m_addr)和已合并页表的大小。合并地址为第七个子页表的物理结束地址(p_addr3+3*STU)。已合并页表的大小为1个STU的大小。
由于第八个子页表的物理起始地址(p_addr3+3*STU)等于合并地址(p_addr3+3*STU),且第八个子页表的属性与已合并页表的属性相同,以及第八个子页表的大小与已合并页表的大小相等,将第八个子页表的物理起始地址和子页表大小存储到L_c,合并第五个子页表至第八个子页表。更新合并地址(m_addr)和已合并页表的大小。合并地址为第八个子页表的物理结束地址(p_addr3+4*STU)。已合并页表的大小为4个STU的大小。已合并页表的物理起始地址是第五个子页表的物理起始地址(p_addr3),已合并页表的大小为4*STU。
其中,L_wr用于指示对应合并地址和已合并页表的大小的L_c的内的位置,即当前已合并的页表对应的值。例如,当第j个子页表(当前翻译结果)的物理起始地址和L_wr指向的已合并页表的寄存器的值一样,确定第j个子页表的大小不等于当前L_wr指向的已合并页表的大小的寄存器的值时,因此将第j个子页表的物理结束地址 和子页表的大小缓存到L_c中,刷新L_wr的值(如:加1),刷新的L_wr指向第j个子页表的物理结束地址和第j个子页表的大小,当第j+1个子页表的物理起始地址和L_wr指向的已合并页表的寄存器的值一样,确定第j+1个子页表的大小等于L_wr指向的第j个子页表的大小时,合并第j个子页表和第j+1个子页表,刷新L_wr值为L_wr-1,L_wr指向第j个子页表的物理起始地址和已合并的页表的大小。
从而,通过合并子页表,降低存储页表的信息而占用的存储空间,提高了ATC的利用率和地址翻译的效率。
本申请的实施例提供的地址翻译方法,对于任何一个地址翻译请求,不管其虚拟地址空间对应的页表有多少个,其中的页表大小是否都一样,处理器将所有的和虚拟地址空间相交(overlap)的翻译后的物理地址空间切割为大小都为STU的翻译结果,并且按照PCIe协议返回给PCIe对端。软件和PCIe设备不需要做任何的适配改变,兼容现有所有的协议和软件架构,兼容所有的PCIe设备。在不违反PCIe协议标准的规定,都能返回全部的翻译结果。
而且,对端PCIe设备在发起地址翻译请求的时候不需要考虑请求翻译的虚拟地址空间对应的页表是否都一样(否则不能返回全部的翻译结果),这样PCIe设备仅仅是简单地看到其想要翻译的虚拟地址空间就是一个起始地址和地址空间的大小这一段连续的虚拟地址空间,然后发起地址翻译请求,从而,可以让对端PCIe设备在实现地址翻译请求功能的时候简单化。
可以理解的是,为了实现上述实施例中功能,处理器和PCIe设备包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本申请中所公开的实施例描述的各示例的单元及方法步骤,本申请能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用场景和设计约束条件。
在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。在本申请的文字描述中,字符“/”,一般表示前后关联对象是一种“或”的关系;在本申请的公式中,字符“/”,表示前后关联对象是一种“相除”的关系。
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请的实施例的范围。上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定。

Claims (14)

  1. 一种地址翻译装置,其特征在于,包括:
    接口,用于从PCIe设备接收一个地址翻译请求报文,所述一个地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小,所述第一虚拟地址为所述第一虚拟存储空间的虚拟起始地址;
    地址翻译单元,用于根据所述第一虚拟存储空间确定P个大小相同的子页表,每个子页表用于指示虚拟地址到物理地址的映射关系,P个子页表的大小之和等于所述第一虚拟存储空间的大小,P为整数,P≥1;
    所述接口,还用于发送一个地址翻译响应报文给所述PCIe设备,所述一个地址翻译响应报文包括所述P个子页表的物理起始地址和所述P个子页表的大小。
  2. 根据权利要求1所述的装置,其特征在于,所述地址翻译单元,用于:
    根据所述第一虚拟存储空间和最小翻译单元确定所述P个大小相同的子页表。
  3. 根据权利要求2所述的装置,其特征在于,所述地址翻译单元,用于:
    根据所述第一虚拟地址确定N个页表,所述第一虚拟存储空间与所述N个页表中每个页表指示的虚拟存储空间至少部分重叠,N为整数,N≥1;
    根据所述最小翻译单元划分所述N个页表以得到M个子页表,所述M个子页表中每个子页表的大小相同,M为整数,M≥1,所述第一虚拟存储空间包括至少一个最小翻译单元;
    从所述M个子页表中确定所述P个子页表,1≤P≤M。
  4. 根据权利要求3所述的装置,其特征在于,所述地址翻译单元,用于:
    根据所述第一虚拟地址和页表的大小确定与所述第一虚拟存储空间重叠的所述N个页表。
  5. 根据权利要求4所述的装置,其特征在于,第i个页表指示的虚拟存储空间与所述第一虚拟存储空间重叠的部分采用如下公式表示:
    (p_addr+(s_addr+D*STU)-v_addr)~(p_addr+min(Xi,U)),
    其中,所述p_addr表示所述第i个页表的物理起始地址,所述s_addr表示所述第一虚拟地址,所述D表示第1个页表至第i-1个页表与第一虚拟存储空间相交的部分所包含的最小翻译单元的个数,所述STU表示最小翻译单元,所述v_addr表示所述第i个页表的虚拟起始地址,所述Xi表示所述第i个页表的大小,所述U表示所述第一虚拟存储空间中未翻译的空间大小,i为整数,i∈[1,N]。
  6. 根据权利要求1-5中任一项所述的装置,其特征在于,所述一个地址翻译响应报文包括的所述P个子页表的物理起始地址是已排序的。
  7. 根据权利要求1-6中任一项所述的装置,其特征在于,所述子页表的属性与该子页表所属的页表的属性相同,所述子页表或所述页表的属性包括所述子页表或所述页表指示的内存空间在系统中的属性。
  8. 一种地址翻译方法,其特征在于,包括:
    接收一个地址翻译请求报文,所述一个地址翻译请求报文包括第一虚拟地址和第一虚拟存储空间的大小,所述第一虚拟地址为所述第一虚拟存储空间的虚拟起始地址;
    根据所述第一虚拟存储空间确定P个大小相同的子页表,每个子页表用于指示虚 拟地址到物理地址的映射关系,P个子页表的大小之和等于所述第一虚拟存储空间的大小,P为整数,P≥1;
    发送一个地址翻译响应报文,所述一个地址翻译响应报文包括所述P个子页表的物理起始地址和所述P个子页表的大小。
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述第一虚拟存储空间确定P个大小相同的子页表,包括:
    根据所述第一虚拟存储空间和最小翻译单元确定所述P个大小相同的子页表。
  10. 根据权利要求9所述的方法,其特征在于,所述根据所述第一虚拟存储空间和最小翻译单元确定所述P个大小相同的子页表,包括:
    根据所述第一虚拟地址确定N个页表,所述第一虚拟存储空间与所述N个页表中每个页表指示的虚拟存储空间至少部分重叠,N为整数,N≥1;
    根据所述最小翻译单元划分所述N个页表以得到M个子页表,所述M个子页表中每个子页表的大小相同,M为整数,M≥1,所述第一虚拟存储空间包括至少一个最小翻译单元;
    从所述M个子页表中确定所述P个子页表,1≤P≤M。
  11. 根据权利要求10所述的方法,其特征在于,根据所述第一虚拟地址确定N个页表,包括:
    根据所述第一虚拟地址和页表的大小确定与所述第一虚拟存储空间重叠的所述N个页表。
  12. 根据权利要求11所述的方法,其特征在于,第i个页表指示的虚拟存储空间与所述第一虚拟存储空间重叠的部分采用如下公式表示:
    (p_addr+(s_addr+D*STU)-v_addr)~(p_addr+min(Xi,U)),
    其中,所述p_addr表示所述第i个页表的物理起始地址,所述s_addr表示所述第一虚拟地址,所述D表示第1个页表至第i-1个页表与第一虚拟存储空间相交的部分所包含的最小翻译单元的个数,所述STU表示最小翻译单元,所述v_addr表示所述第i个页表的虚拟起始地址,所述Xi表示所述第i个页表的大小,所述U表示所述第一虚拟存储空间中未翻译的空间大小,i为整数,i∈[1,N]。
  13. 根据权利要求8-12中任一项所述的方法,其特征在于,所述一个地址翻译响应报文包括的所述P个子页表的物理起始地址是已排序的。
  14. 根据权利要求8-13中任一项所述的方法,其特征在于,所述子页表的属性与该子页表所属的页表的属性相同,所述子页表或所述页表的属性包括所述子页表或所述页表指示的内存空间在系统中的属性。
PCT/CN2019/111777 2019-10-17 2019-10-17 一种地址翻译方法及装置 WO2021072721A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201980101414.4A CN114556881B (zh) 2019-10-17 2019-10-17 一种地址翻译方法及装置
EP19949260.4A EP4036741A1 (en) 2019-10-17 2019-10-17 Address translation method and apparatus
PCT/CN2019/111777 WO2021072721A1 (zh) 2019-10-17 2019-10-17 一种地址翻译方法及装置
US17/720,858 US20220245067A1 (en) 2019-10-17 2022-04-14 Address translation method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/111777 WO2021072721A1 (zh) 2019-10-17 2019-10-17 一种地址翻译方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/720,858 Continuation US20220245067A1 (en) 2019-10-17 2022-04-14 Address translation method and apparatus

Publications (1)

Publication Number Publication Date
WO2021072721A1 true WO2021072721A1 (zh) 2021-04-22

Family

ID=75537341

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/111777 WO2021072721A1 (zh) 2019-10-17 2019-10-17 一种地址翻译方法及装置

Country Status (4)

Country Link
US (1) US20220245067A1 (zh)
EP (1) EP4036741A1 (zh)
CN (1) CN114556881B (zh)
WO (1) WO2021072721A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023040464A1 (zh) * 2021-09-14 2023-03-23 华为技术有限公司 一种总线通信方法及相关设备

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114780466B (zh) * 2022-06-24 2022-09-02 沐曦科技(北京)有限公司 一种基于dma的数据复制延时的优化方法
CN116107935B (zh) * 2022-12-30 2023-08-22 芯动微电子科技(武汉)有限公司 一种基于PCIe地址转换服务机制的ATC实现方法
CN116680202B (zh) * 2023-08-03 2024-01-02 深流微智能科技(深圳)有限公司 一种调试处理内存区的管理方法、系统及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050283602A1 (en) * 2004-06-21 2005-12-22 Balaji Vembu Apparatus and method for protected execution of graphics applications
US20110047261A1 (en) * 2006-10-10 2011-02-24 Panasonic Corporation Information communication apparatus, information communication method, and program
CN103116555A (zh) * 2013-03-05 2013-05-22 中国人民解放军国防科学技术大学 基于多体并行缓存结构的数据访问方法
CN104508641A (zh) * 2012-08-02 2015-04-08 高通股份有限公司 单页表条目内的多组属性字段

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7707385B2 (en) * 2004-12-14 2010-04-27 Sony Computer Entertainment Inc. Methods and apparatus for address translation from an external device to a memory of a processor
US7653803B2 (en) * 2006-01-17 2010-01-26 Globalfoundries Inc. Address translation for input/output (I/O) devices and interrupt remapping for I/O devices in an I/O memory management unit (IOMMU)
US9092365B2 (en) * 2013-08-22 2015-07-28 International Business Machines Corporation Splitting direct memory access windows
CN105989758B (zh) * 2015-02-05 2019-03-19 龙芯中科技术有限公司 地址翻译方法和装置
US10048881B2 (en) * 2016-07-11 2018-08-14 Intel Corporation Restricted address translation to protect against device-TLB vulnerabilities
US10997083B2 (en) * 2018-09-04 2021-05-04 Arm Limited Parallel page table entry access when performing address translations
GB2582362B (en) * 2019-03-21 2021-08-04 Advanced Risc Mach Ltd Page table structure

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050283602A1 (en) * 2004-06-21 2005-12-22 Balaji Vembu Apparatus and method for protected execution of graphics applications
US20110047261A1 (en) * 2006-10-10 2011-02-24 Panasonic Corporation Information communication apparatus, information communication method, and program
CN104508641A (zh) * 2012-08-02 2015-04-08 高通股份有限公司 单页表条目内的多组属性字段
CN103116555A (zh) * 2013-03-05 2013-05-22 中国人民解放军国防科学技术大学 基于多体并行缓存结构的数据访问方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4036741A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023040464A1 (zh) * 2021-09-14 2023-03-23 华为技术有限公司 一种总线通信方法及相关设备

Also Published As

Publication number Publication date
EP4036741A4 (en) 2022-08-03
CN114556881A (zh) 2022-05-27
CN114556881B (zh) 2022-12-13
US20220245067A1 (en) 2022-08-04
EP4036741A1 (en) 2022-08-03

Similar Documents

Publication Publication Date Title
WO2021072721A1 (zh) 一种地址翻译方法及装置
US8250254B2 (en) Offloading input/output (I/O) virtualization operations to a processor
EP3214550B1 (en) Control of persistent memory via a computer bus
US9934173B1 (en) Pseudo cut-through architecture between non-volatile memory storage and remote hosts over a fabric
WO2020247042A1 (en) Network interface for data transport in heterogeneous computing environments
WO2019227883A1 (zh) 地址转换方法、装置及系统
US10079916B2 (en) Register files for I/O packet compression
EP4160425A1 (en) Data transmission method, chip, and device
CN110119304B (zh) 一种中断处理方法、装置及服务器
US20120054380A1 (en) Opportunistic improvement of mmio request handling based on target reporting of space requirements
CN110795376B (zh) 具有用于访问验证的基于查询的地址转换的系统架构
WO2023040464A1 (zh) 一种总线通信方法及相关设备
US8639840B2 (en) Processing unit, chip, computing device and method for accelerating data transmission
CN115964319A (zh) 远程直接内存访问的数据处理方法及相关产品
CN117555836A (zh) 一种数据处理装置、系统及电子设备
TWI744111B (zh) 查找表建立暨記憶體位址查詢方法、主機記憶體位址查找表建立方法與主機記憶體位址查詢方法
WO2019140885A1 (zh) 一种目录处理方法、装置及存储系统
US20230205691A1 (en) Flush packet sending method and apparatus
US20190087351A1 (en) Transaction dispatcher for memory management unit
WO2023000696A1 (zh) 一种资源分配方法及装置
WO2022170452A1 (zh) 一种访问远端资源的系统及方法
US20230144693A1 (en) Processing system that increases the memory capacity of a gpgpu
WO2024061344A1 (zh) 数据迁移方法、装置、芯片以及计算机可读存储介质
WO2024113090A1 (zh) 访存方法、装置及系统
CN116034346A (zh) 用于内存管理的设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19949260

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019949260

Country of ref document: EP

Effective date: 20220426