WO2024082232A1 - 一种内存访问控制方法和装置 - Google Patents

一种内存访问控制方法和装置 Download PDF

Info

Publication number
WO2024082232A1
WO2024082232A1 PCT/CN2022/126519 CN2022126519W WO2024082232A1 WO 2024082232 A1 WO2024082232 A1 WO 2024082232A1 CN 2022126519 W CN2022126519 W CN 2022126519W WO 2024082232 A1 WO2024082232 A1 WO 2024082232A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
domain
tag
instruction
memory domain
Prior art date
Application number
PCT/CN2022/126519
Other languages
English (en)
French (fr)
Inventor
曹建龙
陶喆
崔爱国
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202280011459.4A priority Critical patent/CN118235122A/zh
Priority to PCT/CN2022/126519 priority patent/WO2024082232A1/zh
Publication of WO2024082232A1 publication Critical patent/WO2024082232A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/14Protection against unauthorised use of memory or access to memory

Definitions

  • the present application relates to the field of computer security, and in particular to a memory access control method and device.
  • the present application provides a memory access control method and device for implementing memory partition isolation, reducing software overhead, and lowering memory isolation granularity.
  • a memory access control method is provided, which is applied to a processor, such as an ARM or A core, and the method includes: configuring a memory tag for at least one memory domain; wherein the memory domain configured with the memory tag is only allowed to be accessed by instructions associated with an address tag and the address tag matches the memory tag; before running a first memory domain in at least one memory domain, if the first memory domain is configured with a memory tag, the memory tag of the first memory domain is released; wherein the first memory domain is any one of the at least one memory domain; after determining that the first memory domain does not have a memory tag, running the first memory domain.
  • the above scheme first allocates a memory tag to at least one memory domain, and before running the first memory domain in at least one memory domain, first ensure that it has no memory tag, and then run the first memory domain, so that the first memory domain can be accessed by the components of this domain, and because the embodiment of the present application only configures the memory tag but not the address tag, the component domain corresponding to the first memory domain cannot directly access other memory domains, thereby realizing inter-domain access isolation.
  • This scheme has no intrusive modification to the application, does not require recompiling the program code, does not require increasing the size of the application, and can reduce the implementation cost; compared with other memory isolation schemes implemented by software or hardware, it can reduce performance overhead, such as reducing user state and kernel context switching overhead; memory partitioning can reach 16-byte granularity, which can realize fine-grained memory access control.
  • a memory tag is configured for at least one memory domain.
  • the memory tag may be randomly generated.
  • This design method can reduce the probability of third-party code guessing memory tags and improve the security of memory isolation.
  • a memory tag is configured for at least one memory domain.
  • a memory tag may be configured for a memory domain in which sensitive assets are stored in at least one memory domain.
  • This design approach can ensure the security of sensitive assets while avoiding waste of memory tags and improving the utilization of memory tag resources.
  • a first instruction from the first memory domain can also be detected, where the first instruction is used to access the second memory domain; a memory tag is configured for the first memory domain; the memory tag of the second memory domain is released; the first instruction is executed; and the execution result of the first instruction is returned to the first memory domain.
  • This design method realizes cross-domain access through a tag reconfiguration mechanism, which can improve the security of cross-domain access.
  • the memory tag configured in the first memory domain is a randomly generated memory tag.
  • This design method can prevent third-party code from bypassing the tag reconfiguration process and directly accessing the second memory domain, which can further improve the security of cross-domain access.
  • the address of the first memory domain may be saved.
  • the execution result of the first instruction may be returned to the first memory domain according to the saved address.
  • a memory tag may be configured for the second memory domain; and the memory tag of the first memory domain may be released.
  • This design method can ensure that the execution result is correctly returned to the first memory domain, thereby improving the reliability of cross-domain access.
  • the MTE instruction includes one or more of a read assigned tag LDG instruction, a read multiple tags LDGM instruction, a store assigned tag STG instruction, a store assigned tag and clear STZG instruction, a store assigned tag and store two registers STGP instruction, and a store multiple assigned tags STGM instruction.
  • This design method can prevent third-party code from reading legal memory tags or illegally setting memory tags through MTE instructions, and then obtaining address tags through memory tags to achieve illegal access to memory domains, which can further improve the security and reliability of memory access.
  • a control device comprising modules/units/technical means for executing the method described in the first aspect or any possible design of the first aspect.
  • the device may include:
  • a configuration module is used to configure a memory tag for at least one memory domain; wherein the memory domain configured with the memory tag is only allowed to be accessed by instructions associated with an address tag and the address tag matches the memory tag; before running a first memory domain in at least one memory domain, if the first memory domain is configured with a memory tag, the memory tag of the first memory domain is released; wherein the first memory domain is any one of the at least one memory domain; after determining that the first memory domain does not have a memory tag, the first memory domain is run.
  • the configuration module can be used to: randomly generate memory tags.
  • the configuration module may be used to configure a memory tag for at least one memory domain storing sensitive assets.
  • the device may further include:
  • the gateway module is used to detect a first instruction from a first memory domain, the first instruction is used to access a second memory domain; configure a memory tag for the first memory domain; release the memory tag of the second memory domain; execute the first instruction; and return the execution result of the first instruction to the first memory domain.
  • the gateway module may also be used to: before executing the first instruction, determine that the memory tag configured in the first memory domain is a randomly generated memory tag.
  • the gateway module can also be used to: save the address of the first memory domain before executing the first instruction; when returning the execution result of the first instruction to the first memory domain, return the execution result of the first instruction to the first memory domain according to the saved address.
  • the gateway module can also be used to: configure a memory tag for the second memory domain after executing the first instruction; and release the memory tag of the first memory domain.
  • the configuration module can also be used to: before running the first memory domain, detect whether there is a memory tag extension MTE instruction in the first memory domain through offline or online binary scanning; if so, clear the MTE instruction; wherein the MTE instruction includes one or more of a read assigned tag LDG instruction, a read multiple tags LDGM instruction, a store assigned tag STG instruction, a store assigned tag and clear STZG instruction, a store assigned tag and store two registers STGP instruction, and a store multiple assigned tags STGM instruction.
  • a control device comprising: at least one processor and an interface circuit; the interface circuit is used to receive signals from other devices outside the device and transmit them to the processor or send signals from the processor to other devices outside the device, and the processor is used to implement the method described in the first aspect or any possible design of the first aspect through logic circuits or execution code instructions.
  • a computer-readable storage medium is provided, wherein the computer-readable storage medium is used to store instructions. When the instructions are executed, the method described in the first aspect or any possible design of the first aspect is implemented.
  • a computer program product wherein instructions are stored in the computer program product, and when the computer program product is run on a computer, the computer is caused to execute the method described in the first aspect or any possible design of the first aspect.
  • Figure 1 is a schematic diagram of memory access in a software reuse scenario
  • FIG2 is a flow chart of a memory access control method provided in an embodiment of the present application.
  • FIG3 is a schematic diagram of the memory access mechanism of MTE
  • FIG4 is a schematic diagram of a memory tag
  • FIG5 is a schematic diagram of a label distributor allocating memory labels and releasing memory labels
  • FIG6 is a flow chart of a cross-domain access method provided in an embodiment of the present application.
  • FIG7 is a schematic diagram of an example of cross-domain access
  • FIG8 is a flowchart of an example of cross-domain access
  • FIG9 is a schematic diagram of a control device provided in an embodiment of the present application.
  • FIG. 10 is a schematic diagram of another control device provided in an embodiment of the present application.
  • trusted components and untrusted components use the same address space of memory.
  • Figure 1 is a schematic diagram of memory access in a software reuse scenario.
  • the sensitive asset library stores the data and code of trusted components
  • the third-party library stores the data and code of untrusted components.
  • Any user can send requests through the program entry to access the third-party library and sensitive resource library. They can also maliciously construct memory corruption, such as maliciously modifying data in a legitimate memory, out-of-bounds access (accessing a piece of memory, but beyond the legitimate access range), etc.
  • SFI Software-Based Fault Isolation
  • This method does not rely on hardware, supports microprocessor (Advanced RISC Machines, ARM) architecture and A core (Cortex-A) architecture, and memory partitioning can support fine-grained 16B to 64B, and the memory overhead occupied by the application is small.
  • this method has a great impact on performance due to instruction translation.
  • This method uses MPU to isolate memory, which has less impact on performance than SFI. It can support ARM architecture. Memory partitioning can support fine granularity of 16B to 64B, and has low memory overhead. However, this method depends on hardware implementation and does not support Cortex-A.
  • MPK Secure, Efficient In-process Isolation with Protection Keys
  • ERIM is an x86-based isolation technology that relies on the x86 instruction set architecture (ISA) extension.
  • This scheme uses memory protection keys (MPK) to mark each virtual page with a 4-bit domain identifier (ID), thereby dividing the address space of the process into up to 16 non-overlapping domains.
  • Special registers local to each logical core such as the user-mode-based Protection Key Register User (PKRU), determine which domains the kernel can read or write. Switching domain permissions requires writing a PKRU in user space, which takes only 11,260 cycles on current Intel CPUs, corresponding to an overhead of 0.07% to 1.0% per 100,000 switches/second on a 2.6GHz CPU. This equates to an overhead of up to 4.8% on NGINX throughput.
  • MPK technical content mainly includes:
  • MPK technology provides 16 protection keys (hereinafter referred to as Key) for each process, so the address space of the process can be divided into 16 protection domains at most.
  • the PKRU register stores the specific access rights of each Key.
  • the untrusted domain is always allowed to access, and the trusted domain can never be accessed by the untrusted domain.
  • the untrusted domain can obtain access rights by jumping to the call gates of the trusted domain.
  • ERIM needs to solve is to prevent the untrusted domain from using the WRPKRU instruction to modify the memory domain access rights. ERIM relies on binary inspection to ensure that only safe WRPKRU appears in the executable page.
  • MPU technology mainly includes:
  • the MPU unit on the microcontroller unit provides 16 protection region sets (global), so 16 protection regions can be divided for tasks.
  • the PKRU register stores the specific access rights of each key.
  • PEKM technology mainly includes:
  • the task is executed in kernel state, so there is a problem of isolation in the same process space.
  • PEKM needs to solve is to prevent tasks from using sensitive instructions such as mtcr to modify protection domain permissions. PEKM relies on binary inspection to ensure that system instructions do not appear in tasks.
  • PEKM technology relies on MPU hardware.
  • the memory domain is set through MPU to switch the memory of untrusted components and trusted components.
  • the memory granularity of MPU can be up to 64 bytes, but MPU is only available in some ARM architectures (such as ARM v8R or v8M).
  • MMU memory management unit
  • PEKM also proposes the use of memory management unit (MMU) for protection, but its granularity is still page granularity, that is, 4KB.
  • the present application provides a memory access control solution.
  • the present application implements memory partitioning based on the Memory Tag Extension (MTE) technology, which can reduce software overhead, reduce dependence on fixed hardware, and reduce memory isolation granularity while preventing untrusted components from accessing the memory of trusted components.
  • MTE Memory Tag Extension
  • FIG2 is a flowchart of a memory access control method provided in an embodiment of the present application.
  • the method can be executed by a processor, such as an ARM or Cortex-A processor, and the present application does not limit it.
  • the method includes:
  • the processor configures a memory tag for at least one memory domain; wherein the memory domain configured with the memory tag is only allowed to be accessed by instructions associated with an address tag and the address tag matches the memory tag.
  • the processor can use memory tag extension (MTE) technology to configure a memory tag for at least one memory domain.
  • MTE memory tag extension
  • MTE is an instruction set extension introduced in the ARMv8.5-A architecture.
  • MTE implements a memory access lock & key mechanism that can provide fine-grained physical memory access control, that is, memory access is only authorized when the key value in the pointer is the same as the lock value in the physical memory, otherwise an exception is triggered.
  • Figure 3 is a schematic diagram of the memory access mechanism of MTE.
  • MTE key is also called address tag or logical tag or pointer tag, and lock is also called memory tag.
  • Each memory tag is 4 bits in size and is associated with 16 bytes of physical memory. In specific implementation, multiple different 16-byte physical memories can be associated with the same or different memory tags, and this application does not limit this.
  • Figure 4 it is a schematic diagram of memory tags, where the physical memory [-16, 0] corresponds to the memory tag 0xA, the physical memory [32, 48] has a memory tag 0x3, and the physical memory [0, 16] and [16, 32] correspond to the same memory tag, that is, 0x2.
  • MTE provides special instructions to update this area, such as read allocation tag (Load Allocation Tag, LDG), store allocation tag (Store Allocation Tag, STG) and other instructions.
  • MTE utilizes the Top Byte Ignore (TBI) feature of ARM and uses the high [59:56] bits of the access address as the address label identification pointer. There can be up to 16 different address labels.
  • TBI Top Byte Ignore
  • the access instruction (such as a read instruction or a write instruction) will carry an access address, and the access address is associated with an address tag, such as the high bits [59:56] of the access address are the address tag.
  • Memory domain It can also be called memory area.
  • the processor can divide the memory into multiple different memory domains according to the information input by the user (such as memory domain configuration file). Memory domain is used to mark memory intervals such as third-party libraries or sensitive assets.
  • the processor may use a tag allocator to allocate a memory tag to at least one memory domain.
  • the tag allocator is an allocator defined by the MTE technology. Unlike common memory allocators, the allocator provides tag allocation for memory, which is used to taint a given memory (i.e., allocate a memory tag to a given memory). MTE allocation can also provide MTE instructions to modify memory tags.
  • MTE is essentially program code.
  • the processor runs this program code, it implements functions such as allocating memory tags and modifying memory tags.
  • the processor may randomly generate a memory tag, thereby reducing the probability of third-party code guessing the memory tag and improving the security of memory isolation.
  • the third-party code in this article refers to: from the perspective of the application developer, the code is not developed by the application developer (for example, the code developed by an illegal user (commonly known as a "hacker")), but does not include the system trusted base (such as the system kernel code, the code implementing the method of this application, etc.).
  • the processor when configuring a memory tag for at least one memory domain, the processor can only configure the memory tag for the memory domain storing sensitive assets in at least one memory domain.
  • At least one memory domain includes a first memory domain and a second memory domain, wherein the first memory domain stores sensitive assets and the second memory domain stores non-sensitive assets, then a memory tag may be configured for the first memory domain, but a memory domain tag may not be configured for the second memory domain. This is because the second memory domain stores non-sensitive assets, so it is always allowed to be accessed, and therefore it is not necessary to configure a memory tag for it.
  • assets include but are not limited to code and/or data.
  • running an application when running an application, the essence is to run (or execute) code to process data.
  • an application in applications developed based on software reuse, an application can be composed of multiple components. And the code and data corresponding to each component can be stored in different memory domains. Therefore, when running a component of an application, it can be understood as running (or executing) the code in the memory domain corresponding to the component, or running the memory domain corresponding to the component.
  • running the first memory domain can be understood as running the code in the first memory domain. Before running the first memory domain means before running the first code in the first memory domain.
  • MTE technology itself allows access when the address tag (key) and memory tag (lock) are configured. Therefore, the application code needs to be modified to carry the address tag in the access instruction when accessing the memory domain.
  • a memory tag is first configured for at least one memory domain (i.e., locked), and before running any one of the at least one memory domains (such as the first memory domain), it is ensured that the memory tag of the memory domain does not have a memory tag (i.e., in an unlocked state), so that after the memory domain is run, the component corresponding to the memory domain can access (including read and/or write) the data in the memory of this domain, and when the memory domain accesses other domains, because other memory domains are configured with memory tags (i.e., in a locked state), and the memory domain does not have an address tag (because the tag distributor does not configure the address tag), it will trigger an exception due to tag mismatch, so other memory domains cannot be directly accessed.
  • the tag allocator is respectively configured with memory tag 0, memory tag 1, and memory tag 2 for memory domain A, memory domain B, and memory domain C.
  • memory tag 0 Before running memory domain A, memory tag 0 will first release the memory tag (lock) of memory domain A, so that when memory domain A is running, it can only access this domain, and cannot directly access other memory domains (as shown in FIG5 , memory domain A fails to access memory domain B).
  • the first memory domain does not have a memory tag when the first memory domain is running, it is necessary to determine whether the first memory domain has a memory tag before running the first memory domain. If the first memory domain has a memory tag, after releasing the memory tag of the first memory domain, execute S203; if the first memory domain does not have a memory tag (for example, the first memory domain stores non-sensitive assets, and only the memory domain storing sensitive assets is configured with a memory tag in step S21), S203 can be directly executed.
  • the processor may release the memory tag of the first memory domain through hardware implementation or through setting of an STG instruction.
  • Running the first memory domain means that the processor starts to read and execute the code in the first memory domain to implement the function of the component corresponding to the first memory domain. It can be understood that when the code in the first memory domain is executed, instructions can be generated, and these instructions can be read and/or write instructions, or other instructions, which are not limited in this application.
  • the processor first assigns a memory tag to at least one memory domain. Before running the first memory domain in the at least one memory domain, it ensures that it has no memory tag, and then runs the first memory domain. This ensures that the first memory domain can be accessed by the components of this domain. Moreover, since the embodiment of the present application only configures memory tags but not address tags, the component domain corresponding to the first memory domain cannot directly access other memory domains, thereby realizing inter-domain access isolation.
  • this solution adopts a combination of software and hardware (for example, the tag distributor, API gateway, etc. rely on software implementation, while MTE instructions, memory tags, etc. involve hardware), which can reduce performance overhead, such as reducing the user state and kernel context switching overhead;
  • Memory partitioning based on MTE technology can reach 16-byte granularity and implement fine-grained memory access control.
  • the processor can also achieve cross-domain access by reconfiguring memory tags.
  • an embodiment of the present application further provides a cross-domain access method, including:
  • S601 After running a first memory domain, the processor detects a first instruction from the first memory domain, where the first instruction is used to access a second memory domain.
  • S602 The processor reconfigures a memory tag for the first memory domain (because the tag of the first memory domain has been released before running the first memory domain).
  • the processor randomizes memory tags for the first memory domain.
  • S603 The processor releases the memory tag of the second memory domain, so that the second memory domain runs normally (can be accessed).
  • S604 The processor executes the first instruction in the second memory domain
  • the processor returns the execution result of the first instruction to the first memory domain.
  • the accessing domain when accessing across domains, before the accessed domain (such as the second memory domain) executes read and write instructions from the accessing domain (such as the first memory domain), the accessing domain needs to be re-locked and the accessed domain unlocked so that the accessed domain can be accessed and only the running memory domain can access it.
  • the processor can realize the control process of cross-domain access by configuring the Application Program Interface (API) Gateway.
  • API Application Program Interface
  • the API Gateway can also be software code in nature.
  • the API Gateway When the API Gateway is running, it implements cross-domain access related functions, such as detecting the first instruction, controlling the label distributor to reconfigure the memory label (such as reconfiguring the label for the first memory domain, releasing the label for the second memory domain, etc.), and controlling the execution of the first instruction. If there is a situation of bypassing the API Gateway for cross-domain access, an exception will be triggered due to label mismatch.
  • the API gateway can cooperate with the label allocator to reconfigure the memory tags to enable the first memory domain to access the second memory domain.
  • This design method can achieve cross-domain access through the cooperation of API gateway and label distributor, thereby improving the security of cross-domain access.
  • the processor may also verify whether the memory tag of the first memory domain is a randomly generated memory tag. When the memory tag configured in the first memory domain is a randomly generated memory tag, the processor executes the first instruction; otherwise, the first instruction is not executed.
  • the processor may further save the address of the first memory domain before executing the first instruction.
  • the processor returns the execution result of the first instruction to the first memory domain, the execution result of the first instruction is returned to the first memory domain according to the saved address.
  • the processor can also reconfigure the memory tag for the second memory domain, that is, re-lock the second memory domain and release the memory tag of the first memory domain, so that the execution result can be returned to the first memory domain.
  • the processor when the processor reconfigures the memory tag for the second memory domain, the memory tag can be randomly generated.
  • the processor can also acquire an atomic lock, which is used to ensure that the API gateway has only one switching subject at the same time (i.e., one memory domain that needs to be locked and one memory domain that needs to be unlocked); after executing the first instruction, the processor releases the atomic lock. In this way, concurrency problems can be avoided during cross-domain access.
  • acquiring an atomic lock refers to the process of executing the instruction "Acquire_lock”.
  • the meaning of the instruction "Acquire_lock” is: This step acquires a lock on a specified value. You can use locks to prevent concurrent modification of resources.
  • S801 acquire an atomic lock and determine the memory domain currently being executed, such as the first inner domain (in operation, so there is no memory tag);
  • S802 The processor uses a random memory tag to mark (ie, lock) the first memory domain.
  • S803 The processor determines a memory domain to be executed, such as a second memory domain, and releases a memory tag of the second memory domain.
  • S804 The processor verifies whether the memory tag of the first memory domain is a random memory domain tag, and continues the subsequent process after the verification passes.
  • S805 The processor saves the return address (i.e., the address of the first memory domain) and executes the cross-domain access instruction (such as the first instruction);
  • S806 The processor re-marks (ie, locks) the second memory domain using a random memory tag.
  • S807 The processor releases the memory tag of the first memory domain to enable it to operate normally.
  • S808 The processor releases the atomic lock and returns the execution result to the first memory domain.
  • PAC pointer authentication
  • the above process can be realized by running code.
  • the processor can execute an MTE instruction clear before running any memory domain to prevent third-party code from illegally accessing the pointer tag by illegally obtaining the MTE instruction.
  • the processor detects whether there is an MTE instruction in the first memory domain through offline or online binary scanning; if so, the MTE instruction is cleared.
  • MTE instructions include the following two categories:
  • Instructions for reading memory tags such as: read allocated tags (Load Allocation Tag, LDG), read multiple tags (Load Tag Multiple, LDGM) and other instructions.
  • Instructions for setting memory tags for example, one or more of the following instructions: Store Allocation Tag (STG), Store Allocation Tag and Zeroing (STZG), Store Allocation Tag and Pair register (STGP), Store Tag Multiple (STGM).
  • STG Store Allocation Tag
  • STZG Store Allocation Tag and Zeroing
  • STGP Store Allocation Tag and Pair register
  • STGM Store Tag Multiple
  • third-party code can be prevented from reading legal memory tags or illegally setting memory tags through MTE instructions, and then obtaining address tags through memory tags to achieve illegal access to memory domains, which can further improve the security and reliability of memory access.
  • the processor can enable Address Space Layout Randomization (ASLR) when configuring the addresses of each memory domain, and then randomly configure the addresses of each memory domain.
  • ASLR Address Space Layout Randomization
  • control device includes: a configuration module 901 and a gateway module 902 .
  • the configuration module is used to initialize the system, such as assigning memory tags to all memory domains (or memory domains storing sensitive assets), and scanning and eliminating MTE sensitive instructions before running any memory domain.
  • the configuration module can be further subdivided into multiple modules such as a tag assignor and an instruction elimination module, wherein the tag assignor is used to assign memory tags to memory domains, modify memory tags, etc.; the instruction elimination module is used to perform offline or online binary scanning on the memory domain, and eliminate these MTE instructions when MTE instructions are found in the memory domain.
  • the gateway module is used to implement the function of the above-mentioned API gateway, that is, to control the cross-domain access process between memory domains.
  • the gateway module can be further subdivided into modules such as reconfiguration of memory tags, post-verification, and exception handling.
  • the function of post-verification includes but is not limited to: checking whether the return address is normal; after setting the memory tag, checking whether the memory tag is set correctly (for example, whether it is randomly generated).
  • the exception handling module is used to handle exceptions when an exception occurs during verification, such as stopping the service, reporting an exception, etc.
  • the next PC in FIG9 refers to the next instruction (program conter), which means that after executing one instruction, the next instruction will be executed.
  • the first instruction mentioned above is one of the instructions.
  • an embodiment of the present application also provides a control device, which includes at least one processor 1001 and an interface circuit 1002; the interface circuit 1002 is used to receive signals from other devices outside the device and transmit them to the processor 1001 or send signals from the processor 1001 to other devices outside the device, and the processor 1001 is used to implement the method described in the above method embodiment through logic circuits or execution code instructions.
  • the processor mentioned in the embodiments of the present application can be implemented by hardware or by software.
  • the processor can be a logic circuit, an integrated circuit, etc.
  • the processor can be a general-purpose processor implemented by reading software code stored in a memory.
  • the processor may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • CPU central processing unit
  • DSP digital signal processors
  • ASIC application-specific integrated circuits
  • FPGA field programmable gate arrays
  • a general-purpose processor may be a microprocessor or the processor may also be any conventional processor, etc.
  • the memory mentioned in the embodiments of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories.
  • the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
  • the volatile memory may be a random access memory (RAM), which is used as an external cache.
  • RAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDR SDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • Synchlink DRAM, SLDRAM synchronous link dynamic random access memory
  • Direct Rambus RAM Direct Rambus RAM
  • the processor is a general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, the memory (storage module) can be integrated into the processor.
  • memory described herein is intended to include, without being limited to, these and any other suitable types of memory.
  • an embodiment of the present application further provides a computer-readable storage medium, including a program or an instruction.
  • a program or an instruction When the program or the instruction is executed on a computer, the method described in the above method embodiment is executed.
  • an embodiment of the present application further provides a computer program product comprising instructions, wherein the computer program product stores instructions, and when the computer program product is run on a computer, the method described in the above method embodiment is executed.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment in combination with software and hardware. Moreover, the present application may adopt the form of a computer program product implemented in one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) that contain computer-usable program code.
  • a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Storage Device Security (AREA)

Abstract

本申请公开了一种内存访问控制方法和装置,用于实现内存分区隔离,减少软件开销,降低内存隔离粒度。方法包括:为至少一个内存域配置内存标签;在运行至少一个内存域中的任意一个内存域如第一内存域之前,若第一内存域配置有内存标签,则释放第一内存域的内存标签;在确定第一内存域不具有内存标签后,运行第一内存域。该方案对应用程序无侵入式修改,不需要重新编译程序代码,不需要增加应用程序大小,可以降低实现成本;相比其它通过软件或硬件实现的内存隔离方案,可以减少性能开销,例如减少用户态和内核上下文切换开销;并且,可以实现细粒度的内存访问控制。

Description

一种内存访问控制方法和装置 技术领域
本申请涉及计算机安全领域,尤其涉及一种内存访问控制方法和装置。
背景技术
在开发应用时,软件复用是常用技术手段之一,即开发者开发自己的组件,同时尽量复用开源或第三方的软件库,即“不重新造轮子”,可以减少开发者的工作量,提高软件开发效率。软件复用虽然为开发者提供了便利,但是也存在一些安全隐患。因为基于软件复用开发的应用中,每个组件都可以看到完整的应用地址空间,即可以访问其他组件的内存,存在恶意访问、漏洞利用、敏感信息泄露等诸多问题。为了避免这些安全隐患,现有技术可以把敏感资产(包括数据、代码等)分区,即将这些敏感资产隔离起来,把其他组件的故障和漏洞影响隔绝在分区之外,保证该分区的机密性和完整性。
然而,现有技术一般是通过软件方式或硬件方式对内存分区,其中,软件方式会增加额外的开销,而硬件方式依赖于固定硬件,且存在内存隔离粒度过大问题。
发明内容
本申请提供一种内存访问控制方法和装置,用于实现内存分区隔离,减少软件开销,降低内存隔离粒度。
第一方面,提供一种内存访问控制方法,应用于处理器,例如ARM或A核等,方法包括:为至少一个内存域配置内存标签;其中,配置有内存标签的内存域只允许被关联有地址标签且地址标签与内存标签匹配的指令访问;在运行至少一个内存域中第一内存域之前,若第一内存域配置有内存标签,则释放第一内存域的内存标签;其中,第一内存域为至少一个内存域中的任意一个;在确定第一内存域不具有内存标签后,运行第一内存域。
上述方案,先为至少一个内存域分配内存标签,在运行至少一个内存域中的第一内存域之前,先确保其没有内存标签后,再运行第一内存域,这样可以保证第一内存域可以被本域的组件访问,并且,由于本申请实施例仅配置内存标签而不配置地址标签,所以第一内存域对应的组件域不可直接访问其它内存域,从而实现域间访问隔离。该方案对应用程序无侵入式修改,不需要重新编译程序代码,不需要增加应用程序大小,可以降低实现成本;相比其它通过软件或硬件实现的内存隔离方案,可以减少性能开销,例如减少用户态和内核上下文切换开销;内存分区,可以达到16字节粒度,可以实现细粒度的内存访问控制。
一种可能的设计中,为至少一个内存域配置内存标签,在具体实现时,可以是随机生成内存标签。
通过该设计方式,可以降低第三方代码猜中内存标签的几率,可以提高内存隔离的安全性。
一种可能的设计中,为至少一个内存域配置内存标签,在具体实现时,可以是为至少一个内存域中存储有敏感资产的内存域配置内存标签。
通过该设计方式,可以在保证敏感资产安全性的同时,避免内存标签浪费,提高内存 标签资源的利用率。
一种可能的设计中,在运行第一内存域之后,还可以检测到来自第一内存域的第一指令,第一指令用于访问第二内存域;为第一内存域配置内存标签;释放第二内存域的内存标签;执行第一指令;向第一内存域返回第一指令的执行结果。
该设计方式,通过标签重配置机制实现跨域访问,可以提高跨域访问的安全性。
一种可能的设计中,在执行第一指令之前,还可以确定第一内存域配置的内存标签为随机生成的内存标签。
该设计方式,可以避免第三方代码绕过标签重配置过程,直接访问第二内存域,可以进一步提高跨域访问的安全性。
一种可能的设计中,在执行第一指令之前,还可以保存第一内存域的地址。当向第一内存域返回第一指令的执行结果时,可以根据保存的地址向第一内存域返回第一指令的执行结果。
一种可能的设计中,在执行第一指令之后,还可以为第二内存域配置内存标签;释放第一内存域的内存标签。
该设计方式,可以保证执行结果正确返回到第一内存域,提高了跨域访问的可靠性。
一种可能的设计中,在运行第一内存域之前,还可以通过离线或在线二进制扫描,检测第一内存域中是否存在内存标签扩展MTE指令;若存在,则清除MTE指令;其中,MTE指令包括读取分配的标签LDG指令、读取多个标签LDGM指令、存储分配的标签STG指令、存储分配的标签并清零STZG指令、存储分配的标签并存储两个寄存器STGP指令、存储多个分配的标签STGM指令中的一项或多项。
该设计方式,可以避免第三方代码通过MTE指令读取到合法的内存标签或非法设置内存标签,进而通过内存标签获取地址标签实现对内存域的非法访问,可以进一步提高内存访问的安全性和可靠性。
第二方面,提供一种控制装置,包括用于执行如第一方面或第一方面任一种可能的设计中所述的方法的模块/单元/技术手段。
示例性的,装置可以包括:
配置模块,用于为至少一个内存域配置内存标签;其中,配置有内存标签的内存域只允许被关联有地址标签且地址标签与内存标签匹配的指令访问;在运行至少一个内存域中第一内存域之前,若第一内存域配置有内存标签,则释放第一内存域的内存标签;其中,第一内存域为至少一个内存域中的任意一个;在确定第一内存域不具有内存标签后,运行第一内存域。
一种可能的设计中,配置模块可以用于:随机生成内存标签。
一种可能的设计中,配置模块可以用于:为至少一个内存域中存储有敏感资产的内存域配置内存标签。
一种可能的设计中,装置还可以包括:
网关模块,用于检测到来自第一内存域的第一指令,第一指令用于访问第二内存域;为第一内存域配置内存标签;释放第二内存域的内存标签;执行第一指令;向第一内存域返回第一指令的执行结果。
一种可能的设计中,网关模块还可以用于:在执行第一指令之前,确定第一内存域配置的内存标签为随机生成的内存标签。
一种可能的设计中,网关模块还可以用于:在执行第一指令之前,保存第一内存域的地址;在向第一内存域返回第一指令的执行结果时,根据保存的地址向第一内存域返回第一指令的执行结果。
一种可能的设计中,网关模块还可以用于:在执行第一指令之后,为第二内存域配置内存标签;释放第一内存域的内存标签。
一种可能的设计中,配置模块还可以用于:在运行第一内存域之前,通过离线或在线二进制扫描,检测第一内存域中是否存在内存标签扩展MTE指令;若存在,则清除MTE指令;其中,MTE指令包括读取分配的标签LDG指令、读取多个标签LDGM指令、存储分配的标签STG指令、存储分配的标签并清零STZG指令、存储分配的标签并存储两个寄存器STGP指令、存储多个分配的标签STGM指令中的一项或多项。
第三方面,提供一种控制装置,包括:至少一个处理器和接口电路;所述接口电路用于接收来自所述装置之外的其它装置的信号并传输至所述处理器或将来自所述处理器的信号发送给所述装置之外的其它装置,所述处理器通过逻辑电路或执行代码指令用于实现如第一方面或第一方面任一种可能的设计中所述的方法。
第四方面,提供一种计算机可读存储介质,所述可读存储介质用于存储指令,当所述指令被执行时,使如第一方面或第一方面任一种可能的设计中所述的方法被实现。
第五方面,提供一种计算机程序产品,所述计算机程序产品中存储有指令,当其在计算机上运行时,使得计算机执行如第一方面或第一方面任一种可能的设计中所述的方法。
附图说明
图1为软件复用场景下的内存访问示意图;
图2为本申请实施例提供的一种内存访问控制方法的流程;
图3为MTE的内存访问机制示意图;
图4为内存标签的示意图;
图5为标签分配器分配内存标签和释放内存标签的示意图;
图6为本申请实施例提供的一种跨域访问方法的流程图;
图7为一种跨域访问示例的示意图;
图8为一个跨域访问示例的流程图;
图9为本申请实施例提供的一种控制装置的示意图;
图10为本申请实施例提供的另一种控制装置的示意图。
具体实施方式
在软件复用场景中,可信组件和不可信组件使用内存的同一地址空间。例如,参见图1,为软件复用场景下的内存访问示意图,敏感资产库存储可信组件的数据和代码,第三方库中存储不可信组件的数据和代码,任一用户(包括非法用户)都可以通过程序入口发送请求,访问第三方库、敏感资源库,还可以恶意构建内存损坏(Memory Corruption),例如恶意修改一块合法的内存中的数据,越界访问(指访问一块内存,但超出合法的访问范围)等。
一个好的软件工程中,会把敏感数据和代码进行分区,即将这些敏感资产隔离起来, 把其他组件的故障和漏洞影响隔绝在分区之外,保证该分区的机密性和完整性。
以下例举几种内存隔离技术:
一、软件故障隔离(Software-Based Fault Isolation,SFI),对不可信的组件内存访问进行边界检查,即在每次访问内存前,都加一个检查(例如在执行读写指令之前,检查该指令里的地址是否合法),避免访问其他组件的内存。这种边界检查可以通过编译器或者二进制重写的方式插入。边界检查的插入会导致在不可信的组件中引入额外的执行开销,同时还需要额外的开销,来防止可能绕过边界检查的控制流劫持。
这种方式不依赖硬件,支持微处理器(Advanced RISC Machines,ARM)架构和A核(Cortex-A)架构,内存分区可以支持16B~64B的细粒度,应用占用的内存开销小。但是这种方式,指令翻译,性能影响大。
二、硬件故障隔离,使用硬件安全机制,如硬件内存保护单元(Memory Protection Unit,MPU),这样通过硬件对内存访问进行翻译,非法操作会被硬件进行捕捉,从而在不可信的组件中不会产生额外的性能开销。不过尽管组件内部不会产生额外开销,但是通过虚拟化或内核捕捉的方式去切换组件边界还是会有额外的开销,取决于不同的硬件架构。
这种方式,通过MPU做内存隔离,相比SFI,性能影响小;可以支持ARM架构;内存分区可以支持16B~64B的细粒度,内存开销小。但是这种方式,依赖于硬件实现,且不支持Cortex-A。
三、使用保护密钥(MPK)实现安全、高效的进程内隔离(Secure,Efficient In-process Isolation with Protection Keys(MPK),ERIM),是一种基于x86架构的隔离技术,依赖于x86指令集架构(Instruction Set ASrchitecture,ISA)扩展。该方案使用保护密钥(Memory protection keys,MPK)将每个虚拟页面采用4位域标识(Identity Document,ID)标记,从而将进程的地址空间划分为最多16个不相交的域。每个逻辑内核本地的特殊寄存器如基于用户模式的保护键寄存器(Protection Key Register User,PKRU)确定内核可以读取或写入哪些域。切换域权限需要在用户空间中写入PKRU,在当前的英特尔中央处理器(Central Processing Unit,CPU)上,这只需要11260个周期,对应于2.6GHz CPU上每100000个交换机/秒的开销为0.07%至1.0%。这相当于NGINX(NGINX是一种服务器的名称)吞吐量的开销最多4.8%。
MPK技术内容主要包括:
1、MPK技术为每个进程提供16个保护密钥(Protection Key,可以简称为Key),因此最多能将进程的地址空间分为16个保护域。PKRU寄存器存储了每个Key的具体访问权限。
2、当要改变某个域的访问权限时,只要通过一个用户态的指令(如将数据写入用户页面密钥寄存器(Write Data to User Page Key Register,WRPKRU),11-260周期(cycles))即可,不需要系统调用、对页表修改、刷新转换检测缓冲区(Translation Lookaside Buffer,TLB)等。
下面介绍下ERIM技术:
1、将用户态进程分为两个部分,可信和非可信域。
保证非可信域总是允许访问,可信域是永远不能被非可信域访问。但是非可信域可以通过跳转到可信域的呼叫门(call gates)来获得访问权限。
2、ERIM要解决的一个难点是防止非可信域利用WRPKRU指令修改内存域访问权限。 ERIM依赖于二进制检查(Binary Inspection)来保证只有安全的WRPKRU出现在可执行页里。
这种方式,使用X86的MPK技术会极少的影响非可信或可信组件的运行性能,但其依赖于X86的硬件架构,即无法在ARM架构下使用ERIM技术;同样因为使用页为单位,因此其粒度只能达到4KB,不能够进行更细粒度的内存隔离。
四、一种基于硬件MPU内存保护单元,采用微控制单元保护环境底座(Protected Environment Keystone for MCU,PEKM)技术,来隔离同进程组件的方法。
MPU技术内容主要包括:
1、微控制单元(Microcontroller Unit,MCU)上的MPU单元提供16个保护域(Protection Region)Set(全局),因此能够为任务划分16个保护域。PKRU寄存器存储了每个Key的具体访问权限。
2、当要改变某个域的访问权限时,只要通过一个内核态的指令(如mtcr),需要系统调用产生上下文切换。
PEKM技术内容主要包括:
1、将任务执行在内核态,因此有了同进程空间隔离问题。
2、保证任务(不可信)与内核(可信)双向隔离,但是任务可以通过跳转到内核的call gateway来获得访问权限。
3、PEKM要解决的一个难点是防止任务利用mtcr等敏感指令来修改保护域权限。PEKM依赖于二进制检查(Binary Inspection)来保证系统指令不会出现在任务里。
这种方式,PEKM技术依赖于MPU硬件,通过MPU对内存域进行设置,使得非可信组件和可信组件内存进行切换,MPU的内存粒度可以到64字节,但是MPU只在部分ARM架构(如ARM v8R或者v8M);在Cortex-A中,PEKM也提出了使用内存管理单元(Memory Management Unit,MMU)进行保护,但是其粒度依然为页粒度,即4KB。
以下表1总结了上述四种内存隔离技术的优缺点:
表1
Figure PCTCN2022126519-appb-000001
为了解决上述一个或多个技术问题,提供本申请内存访问控制方案。本申请基于内存标签扩展(Memory Tag Extension,MTE)技术实现内存分区,在避免不可信组件访问可信组件的内存的情况下,可以减少软件开销,减少对固定硬件的依赖,以及降低内存隔离粒度。
下面将结合附图,对本申请实施例进行详细描述。可以理解的,本申请实施例提供的技术方案可以应用于ARM、Cortex-A等架构,或者其它类型的处理器,本申请不做限制。
参见图2,为本申请实施例提供的一种内存访问控制方法的流程图,该方法可以由处理器执行,处理器例如是ARM或Cortex-A等,本申请不做限制。方法包括:
S201、处理器为至少一个内存域配置内存标签;其中,配置有内存标签的内存域只允许被关联有地址标签且地址标签与内存标签匹配的指令访问。
本申请实施例中,处理器可以采用内存标签扩展(Memory Tag Extension,MTE)技术来为至少一个内存域配置内存标签。
为了便于理解,这里先介绍一下MTE技术:
MTE是一种指令集扩展,在ARMv8.5-A架构引入。MTE实现一种内存访问锁(lock)&钥匙(key)的机制,可以提供细粒度的物理内存访问控制,即内存访问仅允许指针中key值与和物理内存中的lock值相同才可以授权访问,否则触发异常,如图3所示,为MTE的内存访问机制示意图。
在MTE中,key又称为地址标签(address tag)或逻辑标签(logical tag)或指针标签,lock又称为内存标签(memory tag)。其中,每一个内存标签是4比特(bit)的大小,关联16字节的物理内存大小。在具体实现时,多个不同的16字节的物理内存可以关联相同或不同的内存标签,本申请不做限制。如图4所示,为内存标签的示意图,其中物理内存[-16,0]对应内存标签为0xA,物理内存[32,48]内存标签0x3,物理内存[0,16]、[16,32]对应同一内存标签,即0x2。可以理解的,内存标签是存放在特殊的内存区域,普通的内存读写指令无法访问到这块内存。因此MTE提供了特殊指令去更新这块区域,譬如读取分配的标签(Load Allocation Tag,LDG)、存储分配的标签(Store Allocation Tag,STG)等指令。
MTE利用ARM的顶部字节忽略(Top Byte Ignore,TBI)特性,将访问地址的高位[59:56]bits用作地址标签标识指针,最多可以有16个不同的地址标签。
可以理解的,处理器在访问内存时,访问指令(如读指令或写指令)中会携带访问地址,访问地址中关联有地址标签,如访问地址的高位[59:56]bits为地址标签。
内存域(domain):还可以称为内存区域。在具体实现时,处理器可以根据用户输入的信息(如内存域配置文件),将内存划分为多个不同的内存域。内存域用于标记如三方库或敏感资产内存区间。
在本申请实施例中,处理器可以使用标签分配器为至少一个内存域分配内存标签。标签分配器是MTE技术定义的分配器,与常见的内存分配器不同,该分配器提供内存的标签分配,用于对给定内存进行标签染色(taint)(即为给定内存分配内存标签)。MTE分配还可以提供MTE指令,修改内存标签。
可以理解的,MTE本质上是程序代码,处理器运行这些程序代码时,实现分配内存标签、修改内存标签等功能。
一种可能的设计中,处理器在为至少一个内存域配置内存标签时,可以是随机生成内存标签。如此,可以降低第三方代码猜中内存标签的几率,可以提高内存隔离的安全性。
可以理解的,本文中的第三方代码是指:站在应用开发者角度看来,不是应用开发者开发的代码(例如非法用户(即通常所称的“黑客”)开发的代码),但不包括系统可信基(如系统内核代码、实现本申请方法的代码等)。一种可能的设计中,处理器在为至少一个内 存域配置内存标签时,可以只为至少一个内存域中存储有敏感资产的内存域配置内存标签。
例如,至少一个内存域包括第一内存域和第二内存域,其中第一内存域存储敏感资产,第二内存域存储非敏感资产,则可以为第一内域与配置内存标签,而不为第二内存域配置内存域标签。这是因为第二内存域存储非敏感资产,所以总是允许被访问,因此可以不必为其配置内存标签。
其中,资产包括但不限于是代码和/或数据。
如此,可以避免内存标签浪费,提高内存标签资源的利用率。
S202、在运行至少一个内存域中第一内存域之前,若第一内存域配置有内存标签,则释放第一内存域的内存标签;其中,第一内存域为至少一个内存域中的任意一个。
可以理解的,在运行应用时,本质是运行(或执行)代码来处理数据。根据上文可知,基于软件复用开发的应用中,一个应用可以由多个组件构成。而各个组件对应的代码和数据可以被存储在不同的内存域中。所以,当运行应用的某个组件时,可以理解为运行(或执行)该组件对应的内存域中的代码,或者说运行该组件对应的内存域。相应的,在本申请实施例中,运行第一内存域,可以理解为运行第一内存域中的代码。运行第一内存域之前,是指运行第一内存域的第一条代码之前。
需要强调的是,本申请实施例中基于MTE技术实现内存隔离的方案和MTE技术本身定义的功能有所不同。
根据上文对MTE技术的介绍可知,MTE技术本身是在地址标签(key)和内存标签(lock)配置时允许访问,因此需要修改应用代码使其在访问内存域时在访问指令中携带地址标签。
而本申请实施例中,不需要修改应用的代码,不需要配置地址标签(key),标签分配器无需对应用感知,标签分配器只需用到MTE技术中与内存标签相关的技术。具体来说,就是先对至少一个内存域配置内存标签(即加锁),在运行至少一个内存域中任意一个内存域(如第一内存域)之前,保证该内存域的内存标签不具有内存标签(即呈解锁状态),这样该内存域运行后,该内存域对应的组件可以访问(包括读和/或写)本域内存中的数据,而该内存域在访问其它域时,由于其它内存域配置有内存标签(即呈加锁状态),且该内存域不具有地址标签(因为标签分配器并未配置地址标签),所以会因为标签不匹配(tag mismatch)而触发异常,所以不能直接访问其它内存域。
示例性的,参见图5,标签分配器分别为内存域A、内存域B、内存域C配置有内存标签0、内存标签1、内存标签2。运行内存域A之前,内存标签0会先释放内存域A的内存标签(lock),使得内存域A运行时,只能访问本域,而无法直接访问其它内存域(如图5所示,内存域A访问内存域B失败)。
相应的,为确保运行第一内存域时第一内存域不具有内存标签,需要在运行第一内存域之前判断第一内存域是否具有内存标签。若第一内存域具有内存标签,则释放第一内存域的内存标签后,执行S203;如果第一内存域不具有内存标签(例如第一内存域存储的是非敏感资产,而步骤S21中只对存储有敏感资产的内存域配置内存标签),则可以直接执行S203。
可选的,处理器在释放第一内存域的内存标签,可以通过硬件实现或通过STG指令设置。
S203、处理器在确定第一内存域不具有内存标签后,运行第一内存域。
运行第一内存域,是指处理器开始读取并执行第一内存域中的代码,实现第一内存域 对应的组件的功能。可以理解的,第一内存域中的代码在执行时,可以生成指令,该些指令可以是读和/或写指令,也可以是其它指令,本申请不做限制。
在上述方案中,处理器先为至少一个内存域分配内存标签,在运行至少一个内存域中的第一内存域之前,先确保其没有内存标签后,再运行第一内存域,这样可以保证第一内存域可以被本域的组件访问,并且,由于本申请实施例仅配置内存标签而不配置地址标签,所以第一内存域对应的组件域不可直接访问其它内存域,从而实现域间访问隔离。
上述方案至少具有以下几个方面的优点:
(1)对应用程序无侵入式修改,不需要重新编译程序代码,不需要增加应用程序大小,可以降低实现成本;
(2)相比其它通过软件或硬件实现的内存隔离方案,本案采用软硬结合的方式(例如,标签分配器、API网关等依赖软件实现,而MTE指令、内存标签等涉及到硬件),可以减少性能开销,例如可以减少用户态和内核上下文切换开销;
(3)基于MTE技术实现内存分区,可以达到16字节粒度,实现细粒度的内存访问控制。
一种可能的设计中,处理器还可以通过重新配置内存标签实现跨域访问。
参见图6,以第一内存域访问第二内存域为例(即第一内存域运行后,第一内存域中生成访问第二内存域的指令),本申请实施例还提供一种跨域访问方法,包括:
S601、处理器在运行第一内存域之后,检测到来自第一内存域的第一指令,第一指令用于访问第二内存域;
S602、处理器为第一内存域重新配置内存标签(因为在运行第一内存域之前已经释放了第一内存域的标签);
可选的,处理器为第一内存域随机内存标签。
S603、处理器释放第二内存域的内存标签,使得第二内存域正常运行(能被访问);
S604、处理器在第二内存域执行第一指令;
S605、处理器向第一内存域返回第一指令的执行结果。
换而言之,在跨域访问时,在被访问域(如第二内存域)执行来自访问域(如第一内存域)读写指令之前,需要对访问域重新上锁,并对被访问域解锁,使得被访问域可以被访问,且保证只有正在运行的内存域可以访问。
在具体实现时,跨域访问的控制过程,处理器可通过配置应用程序接口(Application Program Interface,API)网关(Gateway)实现。可以理解的,API网关本质上也可以软件代码,该API网关运行时,实现跨域访问相关功能,例如检测第一指令、控制标签分配器重新配置内存标签(如为第一内存域重配置标签、为第二内存域释放标签等操作)、控制第一指令执行等。如果存在绕过API网关跨域访问的情况,则会因为标签不匹配触发异常。
例如,参见图7,第一内存域直接访问第二内存域、第三内存域,都会因为第二内存域、第三内存域具有内存标签而第一内存域无法提供匹配的地址标签导致跨域访问失败;当第一内存域通过API网关访问第二内存域时,API网关可以配合标签分配器重新配置内存标签,实现第一内存域访问第二内存域。
该设计方式,通过API网关和标签分配器的配合,可以实现跨域访问,提高了跨域访问的安全性。
一种可能的设计中,处理器在执行第一指令之前,还可以验证第一内存域的内存标签 是否是随机生成的内存标签。当第一内存域配置的内存标签为随机生成的内存标签时,处理器再执行第一指令;否则,不执行第一指令。
如此,可以避免第三方代码绕过步骤S602,直接访问第二内存域,可以进一步提高跨域访问的安全性。
一种可能的设计中,处理器在执行第一指令之前,还可以保存第一内存域的地址。当处理器向第一内存域返回第一指令的执行结果时,根据保存的地址向第一内存域返回第一指令的执行结果。
如此,可以保证执行结果回到第一内存域,提高了跨域访问的可靠性。
一种可能的设计中,处理器在执行第一指令之后,还可以为第二内存域重新配置内存标签,即对第二内存域重新加锁,以及释放第一内存域的内存标签,使得执行结果可以回到第一内存域。可选的,处理器对第二内存域重新配置内存标签时,可以是随机生成内存标签。
如此,可以保证执行结果回到第一内存域,同时避免其它内存域或第三方代码乘机访问第二内存域,提高了跨域访问的可靠性。
一种可能的设计中,处理器在检测到来自第一内存域的第一指令之后,以及为第一内存域配置内存标签之前,还可以获取原子锁,原子锁用于保证API网关在同一时刻只有一个切换主体(即一个需要加锁的内存域和一个需要解锁内存域);处理器在执行第一指令之后,释放原子锁。这样,避免跨域访问过程中出现并发问题。
可以理解的,获取原子锁,是指执行指令“Acquire_lock”的过程,指令“Acquire_lock”的含义是:此步骤获取指定值的锁定。您可以使用锁来防止并发修改资源(This step acquires a lock on a specified value.You can use locks to prevent concurrent modification of resources)。
为了便于理解,这里在列举一个详细的跨域访问的例子,参见图8,包括:
S801、获取原子锁,确定当前正在执行的内存域,如第一内域(在运行中,所以没有内存标签);
S802、处理器使用随机内存标签对第一内存域进行标记(即加锁);
S803、处理器确定待执行的内存域,如第二内存域,释放第二内存域的内存标签;
S804、处理器校验第一内存域的内存标签是否是随机的内存域标签,在校验通过之后继续后面的流程;
S805、处理器保存返回地址(即第一内存域的地址),执行跨域访问的指令(如第一指令);
S806、处理器使用随机内存标签对第二内存域重新进行标记(即加锁);
S807、处理器释放第一内存域的内存标签,使其正常运行;
S808、处理器释放原子锁,向第一内存域返回执行结果。
可以理解的,在硬件(如ARM)支持指针身份验证(pointer authentication,PAC)的情况下,需要打开PAC进一步保障控制流完整性。
在具体实现时,上述流程可以通过运行代码实现。
步骤S801~步骤S808对应的代码举例如下:
Figure PCTCN2022126519-appb-000002
Figure PCTCN2022126519-appb-000003
通过上述流程,可以实现跨域访问。
一种可能的设计中,处理器在运行任意内存域之前,可以执行MTE指令清除,避免第三方代码通过非法获取到的MTE指令获取到指针标签实现非法访问。
示例性的,处理器在运行第一内存域之前,通过离线或在线二进制扫描,检测第一内存域中是否存在MTE指令;若存在,则清除MTE指令。
其中,MTE指令包括以下两类:
1、用于读取内存标签的指令,例如:读取分配的标签(Load Allocation Tag,LDG)、读取多个标签(Load Tag Multiple,LDGM)等指令。
2、用于设置内存标签的指令,例如:存储分配的标签(Store Allocation Tag,STG)、存储分配的标签并清零(Store Allocation Tag and Zeroing,STZG)、存储分配的标签并存储两个寄存器(Store Allocation Tag and Pair register,STGP)、存储多个分配的标签(Store Tag Multiple,STGM)等指令中的一项或多项。
如此,可以避免第三方代码通过MTE指令读取到合法的内存标签或非法设置内存标签,进而通过内存标签获取地址标签实现对内存域的非法访问,可以进一步提高内存访问的安全性和可靠性。
一种可能的设计中,处理器在配置各个内存域的地址时,可以开启地址空间布局随机化()Address Space Layout Randomization,ASLR),进而随机配置各个内存域的地址。
可以立理解,当ASLR开启时,每次程序运行时的时候,装载的可执行文件和共享库都会被映射到虚拟地址空间的不同地址。
如此,可以避免第三方代码猜中内存地址的几率,进而降低应用的进程被入侵的风险,可以进一步提高方案的可靠性。
可以理解的,上述和实施例方式可以相互结合实现不同的技术效果。
参见图9,为本申请实施例提供的一种控制装置示意图,包括:配置模块901、网关模块902。
其中,配置模块用于对系统进行初始化过程,例如为所有内存域(或存储有敏感资产的内存域)分配内存标签,在运行任意内存域之前,扫描并消除MTE敏感指令。一种可 能的示例中,如图9所示,配置模块还可以进一步细分为标签分配器和指令消除等多个模块,其中标签分配器用于对内存域分配内存标签、修改内存标签等;指令消除模块,用于对内存域进行离线或在线二进制扫描,在发现内存域中存在MTE指令时,消除这些MTE指令。
网关模块,用于实现上述API网关的功能,即控制内存域之间的跨域访问过程。一种可能的示例中,如图9所示,网关模块还可以进一步细分为重配置内存标签、过后校验、异常处理等模块。其中,过后校验的功能包括但不限于:检查返回地址是否正常;设置内存标签后,检查内存标签设置是否正确(例如是否随机生成)。异常处理模块,用于在校验出现异常时处理异常,例如停止服务,上报异常等。
图9中的下一个PC是指下一个指令(program conter)的意思,表示执行完一个指令之后,继续执行下一个指令。例如,上文所述的第一指令为其中一个指令。
上述各个模块的具体实现可以参考上文方法实施例中的相关介绍,此处不再赘述。
应理解,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
基于相同的技术构思,参见图10,本申请实施例还提供一种控制装置,该装置包括至少一个处理器1001和接口电路1002;接口电路1002用于接收来自该装置之外的其它装置的信号并传输至处理器1001或将来自处理器1001的信号发送给该装置之外的其它装置,处理器1001通过逻辑电路或执行代码指令用于实现上述方法实施例中所述的方法。
应理解,本申请实施例中提及的处理器可以通过硬件实现也可以通过软件实现。当通过硬件实现时,该处理器可以是逻辑电路、集成电路等。当通过软件实现时,该处理器可以是一个通用处理器,通过读取存储器中存储的软件代码来实现。
示例性的,处理器可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
应理解,本申请实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Eate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
需要说明的是,当处理器为通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件时,存储器(存储模块)可以集成在处理器中。
应注意,本文描述的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
基于相同技术构思,本申请实施例还提供一种计算机可读存储介质,包括程序或指令,当所述程序或指令在计算机上运行时,使得如上述方法实施例中所述的方法被执行。
基于相同技术构思,本申请实施例还提供一种包含指令的计算机程序产品,该计算机程序产品中存储有指令,当其在计算机上运行时,使得如上述方法实施例中所述的方法被执行。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的保护范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (19)

  1. 一种内存访问控制方法,其特征在于,包括:
    为至少一个内存域配置内存标签;其中,配置有内存标签的内存域只允许被关联有地址标签且所述地址标签与所述内存标签匹配的指令访问;
    在运行所述至少一个内存域中第一内存域之前,若所述第一内存域配置有内存标签,则释放所述第一内存域的内存标签;其中,所述第一内存域为所述至少一个内存域中的任意一个;
    在确定所述第一内存域不具有内存标签后,运行所述第一内存域。
  2. 如权利要求1所述的方法,其特征在于,所述为至少一个内存域配置内存标签,包括:
    随机生成内存标签。
  3. 如权利要求1或2所述的方法,其特征在于,所述为至少一个内存域配置内存标签,包括:
    为所述至少一个内存域中存储有敏感资产的内存域配置内存标签。
  4. 如权利要求1-3任一项所述的方法,其特征在于,在运行所述第一内存域之后,还包括:
    检测到来自所述第一内存域的第一指令,所述第一指令用于访问第二内存域;
    为所述第一内存域配置内存标签;
    释放所述第二内存域的内存标签;
    执行所述第一指令;
    向所述第一内存域返回所述第一指令的执行结果。
  5. 如权利要求4所述的方法,其特征在于,在执行所述第一指令之前,还包括:
    确定所述第一内存域配置的内存标签为随机生成的内存标签。
  6. 如权利要求4或5所述的方法,其特征在于,在执行所述第一指令之前,还包括:
    保存所述第一内存域的地址;
    所述向所述第一内存域返回所述第一指令的执行结果,包括:
    根据保存的地址向所述第一内存域返回所述第一指令的执行结果。
  7. 如权利要求4-6任一项所述的方法,其特征在于,在执行所述第一指令之后,还包括:
    为所述第二内存域配置内存标签;
    释放所述第一内存域的内存标签。
  8. 如权利要求1-7任一项所述的方法,其特征在于,在运行所述第一内存域之前,还包括:
    通过离线或在线二进制扫描,检测所述第一内存域中是否存在内存标签扩展MTE指令;
    若存在,则清除所述MTE指令;
    其中,所述MTE指令包括读取分配的标签LDG指令、读取多个标签LDGM指令、存储分配的标签STG指令、存储分配的标签并清零STZG指令、存储分配的标签并存储两个寄存器STGP指令、存储多个分配的标签STGM指令中的一项或多项。
  9. 一种控制装置,其特征在于,包括:
    配置模块,用于为至少一个内存域配置内存标签;其中,配置有内存标签的内存域只允许被关联有地址标签且所述地址标签与所述内存标签匹配的指令访问;在运行所述至少一个内存域中第一内存域之前,若所述第一内存域配置有内存标签,则释放所述第一内存域的内存标签;其中,所述第一内存域为所述至少一个内存域中的任意一个;在确定所述第一内存域不具有内存标签后,运行所述第一内存域。
  10. 如权利要求9所述的装置,其特征在于,所述配置模块用于:随机生成内存标签。
  11. 如权利要求9或10所述的装置,其特征在于,所述配置模块用于:为所述至少一个内存域中存储有敏感资产的内存域配置内存标签。
  12. 如权利要求9-11任一项所述的装置,其特征在于,所述装置还包括:
    网关模块,用于检测到来自所述第一内存域的第一指令,所述第一指令用于访问第二内存域;为所述第一内存域配置内存标签;释放所述第二内存域的内存标签;执行所述第一指令;向所述第一内存域返回所述第一指令的执行结果。
  13. 如权利要求12所述的装置,其特征在于,所述网关模块还用于:在执行所述第一指令之前,确定所述第一内存域配置的内存标签为随机生成的内存标签。
  14. 如权利要求12或13所述的装置,其特征在于,所述网关模块还用于:在执行所述第一指令之前,保存所述第一内存域的地址;在向所述第一内存域返回所述第一指令的执行结果时,根据保存的地址向所述第一内存域返回所述第一指令的执行结果。
  15. 如权利要求12-14任一项所述的装置,其特征在于,所述网关模块还用于:在执行所述第一指令之后,为所述第二内存域配置内存标签;释放所述第一内存域的内存标签。
  16. 如权利要求9-15任一项所述的装置,其特征在于,所述配置模块,还用于:在运行所述第一内存域之前,通过离线或在线二进制扫描,检测所述第一内存域中是否存在内存标签扩展MTE指令;若存在,则清除所述MTE指令;其中,所述MTE指令包括读取分配的标签LDG指令、读取多个标签LDGM指令、存储分配的标签STG指令、存储分配的标签并清零STZG指令、存储分配的标签并存储两个寄存器STGP指令、存储多个分配的标签STGM指令中的一项或多项。
  17. 一种控制装置,其特征在于,包括:至少一个处理器和接口电路;
    所述接口电路用于接收来自所述装置之外的其它装置的信号并传输至所述处理器或将来自所述处理器的信号发送给所述装置之外的其它装置,所述处理器通过逻辑电路或执行代码指令用于实现如权利要求1-8中任一项所述的方法。
  18. 一种计算机可读存储介质,其特征在于,所述可读存储介质用于存储指令,当所述指令被执行时,使如权利要求1-8中任一项所述的方法被实现。
  19. 一种计算机程序产品,其特征在于,所述计算机程序产品中存储有指令,当其在计算机上运行时,使得计算机执行如权利要求1-8中任一项所述的方法。
PCT/CN2022/126519 2022-10-20 2022-10-20 一种内存访问控制方法和装置 WO2024082232A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202280011459.4A CN118235122A (zh) 2022-10-20 2022-10-20 一种内存访问控制方法和装置
PCT/CN2022/126519 WO2024082232A1 (zh) 2022-10-20 2022-10-20 一种内存访问控制方法和装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/126519 WO2024082232A1 (zh) 2022-10-20 2022-10-20 一种内存访问控制方法和装置

Publications (1)

Publication Number Publication Date
WO2024082232A1 true WO2024082232A1 (zh) 2024-04-25

Family

ID=90736532

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/126519 WO2024082232A1 (zh) 2022-10-20 2022-10-20 一种内存访问控制方法和装置

Country Status (2)

Country Link
CN (1) CN118235122A (zh)
WO (1) WO2024082232A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190196977A1 (en) * 2019-02-28 2019-06-27 Intel Corporation Technology For Managing Memory Tags
CN110554911A (zh) * 2018-05-30 2019-12-10 阿里巴巴集团控股有限公司 内存访问与分配方法、存储控制器及系统
CN112970019A (zh) * 2019-03-12 2021-06-15 华为技术有限公司 用于加强硬件辅助内存安全的装置和方法
US20210200684A1 (en) * 2019-12-27 2021-07-01 Intel Corporation Memory tagging apparatus and method
WO2021240160A1 (en) * 2020-05-29 2021-12-02 Arm Limited Tag checking apparatus and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110554911A (zh) * 2018-05-30 2019-12-10 阿里巴巴集团控股有限公司 内存访问与分配方法、存储控制器及系统
US20190196977A1 (en) * 2019-02-28 2019-06-27 Intel Corporation Technology For Managing Memory Tags
CN112970019A (zh) * 2019-03-12 2021-06-15 华为技术有限公司 用于加强硬件辅助内存安全的装置和方法
US20210200684A1 (en) * 2019-12-27 2021-07-01 Intel Corporation Memory tagging apparatus and method
WO2021240160A1 (en) * 2020-05-29 2021-12-02 Arm Limited Tag checking apparatus and method

Also Published As

Publication number Publication date
CN118235122A (zh) 2024-06-21

Similar Documents

Publication Publication Date Title
US20210194696A1 (en) System and method for high performance secure access to a trusted platform module on a hardware virtualization platform
US9989043B2 (en) System and method for processor-based security
CN107771335B (zh) 受保护区域
US8909898B2 (en) Copy equivalent protection using secure page flipping for software components within an execution environment
CN107667350B (zh) 基于虚拟化的平台保护技术
Champagne et al. Scalable architectural support for trusted software
CN109002706B (zh) 一种基于用户级页表的进程内数据隔离保护方法和系统
US7739466B2 (en) Method and apparatus for supporting immutable memory
US8578483B2 (en) Systems and methods for preventing unauthorized modification of an operating system
US10459850B2 (en) System and method for virtualized process isolation including preventing a kernel from accessing user address space
US9355262B2 (en) Modifying memory permissions in a secure processing environment
US20070006175A1 (en) Intra-partitioning of software components within an execution environment
US10514943B2 (en) Method and apparatus for establishing system-on-chip (SOC) security through memory management unit (MMU) virtualization
US20080077767A1 (en) Method and apparatus for secure page swapping in virtual memory systems
KR20210035911A (ko) 객체 특정 가상 어드레스 공간으로부터 물리적 어드레스 공간으로의 메모리 어드레스 변환을 위한 보안 구성
CN114651244A (zh) 机密计算机制
JP2023047278A (ja) トランスフォーマ鍵識別子を使用する仮想機械マネージャによる信頼されたドメイン保護メモリへのシームレスなアクセス
Delshadtehrani et al. Sealpk: Sealable protection keys for risc-v
WO2024082232A1 (zh) 一种内存访问控制方法和装置
US11188477B2 (en) Page protection layer
US20230098991A1 (en) Systems, methods, and media for protecting applications from untrusted operating systems
US20240231825A1 (en) Method for control flow isolation with protection keys and indirect branch tracking
CN116561824A (zh) 在机密计算架构中管理内存的方法和装置
Lee Scalable architectural support for trusted software

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22962419

Country of ref document: EP

Kind code of ref document: A1