WO2022218517A1 - Method and device for verifying execution of a program code - Google Patents

Method and device for verifying execution of a program code Download PDF

Info

Publication number
WO2022218517A1
WO2022218517A1 PCT/EP2021/059611 EP2021059611W WO2022218517A1 WO 2022218517 A1 WO2022218517 A1 WO 2022218517A1 EP 2021059611 W EP2021059611 W EP 2021059611W WO 2022218517 A1 WO2022218517 A1 WO 2022218517A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
tags
program code
data memory
group
Prior art date
Application number
PCT/EP2021/059611
Other languages
French (fr)
Inventor
Rémi Robert Michel DENIS-COURMONT
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/EP2021/059611 priority Critical patent/WO2022218517A1/en
Publication of WO2022218517A1 publication Critical patent/WO2022218517A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/54Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by adding security routines or objects to programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3624Software debugging by performing operations on the source code, e.g. via a compiler
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3636Software debugging by tracing the execution of the program
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/073Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a memory management context, e.g. virtual memory or cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring

Definitions

  • the disclosure relates to a method for verifying execution of a program code; more particularly, the disclosure relates to a device including a data processor coupled to a data memory for verifying the execution of the program code.
  • Software applications may contain flawed logic or faults.
  • a carefully crafted malicious input or program may exploit faults in a given software application in a manner that causes the given software application to deviate from its intended behaviour. Such a deviation may have potentially dangerous consequences for the given software application's user and a system on which the software application is running.
  • Such faults are referred to as vulnerabilities, for example as software vulnerabilities.
  • Memory- corruption vulnerabilities are an important class of software vulnerabilities that lead to corruption of data in a computer memory accessed by the software application.
  • either stray data of the former object, or new data of a new object that is allocated later in a same place, may be present when the reading or writing within the legitimate boundaries of an object that no longer logically exists in program execution sequence occurs. This may lead to tampering or breaching of the confidentiality of data.
  • boundary checks are implemented at run-time entirely in the software application for testing and debugging purposes, for example by using a run-time memory access checking system such as the LLVM AddressSanitizer.
  • the Addresssanitizer requires a special compilation of programs, causes considerable degradation of execution speed, and requires prohibitive amounts of memory.
  • CHERI architecture a mechanism is proposed to enforce spatial safety with changes to a processor and instruction set architecture (ISA) extensions.
  • ISA processor and instruction set architecture
  • the known approach represents memory references using capabilities, which describe not only the memory address, but also the boundaries and permissions associated with the memory reference that allows the processor to enforce boundaries and permissions at run-time.
  • the known approach (CHERI architecture) uses extensions that have not currently been implemented in any real product and does not provide any means for temporal safety.
  • Another known approach such as ARMv8.5-MTE, Memory Tagging Extension introduced by ARM (thereafter MTE), is a form of memory coloring, which associates a 4-bit tag (the “color”) for each 16-byte “granule” of physical memory.
  • the processor verifies that an expected tag, specified as part of the memory address, matches the actual tag of the accessed memory granule.
  • Out-of-bound access without further information of a system state, may have only 1 in 16 chances of carrying the correct tag, and otherwise, be rejected by the processor.
  • This known approach provides memory safeties, both temporally and spatially, but only in a statistical or non- deterministic fashion. Furthermore, the known approach ensures that two adjacent objects have distinct tags so that adjacent out-of-bound accesses may accurately or deterministically fail. But in a general case, protection is only statistical with 15 or 16 chances that provide a safety of about 94%.
  • the disclosure provides a method for verifying execution of a program code and a device including a data processor coupled to a data memory for verifying the execution of the program code.
  • a method for verifying execution of a program code on a device includes a data processor that is coupled to a data memory.
  • the program code when executed on the data processor accesses the data memory.
  • the method includes allocating a first group of tags to respective granules of the data memory.
  • the method includes allocating a second group of predetermined tags to respective pointers and capabilities used by the program code when executed.
  • the pointers and capabilities occupy an integer number of granules.
  • the method includes allocating a third group of tags.
  • a tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule.
  • the method includes, during run time of the program code, using a compiler to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not a data memory violation has occurred.
  • the method is of advantage in that the method can track exact object boundaries and provides deterministic spatial memory safety rather than depend on ad-hoc and new ISA extensions.
  • the method provides deterministic spatial safety, and a very highly probable temporal safety as it uses ARMv8.5-Memory Tagging Extension. Specifically, the method radically changes the logic for allocating tags for granules of memory and changes the representation of pointers so that the tags carry non-fakeable boundaries.
  • the method assigns distinct fixed predetermined tags for (i) pointers or capabilities, which always occupy exactly one granule (16 bytes) of the data memory, (ii) any other data constituent of a valid/live in- memory object, (iii) unallocated data memory including memory space of former objects that are yet not reallocated, and (iv) a per-object descriptor, which like a capability, must occupy exactly one granule.
  • the method leverages a tag storage memory as a mean to distinguish valid pointers stored in memory from other data in memory and unused memory. This is effectively used by the compiler when translating a source code of suitable programming language (such as C or C++) into ARMv8 assembler or a machine code.
  • a device including a data processor coupled to a data memory.
  • the data processor is configured to verify execution of a program code.
  • the program code when executed on the data processor accesses the data memory.
  • the data processor is configured to allocate a first group of tags to respective granules of the data memory.
  • the data processor is configured to allocate a second group of predetermined tags to respective pointers and capabilities used by the program code when executed.
  • the pointers and capabilities occupy an integer number of granules.
  • the data processor is configured to allocate a third group of tags.
  • a given tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule.
  • the data processor is configured to use a compiler to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not a data memory violation has occurred.
  • the advantage of the device is that the device tracks exact object boundaries and provides deterministic spatial memory safety rather than depending on ad-hoc and new ISA extensions.
  • the device provides deterministic spatial safety, and a very highly probable temporal safety as it uses ARMv8.5-MTE.
  • the device radically changes the logic for allocating tags for granules of memory, and changes the representation of pointers so that the tags carry non-fakeable boundaries.
  • the device leverages the tag storage memory as a means for distinguishing valid pointers stored in memory from other data in memory and unused memory. This is effectively used by the compiler when translating a source code of a suitable programming language (C or C++) into ARMv8 assembler or a machine code.
  • a computer program product including a non-transitory computer-readable storage medium having computer-readable instructions stored thereon, the computer-readable instructions being executable by a computerized device comprising processing hardware to execute a method of any one of claims.
  • the method tracks exact object boundaries and provides deterministic spatial memory safety.
  • the method according to the disclosure can provide deterministic spatial safety and very highly probable temporal safety.
  • the method may radically change the logic for allocating tags for granules of memory and changes the representation of pointers so that the tags carry non- fakeable boundaries.
  • FIG. 1 is a block diagram of a device for verifying execution of a program code in accordance with an implementation of the disclosure
  • FIG. 2 is a process flow diagram of a pointer during verification of execution of a program code in accordance with an implementation of the disclosure
  • FIG. 3 is a process flow diagram of a pointer during verification of execution of a program code in accordance with an implementation of the disclosure
  • FIG. 4 is a block diagram of a pointer block of a data processor in accordance with an implementation of the disclosure
  • FIG. 5 is a block diagram of an object descriptor in accordance with an implementation of the disclosure.
  • FIG. 6 is a flow diagram that provides an illustration of steps of a method for verifying execution of a program code on a device in accordance with an implementation of the disclosure.
  • Implementations of the disclosure provide a method for verifying execution of a program code and a device including a data processor coupled to a data memory for verifying the execution of the program code.
  • a process, a method, a system, a product, or a device that includes a series of steps or units is not necessarily limited to expressly listed steps or units but may include other steps or units that are not expressly listed or that are inherent to such process, method, product, or device.
  • FIG. 1 is a block diagram of a device 102 for verifying execution of a program code in accordance with an implementation of the disclosure.
  • the data processor 104 includes a data processor 104 that is coupled to a data memory 106.
  • the program code when executed on the data processor 104 accesses the data memory 106.
  • the data memory 106 is configured to allocate a first group of tags to respective granules of the data memory 106.
  • the data memory 106 is configured to allocate a second group of predetermined tags to respective pointers and capabilities used by the program code when executed.
  • the pointers and capabilities occupy an integer number of granules.
  • the data memory 106 is configured to allocate a third group of tags.
  • a given tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule.
  • the data memory 106 is configured, during run time of the program code, to use a compiler to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not the data memory 106 violation has occurred.
  • the compiler is configured to generate or add an additional instruction ahead of run-time.
  • executable code is compiled ahead of run-time.
  • added instructions cause the data processor 104 to set and verify memory tags, for example as defined by an ARMv8.5-MTE extension, to achieve advantages pursuant to the disclosure.
  • Advantages are provided by the compiler 104 when translating computer source code into program code.
  • the program code is distinguished, for example when implementing ARMv8-A ISA and ARMv8.5-MTE extensions, by benefits that it provides at run-time.
  • FIG. 2 is a process flow diagram of a pointer 202 during verification of execution of a program code in accordance with an implementation of the disclosure.
  • a tag associated with a given granule may be set to a predetermined tag value when the pointer 202 is written to a given granule.
  • the pointer 202 accesses a data memory 204 using a tag associated with a given granule.
  • the tag is from a first group of tags 206, a second group of predetermined tags 208, or a third group of tags 210.
  • a pointer arithmetic is defined as (i) a result of an addition or a subtraction of the pointer 202 and an integer, (ii) a result of a subtraction of two pointers is an integer equal to the result of the subtraction of the two pointer’s target address, and (iii) other arithmetic operations involving pointers are not defined.
  • the result of the addition or the subtraction of the pointer 202 and the integer may be defined as (i) a resulting target address is a result of an addition or a subtraction of the pointer’s target address and the integer, (ii) a resulting descriptor address is an object descriptor’s address and (iii) as variants without object descriptors.
  • a 32-bit offset value is adjusted by adding or withdrawing same integral quantity.
  • conversions between the pointer 202 and integer types include a conversion between the pointers and integers that may not be defined.
  • the conversions between the pointer 202 and integer types include the conversion between the pointer 202 and the integer that is unrestricted to (a) a conversion of the pointer 202 to integer results in a 128-bit integral value that may equal to a 128-bit pointer representation, or (b) a conversion of the integer to the pointer results in the pointer 202 whose 128-bit representation equals the integer value.
  • the conversions between the pointer 202 and the integer types include the conversion between the pointer 202 and integer tracks pointer validity. The conversion of a smaller-than-128- bit integer to a pointer 202 may be defined as invalid.
  • the conversion of the 128-bit integer value is (i) the pointer 202 with the corresponding 128-bit representation if the integer value is known to be a result of converting a valid pointer to the integer and performing only legitimate changes to the integer value thereafter or (ii) an invalid pointer in any other case.
  • a compiler optionally needs to track the validity of 128-bit integer values as potentially valid pointer representations that include at least one of (a) the conversion of the valid pointer to the integer that is deemed valid, (b) a bit-wise conjunction between a valid pointer value and a value less than 2 to a power 60 that is valid, (c) a bit-wise inclusive disjunction between the valid pointer value and a value no less than 2 to the power 128 minus 2 to the power 60, that is valid, (d) a bit-wise exclusive disjunction between the valid pointer value and a value less than 2 to the power 60 that is valid, (e) an addition of the valid pointer value and a given value that is valid if its highest order 68 bits are the same as that of the valid pointer value, (1) a subtraction of the valid pointer value and the given value that is valid if its highest order 60 bits are the same as that of the valid pointer value, (g) other operations, including multiplication, division, bit shift or logical operands that do
  • FIG. 3 is a process flow diagram of a pointer 302 during a verification of execution of a program code in accordance with an implementation of the disclosure.
  • a tag in a first group of tags 306 sets to a corresponding predetermined tag value in an event of the pointer 302 being written to the data memory 304.
  • any memory granule may obtain a same tag value to be as valid pointers or capabilities that may be unambiguously distinguished from scalar or other data memory and unallocated memory.
  • granules for which a reserved tag value is defined have the reserved tag as the tag value in a tag storage memory.
  • FIG. 4 is a block diagram of a pointer 402 of a data processor in accordance with an implementation of the disclosure.
  • the pointer 402 includes a first portion that conveys a memory address in a data memory and a second portion that conveys an address of an object descriptor.
  • the object descriptor defines boundaries of an object in the data memory.
  • a representation of the pointer 402 is changed.
  • a size and alignment of the pointer 402 may increase up to 128 bits.
  • Multiple variants for a pointer representation may be possible and preferred variants are (i) optionally the low-order 64 bits of the pointer 402 that convey the representation of the target address of the pointer 402, as defined in the Virtual Memory System Architecture, (ii) optionally the high-order 64 bits of the pointer 402 that convey the representation of a Virtual Memory System Architecture address of the object descriptor.
  • the pointer 402 is represented as a 128-bit (16 byte) value as (i) a first half (64 bits) that conveys the memory address as normal and (ii) a second half (remaining 64 bits) that conveys an address of the object descriptor.
  • a 128-bits descriptoris allocated when an object is allocated from the data memory, a 128-bits descriptoris allocated.
  • the 128-bits descriptor conveys the boundaries of the object.
  • the 128-bits descriptor is invalidated, and the boundaries describe an object of size 0 (or less) for example by setting a start and an address to the same value.
  • an order of pointer bits is reversed.
  • the high-order 64-bits convey a pair of 32-bit values.
  • the first 32-bit value may represent an offset from a target address to a start address of the object or vice versa.
  • the second 32- bit value may represent a byte size of the object and that limits object sizes to 4 gibibytes minus one byte.
  • the order of bits in the second alternative is swapped.
  • an object size is used instead of one of the two offsets in the second alternative.
  • FIG. 5 is a block diagram of an object descriptor 504 in accordance with an implementation of the disclosure. Boundaries of the granules may be stored as in-memory objects in the object descriptor 504.
  • the object descriptor 504 is stored in a data memory 502.
  • anew object descriptor may be stored in the data memory 502, and a size and an alignment of the object descriptor 504 is 128 bits,
  • a compiler emits necessary machine code instructions to allocate a space for the object descriptor 504 on the stack.
  • the execution run-time allocates the space for the object descriptor 504.
  • the run-time may initialize the object descriptor 504, and the tag of each object descriptor.
  • the run-time may invalidate the object descriptor 504 by substituting descriptor values with that for a zero sized object.
  • the procedure by which storage space for object descriptors is allocated is an endeavour to minimize occurrences, whereby new object descriptors are allocated in a place of former object descriptors.
  • the object descriptor 504 is allocated within a read-only data section as the object descriptor 504 that is not expected to change while the program image remains loaded in a program memory.
  • a compiler and a link editor define suitable relocations to initialize an address of the object descriptor 504 within any statically initialized pointers.
  • the compiler and the link editor are included in the program image file that includes (i) a first table of all statically allocated pointers and (ii) a second table of all statically allocated object descriptors with the corresponding program image.
  • a run-time loader browses the first table and the second table and writes a corresponding tag for the corresponding memory granules before the program is executed.
  • FIG. 6 is a flow diagram that illustrates a method for verifying execution of a program code on a device including a data processor coupled to a data memory in accordance with an implementation of the disclosure.
  • the program code when executed on the data processor accesses the data memory.
  • a first group of tags is allocated to respective granules of the data memory.
  • a second group of predetermined tags is allocated to respective pointers and capabilities used by the program code when executed. The pointers and capabilities occupy an integer number of granules.
  • a third group of tags is allocated. A tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule.
  • a compiler is used to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not a data memory violation has occurred.
  • the compiler is configured to generate or add an additional instruction ahead of run-time.
  • executable code is compiled ahead of run-time.
  • added instructions cause the data processor to set and verify memory tags, for example as defined by an ARMv8.5-MTE extension, to achieve advantages pursuant to the disclosure.
  • Advantages are provided by the compiler when translating computer source code into program code.
  • the program code is distinguished, for example when implementing ARMv8-A ISA and ARMv8.5-MTE extensions, by benefits that it provides at run-time.
  • the method for verifying the execution of the program code tracks exact object boundaries and provides deterministic spatial memory safety rather than depending on ad-hoc and new ISA extensions.
  • the method provides deterministic spatial safety, and a very highly probable temporal safety as it uses ARMv8.5-MTE. Specifically, the method radically changes the logic for allocating tags for granules of memory and changes the representation of pointers so that the tags carry non-fakeable boundaries.
  • the method assigns distinct fixed predetermined tags for (i) pointers or capabilities, which always occupy exactly one granule (16 bytes) of the data memory, (ii) any other data constituent of a valid/live in-memory object, (iii) unallocated data memory including memory space of former objects that are yet not reallocated, (iv) a per-object descriptor, which like a capability, must occupy exactly one granule. Optionally, there is one descriptor per object.
  • the method leverages the tag storage memory as a means for distinguishing valid pointers stored in memory from other data in memory and unused memory. This is effectively used by the compiler when translating a source code of a suitable programming language (C or C++) into ARMv8 assembler or a machine code.
  • C or C++ suitable programming language
  • the method when a given pointer is written to a given granule, the method includes setting a tag associated with the given granule to a predetermined tag value.
  • the tag for the underlying memory granule may be set to the corresponding predetermined tag value. Any memory granule may get that same tag value to be as the valid pointers/capabilities that may be unambiguously distinguished from scalar or other data in memory, and unallocated memory.
  • the pointer is represented as a 128-bit (16 byte) value as (i) a first half (64 bits) that conveys the memory address as normal and (ii) a second half (remaining 64 bits) that conveys an address of an object descriptor.
  • the method when allocating the first group of tags, includes arranging for the compiler to insert one or more ad-hoc memory tag extension, MTE, instructions for provisioning a tag to a granule in data memory, when an allocation of the memory granule switches between the groups.
  • MTE ad-hoc memory tag extension
  • the compiler writes the corresponding tags to any object allocated on the stack and its object descriptor.
  • the compiler initializes that descriptor, and reclaims the descriptor when the object goes out of scope.
  • the compiler reserves a space for the object descriptor in a data section.
  • the compiler for the programming language in use may need to insert ad-hoc MTE instructions to provide the intended tag to a memory granule whenever the allocation of a memory granule switches from one of the 4 categories above to any other.
  • the method in an event of a pointer being written to the data memory, includes setting a tag in the first group of tags to a corresponding predetermined tag value.
  • the pointer includes a first portion that conveys a memory address in the data memory, and a second portion that conveys an address of an object descriptor.
  • the object descriptor defines boundaries in the data memory.
  • the method includes storing the boundaries of the granules as in-memory objects in an object descriptor.
  • the method when a pointer is dereferenced, includes, retrieving the boundaries from the object descriptor and comparing the boundaries with accessed memory addresses in the data memory to determine whether or not the data memory violation has occurred during execution of the computer code.
  • the method includes (i) using the compiler to retrieve boundaries of granules used by the program code, as represented by the third group of tags, and (ii) using the data processor to check the boundaries of the granules at run-time to raise a run-time exception warning in an event of the boundaries of the granules, as represented by the first and second groups of tags, being violated by the program code.
  • the method includes using the compiler to insert instructions for the data processor to retrieve the boundaries, to check them at run-time, and to raise a run-time exception warning when a boundary violation occurs.
  • the compiler for the programming language may insert necessary instructions for the data processor to retrieve the boundaries, check the boundaries at run-time, and raise a run time exception in case of boundary violation.
  • the method may retrieve the boundaries from the object descriptor, and compare the accessed address against the boundaries when the pointer is dereferenced. For example, S is the start address and E the end address, an access of X bytes at address A must verify (i) A is bigger than or equal to S, and (ii) the sum of A and X is smaller than or equal to E.
  • the method includes issuing machine codes as such that the boundaries of the object are loaded from the object descriptor, or computed from pointer representation in variants without object descriptors using the compiler when the pointer is dereferenced.
  • the boundaries for the same object are already retrieved for a previous dereference in program order, those can be reused.
  • the method includes using the compiler to issue machine code as such that the bytes of memory logically accessed by the dereference is validated against the boundaries when the pointer is dereferenced.
  • the object start address may be smaller or equal to the first or lowest dereferenced address.
  • the object end address may be strictly larger than the last or highest dereferenced address.
  • an exception may be raised (for example, in a POSIX environment, the system ‘raise (SIGSEGV)’ may be invoked), and further processing stops. Otherwise, the value may be dereferenced.
  • the deference may necessarily access the entirety of an MTE granule.
  • the tag for the load source granule may be loaded from a tag storage. If the loaded tag value does not match the tag value for pointers, the pointer is not valid. The address of the object descriptor in the pointer representation is invalidated, for example, by setting the low- order 64 bits all to zero. Otherwise, the value is a valid pointer, and no further processing is necessary.
  • the tag for the store destination granule may be set to the tag value reserved for pointer representation if the written value is a valid pointer value or to an unreserved tag value otherwise.
  • the dereferenced value is of an integer with at least 128 bits
  • conversions between integer and pointer types may not be allowed. Otherwise, if the access is a store, the same processing must occur as if the type was pointer type. Otherwise, if the access is a load, either conversion is relaxed or conversion is allowed with validity tracking. For the purpose of tracking validity, the tag of the loaded granule may be loaded from the tag store. The integer value is a valid pointer value if, and only if, the loaded granule value equals the tag value reserved for pointer representation.
  • the method includes eliding (namely, omitting, suppressing or altering) a check using the compiler through any reliable static analysis when a given boundary of the object is proved as valid.
  • the method includes eliding the loading of the boundaries if both boundaries are provably valid and the object is probably allocated.
  • the method includes eliding the loading and matching of the tag if a pointer value is being loaded, and its validity may be proven statically.
  • the method defines anew mapping of source code (namely, anew Application Binary Interface (ABI)).
  • ABSI Application Binary Interface
  • the method exhibits the following essential differences.
  • a subset of the 16 possible tag values may be reserved for special use
  • one value is reserved for granules holding a pointer representation
  • another value is reserved for granules holding an object descriptor
  • optionally another value is reserved for unallocated memory.
  • the compiler optionally uses the reserved values for the reservation.
  • a computer program product including a non-transitory computer-readable storage medium having computer-readable instructions stored thereon.
  • the computer-readable instructions being executable by a computerized device includes a processing hardware to execute the above method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

There is provided a device (102) that includes a data processor (104) that is coupled to a data memory (106, 204, 304, 502). The data processor (104) is configured to verify execution of a program code. The data processor is configured to (i) allocate a first group of tags (206, 306) to respective granules of the data memory, (ii) allocate a second group of predetermined tags (208) to respective pointers and capabilities used by the program code when executed, (iii) allocate a third group of tags (210), wherein a given tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule and (iv) during run time of the program code, to use a compiler to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not a data memory violation has occurred.

Description

METHOD AND DEVICE FOR VERIFYING EXECUTION OF A PROGRAM
CODE
TECHNICAL FIELD
The disclosure relates to a method for verifying execution of a program code; more particularly, the disclosure relates to a device including a data processor coupled to a data memory for verifying the execution of the program code.
BACKGROUND
Software applications may contain flawed logic or faults. A carefully crafted malicious input or program may exploit faults in a given software application in a manner that causes the given software application to deviate from its intended behaviour. Such a deviation may have potentially dangerous consequences for the given software application's user and a system on which the software application is running. Such faults are referred to as vulnerabilities, for example as software vulnerabilities. Memory- corruption vulnerabilities are an important class of software vulnerabilities that lead to corruption of data in a computer memory accessed by the software application.
For example, software programs written in an unsafe native programming language such as C and C++ allow for reading or writing beyond or across legitimate boundaries of in memory objects (for example: data). When an out-of-bounds write or read occurs, the content of other objects elsewhere, usually adjacent by address, in computer memory becomes tampered. The reading or writing beyond or across legitimate boundaries of in memory objects breaches confidentiality of the data in the computer memory.
In another example, either stray data of the former object, or new data of a new object that is allocated later in a same place, may be present when the reading or writing within the legitimate boundaries of an object that no longer logically exists in program execution sequence occurs. This may lead to tampering or breaching of the confidentiality of data.
Currently, boundary checks are implemented at run-time entirely in the software application for testing and debugging purposes, for example by using a run-time memory access checking system such as the LLVM AddressSanitizer. The Addresssanitizer requires a special compilation of programs, causes considerable degradation of execution speed, and requires prohibitive amounts of memory. In known approaches such as CHERI architecture from Cambridge University, a mechanism is proposed to enforce spatial safety with changes to a processor and instruction set architecture (ISA) extensions. Instead of traditional memory addresses/pointers, the known approach represents memory references using capabilities, which describe not only the memory address, but also the boundaries and permissions associated with the memory reference that allows the processor to enforce boundaries and permissions at run-time. The known approach (CHERI architecture) uses extensions that have not currently been implemented in any real product and does not provide any means for temporal safety.
Another known approach, such as ARMv8.5-MTE, Memory Tagging Extension introduced by ARM (thereafter MTE), is a form of memory coloring, which associates a 4-bit tag (the “color”) for each 16-byte “granule” of physical memory. When accessing memory, the processor verifies that an expected tag, specified as part of the memory address, matches the actual tag of the accessed memory granule. By assigning a pseudo random tag to the granules of each in-memory object, it is possible to distinguish partitioning of the set of objects in 2 to the power 4 = 16 separate partitions. Out-of-bound access, without further information of a system state, may have only 1 in 16 chances of carrying the correct tag, and otherwise, be rejected by the processor. This known approach provides memory safeties, both temporally and spatially, but only in a statistical or non- deterministic fashion. Furthermore, the known approach ensures that two adjacent objects have distinct tags so that adjacent out-of-bound accesses may accurately or deterministically fail. But in a general case, protection is only statistical with 15 or 16 chances that provide a safety of about 94%.
Therefore, there arises a need to address the aforementioned technical drawbacks in existing systems or technologies in key attestation on computing devices.
SUMMARY
It is an object of the disclosure to provide a method for (namely, a method ol) verifying execution of a program code, and a device including a data processor coupled to a data memory for verifying execution of the program code while avoiding one or more disadvantages of prior art approaches. This object is achieved by the features of the independent claims. Further implementations are apparent from the dependent claims, the description, and the figures.
The disclosure provides a method for verifying execution of a program code and a device including a data processor coupled to a data memory for verifying the execution of the program code.
According to a first aspect, there is provided a method for verifying execution of a program code on a device. The device includes a data processor that is coupled to a data memory. The program code when executed on the data processor accesses the data memory. The method includes allocating a first group of tags to respective granules of the data memory. The method includes allocating a second group of predetermined tags to respective pointers and capabilities used by the program code when executed. The pointers and capabilities occupy an integer number of granules. The method includes allocating a third group of tags. A tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule. The method includes, during run time of the program code, using a compiler to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not a data memory violation has occurred.
The method is of advantage in that the method can track exact object boundaries and provides deterministic spatial memory safety rather than depend on ad-hoc and new ISA extensions. The method provides deterministic spatial safety, and a very highly probable temporal safety as it uses ARMv8.5-Memory Tagging Extension. Specifically, the method radically changes the logic for allocating tags for granules of memory and changes the representation of pointers so that the tags carry non-fakeable boundaries. The method assigns distinct fixed predetermined tags for (i) pointers or capabilities, which always occupy exactly one granule (16 bytes) of the data memory, (ii) any other data constituent of a valid/live in- memory object, (iii) unallocated data memory including memory space of former objects that are yet not reallocated, and (iv) a per-object descriptor, which like a capability, must occupy exactly one granule. The method leverages a tag storage memory as a mean to distinguish valid pointers stored in memory from other data in memory and unused memory. This is effectively used by the compiler when translating a source code of suitable programming language (such as C or C++) into ARMv8 assembler or a machine code. According to a second aspect, there is provided a device including a data processor coupled to a data memory. The data processor is configured to verify execution of a program code. The program code when executed on the data processor accesses the data memory. The data processor is configured to allocate a first group of tags to respective granules of the data memory. The data processor is configured to allocate a second group of predetermined tags to respective pointers and capabilities used by the program code when executed. The pointers and capabilities occupy an integer number of granules. The data processor is configured to allocate a third group of tags. A given tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule. During run time of the program code, the data processor is configured to use a compiler to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not a data memory violation has occurred.
The advantage of the device is that the device tracks exact object boundaries and provides deterministic spatial memory safety rather than depending on ad-hoc and new ISA extensions. The device provides deterministic spatial safety, and a very highly probable temporal safety as it uses ARMv8.5-MTE. Specifically, the device radically changes the logic for allocating tags for granules of memory, and changes the representation of pointers so that the tags carry non-fakeable boundaries. The device leverages the tag storage memory as a means for distinguishing valid pointers stored in memory from other data in memory and unused memory. This is effectively used by the compiler when translating a source code of a suitable programming language (C or C++) into ARMv8 assembler or a machine code.
According to a third aspect, a computer program product including a non-transitory computer-readable storage medium having computer-readable instructions stored thereon, the computer-readable instructions being executable by a computerized device comprising processing hardware to execute a method of any one of claims.
Technical problems in the prior art are resolved, where the technical problems include reading or writing beyond or across legitimate boundaries of in-memory objects tampers and breaches confidentiality of the data in the computer memory. Therefore, in contradistinction to the prior art, according to the method and the device for verifying execution of a program code of the disclosure, the method tracks exact object boundaries and provides deterministic spatial memory safety. The method according to the disclosure can provide deterministic spatial safety and very highly probable temporal safety. In addition, the method may radically change the logic for allocating tags for granules of memory and changes the representation of pointers so that the tags carry non- fakeable boundaries.
These and other aspects of the disclosure will be apparent from the implementations described below.
BRIEF DESCRIPTION OF DRAWINGS
Implementations of the disclosure will now be described, by way of example only, with reference to the accompanying drawings, in which:
FIG. 1 is a block diagram of a device for verifying execution of a program code in accordance with an implementation of the disclosure;
FIG. 2 is a process flow diagram of a pointer during verification of execution of a program code in accordance with an implementation of the disclosure;
FIG. 3 is a process flow diagram of a pointer during verification of execution of a program code in accordance with an implementation of the disclosure;
FIG. 4 is a block diagram of a pointer block of a data processor in accordance with an implementation of the disclosure;
FIG. 5 is a block diagram of an object descriptor in accordance with an implementation of the disclosure; and
FIG. 6 is a flow diagram that provides an illustration of steps of a method for verifying execution of a program code on a device in accordance with an implementation of the disclosure. DETAILED DESCRIPTION OF THE DRAWINGS
Implementations of the disclosure provide a method for verifying execution of a program code and a device including a data processor coupled to a data memory for verifying the execution of the program code.
To make solutions of the disclosure more comprehensible for a person skilled in the art, the following implementations of the disclosure are described with reference to the accompanying drawings.
Terms such as "a first", "a second", "a third", and "a fourth" (if any) in the summary, claims, and foregoing accompanying drawings of the disclosure are used to distinguish between similar objects and are not necessarily used to describe a specific sequence or order. It should be understood that the terms so used are interchangeable under appropriate circumstances, so that the implementations of the disclosure described herein are, for example, capable of being implemented in sequences other than the sequences illustrated or described herein. Furthermore, the terms "include" and "have" and any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, a method, a system, a product, or a device that includes a series of steps or units, is not necessarily limited to expressly listed steps or units but may include other steps or units that are not expressly listed or that are inherent to such process, method, product, or device.
FIG. 1 is a block diagram of a device 102 for verifying execution of a program code in accordance with an implementation of the disclosure. The data processor 104 includes a data processor 104 that is coupled to a data memory 106. The program code when executed on the data processor 104 accesses the data memory 106. The data memory 106 is configured to allocate a first group of tags to respective granules of the data memory 106. The data memory 106 is configured to allocate a second group of predetermined tags to respective pointers and capabilities used by the program code when executed. The pointers and capabilities occupy an integer number of granules. The data memory 106 is configured to allocate a third group of tags. A given tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule. The data memory 106 is configured, during run time of the program code, to use a compiler to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not the data memory 106 violation has occurred.
In FIG. 1, the compiler is configured to generate or add an additional instruction ahead of run-time. For example, for computer languages such as C, C++ and similar, executable code is compiled ahead of run-time. At run-time of compiled code, added instructions cause the data processor 104 to set and verify memory tags, for example as defined by an ARMv8.5-MTE extension, to achieve advantages pursuant to the disclosure. Advantages are provided by the compiler 104 when translating computer source code into program code. The program code is distinguished, for example when implementing ARMv8-A ISA and ARMv8.5-MTE extensions, by benefits that it provides at run-time.
FIG. 2 is a process flow diagram of a pointer 202 during verification of execution of a program code in accordance with an implementation of the disclosure. A tag associated with a given granule may be set to a predetermined tag value when the pointer 202 is written to a given granule. Optionally, the pointer 202 accesses a data memory 204 using a tag associated with a given granule. Optionally, the tag is from a first group of tags 206, a second group of predetermined tags 208, or a third group of tags 210.
Optionally, a pointer arithmetic is defined as (i) a result of an addition or a subtraction of the pointer 202 and an integer, (ii) a result of a subtraction of two pointers is an integer equal to the result of the subtraction of the two pointer’s target address, and (iii) other arithmetic operations involving pointers are not defined. The result of the addition or the subtraction of the pointer 202 and the integer may be defined as (i) a resulting target address is a result of an addition or a subtraction of the pointer’s target address and the integer, (ii) a resulting descriptor address is an object descriptor’s address and (iii) as variants without object descriptors. Optionally, a 32-bit offset value is adjusted by adding or withdrawing same integral quantity.
Optionally, conversions between the pointer 202 and integer types include a conversion between the pointers and integers that may not be defined.
Optionally, the conversions between the pointer 202 and integer types include the conversion between the pointer 202 and the integer that is unrestricted to (a) a conversion of the pointer 202 to integer results in a 128-bit integral value that may equal to a 128-bit pointer representation, or (b) a conversion of the integer to the pointer results in the pointer 202 whose 128-bit representation equals the integer value. Optionally, the conversions between the pointer 202 and the integer types include the conversion between the pointer 202 and integer tracks pointer validity. The conversion of a smaller-than-128- bit integer to a pointer 202 may be defined as invalid.
Optionally, the conversion of the 128-bit integer value is (i) the pointer 202 with the corresponding 128-bit representation if the integer value is known to be a result of converting a valid pointer to the integer and performing only legitimate changes to the integer value thereafter or (ii) an invalid pointer in any other case.
A compiler optionally needs to track the validity of 128-bit integer values as potentially valid pointer representations that include at least one of (a) the conversion of the valid pointer to the integer that is deemed valid, (b) a bit-wise conjunction between a valid pointer value and a value less than 2 to a power 60 that is valid, (c) a bit-wise inclusive disjunction between the valid pointer value and a value no less than 2 to the power 128 minus 2 to the power 60, that is valid, (d) a bit-wise exclusive disjunction between the valid pointer value and a value less than 2 to the power 60 that is valid, (e) an addition of the valid pointer value and a given value that is valid if its highest order 68 bits are the same as that of the valid pointer value, (1) a subtraction of the valid pointer value and the given value that is valid if its highest order 60 bits are the same as that of the valid pointer value, (g) other operations, including multiplication, division, bit shift or logical operands that do not result in valid pointer values, (h) operations between non-valid pointer values that are not valid, (i) operations between two valid pointer values that are not valid, and (j) the above-mentioned rules that assume a first variant representation of pointers. Equivalent rules may be defined for other representation variants, but with much greater complexity due to the different ordering and signification of the pointer representation bits.
FIG. 3 is a process flow diagram of a pointer 302 during a verification of execution of a program code in accordance with an implementation of the disclosure. A tag in a first group of tags 306 sets to a corresponding predetermined tag value in an event of the pointer 302 being written to the data memory 304.
In particular, any memory granule may obtain a same tag value to be as valid pointers or capabilities that may be unambiguously distinguished from scalar or other data memory and unallocated memory. Optionally, granules for which a reserved tag value is defined have the reserved tag as the tag value in a tag storage memory.
FIG. 4 is a block diagram of a pointer 402 of a data processor in accordance with an implementation of the disclosure. The pointer 402 includes a first portion that conveys a memory address in a data memory and a second portion that conveys an address of an object descriptor. The object descriptor defines boundaries of an object in the data memory.
Optionally, a representation of the pointer 402 is changed. A size and alignment of the pointer 402 may increase up to 128 bits. Multiple variants for a pointer representation may be possible and preferred variants are (i) optionally the low-order 64 bits of the pointer 402 that convey the representation of the target address of the pointer 402, as defined in the Virtual Memory System Architecture, (ii) optionally the high-order 64 bits of the pointer 402 that convey the representation of a Virtual Memory System Architecture address of the object descriptor.
Optionally, the pointer 402 is represented as a 128-bit (16 byte) value as (i) a first half (64 bits) that conveys the memory address as normal and (ii) a second half (remaining 64 bits) that conveys an address of the object descriptor. Optionally, when an object is allocated from the data memory, a 128-bits descriptoris allocated. The 128-bits descriptor conveys the boundaries of the object. Optionally, when the object is freed, the 128-bits descriptor is invalidated, and the boundaries describe an object of size 0 (or less) for example by setting a start and an address to the same value. Optionally, as a first alternative, an order of pointer bits is reversed. Optionally, as a second alternative, the high-order 64-bits convey a pair of 32-bit values. The first 32-bit value may represent an offset from a target address to a start address of the object or vice versa. The second 32- bit value may represent a byte size of the object and that limits object sizes to 4 gibibytes minus one byte. Optionally, as a third alternative, the order of bits in the second alternative is swapped. Optionally, as a fourth alternative, an object size is used instead of one of the two offsets in the second alternative.
FIG. 5 is a block diagram of an object descriptor 504 in accordance with an implementation of the disclosure. Boundaries of the granules may be stored as in-memory objects in the object descriptor 504. Optionally, the object descriptor 504 is stored in a data memory 502. Optionally, for each currently allocated memory object, as defined by the language specification, anew object descriptor may be stored in the data memory 502, and a size and an alignment of the object descriptor 504 is 128 bits, Optionally, if the object is allocated “automatically”, namely on the “stack”, a compiler emits necessary machine code instructions to allocate a space for the object descriptor 504 on the stack.
Optionally, if the object is allocated “dynamically”, that is on the heap or through a memory mapping, the execution run-time allocates the space for the object descriptor 504. The run-time may initialize the object descriptor 504, and the tag of each object descriptor. When a dynamically allocated object is freed or destroyed, the run-time may invalidate the object descriptor 504 by substituting descriptor values with that for a zero sized object.
Optionally, the procedure by which storage space for object descriptors is allocated is an endeavour to minimize occurrences, whereby new object descriptors are allocated in a place of former object descriptors. Optionally, regardless of the access permission for a section in which the object is allocated, the object descriptor 504 is allocated within a read-only data section as the object descriptor 504 that is not expected to change while the program image remains loaded in a program memory.
Optionally, if the object is itself of a pointer type, its initial value is a valid pointer, and a program image is relocatable or a position-independent (PIC). Optionally, a compiler and a link editor define suitable relocations to initialize an address of the object descriptor 504 within any statically initialized pointers.
Optionally, the compiler and the link editor are included in the program image file that includes (i) a first table of all statically allocated pointers and (ii) a second table of all statically allocated object descriptors with the corresponding program image. Optionally, a run-time loader browses the first table and the second table and writes a corresponding tag for the corresponding memory granules before the program is executed.
FIG. 6 is a flow diagram that illustrates a method for verifying execution of a program code on a device including a data processor coupled to a data memory in accordance with an implementation of the disclosure. The program code when executed on the data processor accesses the data memory. At a step 602, a first group of tags is allocated to respective granules of the data memory. At a step 604, a second group of predetermined tags is allocated to respective pointers and capabilities used by the program code when executed. The pointers and capabilities occupy an integer number of granules. At a step 606, a third group of tags is allocated. A tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule. At a step 608, during run time of the program code, a compiler is used to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not a data memory violation has occurred.
In FIG. 6, the compiler is configured to generate or add an additional instruction ahead of run-time. For example, for computer languages such as C, C++ and similar, executable code is compiled ahead of run-time. At run-time of compiled code, added instructions cause the data processor to set and verify memory tags, for example as defined by an ARMv8.5-MTE extension, to achieve advantages pursuant to the disclosure. Advantages are provided by the compiler when translating computer source code into program code. The program code is distinguished, for example when implementing ARMv8-A ISA and ARMv8.5-MTE extensions, by benefits that it provides at run-time.
The method for verifying the execution of the program code tracks exact object boundaries and provides deterministic spatial memory safety rather than depending on ad-hoc and new ISA extensions. The method provides deterministic spatial safety, and a very highly probable temporal safety as it uses ARMv8.5-MTE. Specifically, the method radically changes the logic for allocating tags for granules of memory and changes the representation of pointers so that the tags carry non-fakeable boundaries. The method assigns distinct fixed predetermined tags for (i) pointers or capabilities, which always occupy exactly one granule (16 bytes) of the data memory, (ii) any other data constituent of a valid/live in-memory object, (iii) unallocated data memory including memory space of former objects that are yet not reallocated, (iv) a per-object descriptor, which like a capability, must occupy exactly one granule. Optionally, there is one descriptor per object.
The method leverages the tag storage memory as a means for distinguishing valid pointers stored in memory from other data in memory and unused memory. This is effectively used by the compiler when translating a source code of a suitable programming language (C or C++) into ARMv8 assembler or a machine code.
Optionally, when a given pointer is written to a given granule, the method includes setting a tag associated with the given granule to a predetermined tag value.
In particular, when a pointer is written to the data memory, the tag for the underlying memory granule may be set to the corresponding predetermined tag value. Any memory granule may get that same tag value to be as the valid pointers/capabilities that may be unambiguously distinguished from scalar or other data in memory, and unallocated memory. Optionally, the pointer is represented as a 128-bit (16 byte) value as (i) a first half (64 bits) that conveys the memory address as normal and (ii) a second half (remaining 64 bits) that conveys an address of an object descriptor.
Optionally, when allocating the first group of tags, the method includes arranging for the compiler to insert one or more ad-hoc memory tag extension, MTE, instructions for provisioning a tag to a granule in data memory, when an allocation of the memory granule switches between the groups.
Optionally, the compiler writes the corresponding tags to any object allocated on the stack and its object descriptor. Optionally, the compiler initializes that descriptor, and reclaims the descriptor when the object goes out of scope. Optionally, if the object is allocated “statically”, namely in global or stack data sections of the program image, the compiler reserves a space for the object descriptor in a data section.
When using the MTE, the compiler for the programming language in use may need to insert ad-hoc MTE instructions to provide the intended tag to a memory granule whenever the allocation of a memory granule switches from one of the 4 categories above to any other.
Optionally, in an event of a pointer being written to the data memory, the method includes setting a tag in the first group of tags to a corresponding predetermined tag value. Optionally, the pointer includes a first portion that conveys a memory address in the data memory, and a second portion that conveys an address of an object descriptor. The object descriptor defines boundaries in the data memory. Optionally, the method includes storing the boundaries of the granules as in-memory objects in an object descriptor. Optionally, when a pointer is dereferenced, the method includes, retrieving the boundaries from the object descriptor and comparing the boundaries with accessed memory addresses in the data memory to determine whether or not the data memory violation has occurred during execution of the computer code.
Optionally, the method includes (i) using the compiler to retrieve boundaries of granules used by the program code, as represented by the third group of tags, and (ii) using the data processor to check the boundaries of the granules at run-time to raise a run-time exception warning in an event of the boundaries of the granules, as represented by the first and second groups of tags, being violated by the program code. Optionally, the method includes using the compiler to insert instructions for the data processor to retrieve the boundaries, to check them at run-time, and to raise a run-time exception warning when a boundary violation occurs.
The compiler for the programming language may insert necessary instructions for the data processor to retrieve the boundaries, check the boundaries at run-time, and raise a run time exception in case of boundary violation.
The method may retrieve the boundaries from the object descriptor, and compare the accessed address against the boundaries when the pointer is dereferenced. For example, S is the start address and E the end address, an access of X bytes at address A must verify (i) A is bigger than or equal to S, and (ii) the sum of A and X is smaller than or equal to E.
Optionally, the method includes issuing machine codes as such that the boundaries of the object are loaded from the object descriptor, or computed from pointer representation in variants without object descriptors using the compiler when the pointer is dereferenced. Optionally, as an exception, if the boundaries for the same object are already retrieved for a previous dereference in program order, those can be reused.
Optionally, the method includes using the compiler to issue machine code as such that the bytes of memory logically accessed by the dereference is validated against the boundaries when the pointer is dereferenced. Optionally, the object start address may be smaller or equal to the first or lowest dereferenced address. Optionally, the object end address may be strictly larger than the last or highest dereferenced address. Optionally, if either boundary of the object check fails, an exception may be raised (for example, in a POSIX environment, the system ‘raise (SIGSEGV)’ may be invoked), and further processing stops. Otherwise, the value may be dereferenced.
Optionally, if the dereferenced value has a pointer type, the deference may necessarily access the entirety of an MTE granule. Optionally, if the access is a load, the tag for the load source granule may be loaded from a tag storage. If the loaded tag value does not match the tag value for pointers, the pointer is not valid. The address of the object descriptor in the pointer representation is invalidated, for example, by setting the low- order 64 bits all to zero. Otherwise, the value is a valid pointer, and no further processing is necessary. If the access is a store, then the tag for the store destination granule may be set to the tag value reserved for pointer representation if the written value is a valid pointer value or to an unreserved tag value otherwise.
Optionally, if the dereferenced value is of an integer with at least 128 bits, conversions between integer and pointer types may not be allowed. Otherwise, if the access is a store, the same processing must occur as if the type was pointer type. Otherwise, if the access is a load, either conversion is relaxed or conversion is allowed with validity tracking. For the purpose of tracking validity, the tag of the loaded granule may be loaded from the tag store. The integer value is a valid pointer value if, and only if, the loaded granule value equals the tag value reserved for pointer representation.
Optionally, the method includes eliding (namely, omitting, suppressing or altering) a check using the compiler through any reliable static analysis when a given boundary of the object is proved as valid.
Optionally, the method includes eliding the loading of the boundaries if both boundaries are provably valid and the object is probably allocated.
Optionally, the method includes eliding the loading and matching of the tag if a pointer value is being loaded, and its validity may be proven statically.
Optionally, the method defines anew mapping of source code (namely, anew Application Binary Interface (ABI)). Compared to normal ARMv8 ABI, the method exhibits the following essential differences. For example, a subset of the 16 possible tag values may be reserved for special use For example, one value is reserved for granules holding a pointer representation, another value is reserved for granules holding an object descriptor, and optionally another value is reserved for unallocated memory. The compiler optionally uses the reserved values for the reservation.
Optionally, a computer program product including a non-transitory computer-readable storage medium having computer-readable instructions stored thereon is provided. The computer-readable instructions being executable by a computerized device includes a processing hardware to execute the above method.
It should be understood that the arrangement of components illustrated in the figures described are exemplary and that other arrangement may be possible. It should also be understood that the various system components (and means) defined by the claims, described below, and illustrated in the various block diagrams represent components in some systems configured according to the subject matter disclosed herein. For example, one or more of these system components (and means) may be realized, in whole or in part, by at least some of the components illustrated in the arrangements illustrated in the described figures.
In addition, while at least one of these components are implemented at least partially as an electronic hardware component, and therefore constitutes a machine, the other components may be implemented in software that when included in an execution environment constitutes a machine, hardware, or a combination of software and hardware. Although the disclosure and its advantages have been described in detail, it should be understood that various changes, substitutions, and alterations can be made herein without departing from the scope of the disclosure as defined by the appended claims.

Claims

1. A method for verifying execution of a program code on a device (102) including a data processor (104) coupled to a data memory (106, 204, 304, 502), wherein the program code when executed on the data processor (104) accesses the data memory (106, 204, 304, 502), wherein the method includes:
(i) allocating a first group of tags (206, 306) to respective granules of the data memory (106, 204, 304, 502);
(ii) allocating a second group of predetermined tags (208) to respective pointers and capabilities used by the program code when executed, wherein the pointers and capabilities occupy an integer number of granules;
(iii) allocating a third group of tags (210), wherein a tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule; and
(iv) during run time of the program code, using a compiler to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not a data memory violation has occurred.
2. The method of claim 1, wherein, when a given pointer (202, 302, 402) is written to a given granule, the method includes setting a tag associated with the given granule to a predetermined tag value.
3. The method of claim 1, wherein, when allocating the first group of tags (206, 306), the method includes arranging for the compiler to insert one or more ad-hoc memory tag extension, MTE, instructions for provisioning a tag to a granule in data memory (106, 204, 304, 502), when an allocation of the memory granule switches between the groups.
4. The method of claim 3, wherein, in an event of a pointer (202, 302, 402) being written to the data memory (106, 204, 304, 502), the method includes setting a tag in the first group of tags (206, 306) to a corresponding predetermined tag value.
5. The method of claim 3 or 4, wherein a pointer (202, 302, 402) includes a first portion that conveys a memory address in the data memory (106, 204, 304, 502), and a second portion that conveys an address of an object descriptor (504), wherein the object descriptor (504) defines boundaries in the data memory (106, 204, 304, 502).
6. The method of claim 5, wherein the method includes storing the boundaries of the granules as in-memory objects in an object descriptor (504).
7. The method of claim 6, wherein the method includes, when a pointer (202, 302, 402) is dereferenced, retrieving the boundaries from the object descriptor (504) and comparing the boundaries with accessed memory addresses in the data memory (106, 204, 304, 502) to determine whether or not the data memory violation has occurred during execution of the computer code.
8. The method of any one of the preceding claims, wherein the method includes:
(i) using the compiler to retrieve boundaries of granules used by the program code, as represented by the third group of tags (210); and
(ii) using the data processor (104) to check the boundaries of the granules at run-time to raise a run-time exception warning in an event of the boundaries of the granules, as represented by the first and second groups of tags, being violated by the program code.
9. The method of claim 8, wherein the method includes using the compiler to insert instructions for the data processor (104) to retrieve the boundaries, to check them at run time, and to raise a run-time exception warning when a boundary violation occurs.
10. A device (102) including a data processor (104) coupled to a data memory (106, 204, 304, 502), wherein the data processor (104) is configured to verify execution of a program code, wherein the program code when executed on the data processor (104) accesses the data memory (106, 204, 304, 502), wherein the data processor (104) is configured:
(i) to allocate a first group of tags (206, 306) to respective granules of the data memory (106, 204, 304, 502);
(ii) to allocate a second group of predetermined tags (208) to respective pointers and capabilities used by the program code when executed, wherein the pointers and capabilities occupy an integer number of granules; (iii) to allocate a third group of tags (210), wherein a given tag of the third group corresponds to a descriptor of a corresponding object that occupies a corresponding granule; and
(iv) during run time of the program code, to use a compiler to add instructions for use in checking whether or not the tags of the first, second and third groups mutually match to check whether or not a data memory violation has occurred.
11. A computer program comprising computer-readable instructions which, when executed by a computerized device, cause the computerized device to execute a method of any one of claims 1 to 9.
12. A computer program product comprising a non-transitory computer-readable storage medium having computer-readable instructions stored thereon, the computer- readable instructions being executable by a computerized device comprising processing hardware to execute a method of any one of claims 1 to 9.
PCT/EP2021/059611 2021-04-14 2021-04-14 Method and device for verifying execution of a program code WO2022218517A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2021/059611 WO2022218517A1 (en) 2021-04-14 2021-04-14 Method and device for verifying execution of a program code

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2021/059611 WO2022218517A1 (en) 2021-04-14 2021-04-14 Method and device for verifying execution of a program code

Publications (1)

Publication Number Publication Date
WO2022218517A1 true WO2022218517A1 (en) 2022-10-20

Family

ID=75539335

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2021/059611 WO2022218517A1 (en) 2021-04-14 2021-04-14 Method and device for verifying execution of a program code

Country Status (1)

Country Link
WO (1) WO2022218517A1 (en)

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DONGWEI CHEN ET AL: "SMA: Eliminate Memory Spatial Errors via Saturation Memory Access", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 7 February 2020 (2020-02-07), XP081594029 *
KOSTYA SEREBRYANY ET AL: "Memory Tagging and how it improves C/C++ memory safety", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 26 February 2018 (2018-02-26), XP081212932 *
WATSON ROBERT ET AL: "Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 8)", 31 October 2020 (2020-10-31), XP055849025, Retrieved from the Internet <URL:https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-951.pdf> [retrieved on 20211007], DOI: 10.48456/tr-951 *

Similar Documents

Publication Publication Date Title
US8762797B2 (en) Method and apparatus for detecting memory access faults
US8434064B2 (en) Detecting memory errors using write integrity testing
Duck et al. Heap bounds protection with low fat pointers
US20200133888A1 (en) Apparatus and method for handling page protection faults in a computing system
US7673345B2 (en) Providing extended memory protection
CN109359487B (en) Extensible security shadow storage and tag management method based on hardware isolation
US9390261B2 (en) Securing software by enforcing data flow integrity
US20080140968A1 (en) Protecting memory by containing pointer accesses
JP7460529B2 (en) Random tag setting instructions for tag-protected memory systems
JP2022065654A (en) System, computer-implemented method and computer program product for protecting against invalid memory references (protecting against invalid memory references)
US20170003893A1 (en) Memory state indicator
US10445020B2 (en) Computer-implemented method and a system for encoding a stack application memory state using shadow memory
JP2021512400A (en) Controlling protected tag checking in memory access
CN113672237B (en) Program compiling method and device for preventing memory boundary crossing
US20230236925A1 (en) Tag checking apparatus and method
JP2021512405A (en) Controlling protected tag checking in memory access
WO2022218517A1 (en) Method and device for verifying execution of a program code
US10229070B2 (en) Computer-implemented method and a system for encoding a heap application memory state using shadow memory
US11625171B2 (en) Hardware support for memory safety with an overflow table
US20220342830A1 (en) Safe execution of programs that make out-of-bounds references
An Prevention of C/C++ Pointer Vulnerability

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21719112

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21719112

Country of ref document: EP

Kind code of ref document: A1