CN116340081A - RISCV memory access violation detection method and device based on hardware virtualization - Google Patents

RISCV memory access violation detection method and device based on hardware virtualization Download PDF

Info

Publication number
CN116340081A
CN116340081A CN202111579284.0A CN202111579284A CN116340081A CN 116340081 A CN116340081 A CN 116340081A CN 202111579284 A CN202111579284 A CN 202111579284A CN 116340081 A CN116340081 A CN 116340081A
Authority
CN
China
Prior art keywords
memory
memory access
address
riscv
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111579284.0A
Other languages
Chinese (zh)
Inventor
杨轶
苏璞睿
黄桦烽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Software of CAS
Original Assignee
Institute of Software of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Software of CAS filed Critical Institute of Software of CAS
Priority to CN202111579284.0A priority Critical patent/CN116340081A/en
Publication of CN116340081A publication Critical patent/CN116340081A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/301Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is a virtual computing platform, e.g. logically partitioned systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3037Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method and a device for detecting RISCV memory access violations based on hardware virtualization, wherein the method comprises the following steps: an operating system kernel running on the reverse RISCV acquires a process kernel data structure; simulating the RISCV CPU based on a hardware simulator; constructing a basic process list of an operating system; the sptbr register and a process kernel data structure are used for acquiring characteristic information of a new process; screening the characteristic information according to the basic process list to obtain a target process; API detection and instruction analysis are carried out based on the process information and the dynamic operation process information of the target process, and the obtained memory access data is compared with a memory access authority list to obtain a violation detection result. The invention can completely and transparently monitor the whole running process of the program on the RISCV CPU, provide a configurable memory violation access configuration interface, realize transparent process monitoring and memory violation access detection without depending on functions or interfaces provided by a system, and improve the memory violation access detection capability and accuracy.

Description

RISCV memory access violation detection method and device based on hardware virtualization
Technical Field
The invention belongs to the field of computer science and technology, and particularly relates to a method and a device for detecting RISCV memory access violations based on hardware virtualization.
Background
Memory violation access means that the access purpose is different from the expected design, for example, writing operation is performed on a read-only memory, dynamic update is performed on a static variable, the most main detection method for memory violation access is dynamic analysis at present, and because RISCV is a brand new hardware platform, the research work of dynamic analysis of programs on the RISCV is less at present. For memory violation access, manual analysis or embedding of detection code in source code is mainly relied on. Manual analysis is time-consuming and labor-consuming, and requires an analyst to have a better technical basis. The method based on the source code truly improves the detection capability of memory access violations to a certain extent, but a large number of current software products do not provide the source code, so the analysis method based on the source code has a large limitation, and the application software of binary forms is difficult to be unfolded and analyzed.
Currently, several methods are used for detecting memory access violations on RISCV platforms, such as the following:
1. debugger-based detection of offending access
Code violation access is a program error which is difficult to detect and analyze, and in most cases, the violation access does not cause program or system crash, but causes unexpected change of values of certain variables when the program is executed, so that the problem of the violation access is difficult to locate and debug. Currently, for code violation access detection, the main method is that when a logic error occurs in program operation, a developer spreads analysis through a gdb debugging tool. However, due to the characteristics of code illegal access, the analysis process has higher time complexity and needs a great amount of manpower support, and the analysis method has higher limitation.
2. Compiler-based detection of offending access
At present, a part of work is also performed on a code optimization function based on a compiler, in a code optimization stage, a handwritten code for detecting illegal access is embedded into a target program, and in an operation stage, dynamic analysis is performed on a program of a RISCV, so that whether memory illegal access exists in the code is detected. The method can improve the code illegal access detection capability to a certain extent, but most of software is released in a binary form, the source code cannot be obtained, and the illegal access detection mode based on the source code has larger limitation.
To sum up, the current method for dynamically analyzing programs on top of RISCV hardware has the main drawbacks: since the violation azimuth usually does not bring about the phenomenon of program crash or system crash, but changes some variables of the program, the violation azimuth is not easy to be positioned and debug by an analyst. The existing debug tool based analysis methods have significant limitations. Although some work uses compiler optimization techniques to insert analysis code into source code and memory violation access is achieved through the inserted code, source code of many software is difficult to obtain and source code-based approaches have large limitations.
Disclosure of Invention
The method aims at solving the problems that the existing access detection for the program memory violations on the RISCV CPU depends on manual analysis or source code implementation, and a large amount of manpower and material resources are needed, the time complexity is high and the limitation is large. The invention aims to provide a hardware virtualization-based RISCV memory access violation detection method and device.
The technical content of the invention comprises:
a method for detecting RISCV memory access violations based on hardware virtualization comprises the following steps:
an operating system kernel running on the reverse RISCV acquires a process kernel data structure;
simulating a RISCV CPU based on a hardware simulator, and constructing a basic process list of an operating system;
the sptbr register and the process kernel data structure are used for acquiring the characteristic information of the new process, and a target process is obtained according to the basic process list and the characteristic information;
API detection and instruction analysis are carried out based on the process information and the dynamic operation process information of the target process, a memory access permission list is obtained through API detection, and memory access data of the target process is obtained through instruction analysis;
and comparing the memory access data with the memory access authority list to obtain an access violation detection result.
Further, the operating system includes: linux operating system or Windows operating system.
Further, the types of hardware simulators include: qemu hardware simulator.
Further, the characteristic information of the new process is obtained through the following steps:
1) Monitoring the change of the sptbr register, and obtaining a new process when a new address appears;
2) And then taking the physical page pointed by sptbr as a starting point, and obtaining the characteristic information of the new process through characteristic search of the process kernel data structure.
Further, the feature information includes: the module loads address, length, thread information and memory information.
Further, the process information of the target process includes: a process structure address, a page table physical address, a process name, a module structure information list, and a process current module structure pointer.
Further, the memory access permission list is obtained through the following steps:
1) Intercepting all the ecall instructions, and acquiring an address, a function name, input/output parameters and a return value of an API call;
2) Based on the API call information, judging whether the corresponding function of the address of the API call is a memory application/release/permission operation function or not:
if yes, updating the existing memory access authority list through the configuration process name, the starting address of the memory area, the length of the memory area and the access authority of the memory area which are input by a user, and taking the updated memory access authority list as a memory access authority list;
if not, the existing memory access authority list is used as the memory access authority list.
Further, the memory access data is obtained by:
1) Intercepting all LOAD instructions and STORE instructions;
2) Based on the LOAD instruction and the STORE instruction, obtaining an operation code, an operand, a register, a memory address and memory contents of the instruction;
3) And obtaining memory access data based on the address and the register read by the LOAD instruction and the address and the register written by the STORE instruction.
A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the above method when run.
An electronic device comprising a memory and a processor, wherein the memory stores a program for performing the above-described method.
The invention has the following advantages and positive effects:
the invention can completely and transparently monitor the whole running process of the program on the RISCV CPU, provide a configurable memory violation access configuration interface, and realize transparent process monitoring and memory violation access detection without depending on functions or interfaces provided by a system in the monitoring process, thereby effectively improving the memory violation access detection capability and accuracy.
Drawings
FIG. 1 is a flow chart of a method for detecting a RISCV memory access violation based on hardware virtualization according to the present invention.
Detailed Description
The present invention will be further described in detail below with reference to specific embodiments and with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent.
The RISCV memory access violation detection method of the invention comprises the following steps:
installing an operating system on the Qemu hardware simulator;
based on the Qemu hardware simulator, taking the virtual sptbr register as a clue to distinguish different processes;
based on Qemu hardware simulator, constructing virtual process kernel data structure register, analyzing physical memory content, searching process kernel data structure;
based on Qemu hardware simulator, by modifying the decoding engine, when executing instruction ecl instruction in user mode, detecting whether memory allocation/release/permission update operation is realized, and calibrating access permission and access rule of memory area;
based on Qemu hardware simulator, calibrating access authority and access rule of the memory area by a user by providing a user mode interface;
based on the Qemu hardware simulator, by modifying the decode engine, the callback function is added before and after the LOAD and STORE instructions to perform analysis, based on the memory region access rules, to detect memory access violations.
And outputting the memory access violation detection result in a JSON file format.
Specifically, as shown in fig. 1, the steps involved are described as follows:
1) The operating system kernel running on the manual reverse RISCV analyzes the kernel data structure, and the related data structure is mainly used for finding out the process kernel data structure in a physical memory by means of multi-level pointer mutual authentication (the operating system kernel data structure is connected by using a double linked list, and whether the two values between the kernel data structures point to associated legal addresses or not can be detected to be legal kernel data structures or not) and then step 2 is entered);
2) Based on the RISCV CPU simulated by the Qemu hardware simulator, installing a Linux operating system, recording a basic process required to be started by a general Linux operating system, constructing a basic process list, and entering step 3) without monitoring the process in the later analysis;
3) Starting a Linux operating system and a target process, constructing a process kernel data structure register, and entering step 4);
4) Monitoring changes in the sptbr register in the system: when a new address appears, a new process is considered to appear, then the feature information of the current process is obtained by searching a process kernel data structure through features with a physical page pointed by sptbr as a starting point, the module loading address, the length, the thread information, the memory information and the like are included, whether the process belongs to a basic process list is judged based on the feature information, and if so, the process is ignored. If not, recording process information, wherein the process information comprises: a process structure address, a page table physical address, a process name, a module structure information list and a current module structure pointer of a process, and step 5) is entered;
5) Modifying a decoding engine for a target process, and adding API detection and instruction analysis codes into a decoding mechanism of Qemu, wherein when Qemu is actually executed, dynamic operation process information is further extracted except process information to enter a step 6);
6) Intercepting all ecall instructions aiming at a target process, obtaining an address, a function name, input/output parameters and a return value of an API call, and judging whether a corresponding function of the call address is a memory application/release/permission operation function or not: if yes, updating a memory access authority list, and entering a step 7); if not, entering step 8);
7) The user inputs command lines through the provided interface, configures information such as process names, starting addresses of the memory areas, lengths of the memory areas, access rights of the memory areas and the like, and realizes addition, deletion, modification and inquiry of memory access rules. Step 8) is entered;
8) Aiming at a target process, intercepting all LOAD and STORE instructions, obtaining information such as an operation code, an operand, a register, a memory address, memory content and the like of the instructions, comparing an address and a register read by the LOAD instructions, an address written by the STORE instructions and a memory access permission table, judging whether memory illegal access exists, and if so, outputting a illegal access detection result, and entering step 9);
9) Judging whether the target process exits, if so, outputting the dynamic information in a JSON file mode, and if not, entering the step 4).
Furthermore, the operating system is installed on the Qemu hardware simulator, and the Windows system does not support the RISCV CPU, but is currently only a Linux system. However, the monitoring process of the Windows operating system is consistent with the principle of the monitoring process of the Linux system, and the invention can also support the Windows operating system.
Further, the Qemu-based hardware simulator uses a virtual sptbr register as a clue to distinguish different processes, wherein sptbr is a page table physical address of each process, and because different processes use different page tables, the page table information can uniquely mark the process, and the process information is recorded by constructing a HASH table with the page table address as an index in a memory.
Further, the Qemu-based hardware simulator uses a virtual kernel data structure register as a clue, traverses a linked list in a physical memory to search a kernel process data structure, and extracts process information.
Further, the Qemu hardware simulator detects whether the target address of the instruction is a function of memory allocation/release/memory permission setting or not when the program executes the instruction ecl instruction by modifying the decoding engine, and records the memory area and the corresponding access permission.
Further, the Qemu hardware simulator is added into a user interface to allow a user to define the access right of the target process memory area in a command input mode.
Further, the Qemu-based hardware simulator analyzes the read-write memory address and length of the instruction by modifying the decoding engine and adding callback functions before and after the LOAD and STORE instructions, and judges whether the operation is illegal operation or not according to the predefined memory access authority.
The invention provides a method for detecting memory access violations in the running process of a process by modifying a hardware simulator, aiming at a RISCV CPU, analyzing a register in a virtual CPU, positioning and reading an operating system key data structure in a physical memory, identifying the process, intercepting a function call and an executed instruction of the process. The invention can completely and transparently monitor the whole running process of the program on the RISCV CPU, provide a configurable memory violation access configuration interface, and realize transparent process monitoring and memory violation access detection without depending on functions or interfaces provided by a system in the monitoring process, thereby effectively improving the memory violation access detection capability and accuracy.
Although specific embodiments of, and the accompanying drawings for, the present invention are disclosed for illustrative purposes only and are for the purpose of aiding in the understanding of the present invention and the practice thereof, it will be understood by those skilled in the art that: various alternatives, variations and modifications are possible without departing from the spirit and scope of the invention and the appended claims. Therefore, the present invention should not be limited to the preferred embodiments and the disclosure of the drawings, but the scope of the invention is defined by the appended claims.

Claims (10)

1. A method for detecting RISCV memory access violations based on hardware virtualization comprises the following steps:
an operating system kernel running on the reverse RISCV acquires a process kernel data structure;
simulating a RISCV CPU based on a hardware simulator, and constructing a basic process list of an operating system;
the sptbr register and the process kernel data structure are used for acquiring the characteristic information of the new process, and a target process is obtained according to the basic process list and the characteristic information;
API detection and instruction analysis are carried out based on the process information and the dynamic operation process information of the target process, a memory access permission list is obtained through API detection, and memory access data of the target process is obtained through instruction analysis;
and comparing the memory access data with the memory access authority list to obtain an access violation detection result.
2. The method of claim 1, wherein the operating system comprises: linux operating system or Windows operating system.
3. The method of claim 1, wherein the types of hardware simulators include: qemu hardware simulator.
4. The method of claim 1, wherein the characteristic information of the new process is obtained by:
1) Monitoring the change of the sptbr register, and obtaining a new process when a new address appears;
2) And then taking the physical page pointed by sptbr as a starting point, and obtaining the characteristic information of the new process through characteristic search of the process kernel data structure.
5. The method of claim 1, wherein the characteristic information comprises: the module loads address, length, thread information and memory information.
6. The method of claim 1, wherein the process information of the target process comprises: a process structure address, a page table physical address, a process name, a module structure information list, and a process current module structure pointer.
7. The method of claim 1, wherein the memory access permission list is obtained by:
1) Intercepting all the ecall instructions, and acquiring an address, a function name, input/output parameters and a return value of an API call;
2) Based on the API call information, judging whether the corresponding function of the address of the API call is a memory application/release/permission operation function or not:
if yes, updating the existing memory access authority list through the configuration process name, the starting address of the memory area, the length of the memory area and the access authority of the memory area which are input by a user, and taking the updated memory access authority list as a memory access authority list;
if not, the existing memory access authority list is used as the memory access authority list.
8. The method of claim 1, wherein the memory access data is obtained by:
1) Intercepting all LOAD instructions and STORE instructions;
2) Based on the LOAD instruction and the STORE instruction, obtaining an operation code, an operand, a register, a memory address and memory contents of the instruction;
3) And obtaining memory access data based on the address and the register read by the LOAD instruction and the address and the register written by the STORE instruction.
9. A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the method of any of claims 1-8 when run.
10. An electronic device comprising a memory, in which a computer program is stored, and a processor arranged to run the computer program to perform the method of any of claims 1-8.
CN202111579284.0A 2021-12-22 2021-12-22 RISCV memory access violation detection method and device based on hardware virtualization Pending CN116340081A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111579284.0A CN116340081A (en) 2021-12-22 2021-12-22 RISCV memory access violation detection method and device based on hardware virtualization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111579284.0A CN116340081A (en) 2021-12-22 2021-12-22 RISCV memory access violation detection method and device based on hardware virtualization

Publications (1)

Publication Number Publication Date
CN116340081A true CN116340081A (en) 2023-06-27

Family

ID=86877574

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111579284.0A Pending CN116340081A (en) 2021-12-22 2021-12-22 RISCV memory access violation detection method and device based on hardware virtualization

Country Status (1)

Country Link
CN (1) CN116340081A (en)

Similar Documents

Publication Publication Date Title
US8850581B2 (en) Identification of malware detection signature candidate code
Gong et al. Jitprof: Pinpointing jit-unfriendly javascript code
CN109583200B (en) Program abnormity analysis method based on dynamic taint propagation
CN102054149B (en) Method for extracting malicious code behavior characteristic
Díaz et al. Static analysis of source code security: Assessment of tools against SAMATE tests
US10423518B2 (en) Systems and methods for analyzing violations of coding rules
US10541042B2 (en) Level-crossing memory trace inspection queries
US20090144702A1 (en) System And Program Product for Determining Java Software Code Plagiarism and Infringement
US20050125776A1 (en) Determining the possibility of adverse effects arising from a code change
KR101979329B1 (en) Method and apparatus for tracking security vulnerable input data of executable binaries thereof
CN110941552A (en) Memory analysis method and device based on dynamic taint analysis
Kowalczyk et al. Configurations in Android testing: they matter
CN101458630B (en) Self-modifying code identification method based on hardware emulator
US10839124B1 (en) Interactive compilation of software to a hardware language to satisfy formal verification constraints
US9171168B2 (en) Determine anomalies in web application code based on authorization checks
Ren et al. A dynamic taint analysis framework based on entity equipment
CN107077365B (en) Selectively loading precompiled headers and/or portions thereof
CN114443418A (en) RISCV memory overflow vulnerability detection method and device based on hardware virtualization
CN111966578A (en) Automatic evaluation method for android compatibility defect repair effect
Jianming et al. PVDF: An automatic patch-based vulnerability description and fuzzing method
CN116340081A (en) RISCV memory access violation detection method and device based on hardware virtualization
US20020100001A1 (en) Active trace debugging for hardware description languages
CN117008972B (en) Instruction analysis method, device, computing equipment and storage medium
KR102286451B1 (en) Method for recognizing obfuscated identifiers based on natural language processing, recording medium and device for performing the method
US11886589B2 (en) Process wrapping method for evading anti-analysis of native codes, recording medium and device for performing the method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination