WO2021248315A1 - 一种脱壳处理方法、装置、设备及存储介质 - Google Patents

一种脱壳处理方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021248315A1
WO2021248315A1 PCT/CN2020/095133 CN2020095133W WO2021248315A1 WO 2021248315 A1 WO2021248315 A1 WO 2021248315A1 CN 2020095133 W CN2020095133 W CN 2020095133W WO 2021248315 A1 WO2021248315 A1 WO 2021248315A1
Authority
WO
WIPO (PCT)
Prior art keywords
function
extracted
source
identifier
unpacked
Prior art date
Application number
PCT/CN2020/095133
Other languages
English (en)
French (fr)
Inventor
郭子亮
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to PCT/CN2020/095133 priority Critical patent/WO2021248315A1/zh
Priority to CN202080100486.XA priority patent/CN115552402A/zh
Publication of WO2021248315A1 publication Critical patent/WO2021248315A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements

Definitions

  • This application relates to computer technology, and in particular to a method, device, equipment, and storage medium for unpacking processing.
  • the memory location of the reinforced Dex file is confirmed, the structure of the reinforced Dex file is confirmed, all the mapping fields in the structure are cyclically traversed and loaded, and all the mapping fields in the structure are reorganized and repaired to obtain the shelling After the Dex source file.
  • the above technical solution mainly solves the problem of discontinuous loading of Dex files in the memory, and does not specifically solve the problem of instruction extraction. Therefore, it causes the bytecode of the function body of the function in the Dex source file after shelling. (CodeItem) There are a lot of empty bytecodes, that is to say, the complete Dex source file cannot be obtained after the unpacking operation.
  • the embodiments of the present application expect to provide a shelling processing method, device, equipment, and storage medium, with the purpose of restoring all the source functions to obtain a complete Dex source file.
  • an embodiment of the present application provides a shelling processing method, which includes:
  • the source function corresponding to the at least one extracted function is backfilled into the executable file to be unpacked to obtain the unpacked executable source file.
  • an embodiment of the present application provides a shelling processing device, which includes:
  • the obtaining part is configured to obtain the executable file to be unpacked of the target application
  • the acquiring part is further configured to acquire the function identifier of at least one extracted function in the executable file to be unpacked;
  • the acquiring part is further configured to acquire the source function corresponding to the at least one extracted function from a preset storage space based on the function identifier of the at least one extracted function;
  • the backfilling part is configured to backfill the source function corresponding to the at least one extracted function into the executable file to be unpacked to obtain the unpacked executable source file.
  • a shelling processing device including: a processor and a memory configured to store a computer program that can run on the processor, wherein the processor is configured to execute the aforementioned computer program when the computer program is running. Method steps.
  • a computer-readable storage medium is provided, and a computer program is stored thereon, wherein the computer program implements the steps of the foregoing method when the computer program is executed by a processor.
  • the technical solution of the embodiment of the present application downloads the source function corresponding to the extracted function before reinforcement from the preset storage space by acquiring the function identifier of at least one extracted function, and backfills the source function to the executable to be unpacked
  • the executable source file can be completely restored at the location of the extracted function in the file. In this way, it is possible to achieve a complete restoration of the Dex source file when there are many extracted functions in the target application, and solve the problem of a large number of empty bytecodes in the bytecode of the function body in the Dex source file after shelling.
  • Figure 1 is a schematic diagram of the unpacking process of restoring Dex source files in a packed Android application
  • FIG. 2 is a schematic diagram of the first process of the dehulling processing method in an embodiment of the application
  • FIG. 3 is a schematic diagram of the second process of the dehulling processing method in an embodiment of the application.
  • FIG. 4 is a schematic diagram of detecting a reinforced executable file in an embodiment of the application
  • FIG. 5 is a schematic diagram of the third process of the dehulling processing method in an embodiment of the application.
  • FIG. 6 is a schematic diagram of the fourth process of the dehulling processing method in an embodiment of the application.
  • FIG. 7 is a schematic diagram of the structure of the shelling processing device in an embodiment of the application.
  • FIG. 8 is a schematic diagram of the structure of the dehulling processing equipment in an embodiment of the application.
  • FIG. 9 is a schematic structural block diagram of a chip in an embodiment of the application.
  • Figure 1 is a schematic diagram of the unpacking process of restoring Dex source files in a packed Android application. As shown in Figure 1, specifically,
  • Step 101 Memory location of Dex file
  • DexFileHeader includes: type_ids_off, type_ids_size, string_ids_off, string_ids_size, proto_ids_off, proto_ids_size, field_ids_off, field_ids_size, method_ids_off, method_ids_size, class_defs_off.
  • Step 102 Loop through and load the left and right mapping fields
  • memory block 1 stores the mapping table, which mainly includes: strings mapping table, types mapping table, protos mapping table, Methods mapping table, class defs mapping table, fields mapping table;
  • memory block 2 stores the structure, mainly including: type_ids_off, string_ids_off, proto_ids_off, field_ids_off, method_ids_off, class_defs_off.
  • Step 103 Organize and repair the mappings and fields in the structure to obtain the source Dex file
  • the source Dex file includes at least: structure segment, mapping table segment, and data segment.
  • the technical solution shown in Figure 1 is to reorganize and restore the Dex file according to the mapping information of the fields in the structure after the Dex file is located in the memory, which solves the problem of discontinuous loading of the Dex file in the memory.
  • the problem of instruction being extracted therefore, caused a large number of empty bytecodes in the CodeItem of the function in the Dex source file after shelling. Even if a small number of functions that have been executed can be restored by extraction instructions, most of the function bodies are still extracted. Therefore, the complete Dex source file cannot be obtained after the unpacking operation.
  • this application proposes a shelling processing method, which can realize complete restoration of Dex files.
  • the specific implementation method is as follows:
  • FIG. 2 is a schematic diagram of the first process of the dehulling treatment method in an embodiment of the application.
  • the dehulling treatment method may specifically include:
  • Step 201 Obtain the executable file to be unpacked of the target application
  • the executable file to be unpacked is obtained indirectly based on the address of the Dex file of the target application in the memory. In other words, you need to obtain the Dex file before obtaining the executable file to be unpacked.
  • the Dex file here is the file after the reinforcement operation, and the Dex file is the executable file of the Android system, which contains all the operating instructions and runtime data of the application.
  • the address of the Dex file in the memory is usually located first, and the Dex file of the target application is obtained based on the address; then the structure contained in the Dex file is parsed to obtain the executable file to be unpacked.
  • the address of the Dex file is obtained after the Android application package (APK) starts running, and the system functions related to the structure in the Dex file are dynamically debugged, that is, the address of the Dex file is performed by setting a breakpoint. Of access.
  • API Android application package
  • Step 202 Obtain the function identifier of at least one extracted function in the executable file to be unpacked;
  • the extracted function can be understood as a function in which the function instruction is extracted (also can be understood as being hidden or encrypted) when the reinforcement operation is performed; the CodeItem corresponding to the extracted function is a null bytecode.
  • the function identifier is used to indicate the corresponding function, for example, the function identifier can be a function name.
  • the method specifically includes: obtaining at least one function and its function identifier from the executable file to be unpacked; determining that an empty function in the at least one function is the at least one extracted Function; obtain the function identifier of the at least one extracted function.
  • the executable file to be unpacked includes at least one function and function identifier; among them, at least one function includes the function that is extracted during the reinforcement operation and the function that is not extracted; each function corresponds to a function identifier .
  • the CodeItem corresponding to each function it is necessary to judge the CodeItem corresponding to each function in at least one function. If the CodeItem of the current function is an empty bytecode, the function is called the extracted function, and the function identifier corresponding to the extracted function is obtained. It should be noted that the extracted function presents the extracted state.
  • Step 203 Obtain the source function corresponding to the at least one extracted function from a preset storage space based on the function identifier of the at least one extracted function;
  • the preset storage space is used to store the source function corresponding to the function identifier.
  • the preset storage space may be a storage space stored locally or a storage space stored in the cloud.
  • a table for storing the mapping relationship between function identifiers and source functions is established in advance locally or in the cloud, and the corresponding source functions are stored in it according to the function identifiers. At the same time, it is also convenient for subsequent storage of other function identifiers and source functions, and queries Or download the source function corresponding to the function ID.
  • the corresponding source function is downloaded from the preset storage space; each time a source function corresponding to an extracted function is obtained, Store it in a function mapping table; until all the source functions corresponding to the extracted functions are stored in this function mapping table, a complete function mapping table can be obtained.
  • Step 204 Backfill the source function corresponding to the at least one extracted function into the executable file to be unpacked to obtain the unpacked executable source file.
  • the target function identifier is obtained from the function identifier of at least one extracted function; based on the target function identifier, the target source function corresponding to the target function identifier is obtained from the function mapping table; the target source function is backfilled to the unpacking The original location of the target function to be extracted from the target function identifier in the executable file.
  • the extracted function corresponding to the function identifier of the at least one extracted function is backfilled, the unpacked executable source file is obtained.
  • step 201 to step 204 may be the processor of the unpacking processing device.
  • the technical solution of the embodiment of the present application downloads the source function corresponding to the extracted function before reinforcement from the preset storage space by acquiring the function identifier of at least one extracted function, and backfills the source function to the executable to be unpacked
  • the executable source file can be completely restored at the location of the extracted function in the file. In this way, it is possible to achieve a complete restoration of the Dex source file when there are many extracted functions in the target application, and solve the problem of a large number of empty bytecodes in the bytecode of the function body in the Dex source file after shelling.
  • the embodiment of the present application also provides another dehulling processing method.
  • the shelling treatment method may specifically include:
  • Step 301 Obtain the executable file to be unpacked of the target application
  • the executable file to be unpacked is obtained indirectly based on the address of the Dex file of the target application in the memory. In other words, you need to obtain the Dex file before obtaining the executable file to be unpacked.
  • the Dex file here is the file after the reinforcement operation, and the Dex file is the executable file of the Android system, which contains all the operating instructions and runtime data of the application.
  • Step 302 Obtain the function identifier of at least one extracted function in the executable file to be unpacked;
  • the extracted function can be understood as a function in which the function instruction is extracted (also can be understood as being hidden or encrypted) when the reinforcement operation is performed; the CodeItem corresponding to the extracted function is a null bytecode.
  • the function identifier is used to indicate the corresponding function, for example, the function identifier can be a function name.
  • the method specifically includes: obtaining at least one function and its function identifier from the executable file to be unpacked; determining that an empty function in the at least one function is the at least one extracted Function; obtain the function identifier of the at least one extracted function.
  • the executable file to be unpacked includes at least one function and function identifier; among them, at least one function includes the function that is extracted during the reinforcement operation and the function that is not extracted; each function corresponds to a function identifier .
  • the CodeItem corresponding to each function it is necessary to judge the CodeItem corresponding to each function in at least one function. If the CodeItem of the current function is an empty bytecode, the function is called the extracted function, and the function identifier corresponding to the extracted function is obtained. It should be noted that the extracted function presents the extracted state.
  • Step 303 Based on the function identifier of the at least one extracted function, run the modified calling function that comes with the system, and obtain the source function corresponding to the at least one extracted function from a preset storage space;
  • the function mapping table that is, the CodeItem corresponding to the extracted function corresponding to the function identifier of at least one extracted function is an empty bytecode; secondly, each empty bytecode is obtained from the above function mapping table
  • the function identifier corresponding to the extracted function is executed once the preset calling function, and then the source function that is not empty is obtained from the preset memory space; finally, the obtained source function is used to update the function map
  • the extracted function is empty in the relational table.
  • the function mapping relationship table is established; when the source function corresponding to at least one extracted function is obtained, the source function is directly stored in the function mapping established above In the relationship table, it is the mapping relationship between the function identifier of the at least one extracted function and the source function corresponding to the at least one extracted function.
  • the preset calling function mentioned above may be a calling function built in the modified system.
  • it specifically includes: adding the backup function to the calling function built in the system to obtain the first modified calling function; wherein the backup function is used to obtain the corresponding function of the at least one extracted function when the backup function is running.
  • the above steps are to directly modify the source code of the calling function that comes with the system to achieve the acquisition of the source function corresponding to at least one extracted function, construct the function mapping relationship table and assign at least one extracted function to the source function.
  • the source function is stored in the function mapping table.
  • Step 304 Backfill the source function corresponding to the at least one extracted function into the executable file to be unpacked to obtain the unpacked executable source file.
  • the target function identifier is obtained from the function identifier of at least one extracted function; based on the target function identifier, the target source function corresponding to the target function identifier is obtained from the function mapping table; the target source function is backfilled to the unpacking The original location of the target function to be extracted from the target function identifier in the executable file.
  • the extracted function corresponding to the function identifier of the at least one extracted function is backfilled, the unpacked executable source file is obtained.
  • the shelling system is constructed according to the above shelling processing method; then the shelling system is flashed, and the shelling system is flashed into the simulator to obtain a simulator with the shelling system; and the shelling system can be obtained by running The simulator of the system can realize shelling.
  • the simulator with the shelling system is finally obtained, and the simulation with the shelling system is run. , To achieve the purpose of unpacking the reinforced Dex file.
  • FIG. 4 is a schematic diagram of detecting a reinforced executable file in an embodiment of the application, as shown in FIG. 4, specifically,
  • Step 401 Upload the hardened APK file
  • Step 402 the emulator shelling system executes code restoration
  • Step 403 Perform risk detection and audit on the restored code; if it passes the detection and audit, execute step 404; if it fails the detection and audit, execute step 405;
  • Step 404 Put the reinforced APK on the application market
  • Step 405 After the APK is hardened again, step 401 is executed again.
  • step 405 is executed because the emulator with the unpacking system restores the executable file, that is, if it fails the inspection and audit, it is necessary to re-enforce the APK, and then judge whether it can pass the inspection and audit again.
  • the technical solution of the embodiment of the present application downloads the source function corresponding to the extracted function before reinforcement from the preset storage space by acquiring the function identifier of at least one extracted function, and backfills the source function to the executable to be unpacked
  • the executable source file can be completely restored at the location of the extracted function in the file. In this way, it is possible to achieve a complete restoration of the Dex source file when there are many extracted functions in the target application, and solve the problem of a large number of empty bytecodes in the bytecode of the function body in the Dex source file after shelling.
  • the embodiment of the present application also provides another dehulling processing method.
  • the shelling treatment method may specifically include:
  • Step 501 Obtain the executable file to be unpacked of the target application
  • Step 502 Obtain the function identifier of at least one extracted function in the executable file to be unpacked;
  • Step 503 Run a custom function based on the function identifier of the at least one extracted function, and obtain the source function corresponding to the at least one extracted function from a preset storage space;
  • the function mapping table that is, the CodeItem corresponding to the extracted function corresponding to the function identifier of at least one extracted function is an empty bytecode; secondly, each empty bytecode is obtained from the above function mapping table
  • the function identifier corresponding to the extracted function is executed once the preset calling function, and then the source function that is not empty is obtained from the preset memory space; finally, the obtained source function is used to update the function map
  • the extracted function is empty in the relational table.
  • the function mapping relationship table is established; when the source function corresponding to at least one extracted function is obtained, the source function is directly stored in the function mapping established above In the relationship table, it is the mapping relationship between the function identifier of the at least one extracted function and the source function corresponding to the at least one extracted function.
  • the preset calling function mentioned above may be a custom function.
  • the method further includes: obtaining the calling function built in the system; adding the backup function to the calling function built in the system to obtain the second modified calling function; wherein, the backup function is running when the It is used to obtain the source function corresponding to the at least one extracted function; use the second modified calling function to create a custom function; and use the custom function as the preset calling function.
  • the call to at least one extracted function is realized by the created custom function, and the source function is obtained based on the backup function in the custom function, thereby realizing the purpose of unpacking.
  • the second modified calling function is obtained by copying the calling function that comes with the system, and adding a backup function to the copied calling function of the system.
  • the call of the java layer function in the normal system is implemented through the invoke method in the ArtMethod class.
  • This application implements the invocation of at least one extracted function of the java layer through a custom invoke function.
  • create a custom invoke function custom function
  • secondly, by executing the custom invoke function, and then execute the invoke function in the modified ArtMethod class obtain at least one of the extracted functions corresponding Source function, and construct the function mapping relationship table, store at least one source function corresponding to the extracted function in the corresponding position of the function mapping relationship table; when the source functions of all the extracted functions are obtained, a complete function mapping relationship is finally obtained surface.
  • the executing the preset calling function includes: executing the custom function through a hook method.
  • hook is a technology that changes the execution flow of a program.
  • you can jump directly to the system's own calling function to execute the custom function.
  • the hook method hijacks the invoke method in the ArtMethod class of the system.
  • the custom invoke method is executed instead of the invoke method that comes with the system.
  • the hook technology is implemented under the Xposed or Frida runtime framework.
  • the simulator here does not change the system functions of the simulator after running.
  • Step 504 Backfill the source function corresponding to the at least one extracted function into the executable file to be unpacked to obtain the unpacked executable source file.
  • the target function identifier is obtained from the function identifier of at least one extracted function; based on the target function identifier, the target source function corresponding to the target function identifier is obtained from the function mapping table; the target source function is backfilled to the unpacking The original location of the target function to be extracted from the target function identifier in the executable file.
  • the extracted function corresponding to the function identifier of the at least one extracted function is backfilled, the unpacked executable source file is obtained.
  • the technical solution of the embodiment of the present application downloads the source function corresponding to the extracted function before reinforcement from the preset storage space by acquiring the function identifier of at least one extracted function, and backfills the source function to the executable to be unpacked
  • the executable source file can be completely restored at the location of the extracted function in the file. In this way, it is possible to achieve a complete restoration of the Dex source file when there are many extracted functions in the target application, and solve the problem of a large number of empty bytecodes in the bytecode of the function body in the Dex source file after shelling.
  • FIG. 6 is a schematic diagram of the fourth process of the dehulling processing method in an embodiment of this application.
  • the extraction instruction type reinforcement refers to the protection of the function method body (extracted instruction) in the Dex file.
  • the Dex file is an executable file of the Android system, which contains all the operating instructions and runtime data of the application.
  • This application uses active calls to Java layer functions. After the extracted instructions are dumped when they are called, they will not be executed but will be returned directly. The extracted instructions will be backfilled into the incomplete Dex file to obtain the completeness. Dex source files.
  • the java method is represented by ArtMethod in the ART virtual machine.
  • the virtual machine environment After the virtual machine environment is created from the zygote startup, it enters the java environment through AndroidRuntime::start(), and calls the CallStaticVoidMethod() function in start(), and further
  • the invoke method of ArtMethod and the execution of java method are all executed through the invoke method. Therefore, this application executes the invoke of the extracted function by hooking the invoke function, and dumps the extracted instruction.
  • Step 601 Memory analysis of Dex files
  • the Dex file here is the file after the reinforcement operation, and the Dex file is the executable file of the Android system, which contains all the operating instructions and runtime data of the application.
  • the address of the Dex file in the memory is usually located first, and the Dex file of the target application is obtained based on the address; then the structure contained in the Dex file is parsed to obtain the executable file to be unpacked.
  • the address of the Dex file is obtained by dynamically debugging the system functions related to the structure in the Dex file after the APK starts running, that is, by setting a breakpoint to obtain the address of the Dex file.
  • Step 602 Confirm the extracted function and create a function mapping table list
  • the parsed Dex file is an executable file to be unpacked.
  • the extracted function can be understood as a function in which the function instruction is extracted (also can be understood as being hidden or encrypted) when the reinforcement operation is performed; the CodeItem corresponding to the extracted function is a null bytecode.
  • the executable file to be unpacked contains at least one function and function identifier. Therefore, the CodeItem of each function needs to be judged here, and the function whose CodeItem is empty is defined as the extracted function, and then at least one extracted function is obtained. And get the corresponding function ID.
  • At least one extracted function and a function identifier corresponding to the extracted function are used to create a function mapping relationship table list; wherein the function identifier is used to indicate the corresponding function, for example, the function identifier may be a function name.
  • the extracted function mentioned below is the extracted function that has not been restored, that is to say, it is in the extracted state.
  • Step 603 Create a custom invoke function, and call the invoke function in the ArtMethod class that adds the dump function; the dump function is used to download the source function corresponding to the extracted function, and to create a function mapping table;
  • the source function here is the function before reinforcement downloaded from the preset storage space.
  • the custom invoke function is executed to realize the function of calling and dumping at least one extracted function of the java layer.
  • Step 604 Backfill the source function corresponding to the extracted function to the location of the extracted function;
  • the target function identifier from the function identifier of at least one extracted function, and then obtain the source function corresponding to the target function identifier based on the reconstructed function mapping table list1 or the updated function mapping table list, and then backfill the source function to The location of the corresponding extracted function.
  • Step 605 Obtain the complete Dex source file.
  • the complete Dex source file can be obtained.
  • This application can provide reference and assistance for Android projects, such as: assistance for analysis of competing Android reinforcement solutions; reference for testing of Android reinforcement solutions; assistance for risk assessment of developer applications in the company's application market .
  • the technical solution of the embodiment of the present application downloads the source function corresponding to the extracted function before reinforcement from the preset storage space by acquiring the function identifier of at least one extracted function, and backfills the source function to the executable to be unpacked
  • the executable source file can be completely restored at the location of the extracted function in the file. In this way, it is possible to achieve a complete restoration of the Dex source file when there are many extracted functions in the target application, and solve the problem of a large number of empty bytecodes in the bytecode of the function body in the Dex source file after shelling.
  • the embodiment of the present application also provides a shelling processing device. As shown in FIG. 7, the device includes:
  • the obtaining part 701 is configured to obtain the executable file to be unpacked of the target application
  • the acquiring part 701 is further configured to acquire the function identifier of at least one extracted function in the executable file to be unpacked;
  • the acquiring part 701 is further configured to acquire the source function corresponding to the at least one extracted function from a preset storage space based on the function identifier of the at least one extracted function;
  • the backfill part 702 is configured to backfill the source function corresponding to the at least one extracted function into the executable file to be unpacked to obtain the unpacked executable source file.
  • the apparatus includes: the obtaining part 701 is further configured to obtain at least one function and its function identifier from the executable file to be unpacked; and determining that the at least one function is empty The function is the at least one extracted function; the function identifier of the at least one extracted function is obtained.
  • the device includes: the acquiring part 701 is further configured to use the acquired function identifier of the at least one extracted function to initialize a function mapping relationship table; wherein, the function mapping relationship table is Includes the mapping relationship between the function identifier of the at least one extracted function and the at least one extracted function; based on the function identifier of the at least one extracted function in the function mapping table, the preset calling function is executed, from Acquire the source function corresponding to the at least one extracted function from the preset storage space; store the source function corresponding to the at least one extracted function in the function mapping relationship table.
  • the function mapping relationship table is Includes the mapping relationship between the function identifier of the at least one extracted function and the at least one extracted function
  • the preset calling function is executed, from Acquire the source function corresponding to the at least one extracted function from the preset storage space; store the source function corresponding to the at least one extracted function in the function mapping relationship table.
  • the device includes: the obtaining part 701, which is further configured to execute a preset calling function based on the function identifier of the at least one extracted function, and obtain all the information from the preset storage space.
  • the source function corresponding to the at least one extracted function; the function identification of the at least one extracted function and the corresponding source function are used to construct a function mapping relationship table.
  • the backup function is added to the calling function built in the system to obtain the first modified calling function; wherein the backup function is used to obtain the source function corresponding to the at least one extracted function when the backup function is running. ; Use the first modified calling function as the preset calling function.
  • the system's own calling function is obtained; the backup function is added to the system's own calling function to obtain the second modified calling function; wherein, the backup function is used to obtain the at least A source function corresponding to the extracted function; use the second modified calling function to create a custom function; use the custom function as the preset calling function.
  • the custom function is executed through a hook method.
  • the technical solution of the embodiment of the present application downloads the source function corresponding to the extracted function before reinforcement from the preset storage space by acquiring the function identifier of at least one extracted function, and backfills the source function to the executable to be unpacked
  • the executable source file can be completely restored at the location of the extracted function in the file. In this way, it is possible to achieve a complete restoration of the Dex source file when there are many extracted functions in the target application, and solve the problem of a large number of empty bytecodes in the bytecode of the function body in the Dex source file after shelling.
  • the embodiment of the present application also provides another unpacking processing device.
  • the device includes: a processor 801 and a memory 802 configured to store a computer program that can run on the processor;
  • the processor 801 is configured to execute the method steps in the foregoing embodiment when it is configured to run a computer program.
  • bus system 803 is used to implement connection and communication between these components.
  • the bus system 803 also includes a power bus, a control bus, and a status signal bus.
  • various buses are marked as the bus system 803 in FIG. 8.
  • the above-mentioned processors can be application-specific integrated circuits (ASIC, Application Specific Integrated Circuit), digital signal processing devices (DSPD, Digital Signal Processing Device), programmable logic devices (PLD, Programmable Logic Device), and on-site At least one of Field-Programmable Gate Array (FPGA), controller, microcontroller, and microprocessor. It is understandable that, for different devices, the electronic devices used to implement the above-mentioned processor functions may also be other, which is not specifically limited in the embodiment of the present application.
  • ASIC Application Specific Integrated Circuit
  • DSPD Digital Signal Processing Device
  • PLD Programmable Logic Device
  • FPGA Field-Programmable Gate Array
  • controller microcontroller
  • microprocessor microprocessor
  • the above-mentioned memory may be a volatile memory (volatile memory), such as a random access memory (RAM, Random-Access Memory); or a non-volatile memory (non-volatile memory), such as a read-only memory (ROM, Read-Only Memory), flash memory (flash memory), hard disk (HDD, Hard Disk Drive) or solid state drive (SSD, Solid-State Drive); or a combination of the above types of memory, and provides instructions and data to the processor.
  • volatile memory such as a random access memory (RAM, Random-Access Memory
  • non-volatile memory such as a read-only memory (ROM, Read-Only Memory), flash memory (flash memory), hard disk (HDD, Hard Disk Drive) or solid state drive (SSD, Solid-State Drive
  • SSD Solid-State Drive
  • the embodiments of the present application also provide a computer-readable storage medium, the computer storage medium storing computer-executable instructions, and when the computer-executable instructions are executed, the method steps of the foregoing embodiment are provided.
  • the above-mentioned device in the embodiment of the present application is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium.
  • the computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, Read Only Memory (ROM, Read Only Memory), magnetic disk or optical disk and other media that can store program codes. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
  • FIG. 9 is a schematic structural diagram of a chip of an embodiment of the present application.
  • the chip 901 shown in FIG. 9 includes a processor 902, and the processor 902 can call and run a computer program from the memory 904 to implement the method in the embodiment of the present application.
  • the chip 901 may further include a memory 904.
  • the processor 902 can call and run a computer program from the memory 904 to implement the method in the embodiment of the present application.
  • the memory 904 may be a separate device independent of the processor 902, or may be integrated in the processor 902.
  • the chip 901 may further include an input interface 903.
  • the processor 902 can control the input interface 903 to communicate with other devices or chips, and specifically, can obtain information or data sent by other devices or chips.
  • the chip 901 may further include an output interface 905.
  • the processor 902 can control the output interface 905 to communicate with other devices or chips, and specifically, can output information or data to other devices or chips.
  • the chip can be applied to the network device in the embodiment of the present application, and the chip can implement the corresponding process implemented by the network device in each method of the embodiment of the present application.
  • the chip can implement the corresponding process implemented by the network device in each method of the embodiment of the present application.
  • the chip can be applied to the terminal device in the embodiment of the present application, and the chip can implement the corresponding process implemented by the terminal device in each method of the embodiment of the present application.
  • the chip can implement the corresponding process implemented by the terminal device in each method of the embodiment of the present application.
  • the chip mentioned in the embodiment of the present application may also be called a system-level chip, a system-on-chip, a system-on-chip, or a system-on-chip, etc.
  • an embodiment of the present application further provides a computer storage medium in which a computer program is stored, and the computer program is configured to execute the data scheduling method of the embodiment of the present application.
  • the technical solution of the embodiment of the present application downloads the source function corresponding to the extracted function before reinforcement from the preset storage space by acquiring the function identifier of at least one extracted function, and backfills the source function to the executable to be unpacked
  • the executable source file can be completely restored at the location of the extracted function in the file. In this way, it is possible to achieve a complete restoration of the Dex source file when there are many extracted functions in the target application, and solve the problem of a large number of empty bytecodes in the bytecode of the function body in the Dex source file after shelling.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

一种脱壳处理方法、装置、设备及存储介质,所述方法包括:获取目标应用程序的待脱壳的可执行文件(201);获取待脱壳的可执行文件中的至少一个被抽取函数的函数标识(202);基于至少一个被抽取函数的函数标识,从预设的存储空间中获取至少一个被抽取函数对应的源函数(203);将至少一个被抽取函数对应的源函数回填到待脱壳的可执行文件中,得到脱壳后的可执行源文件(204)。如此,通过函数标识,从预设的存储空间中下载对应的源函数,将源函数回填到待脱壳的可执行文件中被抽取函数所在位置,完整还原可执行源文件。实现目标应用程序存在较多被抽取函数时的完整的还原Dex源文件,解决脱壳后的Dex源文件中函数体的字节码出现大量空字节码的问题。

Description

一种脱壳处理方法、装置、设备及存储介质 技术领域
本申请涉及计算机技术,尤其涉及一种脱壳处理方法、装置、设备及存储介质。
背景技术
随着移动安全领域可执行文件(Dex)加固技术的逐步发展,脱壳技术也随之更新,针对早期的加固技术,诞生出了多种多样的脱壳应用或者脱壳机,而自从指令抽取型加固技术诞生以来,就很难实现全自动脱壳。
现有技术中是通过对加固Dex文件的内存定位,确认加固Dex文件的结构体,循环遍历并加载结构体中所有映射字段,接着对结构体中所有映射字段进行重组和修复,从而得到脱壳后的Dex源文件。
然而,上述技术方案主要解决的是Dex文件在内存中不连续加载的问题,没有针对性去解决指令被抽取的问题,所以,导致脱壳后的Dex源文件中函数的函数体的字节码(CodeItem)出现大量的空字节码,也就是说,进行脱壳操作后无法得到完整的Dex源文件。
发明内容
为解决上述技术问题,本申请实施例期望提供一种脱壳处理方法、装置、设备及存储介质,目的在于将源函数全部还原,得到完整的Dex源文件。
第一方面,本申请实施例提供了一种脱壳处理方法,其中,包括:
获取目标应用程序的待脱壳的可执行文件;
获取所述待脱壳的可执行文件中的至少一个被抽取函数的函数标识;
基于所述至少一个被抽取函数的函数标识,从预设的存储空间中获取所述至少一个被抽取函数对应的源函数;
将所述至少一个被抽取函数对应的源函数回填到所述待脱壳的可执行 文件中,得到脱壳后的可执行源文件。
第二方面,本申请实施例提供了一种脱壳处理装置,其中,包括:
获取部分,配置为获取目标应用程序的待脱壳的可执行文件;
所述获取部分,还配置为获取所述待脱壳的可执行文件中的至少一个被抽取函数的函数标识;
所述获取部分,还配置为基于所述至少一个被抽取函数的函数标识,从预设的存储空间中获取所述至少一个被抽取函数对应的源函数;
回填部分,配置为将所述至少一个被抽取函数对应的源函数回填到所述待脱壳的可执行文件中,得到脱壳后的可执行源文件。
第三方面,提供了一种脱壳处理设备,包括:处理器和配置为存储能够在处理器上运行的计算机程序的存储器,其中,所述处理器配置为运行所述计算机程序时,执行前述方法的步骤。
第四方面,提供了一种计算机可读存储介质,其上存储有计算机程序,其中,该计算机程序被处理器执行时实现前述方法的步骤。
本申请实施例的技术方案,通过获取的至少一个被抽取函数的函数标识,从预设的存储空间中下载加固前被抽取函数对应的源函数,并将源函数回填到待脱壳的可执行文件中被抽取函数所在位置,即可完整的还原出可执行源文件。如此,可实现目标应用程序存在较多被抽取函数时的完整的还原Dex源文件,解决脱壳后的Dex源文件中函数体的字节码出现大量的空字节码的问题。
附图说明
图1为加壳Android应用程序中恢复Dex源文件的脱壳过程示意图;
图2为本申请实施例中脱壳处理方法的第一流程示意图;
图3为本申请实施例中脱壳处理方法的第二流程示意图;
图4为本申请实施例中对加固可执行文件的检测示意图;
图5为本申请实施例中脱壳处理方法的第三流程示意图;
图6为本申请实施例中脱壳处理方法的第四流程示意图;
图7为本申请实施例中脱壳处理装置组成的结构示意图;
图8为本申请实施例中脱壳处理设备组成的结构示意图;
图9为本申请实施例中一种芯片的示意性结构框图。
具体实施方式
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。
图1为加壳Android应用程序中恢复Dex源文件的脱壳过程示意图。如图1所示,具体的,
步骤101:内存定位Dex文件;
内存定位Dex文件,获取对应的结构体(DexFileHeader);其中,DexFileHeader包括:type_ids_off、type_ids_size、string_ids_off、string_ids_size、proto_ids_off、proto_ids_size、field_ids_off、field_ids_size、method_ids_off、method_ids_size、class_defs_off、class_defs_size。
步骤102:循环遍历,并加载左右映射字段;
循环遍历并加载结构体中所有映射的字段;即将映射表和结构体分别存放于不同内存块中;其中,内存块1存放映射表,主要包括:strings映射表、types映射表、protos映射表、methods映射表、class defs映射表、fields映射表;内存块2存放结构体,主要包括:type_ids_off、string_ids_off、proto_ids_off、field_ids_off、method_ids_off、class_defs_off。
步骤103:对结构体内的映射和字段进行整理和修复,得到源Dex文件;
源Dex文件中至少包括:结构体段、映射表段、数据段。
图1所示的技术方案,是在Dex文件内存定位后,根据结构体中的字段的映射信息进行重组还原,解决了Dex文件在内存中不连续加载的问题,然而,由于没有针对性去解决指令被抽取的问题,所以,导致脱壳后的Dex源文件中函数的CodeItem出现大量的空字节码。即使少量已经被执行的函数被抽取指令可以还原,但多数函数体依然是被抽取的状态,所以,进行脱壳操作后无法得到完整的Dex源文件。
为此,本申请提出了一种脱壳处理方法,可实现对Dex文件的完整还原。具体实现方法如下:
图2为本申请实施例中脱壳处理方法的第一流程示意图,如图2所示,该脱壳处理方法具体可以包括:
步骤201:获取目标应用程序的待脱壳的可执行文件;
需要说明的是,待脱壳的可执行文件是基于目标应用程序的Dex文件在内存中的地址间接获得。换句话说,获取待脱壳的可执行文件之前,需先获取Dex文件。这里的Dex文件为加固操作后的文件,Dex文件是安卓(Android)系统的可执行文件,包含应用程序的全部操作指令以及运行时数据。
实际应用中,通常先定位Dex文件在内存中的地址,基于地址获取目标应用程序的Dex文件;再对Dex文件所包含的结构体进行解析,得到待脱壳的可执行文件。
具体的,Dex文件地址的获取是在安卓应用程序包(Android application package,APK)开始运行后,对Dex文件中结构体相关的系统函数使用动态调试的手段,即下断点来进行Dex文件地址的获取。
步骤202:获取所述待脱壳的可执行文件中的至少一个被抽取函数的函数标识;
这里,被抽取函数可以理解成进行加固操作时函数指令被抽取(也可以理解为被隐藏或被加密)的函数;其被抽取函数对应的CodeItem为空字节码。函数标识用于指示所对应的函数,比如函数标识可以为函数名。
在一些实施例中,所述方法具体包括:从所述待脱壳的可执行文件中获取至少一个函数及其函数标识;确定所述至少一个函数中为空的函数为所述至少一个被抽取函数;获取所述至少一个被抽取函数的函数标识。
需要说明的是,待脱壳的可执行文件中包括至少一个函数及函数标识;其中,至少一个函数包括进行加固操作时被抽取的函数和未被抽取的函数;每一个函数均对应一个函数标识。
具体的,需对至少一个函数中每一个函数对应的CodeItem进行判断,若当前函数的CodeItem为空字节码时,将此函数称为被抽取函数,并获取被抽取函数对应的函数标识。需要说明的是,被抽取函数呈现的是被抽取的状态。
步骤203:基于所述至少一个被抽取函数的函数标识,从预设的存储空 间中获取所述至少一个被抽取函数对应的源函数;
需要说明的是,预设的存储空间用于存储函数标识对应的源函数。预设的存储空间可以是存放于本地的存储空间,也可以是存放于云端的存储空间。
实际应用中,预先在本地或者云端建立一用于存储函数标识与源函数映射关系表,依次根据函数标识将对应的源函数存储其中,同时也方便后续继续存储其他函数标识及源函数,以及查询或者下载函数标识对应的源函数。
具体的,基于待脱壳的可执行文件中获取到的至少一个被抽取函数的函数标识,从预设的存储空间中下载对应的源函数;每获取到一个被抽取函数对应的源函数时,将其存放于一函数映射关系表中;直到所有的被抽取函数对应的源函数均存放在此函数映射关系表中后,即可得到完整的函数关系映射表。
在后续进行回填操作时,可直接从上述函数映射关系表中获取对应的源函数,利用源函数去更新被抽取函数,直到将全部被抽取函数对应的源函数更新完成,即可完整还原Dex文件。
步骤204:将所述至少一个被抽取函数对应的源函数回填到所述待脱壳的可执行文件中,得到脱壳后的可执行源文件。
实际应用中,从至少一个被抽取函数的函数标识中获取目标函数标识;基于目标函数标识,从函数映射关系表中获取目标函数标识对应的目标源函数;将此目标源函数回填到待脱壳的可执行文件中目标函数标识对应的目标被抽取函数的原位置。当对至少一个被抽取函数的函数标识所对应的被抽取函数进行回填完毕后,即得到脱壳后的可执行源文件。
这里,步骤201至步骤204的执行主体可以为脱壳处理装置的处理器。
本申请实施例的技术方案,通过获取的至少一个被抽取函数的函数标识,从预设的存储空间中下载加固前被抽取函数对应的源函数,并将源函数回填到待脱壳的可执行文件中被抽取函数所在位置,即可完整的还原出可执行源文件。如此,可实现目标应用程序存在较多被抽取函数时的完整的还原Dex源文件,解决脱壳后的Dex源文件中函数体的字节码出现大量的空字节码的问题。
为实现本申请实施例的方法,基于同一发明构思本申请实施例还提供了另一种脱壳处理方法,图3为本申请实施例中脱壳处理方法的第二流程示意图,如图3所示,该脱壳处理方法具体可以包括:
步骤301:获取目标应用程序的待脱壳的可执行文件;
需要说明的是,待脱壳的可执行文件是基于目标应用程序的Dex文件在内存中的地址间接获得。换句话说,获取待脱壳的可执行文件之前,需先获取Dex文件。这里的Dex文件为加固操作后的文件,Dex文件是安卓(Android)系统的可执行文件,包含应用程序的全部操作指令以及运行时数据。
步骤302:获取所述待脱壳的可执行文件中的至少一个被抽取函数的函数标识;
这里,被抽取函数可以理解成进行加固操作时函数指令被抽取(也可以理解为被隐藏或被加密)的函数;其被抽取函数对应的CodeItem为空字节码。函数标识用于指示所对应的函数,比如函数标识可以为函数名。
在一些实施例中,所述方法具体包括:从所述待脱壳的可执行文件中获取至少一个函数及其函数标识;确定所述至少一个函数中为空的函数为所述至少一个被抽取函数;获取所述至少一个被抽取函数的函数标识。
需要说明的是,待脱壳的可执行文件中包括至少一个函数及函数标识;其中,至少一个函数包括进行加固操作时被抽取的函数和未被抽取的函数;每一个函数均对应一个函数标识。
具体的,需对至少一个函数中每一个函数对应的CodeItem进行判断,若当前函数的CodeItem为空字节码时,将此函数称为被抽取函数,并获取被抽取函数对应的函数标识。需要说明的是,被抽取函数呈现的是被抽取的状态。
步骤303:基于所述至少一个被抽取函数的函数标识,运行修改后的系统自带的调用函数,从预设的存储空间中获取所述至少一个被抽取函数对应的源函数;
实际应用中,首先,初始化函数映射关系表,即至少一个被抽取函数的函数标识所对应的被抽取函数对应的CodeItem为空字节码;其次,从上述函数映射关系表中每获取一个空的被抽取函数所对应的函数标识,则执 行一次预设的调用函数,进而从预设的内存空间中获取不为空的源函数;最后,利用获取到的不为空的源函数去更新函数映射关系表中为空的被抽取函数。
也就是说,在获取到至少一个被抽取函数的函数标识后,去建立的函数映射关系表;当获取到至少一个被抽取函数对应的源函数后,直接将源函数存放至上述建立的函数映射关系表中,即为至少一个被抽取函数的函数标识与至少一个被抽取函数对应的源函数之间的映射关系。
实际应用中,基于获取的第一个CodeItem为空字节码的被抽取函数,根据对应的函数标识,执行一次预设的调用函数,进而从预设的内存空间中获取不为空的源函数时,构建函数映射关系表,并将函数标识与源函数存放其中。后续的被抽取函数执行步骤如上,这里不再陈述。
也就是说,在获取到至少一个被抽取函数的源函数时,构建函数映射关系表。
上述提到的预设的调用函数,可以是修改后的系统自带的调用函数。
在一些实施例中,具体包括:将备份函数增添到系统自带的调用函数中,得到第一修改后的调用函数;其中,所述备份函数运行时用于获取所述至少一个被抽取函数对应的源函数;将所述第一修改后的调用函数作为所述预设的调用函数。
需要说明的是,上述步骤是直接对系统自带的调用函数的源码进行修改,来达到对至少一个被抽取函数对应的源函数的获取,构建函数映射关系表且将至少一个被抽取函数对应的源函数存放至函数映射关系表中。
具体的,根据系统自带的调用函数的函数名,从所有系统源码中调出系统自带的调用函数;在系统自带的调用函数中增添备份函数,其中,备份函数具备获取至少一个被抽取函数对应的源函数的获取功能,构建函数映射关系表且将至少一个被抽取函数对应的源函数存放至函数映射关系表中的备份功能。
上述存放源函数的函数映射关系表,为后续进行回填操作铺垫。
步骤304:将所述至少一个被抽取函数对应的源函数回填到所述待脱壳的可执行文件中,得到脱壳后的可执行源文件。
实际应用中,从至少一个被抽取函数的函数标识中获取目标函数标识; 基于目标函数标识,从函数映射关系表中获取目标函数标识对应的目标源函数;将此目标源函数回填到待脱壳的可执行文件中目标函数标识对应的目标被抽取函数的原位置。当对至少一个被抽取函数的函数标识所对应的被抽取函数进行回填完毕后,即得到脱壳后的可执行源文件。
实际应用中,根据上述脱壳处理方法构建脱壳系统;接着对脱壳系统进行刷机操作,将脱壳系统刷机到模拟器中,使其得到具备脱壳系统的模拟器;通过运行具备脱壳系统的模拟器就可实现脱壳。
示例性的,通过修改系统源码的ArtMethod中Invoke函数(系统自带的调用函数),接着对其进行编译操作、刷机操作,最终得到具备脱壳系统的模拟器,通过运行具备脱壳系统的模拟器,达到对加固Dex文件的脱壳目的。
具体的,基于上述具备脱壳系统的模拟器,对其上传一个加固APK文件,通过运行具备脱壳系统的模拟器,根据每个被抽取函数的函数标识获取对应的源函数并存放于函数映射关系表;基于函数映射关系表,回填完每一个函数标识对应的源函数之后,即可得到脱壳后的可执行源文件。
本申请实施例的技术方案可实现对OPPO应用市场被加固应用的脱壳处理,方便后续的风险检测和代码审计。图4为本申请实施例中对加固可执行文件的检测示意图,如图4所示,具体的,
步骤401:上传加固APK文件;
对具备脱壳系统的模拟器上传一个加固APK文件。
步骤402:模拟器脱壳系统执行代码还原;
步骤403:对还原的代码进行风险检测及审计;若通过检测及审计,则执行步骤404;若未通过检测及审计,则执行步骤405;
步骤404:将加固APK上架应用市场;
步骤405:对APK再一次进行加固后,再次执行步骤401。
上述执行步骤405是由于具备脱壳系统的模拟器还原出可执行文件,也就是说,未通过检测及审计,需重新对APK进行加固操作,然后再一次判断是否能够通过检测及审计。
本申请实施例的技术方案,通过获取的至少一个被抽取函数的函数标识,从预设的存储空间中下载加固前被抽取函数对应的源函数,并将源函 数回填到待脱壳的可执行文件中被抽取函数所在位置,即可完整的还原出可执行源文件。如此,可实现目标应用程序存在较多被抽取函数时的完整的还原Dex源文件,解决脱壳后的Dex源文件中函数体的字节码出现大量的空字节码的问题。
为实现本申请实施例的方法,基于同一发明构思本申请实施例还提供了另一种脱壳处理方法,图5为本申请实施例中脱壳处理方法的第三流程示意图,如图5所示,该脱壳处理方法具体可以包括:
步骤501:获取目标应用程序的待脱壳的可执行文件;
步骤502:获取所述待脱壳的可执行文件中的至少一个被抽取函数的函数标识;
步骤503:基于所述至少一个被抽取函数的函数标识,运行自定义函数,从预设的存储空间中获取所述至少一个被抽取函数对应的源函数;
实际应用中,首先,初始化函数映射关系表,即至少一个被抽取函数的函数标识所对应的被抽取函数对应的CodeItem为空字节码;其次,从上述函数映射关系表中每获取一个空的被抽取函数所对应的函数标识,则执行一次预设的调用函数,进而从预设的内存空间中获取不为空的源函数;最后,利用获取到的不为空的源函数去更新函数映射关系表中为空的被抽取函数。
也就是说,在获取到至少一个被抽取函数的函数标识后,去建立的函数映射关系表;当获取到至少一个被抽取函数对应的源函数后,直接将源函数存放至上述建立的函数映射关系表中,即为至少一个被抽取函数的函数标识与至少一个被抽取函数对应的源函数之间的映射关系。
实际应用中,基于获取的第一个CodeItem为空字节码的被抽取函数,根据对应的函数标识,执行一次预设的调用函数,进而从预设的内存空间中获取不为空的源函数时,构建函数映射关系表,并将函数标识与源函数存放其中。后续的被抽取函数执行步骤如上,这里不再陈述。
也就是说,在获取到至少一个被抽取函数的源函数时,构建函数映射关系表。
上述提到的预设的调用函数,可以是自定义函数。
在一些实施例中,所述方法还包括:获取系统自带的调用函数;将备 份函数增添到系统自带的调用函数中,得到第二修改后的调用函数;其中,所述备份函数运行时用于获取所述至少一个被抽取函数对应的源函数;利用所述第二修改后的调用函数创建自定义函数;将所述自定义函数作为所述预设的调用函数。
需要说明的是,这里通过创建的自定义函数来实现对至少一个被抽取函数的调用,基于自定义函数中的备份函数实现源函数的获取,进而实现脱壳的目的。
具体的,通过执行创建的自定义函数,进而执行自定义函数中的第二修改后的调用函数,利用第二修改后的调用函数中的备份函数,去获取至少一个被抽取函数对应的源函数并构建函数映射关系表。其中,第二修改后的调用函数,是将系统自带的调用函数复制,在复制出的系统自带的调用函数中增添备份函数得到。
示例性地,正常系统中的java层函数的调用是通过ArtMethod类中的invoke方法实现。本申请是通过自定义invoke函数实现对java层的至少一个被抽取函数的调用。首先,创建自定义invoke函数(自定义函数),在自定义invoke函数中增添修改后的ArtMethod类中的invoke函数(第二修改后的调用函数);其中,在ArtMethod类中的invoke函数增添具备获取至少一个被抽取函数对应的源函数并构建函数映射关系表的dump函数;其次,通过执行自定义invoke函数,进而执行修改后的ArtMethod类中的invoke函数,获取到至少一个被抽取函数对应的源函数,以及构建函数映射关系表,将至少一个被抽取函数对应的源函数存放在函数映射关系表相应的位置;当获取到所有被抽取函数的源函数之后,最终得到一个完整的函数映射关系表。
在一些实施例中,所述执行预设的调用函数,包括:通过hook方式,执行所述自定义函数。
需要说明的是,hook是一种改变程序执行流程的技术。这里,当系统执行自带的调用函数时,通过hook方式,可直接跳跃系统自带的调用函数,而去执行自定义函数。
示例性的,hook方式,会劫持系统ArtMethod类中的invoke方法,在系统调用invoke方法时,执行的是自定义invoke方法,而不是系统自带的 invoke方法。
实际应用中,通过在Xposed或Frida运行框架下执行hook技术。此时,相对于上个实施例来说,这里的模拟器运行后并不改变模拟器的系统功能。
步骤504:将所述至少一个被抽取函数对应的源函数回填到所述待脱壳的可执行文件中,得到脱壳后的可执行源文件。
实际应用中,从至少一个被抽取函数的函数标识中获取目标函数标识;基于目标函数标识,从函数映射关系表中获取目标函数标识对应的目标源函数;将此目标源函数回填到待脱壳的可执行文件中目标函数标识对应的目标被抽取函数的原位置。当对至少一个被抽取函数的函数标识所对应的被抽取函数进行回填完毕后,即得到脱壳后的可执行源文件。
也就是说,通过hook方式运行每一个被抽取函数,直到基于被抽取函数的函数标识获取到所有对应的源函数且均存放于函数映射关系表之后,再从函数映射关系表中获取源函数进行回填,进而得到完整的Dex源文件。
本申请实施例的技术方案,通过获取的至少一个被抽取函数的函数标识,从预设的存储空间中下载加固前被抽取函数对应的源函数,并将源函数回填到待脱壳的可执行文件中被抽取函数所在位置,即可完整的还原出可执行源文件。如此,可实现目标应用程序存在较多被抽取函数时的完整的还原Dex源文件,解决脱壳后的Dex源文件中函数体的字节码出现大量的空字节码的问题。
在上述实施例基础上,本申请具体给出了一种脱壳处理方法,图6为本申请实施例中脱壳处理方法的第四流程示意图。
抽取指令型加固指的是通过对Dex文件中的函数方法体(被抽取指令)进行保护。其中,Dex文件是Android系统的可执行文件,包含应用程序的全部操作指令及运行时数据。
本申请是通过对java层函数的主动调用,在被调用时进行被抽取指令的dump后,并不予执行,直接返回,将得到的被抽取指令回填到不完整的Dex文件中,从而得到完整的Dex源文件。
具体的,java方法在ART虚拟机中以ArtMethod表示,从zygote启动创建虚拟机环境之后,通过AndroidRuntime::start()开始进入java环境,且在start()中调用了CallStaticVoidMethod()函数,并进一步调用ArtMethod的 invoke方法,java方法的执行,均通过invoke方法执行。因此,本申请通过hook invoke函数的方式,执行被抽取函数的invoke,并dump出被抽取指令。
如图6所示,具体如下:
步骤601:内存解析Dex文件;
这里的Dex文件为加固操作后的文件,Dex文件是Android系统的可执行文件,包含应用程序的全部操作指令以及运行时数据。
实际应用中,通常先定位Dex文件在内存中的地址,基于地址获取目标应用程序的Dex文件;再对Dex文件所包含的结构体进行解析,得到待脱壳的可执行文件。
具体的,Dex文件地址的获取,是在APK开始运行后,对Dex文件中结构体相关的系统函数使用动态调试的手段,即下断点来进行Dex文件地址的获取。
步骤602:确认被抽取函数并创建函数映射关系表list;
上述步骤后得到解析后的Dex文件为待脱壳的可执行文件。
这里,被抽取函数可以理解成进行加固操作时函数指令被抽取(也可以理解为被隐藏或被加密)的函数;其被抽取函数对应的CodeItem为空字节码。
待脱壳的可执行文件中包含至少一个函数及函数标识,因此,在这里需对每一个函数的CodeItem进行判断,将CodeItem为空的函数定义为被抽取函数,进而得到至少一个被抽取函数,并获取对应的函数标识。
利用至少一个被抽取函数及被抽取函数对应的函数标识创建一函数映射关系表list;其中,函数标识用于指示所对应的函数,比如函数标识可以为函数名。
下面所提及的被抽取函数为未被还原的被抽取函数,也就是说呈现的是被抽取的状态。
步骤603:创建自定义invoke函数,并调用增添dump函数的ArtMethod类中的invoke函数;其中,dump函数用于下载被抽取函数对应的源函数,及创建函数映射关系表;
这里的源函数为从预设存储空间中下载的加固前的函数。
首先,创建自定义invoke函数,在自定义invoke函数中增添修改后的ArtMethod类中的invoke函数;其中,在ArtMethod类中的invoke函数增添具备获取至少一个被抽取函数对应的源函数并构建函数映射关系表的dump函数;其次,通过执行自定义invoke函数,进而执行修改后的ArtMethod类中的invoke函数,获取到至少一个被抽取函数对应的源函数,以及将函数标识与源函数的映射关系存放于重新构建的函数映射关系表list1,或者,直接利用源函数去对应更新被抽取函数,不需要再次构建函数映射关系表;最后,当获取到所有被抽取函数的源函数之后,最终得到一个完整的函数映射关系表。
基于Xposed或Frida运行框架下hook方式,执行自定义invoke函数,实现对java层的至少一个被抽取函数的调用及dump函数的功能。
步骤604:将被抽取函数对应的源函数回填到被抽取函数所在位置;
从至少一个被抽取函数的函数标识中获取目标函数标识,再基于重新构建的函数映射关系表list1或者更新后的函数映射关系表list,获取目标函数标识对应的源函数,进而将源函数回填到对应的被抽取函数的所在位置。
步骤605:得到完整的Dex源文件。
将所有被抽取函数对应的源函数均回填到可执行文件中后,即可得到完整的Dex源文件。
本申请可以对关于Android项目提供参考及辅助,例如:用于竞品Android加固方案的分析的辅助;用于对Android加固方案的测试参考;用于对公司应用市场中开发者应用的风险评估辅助。
本申请实施例的技术方案,通过获取的至少一个被抽取函数的函数标识,从预设的存储空间中下载加固前被抽取函数对应的源函数,并将源函数回填到待脱壳的可执行文件中被抽取函数所在位置,即可完整的还原出可执行源文件。如此,可实现目标应用程序存在较多被抽取函数时的完整的还原Dex源文件,解决脱壳后的Dex源文件中函数体的字节码出现大量的空字节码的问题。
为实现本申请实施例的方法,基于同一发明构思本申请实施例还提供了一种脱壳处理装置,如图7所示,该装置包括:
获取部分701,配置为获取目标应用程序的待脱壳的可执行文件;
所述获取部分701,还配置为获取所述待脱壳的可执行文件中的至少一个被抽取函数的函数标识;
所述获取部分701,还配置为基于所述至少一个被抽取函数的函数标识,从预设的存储空间中获取所述至少一个被抽取函数对应的源函数;
回填部分702,配置为将所述至少一个被抽取函数对应的源函数回填到所述待脱壳的可执行文件中,得到脱壳后的可执行源文件。
在一些实施例中,所述装置包括:所述获取部分701,还配置为从所述待脱壳的可执行文件中获取至少一个函数及其函数标识;确定所述至少一个函数中为空的函数为所述至少一个被抽取函数;获取所述至少一个被抽取函数的函数标识。
在一些实施例中,所述装置包括:所述获取部分701,还配置为利用获取到的所述至少一个被抽取函数的函数标识,初始化函数映射关系表;其中,所述函数映射关系表中包括所述至少一个被抽取函数的函数标识与所述至少一个被抽取函数之间的映射关系;基于所述函数映射关系表中至少一个被抽取函数的函数标识,执行预设的调用函数,从所述预设的存储空间中获取所述至少一个被抽取函数对应的源函数;将所述至少一个被抽取函数对应的源函数存放至所述函数映射关系表中。
在一些实施例中,所述装置包括:所述获取部分701,还配置为基于所述至少一个被抽取函数的函数标识,执行预设的调用函数,从所述预设的存储空间中获取所述至少一个被抽取函数对应的源函数;利用所述至少一个被抽取函数的函数标识及其对应的源函数构建函数映射关系表。
在一些实施例中,将备份函数增添到系统自带的调用函数中,得到第一修改后的调用函数;其中,所述备份函数运行时用于获取所述至少一个被抽取函数对应的源函数;将所述第一修改后的调用函数作为所述预设的调用函数。
在一些实施例中,获取系统自带的调用函数;将备份函数增添到系统自带的调用函数中,得到第二修改后的调用函数;其中,所述备份函数运行时用于获取所述至少一个被抽取函数对应的源函数;利用所述第二修改后的调用函数创建自定义函数;将所述自定义函数作为所述预设的调用函数。
在一些实施例中,通过hook方式,执行所述自定义函数。
本申请实施例的技术方案,通过获取的至少一个被抽取函数的函数标识,从预设的存储空间中下载加固前被抽取函数对应的源函数,并将源函数回填到待脱壳的可执行文件中被抽取函数所在位置,即可完整的还原出可执行源文件。如此,可实现目标应用程序存在较多被抽取函数时的完整的还原Dex源文件,解决脱壳后的Dex源文件中函数体的字节码出现大量的空字节码的问题。
本申请实施例还提供了另一种脱壳处理设备,如图8所示,该设备包括:处理器801和配置为存储能够在处理器上运行的计算机程序的存储器802;
其中,处理器801配置为运行计算机程序时,执行前述实施例中的方法步骤。
当然,实际应用时,如图8所示,该设备中的各个组件通过总线系统803耦合在一起。可理解,总线系统803用于实现这些组件之间的连接通信。总线系统803除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图8中将各种总线都标为总线系统803。
在实际应用中,上述处理器可以为特定用途集成电路(ASIC,Application Specific Integrated Circuit)、数字信号处理装置(DSPD,Digital Signal Processing Device)、可编程逻辑装置(PLD,Programmable Logic Device)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、控制器、微控制器、微处理器中的至少一种。可以理解地,对于不同的设备,用于实现上述处理器功能的电子器件还可以为其它,本申请实施例不作具体限定。
上述存储器可以是易失性存储器(volatile memory),例如随机存取存储器(RAM,Random-Access Memory);或者非易失性存储器(non-volatile memory),例如只读存储器(ROM,Read-Only Memory),快闪存储器(flash memory),硬盘(HDD,Hard Disk Drive)或固态硬盘(SSD,Solid-State Drive);或者上述种类的存储器的组合,并向处理器提供指令和数据。
本申请实施例还提供了一种计算机可读存储介质,所述计算机存储介质存储有计算机可执行指令,所述计算机可执行指令被执行时前述实施例 的方法步骤。
本申请实施例上述设备如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。
图9是本申请实施例的芯片的示意性结构图。图9所示的芯片901包括处理器902,处理器902可以从存储器904中调用并运行计算机程序,以实现本申请实施例中的方法。
可选地,如图9所示,芯片901还可以包括存储器904。其中,处理器902可以从存储器904中调用并运行计算机程序,以实现本申请实施例中的方法。
其中,存储器904可以是独立于处理器902的一个单独的器件,也可以集成在处理器902中。
可选地,该芯片901还可以包括输入接口903。其中,处理器902可以控制该输入接口903与其他设备或芯片进行通信,具体地,可以获取其他设备或芯片发送的信息或数据。
可选地,该芯片901还可以包括输出接口905。其中,处理器902可以控制该输出接口905与其他设备或芯片进行通信,具体地,可以向其他设备或芯片输出信息或数据。
可选地,该芯片可应用于本申请实施例中的网络设备,并且该芯片可以实现本申请实施例的各个方法中由网络设备实现的相应流程,为了简洁,在此不再赘述。
可选地,该芯片可应用于本申请实施例中的终端设备,并且该芯片可以实现本申请实施例的各个方法中由终端设备实现的相应流程,为了简洁,在此不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片,系统芯片,芯片系统或片上系统芯片等。
相应地,本申请实施例还提供一种计算机存储介质,其中存储有计算机程序,该计算机程序配置为执行本申请实施例的数据调度方法。
需要说明的是:“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
工业实用性
本申请实施例的技术方案,通过获取的至少一个被抽取函数的函数标识,从预设的存储空间中下载加固前被抽取函数对应的源函数,并将源函数回填到待脱壳的可执行文件中被抽取函数所在位置,即可完整的还原出可执行源文件。如此,可实现目标应用程序存在较多被抽取函数时的完整的还原Dex源文件,解决脱壳后的Dex源文件中函数体的字节码出现大量的空字节码的问题。

Claims (10)

  1. 一种脱壳处理方法,其中,包括:
    获取目标应用程序的待脱壳的可执行文件;
    获取所述待脱壳的可执行文件中的至少一个被抽取函数的函数标识;
    基于所述至少一个被抽取函数的函数标识,从预设的存储空间中获取所述至少一个被抽取函数对应的源函数;
    将所述至少一个被抽取函数对应的源函数回填到所述待脱壳的可执行文件中,得到脱壳后的可执行源文件。
  2. 根据权利要求1所述的方法,其中,所述获取所述待脱壳的可执行文件中的至少一个被抽取函数的函数标识,包括:
    从所述待脱壳的可执行文件中获取至少一个函数及其函数标识;
    确定所述至少一个函数中为空的函数为所述至少一个被抽取函数;
    获取所述至少一个被抽取函数的函数标识。
  3. 根据权利要求2所述的方法,其中,所述基于所述至少一个被抽取函数的函数标识,从预设的存储空间中获取所述至少一个被抽取函数对应的源函数,包括:
    利用获取到的所述至少一个被抽取函数的函数标识,初始化函数映射关系表;其中,所述函数映射关系表中包括所述至少一个被抽取函数的函数标识与所述至少一个被抽取函数之间的映射关系;
    基于所述函数映射关系表中至少一个被抽取函数的函数标识,执行预设的调用函数,从所述预设的存储空间中获取所述至少一个被抽取函数对应的源函数;
    将所述至少一个被抽取函数对应的源函数存放至所述函数映射关系表中。
  4. 根据权利要求2所述的方法,其中,所述基于所述至少一个被抽取函数的函数标识,从预设的存储空间中获取所述至少一个被抽取函数对应的源函数,包括:
    基于所述至少一个被抽取函数的函数标识,执行预设的调用函数,从 所述预设的存储空间中获取所述至少一个被抽取函数对应的源函数;
    利用所述至少一个被抽取函数的函数标识及其对应的源函数构建函数映射关系表。
  5. 根据权利要求3或4所述的方法,其中,所述方法还包括:
    将备份函数增添到系统自带的调用函数中,得到第一修改后的调用函数;其中,所述备份函数运行时用于获取所述至少一个被抽取函数对应的源函数;
    将所述第一修改后的调用函数作为所述预设的调用函数。
  6. 根据权利要求3或4所述的方法,其中,所述方法还包括:
    获取系统自带的调用函数;
    将备份函数增添到系统自带的调用函数中,得到第二修改后的调用函数;其中,所述备份函数运行时用于获取所述至少一个被抽取函数对应的源函数;
    利用所述第二修改后的调用函数创建自定义函数;
    将所述自定义函数作为所述预设的调用函数。
  7. 根据权利要求6所述的方法,其中,所述执行预设的调用函数,包括:
    通过hook方式,执行所述自定义函数。
  8. 一种脱壳处理装置,其中,包括:
    获取部分,配置为获取目标应用程序的待脱壳的可执行文件;
    所述获取部分,还配置为获取所述待脱壳的可执行文件中的至少一个被抽取函数的函数标识;
    所述获取部分,还配置为基于所述至少一个被抽取函数的函数标识,从预设的存储空间中获取所述至少一个被抽取函数对应的源函数;
    回填部分,配置为将所述至少一个被抽取函数对应的源函数回填到所述待脱壳的可执行文件中,得到脱壳后的可执行源文件。
  9. 一种脱壳处理设备,其中,所述设备包括:处理器和配置为存储能够在处理器上运行的计算机程序的存储器,
    其中,所述处理器配置为运行所述计算机程序时,执行权利要求1至7任一项所述方法的步骤。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其中,该计算机程序被处理器执行时实现权利要求1至7任一项所述的方法的步骤。
PCT/CN2020/095133 2020-06-09 2020-06-09 一种脱壳处理方法、装置、设备及存储介质 WO2021248315A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/095133 WO2021248315A1 (zh) 2020-06-09 2020-06-09 一种脱壳处理方法、装置、设备及存储介质
CN202080100486.XA CN115552402A (zh) 2020-06-09 2020-06-09 一种脱壳处理方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/095133 WO2021248315A1 (zh) 2020-06-09 2020-06-09 一种脱壳处理方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021248315A1 true WO2021248315A1 (zh) 2021-12-16

Family

ID=78846636

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/095133 WO2021248315A1 (zh) 2020-06-09 2020-06-09 一种脱壳处理方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN115552402A (zh)
WO (1) WO2021248315A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100115260A1 (en) * 2008-11-05 2010-05-06 Microsoft Corporation Universal secure token for obfuscation and tamper resistance
CN105574411A (zh) * 2015-12-25 2016-05-11 北京奇虎科技有限公司 一种动态脱壳方法、装置和设备
CN105740708A (zh) * 2016-01-28 2016-07-06 博雅网信(北京)科技有限公司 一种基于Java反射机制的安卓应用自动脱壳方法
CN105989252A (zh) * 2015-12-12 2016-10-05 武汉安天信息技术有限责任公司 一种针对函数级别加壳的脱壳方法及系统
CN106022130A (zh) * 2016-05-20 2016-10-12 中国科学院信息工程研究所 加固应用程序的脱壳方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100115260A1 (en) * 2008-11-05 2010-05-06 Microsoft Corporation Universal secure token for obfuscation and tamper resistance
CN105989252A (zh) * 2015-12-12 2016-10-05 武汉安天信息技术有限责任公司 一种针对函数级别加壳的脱壳方法及系统
CN105574411A (zh) * 2015-12-25 2016-05-11 北京奇虎科技有限公司 一种动态脱壳方法、装置和设备
CN105740708A (zh) * 2016-01-28 2016-07-06 博雅网信(北京)科技有限公司 一种基于Java反射机制的安卓应用自动脱壳方法
CN106022130A (zh) * 2016-05-20 2016-10-12 中国科学院信息工程研究所 加固应用程序的脱壳方法及装置

Also Published As

Publication number Publication date
CN115552402A (zh) 2022-12-30

Similar Documents

Publication Publication Date Title
CN109491695B (zh) 一种集成安卓应用的增量更新方法
US10846083B2 (en) Semantic-aware and self-corrective re-architecting system
US8489925B1 (en) System and method for processing of system errors
CN102402427B (zh) 一种Java应用程序的更新方法及装置
CN108229107B (zh) 一种Android平台应用程序的脱壳方法及容器
CN108229148B (zh) 一种基于Android虚拟机的沙箱脱壳方法及系统
CN105574411A (zh) 一种动态脱壳方法、装置和设备
WO2022148390A1 (zh) 一种在区块链中部署、更新、调用智能合约的方法
CN110032502B (zh) 一种异常处理的方法、装置及电子设备
CN107567629A (zh) 在可信执行环境容器中的动态固件模块加载器
CN112189187A (zh) 统一平台的可扩展性
CN110704113B (zh) 一种基于fpga平台的启动方法、系统及开发板装置
CN115629971A (zh) 一种应用的开发系统和开发方法
US20080109793A1 (en) Verifying loaded module during debugging
CN112214267A (zh) 一种安卓脱壳加速方法、装置、存储介质及计算机设备
WO2022156277A1 (zh) 一种应用程序安装方法、装置、计算设备及可读存储介质
WO2021248315A1 (zh) 一种脱壳处理方法、装置、设备及存储介质
CN112748905B (zh) 基础库的初始化调用方法、装置、电子设备及存储介质
US10552135B1 (en) Reducing a size of an application package
JP4931711B2 (ja) カーネル更新方法、情報処理装置、プログラムおよび記憶媒体
CN110765008B (zh) 一种数据处理方法及装置
CN109214184B (zh) 一种Android加固应用程序通用自动化脱壳方法和装置
CN111625225A (zh) 一种程序指定数据输出方法和装置
CN108536444B (zh) 插件编译方法、装置、计算机设备和存储介质
CN116173511A (zh) 一种游戏热更新方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20939507

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 31/01/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20939507

Country of ref document: EP

Kind code of ref document: A1