WO2021147282A1 - Method, apparatus and device for detecting malicious file, and storage medium - Google Patents

Method, apparatus and device for detecting malicious file, and storage medium Download PDF

Info

Publication number
WO2021147282A1
WO2021147282A1 PCT/CN2020/104449 CN2020104449W WO2021147282A1 WO 2021147282 A1 WO2021147282 A1 WO 2021147282A1 CN 2020104449 W CN2020104449 W CN 2020104449W WO 2021147282 A1 WO2021147282 A1 WO 2021147282A1
Authority
WO
WIPO (PCT)
Prior art keywords
api
sequence
instruction
operating system
file
Prior art date
Application number
PCT/CN2020/104449
Other languages
French (fr)
Chinese (zh)
Inventor
郑宝龙
陈甲
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021147282A1 publication Critical patent/WO2021147282A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/53Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by executing in a restricted environment, e.g. sandbox or secure virtual machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements

Definitions

  • This application relates to the field of computer technology, and in particular to a method, device, equipment and storage medium for detecting malicious files.
  • a malicious file refers to a file containing a piece of program written by the programmer for attacking intent. Malicious files can spread on the network, and computers can receive malicious files from the network. When a computer runs a malicious file, it will perform malicious operations such as information theft, infection, or blackmail based on the program contained in the malicious file, which greatly affects the security of the network system. In view of this, the detection technology of malicious files has become a research hotspot in this field.
  • Dynamic behavior detection based on virtualized operating environment is a new type of malicious file detection technology.
  • the advantage of this detection technology is that it can find unknown new malicious files.
  • the basic principle of this technology is to use virtualization technology to generate a virtualized operating environment similar to the user host, such as a virtual machine.
  • a hook function is set in the virtualized operating environment, and the hook function is used to intercept predetermined application programming interface (application programming interface, API) calls.
  • Run the test file (that is, the file to be tested) in the virtualized operating environment, and obtain a series of API calls during the running process of the file to be tested through the hook function, so as to obtain the dynamic behavior of the test file. It is further judged whether the test file is a malicious file according to the dynamic behavior of the obtained test file.
  • the current detection technology has limitations in its application. For example, from the perspective of usage scenarios, the current virtualization technology can only be run in a testing device that supports the Microsoft Windows operating system (Windows) operating system, otherwise the test file cannot run and the test cannot be implemented.
  • Windows Microsoft Windows operating system
  • the embodiments of the present application provide a method, device, device, and storage medium for detecting malicious files, which can solve the limitations of existing malicious file detection technologies to a certain extent.
  • a detection device obtains a test file, and the test file is an executable file that runs based on a first operating system; the detection device is in a virtual operating environment Run the test file, the virtual running environment is generated based on container technology; the detection device obtains the first API sequence called by the test file during the running process, and the first API sequence includes at least one API,
  • the APIs included in the first API sequence are APIs in a first API set, and the first API set includes multiple APIs required for software operation provided by the virtual operating environment, and the APIs in the first API set
  • the identifier of is the same as the identifier of the API in the second API set.
  • the second API set includes multiple APIs required by the software provided by the first operating system to run; the detection device executes the first operating system in the second operating system.
  • Two API sequences the second API sequence includes at least one API, the APIs included in the second API sequence are the APIs in the second operating system, and the first API in the second API sequence and the The first API in the first API sequence has a mapping relationship, and the second operating system is an operating system based on the computer instruction set architecture of the detection device; the detection device is based on the first API sequence in the process of being called.
  • the behavior characteristics of the test file are described, and it is determined whether the test file is a malicious file.
  • the embodiment of the present application simulates the operating environment provided by the operating system compatible with the test file through the virtual operating environment generated based on the container technology.
  • the testing device converts the API called by the test file from the virtual operating environment to the API provided by the operating system of the testing device, and executes the converted API in the operating system of the testing device. Since the API of the operating system of the detection device is executed, the effect of simulating the execution of the API of the first operating system is achieved.
  • the virtual operating environment provided by the testing equipment can be compatible with the normal operation of the test file, thereby getting rid of the dependence of the test file on a specific architecture or platform (that is, the test file requires the testing equipment to be based on a specific architecture or platform), so it can be a certain degree Solve the limitations of existing malicious file detection technology.
  • cross-platform malicious program detection can be realized.
  • the container technology can avoid the resource overhead caused by Hypervisor and Guest OS, and directly use the kernel of the host to run. Since the size of the image of the container is much smaller than the size of the image of the virtual machine, the detection method of the embodiment of the present application is lighter, consumes less CPU processing resources, and occupies less memory space.
  • the detection method of the embodiment of the present application realizes the operation of the malicious program at the process level, and the detection speed is faster.
  • time-consuming and performance overhead caused by repeated resetting of virtual machines can be avoided, and overhead caused by operations such as the creation and scheduling of traditional virtual machines can be avoided.
  • the execution of the second API sequence by the detection device in the second operating system includes: the detection device separately obtains data from the dynamic link library of the virtual operating environment according to each API in the first API sequence.
  • the corresponding function in the first function sequence is obtained, thereby obtaining the first function sequence.
  • the functions included in the first function sequence are used to implement the API included in the first API sequence;
  • Functions, respectively obtaining the mapped functions from the dynamic link library of the second operating system, thereby generating a second function sequence, and the functions included in the second function sequence are used to implement the API included in the second API sequence ,
  • the first function in the second function sequence has a mapping relationship with the first function in the first function sequence;
  • the detection device is in the kernel of the second operating system, according to the second function sequence Perform the operation.
  • the above process provides an alternative implementation of operating system simulation.
  • the functions that implement the API in the virtual operating environment are encapsulated in a dynamic link library, and the functions that implement the API in the operating system of the detection device are encapsulated in another dynamic link library.
  • the API sequence of the virtual operating environment is called, by sequentially accessing different dynamic link libraries, a function sequence with similar functions provided by the operating system of the detection device is found.
  • the function sequence By executing the function sequence, the effect of executing the API sequence of the first operating system is simulated, so as to achieve the purpose of system simulation, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the first operating system .
  • the first operating system is a Windows operating system
  • the second operating system is a Linux operating system
  • the detection device separately obtains data from the virtual operating environment according to each API in the first API sequence.
  • Obtaining the corresponding function from the dynamic link library in the first API sequence includes: the detection device obtains the corresponding function from the DLL file of the dynamic link library according to each API in the first API sequence; the detection device obtains the corresponding function according to the first API sequence;
  • Each function in a function sequence obtains the mapped function from the dynamic link library of the second operating system, including: the detection device according to each function in the first function sequence and the difference between functions
  • the mapping relationship is to obtain the mapped function from the shared object SO file.
  • the above process provides an optional implementation method for simulating the Windows operating system under the Linux operating system.
  • a function sequence similar to the function of the Windows operating system provided by the Linux operating system is found.
  • the effect of executing the API sequence of the Windows operating system is simulated, so as to achieve the purpose of system simulation, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the Windows operating system.
  • this method can dynamically detect the PE file under the Linux operating system, thereby getting rid of the dependency of the detected PE file on the Windows operating system.
  • the detection equipment of the X86 platform can dynamically detect the test files running on the Windows system, thereby expanding the use scenarios of malicious file detection technology.
  • the first operating system is a Linux operating system
  • the second operating system is a Windows operating system
  • the detection device separately obtains data from the virtual operating environment according to each API in the first API sequence.
  • Obtaining the corresponding function from the dynamic link library includes: the detection device obtains the corresponding function from the SO file according to each API in the first API sequence; the detection device obtains the corresponding function according to the first function sequence
  • Each function in the second operating system obtains the mapped function from the dynamic link library of the second operating system, including: the detection device according to each function in the first function sequence and the mapping relationship between the functions, Obtain the mapped functions from the DLL file.
  • the above process provides an optional implementation method for simulating the Linux operating system under the Windows operating system.
  • the functions that implement the API in the Windows operating system are encapsulated in the DLL file.
  • the API sequence of the virtual operating environment is called, by sequentially accessing the SO file and the DLL file, a function sequence similar to the function of the Linux operating system provided by the Windows operating system is found.
  • the function sequence By executing the function sequence, the effect of executing the API sequence of the Linux operating system is simulated, so as to achieve the purpose of system simulation, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the Linux operating system.
  • test file is an ELF file
  • the operating system of the detection device is the Windows operating system
  • this method can dynamically detect the ELF file under the Windows operating system, so as to get rid of the dependence of the detected ELF file on the Linux operating system, so it is based on X86
  • the detection equipment of the platform can dynamically detect the test files running based on the Linux system, thereby expanding the use scenarios of malicious file detection technology.
  • the execution of the second API sequence by the detection device in the second operating system includes: the detection device obtains the first type of parameters called in the first API sequence, and the parameters included in the first type of parameters Is the input parameter of the API in the first API sequence; in the second operating system, the detection device executes the second API sequence according to the second type of parameters, and the parameters of the second type of parameters are For the input parameters of the API in the second API sequence, the first parameter in the second type parameter has a mapping relationship with the first parameter in the first type parameter.
  • the above process provides an alternative implementation of system simulation. Taking into account that the input parameters of different APIs may be different, when the input parameters are called in the API sequence of the virtual operating environment, the called input parameters are mapped to the input parameters of the API of the operating system of the detection device, so as to execute according to the appropriate parameters. API sequence, so as to avoid the situation that the incoming parameters are incorrect when executing the API sequence.
  • the execution of the second API sequence by the detection device in the second operating system includes: the detection device obtains a first instruction sequence triggered during the running of the test file, and the first instruction sequence includes at least An instruction, each instruction in the first instruction sequence is used to instruct to call an API in the first API sequence; the detection device performs a first instruction conversion on the instructions in the first instruction sequence, according to A second instruction sequence is obtained as a result of the conversion of the first instruction, the second instruction sequence includes at least one instruction, and each instruction in the second instruction sequence is used to instruct to call an API in the second API sequence, so
  • the first instruction conversion is used to convert instructions in the instruction set based on the first operating system into instructions in the computer instruction set of the detection device; the detection device executes the second sequence of instructions to implement the The operation corresponding to the second API sequence.
  • the CPU of the test device is a CPU of instruction set B architecture, and the test device performs instruction conversion to convert the instructions triggered by the test file from the instruction set
  • the instructions in A are converted to instructions in the instruction set B.
  • the CPU of the test device can execute the instruction triggered by the test file, thereby running the test file normally. It can be seen that this technical means can get rid of the dependence of running test files on a specific instruction set architecture, thereby ensuring that the scheme of detecting malicious files is widely used in various hardware environments.
  • the first operating system is a Windows operating system
  • the computer instruction set architecture of the detection device is an advanced reduced instruction set machine ARM architecture
  • the detection device performs the first instruction on the instructions in the first instruction sequence.
  • An instruction conversion to obtain a second instruction sequence according to the result of the first instruction conversion includes: the detection device converts each X86 instruction in the first instruction sequence into an ARM instruction, and obtains the ARM instruction according to the converted ARM instruction The second sequence of instructions.
  • the CPU of the test device is a CPU of the ARM instruction set architecture, and the test device converts the X86 instructions triggered by the test file into ARM instructions. In this way, the CPU of the test device can execute the ARM instruction, thereby running the test file normally. It can be seen that this technical means can get rid of the dependence of running test files on the X86 instruction set architecture, and therefore can ensure that the program for detecting malicious files is widely used in the ARM hardware environment.
  • the first operating system is a Linux operating system
  • the computer instruction set architecture of the detection device is an X86 architecture
  • the detection device performs a first instruction conversion on the instructions in the first instruction sequence, and according to the first instruction
  • Obtaining the second instruction sequence as a result of an instruction conversion includes: the detection device converts each ARM instruction in the first instruction sequence into an X86 instruction, and obtaining the second instruction sequence according to the converted X86 instruction.
  • the CPU of the test device is a CPU of the X86 instruction set architecture, and the test device converts the ARM instructions triggered by the test file into X86 instructions. In this way, the CPU of the test device can execute X86 instructions to run the test file normally. It can be seen that this technical method can get rid of the dependence of the running test file on the ARM instruction set architecture, so it is guaranteed that the program of detecting malicious files is widely used in the X86 hardware environment.
  • the method further includes: the detection device obtains a third instruction sequence, where the third instruction sequence represents execution of the second API As a result obtained after the sequence, the instructions in the third instruction sequence belong to the computer instruction set of the detection device; the detection device performs a second instruction conversion on each instruction in the third instruction sequence, and performs a second instruction conversion according to the second instruction sequence.
  • a fourth instruction sequence is obtained as a result of the instruction conversion.
  • the instructions in the fourth instruction sequence belong to the computer instruction set of the virtual operating environment, and the second instruction conversion is used to convert the instructions in the computer instruction set of the detection device Is an instruction in an instruction set based on the first operating system; the detection device inputs the fourth instruction sequence into the virtual operating environment.
  • the execution result of the API sequence can be converted into a form compatible with the test file, and returned to the test file running in the virtual operating environment, so that the test file can continue to run according to the result of the previous call to the API sequence, continuously Express dynamic behavior.
  • the container technology includes a Docker container technology
  • the virtual operating environment is started by a Docker daemon
  • the Docker daemon is a process run by the detection device based on the second operating system.
  • a virtual operating environment is started through a Docker daemon, and the virtual operating environment is, for example, an instance of a Docker container.
  • Docker containers it can have the advantage of being lighter and realize process-level malicious file detection.
  • a device for detecting malicious files includes at least one module, and at least one module is used to implement the malicious file provided in the first aspect or any one of the optional methods of the first aspect.
  • the detection method For specific details of the device for detecting malicious files provided by the second aspect, reference may be made to the foregoing first aspect or any one of the optional methods of the first aspect, which will not be repeated here.
  • a detection device in a third aspect, includes a processor configured to execute instructions so that the detection device executes the malicious file provided in the first aspect or any one of the optional methods of the first aspect. Detection method.
  • the detection device includes a processor configured to execute instructions so that the detection device executes the malicious file provided in the first aspect or any one of the optional methods of the first aspect. Detection method.
  • a detection device including a network interface, a memory, and a processor connected to the memory,
  • the network interface is used to obtain a test file, and the test file is an executable file running based on a first operating system;
  • the memory is used to store program instructions
  • the processor is configured to execute the program instructions, so that the detection device performs the following operations:
  • the first API sequence includes at least one API
  • the API included in the first API sequence is the API in the first API set
  • the first API sequence includes multiple APIs required for software operation provided by the virtual operating environment
  • the identifiers of the APIs in the first API set are the same as those of the APIs in the second API set
  • the second API set includes all APIs.
  • a second API sequence is executed in the second operating system, the second API sequence includes at least one API, and the second API sequence includes APIs
  • the API in the second operating system, the first API in the second API sequence has a mapping relationship with the first API in the first API sequence, and the second operating system is based on the detection device
  • An operating system based on a computer instruction set architecture; determining whether the test file is a malicious file based on the behavior characteristics of the test file when the first API sequence is called.
  • the detection device provided in the fourth aspect is further configured to execute the malicious file detection method provided in any of the above-mentioned optional methods in the first aspect.
  • the detection device provided in the fourth aspect reference may be made to the foregoing first aspect or any of the optional methods of the first aspect, and details are not described herein again.
  • a computer-readable storage medium stores at least one instruction, and the instruction is read by a processor to make a detection device execute the first aspect or any one of the optional methods of the first aspect The malicious file detection method provided.
  • a computer program product is provided.
  • the detection device executes the malicious file detection method provided in the first aspect or any one of the optional methods in the first aspect. .
  • a chip is provided, when the chip runs on a detection device, the detection device executes the malicious file detection method provided in the first aspect or any one of the optional methods of the first aspect.
  • Fig. 1 is a schematic diagram of a network system provided by an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a detection device provided by an embodiment of the present application.
  • FIG. 3 is a logical functional architecture diagram for detecting malicious files provided by an embodiment of the present application.
  • FIG. 4 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application
  • FIG. 5 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application
  • FIG. 6 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application.
  • FIG. 7 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application.
  • FIG. 8 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application.
  • FIG. 9 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application.
  • FIG. 10 is a flowchart of a method for protecting network security according to an embodiment of the present application.
  • FIG. 11 is a schematic diagram of an enterprise network provided by an embodiment of the present application.
  • FIG. 12 is a flowchart of a method for protecting network security according to an embodiment of the present application.
  • FIG. 13 is a flowchart of a method for protecting network security according to an embodiment of the present application.
  • FIG. 14 is a flowchart of a method for protecting network security according to an embodiment of the present application.
  • FIG. 15 is a flowchart of a method for protecting network security according to an embodiment of the present application.
  • FIG. 16 is a schematic diagram of a virtualization architecture provided by an embodiment of the present application.
  • FIG. 17 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application.
  • FIG. 18 is a schematic structural diagram of a malicious file detection apparatus provided by an embodiment of the present application.
  • the instruction set refers to a set of instructions that the processor can recognize.
  • the instruction set may include a complex instruction set (full name: Complex Instruction Set Computing, abbreviation: CISC) and a reduced instruction set RISC (full name: reduced instruction set Computing, abbreviation: RISC).
  • X86 is a computer language instruction set executed by a microprocessor.
  • X86 refers to a standard number abbreviation of Intel's general-purpose computer series.
  • X86 also identifies a set of general computer instructions.
  • X86 is CISC. The name of X86 comes from the Intel 8086 central processing unit introduced in 1978.
  • the X86 architecture is usually used to refer to processors that are compatible with the X86 instruction set.
  • the X86 architecture is widely used in personal computers (Personal Computer, abbreviation: PC).
  • the X86 platform generally refers to hardware devices based on the X86 architecture.
  • This hardware device uses Intel or other processors compatible with the X86 instruction set.
  • the hardware device is usually installed with a Microsoft Windows operating system (Windows) operating system.
  • Windows Microsoft Windows operating system
  • the hardware device is an X86 server.
  • ARM is a 32-bit RISC.
  • the ARM architecture is usually used to refer to processors that are compatible with the ARM instruction set.
  • the ARM architecture is widely used in mobile terminals.
  • ARM platform generally refers to hardware devices based on the ARM architecture.
  • the hardware device uses a processor compatible with the ARM instruction set.
  • the hardware device is usually installed with a Linux operating system (a set of free-to-use and freely distributed Unix-like operating systems).
  • An executable file refers to a file that can be loaded and executed by the operating system.
  • executable files include portable executable (PE) files
  • PE files include .exe files, .sys files, .com and other types of files. Among them,. Is the separator between the file name and the extension.
  • an .exe file is a file with the extension exe.
  • executable files include executable and linkable format (Executable and Linkable Format, abbreviation: ELF) files.
  • a malicious file refers to a file containing a piece of program written by the programmer for attacking intent.
  • Malicious files use vulnerabilities in computer systems to perform malicious tasks, such as stealing confidential information, destroying stored data, and so on.
  • Malicious files are often executable files, such as viruses, worms, and Trojan horse programs that perform malicious tasks on computer systems. Because malicious files can cause serious damage to computer system security.
  • Static testing refers to a method of program analysis without running a computer program. For example, only by analyzing the source code, assembly, grammar, structure, process, interface, etc. of the sample file to check whether the sample file is malicious.
  • Dynamic behavior detection refers to simulating the execution process of the test file, obtaining the behavior or behavior sequence generated during the execution of the test file, matching it with the dynamic behavior characteristics of the known malicious file, and judging whether the test file is a malicious file according to the matching result.
  • many dynamic behavior detection technologies are restricted by the operating environment. Specifically, after obtaining the test file, it often appears that the test file cannot be run due to the incompatibility of the test file in the operating environment, so that the dynamic behavior cannot be monitored. Malicious files cannot be detected. Dynamic behavior detection is usually implemented using sandbox technology.
  • Sandbox is a security mechanism that provides an isolation environment for test files in execution by providing a virtual operating environment. Programs running in the sandbox will not have a permanent impact on the hardware.
  • the sandbox can be implemented through the real operating system of the host, or through a virtual machine.
  • the driver framework provided by Microsoft is usually used to add monitoring programs. The monitoring programs monitor process creation, document creation, registry modification and other behaviors.
  • Operating System refers to a computer program that manages and controls computer hardware resources and software resources.
  • the operating system is the basic system software, which is the interface between computer hardware and other software, and other software must run under the support of the operating system.
  • the operating system can provide a running environment for executable files, such as providing APIs required for software operation.
  • the kernel of an operating system is a kind of software, the kernel is a part of the operating system, and the kernel is the core of the operating system.
  • the kernel of the operating system can be used to manage various resources of the operating system.
  • the kernel can be understood as a bridge between applications and hardware, or as an interface between applications and hardware.
  • the kernel is a software entity that runs directly on the hardware and is used to provide application programs with access to computer hardware. In addition, the kernel can determine when and how long a program operates on a certain part of the hardware.
  • the kernel can provide hardware abstraction layer, disk and test file system control, multitasking and other functions.
  • Container technology is a kind of virtualization technology.
  • Container technology can be used to generate containers, which can provide a virtual operating environment for software execution.
  • a container is a kind of software.
  • a container is a packaging of executable files and the resources that executable files depend on.
  • the container contains the resources necessary for the executable file to run. Such as code, operating environment, system tools, system libraries and settings.
  • the container can be created by mirroring. Compared with virtual machines, containers have the advantage of being lighter, occupying fewer resources, and running faster.
  • a virtual machine packs virtual hardware, kernel (ie operating system), and user space in a new virtual machine.
  • the container When running applications through the virtual machine, the virtual machine first needs to virtualize a physical environment, and then build a complete operating system , Build another layer of runtime (Runtime), and then the application program runs.
  • the container usually directly installs the container layer on the operating system of the host.
  • the container layer can be, for example, a Linux container (Linux Container, LXC) or lib container (a package for container management in Docker, which is implemented based on the Go language).
  • LXC Linux Container
  • lib container a package for container management in Docker, which is implemented based on the Go language.
  • the container uses the kernel of the physical machine to run, and multiple containers can share the operating system of the physical machine.
  • the container directly uses the kernel of the host, the process of building an operating system and the process of assigning an independent operating system to the applications contained in the container are eliminated, and there are fewer virtualized objects. For example, in some cases, it is necessary to build the container. Only binary files and libraries are built independently for the container. The library contains the content that the binary files depend on, and does not need to package a complete operating system like a virtual machine, so it is lighter and faster to start.
  • container management also has more convenient advantages. Specifically, the running state of the container corresponds to a set of standard management operations, for example, starting the container, stopping the container, suspending the container, deleting the container, etc. The container can be conveniently managed through these standard management operations.
  • the image is used to encapsulate the content required to run the container, such as files such as programs, libraries, resources, configuration, and some configuration parameters.
  • the image is usually stored in a hierarchical structure.
  • the image includes at least one image layer (image layer).
  • image layer is a read-only template that is used to build the container.
  • the mirror layer is used to store applications and migrate applications.
  • Cross-platform is a term in software technology. It refers to applications developed under one operating system that can still run normally under another operating system. For example, if application A is developed under the Windows operating system, and the application A can still run normally under the Linux operating system, application A can be called a cross-platform application. Under normal circumstances, cross-platform applications must meet the conditions of not relying on the operating system.
  • API application programming interface
  • the operating system provides an API for the application program, and the application program calls the API to instruct the operating system to perform operations.
  • API is a set of preset functions.
  • the operating system can be regarded as a service center, which can provide various services for applications, and will encapsulate the instructions for implementing various services in various functions.
  • the application wants to use a certain service of the operating system, it will call For the function corresponding to the service, the operating system will perform the operation corresponding to the function to provide services for the application. Since the service object of this kind of function is an application, this kind of function is called an application programming interface.
  • Through API it can provide applications and developers with the ability to access a set of routines based on software or hardware. At the same time, it eliminates the learning cost of accessing source code and understanding the internal working mechanism, and reduces complexity.
  • Docker is a software container platform launched by Google, which can realize the development, deployment and operation of containers. With Docker, you can easily create and use containers, and put your own applications into the container. Containers can also perform version management, copy, share, and modify, and can achieve continuous integration, continuous delivery, and deployment through custom application images. Docker generally includes a Docker client (Docker Client) and a Docker daemon (Docker Daemon). Docker Daemon, also known as Docker Engine (Docker Engine), is a daemon process used to manage images and containers. It is a background process running on the operating system. The Docker client can trigger various instructions according to the user's input operations, and interact with the Docker daemon through various instructions. The Docker daemon receives instructions from the Docker client, creates corresponding jobs according to the instructions of the Docker client, and executes the corresponding jobs .
  • Docker Engine Docker Engine
  • the instance of Docker container is the running state of the Docker image. Docker containers can be created, started, stopped, deleted, suspended, etc.
  • the user inputs a viewing instruction (for example, a Docker ps instruction), and the computer will respond to the viewing instruction and display a list of Docker containers running on the host. Docker containers have the advantage of being lighter.
  • a virtual machine usually includes a virtual machine management system (Hypervisor) and a guest operating system (Guest Operating System, Guest OS).
  • the hypervisor is used to run a virtual guest operating system on the host operating system and to virtualize hardware resources. .
  • the disk space, CPU, and memory occupied by the guest operating system are very large.
  • Docker Daemon is used to replace Hypervisor and Guest OS.
  • the Docker daemon is a background process running on the operating system and is responsible for managing Docker containers.
  • the Docker daemon can directly communicate with the operating system of the host and allocate resources for each Docker container, eliminating the need for virtual machines to communicate indirectly through the Hypervisor. Coming overhead.
  • Hypervisor virtualizes hardware resources, and Docker can directly use hardware resources, thereby improving the utilization of hardware resources.
  • a Docker image (Docker image) is used to create a Docker container.
  • a Docker image is an executable package that contains the content required to run a Docker application, such as the code, libraries, environment variables, and configuration files of the Docker application.
  • the Docker image can run in an environment with Docker Engine. When the Docker image runs, it will be created as a Docker container.
  • the Docker container can shield the software and hardware outside the container.
  • the Docker image includes a metadata file, a configuration file, and at least one image layer file.
  • the metadata file is a manifest.json file
  • the metadata file records the metadata of all mirror layer files, for example, records the sha256 value (hash value) of each mirror layer file.
  • the configuration file records the memory size occupied by the Docker image, the type of instructions contained in the Docker image, etc.
  • the image layer file is a layer file.
  • the dynamic link library is used to encapsulate the functions and resources that the running process of the application depends on.
  • the dynamic link library is also called shared function library or shared library.
  • the functions and resources in the dynamic link library can be shared by multiple applications. Through the dynamic link library, it helps to avoid code reuse and promote the effective use of memory, making the application modularize each function.
  • the dynamic link library is usually stored in the computer in the form of a test file.
  • the test files encapsulated with the dynamic link library have different formats and have different titles. For example, under the Windows operating system, the dynamic link library is encapsulated in a DLL file; under the Linux operating system, the dynamic link library is encapsulated in a shared object (so) file.
  • the dynamic link library is an implementation of the API.
  • the DLL can encapsulate the code of the API and serve as the carrier of the API.
  • the operating system will access the dynamic link library, obtain the code of the API from the dynamic link library, and run the code to perform the corresponding operation.
  • the operating system can provide a registry API, and when an application calls the registry API, it can access the registry.
  • the code that needs to run when using the registry API can be stored in the dynamic link library.
  • the DLL file is a test file that contains a dynamic link library in the Windows operating system.
  • the DLL file contains many functions and resources that a Windows program depends on when it runs in the Windows environment.
  • DLL files are also called "application extensions".
  • the suffix of the DLL file is .dll.
  • the DLL file includes the kernel32.dll file, the user32.dll file, and the gdi32.dll file.
  • These three test files encapsulate the API functions of the Windows operating system.
  • the DLL file is stored in the system directory, for example, in the C: ⁇ Windows ⁇ System32 ⁇ directory.
  • the kernel32.dll file is an important 32-bit dynamic link library test file in Windows 9x/Me, which is a kernel-level test file.
  • user32.dll is a Windows user interface related application program interface, used to include Windows processing, basic user interface and other features, such as creating windows and sending messages.
  • gdi32.dlll is a dynamic link library stored in the Windows system test folder. It is an application extension of the graphical user interface under Windows. It is usually created automatically during the installation of the operating system.
  • Many application programs are not a complete executable file. This kind of application program will be divided into some relatively independent dynamic link libraries, namely DLL files, which are placed in the system. When the application is executed, the DLL file corresponding to the application will be called. Among them, one application can use multiple DLL files, and one DLL file may also be used by different applications.
  • the ntdll.dll file is a kind of DLL file.
  • the ntdll.dll file contains functions for calling the kernel, which can be understood as the core DLL file.
  • the kernel In the Windows operating system, when an application calls the Windows API, it will access a series of DLL files, and the calls to the functions in the DLL files will eventually be directed to ntdll.dll.
  • the kernel When the functions in the ntdll.dll file are called, The kernel will be called to perform the operation corresponding to the function.
  • the ntdll.dll file is the entry point of the Windows system from ring3 to ring0. All win32APIs located in kernel32.dll and user32.dll are finally implemented by calling functions in ntdll.dll.
  • the function in ntdll.dll uses SYSENTRY to enter ring0, and the implementation entity of the function is in ring0.
  • a shared object (so) file is a file containing a dynamic link library in the Linux operating system.
  • the SO file includes the functions that the application of the Linux operating system depends on when running based on the Linux operating system.
  • the suffix of SO files is .so.
  • SO file is a binary ELF file.
  • SO files are also called shared libraries or shared object libraries.
  • FIG. 1 is a schematic diagram of an application scenario of a malicious file detection method provided by an embodiment of the present application.
  • the network system in FIG. 1 includes a detection device.
  • the scenario shown in FIG. 1 also includes an analysis device in the cloud.
  • the network system in FIG. 1 includes a data center 1102, a core office area, an office area A, and an office area B, and their respective local area networks 1103.
  • the local area networks 1103 of the data center 1102, the core office area, the office area A, and the office area B are connected to the firewall 1105 through a switch.
  • the firewall 1105 is further connected to the wide area network or the Internet through a router 1101, a network address translation (NAT) device (not shown in the figure), a gateway device (not shown in the figure), and so on.
  • the firewall 1105 is used to isolate the network system from the wide area network or the Internet, and to perform security protection for data interacting between the network system and the wide area network or the Internet.
  • a possible deployment location of the detection device is the network exit of the network system, that is, between the firewall 1105 and the router 1101.
  • the detection device is integrated in an egress firewall, an egress router, or a bypass firewall.
  • Detection equipment is used to prevent malicious test files from the Internet and malicious web traffic.
  • the detection device is any one of a gateway device, a firewall device, an intrusion detection system (Intrusion Detection System, IDS) type device, and an intrusion prevention system (Intrusion Prevention System, IPS) type device.
  • IDS intrusion detection system
  • IPS intrusion Prevention System
  • the IDS category refers to the monitoring of the network and system operating conditions through software and hardware in accordance with certain security policies, and discovering various attack attempts, attack behaviors or attack results to ensure the confidentiality and integrity of network system resources And availability.
  • the IPS category refers to monitoring the message transmission behavior of the network or network equipment, instantaneous interruption, adjustment or isolation of some abnormal or harmful message transmission behavior.
  • the detection device may also be an independent sandbox device or other devices that integrate sandbox functions.
  • the detection device can be a security gateway, a firewall, and so on.
  • independent sandbox devices are usually deployed at the Internet egress of the enterprise in a bypass manner.
  • the enterprise's local area network is connected to the Internet through a gateway device or router, and the sandbox device is connected to the gateway device or router in a bypass manner. .
  • the detection device is integrated in the analysis device in the cloud.
  • Testing equipment provides testing services to other network equipment through network apps and open service ports.
  • the detection device receives a test file from other network devices in the network (such as a security gateway, firewall, etc.), and after performing the detection method shown in the embodiment of the present application on the test file, whether the test file is a malicious file
  • the test results are returned to other network devices that provide test files.
  • the detection device is the detection device in the network system shown in FIG. 1.
  • FIG. 2 is a schematic structural diagram of a detection device provided by an embodiment of the present application.
  • the detection device shown in FIG. 2 includes at least one processor 21, at least one memory 22, and a network interface 23.
  • the processor 21, the memory 22, and the network interface 23 are connected to each other through a bus 24.
  • the processor 21 is a general-purpose central processing unit (CPU), a network processor (NP), a microprocessor, or one or more integrated circuits for implementing the solution of the present application, for example, Application-specific integrated circuit (ASIC), programmable logic device (PLD) or a combination thereof.
  • the above-mentioned PLD is a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), or any of them combination.
  • the processor 21 is a single-CPU.
  • the processor 21 is a multi-core processor (multi-CPU).
  • the memory 22 includes a register and a volatile memory (volatile memory), such as a random-access memory (RAM); optionally, the memory includes a non-volatile memory (non-volatile memory), For example, flash memory (flash memory), hard disk drive (HDD) or solid-state drive (SSD), cloud storage, network attached storage (NAS), network disk
  • volatile memory such as a random-access memory (RAM)
  • non-volatile memory non-volatile memory
  • flash memory flash memory
  • HDD hard disk drive
  • SSD solid-state drive
  • NAS network attached storage
  • the memory may also include a combination of the above-mentioned types of memory or other media or products in any form with a storage function.
  • the memory 22 includes electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed Optical discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be accessed by a computer Any other media, but not limited to this.
  • the memory 22 exists independently and is connected to the processor 21 through the bus 24.
  • the memory 22 and the processor 21 are integrated together.
  • the network interface 23 uses any device such as a transceiver for communicating with other devices or a communication network.
  • the network interface 23 includes a wired communication interface.
  • the network interface 23 also includes a wireless communication interface.
  • the wired communication interface is, for example, an Ethernet interface, such as a Gigabit Ethernet (GE for short) interface.
  • the Ethernet interface is an optical interface, an electrical interface or a combination thereof.
  • the wired communication interface is a fiber distributed data interface (Fiber Distributed Data Interface, FDDI for short) interface.
  • the wireless communication interface is a wireless local area network (WLAN) interface, a cellular network communication interface, or a combination thereof.
  • WLAN wireless local area network
  • the bus 24 is used to transfer information between the above-mentioned components.
  • the bus 24 is divided into an address bus, a data bus, a control bus, and the like.
  • the bus 24 is a high-speed serial computer expansion bus standard (peripheral component interconnect express, abbreviated as PCIe) bus, Advanced Microcontroller Bus Architecture (AMBA) bus communication, and cache-coherent system (Huawei cache-coherent).
  • PCIe peripheral component interconnect express
  • AMBA Advanced Microcontroller Bus Architecture
  • HCCS a protocol standard for maintaining the consistency of service data between multiple ports
  • peripheral component interconnection standard peripheral component interconnection standard
  • the detection device in FIG. 2 further includes an input and output interface 25, and the input and output interface 25 is used to connect the output device and the input device.
  • the input device communicates with the processor 21.
  • the input device receives the user's input in multiple ways.
  • the input device is a mouse, a keyboard, a touch screen device, or a sensor device.
  • the output device communicates with the processor 21.
  • the output device displays information in multiple ways.
  • the output device is a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector.
  • LCD liquid crystal display
  • LED light emitting diode
  • CRT cathode ray tube
  • the hardware structure of the detection device is exemplarily introduced above, and the software architecture of the detection device is exemplarily described below.
  • the software of the detection device adopts a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture, etc.
  • the software of the detection device includes at least one functional module, and each functional module is implemented by software.
  • the functional module is generated after the processor 21 of the detection device reads the program code stored in the memory 22.
  • the software of the detection device includes a container 221, an instruction conversion module 222, and an operating system 223.
  • the container 221 is used to provide a virtual operating environment.
  • the instruction conversion module 222 is used to convert the instructions transmitted between the container 221 and the operating system 223. For example, when the container 221 sends an instruction to the operating system 223, the instruction conversion module 222 intercepts the instruction, converts the instruction, and sends the converted instruction to the operating system 223. For another example, when the operating system 223 sends an instruction to the container 221, the instruction conversion module 222 intercepts the instruction, converts the instruction, and sends the converted instruction to the container 221.
  • the instruction conversion module 222 is provided outside the container 221.
  • the software of the detection device further includes a container 224 and an instruction conversion module 225.
  • the instruction conversion module 225 is provided in the container 224.
  • the instruction conversion module 225 is used to convert the instructions transmitted between the container 224 and the operating system 223.
  • the virtual environment provided by the container 224 is different from the virtual environment provided by the container 221.
  • the container 221 provides a virtual operating environment for simulating the Windows operating system
  • the container 224 provides a virtual operating environment for simulating the Linux operating system.
  • the same testing device can use the container 221 and the container 224 to provide multiple virtual operating environments. .
  • the example of the Docker container in FIG. 3 is the container in FIG. 2. Please refer to Fig. 3.
  • the container 221 in Fig. 2 is Docker_1 or Docker_2 in Fig. 3.
  • the container 224 in FIG. 2 is Docker_n in FIG. 3.
  • Docker_1, Docker_2, and Docker_n are three instances of Docker containers.
  • the instruction conversion module 222 in FIG. 2 is the instruction conversion process 1 or the instruction conversion driver 1 in FIG. 3.
  • the instruction conversion module 225 in FIG. 2 is the instruction conversion process 2 or the instruction conversion driver 2 in FIG. 3.
  • the operating system 223 in FIG. 2 is the Linux operating system in FIG. 3, and the Linux operating system can run based on the ARM instruction set.
  • the container in Figure 3 can realize the function of system simulation.
  • ntdll.dll and kernel32.dll are dynamic link libraries of the virtual operating environment.
  • the behavior monitoring module is used to monitor the dynamic behavior of the test file in the container.
  • the Docker instance communicates with the operating system through the instruction conversion module.
  • the instruction conversion module serves as a communication medium between the Docker instance and the operating system. The following is an example of way one and way two.
  • Method 1 An instruction conversion module is set outside the Docker instance. Specifically, the instruction conversion module may receive the instruction generated by the Docker instance, and after the instruction conversion module converts the instruction, it sends the converted instruction to the operating system.
  • Docker_1 is externally provided with instruction conversion process 1 or instruction conversion driver 1. After Docker_1 generates X86 instructions, Docker_1 sends the X86 instructions to instruction conversion process 1 or instruction conversion driver 1, instruction conversion process 1 or instruction The conversion driver 1 receives X86 instructions, converts X86 instructions into ARM instructions, and sends them to the Linux operating system. After the Linux operating system executes the ARM instruction, it carries the execution result in the ARM instruction and returns it to instruction conversion process 1 or instruction conversion driver 1.
  • Instruction conversion process 1 or instruction conversion driver 1 converts ARM instructions to X86 instructions, and returns X86 instructions to Docker_1.
  • Docker_1 communicates with the Linux operating system through the instruction conversion process 1 or the instruction conversion driver 1.
  • the Docker instance and the instruction conversion module have a clear role and division of labor.
  • the function of system simulation is realized by the Docker instance, and the function of instruction conversion is realized by the instruction conversion module, thereby decoupling the two functions of system simulation and instruction conversion. It is convenient to expand, upgrade and update the functions of the system simulation separately in the future.
  • the Docker instance contains an instruction conversion module inside.
  • the Docker instance communicates with the operating system through the internal instruction conversion module.
  • Docker_n is equipped with instruction conversion process 2 or instruction conversion driver 2. After the system simulation part in Docker_n generates X86 instructions, it sends the X86 instructions to instruction conversion process 2 or instruction conversion driver 2, and the instruction is converted Process 2 or instruction conversion driver 2 receives X86 instructions, converts X86 instructions to ARM instructions, and Docker_n sends the ARM instructions to the Linux operating system. After the Linux operating system executes the ARM instruction, the execution result is carried in the ARM instruction and returned to Docker_n.
  • Docker_n receives the ARM instruction, and converts the ARM instruction to X86 instruction through its own instruction conversion process 2 or instruction conversion driver 2, and the instruction conversion process 2 or instruction conversion driver 2 returns the X86 instruction to the part of the operating system simulation in Docker_n. In this way, Docker_n communicates with the Linux operating system through the instruction conversion process 2 or the instruction conversion driver 2 contained inside.
  • the hardware structure and software architecture of the detection device are introduced above, and the following exemplarily introduces the method flow of the detection device for detecting malicious files.
  • FIG. 4 is a flowchart of a method for detecting malicious files according to an embodiment of the present application.
  • FIG. 4 uses the detection device as the execution subject as an example for description. The method includes the following steps 401 to 406.
  • Step 401 The detection device generates a virtual operating environment based on the container technology.
  • the virtual operating environment is isolated from the real operating environment of the host, so that the test files running in the virtual operating environment will not have a permanent impact on the hardware.
  • the detection device runs the test file in the virtual operating environment, and then deletes the changes produced by the running test file.
  • the virtual operating environment is implemented through container technology, and the virtual operating environment is an instance of the container. For example, please refer to Figure 3. If a virtual operating environment is generated based on Docker container technology, the virtual operating environment is provided by Docker_1, Docker_2 or Docker_n in Figure 3.
  • a virtual operating environment is generated based on the image of the container.
  • the virtual operating environment is started through the Docker daemon, and the virtual operating environment is an instance of the Docker container.
  • the Docker container instance it has the advantage of lighter weight and realizes the process-level malicious file detection.
  • step 401 is only described by taking the process of generating a virtual operating environment as an example.
  • the detection device generates multiple virtual environments.
  • the generation process of each virtual operating environment may be similar to step 401.
  • multiple sample files can be detected in parallel, thereby improving the detection efficiency.
  • different virtual operating environments are used to simulate different operating systems, for example, virtual operating environment A simulates a Windows operating system, and virtual operating environment B simulates a Linux operating system, so as to adapt to the operating requirements of sample files with different requirements.
  • Step 402 The detection device obtains a test file.
  • the detection device receives a test file input by the user.
  • the detection device collects the data stream transmitted in the network, and obtains the file carried by the data stream by reorganizing the payload of the packet contained in the data stream.
  • the detection device parses the parent file embedded with the test file to obtain the test file carried in the parent file.
  • the parent file refers to the file in which the test file is embedded.
  • the parent file contains the test file.
  • the parent file is also called the original file, or has different titles according to different manufacturers, standards, or scenarios.
  • the parent file is an email
  • the test file is an attachment carried in the email
  • the attachment is an executable file.
  • the detection device parses the mail and obtains the attachments carried in the mail.
  • the parent file is a word document
  • the test file is an executable file linked in the word document
  • the detection device parses the word document to obtain the test file.
  • the test file is an executable file running based on the first operating system.
  • the first operating system is an operating system compatible with the test file.
  • the test file is a PE file and the first operating system is a Windows operating system; for example, if the test file is an ELF file, the first operating system is a Linux operating system.
  • the test file is an exe file, a .sys file, or a .com file
  • the first operating system refers to the Windows operating system.
  • the Linux operating system and the Windows operating system are examples of the first operating system.
  • the first operating system is the Android operating system (also called the Android operating system, Operating system developed for Google), iOS operating system (Apple's mobile operating system), Mac OS operating system (Apple operating system), BlackBerry operating system (BlackBerry operating system), UNIX operating system (a multi-user, multi-user operating system) Task operating system) or NetWare system (network operating system launched by NOVELL).
  • the test file is an executable file based on these operating systems.
  • the operating systems listed above are only examples of the first operating system, and this embodiment does not limit the specific type of the first operating system.
  • step of generating a virtual operating environment shown in step 401 is executed once, and the step of obtaining a test file shown in step 402 is executed multiple times. In other words, it is not necessary to regenerate the virtual operating environment before each test file is obtained.
  • the detection device executes the processes shown in step 402 to step 406 on multiple test files based on the first operating system simulated by the virtual operating environment.
  • Step 403 The detection device runs the test file in the virtual operating environment.
  • the detection device transfers the test file to an instance of the Docker container, and sends an instruction to start and run the test file to the instance of the Docker container.
  • the instance of the Docker container will run the test file in response to the instruction to start and run the test file, so that the test file can be run in the virtual operating environment.
  • the detection device generates only one virtual operating environment, and sends the test file to the virtual operating environment to run. For example, if a test file is to be run by simulating a Windows operating system, a virtual operating environment for simulating the Windows operating system is generated, and the test file is fixedly sent to the virtual operating environment. In other optional embodiments, the detection device generates multiple virtual operating environments, executes the optional processes described in the following steps 4031 to 4033, and sends the test file to the virtual operating environment it needs to run.
  • Step 4031 The detection device determines the file type of the test file.
  • the detection device uses multiple methods to determine the file type of the test file. For example, the detection device recognizes the file type of the test file according to the file suffix name. For another example, the detection device recognizes the file type of the test file according to the file header information. Specifically, the detection device pre-stores data structures of file headers (or files) of various file types. After receiving the test file, the testing device sequentially compares the file header of the test file with the data structure of the file headers of various file types to obtain the data structure that the file header of the test file conforms to, and use the file type corresponding to the data structure as The file type of the test file. In addition, optionally, the detection device also directly recognizes the file type of the test file according to the suffix name. Other ways for the detection device to determine the file type will not be listed here.
  • Step 4032 The detection device determines the virtual operating environment required by the test file according to the file type of the test file.
  • the detection device pre-stores the mapping relationship between the file type and the virtual operating environment, and by querying the mapping relationship, the operating environment required for the operation of the test file is determined.
  • the mapping relationship is shown in Table 1.
  • Step 4033 The detection device runs the test file in the virtual operating environment required by the test file.
  • the testing device determines that the suffix of a test file is ".elf" through the suffix comparison, it is determined that the file type of the test file is an ELF file, and then the operating environment required for the ELF file is Docker_1 by querying Table 1, Send the test file to Docker_2 to run.
  • Docker_2 is used to simulate the operating environment of the Linux operating system.
  • the detection device determines that the test file conforms to the data structure of the file header of the PE file according to the content of the designated field in the file header of the test file, and therefore determines that the file type of the test file is a PE file. Then, the detection device queries Table 1 to know that the operating environment required by the PE file is Docker_1, and sends the test file to Docker_1 for operation. Among them, Docker_1 is used to simulate the operating environment of the Windows operating system.
  • Step 404 The detection device obtains the first API sequence called during the running of the test file.
  • the two APIs CreateFile() and WriteFile() are called in sequence during the running of the test file, and the first API sequence includes CreateFile() and WriteFile().
  • the virtual operating environment provides multiple APIs for the software operation provided by the test file.
  • the embodiment of the present application sets the virtual operating environment as the multiple required for the software operation provided by the test file.
  • This API is called the first API set.
  • the first API set includes multiple APIs.
  • the test file may call all APIs in the first API set; or, the test file may call some APIs in the first API set, and the first API set except some APIs The rest of the API is not called by the test file.
  • the embodiment of the present application refers to the series of APIs called by the test file in the first API set as the first API sequence, and the series of APIs actually executed by the testing device as It is the second API sequence.
  • the test file in the process of running the test file, only calls one API in the first API set, and the first API sequence only includes this one API.
  • the test file successively calls multiple APIs in the first API set, and the multiple called APIs form a first API sequence, and the first API sequence includes multiple APIs. That is, the first API sequence includes at least one API, and this embodiment does not limit whether the first API sequence includes one API or multiple APIs.
  • the first API sequence includes multiple APIs, and different APIs in the first API sequence are sorted in order of the time when they are called. For example, if the test file first calls API_1 provided by the virtual operating environment, then calls API_2 provided by the virtual operating environment, and finally calls API_3 provided by the virtual operating environment, the first API sequence is expressed as (API_1, API_2, API_3).
  • the second API set is used to represent multiple APIs required for software operation provided by the operating system of the test file compatible with the test file simulated by the virtual operating environment.
  • the second API set is the API provided by the first operating system for the operation of the software required by the test file.
  • the second API set includes multiple APIs.
  • the identifier of the API in the first API set is the same as the identifier of the API in the second API set.
  • the identifier of the API is the name of the API.
  • the API identifier is used to identify the corresponding API.
  • the test file calls the API in the currently running operating environment through the identification of the API. In other words, when running the test file in the virtual running environment and running the test file in the first operating system, the test file uses the same API identifier to call the API in the virtual running environment or the API in the first operating system .
  • the identifier of the API for writing files is WriteFile() or frwite().
  • the first operating system is the Windows operating system as an example, and the Windows operating system provides the first API set for the PE file.
  • the first API set includes APIs for writing files, APIs for creating files, APIs for reading files, and so on.
  • the identifier of the API for writing files is WriteFile()
  • the identifier of the API for creating files is CreateFile()
  • the identifier of the API for reading files is ReadFile().
  • the virtual runtime environment provides a second set of APIs for the PE file.
  • the second API set includes APIs for writing files, APIs for creating files, APIs for reading files, and so on.
  • the identifier of the API used to write files in the second API set is WriteFile()
  • the identifier of the API used to create files in the second API set is CreateFile()
  • the identifier of the API used to read files is ReadFile().
  • the image of the aforementioned container may be provided to the user by the software provider.
  • the software provider encapsulates the first API set in the image.
  • the instance of the container provides the first API set for the test file running in it. For example, to simulate the operating environment provided by the Windows operating system through container technology, the software provider encapsulates some APIs that are the same as those in the Windows operating system in the image. If the container technology is to be used to simulate the operating environment provided by the Linux operating system, the software provider encapsulates some APIs with the same API identification in the Linux operating system in the image.
  • Step 405 The detection device executes the second API sequence in the second operating system.
  • the second operating system is an operating system based on the computer instruction set architecture of the detection device.
  • the second operating system is to detect the real operating environment on the device.
  • the processor of the detection device adopts the X86 architecture
  • the second operating system is the Windows operating system (of course, the operating system based on the X86 architecture may also be the Linux operating system, and only the windows operating system is used for illustration here).
  • the processor of the testing equipment adopts the ARM architecture
  • the second operating system is the Linux operating system (of course, the operating system based on the ARM architecture can also be an operating system developed by the testing equipment manufacturer, or a specific windows series operating system , Such as windows 10).
  • the embodiment of the present application composes the sequence composed of APIs actually executed in the real operating environment (ie, the second operating system) It is called the second API sequence, and the API sequence composed of APIs called in the virtual operating environment generated based on the container technology is called the first API sequence.
  • the second operating system may be the same as the first operating system, and the second operating system may also be different from the first operating system.
  • the embodiment of the present application uses the second operating system to be different from the first operating system for illustration.
  • the API in the first API sequence is different from the API in the second API sequence.
  • the first API The identifier of the API in the sequence may be the same as the identifier of the API in the second API sequence.
  • the detection device determines the second API sequence according to the first API sequence, and by executing the second API sequence in the second operating system, the hardware of the detection device is instructed to perform the corresponding operation , So as to simulate and execute the API sequence of the first operating system, and realize the effect of operating system simulation.
  • the second API sequence includes at least one API, and the API in the second API sequence is an API in the second operating system.
  • the first API in the second API sequence has a mapping relationship with the first API in the first API sequence.
  • the first operating system is a Windows operating system.
  • the second operating system is the Linux operating system.
  • the first API sequence is (CreateFile(), ReadFile()).
  • the second API sequence is (fcreate(), fread()).
  • the fcreate() and CreateFile() in the second API sequence have a mapping relationship.
  • the fread() and ReadFile() in the second API sequence have a mapping relationship.
  • the second API sequence includes multiple APIs
  • different APIs in the second API sequence are sorted according to the order of the corresponding APIs in the first API sequence.
  • the following embodiments of this application use the form of "API+number" to simplify the expression of the above-mentioned specific APIs without introducing difficulties in understanding.
  • the specific APIs are CreateFile(), ReadFile(), fcreate(), fread() and so on, the API can be simplified to a form such as API_1.
  • API_1, API_2, API_3) For example, if the first API sequence is (API_1, API_2, API_3). API_4 and API_1 in the second operating system have a mapping relationship, API_5 and API_2 in the second operating system have a mapping relationship, and API_6 and API_3 in the second operating system have a mapping relationship, so the second API sequence is (API_4, API_5, API_6).
  • the mapping relationship between the API in the first API sequence and the API in the second API sequence is constructed in this way. Determine the function of the API of the first operating system, determine the function of the API of the second operating system, obtain APIs with similar functions of the first operating system and the second operating system, and establish a mapping relationship for the APIs with similar functions.
  • multiple APIs of the second operating system are used to implement one API of the first operating system, and then in the mapping relationship, the multiple APIs of the second operating system correspond to one API of the first operating system.
  • an API of the second operating system is used to implement an API of the first operating system, and in the mapping relationship, an API of the second operating system corresponds to an API of the first operating system.
  • This embodiment does not specifically limit whether the mapping relationship between APIs is a one-to-one relationship or a one-to-many relationship.
  • the mapping relationship between the first API in the first API sequence and the first API in the second API sequence Constructed like this: if the API with the same identification as the first API in the first API sequence is called in the first operating system, the device will be instructed to perform operation A, if the first in the second API sequence is called in the second operating system After the API, the device is instructed to perform operation B, and operation A is the same as operation B.
  • a mapping relationship between the first API in the first API sequence and the first API in the second API sequence is established.
  • the computer device instructs to perform the file writing operation by calling the WriteFile() API; under the Linux operating system, the computer device instructs the file writing operation by calling the frwite() API. Then, since the two APIs WriteFile() and frwite() are used to instruct the execution of file write operations, the mapping relationship between the two APIs WriteFile() and frwite() can be established.
  • the software provider in the process of packaging the image of the container, encapsulates the mapping relationship between the API in the first API set and the API in the second operating system in the image, and the detection device generates the container based on the image. After the instance of, the instance of the container can access the mapping relationship between the API in the first API set and the API in the second operating system.
  • the process of determining the first API in the second API sequence includes: the detection device determines the first API from the first API sequence.
  • the detection device uses the identifier of the API in the first API sequence as an index to query the mapping relationship between the APIs to obtain the identifier of the first API in the second API sequence.
  • the detection device determines the first API in the second API sequence according to the identifier of the first API in the second API sequence. For example, referring to Table 2 below, if the first API sequence is (CreateFile(), ReadFile()), the detection device determines ReadFile() from the first API sequence.
  • the detection device uses ReadFile() as an index, queries Table 2, and determines that the first API in the second API sequence is fread().
  • other APIs other than the first API in the second API sequence are determined, and the determined APIs are further sorted and combined to obtain the second API sequence.
  • the embodiment of the application actually executes the API in the second operating system, but does not actually execute the API in the first operating system, achieving the effect of simulating the running process of the test file in the first operating system.
  • the execution process of WriteFile() of the Windows operating system is simulated by executing frwite() of the Linux operating system.
  • frwite() of the Linux operating system By executing fcreate() of the Linux operating system, the execution process of CreateFile() of the Windows operating system is simulated.
  • the virtual operating environment can simulate the API required for software operation provided by the first operating system for the test file. Since the request of the API called by the test file can be responded to, the purpose of operating system simulation is realized, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the first operating system.
  • calling the API is implemented based on a dynamic link library. Accordingly, the process of using the API of the second operating system to simulate the API of the first operating system includes the following steps 1 to 3.
  • Step 1 The detection device obtains the corresponding function from the dynamic link library of the virtual operating environment according to each API in the first API sequence, thereby obtaining the first function sequence.
  • the code of the test file references functions and resources in the dynamic link library.
  • the testing device accesses the dynamic link library of the virtual operating environment, and obtains the first function sequence from the dynamic link library.
  • the first function sequence includes at least one function, and the functions in the first function sequence are used to implement the API in the first API sequence. For example, if the test file is a file in PE format and the API sequence called by the test file is (CreateFile(), ReadFile()), the function in the first function sequence is the function used to implement CreateFile() or the function used to implement ReadFile( )The function.
  • Step 2 The detection device obtains the corresponding function from the dynamic link library of the second operating system according to each function in the first function sequence, thereby generating the second function sequence.
  • the second function sequence includes at least one function, and the functions in the second function sequence are used to implement the API in the second API sequence.
  • the operating system of the detection device is a Linux operating system
  • the API in the second API sequence is fcreate()
  • the function in the second function sequence is a function for implementing fcreate().
  • the first function in the second function sequence has a mapping relationship with the first function in the first function sequence.
  • the function used to implement fcreate() has a mapping relationship with the function used to implement ReadFile().
  • the process of determining the first function in the second function sequence includes the detection device selecting the first function from the first function sequence.
  • the detection device uses the identifier of the first function in the first function sequence as an index to query the mapping relationship between the identifier of the function in the first function sequence and the identifier of the function in the second function sequence to obtain the The ID of the first function.
  • the detection device determines the first function in the second function sequence according to the identifier of the first function in the second function sequence.
  • functions in the second function sequence corresponding to other functions in the first function sequence are determined, for example, the second function in the second function sequence corresponding to the second function in the first function sequence and so on.
  • the detection device After the detection device obtains the functions in the second function sequence corresponding to each function in the first function sequence, it combines the obtained functions in the plurality of second function sequences to obtain the second function sequence.
  • mapping relationship between the functions the multiple functions of the second operating system correspond to the first operating system.
  • a function of a function of the second operating system is used to implement a function of the first operating system, and then in the mapping relationship, a function of the second operating system corresponds to a function of the first operating system.
  • This embodiment does not specifically limit whether the mapping relationship between functions is a one-to-one relationship or a one-to-many relationship.
  • the mapping relationship between the function in the first function sequence and the function in the second function sequence is encapsulated in the image, and the container is generated based on the image.
  • the instance of the container can access the mapping relationship between the function in the first function sequence and the function in the second function sequence.
  • Step 3 The detection device executes operations in the kernel of the second operating system according to the second function sequence.
  • the detection device converts the process of calling the dynamic link library of the virtual runtime environment into the process of calling the function of the second operating system, thereby using the function provided by the second operating system to simulate the function of the first operating system.
  • the parameters of different function calls may have differences, for example, the encoding methods and value ranges of the parameters may have differences.
  • the detection device further converts between the parameters of different functions. For example, the detection device obtains the first type of parameters called in the first API sequence, and the detection device executes the second API sequence according to the second type of parameters in the second operating system.
  • the first type of parameters include at least one parameter
  • the parameters included in the first type of parameters are input parameters of the API in the first API sequence.
  • the second type of parameter includes at least one parameter.
  • the second type of parameter includes the input parameter of the API in the second API sequence.
  • the first parameter of the second type of parameter has a mapping relationship with the first parameter of the first type of parameter.
  • the API in the first API sequence is CreateFile()
  • the first parameter in the first type of parameters is the parameter representing the file name in CreateFile().
  • the API in the second API sequence is fcreate()
  • the first parameter in the second type of parameters is the parameter representing the file name in fcreate().
  • the detection device uses the second type of parameter as the input parameter of the second API sequence, and executes the second API sequence.
  • the detection device determines the first parameter from the first type of parameters.
  • the detection device uses the identifier of the first parameter as an index to query the mapping relationship between the parameters to obtain the identifier of the first parameter in the second type of parameters.
  • the detection device determines the first parameter in the second parameter according to the identifier of the first parameter in the second parameter.
  • the detection device determines parameters other than the first parameter in the second type of parameters in the same way, and the detection device combines the determined parameters to obtain the second type of parameters.
  • the identifier of the first parameter is used to identify the first parameter. For example, the identifier of the first parameter is the name of the first parameter.
  • the software provider in the process of packaging the image of the container, encapsulates the mapping between the first parameter in the first type of parameter and the first parameter in the second type of parameter in the image. Relationship, after the detection device generates an instance of the container based on the image, the instance of the container can access the mapping relationship between the first parameter in the first parameter and the first parameter in the second parameter.
  • the detection device stores the functions that implement the API in other library files other than the dynamic link library, and the detection device adopts the same principle. Way to obtain the first function sequence and the second function sequence from other library files.
  • the other library file is a static library.
  • the library file is only an example of the storage method of the function that implements the API, and it does not limit that the function that implements the API must be obtained from the library file.
  • the software provider configures the storage address of the function implementing the API in the container instance in advance, and the container instance accesses the preset storage address to obtain the first function sequence and the second function sequence.
  • the test file requests to perform operations on the first resource set by calling the first API sequence during the running process, and accordingly, the detection device executes the second API sequence to perform operations on the second resource set .
  • the first resource set includes at least one resource.
  • the resources in the first resource set are objects of API operations in the first API sequence.
  • the second resource set includes at least one resource.
  • the resources in the second resource set are objects of API operations in the second API sequence, and the first resource in the second resource set has a mapping relationship with the first resource in the first resource set.
  • ReadFile() is called to request access to the file system of Windows.
  • the testing device maps the file system of Windows to directory A of Linux, and calls fread() to access Linux. Directory A.
  • the API in the first API sequence is ReadFile(), and the resource in the first resource set is the Windows file system.
  • the API in the second API sequence is fread(), and the resource in the second resource set is Linux directory A.
  • the test file calls API_1 for network communication in Windows during the running process to request network communication through the Windows network system.
  • the testing device maps the Windows network system to the Linux protocol stack, and calls API_2 for network communication in Linux to request network communication through the Linux protocol stack.
  • the resources in the first resource set are Windows network systems
  • the resources in the second resource set are Linux protocol stacks.
  • the test file requests an input/output (IO) device managed by the Windows system to perform an operation by calling the first API sequence, and the detection device executes the second API sequence to control the IO device managed by the Linux system to perform an operation.
  • IO input/output
  • the instructions triggered by the test file are converted into instructions executable by the second operating system.
  • the instruction conversion process includes the following steps a to b.
  • Step a The detection device obtains the first instruction sequence triggered during the running of the test file.
  • the testing device stores the test file in the disk.
  • the form of the program in the test file is a bunch of binary codes.
  • the testing equipment When the testing equipment receives the test instruction on the test file, it will respond to the test instruction, read the program in the test file, load the program into the system memory, interpret the program in the system memory as instructions one by one, and execute the instruction to Realize the corresponding function.
  • instructions are instructions and commands that direct the work of the computer, and a program is a series of instructions arranged in a certain order, and the working process of the computer is the process of executing the instructions.
  • the test file calls the first API sequence by triggering the first instruction sequence.
  • the first instruction sequence includes at least one instruction, and each instruction in the first instruction sequence is used to instruct to call an API in the first API sequence.
  • Each instruction in the first instruction sequence belongs to a computer instruction set compatible with the operating system simulated by the virtual operating environment.
  • the virtual operating environment is used to simulate the operating environment of the Windows operating system, and each instruction in the first instruction sequence is an X86 instruction in the X86 instruction set.
  • the virtual operating environment is used to simulate the operating environment of the Linux operating system, and each instruction in the first instruction sequence is an ARM instruction in the ARM instruction set.
  • the detection device receives each instruction triggered by running the test file in the virtual operating environment, and obtains the first instruction sequence. For example, if the virtual operating environment is generated based on the Docker container technology, the detection device receives each instruction triggered by the test file in the instance of the Docker container to obtain the first instruction sequence.
  • Step b The detection device performs a first instruction conversion on the instructions in the first instruction sequence, and obtains the second instruction sequence according to the result of the first instruction conversion.
  • the first instruction conversion is used to convert the instructions in the instruction set on which the first operating system is based into instructions in the computer instruction set of the detection device. For example, if the computer instruction set of the detection device is the ARM instruction set, and the instruction set on which the first operating system is based is the X86 instruction set, the first instruction conversion refers to the process of converting X86 instructions into ARM instructions. For another example, if the computer instruction set of the detection device is the X86 instruction set, and the instruction set on which the first operating system is based is the ARM instruction set, the first instruction conversion refers to the process of converting the ARM instructions into X86 instructions.
  • the second instruction sequence includes at least one instruction.
  • Each instruction in the second instruction sequence is used to instruct to call an API in the second API sequence.
  • Each instruction in the second instruction sequence belongs to the computer instruction set of the detection device. Therefore, each instruction in the second instruction sequence is an instruction that can be recognized and executable by the architecture of the detection device. For example, if the computer instruction set of the detection device is an ARM instruction set, each instruction in the second instruction sequence is an ARM instruction in the ARM instruction set. If the computer instruction set of the detection device is an X86 instruction set, each instruction in the second instruction sequence is an X86 instruction in the X86 instruction set.
  • the detection device executes the second instruction sequence to implement the operation corresponding to the second API sequence.
  • each instruction in the second instruction sequence is an ARM instruction
  • the detection device executes each ARM instruction through the ARM CPU. Since the Linux API is implemented by executing the ARM instruction, the execution of each ARM instruction also implements a series of Operation corresponding to Linux API.
  • the detection device runs the instruction conversion process, and performs instruction conversion on the first instruction sequence through the instruction conversion process to obtain the second instruction sequence.
  • the instruction conversion process is deployed outside the virtual operating environment.
  • an instruction conversion process may be inserted between the instance of the Docker container and the second operating system, so that the instruction conversion process is deployed outside the instance of the Docker container.
  • the instruction conversion process is deployed within the virtual operating environment. For example, set the instruction conversion process in the instance of the Docker container.
  • Instruction conversion is also called instruction translation.
  • the detection device divides the first instruction sequence into several micro-instructions, each micro-instruction is implemented by a simple piece of code, and then these codes are compiled into a target file, which is then installed on the detection device.
  • the target file is executed, and the target file contains the second sequence of instructions.
  • test file conversion By means of instruction conversion, if the test file is an executable file written through instruction set A, the CPU of the test device is a CPU of instruction set B architecture, and the test device performs instruction conversion to convert the instructions triggered by the test file from instruction set A The instructions in are converted to instructions in instruction set B. In this way, the CPU of the test device can execute the instruction triggered by the test file, thereby running the test file normally. It can be seen that the technical means get rid of the dependence of running test files on a specific hardware environment, and ensure that the scheme of detecting malicious files is widely used in various hardware environments.
  • the detection device after the detection device executes the second API sequence in the second operating system, the detection device obtains the third instruction sequence, and the detection device performs the second instruction on each instruction in the third instruction sequence. Conversion, the fourth instruction sequence is obtained according to the result of the second instruction conversion.
  • the detection device inputs the fourth instruction sequence into the virtual operating environment, so that the test file continues to run based on the fourth instruction sequence. For example, if the virtual operating environment is generated based on the Docker container technology, the detection device inputs the fourth instruction sequence into the instance of the Docker container, and continues to run the test file according to the fourth instruction sequence in the instance of the Docker container.
  • the second instruction conversion is used to convert instructions in the computer instruction set of the detection device into instructions in the instruction set on which the first operating system is based.
  • the second instruction conversion refers to the process of converting ARM instructions to X86 instructions.
  • the second instruction conversion refers to the process of converting X86 instructions into ARM instructions.
  • the third instruction sequence represents the result obtained after executing the second API sequence, and the instructions in the third instruction sequence belong to the computer instruction set of the detection device.
  • the instruction set corresponding to the third instruction sequence is the same as the instruction set corresponding to the second instruction sequence.
  • the instructions in the third instruction sequence belong to the computer instruction set of the detection device.
  • the instruction set corresponding to the fourth instruction sequence is the same as the instruction set corresponding to the first instruction sequence.
  • the instructions in the fourth instruction sequence belong to the computer instruction set of the virtual operating environment.
  • the processor of the detection device is an ARM architecture processor
  • the result obtained is usually in the form of an ARM instruction
  • the file in the virtual operating environment needs to determine the result obtained according to the X86 instruction .
  • the instruction conversion process converts the ARM instructions returned by the ARM architecture processor into X86 instructions, and inputs the X86 instructions into the virtual operating environment, so that the test file running in the virtual operating environment gets feedback from the processor, and the test file is based on the returned X86 instructions can determine the result of calling the API before, and continue to run according to the result of calling the API.
  • Step 406 The detection device judges whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
  • the detection device monitors the API provided by the virtual operating environment. After monitoring that the first API sequence is called, the detection device extracts the behavior characteristics of the test file in the process of calling the first API sequence. The detection device determines whether the behavior characteristics meet the preset conditions. If the behavior characteristic meets the preset condition, the detection device determines that the test file is a malicious file. If the behavior characteristics do not meet the preset conditions, the detection device determines that the test file is not a malicious file, that is, a normal file.
  • the detection device after the detection device starts to run the test file in the virtual operating environment, it only has to convert the API currently called by the test file in the virtual operating environment to the API in the second operating system and perform the second operation. Only when the test file is actually executed in the system can the subsequent call actions be further shown, so as to obtain the complete first API sequence. Then, the detection device determines whether the test file is a malicious file according to the first API sequence.
  • the detection device to monitor the API provided by the virtual operating environment.
  • a software provider designs a program code for implementing a virtual operating environment based on container technology, it adopts a variety of methods to enable the virtual operating environment to output the behavior characteristics of the test file in the process of calling the first API sequence.
  • a software provider designs the first API set to support the virtual operating environment, a section for outputting information is embedded in some or all of the APIs in the first API set. code.
  • the function of the program code is to output the related information that the embedded API is called when it is executed.
  • the related information of the API being called includes, but is not limited to, the identification of the API, the parameters passed in, the time of being called, and so on.
  • the software provider embeds the above-mentioned program code for outputting information in only part of the APIs that are of interest and that have a better effect on distinguishing normal abnormal behaviors.
  • the above program code for outputting information outputs information related to the call of the embedded API to a designated storage space, for example, to a designated log file, so that the detection device can read the data in the designated storage space. , Obtain the behavior characteristics of the PE file calling the first API sequence.
  • the feature of the API in the first API set is that on the one hand, the identifier of the API is the same as that of the API in the first operating system API set, and on the other hand, it can output the related information of the called when it is called.
  • the API for writing files in the first API set is WriteFile()
  • the WriteFile() has the same name as the file writing API of the Windows operating system.
  • the API used to write files in the first API set is that when WriteFile() is called, it also outputs the related information of the called to the log file.
  • the related information includes the parameters passed in when WriteFile() is called, such as file name, File storage location, the content of the string to be written to the file, the offset of the written data relative to the file header, and so on.
  • the log recording function is preset, and when the function in the first function sequence is called, the log recording function records the event that the function in the first function sequence is called. If the detection device reads the log and finds that each function in the first function sequence is recorded in the log, the detection device determines that the first API sequence is called. For example, the detection device monitors the API for setting the registry provided by the virtual operating environment, that is, monitors the function that implements RegSetValue(). When receiving the event that the RegSetValue() function is called, the detection device determines that the API for setting the registry is called. By analogy, the detection equipment adopts such a monitoring mechanism to monitor each API provided by the virtual operating environment.
  • the detection device adopts such a monitoring mechanism to monitor key APIs in all APIs provided by the virtual operating environment. For example, the detection device monitors the API used to set the registry, the API used to set the file system, The API used to control file permissions, the API used to manipulate processes, the API used to manipulate threads, and the API used to control memory are monitored.
  • the detection device obtains the identifier of each API in the first API sequence, obtains the parameter value passed by the test file to each API, and extracts the behavior characteristics according to the identifier of each API and the parameter value passed by each API.
  • Behavior characteristics are used to represent one or more dynamic behaviors of the test file.
  • the form of the behavior feature is a sequence, and the sequence is formed by sorting the features of each dynamic behavior in the order of occurrence time.
  • the dynamic behavior of the file in the virtual operating environment includes, for example, one or more of process operations, file operations, registry operations, port access, release or loading of DLLs, and so on.
  • Process operations include one or more of creating and ending processes; file operations include one or more of creating files, modifying files, reading files, deleting files, etc.; registry operations include creating registry entries and modifying registry One or more of key, query registry key, delete registry key, etc.
  • test file calls other APIs provided by the virtual runtime environment during operation, it will be judged whether the test file is malicious based on the behavior characteristics in the process of calling other APIs. document.
  • the test file also calls the API for network communication provided by the virtual runtime environment, the API for sending short messages, the API for operating the address book, the API for displaying pop-up windows, etc., provided by the virtual runtime environment during the running process.
  • the behavior characteristic is network communication. Behavior characteristics, behavior characteristics of sending short messages, behavior characteristics of operating the address book, behavior characteristics of displaying pop-up windows, etc.
  • the process of the detection device extracting and calling the behavior characteristics of other APIs for judgment is the same as the method described above, and will not be repeated here.
  • the detection device After the detection device obtains the behavior characteristics of the test file, it confirms whether the test file is a malicious file according to the obtained behavior characteristics. For example, the detection device matches the behavioral characteristics of the above-mentioned test file with predetermined rules, and confirms whether it is a malicious file according to the matching result; or inputs the behavioral characteristics of the above-mentioned test file into a classification model generated by machine learning algorithm training in advance, and according to the output of the classification model Confirm whether it is a malicious file; or manually analyze the behavior characteristics of the above-mentioned test file to confirm whether it is a malicious file.
  • the detection device obtains the behavior characteristics of the test file in the above manner as ⁇ RegDeleteValue (parameter A), RegSetValue (parameter B), SetWindowsHook (parameter C) ⁇ .
  • the above behavior characteristics indicate that the test file first deletes the antivirus software startup item through RegDeleteValue, and then adds itself to the startup item through the registry setting function RegSetValue to achieve system resident; and then further sets the global hook through the SetWindowsHook function to intercept user input data , Steal sensitive information.
  • the detection device determines that the test file used as the test file is a malicious file.
  • different weights are set for different dynamic behaviors, and the weights indicate the degree of threat of the corresponding dynamic behaviors. For example, the greater the weight, the greater the degree of threat.
  • the detection device obtains the weight of the dynamic behavior corresponding to the behavior feature. The detection device determines whether the test file is a malicious test file according to the behavior characteristics and weight.
  • the API called by the test file in the virtual operating environment can be used to abstract the dynamic behavior of the test file. If the dynamic behavior of the test file is found to match the dynamic behavior of the malicious file, the test file is determined to be a malicious file. , Thus realizing the dynamic detection of malicious files.
  • the embodiment of the present application provides a solution that can realize cross-platform dynamic detection of malicious files, and simulates the operating environment provided by an operating system compatible with test files through a virtual operating environment generated based on container technology.
  • the testing device converts the API called by the test file from the virtual operating environment to the API provided by the operating system of the testing device, and executes the converted API in the operating system of the testing device. Since the API of the operating system of the detection device is executed, the effect of simulating the execution of the API of the first operating system is achieved.
  • the virtual operating environment provided by the testing equipment can be compatible with the normal operation of the test file, thereby getting rid of the dependence of the test file on a specific architecture or platform (that is, the test file requires the testing equipment to be based on a specific architecture or platform), thus achieving cross-platform Malicious file detection.
  • the container technology can avoid the resource overhead caused by Hypervisor and Guest OS, and directly use the kernel of the host to run. Since the size of the image of the container is much smaller than the size of the image of the virtual machine, the detection method of the embodiment of the present application is lighter, consumes less CPU processing resources, and occupies less memory space.
  • the detection method of the embodiment of the present application realizes the operation of malicious files at the process level, and the detection speed is faster.
  • the time-consuming and performance overhead caused by repeated resetting of the virtual machine can be avoided, and the overhead caused by operations such as the creation and scheduling of the traditional virtual machine can be avoided.
  • the detection device is a computer device based on the ARM platform.
  • the test file is a PE file.
  • the method flow described in FIG. 5 relates to how a detection device based on the ARM platform detects malicious files that use the Windows operating system to perform malicious operations. It should be understood that the steps in the embodiment in FIG. 5 are the same as those in the embodiment in FIG. 4, please refer to the embodiment in FIG. 4, and will not be repeated in the embodiment in FIG.
  • FIG. 5 is a flowchart of a malicious file detection method provided by an embodiment of the present application. The method includes the following steps 501 to 505.
  • Step 501 The detection device obtains a PE file, which is a test file in the embodiment of the present application.
  • the detection device is a specific case of the detection device in the foregoing embodiment, and the operating system of the detection device is the Linux operating system.
  • the PE file is a specific case of the test file in the foregoing embodiment, the PE file is an executable file running based on the Windows operating system, and the format of the PE file is the PE format.
  • Step 502 The detection device runs the PE file in the virtual operating environment.
  • the virtual operating environment is generated based on container technology.
  • the first API set includes multiple APIs.
  • the first API set includes multiple APIs required for software operation provided by the virtual operating environment.
  • the identifier of the API in the first API set is the same as the identifier of the Windows API in the Windows API set.
  • the Windows API collection includes multiple APIs required by the software provided by the Windows operating system to run the test files.
  • the Windows API collection includes multiple Windows APIs.
  • Step 503 The detection device obtains a first API sequence called during the running of the PE file, and the first API sequence includes at least one API.
  • Step 504 The detection device executes a second API sequence in the Linux operating system, the second API sequence includes at least one Linux API, and the first Linux API in the second API sequence has a mapping relationship with the first API in the first API sequence.
  • the Linux API is executed to achieve the effect of simulating the execution of the Windows API, thereby simulating the operating environment provided by the Windows system for the test file, and achieving the purpose of operating system simulation.
  • applications notify the operating system to perform corresponding functions in the form of function calls, and the functions involved in the Windows API call are usually provided by the system's dynamic link library.
  • the process of simulating the execution of the Windows API includes the following steps (1) to (3).
  • Step (1) The detection device obtains the corresponding function from the DLL file according to each API in the first API sequence.
  • the DLL file is provided by the virtual runtime environment.
  • the dynamic link library of the virtual runtime environment includes DLL files.
  • the software provider in the process of packaging the image of the container, the software provider encapsulates the DLL file in the image, and the detection device generates an instance of the container based on the image, and the instance of the container is the PE file running in it. Provide the DLL file so that the PE file can call the first function sequence in the DLL file.
  • the Windows operating system includes multiple DLL files, and the process of accessing the DLL file includes a process of sequentially accessing the multiple DLL files in a certain order.
  • the last DLL file accessed is the ntdll.dll file.
  • Step (2) The detection device obtains the mapped function from the SO file according to each function in the first function sequence and the mapping relationship between the functions.
  • the SO file is a file containing a dynamic link library in the Linux operating system.
  • the SO file runs on the ARM platform.
  • the SO file is provided by the virtual operating environment.
  • the dynamic link library of the virtual runtime environment includes SO files.
  • the software provider in the process of packaging the image of the container, encapsulates the SO file in the image. After the detection device generates an instance of the container based on the image, if the first function sequence is called, The detection device can access the SO file to obtain the second function sequence.
  • Step (3) In the kernel of the Linux operating system, an operation is performed according to the second function sequence.
  • the PE file when the PE file calls API: WriteFile(), it will call the dynamic link library kernel32.dll, and then further call the function NtWriteFile() in the dynamic link library ntdll.dll, and finally The file writing operation is performed in the kernel of the Windows system.
  • the detection device in the process of running the test file under the Linux operating system, when the PE calls the API: WriteFile(), the detection device optionally calls the function frwite() in the SO file. According to the function frwite(), This is done in the Linux kernel.
  • the calling process of the DLL file on the Windows operating system can be simulated.
  • the process of performing operations based on functions in the Windows kernel can be simulated.
  • the file writing operation according to frwite() in the kernel of the Linux system is simulated to simulate the file writing operation according to the NtWriteFile() in the Windows kernel of the Windows operating system.
  • the instructions triggered by the test file are converted into instructions executable by the Linux operating system.
  • the instruction conversion process includes the following steps A to B.
  • Step A The detection device obtains the first instruction sequence triggered during the running of the test file.
  • Each instruction in the first instruction sequence is an X86 instruction, and each X86 instruction in the first instruction sequence is used to instruct to call the first API sequence.
  • One of the Windows APIs One of the Windows APIs.
  • Step B The detection device converts each X86 instruction in the first instruction sequence into an ARM instruction, and obtains a second instruction sequence according to the converted ARM instruction, and the second instruction sequence includes at least one ARM instruction.
  • Each ARM instruction in the second instruction sequence is used to instruct to call a Linux API in the second API sequence.
  • Each instruction in the second instruction sequence belongs to the ARM instruction set.
  • the detection device executes each ARM instruction in the second instruction sequence through the ARM CPU to implement the operation corresponding to each Linux API in the second API sequence, thereby simulating the Windows API through the Linux API.
  • the detection device obtains a third instruction sequence, where the third instruction sequence represents the result obtained after the second API sequence is executed, and the third The instructions in the instruction sequence are ARM instructions.
  • the detection device converts each ARM instruction in the third instruction sequence into an X86 instruction, and obtains a fourth instruction sequence according to the converted X86 instruction, and the instructions in the fourth instruction sequence belong to the X86 instructions in the X86 instruction set.
  • the detection device inputs the fourth instruction sequence into the virtual operating environment.
  • X86 application when an X86 application (APP) is obtained, the X86 APP will trigger a call to the DLL file in the virtual operating environment, generate X86 instructions, and convert X86 instructions to ARM instructions. X86 instructions are executed through an ARM-based operating system.
  • X86APP is an application developed based on the X86 instruction set, and X86 APP can be packaged as a PE file.
  • Step 505 The detection device determines whether the PE file is a malicious file based on the behavior characteristics of the PE file in the process in which the first API sequence is called.
  • the malicious file detection method provided in this embodiment is implemented by software alone, for example, all are implemented in the form of a computer program product.
  • the software can be provided to users by software providers.
  • the software provider can be different from the manufacturer of the detection device.
  • the hardware of the detection device is provided by the manufacturer of the network device separately, and the software for detecting malicious files running on the detection device is provided by the service provider of the Internet application separately.
  • the software provider designs the program code for implementing the virtual operating environment based on the container technology, it adopts a variety of methods to enable the virtual operating environment to output the behavior characteristics of the test file in the process of calling the first API sequence.
  • a piece of program code for outputting information is embedded in some or all of the APIs in the first API set.
  • the function of the program code is to output the related information that the embedded API is called when it is executed.
  • the related information of the API being called includes, but is not limited to, the identification of the API, the parameters passed in, the time of being called, and so on.
  • the software provider embeds the above-mentioned program code for outputting information in only part of the APIs that are of interest and that have a better effect on distinguishing normal abnormal behaviors.
  • the above program code for outputting information outputs information related to the call of the embedded API to a designated storage space, for example, to a designated log file, so that the detection device can read the data in the designated storage space. , Obtain the behavior characteristics of the PE file calling the first API sequence.
  • the feature of the API in the first API set is that on the one hand, the identifier of the API is the same as that of the Windows API in the Windows API set, and on the other hand, it can output the related information that is called when it is called.
  • the API for writing files in the first API set is WriteFile(), which has the same name as the file writing API of the Windows operating system.
  • the API used to write files in the first API set is that when WriteFile() is called, it also outputs the related information of the called to the log file.
  • the related information includes the parameters passed in when WriteFile() is called, such as file name, File storage location, the content of the string to be written to the file, the offset of the written data relative to the file header, and so on.
  • the detection device After the detection device obtains the behavior characteristics of the PE file in the above manner, it confirms whether the PE file is a malicious file according to the obtained behavior characteristics. For example, the detection device matches the behavioral characteristics of the above-mentioned PE file with predetermined rules, and confirms whether it is a malicious file according to the matching result; or inputs the behavioral characteristics of the above-mentioned PE file into a classification model generated by machine learning algorithm training in advance, and then according to the output of the classification model Confirm whether it is a malicious file; or manually analyze the behavior characteristics of the above PE file to confirm whether it is a malicious file.
  • the detection device obtains the behavior characteristics of the PE file in the above manner as ⁇ RegDeleteValue (parameter A), RegSetValue (parameter B), SetWindowsHook (parameter C) ⁇ .
  • the above behavior characteristics indicate that the PE file first deletes the antivirus software startup item through RegDeleteValue, and then adds itself to the startup item through the registry setting function RegSetValue to achieve system resident; and then further sets the global hook through the SetWindowsHook function to intercept user input data , Steal sensitive information.
  • the detection device judges that the PE file as the test file is a malicious file based on the series of behavior characteristics.
  • the process provided in the embodiment of FIG. 5 is an exemplary illustration of a solution for detecting Windows executable files by a detection device based on a non-X86 platform, and is not the only required implementation for detection devices based on a non-X86 platform to detect Windows executable files.
  • the Linux operating system of the detection device in the embodiment of FIG. 5 may be replaced with another operating system based on the ARM instruction set architecture.
  • the detection device needs to replace the SO file in the embodiment of FIG. 5 with the dynamic link library of the other operating system.
  • test files are still executable files running based on the Windows operating system.
  • the existing non-X86 platform detection equipment detects the test file under the Windows operating system, the existing technology will have a natural obstacle because the test file cannot be executed.
  • the detection device based on the non-X86 platform simulates the operating environment similar to the Windows operating system through the virtual operating environment generated based on the container technology, thereby using the virtual operating environment to be compatible with Windows executable files The normal execution.
  • the test file is an executable file compatible with the Windows operating system, such as a PE file
  • the operating system of the detection device is the Linux operating system
  • the method provided by the embodiments of the present application can be used to dynamically detect the PE under the Linux operating system.
  • the testing equipment based on the non-X86 platform can dynamically detect the testing file based on the Windows system, without requiring the testing equipment to be based on the X86 platform compatible with the Windows operating system, so it can overcome
  • the limitation of the use of detection equipment has been improved, and the use scenarios of malicious file detection technology have been expanded.
  • the detection device is a computer device based on the ARM platform.
  • the test file is a Windows EXE file.
  • the method flow described in FIG. 6 relates to how the detection device based on the ARM platform detects whether the Windows EXE file is a malicious file.
  • the method includes the following steps 601 to 607.
  • Step 601 The detection device obtains a Windows EXE file as a test file.
  • Step 602 The detection device obtains multiple Windows APIs called by the Windows EXE file in the virtual running environment.
  • Step 603 The detection device sequentially accesses the dynamic link library gdi32.dll, user32.dll, or kernel32.dll according to the multiple APIs called in step 602.
  • the dynamic link library gdi32.dll, user32.dll, or kernel32.dll contains the execution functions corresponding to the multiple APIs that are called.
  • Step 604 The detection device accesses the kernel containing the basic function corresponding to the hit function according to the function hit in gdi32.dll, user32.dll or kernel32.dll (that is, the execution function corresponding to the multiple APIs called in step 603 above) Level dynamic link library ntdll.dll.
  • Step 605 The system simulation process running by the detection device determines the Linux API of the basic function mapping corresponding to the function hit in step 604 in the ntdll.dll.
  • Step 606 Detect the access Linux library running on the device, and obtain the function corresponding to the above-mentioned Linux API.
  • Step 607 The Unix kernel running on the detection device controls the Unix device to perform operations through the Unix device driver according to the function corresponding to the Linux API in step 606.
  • the malicious file detection method described in FIG. 6 of the embodiment of the present application will be illustrated by using the embodiment of FIG. 7 as an example.
  • the test file is malware.exe.
  • the word Malware comes from the synthesis of the two words Malicious (malicious) and Software (software). It is a term for malicious software and represents software programs that can threaten computers, such as viruses, worms, Trojan horses, and spyware. Wait.
  • Malware.exe is a PE file. Under the X86 platform, during the process of running malware.exe, different DLL files will be called according to different function calls. All DLL calls will eventually be called to the ntdll.dll file, which enters the kernel of the Windows system. The function of the function call.
  • the detection device based on the ARM platform executes the following steps 701 to 705 to run the malware.exe and detect the malicious files contained in the malware.exe.
  • Step 701 The detection device starts malware.exe.
  • a Docker instance includes a system simulation process, for example, a Docker instance is a parent process, and a system simulation process is a child process.
  • the system simulation process is a process used to simulate the operating system of the detection device in the Docker instance.
  • the system simulation process can access the DLL file, obtain the API encapsulated in the DLL file, and perform API conversion.
  • the Docker instance can start malware.exe through the system simulation process.
  • the Docker instance loads the binary image of malware.exe into the memory space of the detection device through the system simulation process, and starts the binary image.
  • the system simulation process is used to access the DLL files and SO files required by malware.exe to ensure the normal execution of DLL calls and functions during the running of malware.exe.
  • the memory space is pre-applied by the Docker instance to the real operating system of the detection device.
  • Step 702 The detection device calls the function in the DLL file.
  • the functions in the DLL file can be used to compose the first API sequence.
  • requests for the registry, files, and system IO of the Windows operating system will be generated. These requests are notified to the Windows operating system in the form of function calls.
  • the called functions can be located in the DLL file. Or multiple called functions can form an API.
  • the system simulation process can use the resources of the Linux operating system to simulate the resources of the Windows operating system.
  • the file system of Windows is mapped to a certain directory of Linux, so as to simulate the file system of Windows through the Linux directory.
  • the Windows network system is implemented through Linux-based protocol stack simulation.
  • Step 703 The detection device performs instruction conversion.
  • the instructions generated by malware.exe running in the Docker instance are still X86 instructions.
  • the instruction conversion process converts X86 instructions into ARM instructions.
  • Step 704 The detection device executes the ARM instruction on the Linux operating system to implement the operation indicated by the ARM instruction.
  • the linux operating system of the detection device executes the ARM instructions through a CPU based on the ARM architecture.
  • the CPU can control other computer hardware (such as peripheral input and output devices) to execute the operation corresponding to the ARM instruction.
  • the computer hardware can return the execution result generated by the operation to the Linux operating system, and the Linux operating system returns the execution result to the instruction conversion process.
  • the instruction conversion process converts the execution result from ARM instructions to X86 instructions, and feeds back the X86 instructions to the Docker instance process to ensure that malware.exe continues to execute.
  • Step 705 The detection device makes a threat judgment based on the dynamic behavior in the calling process.
  • the detection device monitors the simulated Windows API, abstracts the behavior characteristics of malware.exe based on the called functions and parameters, and performs malicious behavior determination based on the behavior characteristics of malware.exe, thereby completing the dynamic detection of malware.exe.
  • the detection equipment based on the ARM platform realizes the dynamic detection of the files of the X86 platform on the ARM platform through the mode of operating system simulation and the mode of instruction conversion.
  • the operation of malicious files can be realized at the process level, avoiding traditional virtual machine creation, scheduling and other operations, occupies less resources, runs fast, and finally achieves the purpose of cross-platform detection of malicious files.
  • the detection device is a computer device based on an X86 platform.
  • the test file is an ELF file.
  • the method flow described in FIG. 5 relates to how a detection device based on the X86 platform detects malicious files that use the Linux operating system to perform malicious operations. It should be understood that the steps in the embodiment in FIG. 8 are the same as those in the embodiment in FIG. 4, please refer to the embodiment in FIG. 4, and will not be repeated in the embodiment in FIG.
  • FIG. 8 is a flowchart of a method for detecting malicious files according to an embodiment of the present application. The method includes the following steps 801 to 805.
  • Step 801 The detection device obtains an ELF file, which is a test file in the embodiment of the present application.
  • the detection device is a specific case of the detection device in the foregoing embodiment.
  • the operating system of the detection device is a Windows operating system.
  • the ELF file is a specific case of the test file in the above embodiment.
  • ELF files are executable files that run based on the Linux operating system.
  • the format of the ELF file is ELF format.
  • Step 802 The detection device runs the ELF file in the virtual operating environment.
  • the virtual operating environment is generated based on container technology.
  • the first API set includes multiple APIs.
  • the first API set includes multiple APIs required for software operation provided by the virtual operating environment.
  • the identifier of the API in the first API set is the same as the identifier of the Linux API in the Linux API set.
  • the Linux API collection includes multiple APIs required by the Linux operating system for running the software provided by the test file.
  • the Linux API collection includes multiple Linux APIs.
  • Step 803 The detection device obtains the first API sequence called during the running of the ELF file, the first API sequence includes at least one API, and the APIs in the first API sequence are APIs in the first API set.
  • Step 804 The detection device executes a second API sequence in the Windows operating system, the second API sequence includes at least one Windows API, and the first Windows in the second API sequence has a mapping relationship with the first API in the first API sequence.
  • the Windows API is executed to achieve the effect of simulating the execution of the Linux API, thereby simulating the operating environment provided by the Linux system for the test file, and achieving the purpose of operating system simulation.
  • the process of simulating the execution of the Linux API includes the following steps 8041 to 8043.
  • Step 8041 The detection device obtains the corresponding function from the SO file according to each API in the first API sequence.
  • the SO file is provided by the virtual operating environment.
  • the dynamic link library of the virtual runtime environment includes SO files.
  • the software provider in the process of packaging the image of the container, the software provider encapsulates the SO file in the image, and the detection device generates an instance of the container based on the image, and the instance of the container is the ELF file running in it. Provide the SO file so that the ELF file can call the first function sequence in the SO file.
  • Step 8042 The detection device obtains the mapped function from the DLL file according to each function in the first function sequence and the mapping relationship between the functions.
  • the DLL file is provided by the virtual runtime environment.
  • the dynamic link library of the virtual runtime environment includes DLL files.
  • the software provider encapsulates the DLL file in the image, and the detection device generates an instance of the container based on the image, and the instance of the container is the ELF file running in it.
  • the detection device can access the DLL file to obtain the second function sequence.
  • Step 8043 In the kernel of the Windows operating system, perform an operation according to the second function sequence.
  • the function frwite in the SO file will be called to perform the file writing operation in the kernel of the Linux system.
  • the detection device calls kernel32.dll, and then further calls the function NtWriteFile() in ntdll.dll, Finally, the file writing operation is performed in the kernel of the Windows system.
  • the calling process of the SO file on the Linux operating system can be simulated, and the process of performing operations according to the function in the Linux kernel can be simulated by performing operations according to functions in the Windows kernel.
  • the instructions triggered by the test file are converted into instructions executable by the Windows operating system.
  • the instruction conversion process includes the following steps one to two.
  • Step 1 The testing device obtains the first instruction sequence triggered during the running of the test file.
  • Each instruction in the first instruction sequence is an ARM instruction, and each ARM instruction in the first instruction sequence is used to instruct to call the first API sequence One of the Linux APIs.
  • Step 2 The detection device converts each ARM instruction in the first instruction sequence into an X86 instruction, and obtains a second instruction sequence according to the converted X86 instruction, and the second instruction sequence includes at least one X86 instruction.
  • Each X86 instruction in the second instruction sequence is used to instruct to call a Windows API in the second API sequence.
  • Each instruction in the second instruction sequence belongs to the X86 instruction set.
  • the detection device executes each X86 instruction in the second instruction sequence through the X86CPU to implement the operation corresponding to each Windows API in the second API sequence, thereby simulating the Linux API through the Windows API.
  • the detection device obtains a third instruction sequence, where the third instruction sequence represents the result obtained after the second API sequence is executed, and the third The instructions in the instruction sequence are X86 instructions.
  • the detection device converts each X86 instruction in the third instruction sequence into an ARM instruction, and obtains a fourth instruction sequence according to the converted ARM instruction, and the instructions in the fourth instruction sequence belong to the ARM instruction in the ARM instruction set.
  • the detection device inputs the fourth instruction sequence into the virtual operating environment.
  • the ARM APP when the ARM APP is obtained, the ARM APP will trigger the call to the SO file in the virtual operating environment, generate ARM instructions, convert the ARM instructions into X86 instructions, and execute the X86 instructions through the X86-based operating system.
  • ARM APP is an application developed based on the ARM instruction set.
  • the ARM APP is packaged as an ELF file.
  • Step 805 The detection device judges whether the ELF file is a malicious file based on the behavior characteristics of the ELF file during the calling process of the first API sequence.
  • the software provider when the software provider designs the program code for realizing the virtual operating environment based on the container technology, it adopts multiple methods to enable the virtual operating environment to output the behavior characteristics of the test file in the process of calling the first API sequence.
  • a piece of program code for outputting information is embedded in some or all of the APIs in the first API set.
  • the function of the program code is to output the related information that the embedded API is called when it is executed.
  • the related information of the API being called includes, but is not limited to, the identification of the API, the parameters passed in, the time of being called, and so on.
  • the software provider embeds the above-mentioned program code for outputting information in only part of the APIs that are of interest and that have a better effect on distinguishing normal abnormal behaviors.
  • the above program code for outputting information outputs information related to the call of the embedded API to a designated storage space, for example, to a designated log file, so that the detection device can read the data in the designated storage space. , Obtain the behavior characteristics of the PE file calling the first API sequence.
  • the feature of the APIs in the first API set is that on the one hand, the identification of the API is the same as the identification of the Linux API in the Linux API set, and on the other hand, it can output the related information that is called when it is called.
  • the API for writing files in the first API set is frwite(), which has the same name as the file writing API of the Linux operating system.
  • frwite() When the API used to write files in the first API set is called frwite(), it also outputs the related information of the called to the log file.
  • the related information includes the parameters passed in when frwite() is called, such as file name, File storage location, the content of the string to be written to the file, the offset of the written data relative to the file header, and so on.
  • the detection device After the detection device obtains the behavior characteristics of the ELF file in the above manner, it confirms whether the ELF file is a malicious file according to the obtained behavior characteristics. For example, the detection device matches the behavioral characteristics of the ELF file with predetermined rules, and confirms whether it is a malicious file according to the matching result; or inputs the behavioral characteristics of the ELF file into a classification model generated by machine learning algorithm training in advance, and according to the output of the classification model Confirm whether it is a malicious file; or manually analyze the behavior characteristics of the above ELF file to confirm whether it is a malicious file.
  • the detection device obtains the behavior characteristics of the ELF file in the above manner as ⁇ RegDeleteValue (parameter A), RegSetValue (parameter B), SetLinuxHook (parameter C) ⁇ .
  • the above behavior characteristics indicate that the ELF file first deletes the antivirus software startup item through RegDeleteValue, and then adds itself to the boot-up item through the registry setting function RegSetValue to achieve system resident; and then further sets the global hook through the SetLinuxHook function to intercept user input data , Steal sensitive information.
  • the detection device judges the ELF file as the test file as a malicious file based on the series of behavior characteristics.
  • the ELF file calls the file-opening API during the running process, and passes the parameter value representing the Linux kernel symbol table to the system file API, and the behavior characteristics include fopen, proc/kallsyms, and r.
  • fopen means to open a file
  • proc/kallsyms and r means the Linux kernel symbol table. Since malicious files usually obtain ROOT permissions by opening the Linux kernel symbol table, when the behavior characteristics include fopen, proc/kallsyms, and r, the ELF file is judged to be a malicious file.
  • the process provided in the embodiment of FIG. 8 is an exemplary description of a solution for detecting a Linux executable file by a detection device based on the X86 platform, and is not the only required implementation method for a detection device based on the X86 platform to detect a Linux executable file.
  • the Windows operating system in the embodiment of FIG. 8 is replaced with another operating system based on the X86 instruction set architecture.
  • the detection device needs to replace the DLL file in the embodiment of FIG. 8 with the dynamic link library of the other operating system.
  • test files are still executable files running based on the Linux operating system.
  • the existing X86 platform detection device detects the test file under the Linux operating system, the existing technology will have a natural obstacle due to the inability to execute the test file.
  • the detection device based on the X86 platform can simulate the operating environment similar to that provided by the Linux operating system through the virtual operating environment generated based on container technology, thereby using the virtual operating environment to be compatible with Linux executables The normal execution of the file.
  • the test file is an executable file compatible with the Linux operating system, such as an ELF file
  • the operating system of the detection device is a Windows operating system
  • the ELF file is dynamically detected under the Windows operating system using the method provided by the embodiment of the present application, So as to get rid of the dependence of the detection test file on the Linux operating system, the detection equipment based on the X86 platform can dynamically detect the test file based on the Linux system, without requiring the detection equipment to be based on the ARM platform compatible with the Linux operating system, thus overcoming the detection
  • the limitations of the use of the device have expanded the use scenarios of malicious file detection technology.
  • the product form of the foregoing malicious file detection method is a container application, which can provide a function of detecting malicious files.
  • the above software provider may be a cloud computing service provider.
  • a cloud computing service provider provides container applications for enterprise networks.
  • the cloud server deploys the container applications on the enterprise network, and the detection equipment in the enterprise network runs the container
  • the application can implement the method provided in the foregoing embodiment.
  • container clusters are deployed in the enterprise network, and container applications are deployed, managed, expanded, upgraded, uninstalled, expanded, service discovered, and load balanced in the cloud And other life cycle management.
  • the users of the enterprise network can use CCE to conveniently manage the container applications deployed in the enterprise network according to the needs of detecting malicious files.
  • the malicious file detection method described in FIG. 4 of the embodiment of the present application will be illustrated by using the embodiment of FIG. 9 as an example.
  • the form of the container is a container application
  • the cloud computing service provider obtains the image of the container application by operating the container service management entity.
  • the method includes the following steps 901 to 907.
  • Step 901 The container service management entity creates an image of the container application.
  • the container service management entity is a container as a service (container as a service, CaaS) manager.
  • CaaS is a platform as a service (Platform as a Service, PaaS) for providing container services.
  • CaaS is located at the bottom of the PaaS layer and integrates the service capabilities of the PaaS layer and the IaaS layer.
  • CaaS includes container applications at the PaaS layer and container resources at the IaaS layer.
  • the CaaS manager is an entity used to manage container services in CaaS, and the container service management entity is used to manage and orchestrate CaaS.
  • the name CaaS Manager is just an example, and other names can also be used to refer to the entity used to manage container services in CaaS.
  • the container service management entity encapsulates the resources of the virtual operating environment based on Docker technology to obtain a Docker image. For example, package various registry, DLL calls, services, etc. to obtain Docker images.
  • Step 902 The container service management entity sends the image of the container application to the detection device.
  • the detection device receives the image, creates a container application based on the image, and runs the container instance.
  • the Docker image library (also called Docker registry) in the Docker technology is used to send the image to the detection device.
  • the container service management entity sends a Docker image to the Docker image library, and the detection device downloads the Docker image from the Docker image library, thereby obtaining the image sent by the container service management entity.
  • the container service management entity sends an instruction (for example, Docker push) through the image, sends the Docker image to the Docker image library, the detection device sends an image download instruction (for example, Docker pull command) to the Docker image library, and the Docker image library responds The image download instruction sends the Docker image to the detection device, thereby deploying the Docker image on the detection device.
  • the Docker image library is a node device used to store and distribute Docker images in the cluster.
  • the Docker image repository stores a large number of Docker images.
  • the Docker image library is implemented based on the Docker registry protocol or the Docker hub protocol.
  • the Docker image library is used to store and distribute Docker images.
  • the Docker image is stored in the Docker image station in the form of multiple image layers and one image description information.
  • Step 903 The detection device obtains the test file.
  • Step 904 The detection device runs the test file in the virtual operating environment.
  • Step 905 The detection device obtains the first API sequence called during the running of the test file.
  • Step 906 The detection device executes the second API sequence in the second operating system.
  • Step 907 The detection device determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
  • FIG. 3 the process of implementing operating system simulation and the process of behavior monitoring described in the embodiment of Fig. 9 is executed in a container application.
  • an executable file is started in each container application, and an executable file is detected by each container application.
  • FIG. 3 create three container instances, namely container Docker_1, container Docker_2, and container Docker_n.
  • Start APP2 through the container Docker_2, and detect APP2 in the container Docker_2.
  • Start APPn through the container Docker_n, and detect APPn in the container Docker_n, so that different test files can be detected in parallel through different containers.
  • the detection device is specifically a network security device such as a firewall, a router, a security gateway, and an intrusion detection device.
  • Network security equipment ensures the security of the network by detecting malicious files spreading on the network.
  • FIG. 10 is a flowchart of a network security protection method provided by an embodiment of the present application.
  • the method includes the following steps 1001 to 1007.
  • Step 1001 The network security device obtains the data stream transmitted in the network.
  • a data stream refers to a series of messages from a source host to a destination, where the destination can be another host, a multicast group containing multiple hosts, or a broadcast domain .
  • the network security device is an IDS device, and the network security device obtains the data stream in a bypass mode. That is, the network device does not block the transmission of packets in the network, but uses port mirroring (port mirroring) to copy the packets flowing through the mirrored port to obtain the mirrored packet, and parse the mirrored packet to obtain Test file.
  • the network security device is an IPS type device, and the network security device checks each packet passed in real-time through in-line mode, so that when the test file in the packet is a malicious file, the report is blocked. Transmission of text in the network.
  • Step 1002 The network security device obtains a test file from the data stream.
  • the network security device reorganizes the load data of all the messages in the data stream according to the sequence number of the message, thereby obtaining the test file.
  • Step 1003 The network security device runs the test file in the virtual operating environment.
  • Step 1004 The network security device obtains the first API sequence called during the running of the test file.
  • Step 1005 The network security device executes the second API sequence in the second operating system.
  • Step 1006 The network security device judges whether the test file is a malicious file based on the behavioral characteristics of the test file in the process in which the first API sequence is called.
  • Step 1007 The network security device performs intrusion prevention on the network according to the detection result.
  • the network security device is an IDS device, and if the network security device determines that the test file is a malicious file, the network security device sends an alarm message.
  • the network security device is an IPS device. If the network security device determines that the test file is a malicious file, the network security device discards the message, thereby blocking the transmission of the message and sending an alarm message.
  • the network security device detects the test file carried in the message by implementing a cross-platform dynamic detection of malicious files, and performs intrusion prevention based on the detection results, and can detect malicious files transmitted in the network in time. Messages to improve the security of the network.
  • network security devices get rid of the dependence of the detection process on the operating system by means of operating system simulation.
  • the virtual operating environment is used to simulate the Windows operating system, and the malicious files are run in the virtual operating environment to detect such malicious packets.
  • the network security device can dynamically detect the test files based on the Windows system and the test files based on the Linux system, without requiring that the network security device must be based on the X86 platform compatible with the Windows system or the ARM platform compatible with the Linux system, thus overcoming the network
  • the limitation of the use of security equipment which greatly expands the application scenarios of the network intrusion prevention method, improves the security of the network system.
  • the network security device provided in the embodiment of FIG. 10 is applied in an enterprise network, and is deployed on the gateway device and cloud platform entrance of the enterprise network.
  • the network security device can dynamically detect malicious behavior by executing the method provided by the embodiment of the application. Documents to provide solutions for the network security of the corporate network.
  • the enterprise network includes a headquarters local area network and several local area networks of branch offices.
  • the headquarters LAN includes the data center 1102, the core office area, office area A, and office area B's respective LANs.
  • the respective local area networks of the data center 1102, the core office area, the office area A, and the office area B are connected to the firewall 1105 through a switch.
  • the firewall 1105 is further connected to the wide area network or the Internet through a router 1101, a NAT device (not shown in the figure), a gateway device (not shown in the figure), and so on.
  • the firewall 1105 is used to isolate the headquarters local area network from the wide area network or the Internet, and to protect the data exchanged between the headquarters local area network and the wide area network or the Internet.
  • the headquarters local area network is connected to the local area network 1104 of each branch through a VPN, and the branch offices are branch A, branch B, and branch C as shown in FIG. 11.
  • the network security device provided in FIG. 10 is deployed in the enterprise network shown in FIG. 11.
  • the network security device is a first network security device, a second network security device, a third network security device, or a fourth network security device.
  • the first network security device is deployed at the network exit of the headquarters LAN, that is, between the firewall 1105 and the router 1101.
  • the first network security device is integrated in an exit firewall, an exit router, or a bypass firewall.
  • the first network security device is used to prevent malicious test files from the Internet and malicious web traffic.
  • the second network security device is deployed on the border of the data center 1102 of the headquarters LAN.
  • it is an independent device set in a straight way between the data center 1102 and the firewall 1105 to protect the core assets of the server, and to discover the hidden attacks and malicious attacks on the internal network. Scanning, penetration, etc.
  • the third network security device is deployed on the border of the core department 1103 of the headquarters LAN, for example, a separate device set in a straight path between the switch in the core office area and the firewall 1105 to prevent the transmission of suspicious test files on the intranet and laterally infect the core. Department.
  • the fourth network security device is deployed on the boundary of the branch LAN 1104, such as a separate device set in a straight path between the branch LAN and the WAN routing device, to avoid malicious test files and unknown threats on the branch LAN and headquarters LAN Random spread between.
  • the network security device is the first network security device in FIG. 11.
  • the test file is the file carried in the incoming or outgoing packets from the network exit of the headquarters LAN.
  • the method flow described in FIG. 12 relates to how the network security equipment deployed at the network exit of the headquarters LAN protects the network security of the headquarters LAN. It should be understood that the steps in the embodiment in FIG. 12 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.
  • FIG. 12 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 12, the method may include the following steps 1201 to 1206.
  • Step 1201. The first network security device collects a message that enters or exits from the network exit of the headquarters LAN, and obtains a test file carried in the message, and the test file is an executable file of the first operating system.
  • Step 1202 The first network security device runs the test file in the virtual operating environment.
  • Step 1203 The first network security device obtains the first API sequence called during the running of the test file.
  • Step 1204 The first network security device executes the second API sequence in the second operating system.
  • Step 1205 The first network security device judges whether the test file is a malicious file based on the behavioral characteristics of the test file in the process in which the first API sequence is called.
  • step 1202 in the embodiment of FIG. 12 please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1203 in the embodiment of FIG. 12, please refer to step 404 in the embodiment of FIG.
  • step 1204 please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1205 in the embodiment of FIG. 12, please refer to step 406 in the embodiment of FIG.
  • Step 1206 If the test file is a malicious file, the first network security device reports that malicious traffic is detected at the network exit of the headquarters LAN.
  • network security equipment is deployed at the network exit of the headquarters LAN.
  • the network security equipment collects messages coming in and out of the network exit, and implements a cross-platform dynamic detection scheme for malicious files to carry out messages.
  • the test files are tested, and the intrusion prevention is performed based on the test results.
  • the network security device uses the virtual operating environment to simulate the Windows operating system, and runs the malicious file in the virtual operating environment to detect Out such malicious messages.
  • the network security device uses a virtual operating environment to simulate the Linux operating system, and runs the malicious file in the virtual operating environment to detect this malicious Message. It can be seen that this method can effectively prevent malicious traffic from the Internet for the headquarters LAN and improve the network security of the headquarters LAN.
  • the malicious file detection method described in FIG. 10 of the embodiment of the present application will be illustrated by using the embodiment of FIG.
  • the network security device is the second network security device in FIG. 11.
  • the test file is the file carried in the packets entering or leaving the boundary of the data center.
  • the method flow described in FIG. 13 relates to how the network security equipment deployed at the boundary of the data center protects the network security of the data center. It should be understood that the steps in the embodiment in FIG. 13 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.
  • FIG. 13 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 13, the method may include the following steps 1301 to 1306.
  • Step 1301 The second network security device collects a message entering or exiting from the boundary of the data center, and obtains a test file carried by the message, and the test file is an executable file of the first operating system.
  • Step 1302 The second network security device runs the test file in the virtual operating environment.
  • Step 1303 The second network security device obtains the first API sequence called during the running of the test file.
  • Step 1304 The second network security device executes the second API sequence in the second operating system.
  • Step 1305 The second network security device judges whether the test file is a malicious file based on the behavior characteristics of the test file during the calling process of the first API sequence.
  • step 1302 in the embodiment of FIG. 13 please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1303 in the embodiment of FIG. 13 please refer to step 404 in the embodiment of FIG.
  • step 1304 please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1305 in the embodiment of FIG. 13 please refer to step 406 in the embodiment of FIG.
  • Step 1306 If the test file is a malicious file, the second network security device reports that malicious traffic is detected at the boundary of the data center.
  • the network security equipment collects messages coming in and out of the network exit, and implements a cross-platform dynamic detection scheme for malicious files, so as to detect the malicious files carried in the messages. Test files are tested, and intrusion prevention is performed based on the test results.
  • the network security device uses the virtual operating environment to simulate the Windows operating system, and runs the malicious files in the virtual operating environment to detect This malicious message.
  • the network security device uses a virtual operating environment to simulate the Linux operating system, and runs the malicious file in the virtual operating environment to detect this malicious Message. It can be seen that this method helps to find malicious files spread in the data center's intranet, helps protect the core assets of the server, and discovers potential attacks, malicious scans, and infiltrations in the intranet.
  • the network security device is the third network security device in FIG. 11.
  • the test file is the file carried in the message transmitted internally by the core department.
  • the method flow described in FIG. 14 relates to how the network security equipment deployed at the boundary of the core department protects the network security of the core department. It should be understood that the steps in the embodiment in FIG. 14 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.
  • FIG. 14 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 14, the method may include the following steps 1401 to 1406.
  • Step 1401 The third network security device collects the message transmitted internally by the core department to obtain a test file carried by the message, and the test file is an executable file of the first operating system.
  • Step 1402 The third network security device runs the test file in the virtual operating environment.
  • Step 1403 The third network security device obtains the first API sequence called during the running of the test file.
  • Step 1404 The third network security device executes the second API sequence in the second operating system.
  • Step 1405 The third network security device judges whether the test file is a malicious file based on the behavior characteristics of the test file during the calling process of the first API sequence.
  • step 1402 in the embodiment of FIG. 14 please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1403 in the embodiment of FIG. 14, please refer to step 404 in the embodiment of FIG. 4.
  • step 1404 please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1405 in the embodiment of FIG. 14 please refer to step 406 in the embodiment of FIG.
  • Step 1406 If the test file is a malicious file, the third network security device reports that malicious traffic is detected in the core department.
  • the network security equipment collects incoming and outgoing messages from the core department, and implements a cross-platform dynamic detection scheme for malicious files, so as to detect the malicious files carried in the messages. Test files are tested, and intrusion prevention is performed based on the test results.
  • the network security device uses the virtual operating environment to simulate the Windows operating system, and runs the malicious files in the virtual operating environment to detect This malicious message.
  • the network security device uses the virtual operating environment to simulate the Linux operating system, and runs the malicious files in the virtual operating environment to detect this malicious Message. It can be seen that this method helps to discover malicious files spread on the intranet of the core department, and helps prevent the spread of suspicious test files on the intranet and infect the core department horizontally.
  • the network security device is the fourth network security device in FIG. 11.
  • the test file is the file carried in the incoming or outgoing packets from the boundary of the branch LAN.
  • the method flow described in FIG. 15 relates to how the network security device deployed at the boundary of the branch local area network protects the network security of the branch local area network. It should be understood that the steps in the embodiment in FIG. 15 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.
  • FIG. 15 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 15, the method may include the following steps 1501 to 1506.
  • Step 1501 The fourth network security device collects a message that enters or exits from the boundary of the branch LAN, and obtains a test file carried by the message.
  • the message includes the message transmitted within the branch LAN, the message transmitted between the corporate headquarters and the branch LAN, the message flowing from the external network to the branch LAN, or the branch LAN to the external network. At least one of the messages.
  • Step 1502 the fourth network security device runs the test file in the virtual operating environment.
  • Step 1503 The fourth network security device obtains the first API sequence called during the running of the test file.
  • Step 1504 The fourth network security device executes the second API sequence in the second operating system.
  • Step 1505 The fourth network security device determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
  • step 1502 in the embodiment of FIG. 15 please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1503 in the embodiment of FIG. 15, please refer to step 404 in the embodiment of FIG. 4.
  • step 1504 please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1505 in the embodiment of FIG. 15 please refer to step 406 in the embodiment of FIG.
  • Step 1506 If the test file is a malicious file, the fourth network security device reports that malicious traffic is detected at the boundary of the branch LAN.
  • network security equipment is deployed at the boundary of the local area network of the branch.
  • the network security equipment collects incoming and outgoing messages from the boundary, and implements a cross-platform dynamic detection scheme for malicious files to detect the malicious files carried in the messages. Test files are tested, and intrusion prevention is performed based on the test results.
  • the network security device uses the virtual operating environment to simulate the Windows operating system and runs the malicious file in the virtual operating environment. This malicious message is detected.
  • the network security device uses a virtual operating environment to simulate the Linux operating system, and runs the malicious file in the virtual operating environment to detect this.
  • a malicious message can effectively prevent malicious traffic from the headquarters LAN for the branch LAN, avoid malicious test files and unknown threats from spreading between the branch LAN and the headquarters LAN, and improve the network security of the branch LAN.
  • the above method embodiments are applied in a virtualization architecture, and the execution subject of the method embodiments is an entity corresponding to a network element in the virtualization architecture.
  • the virtualization architecture is the NFV architecture.
  • the NFV architecture includes NFV MANO and VNF.
  • NFV MANO has three main functional blocks, namely NFV orchestrator, VNF manager, and virtualized infrastructure manager (VIM).
  • the NFV orchestrator can orchestrate services and resources, control new network services and integrate VNFs into the virtual architecture.
  • the NFV orchestrator can also verify and authorize resource requests from the NFV infrastructure.
  • the VNF manager can manage the life cycle of the VNF.
  • VIM can control and manage NFV infrastructure, including computing resources, storage resources, and network resources.
  • OSS operation support system
  • BSS business support system
  • each component in FIG. 16 is as follows.
  • Network function virtualization orchestrator is used to realize the management and processing of network service descriptor (NSD) and virtual network function forwarding graph (VNFFG), The management of the life cycle of network services, and the coordination of virtual network function manager (VNFM) to realize the management of the life cycle of virtual network function (VNF) and the global view function of virtual resources .
  • VNFM is used to manage the life cycle of VNF, including VNF descriptor (VNF descriptor, VNFD) management, VNF instantiation, and elastic scaling of VNF instances (for example, scaling out/up, and/or scaling out) in/down), healing of VNF instances and termination of VNF instances.
  • VNFM also supports receiving elastic scaling (scaling) policies issued by NFVO to realize automated VNF elastic scaling.
  • the virtualized infrastructure manager is mainly responsible for the management (including reservation and allocation) of hardware resources and virtualized resources of the infrastructure layer, as well as the monitoring and fault reporting of virtual resource status, and provides virtualized resources for upper-layer applications. Resource pool.
  • Operation and business support systems refer to the existing operation and maintenance systems of operators.
  • the element manager performs traditional fault, configuration, user, performance, and security management (fault management, configuration management, account management, performance management, security management, FCAPS) functions for the VNF.
  • the virtualized network function corresponds to the physical network function (PNF) in the traditional non-virtualized network, for example, the mobility of the virtualized evolved packet core (EPC) Management entity (mobility management entity, MME), service gateway (service gateway, SGW), packet data gateway (packet data network gateway, PGW) and other nodes.
  • EPC virtualized packet core
  • MME mobility management entity
  • SGW service gateway
  • PGW packet data gateway
  • the VNF includes one or more VNF components (virtual network function component, VNFC) of a lower functional level.
  • NFV infrastructure including hardware resources, virtual resources and virtualization layer. From the perspective of VNF, the virtualization layer and hardware resources appear to be a complete entity that can provide the required virtual resources.
  • the hardware resource of the NFVI is a heterogeneous system.
  • the heterogeneous system includes hardware using different types of instruction sets and architectures.
  • the hardware includes computing hardware, storage hardware, network hardware, and the like.
  • the heterogeneous system includes X86 CPU and ARM CPU.
  • the virtualization layer of NFVI is used to implement the function of operating system simulation and the function of instruction conversion described in the foregoing method embodiment.
  • the virtual resource of the NFVI includes a container, which is used to provide the virtual operating environment described in the above method embodiment, and the container is provided as a VNF.
  • the malicious file detection method described in FIG. 4 of the embodiment of the present application will be illustrated by using the embodiment of FIG. 17 as an example.
  • the malicious file detection method is applied in the NFV architecture, and the detection device is a VNF.
  • the software provider issues test files to the testing equipment through NFVO, VNFM or VIM in NFV MANO, and the testing results of the testing equipment on the test files can be returned to NFV MANO.
  • the VNF runs the container to perform malicious detection, and the container is delivered to the VNF by the NFV MANO.
  • a CaaS manager is deployed in the NFV MANO, and the CaaS manager delivers the container to the VNF.
  • other network elements in the NFV MANO deliver the container to the VNF.
  • a containerized VNF refers to a VNF created on a container, and examples of the containerized VNF include one or more VNFC instances.
  • one VNFC is mapped to one container application in the CaaS service, or one VNF is mapped to one container application in the CaaS service.
  • the following describes the process of the method for detecting malicious files based on the NFV architecture with reference to FIG. 17.
  • the method may include the following steps 1701 to 1706.
  • Step 1701 NFV MANO sends a test file to the VNF, where the test file is an executable file of the first operating system.
  • Step 1702 the VNF receives the test file, and runs the test file in the virtual operating environment.
  • Step 1703 The VNF obtains the first API sequence called during the running of the test file.
  • Step 1704 The VNF executes the second API sequence in the second operating system.
  • Step 1705 The VNF determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
  • Step 1706 The VNF sends the detection result to the NFV MANO.
  • each network element in the telecommunications network is virtualized into software, and each software is deployed on common hardware, so as to realize the software and Decoupling of hardware.
  • the detection function of malicious files is free from dependence on specific hardware, it is not necessary to use dedicated hardware to implement the detection process of malicious files. Therefore, it just meets the fundamental goal of decoupling software and hardware in NFV.
  • the function of detecting malicious files is virtualized as a VNF and applied under the virtualized architecture of NFV, thereby expanding the scenario of malicious file detection for NFV applications.
  • the malicious file detection method of the embodiment of the present application is introduced above, and the malicious file detection device of the embodiment of the present application is introduced below. It should be understood that the detection device applied to the malicious file has any function of the execution subject of the above method embodiment. .
  • FIG. 18 is a schematic structural diagram of a malicious file detection device provided by an embodiment of the present application. As shown in FIG. 18, the malicious file detection device includes an acquisition module 1801, an operation module 1802, an execution module 1803, and a judgment module 1804.
  • the obtaining module 1801 is used to obtain a test file, for example, it can be used to execute step 402, step 501, step 801, step 903, step 1002, step 1201, step 1301, step 1401, step 1501, or step 1702 in the above method embodiment ;
  • the running module 1802 is used to execute the running test file. For example, it can be used to execute step 403, step 502, step 802, step 904, step 1003, step 1202, step 1302, step 1402, step 1502 or step in the above method embodiment 1703;
  • the obtaining module 1801 is also used to obtain the first API sequence. For example, it can be used to execute step 404, step 503, step 803, step 905, step 1004, step 1203, step 1303, step 1403, and step 1503 in the above method embodiment. Or step 1703;
  • the execution module 1803 is used to execute the second API sequence. For example, it can be used to execute step 405, step 504, step 804, step 906, step 1005, step 1204, step 1304, step 1404, step 1504 or Step 1704;
  • the judging module 1804 is used to judge whether the test file is a malicious file. For example, it can be used to execute step 406, step 505, step 805, step 907, step 1006, step 1205, step 1305, step 1405, step 1505 or step 1705.
  • the execution module 1803 is configured to execute step one to step three in step 405.
  • the execution module 1803 is configured to execute step (1) to step (3) in step 504.
  • the execution module 1803 is configured to execute step 8041 to step 8043.
  • the execution module 1803 is configured to execute step a to step b in step 405.
  • the device for detecting malicious files provided in the embodiment of FIG. 18 corresponds to the device for detecting malicious files in the foregoing method embodiments.
  • the modules in the device for detecting malicious files and the other operations and/or functions described above are used to implement the method.
  • the malicious file detection device for specific details, please refer to the foregoing method embodiment, and for brevity, details are not repeated here.
  • the device for detecting malicious files provided in the embodiment of FIG. 18 detects malicious files
  • only the division of the above-mentioned functional modules is used as an example.
  • the above-mentioned functions can be allocated by different functional modules as required. , That is, divide the internal structure of the malicious file detection device into different functional modules to complete all or part of the functions described above.
  • the malicious file detection apparatus provided in the foregoing embodiment belongs to the same concept as the foregoing malicious file detection method embodiment. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.
  • the embodiments of the present application also provide a computer program product, which when the computer program product runs on a detection device, causes the detection device to execute the malicious file detection method provided in the foregoing method embodiment.
  • the embodiment of the present application also provides a chip, which when the chip runs on a detection device, causes the detection device to execute the malicious file detection method provided by the foregoing method embodiment.
  • the chip may be a general-purpose processor, the general-purpose processor includes a processing circuit and an input interface and an output interface that are internally connected and communicated with the processing circuit, and the processing circuit is used to execute the steps of obtaining the test file in the above-mentioned various method embodiments through the input interface
  • the processing circuit is used to execute the steps of running the test file, acquiring the first API sequence, executing the second API sequence, and judging whether the test file is a malicious file in the foregoing method embodiments.
  • the general-purpose processor may further include a storage medium, and the processing circuit is configured to execute the storage steps in each of the foregoing method embodiments through the storage medium.
  • the storage medium may store instructions executed by the processing circuit, and the processing circuit is configured to execute the instructions stored in the storage medium to execute the foregoing method embodiments.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the unit is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the unit described as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may also be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer program instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer program instructions can be passed from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a digital video disc (DVD), or a semiconductor medium (for example, a solid state hard disk).

Abstract

The present application relates to the technical field of computers, and provided therein are a method, apparatus and device for detecting a malicious file, and a storage medium. The present application provides a solution that may achieve cross-platform dynamic detection of a malicious file. The method comprises: generating a virtual operating environment on the basis of container technology; simulating an operating environment provided by an operating system compatible with a test file; after the test file has called an API provided by the virtual operating environment, converting the API called by the test file in the virtual operating environment to an API provided by an operating system of a detection device; and executing the converted API in the operating system of the detection device. By executing an API of an operating system of a detection device, the effect of simulating the execution of an API of a first operating system is thus achieved. Therefore, a virtual operating environment provided by the detection device may be compatible with the normal operation of a test file, thereby eliminating the dependence of the test file on a specific operating system, and thus achieving cross-platform detection of a malicious file.

Description

恶意文件的检测方法、装置、设备及存储介质Malicious file detection method, device, equipment and storage medium
本申请要求于2020年01月20日提交的申请号为202010065766.3、发明名称为“恶意文件的检测方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed on January 20, 2020 with the application number 202010065766.3 and the invention title "Malicious file detection method, device, equipment and storage medium", the entire content of which is incorporated herein by reference Applying.
技术领域Technical field
本申请涉及计算机技术领域,特别涉及一种恶意文件的检测方法、装置、设备及存储介质。This application relates to the field of computer technology, and in particular to a method, device, equipment and storage medium for detecting malicious files.
背景技术Background technique
恶意文件是指包含有程序设计者出于攻击意图所编写的一段程序的文件。恶意文件能够在网络中传播,计算机能够从网络中接收到恶意文件。当计算机运行恶意文件时,会根据恶意文件包含的程序,执行信息窃取、感染或者勒索等恶意操作,极大地影响网络系统的安全性。有鉴于此,恶意文件的检测技术成为本领域的研究热点。A malicious file refers to a file containing a piece of program written by the programmer for attacking intent. Malicious files can spread on the network, and computers can receive malicious files from the network. When a computer runs a malicious file, it will perform malicious operations such as information theft, infection, or blackmail based on the program contained in the malicious file, which greatly affects the security of the network system. In view of this, the detection technology of malicious files has become a research hotspot in this field.
基于虚拟化运行环境的动态行为检测是一种新型的恶意文件检测技术。这一检测技术的优势在于能够发现未知的新的恶意文件。这一技术的基本原理是利用虚拟化技术生成一个和用户主机类似的虚拟化运行环境,例如一个虚拟机。在虚拟化运行环境中设置钩子函数,钩子函数用于拦截预定的应用编程接口(application programming interface,API)调用。在虚拟化运行环境中运行测试文件(即待检测的文件),通过钩子函数获取待检测文件在运行过程中的一系列API调用,从而得到测试文件的动态行为。进一步根据得到的测试文件的动态行为判断测试文件是否是恶意文件。Dynamic behavior detection based on virtualized operating environment is a new type of malicious file detection technology. The advantage of this detection technology is that it can find unknown new malicious files. The basic principle of this technology is to use virtualization technology to generate a virtualized operating environment similar to the user host, such as a virtual machine. A hook function is set in the virtualized operating environment, and the hook function is used to intercept predetermined application programming interface (application programming interface, API) calls. Run the test file (that is, the file to be tested) in the virtualized operating environment, and obtain a series of API calls during the running process of the file to be tested through the hook function, so as to obtain the dynamic behavior of the test file. It is further judged whether the test file is a malicious file according to the dynamic behavior of the obtained test file.
目前的检测技术存在应用的局限性。例如从使用场景上看,目前的虚拟化技术只能运行于支持微软视窗操作系统(Windows)操作系统的检测设备中,否则测试文件无法运行而无法实现检测。The current detection technology has limitations in its application. For example, from the perspective of usage scenarios, the current virtualization technology can only be run in a testing device that supports the Microsoft Windows operating system (Windows) operating system, otherwise the test file cannot run and the test cannot be implemented.
发明内容Summary of the invention
本申请实施例提供了一种恶意文件的检测方法、装置、设备及存储介质,能够一定程度上解决现有恶意文件检测技术的局限性。The embodiments of the present application provide a method, device, device, and storage medium for detecting malicious files, which can solve the limitations of existing malicious file detection technologies to a certain extent.
第一方面,提供了一种恶意文件的检测方法,在该方法中,检测设备获取测试文件,所述测试文件为基于第一操作系统运行的可执行文件;所述检测设备在虚拟运行环境中运行所述测试文件,所述虚拟运行环境是基于容器技术生成的;所述检测设备获取所述测试文件在运行过程中调用的第一API序列,所述第一API序列中包括至少一个API,所述第一API序列包括的API为第一API集合中的API,所述第一API集合包括所述虚拟运行环境提供的软件运行所需的多个API,所述第一API集合中的API的标识与第二API集合中的API的标识相同,所述第二API集合包括所述第一操作系统提供的软件运行所需的多个API;所述检测设备在第二操作系统中执行第二API序列,所述第二API序列中包括至少一个API,所述第 二API序列包括的API为所述第二操作系统中的API,所述第二API序列中的第一API与所述第一API序列中的第一API具有映射关系,所述第二操作系统是基于所述检测设备的计算机指令集架构的操作系统;所述检测设备基于所述第一API序列被调用过程中所述测试文件的行为特征,判断所述测试文件是否为恶意文件。In the first aspect, a method for detecting malicious files is provided. In this method, a detection device obtains a test file, and the test file is an executable file that runs based on a first operating system; the detection device is in a virtual operating environment Run the test file, the virtual running environment is generated based on container technology; the detection device obtains the first API sequence called by the test file during the running process, and the first API sequence includes at least one API, The APIs included in the first API sequence are APIs in a first API set, and the first API set includes multiple APIs required for software operation provided by the virtual operating environment, and the APIs in the first API set The identifier of is the same as the identifier of the API in the second API set. The second API set includes multiple APIs required by the software provided by the first operating system to run; the detection device executes the first operating system in the second operating system. Two API sequences, the second API sequence includes at least one API, the APIs included in the second API sequence are the APIs in the second operating system, and the first API in the second API sequence and the The first API in the first API sequence has a mapping relationship, and the second operating system is an operating system based on the computer instruction set architecture of the detection device; the detection device is based on the first API sequence in the process of being called. The behavior characteristics of the test file are described, and it is determined whether the test file is a malicious file.
本申请实施例通过基于容器技术生成的虚拟运行环境,模拟出兼容测试文件的操作系统提供的运行环境。测试文件调用了虚拟运行环境提供的API后,检测设备将虚拟运行环境被测试文件调用的API,转换为检测设备的操作系统提供的API,在检测设备的操作系统中执行转换后的API。由于通过执行检测设备的操作系统的API,达到了模拟执行第一操作系统的API的效果。因此,检测设备提供的虚拟运行环境能够兼容测试文件的正常运行,从而摆脱了测试文件对特定架构或平台的依赖(即测试文件要求检测设备必须基于特定的架构或平台),因此能够一定程度上解决现有恶意文件检测技术的局限性。此外,可以实现了跨平台的恶意程序检测。此外,由于虚拟运行环境是基于容器技术生成的,容器技术可以免去Hypervisor以及Guest OS带来的资源开销,直接利用主机的内核运行。由于容器的镜像的大小远远小于虚拟机的镜像的大小,因此本申请实施例的检测方法更加轻量化,消耗的CPU处理资源更少,占用的内存空间更少。本申请实施例的检测方法在进程级别实现恶意程序的运行,检测速度更快。并且,可以避免虚拟机反复重置带来的耗时和性能开销,避免传统的虚拟机的创建、调度等操作带来的开销。The embodiment of the present application simulates the operating environment provided by the operating system compatible with the test file through the virtual operating environment generated based on the container technology. After the test file calls the API provided by the virtual operating environment, the testing device converts the API called by the test file from the virtual operating environment to the API provided by the operating system of the testing device, and executes the converted API in the operating system of the testing device. Since the API of the operating system of the detection device is executed, the effect of simulating the execution of the API of the first operating system is achieved. Therefore, the virtual operating environment provided by the testing equipment can be compatible with the normal operation of the test file, thereby getting rid of the dependence of the test file on a specific architecture or platform (that is, the test file requires the testing equipment to be based on a specific architecture or platform), so it can be a certain degree Solve the limitations of existing malicious file detection technology. In addition, cross-platform malicious program detection can be realized. In addition, because the virtual operating environment is generated based on container technology, the container technology can avoid the resource overhead caused by Hypervisor and Guest OS, and directly use the kernel of the host to run. Since the size of the image of the container is much smaller than the size of the image of the virtual machine, the detection method of the embodiment of the present application is lighter, consumes less CPU processing resources, and occupies less memory space. The detection method of the embodiment of the present application realizes the operation of the malicious program at the process level, and the detection speed is faster. In addition, time-consuming and performance overhead caused by repeated resetting of virtual machines can be avoided, and overhead caused by operations such as the creation and scheduling of traditional virtual machines can be avoided.
可选地,所述检测设备在第二操作系统中执行第二API序列,包括:所述检测设备根据所述第一API序列中的每个API,分别从所述虚拟运行环境的动态链接库中获取对应的函数,从而获得第一函数序列,所述第一函数序列包括的函数用于实现所述第一API序列中包括的API;所述检测设备根据所述第一函数序列中的每个函数,分别从所述第二操作系统的动态链接库中获取映射的函数,从而生成第二函数序列,所述第二函数序列包括的函数用于实现所述第二API序列中包括的API,所述第二函数序列中的第一函数与所述第一函数序列中的第一函数具有映射关系;所述检测设备在所述第二操作系统的内核中,根据所述第二函数序列执行操作。Optionally, the execution of the second API sequence by the detection device in the second operating system includes: the detection device separately obtains data from the dynamic link library of the virtual operating environment according to each API in the first API sequence. The corresponding function in the first function sequence is obtained, thereby obtaining the first function sequence. The functions included in the first function sequence are used to implement the API included in the first API sequence; Functions, respectively obtaining the mapped functions from the dynamic link library of the second operating system, thereby generating a second function sequence, and the functions included in the second function sequence are used to implement the API included in the second API sequence , The first function in the second function sequence has a mapping relationship with the first function in the first function sequence; the detection device is in the kernel of the second operating system, according to the second function sequence Perform the operation.
上述过程提供了一种操作系统模拟的可选实现方式。将虚拟运行环境中实现API的函数封装在动态链接库中,将检测设备的操作系统中实现API的函数封装在另一个动态链接库中。当虚拟运行环境的API序列被调用时,通过依次访问不同的动态链接库,找到检测设备的操作系统提供的功能类似的函数序列。通过执行函数序列,模拟出执行第一操作系统的API序列的效果,从而实现了系统模拟的目的,使得测试文件能够在虚拟运行环境下正常运行,从而摆脱了测试文件对第一操作系统的依赖。The above process provides an alternative implementation of operating system simulation. The functions that implement the API in the virtual operating environment are encapsulated in a dynamic link library, and the functions that implement the API in the operating system of the detection device are encapsulated in another dynamic link library. When the API sequence of the virtual operating environment is called, by sequentially accessing different dynamic link libraries, a function sequence with similar functions provided by the operating system of the detection device is found. By executing the function sequence, the effect of executing the API sequence of the first operating system is simulated, so as to achieve the purpose of system simulation, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the first operating system .
可选地,所述第一操作系统为Windows操作系统,所述第二操作系统为Linux操作系统,所述检测设备根据所述第一API序列中的每个API,分别从所述虚拟运行环境的动态链接库中获取对应的函数,包括:所述检测设备根据所述第一API序列中的每个API,分别从动态链接库DLL文件中获取对应的函数;所述检测设备根据所述第一函数序列中的每个函数,分别从所述第二操作系统的动态链接库中获取映射的函数,包括:所述检测设备根据所述第一函数序列中的每个函数以及函数之间的映射关系,分别从共享对象SO文件中获取映射的函数。Optionally, the first operating system is a Windows operating system, the second operating system is a Linux operating system, and the detection device separately obtains data from the virtual operating environment according to each API in the first API sequence. Obtaining the corresponding function from the dynamic link library in the first API sequence includes: the detection device obtains the corresponding function from the DLL file of the dynamic link library according to each API in the first API sequence; the detection device obtains the corresponding function according to the first API sequence; Each function in a function sequence obtains the mapped function from the dynamic link library of the second operating system, including: the detection device according to each function in the first function sequence and the difference between functions The mapping relationship is to obtain the mapped function from the shared object SO file.
上述过程提供了一种Linux操作系统下模拟Windows操作系统的可选实现方式。将虚拟 运行环境中实现API的函数封装在DLL文件,将Linux操作系统中实现API的函数封装在SO文件中。在测试文件运行的过程中,当虚拟运行环境的API序列被调用时,通过依次访问DLL文件以及SO文件,找到Linux操作系统提供的与Windows操作系统功能类似的函数序列。通过执行函数序列,模拟出执行Windows操作系统的API序列的效果,从而实现了系统模拟的目的,使得测试文件能够在虚拟运行环境下正常运行,从而摆脱了测试文件对Windows操作系统的依赖。如此,即使测试文件是PE文件,而检测设备的操作系统是Linux操作系统,利用该方法,能够在Linux操作系统下动态检测PE文件,从而摆脱检测PE文件对Windows操作系统的依赖,因此基于非X86平台的检测设备能够动态检测基于Windows系统运行的测试文件,从而扩展了恶意文件检测技术的使用场景。The above process provides an optional implementation method for simulating the Windows operating system under the Linux operating system. Encapsulate the functions that implement the API in the virtual runtime environment in a DLL file, and encapsulate the functions that implement the API in the Linux operating system in an SO file. In the process of running the test file, when the API sequence of the virtual running environment is called, by sequentially accessing the DLL file and the SO file, a function sequence similar to the function of the Windows operating system provided by the Linux operating system is found. By executing the function sequence, the effect of executing the API sequence of the Windows operating system is simulated, so as to achieve the purpose of system simulation, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the Windows operating system. In this way, even if the test file is a PE file, and the operating system of the detection device is the Linux operating system, this method can dynamically detect the PE file under the Linux operating system, thereby getting rid of the dependency of the detected PE file on the Windows operating system. The detection equipment of the X86 platform can dynamically detect the test files running on the Windows system, thereby expanding the use scenarios of malicious file detection technology.
可选地,所述第一操作系统为Linux操作系统,所述第二操作系统为Windows操作系统,所述检测设备根据所述第一API序列中的每个API,分别从所述虚拟运行环境的动态链接库中获取对应的函数,包括:所述检测设备根据所述第一API序列中的每个API,分别从SO文件中获取对应的函数;所述检测设备根据所述第一函数序列中的每个函数,分别从所述第二操作系统的动态链接库中获取映射的函数,包括:所述检测设备根据所述第一函数序列中的每个函数以及函数之间的映射关系,分别从DLL文件中获取映射的函数。Optionally, the first operating system is a Linux operating system, the second operating system is a Windows operating system, and the detection device separately obtains data from the virtual operating environment according to each API in the first API sequence. Obtaining the corresponding function from the dynamic link library includes: the detection device obtains the corresponding function from the SO file according to each API in the first API sequence; the detection device obtains the corresponding function according to the first function sequence Each function in the second operating system obtains the mapped function from the dynamic link library of the second operating system, including: the detection device according to each function in the first function sequence and the mapping relationship between the functions, Obtain the mapped functions from the DLL file.
上述过程提供了一种Windows操作系统下模拟Linux操作系统的可选实现方式。通过将虚拟运行环境中实现API的函数封装在SO文件,将Windows操作系统中实现API的函数封装在DLL文件。在测试文件运行的过程中,当虚拟运行环境的API序列被调用时,通过依次访问SO文件以及DLL文件,找到Windows操作系统提供的与Linux操作系统功能类似的函数序列。通过执行函数序列,模拟出执行Linux操作系统的API序列的效果,从而实现了系统模拟的目的,使得测试文件能够在虚拟运行环境下正常运行,从而摆脱了测试文件对Linux操作系统的依赖。如此,即使测试文件是ELF文件,而检测设备的操作系统是Windows操作系统,利用该方法,能够在Windows操作系统下动态检测ELF文件,从而摆脱检测ELF文件对Linux操作系统的依赖,因此基于X86平台的检测设备能够动态检测基于Linux系统运行的测试文件,从而扩展了恶意文件检测技术的使用场景。The above process provides an optional implementation method for simulating the Linux operating system under the Windows operating system. By encapsulating the functions that implement the API in the virtual operating environment in the SO file, the functions that implement the API in the Windows operating system are encapsulated in the DLL file. In the process of running the test file, when the API sequence of the virtual operating environment is called, by sequentially accessing the SO file and the DLL file, a function sequence similar to the function of the Linux operating system provided by the Windows operating system is found. By executing the function sequence, the effect of executing the API sequence of the Linux operating system is simulated, so as to achieve the purpose of system simulation, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the Linux operating system. In this way, even if the test file is an ELF file, and the operating system of the detection device is the Windows operating system, this method can dynamically detect the ELF file under the Windows operating system, so as to get rid of the dependence of the detected ELF file on the Linux operating system, so it is based on X86 The detection equipment of the platform can dynamically detect the test files running based on the Linux system, thereby expanding the use scenarios of malicious file detection technology.
可选地,所述检测设备在第二操作系统中执行第二API序列,包括:所述检测设备获取所述第一API序列中调用的第一类参数,所述第一类参数包括的参数为所述第一API序列中的API的输入参数;所述检测设备在所述第二操作系统中,根据第二类参数执行所述第二API序列,所述第二类参数包括的参数为所述第二API序列中的API的输入参数,所述第二类参数中的第一参数与所述第一类参数中的第一参数具有映射关系。Optionally, the execution of the second API sequence by the detection device in the second operating system includes: the detection device obtains the first type of parameters called in the first API sequence, and the parameters included in the first type of parameters Is the input parameter of the API in the first API sequence; in the second operating system, the detection device executes the second API sequence according to the second type of parameters, and the parameters of the second type of parameters are For the input parameters of the API in the second API sequence, the first parameter in the second type parameter has a mapping relationship with the first parameter in the first type parameter.
上述过程提供了一种系统模拟的可选实现方式。通过考虑到不同API的输入参数可能具有差异,当虚拟运行环境的API序列中调用了输入参数时,将调用的输入参数映射为检测设备的操作系统的API的输入参数,从而根据合适的参数执行API序列,从而避免执行API序列时出现传入的参数错误的情况。The above process provides an alternative implementation of system simulation. Taking into account that the input parameters of different APIs may be different, when the input parameters are called in the API sequence of the virtual operating environment, the called input parameters are mapped to the input parameters of the API of the operating system of the detection device, so as to execute according to the appropriate parameters. API sequence, so as to avoid the situation that the incoming parameters are incorrect when executing the API sequence.
可选地,所述检测设备在第二操作系统中执行第二API序列,包括:所述检测设备获取所述测试文件在运行过程中触发的第一指令序列,所述第一指令序列包括至少一个指令,所述第一指令序列中的每个指令用于指示调用所述第一API序列中的一个API;所述检测设备对所述第一指令序列中的指令进行第一指令转换,根据第一指令转换的结果得到第二指令序列,所述第二指令序列包括至少一个指令,所述第二指令序列中的每个指令用于指示调用所 述第二API序列中的一个API,所述第一指令转换用于将所述第一操作系统所基于的指令集中的指令转换为所述检测设备的计算机指令集中的指令;所述检测设备执行所述第二指令序列,以实现所述第二API序列对应的操作。Optionally, the execution of the second API sequence by the detection device in the second operating system includes: the detection device obtains a first instruction sequence triggered during the running of the test file, and the first instruction sequence includes at least An instruction, each instruction in the first instruction sequence is used to instruct to call an API in the first API sequence; the detection device performs a first instruction conversion on the instructions in the first instruction sequence, according to A second instruction sequence is obtained as a result of the conversion of the first instruction, the second instruction sequence includes at least one instruction, and each instruction in the second instruction sequence is used to instruct to call an API in the second API sequence, so The first instruction conversion is used to convert instructions in the instruction set based on the first operating system into instructions in the computer instruction set of the detection device; the detection device executes the second sequence of instructions to implement the The operation corresponding to the second API sequence.
通过这种可选方式,若测试文件是通过指令集A编写的可执行文件,测试设备的CPU是指令集B架构的CPU,测试设备通过进行指令转换,从而将测试文件触发的指令从指令集A中的指令转换为指令集B中的指令。这样,测试设备的CPU能够执行测试文件触发的指令,从而正常运行测试文件。由此可见,该技术手段能够摆脱运行测试文件对特定指令集架构的依赖,从而保证检测恶意文件的方案广泛的应用在各种硬件环境上。Through this optional method, if the test file is an executable file written through instruction set A, the CPU of the test device is a CPU of instruction set B architecture, and the test device performs instruction conversion to convert the instructions triggered by the test file from the instruction set The instructions in A are converted to instructions in the instruction set B. In this way, the CPU of the test device can execute the instruction triggered by the test file, thereby running the test file normally. It can be seen that this technical means can get rid of the dependence of running test files on a specific instruction set architecture, thereby ensuring that the scheme of detecting malicious files is widely used in various hardware environments.
可选地,所述第一操作系统为Windows操作系统,所述检测设备的计算机指令集架构为进阶精简指令集机器ARM架构,所述检测设备对所述第一指令序列中的指令进行第一指令转换,根据第一指令转换的结果得到第二指令序列,包括:所述检测设备将所述第一指令序列中的每个X86指令转换为ARM指令,根据转换得到的ARM指令得到所述第二指令序列。Optionally, the first operating system is a Windows operating system, the computer instruction set architecture of the detection device is an advanced reduced instruction set machine ARM architecture, and the detection device performs the first instruction on the instructions in the first instruction sequence. An instruction conversion to obtain a second instruction sequence according to the result of the first instruction conversion includes: the detection device converts each X86 instruction in the first instruction sequence into an ARM instruction, and obtains the ARM instruction according to the converted ARM instruction The second sequence of instructions.
通过这种可选方式,若测试文件是通过X86指令集编写的可执行文件,测试设备的CPU是ARM指令集架构的CPU,测试设备通过进行指令转换,从而将测试文件触发的X86指令转换为ARM指令。这样,测试设备的CPU能够执行ARM指令,从而正常运行测试文件。由此可见,该技术手段能够摆脱运行测试文件对X86指令集架构的依赖,因此能够保证检测恶意文件的方案广泛的应用在ARM硬件环境上。Through this optional method, if the test file is an executable file written by the X86 instruction set, the CPU of the test device is a CPU of the ARM instruction set architecture, and the test device converts the X86 instructions triggered by the test file into ARM instructions. In this way, the CPU of the test device can execute the ARM instruction, thereby running the test file normally. It can be seen that this technical means can get rid of the dependence of running test files on the X86 instruction set architecture, and therefore can ensure that the program for detecting malicious files is widely used in the ARM hardware environment.
可选地,所述第一操作系统为Linux操作系统,所述检测设备的计算机指令集架构为X86架构,所述检测设备对所述第一指令序列中的指令进行第一指令转换,根据第一指令转换的结果得到第二指令序列,包括:所述检测设备将所述第一指令序列中的每个ARM指令转换为X86指令,根据转换得到的X86指令得到所述第二指令序列。Optionally, the first operating system is a Linux operating system, the computer instruction set architecture of the detection device is an X86 architecture, and the detection device performs a first instruction conversion on the instructions in the first instruction sequence, and according to the first instruction Obtaining the second instruction sequence as a result of an instruction conversion includes: the detection device converts each ARM instruction in the first instruction sequence into an X86 instruction, and obtaining the second instruction sequence according to the converted X86 instruction.
通过这种可选方式,若测试文件是通过ARM指令集编写的可执行文件,测试设备的CPU是X86指令集架构的CPU,测试设备通过进行指令转换,从而将测试文件触发的ARM指令转换为X86指令。这样,测试设备的CPU能够执行X86指令,从而正常运行测试文件。由此可见,该技术手段能够摆脱运行测试文件对ARM指令集架构的依赖,因此保证检测恶意文件的方案广泛的应用在X86硬件环境上。Through this optional method, if the test file is an executable file written by the ARM instruction set, the CPU of the test device is a CPU of the X86 instruction set architecture, and the test device converts the ARM instructions triggered by the test file into X86 instructions. In this way, the CPU of the test device can execute X86 instructions to run the test file normally. It can be seen that this technical method can get rid of the dependence of the running test file on the ARM instruction set architecture, so it is guaranteed that the program of detecting malicious files is widely used in the X86 hardware environment.
可选地,所述检测设备在第二操作系统中执行第二API序列之后,所述方法还包括:所述检测设备获取第三指令序列,所述第三指令序列表示执行所述第二API序列后得到的结果,所述第三指令序列中的指令属于所述检测设备的计算机指令集;所述检测设备对所述第三指令序列中的每个指令进行第二指令转换,根据第二指令转换的结果得到第四指令序列,所述第四指令序列中的指令属于所述虚拟运行环境的计算机指令集,所述第二指令转换用于将所述检测设备的计算机指令集中的指令转换为所述第一操作系统所基于的指令集中的指令;所述检测设备将所述第四指令序列输入所述虚拟运行环境。Optionally, after the detection device executes the second API sequence in the second operating system, the method further includes: the detection device obtains a third instruction sequence, where the third instruction sequence represents execution of the second API As a result obtained after the sequence, the instructions in the third instruction sequence belong to the computer instruction set of the detection device; the detection device performs a second instruction conversion on each instruction in the third instruction sequence, and performs a second instruction conversion according to the second instruction sequence. A fourth instruction sequence is obtained as a result of the instruction conversion. The instructions in the fourth instruction sequence belong to the computer instruction set of the virtual operating environment, and the second instruction conversion is used to convert the instructions in the computer instruction set of the detection device Is an instruction in an instruction set based on the first operating system; the detection device inputs the fourth instruction sequence into the virtual operating environment.
通过这种可选方式,能够将API序列的执行结果转换为与测试文件兼容的形式,返回给虚拟运行环境中运行的测试文件,使得测试文件能够根据之前调用API序列的结果继续运行,持续地表达出动态行为。Through this optional method, the execution result of the API sequence can be converted into a form compatible with the test file, and returned to the test file running in the virtual operating environment, so that the test file can continue to run according to the result of the previous call to the API sequence, continuously Express dynamic behavior.
可选地,所述容器技术包括Docker容器技术,所述虚拟运行环境通过Docker守护进程启动,所述Docker守护进程为所述检测设备基于所述第二操作系统运行的进程。Optionally, the container technology includes a Docker container technology, the virtual operating environment is started by a Docker daemon, and the Docker daemon is a process run by the detection device based on the second operating system.
通过这种可选方式,基于Docker容器技术,通过Docker守护进程启动虚拟运行环境, 该虚拟运行环境例如是Docker容器的实例。通过使用Docker容器,能够具有更轻量化的优势,实现进程级别的恶意文件检测。In this optional manner, based on the Docker container technology, a virtual operating environment is started through a Docker daemon, and the virtual operating environment is, for example, an instance of a Docker container. By using Docker containers, it can have the advantage of being lighter and realize process-level malicious file detection.
第二方面,提供了一种恶意文件的检测装置,该恶意文件的检测装置包括至少一个模块,至少一个模块用于实现上述第一方面或第一方面任一种可选方式所提供的恶意文件的检测方法。第二方面提供的恶意文件的检测装置的具体细节可参见上述第一方面或第一方面任一种可选方式,此处不再赘述。In a second aspect, a device for detecting malicious files is provided. The device for detecting malicious files includes at least one module, and at least one module is used to implement the malicious file provided in the first aspect or any one of the optional methods of the first aspect. The detection method. For specific details of the device for detecting malicious files provided by the second aspect, reference may be made to the foregoing first aspect or any one of the optional methods of the first aspect, which will not be repeated here.
第三方面,提供了一种检测设备,该检测设备包括处理器,该处理器用于执行指令,使得该检测设备执行上述第一方面或第一方面任一种可选方式所提供的恶意文件的检测方法。第三方面提供的检测设备的具体细节可参见上述第一方面或第一方面任一种可选方式,此处不再赘述。In a third aspect, a detection device is provided. The detection device includes a processor configured to execute instructions so that the detection device executes the malicious file provided in the first aspect or any one of the optional methods of the first aspect. Detection method. For specific details of the detection device provided by the third aspect, reference may be made to the foregoing first aspect or any of the optional methods of the first aspect, and details are not described herein again.
第四方面,提供了一种检测设备,该检测设备包括网络接口、存储器和与所述存储器连接的处理器,In a fourth aspect, a detection device is provided, the detection device including a network interface, a memory, and a processor connected to the memory,
所述网络接口,用于获取测试文件,所述测试文件为基于第一操作系统运行的可执行文件;The network interface is used to obtain a test file, and the test file is an executable file running based on a first operating system;
所述存储器用于存储程序指令;The memory is used to store program instructions;
所述处理器用于执行所述程序指令,以使所述检测设备执行以下操作:The processor is configured to execute the program instructions, so that the detection device performs the following operations:
在虚拟运行环境中运行所述测试文件,所述虚拟运行环境是基于容器技术生成的;Running the test file in a virtual operating environment, the virtual operating environment being generated based on container technology;
获得所述测试文件在运行过程中调用的第一API序列,所述第一API序列中包括至少一个API,所述第一API序列包括的API为第一API集合中的API,所述第一API集合包括所述虚拟运行环境提供的软件运行所需的多个API,所述第一API集合中的API的标识与第二API集合中的API的标识相同,所述第二API集合包括所述第一操作系统提供的软件运行所需的多个API;在第二操作系统中执行第二API序列,所述第二API序列中包括至少一个API,所述第二API序列包括的API为所述第二操作系统中的API,所述第二API序列中的第一API与所述第一API序列中的第一API具有映射关系,所述第二操作系统是基于所述检测设备的计算机指令集架构的操作系统;基于所述第一API序列被调用过程中所述测试文件的行为特征,判断所述测试文件是否为恶意文件。Obtain the first API sequence called during the running of the test file, the first API sequence includes at least one API, the API included in the first API sequence is the API in the first API set, and the first API sequence The API set includes multiple APIs required for software operation provided by the virtual operating environment, the identifiers of the APIs in the first API set are the same as those of the APIs in the second API set, and the second API set includes all APIs. Multiple APIs required for software operation provided by the first operating system; a second API sequence is executed in the second operating system, the second API sequence includes at least one API, and the second API sequence includes APIs The API in the second operating system, the first API in the second API sequence has a mapping relationship with the first API in the first API sequence, and the second operating system is based on the detection device An operating system based on a computer instruction set architecture; determining whether the test file is a malicious file based on the behavior characteristics of the test file when the first API sequence is called.
可选地,第四方面提供的检测设备还用于执行上述第一方面中任一种可选方式所提供的恶意文件的检测方法。第四方面提供的检测设备的具体细节可参见上述第一方面或第一方面任一种可选方式,此处不再赘述。Optionally, the detection device provided in the fourth aspect is further configured to execute the malicious file detection method provided in any of the above-mentioned optional methods in the first aspect. For specific details of the detection device provided in the fourth aspect, reference may be made to the foregoing first aspect or any of the optional methods of the first aspect, and details are not described herein again.
第五方面,提供了一种计算机可读存储介质,该存储介质中存储有至少一条指令,该指令由处理器读取以使检测设备执行上述第一方面或第一方面任一种可选方式所提供的恶意文件的检测方法。In a fifth aspect, a computer-readable storage medium is provided, the storage medium stores at least one instruction, and the instruction is read by a processor to make a detection device execute the first aspect or any one of the optional methods of the first aspect The malicious file detection method provided.
第六方面,提供了一种计算机程序产品,当该计算机程序产品在检测设备上运行时,使得检测设备执行上述第一方面或第一方面任一种可选方式所提供的恶意文件的检测方法。In a sixth aspect, a computer program product is provided. When the computer program product runs on a detection device, the detection device executes the malicious file detection method provided in the first aspect or any one of the optional methods in the first aspect. .
第七方面,提供了一种芯片,当该芯片在检测设备上运行时,使得检测设备执行上述第一方面或第一方面任一种可选方式所提供的恶意文件的检测方法。In a seventh aspect, a chip is provided, when the chip runs on a detection device, the detection device executes the malicious file detection method provided in the first aspect or any one of the optional methods of the first aspect.
附图说明Description of the drawings
图1是本申请实施例提供的一种网络系统的示意图;Fig. 1 is a schematic diagram of a network system provided by an embodiment of the present application;
图2是本申请实施例提供的一种检测设备的结构示意图;FIG. 2 is a schematic structural diagram of a detection device provided by an embodiment of the present application;
图3是本申请实施例提供的一种检测恶意文件的逻辑功能架构图;FIG. 3 is a logical functional architecture diagram for detecting malicious files provided by an embodiment of the present application;
图4是本申请实施例提供的一种恶意文件的检测方法的流程图;FIG. 4 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;
图5是本申请实施例提供的一种恶意文件的检测方法的流程图;FIG. 5 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;
图6是本申请实施例提供的一种恶意文件的检测方法的流程图;FIG. 6 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;
图7是本申请实施例提供的一种恶意文件的检测方法的流程图;FIG. 7 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;
图8是本申请实施例提供的一种恶意文件的检测方法的流程图;FIG. 8 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;
图9是本申请实施例提供的一种恶意文件的检测方法的流程图;FIG. 9 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;
图10是本申请实施例提供的一种网络安全的保护方法的流程图;FIG. 10 is a flowchart of a method for protecting network security according to an embodiment of the present application;
图11是本申请实施例提供的一种企业网络的示意图;FIG. 11 is a schematic diagram of an enterprise network provided by an embodiment of the present application;
图12是本申请实施例提供的一种网络安全的保护方法的流程图;FIG. 12 is a flowchart of a method for protecting network security according to an embodiment of the present application;
图13是本申请实施例提供的一种网络安全的保护方法的流程图;FIG. 13 is a flowchart of a method for protecting network security according to an embodiment of the present application;
图14是本申请实施例提供的一种网络安全的保护方法的流程图;FIG. 14 is a flowchart of a method for protecting network security according to an embodiment of the present application;
图15是本申请实施例提供的一种网络安全的保护方法的流程图;FIG. 15 is a flowchart of a method for protecting network security according to an embodiment of the present application;
图16是本申请实施例提供的一种虚拟化架构的示意图;FIG. 16 is a schematic diagram of a virtualization architecture provided by an embodiment of the present application;
图17是本申请实施例提供的一种恶意文件的检测方法的流程图;FIG. 17 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;
图18是本申请实施例提供的一种恶意文件的检测装置的结构示意图。FIG. 18 is a schematic structural diagram of a malicious file detection apparatus provided by an embodiment of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the purpose, technical solutions, and advantages of the present application clearer, the following further describes the embodiments of the present application in detail with reference to the accompanying drawings.
本申请中术语“第一”“第二”等字样用于对作用和功能基本相同的相同项或相似项进行区分,应理解,“第一”、“第二”、“第n”之间不具有逻辑或时序上的依赖关系,也不对数量和执行顺序进行限定。In this application, the terms "first", "second" and other words are used to distinguish the same or similar items that have basically the same function and function. It should be understood that between "first", "second" and "nth" There are no logic or timing dependencies, and no restrictions on the number and execution order.
本申请中术语“至少一个”的含义是指一个或多个,本申请中术语“多个”的含义是指两个或两个以上,例如,多个第二报文是指两个或两个以上的第二报文。本文中术语“系统”和“网络”经常可互换使用。The term "at least one" in this application means one or more, and the term "multiple" in this application means two or more. For example, multiple second messages mean two or more More than one second message. The terms "system" and "network" are often used interchangeably in this document.
还应理解,术语“如果”可被解释为意指“当...时”(“when”或“upon”)或“响应于确定”或“响应于检测到”。类似地,根据上下文,短语“如果确定...”或“如果检测到[所陈述的条件或事件]”可被解释为意指“在确定...时”或“响应于确定...”或“在检测到[所陈述的条件或事件]时”或“响应于检测到[所陈述的条件或事件]”。It should also be understood that the term "if" can be interpreted to mean "when" ("when" or "upon") or "in response to determination" or "in response to detection." Similarly, depending on the context, the phrase "if it is determined..." or "if [the stated condition or event] is detected" can be interpreted to mean "when determining..." or "in response to determining..." "Or "when [stated condition or event] is detected" or "in response to detecting [stated condition or event]".
以下,对本申请涉及的技术术语进行介绍。In the following, the technical terms involved in this application are introduced.
指令集是指处理器可识别的一整套指令。指令集可包括复杂指令集(全称:Complex Instruction Set Computing,缩写:CISC)和精简指令集RISC(全称:reduced instruction set Computing,缩写:RISC)。The instruction set refers to a set of instructions that the processor can recognize. The instruction set may include a complex instruction set (full name: Complex Instruction Set Computing, abbreviation: CISC) and a reduced instruction set RISC (full name: reduced instruction set Computing, abbreviation: RISC).
X86是微处理器执行的计算机语言指令集。X86指一个英特尔(Intel)通用计算机系列的标准编号缩写。X86也标识一套通用的计算机指令集合。X86为CISC。X86的名字来源于1978年推出的Intel 8086中央处理器。X86 is a computer language instruction set executed by a microprocessor. X86 refers to a standard number abbreviation of Intel's general-purpose computer series. X86 also identifies a set of general computer instructions. X86 is CISC. The name of X86 comes from the Intel 8086 central processing unit introduced in 1978.
X86架构通常用于指代兼容X86指令集的处理器。X86架构被广泛地应用在个人计算机(Personal Computer,缩写:PC)上。The X86 architecture is usually used to refer to processors that are compatible with the X86 instruction set. The X86 architecture is widely used in personal computers (Personal Computer, abbreviation: PC).
X86平台泛指基于X86架构的硬件设备。该硬件设备上使用了Intel或其它兼容X86指令集的处理器。另外,硬件设备通常会安装有微软视窗操作系统(Windows)操作系统。例如,硬件设备是X86服务器。The X86 platform generally refers to hardware devices based on the X86 architecture. This hardware device uses Intel or other processors compatible with the X86 instruction set. In addition, the hardware device is usually installed with a Microsoft Windows operating system (Windows) operating system. For example, the hardware device is an X86 server.
ARM是一个32位的RISC。ARM架构通常用于指代兼容ARM指令集的处理器。ARM架构被广泛地应用在移动终端上。ARM is a 32-bit RISC. The ARM architecture is usually used to refer to processors that are compatible with the ARM instruction set. The ARM architecture is widely used in mobile terminals.
ARM平台泛指基于ARM架构的硬件设备。该硬件设备上使用了兼容ARM指令集的处理器。另外,该硬件设备通常会安装有Linux操作系统(是一套免费使用和自由传播的类Unix操作系统)。ARM platform generally refers to hardware devices based on the ARM architecture. The hardware device uses a processor compatible with the ARM instruction set. In addition, the hardware device is usually installed with a Linux operating system (a set of free-to-use and freely distributed Unix-like operating systems).
可执行文件(executable file)是指可由操作系统加载执行的文件。可选地,在Windows操作系统下,可执行文件包括可移植的可执行(portable executable,PE)文件,PE文件包括.exe文件、.sys文件、.com等类型文件。其中,.是文件名和扩展名的间隔符,例如,.exe文件为扩展名为exe的文件。在Linux操作系统下,可执行文件包括可执行与可链接格式(Executable and Linkable Format,缩写:ELF)文件。An executable file refers to a file that can be loaded and executed by the operating system. Optionally, under the Windows operating system, executable files include portable executable (PE) files, and PE files include .exe files, .sys files, .com and other types of files. Among them,. Is the separator between the file name and the extension. For example, an .exe file is a file with the extension exe. Under the Linux operating system, executable files include executable and linkable format (Executable and Linkable Format, abbreviation: ELF) files.
恶意文件是指包含有程序设计者出于攻击意图所编写的一段程序的文件。恶意文件利用计算机系统的漏洞执行恶意任务,例如窃取机密信息、破坏存储的数据等等。恶意文件往往是可执行文件,例如在计算机系统上执行恶意任务的病毒、蠕虫和特洛伊木马程序。由于恶意文件会对计算机系统安全造成严重破坏。A malicious file refers to a file containing a piece of program written by the programmer for attacking intent. Malicious files use vulnerabilities in computer systems to perform malicious tasks, such as stealing confidential information, destroying stored data, and so on. Malicious files are often executable files, such as viruses, worms, and Trojan horse programs that perform malicious tasks on computer systems. Because malicious files can cause serious damage to computer system security.
静态检测是指在不运行计算机程序的条件下进行程序分析的方法。例如仅通过分析样本文件的源代码、汇编、语法、结构、过程、接口等来检查样本文件是否为恶意文件。Static testing refers to a method of program analysis without running a computer program. For example, only by analyzing the source code, assembly, grammar, structure, process, interface, etc. of the sample file to check whether the sample file is malicious.
动态行为检测是指模拟测试文件的执行过程,获得测试文件执行过程中产生的行为或行为序列,与已知恶意文件的动态行为特征进行匹配,根据匹配结果判定测试文件是否为恶意文件。时下,很多动态行为检测技术会受到运行环境的制约,具体地,获取到测试文件后,经常会出现由于运行环境不兼容测试文件,使得测试文件无法运行起来,这样就无法监控到动态行为,也就无法检测出恶意文件。动态行为检测通常采用沙箱技术实现。Dynamic behavior detection refers to simulating the execution process of the test file, obtaining the behavior or behavior sequence generated during the execution of the test file, matching it with the dynamic behavior characteristics of the known malicious file, and judging whether the test file is a malicious file according to the matching result. Nowadays, many dynamic behavior detection technologies are restricted by the operating environment. Specifically, after obtaining the test file, it often appears that the test file cannot be run due to the incompatibility of the test file in the operating environment, so that the dynamic behavior cannot be monitored. Malicious files cannot be detected. Dynamic behavior detection is usually implemented using sandbox technology.
沙箱(sandbox)是一种安全机制,通过提供一个虚拟的运行环境,为执行中的测试文件提供隔离环境。在沙箱中运行的程序不会对硬件产生永久性的影响。可选地,沙箱能够通过主机的真实操作系统来实现,也能够通过虚拟机来实现。为了收集测试文件运行过程中产生的行为或行为序列,需要在沙箱里添加监控程序。在Windows操作系统的虚拟机里,通常利用微软公司提供的驱动框架添加监控程序,监控程序监控进程创建、文档创建、注册表修改等行为。Sandbox is a security mechanism that provides an isolation environment for test files in execution by providing a virtual operating environment. Programs running in the sandbox will not have a permanent impact on the hardware. Optionally, the sandbox can be implemented through the real operating system of the host, or through a virtual machine. In order to collect the behavior or behavior sequence generated during the running of the test file, it is necessary to add a monitoring program in the sandbox. In the virtual machine of the Windows operating system, the driver framework provided by Microsoft is usually used to add monitoring programs. The monitoring programs monitor process creation, document creation, registry modification and other behaviors.
操作系统(Operating System,缩写:OS)是指管理和控制计算机硬件资源与软件资源的 计算机程序。操作系统是基本的系统软件,是计算机硬件和其他软件的接口,其他软件要在操作系统的支持下运行。操作系统能够为可执行文件提供运行环境,比如提供软件运行所需的API等。Operating System (Operating System, abbreviation: OS) refers to a computer program that manages and controls computer hardware resources and software resources. The operating system is the basic system software, which is the interface between computer hardware and other software, and other software must run under the support of the operating system. The operating system can provide a running environment for executable files, such as providing APIs required for software operation.
操作系统的内核是一种软件,内核为操作系统的一部分,内核为操作系统的核心。操作系统的内核可用于管理操作系统的各种资源。内核可以理解为应用程序和硬件之间的桥梁,或者说充当应用程序和硬件之间的接口。内核是直接运行在硬件上的软件实体,用于为应用程序提供对计算机硬件的访问。另外内核能够决定一个程序在什么时候对某部分硬件操作多长时间。内核能够提供硬件抽象层、磁盘及测试文件系统控制、多任务等功能。The kernel of an operating system is a kind of software, the kernel is a part of the operating system, and the kernel is the core of the operating system. The kernel of the operating system can be used to manage various resources of the operating system. The kernel can be understood as a bridge between applications and hardware, or as an interface between applications and hardware. The kernel is a software entity that runs directly on the hardware and is used to provide application programs with access to computer hardware. In addition, the kernel can determine when and how long a program operates on a certain part of the hardware. The kernel can provide hardware abstraction layer, disk and test file system control, multitasking and other functions.
容器(container)技术是一种虚拟化技术。容器技术能够用于生成容器,该容器能够为软件的执行提供虚拟运行环境。容器是一种软件。容器是可执行文件以及可执行文件依赖的资源的打包。容器包含可执行文件运行所需的资源。例如代码、运行环境、系统工具、系统库和设置。容器能够通过镜像创建。相对于虚拟机而言,容器具有更轻量的优势,占用的资源更少,运行速度更快。具体而言,虚拟机会将虚拟硬件、内核(即操作系统)以及用户空间打包在新虚拟机当中,当通过虚拟机运行应用时,虚拟机先需要虚拟一个物理环境,然后构建一个完整的操作系统,再搭建一层运行时刻(Runtime),然后供应用程序运行。而容器通常直接将容器层安装在主机的操作系统之上,容器层例如可以为Linux容器(Linux Container,LXC)或lib container(Docker中用于容器管理的包,它基于Go语言实现)。其中,容器的内部没有操作系统,容器利用物理机的内核运行,多个容器能够共享物理机的操作系统。由于容器直接利用了主机的内核,免去了构建操作系统的流程以及为容器包含的应用分配独立的操作系统的流程,虚拟化的对象更少,比如说,在一些情况下,构建容器时要为容器独立构建的只有二进制文件与库,库中包含二进制文件依赖的内容,而不需要像虚拟机那样打包完整的操作系统,因此更加轻量化,启动速度极快。此外,容器的管理也具有更便捷的优势。具体地,容器的运行态对应一组标准的管理操作,例如,启动容器、停止容器、暂停容器和删除容器等,能够通过这些标准的管理操作,便捷的管理容器。Container technology is a kind of virtualization technology. Container technology can be used to generate containers, which can provide a virtual operating environment for software execution. A container is a kind of software. A container is a packaging of executable files and the resources that executable files depend on. The container contains the resources necessary for the executable file to run. Such as code, operating environment, system tools, system libraries and settings. The container can be created by mirroring. Compared with virtual machines, containers have the advantage of being lighter, occupying fewer resources, and running faster. Specifically, a virtual machine packs virtual hardware, kernel (ie operating system), and user space in a new virtual machine. When running applications through the virtual machine, the virtual machine first needs to virtualize a physical environment, and then build a complete operating system , Build another layer of runtime (Runtime), and then the application program runs. The container usually directly installs the container layer on the operating system of the host. The container layer can be, for example, a Linux container (Linux Container, LXC) or lib container (a package for container management in Docker, which is implemented based on the Go language). Among them, there is no operating system inside the container, the container uses the kernel of the physical machine to run, and multiple containers can share the operating system of the physical machine. Since the container directly uses the kernel of the host, the process of building an operating system and the process of assigning an independent operating system to the applications contained in the container are eliminated, and there are fewer virtualized objects. For example, in some cases, it is necessary to build the container. Only binary files and libraries are built independently for the container. The library contains the content that the binary files depend on, and does not need to package a complete operating system like a virtual machine, so it is lighter and faster to start. In addition, container management also has more convenient advantages. Specifically, the running state of the container corresponds to a set of standard management operations, for example, starting the container, stopping the container, suspending the container, deleting the container, etc. The container can be conveniently managed through these standard management operations.
镜像(image)用于封装运行容器所需的内容,例如程序、库、资源、配置等文件,以及一些配置参数。其中,镜像通常以分层的结构方式进行存储。镜像包括至少一个镜像层(image layer)。例如,镜像层是一个只读的模板,该模板用于构建容器。镜像层用于存储应用程序和迁移应用程序。The image (image) is used to encapsulate the content required to run the container, such as files such as programs, libraries, resources, configuration, and some configuration parameters. Among them, the image is usually stored in a hierarchical structure. The image includes at least one image layer (image layer). For example, the image layer is a read-only template that is used to build the container. The mirror layer is used to store applications and migrate applications.
跨平台是软件技术中的术语,是指一个操作系统下开发的应用,放到另一个操作系统下依然能够正常运行。例如,如果在Windows操作系统下开发了应用A,该应用A在Linux操作系统下依然能够正常运行,可称应用A为跨平台应用。通常情况下,跨平台的应用要满足不依赖于操作系统的条件。Cross-platform is a term in software technology. It refers to applications developed under one operating system that can still run normally under another operating system. For example, if application A is developed under the Windows operating system, and the application A can still run normally under the Linux operating system, application A can be called a cross-platform application. Under normal circumstances, cross-platform applications must meet the conditions of not relying on the operating system.
应用编程接口(application programming interface,API)是操作系统与应用程序之间的通信接口,操作系统为应用程序提供API,应用程序调用API,以指令操作系统执行操作。从技术实质的角度而言,API是一套预先设定的函数。通俗的讲,操作系统可以视为一个服务中心,能为应用程序提供各种服务,会将实现各种服务的指令封装在各个函数中,若应用程序要使用操作系统的某种服务,会调用该服务对应的函数,则操作系统会执行函数对应的操作,从而为应用程序提供服务。由于这种函数的服务对象是应用程序,因此这种函数称之为应用程序编程接口。通过API,能够为应用程序与开发人员提供基于软件或硬件来访问一组 例程的能力,同时,免去访问源码以及理解内部工作机制的学习成本,降低了复杂度。An application programming interface (application programming interface, API) is a communication interface between an operating system and an application program. The operating system provides an API for the application program, and the application program calls the API to instruct the operating system to perform operations. From a technical point of view, API is a set of preset functions. In layman's terms, the operating system can be regarded as a service center, which can provide various services for applications, and will encapsulate the instructions for implementing various services in various functions. If the application wants to use a certain service of the operating system, it will call For the function corresponding to the service, the operating system will perform the operation corresponding to the function to provide services for the application. Since the service object of this kind of function is an application, this kind of function is called an application programming interface. Through API, it can provide applications and developers with the ability to access a set of routines based on software or hardware. At the same time, it eliminates the learning cost of accessing source code and understanding the internal working mechanism, and reduces complexity.
Docker为由Google公司推出的软件容器平台,可实现容器的开发、部署并运行等功能。通过Docker,能够方便地创建和使用容器,把自己的应用放入容器。容器还能够进行版本管理、复制、分享、修改,能够通过定制应用镜像来实现持续集成、持续交付、部署。Docker一般包含Docker客户端(Docker Client)以及Docker守护进程(Docker Daemon)。Docker Daemon也称Docker引擎(Docker Engine),是用于管理镜像以及容器的守护进程,它是运行在操作系统之上的后台进程。Docker客户端能够根据用户的输入操作,触发各种指令,通过各种指令与Docker守护进程交互,Docker守护进程接收Docker客户端的指令,根据Docker客户端的指令,创建对应的作业,并执行对应的作业。Docker is a software container platform launched by Google, which can realize the development, deployment and operation of containers. With Docker, you can easily create and use containers, and put your own applications into the container. Containers can also perform version management, copy, share, and modify, and can achieve continuous integration, continuous delivery, and deployment through custom application images. Docker generally includes a Docker client (Docker Client) and a Docker daemon (Docker Daemon). Docker Daemon, also known as Docker Engine (Docker Engine), is a daemon process used to manage images and containers. It is a background process running on the operating system. The Docker client can trigger various instructions according to the user's input operations, and interact with the Docker daemon through various instructions. The Docker daemon receives instructions from the Docker client, creates corresponding jobs according to the instructions of the Docker client, and executes the corresponding jobs .
Docker容器(Docker container)的实例是Docker镜像的运行态。Docker容器能够被创建、启动、停止、删除、暂停等。用户输入查看指令(例如可以是Docker ps指令),计算机会响应查看指令,显示主机上运行的Docker容器的列表。Docker容器具有更轻量化的优势。具体地,虚拟机通常包括虚拟机管理系统(Hypervisor)和客户操作系统(Guest Operating System,Guest OS),Hypervisor用于在宿主操作系统上运行虚拟的客户操作系统,并且要对硬件资源进行虚拟化。客户操作系统占用的磁盘空间、CPU和内存都非常大。而Docker容器中,使用Docker守护进程(Docker Daemon)取代了Hypervisor以及Guest OS。Docker守护进程是运行在操作系统之上的后台进程,负责管理Docker容器,Docker守护进程能够直接与主机的操作系统进行通信,为各个Docker容器分配资源,免去了虚拟机通过Hypervisor间接通信会带来的开销。另外,Hypervisor会对硬件资源进行虚拟化,而Docker能够直接使用硬件资源,从而提高了硬件资源的利用率。The instance of Docker container is the running state of the Docker image. Docker containers can be created, started, stopped, deleted, suspended, etc. The user inputs a viewing instruction (for example, a Docker ps instruction), and the computer will respond to the viewing instruction and display a list of Docker containers running on the host. Docker containers have the advantage of being lighter. Specifically, a virtual machine usually includes a virtual machine management system (Hypervisor) and a guest operating system (Guest Operating System, Guest OS). The hypervisor is used to run a virtual guest operating system on the host operating system and to virtualize hardware resources. . The disk space, CPU, and memory occupied by the guest operating system are very large. In Docker containers, Docker Daemon is used to replace Hypervisor and Guest OS. The Docker daemon is a background process running on the operating system and is responsible for managing Docker containers. The Docker daemon can directly communicate with the operating system of the host and allocate resources for each Docker container, eliminating the need for virtual machines to communicate indirectly through the Hypervisor. Coming overhead. In addition, Hypervisor virtualizes hardware resources, and Docker can directly use hardware resources, thereby improving the utilization of hardware resources.
Docker镜像(Docker image)用于创建Docker容器,Docker镜像是一个可执行程序包,包含运行Docker应用所需的内容,例如包含Docker应用的代码、库、环境变量和配置文件。Docker镜像能够在装有Docker Engine的环境中运行,当Docker镜像运行后,会被创建为Docker容器,Docker容器能够实现对容器外部的软件以及硬件进行屏蔽。可选地,Docker镜像包括元数据文件、配置文件以及至少一个镜像层文件。例如,元数据文件是manifest.json文件,元数据文件记录了所有镜像层文件的元数据,例如记录每个镜像层文件的sha256值(哈希值)。配置文件记录Docker镜像占用的内存大小、Docker镜像包含的指令类型等,镜像层文件是layer(层)文件。A Docker image (Docker image) is used to create a Docker container. A Docker image is an executable package that contains the content required to run a Docker application, such as the code, libraries, environment variables, and configuration files of the Docker application. The Docker image can run in an environment with Docker Engine. When the Docker image runs, it will be created as a Docker container. The Docker container can shield the software and hardware outside the container. Optionally, the Docker image includes a metadata file, a configuration file, and at least one image layer file. For example, the metadata file is a manifest.json file, and the metadata file records the metadata of all mirror layer files, for example, records the sha256 value (hash value) of each mirror layer file. The configuration file records the memory size occupied by the Docker image, the type of instructions contained in the Docker image, etc. The image layer file is a layer file.
动态链接库用于封装应用程序的运行过程依赖的函数以及资源。动态链接库也称共享函数库或共享库,动态链接库中的函数和资源能够由多个应用程序共享。通过动态链接库,有助于避免代码重用和促进内存的有效使用,使得应用程序将每个功能进行模块化。动态链接库通常以测试文件的形式存储在计算机中。可选地,在不同的操作系统中,这种封装有动态链接库的测试文件具有不同的格式,具有不同的称谓。例如,在Windows操作系统下,动态链接库封装在DLL文件中;在Linux操作系统下,动态链接库封装在共享对象(shared object,so)文件中。The dynamic link library is used to encapsulate the functions and resources that the running process of the application depends on. The dynamic link library is also called shared function library or shared library. The functions and resources in the dynamic link library can be shared by multiple applications. Through the dynamic link library, it helps to avoid code reuse and promote the effective use of memory, making the application modularize each function. The dynamic link library is usually stored in the computer in the form of a test file. Optionally, in different operating systems, the test files encapsulated with the dynamic link library have different formats and have different titles. For example, under the Windows operating system, the dynamic link library is encapsulated in a DLL file; under the Linux operating system, the dynamic link library is encapsulated in a shared object (so) file.
动态链接库是API的一种实现方式。具体地,DLL能够封装API的代码,作为API的载体。在执行应用程序的过程中,若应用程序对API触发调用指令,操作系统会访问动态链接库,从动态链接库中得到API的代码,运行代码,以执行对应的操作。例如,操作系统能够提供注册表API,应用程序调用注册表API时,能够访问注册表。而使用该注册表API时所 需运行的代码能够存储在动态链接库中。The dynamic link library is an implementation of the API. Specifically, the DLL can encapsulate the code of the API and serve as the carrier of the API. In the process of executing the application program, if the application program triggers a call instruction to the API, the operating system will access the dynamic link library, obtain the code of the API from the dynamic link library, and run the code to perform the corresponding operation. For example, the operating system can provide a registry API, and when an application calls the registry API, it can access the registry. The code that needs to run when using the registry API can be stored in the dynamic link library.
DLL文件为Windows操作系统中包含动态链接库的测试文件,DLL文件包含了Windows的程序在Windows环境下运行过程中依赖的许多函数和资源。DLL文件又称“应用程序拓展”。DLL文件的后缀是.dll。例如,DLL文件包括kernel32.dll文件、user32.dll文件、gdi32.dll文件,这三个测试文件封装了Windows操作系统的API函数。可选地,DLL文件存放在系统目录下中,例如存放在C:\Windows\System32\目录下。其中,kernel32.dll文件是Windows 9x/Me中重要的32位动态链接库测试文件,属于内核级测试文件。user32.dll是Windows用户界面相关应用程序接口,用于包括Windows处理,基本用户界面等特性,如创建窗口和发送消息。gdi32.dlll是存放在Windows系统测试文件夹中的一个动态链接库,是Windows下图形用户界面的应用拓展,通常情况下是在安装操作系统过程中自动创建的。许多应用程序并不是一个完整的可执行文件,这种应用程序会被分割成一些相对独立的动态链接库,即DLL文件,放置于系统中。当执行应用程序时,应用程序对应的DLL文件就会被调用。其中,一个应用程序可使用多个DLL文件,一个DLL文件也可能被不同的应用程序使用。The DLL file is a test file that contains a dynamic link library in the Windows operating system. The DLL file contains many functions and resources that a Windows program depends on when it runs in the Windows environment. DLL files are also called "application extensions". The suffix of the DLL file is .dll. For example, the DLL file includes the kernel32.dll file, the user32.dll file, and the gdi32.dll file. These three test files encapsulate the API functions of the Windows operating system. Optionally, the DLL file is stored in the system directory, for example, in the C:\Windows\System32\ directory. Among them, the kernel32.dll file is an important 32-bit dynamic link library test file in Windows 9x/Me, which is a kernel-level test file. user32.dll is a Windows user interface related application program interface, used to include Windows processing, basic user interface and other features, such as creating windows and sending messages. gdi32.dlll is a dynamic link library stored in the Windows system test folder. It is an application extension of the graphical user interface under Windows. It is usually created automatically during the installation of the operating system. Many application programs are not a complete executable file. This kind of application program will be divided into some relatively independent dynamic link libraries, namely DLL files, which are placed in the system. When the application is executed, the DLL file corresponding to the application will be called. Among them, one application can use multiple DLL files, and one DLL file may also be used by different applications.
ntdll.dll文件为一种DLL文件,ntdll.dll文件包含用于调用内核的函数,可以理解为核心的DLL文件。在Windows操作系统中,当应用程序调用Windows API时,会访问一系列的DLL文件,而对DLL文件中函数的调用,最终会定向到ntdll.dll,当调用ntdll.dll文件中的函数后,内核会被调用,执行函数对应的操作。ntdll.dll文件是Windows系统从ring3到ring0的入口。位于kernel32.dll和user32.dll中的所有win32API最终都是调用ntdll.dll中的函数实现的。ntdll.dll中的函数使用SYSENTRY进入ring0,函数的实现实体在ring0中。The ntdll.dll file is a kind of DLL file. The ntdll.dll file contains functions for calling the kernel, which can be understood as the core DLL file. In the Windows operating system, when an application calls the Windows API, it will access a series of DLL files, and the calls to the functions in the DLL files will eventually be directed to ntdll.dll. When the functions in the ntdll.dll file are called, The kernel will be called to perform the operation corresponding to the function. The ntdll.dll file is the entry point of the Windows system from ring3 to ring0. All win32APIs located in kernel32.dll and user32.dll are finally implemented by calling functions in ntdll.dll. The function in ntdll.dll uses SYSENTRY to enter ring0, and the implementation entity of the function is in ring0.
共享对象(shared object,so)文件为Linux操作系统中包含动态链接库的文件。SO文件包括Linux操作系统的应用程序在基于Linux操作系统运行时依赖的函数。SO文件的后缀是.so。SO文件是一种二进制的ELF文件。SO文件也称共享库或共享对象库。A shared object (so) file is a file containing a dynamic link library in the Linux operating system. The SO file includes the functions that the application of the Linux operating system depends on when running based on the Linux operating system. The suffix of SO files is .so. SO file is a binary ELF file. SO files are also called shared libraries or shared object libraries.
以下,示例性介绍本申请实施例提供的恶意文件的检测方法的应用场景。Hereinafter, the application scenario of the malicious file detection method provided by the embodiment of the present application is exemplarily introduced.
请参见图1,图1是本申请实施例提供的一种恶意文件的检测方法的应用场景的示意图,图1中的网络系统中包括检测设备。可选地,图1所示的场景中还包括云端的分析设备。Please refer to FIG. 1. FIG. 1 is a schematic diagram of an application scenario of a malicious file detection method provided by an embodiment of the present application. The network system in FIG. 1 includes a detection device. Optionally, the scenario shown in FIG. 1 also includes an analysis device in the cloud.
图1中的网络系统包括数据中心1102,核心办公区、办公区A和办公区B各自的局域网1103。数据中心1102、核心办公区、办公区A和办公区B各自的局域网1103通过交换机与防火墙1105连接。防火墙1105进一步通过路由器1101、网络地址转换(Network Address Translation,NAT)设备(图中未示出)、网关设备(图中未示出)等等与广域网或者因特网连接。防火墙1105用于将网络系统与广域网或因特网进行隔离,对网络系统与广域网或者因特网之间交互的数据进行安全防护。The network system in FIG. 1 includes a data center 1102, a core office area, an office area A, and an office area B, and their respective local area networks 1103. The local area networks 1103 of the data center 1102, the core office area, the office area A, and the office area B are connected to the firewall 1105 through a switch. The firewall 1105 is further connected to the wide area network or the Internet through a router 1101, a network address translation (NAT) device (not shown in the figure), a gateway device (not shown in the figure), and so on. The firewall 1105 is used to isolate the network system from the wide area network or the Internet, and to perform security protection for data interacting between the network system and the wide area network or the Internet.
如图1所示,检测设备的一种可能的部署位置是网络系统的网络出口,即防火墙1105与路由器1101之间,例如检测设备集成在出口防火墙、出口路由器或者旁挂防火墙中。检测设备用于防范来自互联网的恶意测试文件以及恶意的web流量。可选地,检测设备是网关设备、防火墙设备、入侵检测系统(Intrusion Detection System,IDS)类设备、入侵防御系统(Intrusion Prevention System,IPS)类设备中的任一种。其中,IDS类是指依照一定的安全策略,通过软、硬件,对网络、系统的运行状况进行监视,发现各种攻击企图、攻击行为或者攻击结果,以保证网络系统资源的机密性、完整性和可用性。IPS类是指监视网络或网络设备的报文传 输行为、即时的中断、调整或隔离一些不正常或是具有伤害性的报文传输行为。可选地,检测设备还可以是独立的沙箱设备、或者是集成了沙箱功能的其他设备。例如检测设备可以是安全网关、防火墙等等。其中,独立的沙箱设备通常以旁路的方式部署于企业的互联网出口处,例如企业的区域网通过网关设备或路由器与互联网连接,沙箱设备以旁路的方式与网关设备、或路由器连接。As shown in Figure 1, a possible deployment location of the detection device is the network exit of the network system, that is, between the firewall 1105 and the router 1101. For example, the detection device is integrated in an egress firewall, an egress router, or a bypass firewall. Detection equipment is used to prevent malicious test files from the Internet and malicious web traffic. Optionally, the detection device is any one of a gateway device, a firewall device, an intrusion detection system (Intrusion Detection System, IDS) type device, and an intrusion prevention system (Intrusion Prevention System, IPS) type device. Among them, the IDS category refers to the monitoring of the network and system operating conditions through software and hardware in accordance with certain security policies, and discovering various attack attempts, attack behaviors or attack results to ensure the confidentiality and integrity of network system resources And availability. The IPS category refers to monitoring the message transmission behavior of the network or network equipment, instantaneous interruption, adjustment or isolation of some abnormal or harmful message transmission behavior. Optionally, the detection device may also be an independent sandbox device or other devices that integrate sandbox functions. For example, the detection device can be a security gateway, a firewall, and so on. Among them, independent sandbox devices are usually deployed at the Internet egress of the enterprise in a bypass manner. For example, the enterprise's local area network is connected to the Internet through a gateway device or router, and the sandbox device is connected to the gateway device or router in a bypass manner. .
在一种可能的实现方式中,检测设备集成在云端的分析设备中。检测设备通过网络app、开放的服务端口等方式向其他网络设备提供检测服务。在这种情况下,检测设备从网络中的其他网络设备(例如安全网关、防火墙等等)接收测试文件,对测试文件执行本申请实施例所示的检测方法之后,将测试文件是否为恶意文件的检测结果返回给提供测试文件的其他网络设备。In a possible implementation, the detection device is integrated in the analysis device in the cloud. Testing equipment provides testing services to other network equipment through network apps and open service ports. In this case, the detection device receives a test file from other network devices in the network (such as a security gateway, firewall, etc.), and after performing the detection method shown in the embodiment of the present application on the test file, whether the test file is a malicious file The test results are returned to other network devices that provide test files.
以上介绍了恶意文件的检测方法的应用场景,以下示例性介绍本申请实施例提供的检测设备。可选地,该检测设备是图1所示的网络系统中的检测设备。The above describes the application scenarios of the malicious file detection method, and the following exemplarily introduces the detection device provided in the embodiment of the present application. Optionally, the detection device is the detection device in the network system shown in FIG. 1.
请参见图2,图2是本申请实施例提供的一种检测设备的结构示意图。Please refer to FIG. 2, which is a schematic structural diagram of a detection device provided by an embodiment of the present application.
如附图2所示的检测设备包括至少一个处理器21、至少一个存储器22、网络接口23。处理器21、存储器22和网络接口23通过总线24相互连接。The detection device shown in FIG. 2 includes at least one processor 21, at least one memory 22, and a network interface 23. The processor 21, the memory 22, and the network interface 23 are connected to each other through a bus 24.
可选地,处理器21是一个通用中央处理器(central processing unit,CPU)、网络处理器(NP)、微处理器、或者是一个或多个用于实现本申请方案的集成电路,例如,专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。可选地,上述PLD是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。可选地,处理器21是一个单核处理器(single-CPU)。可选地,处理器21是一个多核处理器(multi-CPU)。Optionally, the processor 21 is a general-purpose central processing unit (CPU), a network processor (NP), a microprocessor, or one or more integrated circuits for implementing the solution of the present application, for example, Application-specific integrated circuit (ASIC), programmable logic device (PLD) or a combination thereof. Optionally, the above-mentioned PLD is a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), or any of them combination. Optionally, the processor 21 is a single-CPU. Optionally, the processor 21 is a multi-core processor (multi-CPU).
可选地,存储器22包括寄存器、易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM);可选地,存储器包括非易失性存储器(non-volatile memory),例如快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD)、云存储(cloud storage)、网络附接存储(Network Attached Storage,NAS)、网盘(network drive)等;可选地,存储器还包括上述种类的存储器的组合或者其他具有存储功能的任意形态的介质或产品。例如,存储器22包括电可擦可编程只读存储器(electrically erasable programmable read-only Memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其它光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其它磁存储设备,或者是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其它介质,但不限于此。可选地,存储器22是独立存在,并通过总线24与处理器21相连接。可选地,存储器22和处理器21集成在一起。Optionally, the memory 22 includes a register and a volatile memory (volatile memory), such as a random-access memory (RAM); optionally, the memory includes a non-volatile memory (non-volatile memory), For example, flash memory (flash memory), hard disk drive (HDD) or solid-state drive (SSD), cloud storage, network attached storage (NAS), network disk Optionally, the memory may also include a combination of the above-mentioned types of memory or other media or products in any form with a storage function. For example, the memory 22 includes electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed Optical discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be accessed by a computer Any other media, but not limited to this. Optionally, the memory 22 exists independently and is connected to the processor 21 through the bus 24. Optionally, the memory 22 and the processor 21 are integrated together.
网络接口23使用任何收发器一类的装置,用于与其它设备或通信网络通信。网络接口23包括有线通信接口,可选地,网络接口23还包括无线通信接口。其中,有线通信接口例如为以太网接口,例如千兆以太网(Gigabit Ethernet,简称GE)接口。可选地,以太网接口是光接口,电接口或其组合。可选地,有线通信接口是光纤分布式数据接口(Fiber Distributed Data  Interface,简称FDDI)接口。可选地,无线通信接口为无线局域网(wireless local area networks,WLAN)接口,蜂窝网络通信接口或其组合等。The network interface 23 uses any device such as a transceiver for communicating with other devices or a communication network. The network interface 23 includes a wired communication interface. Optionally, the network interface 23 also includes a wireless communication interface. The wired communication interface is, for example, an Ethernet interface, such as a Gigabit Ethernet (GE for short) interface. Optionally, the Ethernet interface is an optical interface, an electrical interface or a combination thereof. Optionally, the wired communication interface is a fiber distributed data interface (Fiber Distributed Data Interface, FDDI for short) interface. Optionally, the wireless communication interface is a wireless local area network (WLAN) interface, a cellular network communication interface, or a combination thereof.
总线24用于在上述组件之间传送信息。可选地,总线24分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。例如,总线24是高速串行计算机扩展总线标准(peripheral component interconnect express,简称:PCIe)总线、先进微控制器总线架构(Advanced Microcontroller Bus Architecture,AMBA)总线通信、缓存一致性系统(Huawei cache-coherent system,HCCS,维护多端口之间业务数据一致性的协议标准)总线或外设部件互连标准(peripheral component interconnect,简称:PCI)总线。The bus 24 is used to transfer information between the above-mentioned components. Optionally, the bus 24 is divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus. For example, the bus 24 is a high-speed serial computer expansion bus standard (peripheral component interconnect express, abbreviated as PCIe) bus, Advanced Microcontroller Bus Architecture (AMBA) bus communication, and cache-coherent system (Huawei cache-coherent). system, HCCS, a protocol standard for maintaining the consistency of service data between multiple ports) bus or peripheral component interconnection standard (peripheral component interconnect, PCI for short) bus.
可选地,图2中的检测设备还包括输入输出接口25,输入输出接口25用于连接输出设备和输入设备。输入设备和处理器21通信。可选地,输入设备以多种方式接收用户的输入。例如,输入设备是鼠标、键盘、触摸屏设备或传感设备等。输出设备和处理器21通信。可选地,输出设备以多种方式来显示信息。例如,输出设备是液晶显示器(liquid crystal display,LCD)、发光二级管(light emitting diode,LED)显示设备、阴极射线管(cathode ray tube,CRT)显示设备或投影仪(projector)等。Optionally, the detection device in FIG. 2 further includes an input and output interface 25, and the input and output interface 25 is used to connect the output device and the input device. The input device communicates with the processor 21. Optionally, the input device receives the user's input in multiple ways. For example, the input device is a mouse, a keyboard, a touch screen device, or a sensor device. The output device communicates with the processor 21. Optionally, the output device displays information in multiple ways. For example, the output device is a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector.
以上示例性介绍了检测设备的硬件结构,以下对检测设备的软件架构进行示例性描述。The hardware structure of the detection device is exemplarily introduced above, and the software architecture of the detection device is exemplarily described below.
可选地,检测设备的软件采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构等等。检测设备的软件包括至少一个功能模块,每个功能模块采用软件实现,换句话说,功能模块为检测设备的处理器21读取存储器22中存储的程序代码后生成的。Optionally, the software of the detection device adopts a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture, etc. The software of the detection device includes at least one functional module, and each functional module is implemented by software. In other words, the functional module is generated after the processor 21 of the detection device reads the program code stored in the memory 22.
例如,请参见图2,检测设备的软件包括容器221、指令转换模块222和操作系统223。For example, referring to FIG. 2, the software of the detection device includes a container 221, an instruction conversion module 222, and an operating system 223.
容器221用于提供虚拟运行环境。The container 221 is used to provide a virtual operating environment.
指令转换模块222用于对容器221和操作系统223之间传输的指令进行转换。例如,容器221向操作系统223发送指令时,指令转换模块222截获指令,对指令进行转换,将转换后的指令发送至操作系统223。又如,操作系统223向容器221发送指令时,指令转换模块222截获指令,对指令进行转换,将转换后的指令发送至容器221。The instruction conversion module 222 is used to convert the instructions transmitted between the container 221 and the operating system 223. For example, when the container 221 sends an instruction to the operating system 223, the instruction conversion module 222 intercepts the instruction, converts the instruction, and sends the converted instruction to the operating system 223. For another example, when the operating system 223 sends an instruction to the container 221, the instruction conversion module 222 intercepts the instruction, converts the instruction, and sends the converted instruction to the container 221.
可选地,指令转换模块222设置在容器221之外。Optionally, the instruction conversion module 222 is provided outside the container 221.
可选地,检测设备的软件还包括容器224和指令转换模块225。指令转换模块225设置在容器224之内。指令转换模块225用于对容器224和操作系统223之间传输的指令进行转换。Optionally, the software of the detection device further includes a container 224 and an instruction conversion module 225. The instruction conversion module 225 is provided in the container 224. The instruction conversion module 225 is used to convert the instructions transmitted between the container 224 and the operating system 223.
可选地,容器224提供的虚拟环境和容器221提供的虚拟环境不同。例如,容器221提供用于模拟Windows操作系统的虚拟运行环境,容器224提供用于模拟Linux操作系统的虚拟运行环境,如此,同一台检测设备可以利用容器221和容器224,提供多种虚拟运行环境。Optionally, the virtual environment provided by the container 224 is different from the virtual environment provided by the container 221. For example, the container 221 provides a virtual operating environment for simulating the Windows operating system, and the container 224 provides a virtual operating environment for simulating the Linux operating system. In this way, the same testing device can use the container 221 and the container 224 to provide multiple virtual operating environments. .
下面以检测设备的计算机指令集架构为ARM指令集架构为例,对图2所示的软件架构进行举例说明。Taking the ARM instruction set architecture as an example of the computer instruction set architecture of the detection device, the software architecture shown in FIG. 2 will be described as an example.
可选地,图3中Docker容器的实例是图2中的容器。请参见图3,例如,图2中的容器221是图3中的Docker_1或Docker_2。图2中的容器224是图3中的Docker_n。其中,Docker_1、Docker_2、Docker_n为3个Docker容器的实例。例如,图2中的指令转换模块222是图3中的指令转换进程1或指令转换驱动1。图2中的指令转换模块225是图3中的指令转换进 程2或指令转换驱动2。例如,图2中的操作系统223是图3中的Linux操作系统,Linux操作系统能够基于ARM指令集运行。Optionally, the example of the Docker container in FIG. 3 is the container in FIG. 2. Please refer to Fig. 3. For example, the container 221 in Fig. 2 is Docker_1 or Docker_2 in Fig. 3. The container 224 in FIG. 2 is Docker_n in FIG. 3. Among them, Docker_1, Docker_2, and Docker_n are three instances of Docker containers. For example, the instruction conversion module 222 in FIG. 2 is the instruction conversion process 1 or the instruction conversion driver 1 in FIG. 3. The instruction conversion module 225 in FIG. 2 is the instruction conversion process 2 or the instruction conversion driver 2 in FIG. 3. For example, the operating system 223 in FIG. 2 is the Linux operating system in FIG. 3, and the Linux operating system can run based on the ARM instruction set.
图3中的容器能够实现系统模拟的功能。参见图3,Docker_1包含的ntdll.dll、kernel32.dll和行为监控模块,其中,ntdll.dll、kernel32.dll用于进行系统模拟。ntdll.dll、kernel32.dll为虚拟运行环境的动态链接库。行为监控模块用于对测试文件在容器中的动态行为进行监控。The container in Figure 3 can realize the function of system simulation. Refer to Figure 3, ntdll.dll, kernel32.dll and behavior monitoring modules included in Docker_1, where ntdll.dll and kernel32.dll are used for system simulation. ntdll.dll and kernel32.dll are dynamic link libraries of the virtual operating environment. The behavior monitoring module is used to monitor the dynamic behavior of the test file in the container.
可选地,Docker实例借助指令转换模块与操作系统进行通信,换句话说,指令转换模块充当Docker实例和操作系统之间的通信媒介。以下通过方式一和方式二举例说明。Optionally, the Docker instance communicates with the operating system through the instruction conversion module. In other words, the instruction conversion module serves as a communication medium between the Docker instance and the operating system. The following is an example of way one and way two.
方式一、Docker实例的外部设置有指令转换模块。具体地,指令转换模块可以接收Docker实例产生的指令,指令转换模块对指令进行转换后,将转换后的指令发送给操作系统。例如,请参见图3,Docker_1的外部设置有指令转换进程1或指令转换驱动1,Docker_1产生X86指令后,Docker_1将X86指令发送给指令转换进程1或指令转换驱动1,指令转换进程1或指令转换驱动1接收X86指令,将X86指令转换为ARM指令,发送给Linux操作系统。Linux操作系统执行ARM指令后,将执行结果携带在ARM指令,返回给指令转换进程1或指令转换驱动1,指令转换进程1或指令转换驱动1将ARM指令转换为X86指令,将X86指令返回给Docker_1。如此,Docker_1通过指令转换进程1或指令转换驱动1,与Linux操作系统进行了通信。通过这种方式,Docker实例和指令转换模块有着清晰的角色和分工,系统模拟的功能由Docker实例实现,指令转换的功能由指令转换模块实现,从而将系统模拟与指令转换这两种功能解耦开来,方便后续单独对系统模拟的功能进行扩展、升级和更新。Method 1: An instruction conversion module is set outside the Docker instance. Specifically, the instruction conversion module may receive the instruction generated by the Docker instance, and after the instruction conversion module converts the instruction, it sends the converted instruction to the operating system. For example, please refer to Figure 3. Docker_1 is externally provided with instruction conversion process 1 or instruction conversion driver 1. After Docker_1 generates X86 instructions, Docker_1 sends the X86 instructions to instruction conversion process 1 or instruction conversion driver 1, instruction conversion process 1 or instruction The conversion driver 1 receives X86 instructions, converts X86 instructions into ARM instructions, and sends them to the Linux operating system. After the Linux operating system executes the ARM instruction, it carries the execution result in the ARM instruction and returns it to instruction conversion process 1 or instruction conversion driver 1. Instruction conversion process 1 or instruction conversion driver 1 converts ARM instructions to X86 instructions, and returns X86 instructions to Docker_1. In this way, Docker_1 communicates with the Linux operating system through the instruction conversion process 1 or the instruction conversion driver 1. In this way, the Docker instance and the instruction conversion module have a clear role and division of labor. The function of system simulation is realized by the Docker instance, and the function of instruction conversion is realized by the instruction conversion module, thereby decoupling the two functions of system simulation and instruction conversion. It is convenient to expand, upgrade and update the functions of the system simulation separately in the future.
方式二、Docker实例内部包含指令转换模块。Docker实例通过内部的指令转换模块与操作系统进行通信。例如,请参见图3,Docker_n的内部设置有指令转换进程2或指令转换驱动2,Docker_n中系统模拟的部分产生X86指令后,将X86指令发送给指令转换进程2或指令转换驱动2,指令转换进程2或指令转换驱动2接收X86指令,将X86指令转换为ARM指令,Docker_n将ARM指令发送给Linux操作系统。Linux操作系统执行ARM指令后,将执行结果携带在ARM指令,返回给Docker_n。Docker_n接收到ARM指令,通过自身的指令转换进程2或指令转换驱动2将ARM指令转换为X86指令,指令转换进程2或指令转换驱动2将X86指令返回给Docker_n中操作系统模拟的部分。如此,Docker_n通过内部包含的指令转换进程2或指令转换驱动2,与Linux操作系统进行了通信。Method 2: The Docker instance contains an instruction conversion module inside. The Docker instance communicates with the operating system through the internal instruction conversion module. For example, please refer to Figure 3. Docker_n is equipped with instruction conversion process 2 or instruction conversion driver 2. After the system simulation part in Docker_n generates X86 instructions, it sends the X86 instructions to instruction conversion process 2 or instruction conversion driver 2, and the instruction is converted Process 2 or instruction conversion driver 2 receives X86 instructions, converts X86 instructions to ARM instructions, and Docker_n sends the ARM instructions to the Linux operating system. After the Linux operating system executes the ARM instruction, the execution result is carried in the ARM instruction and returned to Docker_n. Docker_n receives the ARM instruction, and converts the ARM instruction to X86 instruction through its own instruction conversion process 2 or instruction conversion driver 2, and the instruction conversion process 2 or instruction conversion driver 2 returns the X86 instruction to the part of the operating system simulation in Docker_n. In this way, Docker_n communicates with the Linux operating system through the instruction conversion process 2 or the instruction conversion driver 2 contained inside.
应理解,图2和图3描述的软件架构在检测恶意文件时,仅以上述各功能模块的划分进行举例说明,实际应用中,可选地,根据需要而将上述功能分配由不同的功能模块完成,即将检测设备内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。换句话说,图2和图3中的功能模块能够进行组合或拆分,并不影响检测设备的整体功能。此外,图2和图3提供的软件架构与下述图4提供的方法实施例属于同一构思,其具体实现过程详见方法实施例,这里先不做赘述。It should be understood that when the software architecture described in Figures 2 and 3 detects malicious files, only the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned functions can optionally be allocated to different functional modules as needed. Complete, that is, divide the internal structure of the detection device into different functional modules to complete all or part of the functions described above. In other words, the functional modules in Fig. 2 and Fig. 3 can be combined or split without affecting the overall function of the detection device. In addition, the software architecture provided in FIG. 2 and FIG. 3 belong to the same concept as the method embodiment provided in FIG. 4 below, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
以上介绍了检测设备的硬件结构和软件架构,以下示例性介绍检测设备检测恶意文件的方法流程。The hardware structure and software architecture of the detection device are introduced above, and the following exemplarily introduces the method flow of the detection device for detecting malicious files.
请参见图4,图4是本申请实施例提供的一种恶意文件的检测方法的流程图,图4以执行主体为检测设备为例进行说明。该方法包括以下步骤401至步骤406。Please refer to FIG. 4. FIG. 4 is a flowchart of a method for detecting malicious files according to an embodiment of the present application. FIG. 4 uses the detection device as the execution subject as an example for description. The method includes the following steps 401 to 406.
步骤401、检测设备基于容器技术生成虚拟运行环境。Step 401: The detection device generates a virtual operating environment based on the container technology.
可选地,虚拟运行环境和主机的真实运行环境相隔离,使得在虚拟运行环境中运行的测试文件不会对硬件产生永久性的影响。检测设备在虚拟运行环境内部运行测试文件,随后删除运行测试文件所产生的变化。可选地,虚拟运行环境通过容器技术实现,虚拟运行环境是容器的实例。例如,请参见图3,如果基于Docker容器技术生成虚拟运行环境,虚拟运行环境通过图3中的Docker_1、Docker_2或Docker_n提供。Optionally, the virtual operating environment is isolated from the real operating environment of the host, so that the test files running in the virtual operating environment will not have a permanent impact on the hardware. The detection device runs the test file in the virtual operating environment, and then deletes the changes produced by the running test file. Optionally, the virtual operating environment is implemented through container technology, and the virtual operating environment is an instance of the container. For example, please refer to Figure 3. If a virtual operating environment is generated based on Docker container technology, the virtual operating environment is provided by Docker_1, Docker_2 or Docker_n in Figure 3.
可选地,在一些实施例中,基于容器的镜像生成虚拟运行环境。例如,基于Docker容器技术,通过Docker守护进程启动虚拟运行环境,该虚拟运行环境是Docker容器的实例。通过使用Docker容器的实例,具有更轻量化的优势,实现进程级别的恶意文件检测。Optionally, in some embodiments, a virtual operating environment is generated based on the image of the container. For example, based on the Docker container technology, the virtual operating environment is started through the Docker daemon, and the virtual operating environment is an instance of the Docker container. By using the Docker container instance, it has the advantage of lighter weight and realizes the process-level malicious file detection.
应理解,步骤401仅是以生成一个虚拟运行环境的过程为例进行说明,在一些可选地实施例中,检测设备生成多个虚拟环境。检测设备包含数目更多的虚拟运行环境时,每个虚拟运行环境的生成过程可以和步骤401类似。通过生成多个虚拟运行环境,能够并行地对多个样本文件进行检测,从而提高检测效率。此外,可选地,不同虚拟运行环境用于模拟不同的操作系统,例如虚拟运行环境A模拟Windows操作系统,虚拟运行环境B模拟Linux操作系统,从而适应不同要求的样本文件的运行需求。It should be understood that step 401 is only described by taking the process of generating a virtual operating environment as an example. In some optional embodiments, the detection device generates multiple virtual environments. When the detection device includes a larger number of virtual operating environments, the generation process of each virtual operating environment may be similar to step 401. By generating multiple virtual operating environments, multiple sample files can be detected in parallel, thereby improving the detection efficiency. In addition, optionally, different virtual operating environments are used to simulate different operating systems, for example, virtual operating environment A simulates a Windows operating system, and virtual operating environment B simulates a Linux operating system, so as to adapt to the operating requirements of sample files with different requirements.
步骤402、检测设备获取测试文件。Step 402: The detection device obtains a test file.
可选地,获取测试文件的方式包括多种。例如,检测设备接收用户输入的测试文件。又如,检测设备采集网络中传输的数据流,通过对数据流中包含的报文的载荷进行重组得到数据流携带的文件。又如,检测设备对嵌入了测试文件的父文件进行解析,得到父文件中携带的测试文件。其中,父文件是指嵌入了测试文件的文件。父文件包含测试文件。父文件也称为原始文件,或者根据厂商、标准或场景的不同而具有不同的称谓。例如,父文件是邮件,测试文件是邮件中携带的附件,该附件是可执行文件。检测设备对邮件进行解析,得到邮件中携带的附件。又例如,父文件是word文档,测试文件是word文档中链接的可执行文件,检测设备对word文档进行解析,得到测试文件。Optionally, there are multiple ways to obtain the test file. For example, the detection device receives a test file input by the user. In another example, the detection device collects the data stream transmitted in the network, and obtains the file carried by the data stream by reorganizing the payload of the packet contained in the data stream. For another example, the detection device parses the parent file embedded with the test file to obtain the test file carried in the parent file. Among them, the parent file refers to the file in which the test file is embedded. The parent file contains the test file. The parent file is also called the original file, or has different titles according to different manufacturers, standards, or scenarios. For example, the parent file is an email, the test file is an attachment carried in the email, and the attachment is an executable file. The detection device parses the mail and obtains the attachments carried in the mail. For another example, the parent file is a word document, the test file is an executable file linked in the word document, and the detection device parses the word document to obtain the test file.
测试文件为基于第一操作系统运行的可执行文件。第一操作系统为兼容测试文件的操作系统。例如,测试文件是PE文件,第一操作系统是Windows操作系统;例如,如果测试文件是ELF文件,第一操作系统是Linux操作系统。例如,如果测试文件是exe文件、.sys文件或.com等类型的文件,第一操作系统是指Windows操作系统。The test file is an executable file running based on the first operating system. The first operating system is an operating system compatible with the test file. For example, the test file is a PE file and the first operating system is a Windows operating system; for example, if the test file is an ELF file, the first operating system is a Linux operating system. For example, if the test file is an exe file, a .sys file, or a .com file, the first operating system refers to the Windows operating system.
应理解,可执行文件与操作系统的情况有多种,Linux操作系统和Windows操作系统是对第一操作系统的举例,可选地,第一操作系统是Android操作系统(也称安卓操作系统,为谷歌公司开发的操作系统)、iOS操作系统(苹果公司的移动操作系统)、Mac OS操作系统(苹果操作系统)、BlackBerry操作系统(黑莓操作系统)、UNIX操作系统(一种多用户、多任务操作系统)或者NetWare系统(NOVELL公司推出的网络操作系统)中的任一种,相应地,测试文件是基于这些操作系统运行的可执行文件。当然,以上列举的操作系统仅是对第一操作系统的举例,本实施例对第一操作系统的具体类型不做限定。It should be understood that there are many types of executable files and operating systems. The Linux operating system and the Windows operating system are examples of the first operating system. Optionally, the first operating system is the Android operating system (also called the Android operating system, Operating system developed for Google), iOS operating system (Apple's mobile operating system), Mac OS operating system (Apple operating system), BlackBerry operating system (BlackBerry operating system), UNIX operating system (a multi-user, multi-user operating system) Task operating system) or NetWare system (network operating system launched by NOVELL). Correspondingly, the test file is an executable file based on these operating systems. Of course, the operating systems listed above are only examples of the first operating system, and this embodiment does not limit the specific type of the first operating system.
需要指出的是,步骤401所示的生成虚拟运行环境的步骤是一次执行的,而步骤402所示的获取测试文件的步骤是多次执行的。换句话说,不必在每次获得测试文件之前都重新生成虚拟运行环境。步骤401生成虚拟运行环境之后,检测设备对基于虚拟运行环境所模拟的第一操作系统的多个测试文件分别执行步骤402至步骤406所示的流程。It should be pointed out that the step of generating a virtual operating environment shown in step 401 is executed once, and the step of obtaining a test file shown in step 402 is executed multiple times. In other words, it is not necessary to regenerate the virtual operating environment before each test file is obtained. After the virtual operating environment is generated in step 401, the detection device executes the processes shown in step 402 to step 406 on multiple test files based on the first operating system simulated by the virtual operating environment.
步骤403、检测设备在虚拟运行环境中运行测试文件。Step 403: The detection device runs the test file in the virtual operating environment.
例如,检测设备将测试文件传入Docker容器的实例,向Docker容器的实例发送启动运行测试文件的指令。Docker容器的实例会响应于启动运行测试文件的指令,运行测试文件,使得测试文件得以在虚拟运行环境中运行。For example, the detection device transfers the test file to an instance of the Docker container, and sends an instruction to start and run the test file to the instance of the Docker container. The instance of the Docker container will run the test file in response to the instruction to start and run the test file, so that the test file can be run in the virtual operating environment.
在一些可选地实施例中,检测设备仅生成一种虚拟运行环境,将该测试文件发送至该虚拟运行环境运行。例如,如果要模拟Windows操作系统运行测试文件,则生成用于模拟Windows操作系统的虚拟运行环境,将测试文件固定发送至该虚拟运行环境。在另一些可选地实施例中,检测设备生成多种虚拟运行环境,执行以下步骤4031至步骤4033所描述的可选流程,将测试文件发送至它需要的虚拟运行环境运行。In some optional embodiments, the detection device generates only one virtual operating environment, and sends the test file to the virtual operating environment to run. For example, if a test file is to be run by simulating a Windows operating system, a virtual operating environment for simulating the Windows operating system is generated, and the test file is fixedly sent to the virtual operating environment. In other optional embodiments, the detection device generates multiple virtual operating environments, executes the optional processes described in the following steps 4031 to 4033, and sends the test file to the virtual operating environment it needs to run.
步骤4031、检测设备确定测试文件的文件类型。Step 4031. The detection device determines the file type of the test file.
可选地,检测设备采用多种方式确定测试文件的文件类型。例如,检测设备根据文件后缀名识别测试文件的文件类型。再例如,检测设备根据文件头部信息识别测试文件的文件类型。具体地,检测设备预先存储有各种文件类型的文件头(或文件)的数据结构。检测设备接收到测试文件后,依次将测试文件的文件头与各种文件类型的文件头的数据结构进行比对,得到测试文件的文件头符合的数据结构,将该数据结构对应的文件类型作为测试文件的文件类型。此外,可选地,检测设备也直接根据后缀名识别测试文件的文件类型。检测设备确定文件类型的其他方式在这里不再一一列举。Optionally, the detection device uses multiple methods to determine the file type of the test file. For example, the detection device recognizes the file type of the test file according to the file suffix name. For another example, the detection device recognizes the file type of the test file according to the file header information. Specifically, the detection device pre-stores data structures of file headers (or files) of various file types. After receiving the test file, the testing device sequentially compares the file header of the test file with the data structure of the file headers of various file types to obtain the data structure that the file header of the test file conforms to, and use the file type corresponding to the data structure as The file type of the test file. In addition, optionally, the detection device also directly recognizes the file type of the test file according to the suffix name. Other ways for the detection device to determine the file type will not be listed here.
步骤4032、检测设备根据测试文件的文件类型,确定测试文件需要的虚拟运行环境。Step 4032. The detection device determines the virtual operating environment required by the test file according to the file type of the test file.
可选地,检测设备预先保存文件类型与虚拟运行环境之间的映射关系,通过查询该映射关系,确定测试文件运行需要的运行环境。例如,该映射关系如表1所示。Optionally, the detection device pre-stores the mapping relationship between the file type and the virtual operating environment, and by querying the mapping relationship, the operating environment required for the operation of the test file is determined. For example, the mapping relationship is shown in Table 1.
表1Table 1
文件类型file type 测试文件需要的虚拟运行环境Virtual operating environment required for test files
PE文件PE file Docker_1Docker_1
ELF文件ELF file Docker_2Docker_2
步骤4033、检测设备在测试文件需要的虚拟运行环境中,运行测试文件。Step 4033: The detection device runs the test file in the virtual operating environment required by the test file.
例如,检测设备通过后缀名比对,确定一个测试文件的后缀名为“.elf”,则确定测试文件的文件类型是ELF文件,然后在表1中查询获知ELF文件需要的运行环境是Docker_1,将该测试文件发送给Docker_2运行。其中,Docker_2用于模拟Linux操作系统的运行环境。For example, if the testing device determines that the suffix of a test file is ".elf" through the suffix comparison, it is determined that the file type of the test file is an ELF file, and then the operating environment required for the ELF file is Docker_1 by querying Table 1, Send the test file to Docker_2 to run. Among them, Docker_2 is used to simulate the operating environment of the Linux operating system.
又例如,检测设备根据测试文件的文件头中的指定字段内容,确定测试文件符合PE文件的文件头的数据结构,因此确定测试文件的文件类型是PE文件。然后检测设备在表1中查询获知PE文件需要的运行环境是Docker_1,将该测试文件发送给Docker_1运行。其中,Docker_1用于模拟Windows操作系统的运行环境。For another example, the detection device determines that the test file conforms to the data structure of the file header of the PE file according to the content of the designated field in the file header of the test file, and therefore determines that the file type of the test file is a PE file. Then, the detection device queries Table 1 to know that the operating environment required by the PE file is Docker_1, and sends the test file to Docker_1 for operation. Among them, Docker_1 is used to simulate the operating environment of the Windows operating system.
步骤404、检测设备获得测试文件在运行过程中调用的第一API序列。Step 404: The detection device obtains the first API sequence called during the running of the test file.
例如,如果测试文件为PE格式的文件,测试文件在运行过程中,依次调用CreateFile()和WriteFile()两个API,则第一API序列中包括CreateFile()和WriteFile()。For example, if the test file is a file in the PE format, the two APIs CreateFile() and WriteFile() are called in sequence during the running of the test file, and the first API sequence includes CreateFile() and WriteFile().
虚拟运行环境为测试文件提供的软件运行提供多个API。为了与虚拟运行环境所模拟的、兼容测试文件的操作系统为测试文件提供的软件运行所需的多个API区分描述,本申请实施例将虚拟运行环境为测试文件提供的软件运行所需的多个API称为第一API集合。第一API集合中包括多个API。测试文件在虚拟运行环境中运行的过程中,测试文件可能调用第一API 集合中的全部API;或者,测试文件调用第一API集合中的部分API,而第一API集合中除部分API之外的其余API没有被测试文件调用。为了将测试文件调用的API与实际执行的API区分描述,本申请实施例将第一API集合中被测试文件调用的一系列API称为第一API序列,将检测设备实际执行的一系列API称为第二API序列。另外,可选地,在运行测试文件的过程中,测试文件仅调用第一API集合中一个API,则第一API序列仅包括这一个API。可选地,随着时间的推移,测试文件先后调用第一API集合中多个API,调用的多个API组成第一API序列,则第一API序列包括多个API。也即是,第一API序列中包括至少一个API,本实施例对第一API序列包括一个API还是包括多个API不做限定。The virtual operating environment provides multiple APIs for the software operation provided by the test file. In order to distinguish and describe multiple APIs required for the software operation provided by the test file for the operating system that is simulated by the virtual operating environment and compatible with the test file, the embodiment of the present application sets the virtual operating environment as the multiple required for the software operation provided by the test file. This API is called the first API set. The first API set includes multiple APIs. During the running of the test file in the virtual operating environment, the test file may call all APIs in the first API set; or, the test file may call some APIs in the first API set, and the first API set except some APIs The rest of the API is not called by the test file. In order to distinguish between the API called by the test file and the API actually executed, the embodiment of the present application refers to the series of APIs called by the test file in the first API set as the first API sequence, and the series of APIs actually executed by the testing device as It is the second API sequence. In addition, optionally, in the process of running the test file, the test file only calls one API in the first API set, and the first API sequence only includes this one API. Optionally, as time goes by, the test file successively calls multiple APIs in the first API set, and the multiple called APIs form a first API sequence, and the first API sequence includes multiple APIs. That is, the first API sequence includes at least one API, and this embodiment does not limit whether the first API sequence includes one API or multiple APIs.
此外,可选地,第一API序列包括多个API,第一API序列中的不同API按照被调用的时间的先后顺序依次排序。例如,如果测试文件首先调用了虚拟运行环境提供的API_1,之后调用了虚拟运行环境提供的API_2,最后调用了虚拟运行环境提供的API_3,则第一API序列表示为(API_1,API_2,API_3)。In addition, optionally, the first API sequence includes multiple APIs, and different APIs in the first API sequence are sorted in order of the time when they are called. For example, if the test file first calls API_1 provided by the virtual operating environment, then calls API_2 provided by the virtual operating environment, and finally calls API_3 provided by the virtual operating environment, the first API sequence is expressed as (API_1, API_2, API_3).
本申请实施例用第二API集合表示虚拟运行环境所模拟的、兼容测试文件的操作系统为测试文件提供的软件运行所需的多个API。在本实施例中第二API集合是第一操作系统为测试文件提供的软件运行所需的API。第二API集合包括多个API。In the embodiment of the present application, the second API set is used to represent multiple APIs required for software operation provided by the operating system of the test file compatible with the test file simulated by the virtual operating environment. In this embodiment, the second API set is the API provided by the first operating system for the operation of the software required by the test file. The second API set includes multiple APIs.
其中,第一API集合中的API的标识与第二API集合中的API的标识相同。可选地,API的标识是API的名称。API的标识用于标识对应的API。测试文件通过API的标识调用当前运行的运行环境中的API。换句话说,在虚拟运行环境中运行测试文件、以及在第一操作系统中运行测试文件时,测试文件用相同的API标识来调用虚拟运行环境中的API、或者调用第一操作系统中的API。Wherein, the identifier of the API in the first API set is the same as the identifier of the API in the second API set. Optionally, the identifier of the API is the name of the API. The API identifier is used to identify the corresponding API. The test file calls the API in the currently running operating environment through the identification of the API. In other words, when running the test file in the virtual running environment and running the test file in the first operating system, the test file uses the same API identifier to call the API in the virtual running environment or the API in the first operating system .
例如,用于写文件的API的标识为WriteFile()或frwite()。以测试文件为PE格式的文件,第一操作系统为Windows操作系统为例,Windows操作系统为PE文件提供第一API集合。该第一API集合包括用于写文件的API、用于创建文件的API、用于读取文件的API等等若干API。其中,用于写文件的API的标识为WriteFile(),用于创建文件的API的标识为CreateFile(),用于读取文件的API的标识为ReadFile()。虚拟运行环境为PE文件提供第二API集合。该第二API集合包括用于写文件的API、用于创建文件的API、用于读取文件的API等等若干API。第二API集合中用于写文件的API的标识为WriteFile(),第二API集合中用于创建文件的API的标识为CreateFile(),用于读取文件的API的标识为ReadFile()。For example, the identifier of the API for writing files is WriteFile() or frwite(). Taking the test file as a file in the PE format, the first operating system is the Windows operating system as an example, and the Windows operating system provides the first API set for the PE file. The first API set includes APIs for writing files, APIs for creating files, APIs for reading files, and so on. The identifier of the API for writing files is WriteFile(), the identifier of the API for creating files is CreateFile(), and the identifier of the API for reading files is ReadFile(). The virtual runtime environment provides a second set of APIs for the PE file. The second API set includes APIs for writing files, APIs for creating files, APIs for reading files, and so on. The identifier of the API used to write files in the second API set is WriteFile(), the identifier of the API used to create files in the second API set is CreateFile(), and the identifier of the API used to read files is ReadFile().
在一些可能的实施例中,上述容器的镜像可以由软件提供商提供给用户。软件提供商在打包容器的镜像的过程中,在镜像中封装第一API集合,则检测设备基于镜像生成容器的实例后,容器的实例为其中运行的测试文件提供第一API集合。比如说,要通过容器技术来模拟Windows操作系统提供的运行环境,软件提供商在镜像中封装与Windows操作系统中的API标识相同的一些API。如果要通过容器技术来模拟Linux操作系统提供的运行环境,软件提供商在镜像中封装与Linux操作系统中的API标识相同的一些API。In some possible embodiments, the image of the aforementioned container may be provided to the user by the software provider. In the process of packaging the image of the container, the software provider encapsulates the first API set in the image. After the detection device generates an instance of the container based on the image, the instance of the container provides the first API set for the test file running in it. For example, to simulate the operating environment provided by the Windows operating system through container technology, the software provider encapsulates some APIs that are the same as those in the Windows operating system in the image. If the container technology is to be used to simulate the operating environment provided by the Linux operating system, the software provider encapsulates some APIs with the same API identification in the Linux operating system in the image.
步骤405、检测设备在第二操作系统中执行第二API序列。Step 405: The detection device executes the second API sequence in the second operating system.
第二操作系统是基于检测设备的计算机指令集架构的操作系统。换句话说,第二操作系统是检测设备上的真实运行环境。例如,检测设备的处理器采用X86架构,第二操作系统是Windows操作系统(当然,基于X86架构的操作系统也可以是Linux操作系统,这里仅以windows操作系统进行举例说明)。又例如,检测设备的处理器采用ARM架构,第二操作系 统是Linux操作系统(当然,基于ARM架构的操作系统也可以是检测设备厂商自开发的操作系统,或者某种特定的windows系列操作系统,如windows 10)。为了与前面提到的基于容器技术生成的虚拟运行环境中被调用的API组成的API序列相区别,本申请实施例将真实运行环境(即第二操作系统)中被实际执行的API组成的序列称为第二API序列,将基于容器技术生成的虚拟运行环境中被调用的API组成的API序列称为第一API序列。The second operating system is an operating system based on the computer instruction set architecture of the detection device. In other words, the second operating system is to detect the real operating environment on the device. For example, the processor of the detection device adopts the X86 architecture, and the second operating system is the Windows operating system (of course, the operating system based on the X86 architecture may also be the Linux operating system, and only the windows operating system is used for illustration here). For another example, the processor of the testing equipment adopts the ARM architecture, and the second operating system is the Linux operating system (of course, the operating system based on the ARM architecture can also be an operating system developed by the testing equipment manufacturer, or a specific windows series operating system , Such as windows 10). In order to be different from the aforementioned API sequence composed of APIs called in the virtual operating environment generated based on container technology, the embodiment of the present application composes the sequence composed of APIs actually executed in the real operating environment (ie, the second operating system) It is called the second API sequence, and the API sequence composed of APIs called in the virtual operating environment generated based on the container technology is called the first API sequence.
可选地,第二操作系统既可以与第一操作系统相同,第二操作系统也可以与第一操作系统不同。为了便于理解,本申请实施例以第二操作系统与第一操作系统不同来进行举例说明。在第二操作系统与第一操作系统不同的情况下,第一API序列中的API与第二API序列中的API不同,在第二操作系统与第一操作系统相同的情况下,第一API序列中的API的标识与第二API序列中的API的标识有可能是相同的。Optionally, the second operating system may be the same as the first operating system, and the second operating system may also be different from the first operating system. For ease of understanding, the embodiment of the present application uses the second operating system to be different from the first operating system for illustration. When the second operating system is different from the first operating system, the API in the first API sequence is different from the API in the second API sequence. When the second operating system is the same as the first operating system, the first API The identifier of the API in the sequence may be the same as the identifier of the API in the second API sequence.
在一些实施例中,若第一API序列被调用,检测设备根据第一API序列,确定第二API序列,通过在第二操作系统中执行第二API序列,指令检测设备的硬件执行对应的操作,从而模拟执行第一操作系统的API序列,实现操作系统模拟的效果。In some embodiments, if the first API sequence is called, the detection device determines the second API sequence according to the first API sequence, and by executing the second API sequence in the second operating system, the hardware of the detection device is instructed to perform the corresponding operation , So as to simulate and execute the API sequence of the first operating system, and realize the effect of operating system simulation.
第二API序列中包括至少一个API,第二API序列中的API为第二操作系统中的API。第二API序列中的第一API与第一API序列中的第一API具有映射关系。例如,第一操作系统为Windows操作系统。第二操作系统为Linux操作系统。第一API序列为(CreateFile(),ReadFile())。第二API序列为(fcreate(),fread())。第二API序列中的fcreate()和CreateFile()具有映射关系。第二API序列中的fread()和ReadFile()具有映射关系。The second API sequence includes at least one API, and the API in the second API sequence is an API in the second operating system. The first API in the second API sequence has a mapping relationship with the first API in the first API sequence. For example, the first operating system is a Windows operating system. The second operating system is the Linux operating system. The first API sequence is (CreateFile(), ReadFile()). The second API sequence is (fcreate(), fread()). The fcreate() and CreateFile() in the second API sequence have a mapping relationship. The fread() and ReadFile() in the second API sequence have a mapping relationship.
此外,如果第二API序列包括多个API,第二API序列中的不同API按照第一API序列中对应API的先后顺序依次排序。为了简明起见,本申请实施例后续在不至于引入理解困难的情况下用“API+数字”的形式来简化表示上述具体的API,如具体的API为CreateFile()、ReadFile()、fcreate(),fread()等等,API可以简化表示为如API_1的形式。In addition, if the second API sequence includes multiple APIs, different APIs in the second API sequence are sorted according to the order of the corresponding APIs in the first API sequence. For the sake of brevity, the following embodiments of this application use the form of "API+number" to simplify the expression of the above-mentioned specific APIs without introducing difficulties in understanding. For example, the specific APIs are CreateFile(), ReadFile(), fcreate(), fread() and so on, the API can be simplified to a form such as API_1.
例如,如果第一API序列为(API_1,API_2,API_3)。第二操作系统中API_4与API_1具有映射关系,第二操作系统中API_5与API_2具有映射关系,第二操作系统中API_6与API_3具有映射关系,则第二API序列为(API_4,API_5,API_6)。For example, if the first API sequence is (API_1, API_2, API_3). API_4 and API_1 in the second operating system have a mapping relationship, API_5 and API_2 in the second operating system have a mapping relationship, and API_6 and API_3 in the second operating system have a mapping relationship, so the second API sequence is (API_4, API_5, API_6).
可选地,在一些实施例中,第一API序列中的API与第二API序列中的API之间的映射关系采用这样的方式构建。确定第一操作系统的API的功能,确定第二操作系统的API的功能,获取第一操作系统和第二操作系统的功能类似的API,为功能类似的API建立映射关系。应理解,可选地,使用第二操作系统的多个API来实现第一操作系统的一个API,则在映射关系中,第二操作系统的多个API对应于第一操作系统的一个API。此外,可选地,使用第二操作系统的一个API来实现第一操作系统的一个API,则在映射关系中,第二操作系统的一个API对应于第一操作系统的一个API。本实施例对API之间的映射关系是一对一的关系还是一对多的关系并不做具体限定。Optionally, in some embodiments, the mapping relationship between the API in the first API sequence and the API in the second API sequence is constructed in this way. Determine the function of the API of the first operating system, determine the function of the API of the second operating system, obtain APIs with similar functions of the first operating system and the second operating system, and establish a mapping relationship for the APIs with similar functions. It should be understood that, optionally, multiple APIs of the second operating system are used to implement one API of the first operating system, and then in the mapping relationship, the multiple APIs of the second operating system correspond to one API of the first operating system. In addition, optionally, an API of the second operating system is used to implement an API of the first operating system, and in the mapping relationship, an API of the second operating system corresponds to an API of the first operating system. This embodiment does not specifically limit whether the mapping relationship between APIs is a one-to-one relationship or a one-to-many relationship.
例如,在第一操作系统和第二操作系统分别均为计算机设备真实运行的操作系统的情况下,第一API序列中的第一API与第二API序列中的第一API之间的映射关系这样构建:若在第一操作系统中调用与第一API序列中的第一API相同标识的API后,会指令设备执行操作A,若在第二操作系统中调用第二API序列中的第一API后,会指令设备执行操作B,而操作A与操作B相同,可选地,建立第一API序列中的第一API与第二API序列中第一API之间的映射关系。比如说,在Windows操作系统下,计算机设备通过调用WriteFile()这一API, 指示执行写文件操作;在Linux操作系统下,计算机设备通过调用frwite()这一API,来指令执行写文件操作,那么由于WriteFile()和frwite()这2个API都是用来指示执行写文件操作的,可建立WriteFile()和frwite()这两个API之间的映射关系。For example, in the case where the first operating system and the second operating system are respectively the operating systems that the computer device actually runs, the mapping relationship between the first API in the first API sequence and the first API in the second API sequence Constructed like this: if the API with the same identification as the first API in the first API sequence is called in the first operating system, the device will be instructed to perform operation A, if the first in the second API sequence is called in the second operating system After the API, the device is instructed to perform operation B, and operation A is the same as operation B. Optionally, a mapping relationship between the first API in the first API sequence and the first API in the second API sequence is established. For example, under the Windows operating system, the computer device instructs to perform the file writing operation by calling the WriteFile() API; under the Linux operating system, the computer device instructs the file writing operation by calling the frwite() API. Then, since the two APIs WriteFile() and frwite() are used to instruct the execution of file write operations, the mapping relationship between the two APIs WriteFile() and frwite() can be established.
在一些实施例中,软件提供商在打包容器的镜像的过程中,在镜像中封装第一API集合中的API与第二操作系统中的API之间的映射关系,则检测设备基于镜像生成容器的实例后,容器的实例能够访问第一API集合中的API与第二操作系统中的API之间的映射关系。In some embodiments, in the process of packaging the image of the container, the software provider encapsulates the mapping relationship between the API in the first API set and the API in the second operating system in the image, and the detection device generates the container based on the image. After the instance of, the instance of the container can access the mapping relationship between the API in the first API set and the API in the second operating system.
示例性地,确定第二API序列中的第一API的过程包括:检测设备从第一API序列中确定第一API。检测设备以第一API序列中API的标识为索引,查询API之间的映射关系,得到第二API序列中的第一API的标识。检测设备根据第二API序列中的第一API的标识,确定第二API序列中的第一API。例如,参见下表2,若第一API序列为(CreateFile(),ReadFile()),检测设备从第一API序列中确定ReadFile()。检测设备以ReadFile()为索引,查询表2,确定第二API序列中的第一API为fread()。依次类推,可选地,通过同理的方式,确定第二API序列中第一API之外的其他API,进一步对确定的API进行排序和组合,得到第二API序列。Exemplarily, the process of determining the first API in the second API sequence includes: the detection device determines the first API from the first API sequence. The detection device uses the identifier of the API in the first API sequence as an index to query the mapping relationship between the APIs to obtain the identifier of the first API in the second API sequence. The detection device determines the first API in the second API sequence according to the identifier of the first API in the second API sequence. For example, referring to Table 2 below, if the first API sequence is (CreateFile(), ReadFile()), the detection device determines ReadFile() from the first API sequence. The detection device uses ReadFile() as an index, queries Table 2, and determines that the first API in the second API sequence is fread(). By analogy, optionally, in the same way, other APIs other than the first API in the second API sequence are determined, and the determined APIs are further sorted and combined to obtain the second API sequence.
表2Table 2
Windows操作系统的APIWindows operating system API Linux操作系统的APILinux operating system API
WriteFile()WriteFile() frwite()frwite()
ReadFile()ReadFile() fread()fread()
CreateFile()CreateFile() fcreate()fcreate()
通过上述方法,本申请实施例实际执行的是第二操作系统中的API,而并没有实际执行第一操作系统中API,实现了达到了模拟出测试文件在第一操作系统中运行过程的效果。例如,参见上表1,通过执行Linux操作系统的frwite(),模拟Windows操作系统的WriteFile()的执行过程。通过执行Linux操作系统的fcreate(),模拟Windows操作系统的CreateFile()的执行过程。由此可见,虚拟运行环境能够模拟出了第一操作系统为测试文件提供的软件运行所需的API。由于测试文件调用的API的请求能够被响应,实现了操作系统模拟的目的,使得测试文件能够在虚拟运行环境下正常运行,从而摆脱了测试文件对第一操作系统的依赖。Through the above method, the embodiment of the application actually executes the API in the second operating system, but does not actually execute the API in the first operating system, achieving the effect of simulating the running process of the test file in the first operating system. . For example, referring to Table 1 above, the execution process of WriteFile() of the Windows operating system is simulated by executing frwite() of the Linux operating system. By executing fcreate() of the Linux operating system, the execution process of CreateFile() of the Windows operating system is simulated. It can be seen that the virtual operating environment can simulate the API required for software operation provided by the first operating system for the test file. Since the request of the API called by the test file can be responded to, the purpose of operating system simulation is realized, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the first operating system.
可选地,在一些可能的实施例中,对API的调用基于动态链接库实现,相应地,使用第二操作系统的API来模拟第一操作系统的API的过程包括以下步骤一至步骤三。Optionally, in some possible embodiments, calling the API is implemented based on a dynamic link library. Accordingly, the process of using the API of the second operating system to simulate the API of the first operating system includes the following steps 1 to 3.
步骤一、检测设备根据第一API序列中的每个API,分别从虚拟运行环境的动态链接库中获取对应的函数,从而获得第一函数序列。Step 1: The detection device obtains the corresponding function from the dynamic link library of the virtual operating environment according to each API in the first API sequence, thereby obtaining the first function sequence.
测试文件的代码中引用动态链接库中的函数以及资源。在检测设备运行测试文件的过程中,测试文件调用第一API序列后,检测设备访问虚拟运行环境的动态链接库,从动态链接库中得到第一函数序列。其中,第一函数序列包括至少一个函数,第一函数序列中的函数用于实现第一API序列中的API。例如,测试文件为PE格式的文件,测试文件调用的API序列为(CreateFile(),ReadFile()),则第一函数序列中的函数是用于实现CreateFile()的函数或者用于实现ReadFile()的函数。The code of the test file references functions and resources in the dynamic link library. In the process of running the test file by the testing device, after the test file calls the first API sequence, the testing device accesses the dynamic link library of the virtual operating environment, and obtains the first function sequence from the dynamic link library. The first function sequence includes at least one function, and the functions in the first function sequence are used to implement the API in the first API sequence. For example, if the test file is a file in PE format and the API sequence called by the test file is (CreateFile(), ReadFile()), the function in the first function sequence is the function used to implement CreateFile() or the function used to implement ReadFile( )The function.
步骤二、检测设备根据第一函数序列中的每个函数,分别从第二操作系统的动态链接库中获取对应的函数,从而生成第二函数序列。Step 2: The detection device obtains the corresponding function from the dynamic link library of the second operating system according to each function in the first function sequence, thereby generating the second function sequence.
第二函数序列包括至少一个函数,第二函数序列中的函数用于实现第二API序列中的API。 例如,如果检测设备的操作系统是Linux操作系统,第二API序列中的API是fcreate(),第二函数序列中的函数是用于实现fcreate()的函数。第二函数序列中的第一函数与第一函数序列中的第一函数具有映射关系。例如,用于实现fcreate()的函数与用于实现ReadFile()的函数具有映射关系。The second function sequence includes at least one function, and the functions in the second function sequence are used to implement the API in the second API sequence. For example, if the operating system of the detection device is a Linux operating system, the API in the second API sequence is fcreate(), and the function in the second function sequence is a function for implementing fcreate(). The first function in the second function sequence has a mapping relationship with the first function in the first function sequence. For example, the function used to implement fcreate() has a mapping relationship with the function used to implement ReadFile().
示例性地,确定第二函数序列中的第一函数的过程包括,检测设备从第一函数序列中选取第一函数。检测设备以第一函数序列中的第一函数的标识为索引,查询第一函数序列中的函数的标识与第二函数序列中的函数的标识之间的映射关系,得到第二函数序列中的第一函数的标识。检测设备根据第二函数序列中的第一函数的标识,确定第二函数序列中的第一函数。依次类推,通过同理的方式,确定第一函数序列中其他函数对应的第二函数序列中的函数,例如第一函数序列中第二函数对应的第二函数序列中的第二函数等等。Exemplarily, the process of determining the first function in the second function sequence includes the detection device selecting the first function from the first function sequence. The detection device uses the identifier of the first function in the first function sequence as an index to query the mapping relationship between the identifier of the function in the first function sequence and the identifier of the function in the second function sequence to obtain the The ID of the first function. The detection device determines the first function in the second function sequence according to the identifier of the first function in the second function sequence. By analogy, in the same way, functions in the second function sequence corresponding to other functions in the first function sequence are determined, for example, the second function in the second function sequence corresponding to the second function in the first function sequence and so on.
检测设备得到第一函数序列中各函数分别对应的第二函数序列中的函数之后,对得到的若干第二函数序列中的函数进行组合,从而得到第二函数序列。After the detection device obtains the functions in the second function sequence corresponding to each function in the first function sequence, it combines the obtained functions in the plurality of second function sequences to obtain the second function sequence.
可选地,应理解,使用第二操作系统的多个函数来实现第一操作系统的一个函数,则在函数之间的映射关系中,第二操作系统的多个函数对应于第一操作系统的一个函数。此外,可选地,使用第二操作系统的一个函数来实现第一操作系统的一个函数,则在映射关系中,第二操作系统的一个函数对应于第一操作系统的一个函数。本实施例对函数之间的映射关系是一对一的关系还是一对多的关系并不做具体限定。Optionally, it should be understood that if multiple functions of the second operating system are used to implement one function of the first operating system, in the mapping relationship between the functions, the multiple functions of the second operating system correspond to the first operating system. A function of. In addition, optionally, a function of the second operating system is used to implement a function of the first operating system, and then in the mapping relationship, a function of the second operating system corresponds to a function of the first operating system. This embodiment does not specifically limit whether the mapping relationship between functions is a one-to-one relationship or a one-to-many relationship.
在一些可能的实施例中,在软件提供商打包容器的镜像的过程中,在镜像中封装第一函数序列中的函数与第二函数序列中的函数之间的映射关系,则基于镜像生成容器的实例后,容器的实例能够访问第一函数序列中的函数与第二函数序列中的函数之间的映射关系。In some possible embodiments, in the process of packaging the image of the container by the software provider, the mapping relationship between the function in the first function sequence and the function in the second function sequence is encapsulated in the image, and the container is generated based on the image. After the instance of, the instance of the container can access the mapping relationship between the function in the first function sequence and the function in the second function sequence.
步骤三、检测设备在第二操作系统的内核中,根据第二函数序列执行操作。Step 3: The detection device executes operations in the kernel of the second operating system according to the second function sequence.
通过上述实现方式,检测设备将调用虚拟运行环境的动态链接库的过程转换为调用第二操作系统的函数的过程,从而利用第二操作系统提供的函数,来模拟第一操作系统的函数。Through the above implementation, the detection device converts the process of calling the dynamic link library of the virtual runtime environment into the process of calling the function of the second operating system, thereby using the function provided by the second operating system to simulate the function of the first operating system.
可选地,在一些实施例中,考虑到不同函数调用的参数可能具有差异,例如参数的编码方式、取值范围等等具有差异。检测设备进一步在不同函数的参数之间进行转换。例如,检测设备获取第一API序列中调用的第一类参数,检测设备在第二操作系统中,根据第二类参数执行第二API序列。Optionally, in some embodiments, it is considered that the parameters of different function calls may have differences, for example, the encoding methods and value ranges of the parameters may have differences. The detection device further converts between the parameters of different functions. For example, the detection device obtains the first type of parameters called in the first API sequence, and the detection device executes the second API sequence according to the second type of parameters in the second operating system.
其中,第一类参数包括至少一个参数,第一类参数包括的参数为第一API序列中的API的输入参数。第二类参数包括至少一个参数,第二类参数包括的参数为第二API序列中的API的输入参数,第二类参数中的第一参数与第一类参数中的第一参数具有映射关系。例如,第一API序列中的API是CreateFile(),第一类参数中的第一参数是CreateFile()中表示文件名的参数。第二API序列中的API是fcreate(),第二类参数中的第一参数是fcreate()中表示文件名的参数。检测设备将第二类参数作为第二API序列的输入参数,执行第二API序列。Among them, the first type of parameters include at least one parameter, and the parameters included in the first type of parameters are input parameters of the API in the first API sequence. The second type of parameter includes at least one parameter. The second type of parameter includes the input parameter of the API in the second API sequence. The first parameter of the second type of parameter has a mapping relationship with the first parameter of the first type of parameter. . For example, the API in the first API sequence is CreateFile(), and the first parameter in the first type of parameters is the parameter representing the file name in CreateFile(). The API in the second API sequence is fcreate(), and the first parameter in the second type of parameters is the parameter representing the file name in fcreate(). The detection device uses the second type of parameter as the input parameter of the second API sequence, and executes the second API sequence.
示例性地,检测设备从第一类参数中确定第一参数。检测设备以第一参数的标识为索引,查询参数之间的映射关系,得到第二类参数中的第一参数的标识。检测设备根据第二类参数中的第一参数的标识,确定第二类参数中的第一参数。依次类推,检测设备通过同理的方式,确定第二类参数中第一参数之外的其他参数,检测设备对确定的参数进行组合,得到第二类参数。第一参数的标识用于标识第一参数。例如第一参数的标识是第一参数的名称。Exemplarily, the detection device determines the first parameter from the first type of parameters. The detection device uses the identifier of the first parameter as an index to query the mapping relationship between the parameters to obtain the identifier of the first parameter in the second type of parameters. The detection device determines the first parameter in the second parameter according to the identifier of the first parameter in the second parameter. By analogy, the detection device determines parameters other than the first parameter in the second type of parameters in the same way, and the detection device combines the determined parameters to obtain the second type of parameters. The identifier of the first parameter is used to identify the first parameter. For example, the identifier of the first parameter is the name of the first parameter.
在一些可能的实施例中,软件提供商在打包容器的镜像的过程中,软件提供商在镜像中 封装第一类参数中的第一参数与第二类参数中的第一参数之间的映射关系,则检测设备基于镜像生成容器的实例后,容器的实例可以访问第一类参数中的第一参数与第二类参数中的第一参数之间的映射关系。In some possible embodiments, in the process of packaging the image of the container, the software provider encapsulates the mapping between the first parameter in the first type of parameter and the first parameter in the second type of parameter in the image. Relationship, after the detection device generates an instance of the container based on the image, the instance of the container can access the mapping relationship between the first parameter in the first parameter and the first parameter in the second parameter.
应理解,上述通过动态链接库来获取函数序列是示例说明,在一些可选地实施例中,检测设备将实现API的函数存储在动态链接库之外的其他库文件中,检测设备采用同理的方式,从其他库文件中获取第一函数序列以及第二函数序列。其中,可选地,该其他库文件是静态库。当然,库文件也仅是对实现API的函数的存储方式的一种举例,并不限定必须从库文件中获取实现API的函数。例如,软件提供商预先对容器实例中配置实现API的函数的存储地址,容器实例访问预先设定的存储地址,能够得到第一函数序列以及第二函数序列。It should be understood that the above-mentioned acquisition of the function sequence through the dynamic link library is an example. In some optional embodiments, the detection device stores the functions that implement the API in other library files other than the dynamic link library, and the detection device adopts the same principle. Way to obtain the first function sequence and the second function sequence from other library files. Wherein, optionally, the other library file is a static library. Of course, the library file is only an example of the storage method of the function that implements the API, and it does not limit that the function that implements the API must be obtained from the library file. For example, the software provider configures the storage address of the function implementing the API in the container instance in advance, and the container instance accesses the preset storage address to obtain the first function sequence and the second function sequence.
可选地,在一些实施例中,测试文件在运行过程中通过调用第一API序列来请求对第一资源集合执行操作,相应地,检测设备执行第二API序列以对第二资源集合执行操作。其中,第一资源集合包括至少一个资源。第一资源集合中的资源为第一API序列中的API操作的对象。第二资源集合包括至少一个资源。第二资源集合中资源为第二API序列中的API操作的对象,第二资源集合中的第一资源与第一资源集合中的第一资源具有映射关系。Optionally, in some embodiments, the test file requests to perform operations on the first resource set by calling the first API sequence during the running process, and accordingly, the detection device executes the second API sequence to perform operations on the second resource set . Wherein, the first resource set includes at least one resource. The resources in the first resource set are objects of API operations in the first API sequence. The second resource set includes at least one resource. The resources in the second resource set are objects of API operations in the second API sequence, and the first resource in the second resource set has a mapping relationship with the first resource in the first resource set.
例如,测试文件在运行过程中,调用ReadFile()来请求访问Windows的文件系统,为了响应测试文件的调用请求,检测设备将Windows的文件系统映射为Linux的目录A,调用fread()来访问Linux的目录A。在这个例子中,第一API序列中的API为ReadFile(),第一资源集合中的资源为Windows的文件系统。第二API序列中的API为fread(),第二资源集合中资源为Linux的目录A。For example, when the test file is running, ReadFile() is called to request access to the file system of Windows. In response to the call request of the test file, the testing device maps the file system of Windows to directory A of Linux, and calls fread() to access Linux. Directory A. In this example, the API in the first API sequence is ReadFile(), and the resource in the first resource set is the Windows file system. The API in the second API sequence is fread(), and the resource in the second resource set is Linux directory A.
又如,测试文件在运行过程中,调用Windows中进行网络通信的API_1,以请求通过Windows的网络系统进行网络通信。为了响应测试文件的调用请求,检测设备将Windows的网络系统映射为Linux的协议栈,调用Linux中进行网络通信的API_2,以请求通过Linux的协议栈进行网络通信。在这个例子中,第一资源集合中的资源为Windows的网络系统,第二资源集合中资源为Linux的协议栈。In another example, the test file calls API_1 for network communication in Windows during the running process to request network communication through the Windows network system. In order to respond to the call request of the test file, the testing device maps the Windows network system to the Linux protocol stack, and calls API_2 for network communication in Linux to request network communication through the Linux protocol stack. In this example, the resources in the first resource set are Windows network systems, and the resources in the second resource set are Linux protocol stacks.
依次类推,可选地,测试文件通过调用第一API序列来请求对Windows系统管理的输入输出(IO)设备执行操作,检测设备执行第二API序列以控制Linux系统管理的IO设备执行操作。By analogy, optionally, the test file requests an input/output (IO) device managed by the Windows system to perform an operation by calling the first API sequence, and the detection device executes the second API sequence to control the IO device managed by the Linux system to perform an operation.
可选地,在一些实施例中,通过进行指令转换,将测试文件触发的指令转换为第二操作系统可执行的指令。可选地,指令转换的过程包括以下步骤a至步骤b。Optionally, in some embodiments, by performing instruction conversion, the instructions triggered by the test file are converted into instructions executable by the second operating system. Optionally, the instruction conversion process includes the following steps a to b.
步骤a、检测设备获取测试文件在运行过程中触发的第一指令序列。Step a: The detection device obtains the first instruction sequence triggered during the running of the test file.
可选地,在运行测试文件之前,检测设备在磁盘中存储测试文件,此时测试文件中的程序的形态是一堆二进制代码。检测设备接收到对测试文件的测试指令时,会响应于测试指令,读取测试文件中的程序,将程序加载到系统内存中,将系统内存中的程序解释为一条一条的指令,执行指令以实现对应的功能。其中,指令为指挥计算机工作的指示和命令,程序是一系列按一定顺序排列的指令,计算机的工作过程即为执行指令的过程。Optionally, before running the test file, the testing device stores the test file in the disk. At this time, the form of the program in the test file is a bunch of binary codes. When the testing equipment receives the test instruction on the test file, it will respond to the test instruction, read the program in the test file, load the program into the system memory, interpret the program in the system memory as instructions one by one, and execute the instruction to Realize the corresponding function. Among them, instructions are instructions and commands that direct the work of the computer, and a program is a series of instructions arranged in a certain order, and the working process of the computer is the process of executing the instructions.
测试文件调用第一API序列是通过触发第一指令序列实现的。第一指令序列包括至少一个指令,第一指令序列中的每个指令用于指示调用第一API序列中的一个API。第一指令序列中的每个指令属于与虚拟运行环境所模拟的操作系统兼容的计算机指令集。例如,虚拟运行环境用来模拟Windows操作系统的运行环境,第一指令序列中的每个指令是X86指令集中 的X86指令。虚拟运行环境用来模拟Linux操作系统的运行环境,第一指令序列中的每个指令是ARM指令集中的ARM指令。The test file calls the first API sequence by triggering the first instruction sequence. The first instruction sequence includes at least one instruction, and each instruction in the first instruction sequence is used to instruct to call an API in the first API sequence. Each instruction in the first instruction sequence belongs to a computer instruction set compatible with the operating system simulated by the virtual operating environment. For example, the virtual operating environment is used to simulate the operating environment of the Windows operating system, and each instruction in the first instruction sequence is an X86 instruction in the X86 instruction set. The virtual operating environment is used to simulate the operating environment of the Linux operating system, and each instruction in the first instruction sequence is an ARM instruction in the ARM instruction set.
检测设备接收虚拟运行环境运行测试文件触发的每条指令,得到第一指令序列。例如,如果虚拟运行环境是基于Docker容器技术生成的,检测设备接收Docker容器的实例中测试文件触发的每条指令,得到第一指令序列。The detection device receives each instruction triggered by running the test file in the virtual operating environment, and obtains the first instruction sequence. For example, if the virtual operating environment is generated based on the Docker container technology, the detection device receives each instruction triggered by the test file in the instance of the Docker container to obtain the first instruction sequence.
步骤b、检测设备对第一指令序列中的指令进行第一指令转换,根据第一指令转换的结果得到第二指令序列。Step b. The detection device performs a first instruction conversion on the instructions in the first instruction sequence, and obtains the second instruction sequence according to the result of the first instruction conversion.
第一指令转换用于将第一操作系统所基于的指令集中的指令转换为检测设备的计算机指令集中的指令。例如,如果检测设备的计算机指令集为ARM指令集,第一操作系统所基于的指令集为X86指令集,则第一指令转换是指将X86指令转换为ARM指令的过程。又如,如果检测设备的计算机指令集为X86指令集,第一操作系统所基于的指令集为ARM指令集,则第一指令转换是指将ARM指令转换为X86指令的过程。The first instruction conversion is used to convert the instructions in the instruction set on which the first operating system is based into instructions in the computer instruction set of the detection device. For example, if the computer instruction set of the detection device is the ARM instruction set, and the instruction set on which the first operating system is based is the X86 instruction set, the first instruction conversion refers to the process of converting X86 instructions into ARM instructions. For another example, if the computer instruction set of the detection device is the X86 instruction set, and the instruction set on which the first operating system is based is the ARM instruction set, the first instruction conversion refers to the process of converting the ARM instructions into X86 instructions.
第二指令序列包括至少一个指令。第二指令序列中的每个指令用于指示调用第二API序列中的一个API。第二指令序列中的每个指令属于检测设备的计算机指令集,因此,第二指令序列中的每个指令是检测设备的架构可识别、可执行的指令。例如,如果检测设备的计算机指令集为ARM指令集,则第二指令序列中的每个指令是ARM指令集中的ARM指令。如果检测设备的计算机指令集为X86指令集,第二指令序列中的每个指令是X86指令集中的X86指令。The second instruction sequence includes at least one instruction. Each instruction in the second instruction sequence is used to instruct to call an API in the second API sequence. Each instruction in the second instruction sequence belongs to the computer instruction set of the detection device. Therefore, each instruction in the second instruction sequence is an instruction that can be recognized and executable by the architecture of the detection device. For example, if the computer instruction set of the detection device is an ARM instruction set, each instruction in the second instruction sequence is an ARM instruction in the ARM instruction set. If the computer instruction set of the detection device is an X86 instruction set, each instruction in the second instruction sequence is an X86 instruction in the X86 instruction set.
检测设备得到第二指令序列后,执行第二指令序列,以实现第二API序列对应的操作。例如,第二指令序列中的每个指令为ARM指令,检测设备通过ARM CPU执行每个ARM指令,那么由于Linux API是通过执行ARM指令实现的,执行每个ARM指令也就实现了一系列的Linux API对应的操作。After obtaining the second instruction sequence, the detection device executes the second instruction sequence to implement the operation corresponding to the second API sequence. For example, each instruction in the second instruction sequence is an ARM instruction, and the detection device executes each ARM instruction through the ARM CPU. Since the Linux API is implemented by executing the ARM instruction, the execution of each ARM instruction also implements a series of Operation corresponding to Linux API.
可选地,在一些实施例中,检测设备运行指令转换进程,通过指令转换进程对第一指令序列进行指令转换,得到第二指令序列。可选地,从软件架构的角度来讲,在虚拟运行环境之外部署指令转换进程。例如,可以在Docker容器的实例与第二操作系统之间插入指令转换进程,使得指令转换进程部署在Docker容器的实例之外。或者,可选地,在虚拟运行环境之内部署指令转换进程。例如,在Docker容器的实例中设置指令转换进程。Optionally, in some embodiments, the detection device runs the instruction conversion process, and performs instruction conversion on the first instruction sequence through the instruction conversion process to obtain the second instruction sequence. Optionally, from the perspective of software architecture, the instruction conversion process is deployed outside the virtual operating environment. For example, an instruction conversion process may be inserted between the instance of the Docker container and the second operating system, so that the instruction conversion process is deployed outside the instance of the Docker container. Or, optionally, the instruction conversion process is deployed within the virtual operating environment. For example, set the instruction conversion process in the instance of the Docker container.
指令转换也称指令翻译。可选地,在一些实施例中,检测设备将第一指令序列切分成若干条微指令,将每条微指令由一段简单的代码来实现,然后将这些代码编译成目标文件,在检测设备上执行目标文件,该目标文件包含第二指令序列。Instruction conversion is also called instruction translation. Optionally, in some embodiments, the detection device divides the first instruction sequence into several micro-instructions, each micro-instruction is implemented by a simple piece of code, and then these codes are compiled into a target file, which is then installed on the detection device. The target file is executed, and the target file contains the second sequence of instructions.
通过指令转换的手段,若测试文件是通过指令集A编写的可执行文件,测试设备的CPU是指令集B架构的CPU,测试设备通过进行指令转换,从而将测试文件触发的指令从指令集A中的指令转换为指令集B中的指令。这样,测试设备的CPU能够执行测试文件触发的指令,从而正常运行测试文件。由此可见,该技术手段摆脱运行测试文件对特定硬件环境的依赖,保证检测恶意文件的方案广泛的应用在各种硬件环境上。By means of instruction conversion, if the test file is an executable file written through instruction set A, the CPU of the test device is a CPU of instruction set B architecture, and the test device performs instruction conversion to convert the instructions triggered by the test file from instruction set A The instructions in are converted to instructions in instruction set B. In this way, the CPU of the test device can execute the instruction triggered by the test file, thereby running the test file normally. It can be seen that the technical means get rid of the dependence of running test files on a specific hardware environment, and ensure that the scheme of detecting malicious files is widely used in various hardware environments.
可选地,在一些实施例中,检测设备在第二操作系统中执行第二API序列之后,检测设备获取到第三指令序列,检测设备对第三指令序列中的每个指令进行第二指令转换,根据第二指令转换的结果得到第四指令序列。检测设备将第四指令序列输入虚拟运行环境,以使测试文件基于第四指令序列继续运行。例如,如果虚拟运行环境是基于Docker容器技术生成的, 检测设备将第四指令序列输入至Docker容器的实例中,在Docker容器的实例中根据第四指令序列继续运行测试文件。Optionally, in some embodiments, after the detection device executes the second API sequence in the second operating system, the detection device obtains the third instruction sequence, and the detection device performs the second instruction on each instruction in the third instruction sequence. Conversion, the fourth instruction sequence is obtained according to the result of the second instruction conversion. The detection device inputs the fourth instruction sequence into the virtual operating environment, so that the test file continues to run based on the fourth instruction sequence. For example, if the virtual operating environment is generated based on the Docker container technology, the detection device inputs the fourth instruction sequence into the instance of the Docker container, and continues to run the test file according to the fourth instruction sequence in the instance of the Docker container.
第二指令转换用于将检测设备的计算机指令集中的指令转换为第一操作系统所基于的指令集中的指令。例如,如果检测设备的计算机指令集为ARM指令集,第一操作系统所基于的指令集为X86指令集,则第二指令转换是指将ARM指令转换为X86指令的过程。又如,如果检测设备的计算机指令集为X86指令集,第一操作系统所基于的指令集为ARM指令集,则第二指令转换是指将X86指令转换为ARM指令的过程。The second instruction conversion is used to convert instructions in the computer instruction set of the detection device into instructions in the instruction set on which the first operating system is based. For example, if the computer instruction set of the detection device is an ARM instruction set, and the instruction set on which the first operating system is based is an X86 instruction set, the second instruction conversion refers to the process of converting ARM instructions to X86 instructions. For another example, if the computer instruction set of the detection device is an X86 instruction set, and the instruction set on which the first operating system is based is an ARM instruction set, the second instruction conversion refers to the process of converting X86 instructions into ARM instructions.
第三指令序列表示执行第二API序列后得到的结果,第三指令序列中的指令属于检测设备的计算机指令集。可选地,第三指令序列对应的指令集和第二指令序列对应的指令集相同。第三指令序列中的指令属于检测设备的计算机指令集。第四指令序列对应的指令集和第一指令序列对应的指令集相同。第四指令序列中的指令属于虚拟运行环境的计算机指令集。The third instruction sequence represents the result obtained after executing the second API sequence, and the instructions in the third instruction sequence belong to the computer instruction set of the detection device. Optionally, the instruction set corresponding to the third instruction sequence is the same as the instruction set corresponding to the second instruction sequence. The instructions in the third instruction sequence belong to the computer instruction set of the detection device. The instruction set corresponding to the fourth instruction sequence is the same as the instruction set corresponding to the first instruction sequence. The instructions in the fourth instruction sequence belong to the computer instruction set of the virtual operating environment.
例如,如果检测设备的处理器为ARM架构的处理器,ARM架构的处理器执行ARM指令后,得到的结果通常还是ARM指令的形式,而虚拟运行环境中的文件需要根据X86指令确定得到的结果。那么,指令转换进程将ARM架构的处理器返回的ARM指令转换为X86指令,将X86指令输入至虚拟运行环境,以使虚拟运行环境中运行的测试文件得到处理器的反馈,测试文件根据返回的X86指令,能够确定之前调用API的结果,根据调用API的结果继续运行。For example, if the processor of the detection device is an ARM architecture processor, after the ARM architecture processor executes an ARM instruction, the result obtained is usually in the form of an ARM instruction, and the file in the virtual operating environment needs to determine the result obtained according to the X86 instruction . Then, the instruction conversion process converts the ARM instructions returned by the ARM architecture processor into X86 instructions, and inputs the X86 instructions into the virtual operating environment, so that the test file running in the virtual operating environment gets feedback from the processor, and the test file is based on the returned X86 instructions can determine the result of calling the API before, and continue to run according to the result of calling the API.
步骤406、检测设备基于第一API序列被调用过程中测试文件的行为特征,判断测试文件是否为恶意文件。Step 406: The detection device judges whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
可选地,在一些实施例中,检测设备对虚拟运行环境提供的API进行监控。当监控到第一API序列被调用后,检测设备提取测试文件调用第一API序列过程中的行为特征。检测设备判断行为特征是否满足预先设定的条件。若行为特征满足预先设定的条件,则检测设备确定测试文件为恶意文件。若行为特征不满足预先设定的条件,则检测设备确定测试文件不为恶意文件,即正常文件。Optionally, in some embodiments, the detection device monitors the API provided by the virtual operating environment. After monitoring that the first API sequence is called, the detection device extracts the behavior characteristics of the test file in the process of calling the first API sequence. The detection device determines whether the behavior characteristics meet the preset conditions. If the behavior characteristic meets the preset condition, the detection device determines that the test file is a malicious file. If the behavior characteristics do not meet the preset conditions, the detection device determines that the test file is not a malicious file, that is, a normal file.
可选地,在一些实施例中,检测设备在虚拟运行环境中开始运行测试文件之后,只有将测试文件在虚拟运行环境中当前调用的API转换为第二操作系统中的API并在第二操作系统中实际执行,测试文件才能进一步展现出后续的调用动作,从而获取完整的第一API序列。继而,检测设备根据第一API序列确定测试文件是否为恶意文件。Optionally, in some embodiments, after the detection device starts to run the test file in the virtual operating environment, it only has to convert the API currently called by the test file in the virtual operating environment to the API in the second operating system and perform the second operation. Only when the test file is actually executed in the system can the subsequent call actions be further shown, so as to obtain the complete first API sequence. Then, the detection device determines whether the test file is a malicious file according to the first API sequence.
可选地,检测设备对虚拟运行环境提供的API进行监控的实现方式包括多种。例如,软件提供商在基于容器技术设计用于实现虚拟运行环境的程序代码时,采用多种方式使得虚拟运行环境能够输出测试文件调用第一API序列过程中的行为特征。Optionally, there are multiple implementation ways for the detection device to monitor the API provided by the virtual operating environment. For example, when a software provider designs a program code for implementing a virtual operating environment based on container technology, it adopts a variety of methods to enable the virtual operating environment to output the behavior characteristics of the test file in the process of calling the first API sequence.
以监控第一API序列的过程为例,例如,软件提供商在设计用于支持虚拟运行环境的第一API集合时,在第一API集合中的部分或者全部API中嵌入一段用于输出信息的程序代码。该程序代码的功能是在被执行时输出所嵌入的API被调用的相关信息。API被调用的相关信息包括但不限于API的标识、传入的参数、被调用的时间等等。可选地,软件提供商在设计第一API集合时,只在部分感兴趣的、对于区分正常异常行为效果较好的API中嵌入上述用于输出信息的程序代码。Take the process of monitoring the first API sequence as an example. For example, when a software provider designs the first API set to support the virtual operating environment, a section for outputting information is embedded in some or all of the APIs in the first API set. code. The function of the program code is to output the related information that the embedded API is called when it is executed. The related information of the API being called includes, but is not limited to, the identification of the API, the parameters passed in, the time of being called, and so on. Optionally, when designing the first API set, the software provider embeds the above-mentioned program code for outputting information in only part of the APIs that are of interest and that have a better effect on distinguishing normal abnormal behaviors.
可选地,上述用于输出信息的程序代码将所嵌入的API被调用的相关信息输出到指定存储空间中,例如输出到指定日志文件中,以便于检测设备读取指定存储空间中的数据后,获 得PE文件调用第一API序列的行为特征。Optionally, the above program code for outputting information outputs information related to the call of the embedded API to a designated storage space, for example, to a designated log file, so that the detection device can read the data in the designated storage space. , Obtain the behavior characteristics of the PE file calling the first API sequence.
换句话说,第一API集合中的API的特点一方面是API的标识与第一操作系统API集合中的API的标识相同,另一方面是在被调用时能够输出被调用的相关信息。In other words, the feature of the API in the first API set is that on the one hand, the identifier of the API is the same as that of the API in the first operating system API set, and on the other hand, it can output the related information of the called when it is called.
以第一操作系统为Windows操作系统为例,例如,第一API集合中用于写文件的API为WriteFile(),该WriteFile()与Windows操作系统的写文件的API名称相同。第一API集合中用于写文件的API为WriteFile()被调用时,还向日志文件输出被调用的相关信息,相关信息包括WriteFile()被调用时传入的参数,这些参数例如文件名、文件存储位置、将被写入文件的字符串内容,写入数据相对于文件头的偏移量等等。Taking the Windows operating system as the first operating system as an example, for example, the API for writing files in the first API set is WriteFile(), and the WriteFile() has the same name as the file writing API of the Windows operating system. The API used to write files in the first API set is that when WriteFile() is called, it also outputs the related information of the called to the log file. The related information includes the parameters passed in when WriteFile() is called, such as file name, File storage location, the content of the string to be written to the file, the offset of the written data relative to the file header, and so on.
在另一些实施例中,预先设置日志记录函数,当第一函数序列中的函数被调用时,日志记录函数会记录该第一函数序列中的函数被调用的事件。若检测设备读取日志,发现日志中记录了第一函数序列中的每个函数,检测设备从而确定第一API序列被调用。例如,检测设备对虚拟运行环境提供的用于设置注册表的API进行监控,即对实现RegSetValue()的函数进行监控。当接收到RegSetValue()的函数被调用的事件,检测设备确定用于设置注册表的API被调用。以此类推,检测设备采用这样的监控机制,对虚拟运行环境提供的每个API进行监控。或者,可选地,检测设备采用这样的监控机制,对虚拟运行环境提供的所有API中的关键API进行监控,例如,检测设备对用于设置注册表的API、用于设置文件系统的API、用于控制文件权限的API、用于操作进程的API、用于操作线程的API、用于控制内存的API进行监控。In other embodiments, the log recording function is preset, and when the function in the first function sequence is called, the log recording function records the event that the function in the first function sequence is called. If the detection device reads the log and finds that each function in the first function sequence is recorded in the log, the detection device determines that the first API sequence is called. For example, the detection device monitors the API for setting the registry provided by the virtual operating environment, that is, monitors the function that implements RegSetValue(). When receiving the event that the RegSetValue() function is called, the detection device determines that the API for setting the registry is called. By analogy, the detection equipment adopts such a monitoring mechanism to monitor each API provided by the virtual operating environment. Or, optionally, the detection device adopts such a monitoring mechanism to monitor key APIs in all APIs provided by the virtual operating environment. For example, the detection device monitors the API used to set the registry, the API used to set the file system, The API used to control file permissions, the API used to manipulate processes, the API used to manipulate threads, and the API used to control memory are monitored.
可选地,根据调用第一API序列的过程来提取行为特征的实现方式包括多种。例如,检测设备获取第一API序列中每个API的标识,获取测试文件向每个API传入的参数值,根据每个API的标识以及每个API被传入的参数值,提取行为特征。Optionally, there are multiple implementation manners for extracting behavior characteristics according to the process of calling the first API sequence. For example, the detection device obtains the identifier of each API in the first API sequence, obtains the parameter value passed by the test file to each API, and extracts the behavior characteristics according to the identifier of each API and the parameter value passed by each API.
行为特征用于表示测试文件的一个或多个动态行为。可选地,行为特征的形式是一个序列,该序列由每个动态行为的特征按照发生时间先后的顺序排序而成。文件在虚拟运行环境中的动态行为例如包括进程操作、文件操作、注册表操作、端口访问、释放或加载DLL等等中的一个或多个。进程操作包括创建进程、结束进程中的一个或多个;文件操作包括创建文件、修改文件、读取文件、删除文件等中的一个或多个;注册表操作包括创建注册表项、修改注册表项、查询注册表项、删除注册表项等中的一个或多个。Behavior characteristics are used to represent one or more dynamic behaviors of the test file. Optionally, the form of the behavior feature is a sequence, and the sequence is formed by sorting the features of each dynamic behavior in the order of occurrence time. The dynamic behavior of the file in the virtual operating environment includes, for example, one or more of process operations, file operations, registry operations, port access, release or loading of DLLs, and so on. Process operations include one or more of creating and ending processes; file operations include one or more of creating files, modifying files, reading files, deleting files, etc.; registry operations include creating registry entries and modifying registry One or more of key, query registry key, delete registry key, etc.
应理解,以上描述的被调用的API以及行为特征仅是举例,测试文件在运行过程中如果调用虚拟运行环境提供的其他API,则基于调用其他API过程中的行为特征,判断测试文件是否为恶意文件。例如,可选地,测试文件在运行过程中还调用虚拟运行环境提供的进行网络通信的API、发送短信的API、操作通讯录的API、显示弹窗的API等等,行为特征是网络通信的行为特征、发送短信的行为特征、操作通讯录的行为特征、显示弹窗的行为特征等等。检测设备提取调用其他API的行为特征进行判断的过程,与前文描述的方式同理,在此不做赘述。It should be understood that the called APIs and behavior characteristics described above are only examples. If the test file calls other APIs provided by the virtual runtime environment during operation, it will be judged whether the test file is malicious based on the behavior characteristics in the process of calling other APIs. document. For example, optionally, the test file also calls the API for network communication provided by the virtual runtime environment, the API for sending short messages, the API for operating the address book, the API for displaying pop-up windows, etc., provided by the virtual runtime environment during the running process. The behavior characteristic is network communication. Behavior characteristics, behavior characteristics of sending short messages, behavior characteristics of operating the address book, behavior characteristics of displaying pop-up windows, etc. The process of the detection device extracting and calling the behavior characteristics of other APIs for judgment is the same as the method described above, and will not be repeated here.
检测设备获取测试文件的行为特征后,根据获取的行为特征确认测试文件是否为恶意文件。例如,检测设备将上述测试文件的行为特征与预定规则匹配,根据匹配结果确认是否为恶意文件;或者将上述测试文件的行为特征输入预先通过机器学习算法训练生成的分类模型,根据分类模型的输出确认是否为恶意文件;或者通过人工分析上述测试文件的行为特征,确认是否为恶意文件。After the detection device obtains the behavior characteristics of the test file, it confirms whether the test file is a malicious file according to the obtained behavior characteristics. For example, the detection device matches the behavioral characteristics of the above-mentioned test file with predetermined rules, and confirms whether it is a malicious file according to the matching result; or inputs the behavioral characteristics of the above-mentioned test file into a classification model generated by machine learning algorithm training in advance, and according to the output of the classification model Confirm whether it is a malicious file; or manually analyze the behavior characteristics of the above-mentioned test file to confirm whether it is a malicious file.
例如,检测设备通过以上方式获取测试文件的行为特征为{RegDeleteValue(参数A),RegSetValue(参数B),SetWindowsHook(参数C)}。上述行为特征表明测试文件先通过RegDeleteValue删除杀毒软件启动项,再通过注册表设置函数RegSetValue将自身添加到开机自启动项中,实现系统驻留;再进一步通过SetWindowsHook函数设置全局钩子,截获用户输入数据,窃取敏感信息。检测设备根据这一系列的行为特征,判断作为测试文件的测试文件为恶意文件。For example, the detection device obtains the behavior characteristics of the test file in the above manner as {RegDeleteValue (parameter A), RegSetValue (parameter B), SetWindowsHook (parameter C)}. The above behavior characteristics indicate that the test file first deletes the antivirus software startup item through RegDeleteValue, and then adds itself to the startup item through the registry setting function RegSetValue to achieve system resident; and then further sets the global hook through the SetWindowsHook function to intercept user input data , Steal sensitive information. Based on the series of behavior characteristics, the detection device determines that the test file used as the test file is a malicious file.
可选地,在一些实施例,为不同的动态行为设置不同的权重,权重表示对应动态行为的威胁程度,例如权重越大,表示威胁程度越大。可选地,检测设备确定行为特征后,获取行为特征对应的动态行为的权重。检测设备根据行为特征以及权重,判定测试文件是否为恶意测试文件。Optionally, in some embodiments, different weights are set for different dynamic behaviors, and the weights indicate the degree of threat of the corresponding dynamic behaviors. For example, the greater the weight, the greater the degree of threat. Optionally, after determining the behavior feature, the detection device obtains the weight of the dynamic behavior corresponding to the behavior feature. The detection device determines whether the test file is a malicious test file according to the behavior characteristics and weight.
随着APT攻击的持续变化,新的攻击技术如加壳、混淆、加密等方式,也在不断更新,导致传统的基于恶意特征码匹配检测技术已经越来越难以应对。而从恶意文件的实现目的来看,即使恶意文件经过了混淆或加壳,也会触发被感染的机器执行动作,以进行信息窃取、感染或者勒索等恶意行为。而通过上述实施方式,能够利用测试文件在虚拟运行环境中调用的API,抽象出测试文件的动态行为,若甄别出测试文件的动态行为符合恶意文件的动态行为,则判定该测试文件为恶意文件,从而实现了恶意文件的动态检测。As APT attacks continue to change, new attack techniques such as shelling, obfuscation, encryption, etc. are also constantly updated. As a result, traditional detection techniques based on malicious signature matching have become increasingly difficult to deal with. From the perspective of the purpose of the malicious file, even if the malicious file is obfuscated or packed, it will trigger the infected machine to perform actions to carry out malicious behaviors such as information theft, infection, or blackmail. Through the above-mentioned implementation manners, the API called by the test file in the virtual operating environment can be used to abstract the dynamic behavior of the test file. If the dynamic behavior of the test file is found to match the dynamic behavior of the malicious file, the test file is determined to be a malicious file. , Thus realizing the dynamic detection of malicious files.
本申请实施例提供了能实现跨平台动态检测恶意文件的方案,通过基于容器技术生成的虚拟运行环境,模拟出兼容测试文件的操作系统提供的运行环境。测试文件调用了虚拟运行环境提供的API后,检测设备将虚拟运行环境被测试文件调用的API,转换为检测设备的操作系统提供的API,在检测设备的操作系统中执行转换后的API。由于通过执行检测设备的操作系统的API,达到了模拟执行第一操作系统的API的效果。因此,检测设备提供的虚拟运行环境能够兼容测试文件的正常运行,从而摆脱了测试文件对特定架构或平台的依赖(即测试文件要求检测设备必须基于特定的架构或平台),因此实现了跨平台的恶意文件检测。此外,由于虚拟运行环境是基于容器技术生成的,容器技术能够免去Hypervisor以及Guest OS带来的资源开销,直接利用主机的内核运行。由于容器的镜像的大小远远小于虚拟机的镜像的大小,因此本申请实施例的检测方法更加轻量化,消耗的CPU处理资源更少,占用的内存空间更少。本申请实施例的检测方法在进程级别实现恶意文件的运行,检测速度更快。并且,能够避免虚拟机反复重置带来的耗时和性能开销,避免传统的虚拟机的创建、调度等操作带来的开销。The embodiment of the present application provides a solution that can realize cross-platform dynamic detection of malicious files, and simulates the operating environment provided by an operating system compatible with test files through a virtual operating environment generated based on container technology. After the test file calls the API provided by the virtual operating environment, the testing device converts the API called by the test file from the virtual operating environment to the API provided by the operating system of the testing device, and executes the converted API in the operating system of the testing device. Since the API of the operating system of the detection device is executed, the effect of simulating the execution of the API of the first operating system is achieved. Therefore, the virtual operating environment provided by the testing equipment can be compatible with the normal operation of the test file, thereby getting rid of the dependence of the test file on a specific architecture or platform (that is, the test file requires the testing equipment to be based on a specific architecture or platform), thus achieving cross-platform Malicious file detection. In addition, since the virtual operating environment is generated based on container technology, the container technology can avoid the resource overhead caused by Hypervisor and Guest OS, and directly use the kernel of the host to run. Since the size of the image of the container is much smaller than the size of the image of the virtual machine, the detection method of the embodiment of the present application is lighter, consumes less CPU processing resources, and occupies less memory space. The detection method of the embodiment of the present application realizes the operation of malicious files at the process level, and the detection speed is faster. In addition, the time-consuming and performance overhead caused by repeated resetting of the virtual machine can be avoided, and the overhead caused by operations such as the creation and scheduling of the traditional virtual machine can be avoided.
以下通过图5实施例,对本申请实施例附图4所描述的恶意文件的检测方法进行举例说明。在附图5所示的实施例中,检测设备为基于ARM平台的计算机设备。测试文件为PE文件。换句话说,附图5描述的方法流程关于基于ARM平台的检测设备如何检测利用Windows操作系统进行恶意操作的恶意文件。应理解,图5实施例与图4实施例同理的步骤还请参见图4实施例,在图5实施例中不做赘述。The following uses the embodiment of FIG. 5 to illustrate the malicious file detection method described in FIG. 4 of the embodiment of the present application. In the embodiment shown in FIG. 5, the detection device is a computer device based on the ARM platform. The test file is a PE file. In other words, the method flow described in FIG. 5 relates to how a detection device based on the ARM platform detects malicious files that use the Windows operating system to perform malicious operations. It should be understood that the steps in the embodiment in FIG. 5 are the same as those in the embodiment in FIG. 4, please refer to the embodiment in FIG. 4, and will not be repeated in the embodiment in FIG.
参见图5,图5是本申请实施例提供的一种恶意文件的检测方法的流程图,该方法包括以下步骤501至步骤505。Referring to FIG. 5, FIG. 5 is a flowchart of a malicious file detection method provided by an embodiment of the present application. The method includes the following steps 501 to 505.
步骤501、检测设备获取一个PE文件,该PE文件是本申请实施例中的测试文件。Step 501: The detection device obtains a PE file, which is a test file in the embodiment of the present application.
其中,检测设备是上述实施例中的检测设备的一种具体情况,检测设备的操作系统是 Linux操作系统。PE文件是上述实施例中的测试文件的一种具体情况,PE文件为基于Windows操作系统运行的可执行文件,PE文件的格式为PE格式。The detection device is a specific case of the detection device in the foregoing embodiment, and the operating system of the detection device is the Linux operating system. The PE file is a specific case of the test file in the foregoing embodiment, the PE file is an executable file running based on the Windows operating system, and the format of the PE file is the PE format.
步骤502、检测设备在虚拟运行环境中运行PE文件。Step 502: The detection device runs the PE file in the virtual operating environment.
其中,虚拟运行环境是基于容器技术生成的。第一API集合中包括多个API。第一API集合包括虚拟运行环境提供的软件运行所需的多个API。第一API集合中的API的标识与Windows API集合中的Windows API的标识相同。Windows API集合包括了Windows操作系统为测试文件提供的软件运行所需的多个API。Windows API集合中包括多个Windows API。Among them, the virtual operating environment is generated based on container technology. The first API set includes multiple APIs. The first API set includes multiple APIs required for software operation provided by the virtual operating environment. The identifier of the API in the first API set is the same as the identifier of the Windows API in the Windows API set. The Windows API collection includes multiple APIs required by the software provided by the Windows operating system to run the test files. The Windows API collection includes multiple Windows APIs.
步骤503、检测设备获得PE文件在运行过程中调用的第一API序列,第一API序列中包括至少一个API。Step 503: The detection device obtains a first API sequence called during the running of the PE file, and the first API sequence includes at least one API.
步骤504、检测设备在Linux操作系统中执行第二API序列,第二API序列包括至少一个Linux API,第二API序列中的第一Linux API与第一API序列中的第一API具有映射关系。Step 504: The detection device executes a second API sequence in the Linux operating system, the second API sequence includes at least one Linux API, and the first Linux API in the second API sequence has a mapping relationship with the first API in the first API sequence.
本步骤中,通过执行Linux API,达到模拟执行Windows API的效果,从而模拟Windows系统为测试文件提供的运行环境,实现操作系统模拟的目的。在Windows操作系统下,应用程序是以函数调用的方式来通知操作系统执行相应的功能的,而Windows API调用过程中涉及的函数通常由系统的动态链接库提供。相应地,可选地,模拟执行Windows API的流程包括以下步骤(1)至步骤(3)。In this step, the Linux API is executed to achieve the effect of simulating the execution of the Windows API, thereby simulating the operating environment provided by the Windows system for the test file, and achieving the purpose of operating system simulation. Under the Windows operating system, applications notify the operating system to perform corresponding functions in the form of function calls, and the functions involved in the Windows API call are usually provided by the system's dynamic link library. Correspondingly, optionally, the process of simulating the execution of the Windows API includes the following steps (1) to (3).
步骤(1)检测设备根据第一API序列中的每个API,分别从DLL文件中获取对应的函数。Step (1) The detection device obtains the corresponding function from the DLL file according to each API in the first API sequence.
DLL文件由虚拟运行环境提供。例如,虚拟运行环境的动态链接库包括DLL文件。在一种可能的实现中,软件提供商在打包容器的镜像的过程中,软件提供商在镜像中封装DLL文件,则检测设备基于镜像生成容器的实例后,容器的实例为其中运行的PE文件提供DLL文件,使得PE文件可以调用DLL文件中的第一函数序列。The DLL file is provided by the virtual runtime environment. For example, the dynamic link library of the virtual runtime environment includes DLL files. In a possible implementation, in the process of packaging the image of the container, the software provider encapsulates the DLL file in the image, and the detection device generates an instance of the container based on the image, and the instance of the container is the PE file running in it. Provide the DLL file so that the PE file can call the first function sequence in the DLL file.
在一些实施例中,可选地,Windows操作系统包括多个DLL文件,访问DLL文件的过程包括按照一定的顺序依次访问多个DLL文件的过程。其中,可选地,最后一个访问的DLL文件是ntdll.dll文件。In some embodiments, optionally, the Windows operating system includes multiple DLL files, and the process of accessing the DLL file includes a process of sequentially accessing the multiple DLL files in a certain order. Among them, optionally, the last DLL file accessed is the ntdll.dll file.
步骤(2)检测设备根据第一函数序列中的每个函数以及函数之间的映射关系,分别从SO文件中获取映射的函数。Step (2) The detection device obtains the mapped function from the SO file according to each function in the first function sequence and the mapping relationship between the functions.
SO文件为Linux操作系统中包含动态链接库的文件。SO文件基于ARM平台运行。SO文件由虚拟运行环境提供。例如,虚拟运行环境的动态链接库包括SO文件。在一种可能的实现中,软件提供商在打包容器的镜像的过程中,软件提供商在镜像中封装SO文件,则检测设备基于镜像生成容器的实例后,若第一函数序列被调用后,检测设备能够访问SO文件,得到第二函数序列。The SO file is a file containing a dynamic link library in the Linux operating system. The SO file runs on the ARM platform. The SO file is provided by the virtual operating environment. For example, the dynamic link library of the virtual runtime environment includes SO files. In a possible implementation, in the process of packaging the image of the container, the software provider encapsulates the SO file in the image. After the detection device generates an instance of the container based on the image, if the first function sequence is called, The detection device can access the SO file to obtain the second function sequence.
步骤(3)在Linux操作系统的内核中,根据第二函数序列执行操作。Step (3) In the kernel of the Linux operating system, an operation is performed according to the second function sequence.
示例性地,如果在Windows操作系统下,当PE文件调用API:WriteFile(),会调用到动态链接库kernel32.dll中,然后进一步调用动态链接库ntdll.dll中的函数NtWriteFile(),最后在Windows系统的内核中执行写文件操作。而在本实施例中,在Linux操作系统下运行测试文件的过程中,当PE调用API:WriteFile()时,检测设备可选地调用SO文件中的函数frwite(),根据函数frwite(),在Linux内核中完成该操作。Exemplarily, if under the Windows operating system, when the PE file calls API: WriteFile(), it will call the dynamic link library kernel32.dll, and then further call the function NtWriteFile() in the dynamic link library ntdll.dll, and finally The file writing operation is performed in the kernel of the Windows system. In this embodiment, in the process of running the test file under the Linux operating system, when the PE calls the API: WriteFile(), the detection device optionally calls the function frwite() in the SO file. According to the function frwite(), This is done in the Linux kernel.
通过调用Linux操作系统的SO文件,能够模拟出Windows操作系统上DLL文件的调用 过程。通过在Linux内核中根据函数执行操作,能够模拟出在Windows内核中根据函数执行操作的过程。比如在上一段的例子中,通过在Linux系统的内核中根据frwite()进行写文件操作,模拟出在Windows操作系统的Windows内核中根据NtWriteFile()进行写文件操作。By calling the SO file of the Linux operating system, the calling process of the DLL file on the Windows operating system can be simulated. By performing operations based on functions in the Linux kernel, the process of performing operations based on functions in the Windows kernel can be simulated. For example, in the example in the previous paragraph, the file writing operation according to frwite() in the kernel of the Linux system is simulated to simulate the file writing operation according to the NtWriteFile() in the Windows kernel of the Windows operating system.
在一些实施例中,可选地,通过进行指令转换,将测试文件触发的指令转换为Linux操作系统可执行的指令。可选地,指令转换的过程包括以下步骤A至步骤B。In some embodiments, optionally, by performing instruction conversion, the instructions triggered by the test file are converted into instructions executable by the Linux operating system. Optionally, the instruction conversion process includes the following steps A to B.
步骤A、检测设备获取测试文件在运行过程中触发的第一指令序列,第一指令序列中的每个指令为X86指令,第一指令序列中的每个X86指令用于指示调用第一API序列中的一个Windows API。Step A. The detection device obtains the first instruction sequence triggered during the running of the test file. Each instruction in the first instruction sequence is an X86 instruction, and each X86 instruction in the first instruction sequence is used to instruct to call the first API sequence. One of the Windows APIs.
步骤B、检测设备将第一指令序列中的每个X86指令转换为ARM指令,根据转换得到的ARM指令得到第二指令序列,第二指令序列包括至少一个ARM指令。第二指令序列中的每个ARM指令用于指示调用第二API序列中的一个Linux API。第二指令序列中的每个指令属于ARM指令集。Step B: The detection device converts each X86 instruction in the first instruction sequence into an ARM instruction, and obtains a second instruction sequence according to the converted ARM instruction, and the second instruction sequence includes at least one ARM instruction. Each ARM instruction in the second instruction sequence is used to instruct to call a Linux API in the second API sequence. Each instruction in the second instruction sequence belongs to the ARM instruction set.
之后,检测设备通过ARM CPU执行第二指令序列中的每个ARM指令,以实现第二API序列中每个Linux API对应的操作,从而通过Linux API模拟了Windows API。After that, the detection device executes each ARM instruction in the second instruction sequence through the ARM CPU to implement the operation corresponding to each Linux API in the second API sequence, thereby simulating the Windows API through the Linux API.
在一些实施例中,可选地,检测设备在Linux操作系统中执行第二API序列之后,检测设备获取到第三指令序列,第三指令序列表示执行第二API序列后得到的结果,第三指令序列中的指令为ARM指令。检测设备将第三指令序列中的每个ARM指令转换为X86指令,根据转换得到的X86指令得到第四指令序列,第四指令序列中的指令属于X86指令集中的X86指令。检测设备将第四指令序列输入至虚拟运行环境。In some embodiments, optionally, after the detection device executes the second API sequence in the Linux operating system, the detection device obtains a third instruction sequence, where the third instruction sequence represents the result obtained after the second API sequence is executed, and the third The instructions in the instruction sequence are ARM instructions. The detection device converts each ARM instruction in the third instruction sequence into an X86 instruction, and obtains a fourth instruction sequence according to the converted X86 instruction, and the instructions in the fourth instruction sequence belong to the X86 instructions in the X86 instruction set. The detection device inputs the fourth instruction sequence into the virtual operating environment.
请参见图3,基于上述方法流程,当得到X86的应用程序(Application,APP)后,X86的APP会触发对虚拟运行环境中DLL文件的调用,产生X86指令,将X86指令转换为ARM指令,通过基于ARM的操作系统执行X86指令。其中,X86APP为基于X86指令集开发的应用程序,X86的APP能够封装为PE文件。Refer to Figure 3. Based on the above method flow, when an X86 application (APP) is obtained, the X86 APP will trigger a call to the DLL file in the virtual operating environment, generate X86 instructions, and convert X86 instructions to ARM instructions. X86 instructions are executed through an ARM-based operating system. Among them, X86APP is an application developed based on the X86 instruction set, and X86 APP can be packaged as a PE file.
步骤505、检测设备基于第一API序列被调用过程中PE文件的行为特征,判断PE文件是否为恶意文件。Step 505: The detection device determines whether the PE file is a malicious file based on the behavior characteristics of the PE file in the process in which the first API sequence is called.
可选地,本实施例提供的恶意文件检测方法单独使用软件实现,例如全部以计算机程序产品的形式实现。该软件可以由软件提供商提供给用户。该软件提供商可以和检测设备的厂商不同,比如说,检测设备的硬件单独由网络设备的厂商提供,检测设备上运行的用于检测恶意文件的软件单独由互联网应用的服务商提供。软件提供商在基于容器技术设计用于实现虚拟运行环境的程序代码时,采用多种方式使得虚拟运行环境能够输出测试文件调用第一API序列过程中的行为特征。Optionally, the malicious file detection method provided in this embodiment is implemented by software alone, for example, all are implemented in the form of a computer program product. The software can be provided to users by software providers. The software provider can be different from the manufacturer of the detection device. For example, the hardware of the detection device is provided by the manufacturer of the network device separately, and the software for detecting malicious files running on the detection device is provided by the service provider of the Internet application separately. When the software provider designs the program code for implementing the virtual operating environment based on the container technology, it adopts a variety of methods to enable the virtual operating environment to output the behavior characteristics of the test file in the process of calling the first API sequence.
例如,软件提供商在设计用于支持虚拟运行环境的第一API集合时,在第一API集合中的部分或者全部API中嵌入一段用于输出信息的程序代码。该程序代码的功能是在被执行时输出所嵌入的API被调用的相关信息。API被调用的相关信息包括但不限于API的标识、传入的参数、被调用的时间等等。可选地,软件提供商在设计第一API集合时,只在部分感兴趣的、对于区分正常异常行为效果较好的API中嵌入上述用于输出信息的程序代码。For example, when a software provider designs the first API set to support the virtual operating environment, a piece of program code for outputting information is embedded in some or all of the APIs in the first API set. The function of the program code is to output the related information that the embedded API is called when it is executed. The related information of the API being called includes, but is not limited to, the identification of the API, the parameters passed in, the time of being called, and so on. Optionally, when designing the first API set, the software provider embeds the above-mentioned program code for outputting information in only part of the APIs that are of interest and that have a better effect on distinguishing normal abnormal behaviors.
可选地,上述用于输出信息的程序代码将所嵌入的API被调用的相关信息输出到指定存储空间中,例如输出到指定日志文件中,以便于检测设备读取指定存储空间中的数据后,获得PE文件调用第一API序列的行为特征。Optionally, the above program code for outputting information outputs information related to the call of the embedded API to a designated storage space, for example, to a designated log file, so that the detection device can read the data in the designated storage space. , Obtain the behavior characteristics of the PE file calling the first API sequence.
换句话说,第一API集合中的API的特点一方面是API的标识与Windows API集合中的Windows API的标识相同,另一方面是在被调用时能够输出被调用的相关信息。In other words, the feature of the API in the first API set is that on the one hand, the identifier of the API is the same as that of the Windows API in the Windows API set, and on the other hand, it can output the related information that is called when it is called.
例如,第一API集合中用于写文件的API为WriteFile(),该WriteFile()与Windows操作系统的写文件的API名称相同。第一API集合中用于写文件的API为WriteFile()被调用时,还向日志文件输出被调用的相关信息,相关信息包括WriteFile()被调用时传入的参数,这些参数例如文件名、文件存储位置、将被写入文件的字符串内容,写入数据相对于文件头的偏移量等等。For example, the API for writing files in the first API set is WriteFile(), which has the same name as the file writing API of the Windows operating system. The API used to write files in the first API set is that when WriteFile() is called, it also outputs the related information of the called to the log file. The related information includes the parameters passed in when WriteFile() is called, such as file name, File storage location, the content of the string to be written to the file, the offset of the written data relative to the file header, and so on.
检测设备通过以上方式获取PE文件的行为特征后,根据获取的行为特征确认PE文件是否为恶意文件。例如,检测设备将上述PE文件的行为特征与预定规则匹配,根据匹配结果确认是否为恶意文件;或者将上述PE文件的行为特征输入预先通过机器学习算法训练生成的分类模型,根据分类模型的输出确认是否为恶意文件;或者通过人工分析上述PE文件的行为特征,确认是否为恶意文件。After the detection device obtains the behavior characteristics of the PE file in the above manner, it confirms whether the PE file is a malicious file according to the obtained behavior characteristics. For example, the detection device matches the behavioral characteristics of the above-mentioned PE file with predetermined rules, and confirms whether it is a malicious file according to the matching result; or inputs the behavioral characteristics of the above-mentioned PE file into a classification model generated by machine learning algorithm training in advance, and then according to the output of the classification model Confirm whether it is a malicious file; or manually analyze the behavior characteristics of the above PE file to confirm whether it is a malicious file.
例如,检测设备通过以上方式获取PE文件的行为特征为{RegDeleteValue(参数A),RegSetValue(参数B),SetWindowsHook(参数C)}。上述行为特征表明PE文件先通过RegDeleteValue删除杀毒软件启动项,再通过注册表设置函数RegSetValue将自身添加到开机自启动项中,实现系统驻留;再进一步通过SetWindowsHook函数设置全局钩子,截获用户输入数据,窃取敏感信息。检测设备根据这一系列的行为特征,判断作为测试文件的PE文件为恶意文件。For example, the detection device obtains the behavior characteristics of the PE file in the above manner as {RegDeleteValue (parameter A), RegSetValue (parameter B), SetWindowsHook (parameter C)}. The above behavior characteristics indicate that the PE file first deletes the antivirus software startup item through RegDeleteValue, and then adds itself to the startup item through the registry setting function RegSetValue to achieve system resident; and then further sets the global hook through the SetWindowsHook function to intercept user input data , Steal sensitive information. The detection device judges that the PE file as the test file is a malicious file based on the series of behavior characteristics.
应理解,图5实施例提供的流程为基于非X86平台的检测设备检测Windows可执行文件的方案的示例性说明,并不是基于非X86平台的检测设备检测Windows可执行文件的唯一必选实现方式。可选地,在另一些实施例中,图5实施例中检测设备的Linux操作系统可被替换为基于ARM指令集架构的其他操作系统。在图5实施例中的Linux操作系统被替换为基于ARM指令集架构的其他操作系统的情况下,检测设备需要将图5实施例中的SO文件替换为该其他操作系统的动态链接库。而这些实现方式属于图4实施例的一种具体情况,也应涵盖在本申请实施例的保护范围之内。It should be understood that the process provided in the embodiment of FIG. 5 is an exemplary illustration of a solution for detecting Windows executable files by a detection device based on a non-X86 platform, and is not the only required implementation for detection devices based on a non-X86 platform to detect Windows executable files. . Optionally, in other embodiments, the Linux operating system of the detection device in the embodiment of FIG. 5 may be replaced with another operating system based on the ARM instruction set architecture. In the case where the Linux operating system in the embodiment of FIG. 5 is replaced with another operating system based on the ARM instruction set architecture, the detection device needs to replace the SO file in the embodiment of FIG. 5 with the dynamic link library of the other operating system. These implementation manners belong to a specific situation of the embodiment in FIG. 4, and should also be covered within the protection scope of the embodiment of the present application.
当前现有网络中基于非X86平台的计算机设备越来越多,而大量的测试文件仍然是基于Windows操作系统运行的可执行文件。而现有非X86平台的检测设备检测Windows操作系统下的测试文件时,现有技术会由于测试文件无法执行而存在着天然的障碍。Currently, there are more and more computer devices based on non-X86 platforms in existing networks, and a large number of test files are still executable files running based on the Windows operating system. When the existing non-X86 platform detection equipment detects the test file under the Windows operating system, the existing technology will have a natural obstacle because the test file cannot be executed.
通过本申请实施例提供的方法,基于非X86平台的检测设备通过基于容器技术生成的虚拟运行环境,模拟出类似于Windows操作系统的运行环境,从而利用该虚拟运行环境,兼容Windows的可执行文件的正常执行。如此,即使测试文件是兼容Windows操作系统的可执行文件,例如PE文件,而检测设备的操作系统是Linux操作系统,也能够利用通过本申请实施例提供的方法,在Linux操作系统下动态检测PE文件,从而摆脱测试文件对Windows操作系统的依赖,因此基于非X86平台的检测设备能够动态检测基于Windows系统运行的测试文件,而不必要求检测设备必须是基于兼容Windows操作系统的X86平台,因此克服了检测设备的使用局限性,扩展了恶意文件检测技术的使用场景。Through the method provided in the embodiments of this application, the detection device based on the non-X86 platform simulates the operating environment similar to the Windows operating system through the virtual operating environment generated based on the container technology, thereby using the virtual operating environment to be compatible with Windows executable files The normal execution. In this way, even if the test file is an executable file compatible with the Windows operating system, such as a PE file, and the operating system of the detection device is the Linux operating system, the method provided by the embodiments of the present application can be used to dynamically detect the PE under the Linux operating system. Therefore, the testing equipment based on the non-X86 platform can dynamically detect the testing file based on the Windows system, without requiring the testing equipment to be based on the X86 platform compatible with the Windows operating system, so it can overcome The limitation of the use of detection equipment has been improved, and the use scenarios of malicious file detection technology have been expanded.
以下通过图6实施例,对本申请实施例附图4和附图5所描述的恶意文件的检测方法进行举例说明。在附图6所示的实施例中,检测设备为基于ARM平台的计算机设备。测试文 件为Windows EXE文件。换句话说,附图6描述的方法流程关于基于ARM平台的检测设备如何检测Windows EXE文件是否为恶意文件。该方法包括以下步骤601至步骤607。Hereinafter, the malicious file detection method described in FIG. 4 and FIG. 5 of the embodiment of the present application will be described by using the embodiment of FIG. 6 as an example. In the embodiment shown in FIG. 6, the detection device is a computer device based on the ARM platform. The test file is a Windows EXE file. In other words, the method flow described in FIG. 6 relates to how the detection device based on the ARM platform detects whether the Windows EXE file is a malicious file. The method includes the following steps 601 to 607.
步骤601、检测设备获取作为测试文件的Windows EXE文件。Step 601: The detection device obtains a Windows EXE file as a test file.
步骤602、检测设备获取Windows EXE文件在虚拟运行环境中调用的多个Windows API。Step 602: The detection device obtains multiple Windows APIs called by the Windows EXE file in the virtual running environment.
步骤603、检测设备根据步骤602中被调用的多个API,依次访问动态链接库gdi32.dll、user32.dll或kernel32.dll。动态链接库gdi32.dll、user32.dll或kernel32.dll中含有上述被调用的多个API对应的执行函数。Step 603: The detection device sequentially accesses the dynamic link library gdi32.dll, user32.dll, or kernel32.dll according to the multiple APIs called in step 602. The dynamic link library gdi32.dll, user32.dll, or kernel32.dll contains the execution functions corresponding to the multiple APIs that are called.
步骤604、检测设备根据gdi32.dll、user32.dll或kernel32.dll中命中的函数(即上述步骤603中被调用的多个API对应的执行函数),访问含有命中的函数对应的基本函数的内核级动态链接库ntdll.dll。Step 604: The detection device accesses the kernel containing the basic function corresponding to the hit function according to the function hit in gdi32.dll, user32.dll or kernel32.dll (that is, the execution function corresponding to the multiple APIs called in step 603 above) Level dynamic link library ntdll.dll.
步骤605、检测设备运行的系统模拟进程确定ntdll.dll中的上述步骤604中命中的函数对应的基本函数映射的Linux API。Step 605: The system simulation process running by the detection device determines the Linux API of the basic function mapping corresponding to the function hit in step 604 in the ntdll.dll.
步骤606、检测设备运行的访问Linux库,得到上述Linux API对应的函数。Step 606: Detect the access Linux library running on the device, and obtain the function corresponding to the above-mentioned Linux API.
步骤607、检测设备运行的Unix内核根据步骤606中Linux API对应的函数,通过Unix设备驱动控制Unix设备执行操作。Step 607: The Unix kernel running on the detection device controls the Unix device to perform operations through the Unix device driver according to the function corresponding to the Linux API in step 606.
下面通过图7实施例,对本申请实施例附图6所描述的恶意文件的检测方法进行举例说明。在附图7所示的实施例中,测试文件为malware.exe。其中,Malware这个单词来自于Malicious(有恶意的)和Software(软件)这两个单词的合成,是恶意软件的术语,代表能够对计算机有威胁的软件程序,如病毒、蠕虫、木马、间谍软件等。malware.exe是一个PE文件。在X86平台下,运行malware.exe的过程中,会根据不同的函数调用,调用不同的DLL文件,所有的DLL调用,最终会调用到ntdll.dll文件中,由此进入Windows系统的内核,实现函数调用的功能。而在本实施例中,基于ARM平台的检测设备执行以下步骤701至步骤705,来运行malware.exe,并检测malware.exe包含的恶意文件。Hereinafter, the malicious file detection method described in FIG. 6 of the embodiment of the present application will be illustrated by using the embodiment of FIG. 7 as an example. In the embodiment shown in FIG. 7, the test file is malware.exe. Among them, the word Malware comes from the synthesis of the two words Malicious (malicious) and Software (software). It is a term for malicious software and represents software programs that can threaten computers, such as viruses, worms, Trojan horses, and spyware. Wait. Malware.exe is a PE file. Under the X86 platform, during the process of running malware.exe, different DLL files will be called according to different function calls. All DLL calls will eventually be called to the ntdll.dll file, which enters the kernel of the Windows system. The function of the function call. In this embodiment, the detection device based on the ARM platform executes the following steps 701 to 705 to run the malware.exe and detect the malicious files contained in the malware.exe.
步骤701、检测设备启动malware.exe。Step 701: The detection device starts malware.exe.
例如,Docker实例包括系统模拟进程,例如,Docker实例为父进程,系统模拟进程是子进程。系统模拟进程为Docker实例中用于模拟检测设备的操作系统的进程。例如,系统模拟进程可以访问DLL文件,得到DLL文件中封装的API,进行API转换。For example, a Docker instance includes a system simulation process, for example, a Docker instance is a parent process, and a system simulation process is a child process. The system simulation process is a process used to simulate the operating system of the detection device in the Docker instance. For example, the system simulation process can access the DLL file, obtain the API encapsulated in the DLL file, and perform API conversion.
Docker实例可以通过系统模拟进程启动malware.exe。Docker实例通过系统模拟进程,将malware.exe的二进制映像加载到检测设备的内存空间,并启动二进制映像。此外,系统模拟进程用于访问malware.exe所需的DLL文件和SO文件,保证malware.exe运行过程中的DLL调用和函数正常执行。其中,该内存空间由Docker实例预先向检测设备真实的操作系统申请。The Docker instance can start malware.exe through the system simulation process. The Docker instance loads the binary image of malware.exe into the memory space of the detection device through the system simulation process, and starts the binary image. In addition, the system simulation process is used to access the DLL files and SO files required by malware.exe to ensure the normal execution of DLL calls and functions during the running of malware.exe. Among them, the memory space is pre-applied by the Docker instance to the real operating system of the detection device.
步骤702、检测设备调用DLL文件中的函数。Step 702: The detection device calls the function in the DLL file.
其中,DLL文件中的函数可以用于组成第一API序列。malware.exe运行过程中,会产生对Windows操作系统的注册表、文件、系统IO的请求,这些请求是以函数调用的方式通知给Windows操作系统的,被调用的函数可以位于DLL文件中,一个或多个被调用的函数可以组成API。Among them, the functions in the DLL file can be used to compose the first API sequence. During the running process of malware.exe, requests for the registry, files, and system IO of the Windows operating system will be generated. These requests are notified to the Windows operating system in the form of function calls. The called functions can be located in the DLL file. Or multiple called functions can form an API.
可选地,系统模拟进程可以利用Linux操作系统的资源模拟Windows操作系统的资源。比如说,将Windows的文件系统对应到Linux的某个目录下,从而通过Linux的目录模拟 Windows的文件系统。可选地,Windows的网络系统通过基于Linux的协议栈模拟实现。Optionally, the system simulation process can use the resources of the Linux operating system to simulate the resources of the Windows operating system. For example, the file system of Windows is mapped to a certain directory of Linux, so as to simulate the file system of Windows through the Linux directory. Optionally, the Windows network system is implemented through Linux-based protocol stack simulation.
步骤703、检测设备进行指令转换。Step 703: The detection device performs instruction conversion.
malware.exe在Docker实例内运行产生的指令还是X86指令,可选地,指令转换进程对X86指令转换成ARM指令。The instructions generated by malware.exe running in the Docker instance are still X86 instructions. Optionally, the instruction conversion process converts X86 instructions into ARM instructions.
步骤704、检测设备通过Linux操作系统上执行ARM指令,以实现ARM指令所指示的操作。Step 704: The detection device executes the ARM instruction on the Linux operating system to implement the operation indicated by the ARM instruction.
可选地,检测设备的linux操作系统通过基于ARM架构的CPU执行ARM指令。在执行ARM指令的过程中,CPU可以控制其他计算机硬件(如外围的输入输出设备)执行ARM指令对应的操作。计算机硬件可以将操作产生的执行结果返回给Linux操作系统,Linux操作系统将执行结果返回给指令转换进程。指令转换进程将执行结果从ARM指令转换为X86指令,将X86指令反馈到Docker实例进程中,保证malware.exe继续执行。Optionally, the linux operating system of the detection device executes the ARM instructions through a CPU based on the ARM architecture. In the process of executing the ARM instruction, the CPU can control other computer hardware (such as peripheral input and output devices) to execute the operation corresponding to the ARM instruction. The computer hardware can return the execution result generated by the operation to the Linux operating system, and the Linux operating system returns the execution result to the instruction conversion process. The instruction conversion process converts the execution result from ARM instructions to X86 instructions, and feeds back the X86 instructions to the Docker instance process to ensure that malware.exe continues to execute.
步骤705、检测设备根据调用过程中的动态行为,进行威胁判定。Step 705: The detection device makes a threat judgment based on the dynamic behavior in the calling process.
例如,检测设备对模拟的Windows API进行监控,根据被调用的函数和参数,抽象出malware.exe的行为特征,根据malware.exe的行为特征进行恶意行为判定,从而完成malware.exe的动态检测。For example, the detection device monitors the simulated Windows API, abstracts the behavior characteristics of malware.exe based on the called functions and parameters, and performs malicious behavior determination based on the behavior characteristics of malware.exe, thereby completing the dynamic detection of malware.exe.
通过上述流程,基于ARM平台的检测设备通过操作系统模拟的方式和指令转换的方式,实现了X86平台的文件在ARM平台上的动态检测。并且,能在进程级别实现恶意文件的运行,避免传统的虚拟机的创建、调度等操作,占用资源少,运行速度快,最终实现跨平台检测恶意文件的目的。Through the above process, the detection equipment based on the ARM platform realizes the dynamic detection of the files of the X86 platform on the ARM platform through the mode of operating system simulation and the mode of instruction conversion. In addition, the operation of malicious files can be realized at the process level, avoiding traditional virtual machine creation, scheduling and other operations, occupies less resources, runs fast, and finally achieves the purpose of cross-platform detection of malicious files.
以下通过图8实施例,对本申请实施例附图4所描述的恶意文件的检测方法进行举例说明。在附图8所示的实施例中,检测设备为基于X86平台的计算机设备。测试文件为ELF文件。换句话说,附图5描述的方法流程关于基于X86平台的检测设备如何检测利用Linux操作系统进行恶意操作的恶意文件。应理解,图8实施例与图4实施例同理的步骤还请参见图4实施例,在图8实施例中不做赘述。The following illustrates the malicious file detection method described in FIG. 4 of the embodiment of the present application by using the embodiment of FIG. 8 as an example. In the embodiment shown in FIG. 8, the detection device is a computer device based on an X86 platform. The test file is an ELF file. In other words, the method flow described in FIG. 5 relates to how a detection device based on the X86 platform detects malicious files that use the Linux operating system to perform malicious operations. It should be understood that the steps in the embodiment in FIG. 8 are the same as those in the embodiment in FIG. 4, please refer to the embodiment in FIG. 4, and will not be repeated in the embodiment in FIG.
参见图8,图8是本申请实施例提供的一种恶意文件的检测方法的流程图,该方法包括以下步骤801至步骤805。Referring to FIG. 8, FIG. 8 is a flowchart of a method for detecting malicious files according to an embodiment of the present application. The method includes the following steps 801 to 805.
步骤801、检测设备获取一个ELF文件,该ELF文件是本申请实施例中的测试文件。Step 801: The detection device obtains an ELF file, which is a test file in the embodiment of the present application.
其中,检测设备是上述实施例中的检测设备的一种具体情况,可选地,检测设备的操作系统是Windows操作系统。ELF文件是上述实施例中的测试文件的一种具体情况。ELF文件为基于Linux操作系统运行的可执行文件。ELF文件的格式为ELF格式。The detection device is a specific case of the detection device in the foregoing embodiment. Optionally, the operating system of the detection device is a Windows operating system. The ELF file is a specific case of the test file in the above embodiment. ELF files are executable files that run based on the Linux operating system. The format of the ELF file is ELF format.
步骤802、检测设备在虚拟运行环境中运行ELF文件。Step 802: The detection device runs the ELF file in the virtual operating environment.
其中,虚拟运行环境是基于容器技术生成的。第一API集合中包括多个API。第一API集合包括虚拟运行环境提供的软件运行所需的多个API。第一API集合中的API的标识与Linux API集合中的Linux API的标识相同。Linux API集合包括了Linux操作系统为测试文件提供的软件运行所需的多个API。Linux API集合中包括多个Linux API。Among them, the virtual operating environment is generated based on container technology. The first API set includes multiple APIs. The first API set includes multiple APIs required for software operation provided by the virtual operating environment. The identifier of the API in the first API set is the same as the identifier of the Linux API in the Linux API set. The Linux API collection includes multiple APIs required by the Linux operating system for running the software provided by the test file. The Linux API collection includes multiple Linux APIs.
步骤803、检测设备获得ELF文件在运行过程中调用的第一API序列,第一API序列中包括至少一个API,第一API序列中的API为第一API集合中的API。Step 803: The detection device obtains the first API sequence called during the running of the ELF file, the first API sequence includes at least one API, and the APIs in the first API sequence are APIs in the first API set.
步骤804、检测设备在Windows操作系统中执行第二API序列,第二API序列中包括至 少一个Windows API,第二API序列中的第一Windows与第一API序列中的第一API具有映射关系。Step 804: The detection device executes a second API sequence in the Windows operating system, the second API sequence includes at least one Windows API, and the first Windows in the second API sequence has a mapping relationship with the first API in the first API sequence.
本步骤中,通过执行Windows API,达到模拟执行Linux API的效果,从而模拟Linux系统为测试文件提供的运行环境,实现操作系统模拟的目的。可选地,模拟执行Linux API的流程包括以下步骤8041至步骤8043。In this step, the Windows API is executed to achieve the effect of simulating the execution of the Linux API, thereby simulating the operating environment provided by the Linux system for the test file, and achieving the purpose of operating system simulation. Optionally, the process of simulating the execution of the Linux API includes the following steps 8041 to 8043.
步骤8041、检测设备根据第一API序列中的每个API,分别从SO文件中获取对应的函数。Step 8041. The detection device obtains the corresponding function from the SO file according to each API in the first API sequence.
SO文件由虚拟运行环境提供。例如,虚拟运行环境的动态链接库包括SO文件。在一种可能的实现中,软件提供商在打包容器的镜像的过程中,软件提供商在镜像中封装SO文件,则检测设备基于镜像生成容器的实例后,容器的实例为其中运行的ELF文件提供SO文件,使得ELF文件可以调用SO文件中的第一函数序列。The SO file is provided by the virtual operating environment. For example, the dynamic link library of the virtual runtime environment includes SO files. In a possible implementation, in the process of packaging the image of the container, the software provider encapsulates the SO file in the image, and the detection device generates an instance of the container based on the image, and the instance of the container is the ELF file running in it. Provide the SO file so that the ELF file can call the first function sequence in the SO file.
步骤8042、检测设备根据第一函数序列中的每个函数以及函数之间的映射关系,分别从DLL文件中获取映射的函数。Step 8042. The detection device obtains the mapped function from the DLL file according to each function in the first function sequence and the mapping relationship between the functions.
DLL文件由虚拟运行环境提供。例如,虚拟运行环境的动态链接库包括DLL文件。在一种可能的实现中,软件提供商在打包容器的镜像的过程中,软件提供商在镜像中封装DLL文件,则检测设备基于镜像生成容器的实例后,容器的实例为其中运行的ELF文件提供DLL文件,使得第一函数序列被调用后,检测设备能够访问DLL文件,得到第二函数序列。The DLL file is provided by the virtual runtime environment. For example, the dynamic link library of the virtual runtime environment includes DLL files. In a possible implementation, in the process of packaging the image of the container, the software provider encapsulates the DLL file in the image, and the detection device generates an instance of the container based on the image, and the instance of the container is the ELF file running in it. Provide the DLL file so that after the first function sequence is called, the detection device can access the DLL file to obtain the second function sequence.
步骤8043、在Windows操作系统的内核中,根据第二函数序列执行操作。Step 8043: In the kernel of the Windows operating system, perform an operation according to the second function sequence.
示例性地,如果在Linux操作系统下,当ELF文件调用API:frwite(),会调用到SO文件中的函数frwite,在Linux系统的内核中执行写文件操作。而在本实施例中,Windows操作系统下运行ELF文件的过程中,当ELF文件调用API:frwite()时,检测设备调用kernel32.dll中,然后进一步调用ntdll.dll中的函数NtWriteFile(),最后在Windows系统的内核中执行写文件操作。Exemplarily, if under the Linux operating system, when the ELF file calls the API: frwite(), the function frwite in the SO file will be called to perform the file writing operation in the kernel of the Linux system. In this embodiment, in the process of running the ELF file under the Windows operating system, when the ELF file calls API: frwite(), the detection device calls kernel32.dll, and then further calls the function NtWriteFile() in ntdll.dll, Finally, the file writing operation is performed in the kernel of the Windows system.
通过Windows操作系统的DLL文件,能够模拟出Linux操作系统上SO文件的调用过程,通过在Windows内核中根据函数执行操作,能够模拟出在Linux内核中根据函数执行操作的过程。Through the DLL file of the Windows operating system, the calling process of the SO file on the Linux operating system can be simulated, and the process of performing operations according to the function in the Linux kernel can be simulated by performing operations according to functions in the Windows kernel.
在一些实施例中,可选地,通过进行指令转换,将测试文件触发的指令转换为Windows操作系统可执行的指令。可选地,指令转换的过程包括以下步骤一至步骤二。In some embodiments, optionally, by performing instruction conversion, the instructions triggered by the test file are converted into instructions executable by the Windows operating system. Optionally, the instruction conversion process includes the following steps one to two.
步骤一、检测设备获取测试文件在运行过程中触发的第一指令序列,第一指令序列中的每个指令为ARM指令,第一指令序列中的每个ARM指令用于指示调用第一API序列中的一个Linux API。 Step 1. The testing device obtains the first instruction sequence triggered during the running of the test file. Each instruction in the first instruction sequence is an ARM instruction, and each ARM instruction in the first instruction sequence is used to instruct to call the first API sequence One of the Linux APIs.
步骤二、检测设备将第一指令序列中的每个ARM指令转换为X86指令,根据转换得到的X86指令得到第二指令序列,第二指令序列包括至少一个X86指令。第二指令序列中的每个X86指令用于指示调用第二API序列中的一个Windows API。第二指令序列中的每个指令属于X86指令集。Step 2: The detection device converts each ARM instruction in the first instruction sequence into an X86 instruction, and obtains a second instruction sequence according to the converted X86 instruction, and the second instruction sequence includes at least one X86 instruction. Each X86 instruction in the second instruction sequence is used to instruct to call a Windows API in the second API sequence. Each instruction in the second instruction sequence belongs to the X86 instruction set.
之后,检测设备通过X86CPU执行第二指令序列中的每个X86指令,以实现第二API序列中每个Windows API对应的操作,从而通过Windows API模拟了Linux API。After that, the detection device executes each X86 instruction in the second instruction sequence through the X86CPU to implement the operation corresponding to each Windows API in the second API sequence, thereby simulating the Linux API through the Windows API.
在一些实施例中,可选地,检测设备在Linux操作系统中执行第二API序列之后,检测设备获取到第三指令序列,第三指令序列表示执行第二API序列后得到的结果,第三指令序 列中的指令为X86指令。检测设备将第三指令序列中的每个X86指令转换为ARM指令,根据转换得到的ARM指令得到第四指令序列,第四指令序列中的指令属于ARM指令集中的ARM指令。检测设备将第四指令序列输入至虚拟运行环境。In some embodiments, optionally, after the detection device executes the second API sequence in the Linux operating system, the detection device obtains a third instruction sequence, where the third instruction sequence represents the result obtained after the second API sequence is executed, and the third The instructions in the instruction sequence are X86 instructions. The detection device converts each X86 instruction in the third instruction sequence into an ARM instruction, and obtains a fourth instruction sequence according to the converted ARM instruction, and the instructions in the fourth instruction sequence belong to the ARM instruction in the ARM instruction set. The detection device inputs the fourth instruction sequence into the virtual operating environment.
例如,基于上述方法流程,当得到ARM APP后,ARM APP会触发对虚拟运行环境中SO文件的调用,产生ARM指令,将ARM指令转换为X86指令,通过基于X86的操作系统执行X86指令。其中,ARM APP为基于ARM指令集开发的应用程序,可选地,ARM APP封装为ELF文件。For example, based on the above method flow, when the ARM APP is obtained, the ARM APP will trigger the call to the SO file in the virtual operating environment, generate ARM instructions, convert the ARM instructions into X86 instructions, and execute the X86 instructions through the X86-based operating system. Among them, ARM APP is an application developed based on the ARM instruction set. Optionally, the ARM APP is packaged as an ELF file.
步骤805、检测设备基于第一API序列被调用过程中ELF文件的行为特征,判断ELF文件是否为恶意文件。Step 805: The detection device judges whether the ELF file is a malicious file based on the behavior characteristics of the ELF file during the calling process of the first API sequence.
可选地,软件提供商在基于容器技术设计用于实现虚拟运行环境的程序代码时,采用多种方式使得虚拟运行环境能够输出测试文件调用第一API序列过程中的行为特征。Optionally, when the software provider designs the program code for realizing the virtual operating environment based on the container technology, it adopts multiple methods to enable the virtual operating environment to output the behavior characteristics of the test file in the process of calling the first API sequence.
例如,软件提供商在设计用于支持虚拟运行环境的第一API集合时,在第一API集合中的部分或者全部API中嵌入一段用于输出信息的程序代码。该程序代码的功能是在被执行时输出所嵌入的API被调用的相关信息。API被调用的相关信息包括但不限于API的标识、传入的参数、被调用的时间等等。可选地,软件提供商在设计第一API集合时,只在部分感兴趣的、对于区分正常异常行为效果较好的API中嵌入上述用于输出信息的程序代码。For example, when a software provider designs the first API set to support the virtual operating environment, a piece of program code for outputting information is embedded in some or all of the APIs in the first API set. The function of the program code is to output the related information that the embedded API is called when it is executed. The related information of the API being called includes, but is not limited to, the identification of the API, the parameters passed in, the time of being called, and so on. Optionally, when designing the first API set, the software provider embeds the above-mentioned program code for outputting information in only part of the APIs that are of interest and that have a better effect on distinguishing normal abnormal behaviors.
可选地,上述用于输出信息的程序代码将所嵌入的API被调用的相关信息输出到指定存储空间中,例如输出到指定日志文件中,以便于检测设备读取指定存储空间中的数据后,获得PE文件调用第一API序列的行为特征。Optionally, the above program code for outputting information outputs information related to the call of the embedded API to a designated storage space, for example, to a designated log file, so that the detection device can read the data in the designated storage space. , Obtain the behavior characteristics of the PE file calling the first API sequence.
换句话说,第一API集合中的API的特点一方面是API的标识与Linux API集合中的LinuxAPI的标识相同,另一方面是在被调用时能够输出被调用的相关信息。In other words, the feature of the APIs in the first API set is that on the one hand, the identification of the API is the same as the identification of the Linux API in the Linux API set, and on the other hand, it can output the related information that is called when it is called.
例如,第一API集合中用于写文件的API为frwite(),该frwite()与Linux操作系统的写文件的API名称相同。第一API集合中用于写文件的API为frwite()被调用时,还向日志文件输出被调用的相关信息,相关信息包括frwite()被调用时传入的参数,这些参数例如文件名、文件存储位置、将被写入文件的字符串内容,写入数据相对于文件头的偏移量等等。For example, the API for writing files in the first API set is frwite(), which has the same name as the file writing API of the Linux operating system. When the API used to write files in the first API set is called frwite(), it also outputs the related information of the called to the log file. The related information includes the parameters passed in when frwite() is called, such as file name, File storage location, the content of the string to be written to the file, the offset of the written data relative to the file header, and so on.
检测设备通过以上方式获取ELF文件的行为特征后,根据获取的行为特征确认ELF文件是否为恶意文件。例如,检测设备将上述ELF文件的行为特征与预定规则匹配,根据匹配结果确认是否为恶意文件;或者将上述ELF文件的行为特征输入预先通过机器学习算法训练生成的分类模型,根据分类模型的输出确认是否为恶意文件;或者通过人工分析上述ELF文件的行为特征,确认是否为恶意文件。After the detection device obtains the behavior characteristics of the ELF file in the above manner, it confirms whether the ELF file is a malicious file according to the obtained behavior characteristics. For example, the detection device matches the behavioral characteristics of the ELF file with predetermined rules, and confirms whether it is a malicious file according to the matching result; or inputs the behavioral characteristics of the ELF file into a classification model generated by machine learning algorithm training in advance, and according to the output of the classification model Confirm whether it is a malicious file; or manually analyze the behavior characteristics of the above ELF file to confirm whether it is a malicious file.
例如,检测设备通过以上方式获取ELF文件的行为特征为{RegDeleteValue(参数A),RegSetValue(参数B),SetLinuxHook(参数C)}。上述行为特征表明ELF文件先通过RegDeleteValue删除杀毒软件启动项,再通过注册表设置函数RegSetValue将自身添加到开机自启动项中,实现系统驻留;再进一步通过SetLinuxHook函数设置全局钩子,截获用户输入数据,窃取敏感信息。检测设备根据这一系列的行为特征,判断作为测试文件的ELF文件为恶意文件。For example, the detection device obtains the behavior characteristics of the ELF file in the above manner as {RegDeleteValue (parameter A), RegSetValue (parameter B), SetLinuxHook (parameter C)}. The above behavior characteristics indicate that the ELF file first deletes the antivirus software startup item through RegDeleteValue, and then adds itself to the boot-up item through the registry setting function RegSetValue to achieve system resident; and then further sets the global hook through the SetLinuxHook function to intercept user input data , Steal sensitive information. The detection device judges the ELF file as the test file as a malicious file based on the series of behavior characteristics.
又例如,ELF文件在运行过程中调用打开文件的API,向该系统文件的API传入了表示Linux内核符号表的参数值,则行为特征包括fopen、proc/kallsyms和r。其中,fopen表示打开文件,proc/kallsyms和r表示Linux内核符号表。由于恶意文件通常会通过打开Linux内核 符号表,获取ROOT权限,因此,当发现行为特征包括fopen、proc/kallsyms和r时,判断ELF文件为恶意文件。For another example, the ELF file calls the file-opening API during the running process, and passes the parameter value representing the Linux kernel symbol table to the system file API, and the behavior characteristics include fopen, proc/kallsyms, and r. Among them, fopen means to open a file, proc/kallsyms and r means the Linux kernel symbol table. Since malicious files usually obtain ROOT permissions by opening the Linux kernel symbol table, when the behavior characteristics include fopen, proc/kallsyms, and r, the ELF file is judged to be a malicious file.
应理解,图8实施例提供的流程为基于X86平台的检测设备检测Linux可执行文件的方案的示例性说明,并不是基于X86平台的检测设备检测Linux可执行文件的唯一必选实现方式。在另一些实施例中,可选地,图8实施例中的Windows操作系统被替换为基于X86指令集架构的其他操作系统。在图8实施例中的Windows操作系统被替换为基于X86指令集架构的其他操作系统的情况下,检测设备需要将图8实施例中的DLL文件替换为该其他操作系统的动态链接库。而这些实现方式属于图4实施例的一种具体情况,也应涵盖在本申请实施例的保护范围之内。It should be understood that the process provided in the embodiment of FIG. 8 is an exemplary description of a solution for detecting a Linux executable file by a detection device based on the X86 platform, and is not the only required implementation method for a detection device based on the X86 platform to detect a Linux executable file. In other embodiments, optionally, the Windows operating system in the embodiment of FIG. 8 is replaced with another operating system based on the X86 instruction set architecture. When the Windows operating system in the embodiment of FIG. 8 is replaced with another operating system based on the X86 instruction set architecture, the detection device needs to replace the DLL file in the embodiment of FIG. 8 with the dynamic link library of the other operating system. These implementation manners belong to a specific situation of the embodiment in FIG. 4, and should also be covered within the protection scope of the embodiment of the present application.
当前现有网络中基于X86平台的计算机设备越来越多,而大量的测试文件仍然是基于Linux操作系统运行的可执行文件。而现有X86平台的检测设备检测Linux操作系统下的测试文件时,现有技术会由于测试文件无法执行而存在着天然的障碍。At present, there are more and more computer devices based on the X86 platform in the existing network, and a large number of test files are still executable files running based on the Linux operating system. When the existing X86 platform detection device detects the test file under the Linux operating system, the existing technology will have a natural obstacle due to the inability to execute the test file.
通过本申请实施例提供的方法,基于X86平台的检测设备能够通过基于容器技术生成的虚拟运行环境,模拟出类似于Linux操作系统提供的运行环境,从而利用该虚拟运行环境,兼容Linux的可执行文件的正常执行。如此,即使测试文件是兼容Linux操作系统的可执行文件,例如ELF文件,而检测设备的操作系统是Windows操作系统,利用通过本申请实施例提供的方法,在Windows操作系统下动态检测ELF文件,从而摆脱检测测试文件对Linux操作系统的依赖,因此基于X86平台的检测设备能够动态检测基于Linux系统运行的测试文件,而不必要求检测设备必须是基于兼容Linux操作系统的ARM平台,因此克服了检测设备的使用局限性,从而扩展了恶意文件检测技术的使用场景。Through the method provided by the embodiments of this application, the detection device based on the X86 platform can simulate the operating environment similar to that provided by the Linux operating system through the virtual operating environment generated based on container technology, thereby using the virtual operating environment to be compatible with Linux executables The normal execution of the file. In this way, even if the test file is an executable file compatible with the Linux operating system, such as an ELF file, and the operating system of the detection device is a Windows operating system, the ELF file is dynamically detected under the Windows operating system using the method provided by the embodiment of the present application, So as to get rid of the dependence of the detection test file on the Linux operating system, the detection equipment based on the X86 platform can dynamically detect the test file based on the Linux system, without requiring the detection equipment to be based on the ARM platform compatible with the Linux operating system, thus overcoming the detection The limitations of the use of the device have expanded the use scenarios of malicious file detection technology.
在一些可能的实施例中,可选地,上述恶意文件的检测方法的产品形态是一个容器应用,该容器应用能够提供检测恶意文件的功能。例如,上文中的软件提供商可以是云计算服务提供者,例如云计算服务提供者为企业网络提供容器应用,云服务器将容器应用部署在企业网络中,企业网络中的检测设备通过运行该容器应用,能够实现上述实施例提供的方法。此外,可选地,通过云容器引擎(Cloud Container Engine,简称CCE),在企业网络中部署容器集群,在云端对容器应用进行部署、管理、扩容、升级、卸载、扩展、服务发现及负载均衡等生命周期管理。企业网络的用户能够根据检测恶意文件的需求,利用CCE便捷地管理企业网络中部署的容器应用。In some possible embodiments, optionally, the product form of the foregoing malicious file detection method is a container application, which can provide a function of detecting malicious files. For example, the above software provider may be a cloud computing service provider. For example, a cloud computing service provider provides container applications for enterprise networks. The cloud server deploys the container applications on the enterprise network, and the detection equipment in the enterprise network runs the container The application can implement the method provided in the foregoing embodiment. In addition, optionally, through the Cloud Container Engine (CCE), container clusters are deployed in the enterprise network, and container applications are deployed, managed, expanded, upgraded, uninstalled, expanded, service discovered, and load balanced in the cloud And other life cycle management. The users of the enterprise network can use CCE to conveniently manage the container applications deployed in the enterprise network according to the needs of detecting malicious files.
以下通过图9实施例,对本申请实施例附图4所描述的恶意文件的检测方法进行举例说明。在附图9所示的实施例中,容器的形态为容器应用,云计算服务提供者通过对容器服务管理实体进行操作来得到容器应用的镜像。参见图9,该方法包括以下步骤901至步骤907。Hereinafter, the malicious file detection method described in FIG. 4 of the embodiment of the present application will be illustrated by using the embodiment of FIG. 9 as an example. In the embodiment shown in FIG. 9, the form of the container is a container application, and the cloud computing service provider obtains the image of the container application by operating the container service management entity. Referring to FIG. 9, the method includes the following steps 901 to 907.
步骤901、容器服务管理实体创建容器应用的镜像。Step 901: The container service management entity creates an image of the container application.
在一些可能的实施例中,可选地,容器服务管理实体是容器即服务(container as a service,CaaS)管理器。CaaS是一种用于提供容器服务的平台即服务(Platform as a Service,PaaS)。CaaS位于PaaS层的底部,集成了PaaS层和IaaS层的服务能力,例如,可选地,CaaS包括PaaS层的容器应用和IaaS层的容器资源。CaaS管理器为CaaS中用于管理容器服务的实体,容器服务管理实体用于对CaaS进行管理和编排。当然,CaaS管理器这个名称仅是举例,也可以采用其他称呼来指代CaaS中用于管理容器服务的实体。In some possible embodiments, optionally, the container service management entity is a container as a service (container as a service, CaaS) manager. CaaS is a platform as a service (Platform as a Service, PaaS) for providing container services. CaaS is located at the bottom of the PaaS layer and integrates the service capabilities of the PaaS layer and the IaaS layer. For example, optionally, CaaS includes container applications at the PaaS layer and container resources at the IaaS layer. The CaaS manager is an entity used to manage container services in CaaS, and the container service management entity is used to manage and orchestrate CaaS. Of course, the name CaaS Manager is just an example, and other names can also be used to refer to the entity used to manage container services in CaaS.
在一些实施例中,可选地,容器服务管理实体基于Docker技术,对虚拟运行环境的资源进行封装,得到Docker镜像。例如,对各种注册表、DLL调用、服务等进行打包,得到Docker镜像。In some embodiments, optionally, the container service management entity encapsulates the resources of the virtual operating environment based on Docker technology to obtain a Docker image. For example, package various registry, DLL calls, services, etc. to obtain Docker images.
步骤902、容器服务管理实体向检测设备发送容器应用的镜像。Step 902: The container service management entity sends the image of the container application to the detection device.
可选地,检测设备接收镜像,根据镜像创建容器应用,运行该容器实例。Optionally, the detection device receives the image, creates a container application based on the image, and runs the container instance.
在一些可能的实施例中,可选地,利用Docker技术中的Docker镜像库(也称Docker registry),将镜像发送给检测设备。例如,容器服务管理实体向Docker镜像库发送Docker镜像,检测设备从Docker镜像库下载Docker镜像,从而得到容器服务管理实体发送的镜像。例如,容器服务管理实体通过镜像发送指令(例如是Docker push)指令,向Docker镜像库发送该Docker镜像,检测设备向Docker镜像库发送镜像下载指令(例如是Docker pull指令),Docker镜像库响应于镜像下载指令,向检测设备发送Docker镜像,从而将Docker镜像部署在检测设备上。In some possible embodiments, optionally, the Docker image library (also called Docker registry) in the Docker technology is used to send the image to the detection device. For example, the container service management entity sends a Docker image to the Docker image library, and the detection device downloads the Docker image from the Docker image library, thereby obtaining the image sent by the container service management entity. For example, the container service management entity sends an instruction (for example, Docker push) through the image, sends the Docker image to the Docker image library, the detection device sends an image download instruction (for example, Docker pull command) to the Docker image library, and the Docker image library responds The image download instruction sends the Docker image to the detection device, thereby deploying the Docker image on the detection device.
Docker镜像库为集群中用于存储及分发Docker镜像的节点设备。Docker镜像库存储了大量的Docker镜像。可选地,Docker镜像库基于Docker registry协议或Docker hub等协议实现。Docker镜像库用于存储及分发Docker镜像。可选地,Docker镜像采用多个镜像层和一个镜像描述信息的形式存储在Docker镜像站。The Docker image library is a node device used to store and distribute Docker images in the cluster. The Docker image repository stores a large number of Docker images. Optionally, the Docker image library is implemented based on the Docker registry protocol or the Docker hub protocol. The Docker image library is used to store and distribute Docker images. Optionally, the Docker image is stored in the Docker image station in the form of multiple image layers and one image description information.
步骤903、检测设备获取测试文件。Step 903: The detection device obtains the test file.
步骤904、检测设备在虚拟运行环境中运行测试文件。Step 904: The detection device runs the test file in the virtual operating environment.
步骤905、检测设备获得测试文件在运行过程中调用的第一API序列。Step 905: The detection device obtains the first API sequence called during the running of the test file.
步骤906、检测设备在第二操作系统中执行第二API序列。Step 906: The detection device executes the second API sequence in the second operating system.
步骤907、检测设备基于第一API序列被调用过程中测试文件的行为特征,判断测试文件是否为恶意文件。Step 907: The detection device determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
参见图3,图9实施例描述的实现操作系统模拟的流程以及行为监控的流程在容器应用中执行。在一些实施例中,可选地,在每个容器应用中启动一个可执行文件,通过每个容器应用分别检测一个可执行文件。例如,如图3所示,创建3个容器的实例,分别是容器Docker_1、容器Docker_2和容器Docker_n。通过容器Docker_1启动APP1,在容器Docker_1中对APP1进行检测。通过容器Docker_2启动APP2,在容器Docker_2中对APP2进行检测。通过容器Docker_n启动APPn,在容器Docker_n中对APPn进行检测,从而通过不同的容器,并行检测不同测试文件。Referring to Fig. 3, the process of implementing operating system simulation and the process of behavior monitoring described in the embodiment of Fig. 9 is executed in a container application. In some embodiments, optionally, an executable file is started in each container application, and an executable file is detected by each container application. For example, as shown in Figure 3, create three container instances, namely container Docker_1, container Docker_2, and container Docker_n. Start APP1 through the container Docker_1, and detect APP1 in the container Docker_1. Start APP2 through the container Docker_2, and detect APP2 in the container Docker_2. Start APPn through the container Docker_n, and detect APPn in the container Docker_n, so that different test files can be detected in parallel through different containers.
以下通过图10实施例,对本申请实施例附图4所描述的恶意文件的检测方法进行举例说明。在附图10所示的实施例中,恶意文件的检测方法应用在网络安全领域。检测设备具体是以防火墙、路由器、安全网关、入侵检测设备为例的网络安全设备。网络安全设备通过检测网络中传播的恶意文件,保证网络的安全性。应理解,图10实施例与图4实施例同理的步骤还请参见图4实施例,在图10实施例中不做赘述。The following describes the malicious file detection method described in FIG. 4 of the embodiment of the present application by using the embodiment of FIG. 10 as an example. In the embodiment shown in FIG. 10, the method for detecting malicious files is applied in the field of network security. The detection device is specifically a network security device such as a firewall, a router, a security gateway, and an intrusion detection device. Network security equipment ensures the security of the network by detecting malicious files spreading on the network. It should be understood that the steps in the embodiment in FIG. 10 are the same as those in the embodiment in FIG. 4, please refer to the embodiment in FIG. 4, and will not be repeated in the embodiment in FIG.
参见图10,图10是本申请实施例提供的一种网络安全的保护方法的流程图,该方法包括以下步骤1001至步骤1007。Refer to FIG. 10, which is a flowchart of a network security protection method provided by an embodiment of the present application. The method includes the following steps 1001 to 1007.
步骤1001、网络安全设备获取网络中传输的数据流。Step 1001: The network security device obtains the data stream transmitted in the network.
在本领域中,数据流(或称报文流)是指从一个源主机到目的方的一系列报文,其中目的 方可以是另一个主机、包含多个主机的多播组、或者广播域。In this field, a data stream (or message stream) refers to a series of messages from a source host to a destination, where the destination can be another host, a multicast group containing multiple hosts, or a broadcast domain .
可选地,网络安全设备为IDS类设备,网络安全设备通过旁路的方式获取数据流。即,网络设备不阻断报文在网络中的传输,而是通过端口镜像(port mirroring)的手段,复制从镜像端口流经的报文,得到镜像报文,对镜像报文进行解析,得到测试文件。可选地,网络安全设备为IPS类设备,网络安全设备通过串联模式(in-line mode),实时检查每一个通过的报文,以便在报文中的测试文件为恶意文件时,阻断报文在网络中的传输。Optionally, the network security device is an IDS device, and the network security device obtains the data stream in a bypass mode. That is, the network device does not block the transmission of packets in the network, but uses port mirroring (port mirroring) to copy the packets flowing through the mirrored port to obtain the mirrored packet, and parse the mirrored packet to obtain Test file. Optionally, the network security device is an IPS type device, and the network security device checks each packet passed in real-time through in-line mode, so that when the test file in the packet is a malicious file, the report is blocked. Transmission of text in the network.
步骤1002、网络安全设备从数据流中获取测试文件。可选地,网络安全设备获取数据流中每个报文的载荷数据后,按照报文的序列号对数据流中所有报文的载荷数据进行重组,从而获取测试文件。 Step 1002. The network security device obtains a test file from the data stream. Optionally, after obtaining the load data of each message in the data stream, the network security device reorganizes the load data of all the messages in the data stream according to the sequence number of the message, thereby obtaining the test file.
步骤1003、网络安全设备在虚拟运行环境中运行测试文件。Step 1003: The network security device runs the test file in the virtual operating environment.
步骤1004、网络安全设备获得测试文件在运行过程中调用的第一API序列。Step 1004: The network security device obtains the first API sequence called during the running of the test file.
步骤1005、网络安全设备在第二操作系统中执行第二API序列。Step 1005: The network security device executes the second API sequence in the second operating system.
步骤1006、网络安全设备基于第一API序列被调用过程中测试文件的行为特征,判断测试文件是否为恶意文件。Step 1006: The network security device judges whether the test file is a malicious file based on the behavioral characteristics of the test file in the process in which the first API sequence is called.
步骤1007、网络安全设备根据检测结果,对网络进行入侵防御。Step 1007: The network security device performs intrusion prevention on the network according to the detection result.
例如,网络安全设备为IDS类设备,如果网络安全设备判断测试文件为恶意文件,网络安全设备发送告警消息。又如,网络安全设备为IPS类设备,如果网络安全设备判断测试文件为恶意文件,网络安全设备丢弃报文,从而阻断报文的传输,并发送告警消息。For example, the network security device is an IDS device, and if the network security device determines that the test file is a malicious file, the network security device sends an alarm message. In another example, the network security device is an IPS device. If the network security device determines that the test file is a malicious file, the network security device discards the message, thereby blocking the transmission of the message and sending an alarm message.
通过本申请实施例提供的方法,网络安全设备通过实施跨平台动态检测恶意文件的方案,对报文中携带的测试文件进行检测,依据检测结果进行入侵防御,能够及时检测到网络中传输的恶意报文,提高网络的安全性。尤其是,网络安全设备通过操作系统模拟的手段,摆脱了检测过程对操作系统的依赖。网络中传输的报文携带了利用Windows操作系统进行恶意操作的恶意文件时,使用虚拟运行环境模拟Windows操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。网络中传输的报文携带了利用Linux操作系统进行恶意操作的恶意文件时,使用虚拟运行环境模拟Linux操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。因此网络安全设备能够动态检测基于Windows系统运行的测试文件以及基于Linux系统运行的测试文件,而不必要求网络安全设备必须是基于兼容Windows系统的X86平台或兼容Linux系统的ARM平台,因此克服了网络安全设备的使用局限性,进而极大地扩展的网络入侵防御方法的适用场景,提高网络系统的安全性。Through the method provided in the embodiments of this application, the network security device detects the test file carried in the message by implementing a cross-platform dynamic detection of malicious files, and performs intrusion prevention based on the detection results, and can detect malicious files transmitted in the network in time. Messages to improve the security of the network. In particular, network security devices get rid of the dependence of the detection process on the operating system by means of operating system simulation. When the packets transmitted on the network carry malicious files that use the Windows operating system to perform malicious operations, the virtual operating environment is used to simulate the Windows operating system, and the malicious files are run in the virtual operating environment to detect such malicious packets. When the packets transmitted on the network carry malicious files that use the Linux operating system to perform malicious operations, the virtual operating environment is used to simulate the Linux operating system, and the malicious files are run in the virtual operating environment to detect such malicious packets. Therefore, the network security device can dynamically detect the test files based on the Windows system and the test files based on the Linux system, without requiring that the network security device must be based on the X86 platform compatible with the Windows system or the ARM platform compatible with the Linux system, thus overcoming the network The limitation of the use of security equipment, which greatly expands the application scenarios of the network intrusion prevention method, improves the security of the network system.
可选地,图10实施例提供的网络安全设备应用在企业网络中,部署在企业网络的网关设备及云平台入口,该网络安全设备通过执行通过本申请实施例提供的方法,能够动态检测恶意文件,从而为企业网络的网络安全提供解决方案。Optionally, the network security device provided in the embodiment of FIG. 10 is applied in an enterprise network, and is deployed on the gateway device and cloud platform entrance of the enterprise network. The network security device can dynamically detect malicious behavior by executing the method provided by the embodiment of the application. Documents to provide solutions for the network security of the corporate network.
附图11描述了网络安全设备几种可能部署场景的示例。示例性地,参见图11,企业网络包括总部局域网和分支机构的若干局域网。总部局域网中包括数据中心1102,核心办公区、办公区A和办公区B各自的局域网。数据中心1102、核心办公区、办公区A和办公区B各自的局域网通过交换机与防火墙1105连接。防火墙1105进一步通过路由器1101、NAT设备(图中未示出)、网关设备(图中未示出)等等与广域网或者因特网连接。防火墙1105用于将总部局域网与广域网或因特网进行隔离,对总部局域网与广域网或者因特网之间交互的数 据进行安全防护。可选地,总部局域网通过VPN与各分支机构的局域网1104连接,分支机构如图11所示的分支机构A、分支机构B和分支机构C。Figure 11 shows examples of several possible deployment scenarios for network security devices. Illustratively, referring to FIG. 11, the enterprise network includes a headquarters local area network and several local area networks of branch offices. The headquarters LAN includes the data center 1102, the core office area, office area A, and office area B's respective LANs. The respective local area networks of the data center 1102, the core office area, the office area A, and the office area B are connected to the firewall 1105 through a switch. The firewall 1105 is further connected to the wide area network or the Internet through a router 1101, a NAT device (not shown in the figure), a gateway device (not shown in the figure), and so on. The firewall 1105 is used to isolate the headquarters local area network from the wide area network or the Internet, and to protect the data exchanged between the headquarters local area network and the wide area network or the Internet. Optionally, the headquarters local area network is connected to the local area network 1104 of each branch through a VPN, and the branch offices are branch A, branch B, and branch C as shown in FIG. 11.
可选地,图10提供的网络安全设备部署在图11所示的企业网络中。例如,例如参见图11,网络安全设备为第一网络安全设备、第二网络安全设备、第三网络安全设备或第四网络安全设备。Optionally, the network security device provided in FIG. 10 is deployed in the enterprise network shown in FIG. 11. For example, referring to FIG. 11, the network security device is a first network security device, a second network security device, a third network security device, or a fourth network security device.
第一网络安全设备部署在总部局域网的网络出口,即防火墙1105与路由器1101之间,例如第一网络安全设备集成在出口防火墙、出口路由器或者旁挂防火墙中。第一网络安全设备用于防范来自互联网的恶意测试文件以及恶意的web流量。The first network security device is deployed at the network exit of the headquarters LAN, that is, between the firewall 1105 and the router 1101. For example, the first network security device is integrated in an exit firewall, an exit router, or a bypass firewall. The first network security device is used to prevent malicious test files from the Internet and malicious web traffic.
第二网络安全设备部署在总部局域网的数据中心1102的边界,例如是以直路方式设置在数据中心1102和防火墙1105之间的独立设备,用于保护服务器核心资产,发现内网潜伏的攻击、恶意扫描,渗透等。The second network security device is deployed on the border of the data center 1102 of the headquarters LAN. For example, it is an independent device set in a straight way between the data center 1102 and the firewall 1105 to protect the core assets of the server, and to discover the hidden attacks and malicious attacks on the internal network. Scanning, penetration, etc.
第三网络安全设备部署在总部局域网的核心部门1103的边界,例如是以直路方式设置在核心办公区的交换机和防火墙1105之间的独立设备,用于防范内网可疑测试文件传播,横向感染核心部门。The third network security device is deployed on the border of the core department 1103 of the headquarters LAN, for example, a separate device set in a straight path between the switch in the core office area and the firewall 1105 to prevent the transmission of suspicious test files on the intranet and laterally infect the core. Department.
第四网络安全设备部署在分支机构局域网1104的边界,例如是以直路方式设置在分支机构局域网和广域网路由设备之间的独立设备,用于避免恶意测试文件、未知威胁在分支机构局域网和总部局域网之间任意扩散。The fourth network security device is deployed on the boundary of the branch LAN 1104, such as a separate device set in a straight path between the branch LAN and the WAN routing device, to avoid malicious test files and unknown threats on the branch LAN and headquarters LAN Random spread between.
以下通过图12实施例,对本申请实施例附图10所描述的恶意文件的检测方法进行举例说明。在附图12所示的实施例中,网络安全设备为图11中的第一网络安全设备。测试文件为总部局域网的网络出口进入或流出的报文中携带的文件。换句话说,附图12描述的方法流程关于部署在总部局域网的网络出口的网络安全设备如何保护总部局域网的网络安全。应理解,图12实施例与图10实施例同理的步骤还请参见图10实施例,在图12实施例中不做赘述。The following uses the embodiment of FIG. 12 to illustrate the malicious file detection method described in FIG. 10 of the embodiment of the present application. In the embodiment shown in FIG. 12, the network security device is the first network security device in FIG. 11. The test file is the file carried in the incoming or outgoing packets from the network exit of the headquarters LAN. In other words, the method flow described in FIG. 12 relates to how the network security equipment deployed at the network exit of the headquarters LAN protects the network security of the headquarters LAN. It should be understood that the steps in the embodiment in FIG. 12 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.
参见图12,图12是本申请实施例提供的一种网络安全的保护方法的流程图,如图12所示,该方法可以包括以下步骤1201至步骤1206。Referring to FIG. 12, FIG. 12 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 12, the method may include the following steps 1201 to 1206.
步骤1201、第一网络安全设备采集从总部局域网的网络出口进入或流出的报文,得到该报文携带的测试文件,该测试文件为第一操作系统的可执行文件。 Step 1201. The first network security device collects a message that enters or exits from the network exit of the headquarters LAN, and obtains a test file carried in the message, and the test file is an executable file of the first operating system.
步骤1202、第一网络安全设备在虚拟运行环境中运行测试文件。Step 1202: The first network security device runs the test file in the virtual operating environment.
步骤1203、第一网络安全设备获得测试文件在运行过程中调用的第一API序列。Step 1203: The first network security device obtains the first API sequence called during the running of the test file.
步骤1204、第一网络安全设备在第二操作系统中执行第二API序列。Step 1204: The first network security device executes the second API sequence in the second operating system.
步骤1205、第一网络安全设备基于第一API序列被调用过程中测试文件的行为特征,判断测试文件是否为恶意文件。Step 1205: The first network security device judges whether the test file is a malicious file based on the behavioral characteristics of the test file in the process in which the first API sequence is called.
图12实施例中的步骤1202的具体细节请参考图4实施例中的步骤403,图12实施例中的步骤1203的具体细节请参考图4实施例中的步骤404,图12实施例中的步骤1204的具体细节请参考图4实施例中的步骤405,图12实施例中的步骤1205的具体细节请参考图4实施例中的步骤406,为了简洁,在此不再赘述。For specific details of step 1202 in the embodiment of FIG. 12, please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1203 in the embodiment of FIG. 12, please refer to step 404 in the embodiment of FIG. For specific details of step 1204, please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1205 in the embodiment of FIG. 12, please refer to step 406 in the embodiment of FIG.
步骤1206、若测试文件为恶意文件,第一网络安全设备上报在总部局域网的网络出口检测到恶意流量。Step 1206: If the test file is a malicious file, the first network security device reports that malicious traffic is detected at the network exit of the headquarters LAN.
通过本申请实施例提供的方法,通过在总部局域网的网络出口部署网络安全设备,由网络安全设备采集网络出口中出入的报文,实施跨平台动态检测恶意文件的方案,以对报文中携带的测试文件进行检测,依据检测结果进行入侵防御。通过该方法,若互联网向总部局域网传输的报文携带了利用Windows操作系统进行恶意操作的恶意文件,网络安全设备使用虚拟运行环境模拟Windows操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。若互联网向总部局域网传输的报文携带了利用Linux操作系统进行恶意操作的恶意文件,网络安全设备使用虚拟运行环境模拟Linux操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。由此可见,该方法能够为总部局域网有效防范来自互联网的恶意流量,提高总部局域网的网络安全。Through the method provided by the embodiments of the present application, network security equipment is deployed at the network exit of the headquarters LAN. The network security equipment collects messages coming in and out of the network exit, and implements a cross-platform dynamic detection scheme for malicious files to carry out messages. The test files are tested, and the intrusion prevention is performed based on the test results. Through this method, if the message transmitted from the Internet to the headquarters LAN carries a malicious file that uses the Windows operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Windows operating system, and runs the malicious file in the virtual operating environment to detect Out such malicious messages. If the message transmitted from the Internet to the headquarters LAN carries a malicious file that uses the Linux operating system to perform malicious operations, the network security device uses a virtual operating environment to simulate the Linux operating system, and runs the malicious file in the virtual operating environment to detect this malicious Message. It can be seen that this method can effectively prevent malicious traffic from the Internet for the headquarters LAN and improve the network security of the headquarters LAN.
以下通过图13实施例,对本申请实施例附图10所描述的恶意文件的检测方法进行举例说明。在附图13所示的实施例中,网络安全设备为图11中的第二网络安全设备。测试文件为数据中心的边界进入或流出的报文中携带的文件。换句话说,附图13描述的方法流程关于部署在数据中心的边界的网络安全设备如何保护数据中心的网络安全。应理解,图13实施例与图10实施例同理的步骤还请参见图10实施例,在图13实施例中不做赘述。Hereinafter, the malicious file detection method described in FIG. 10 of the embodiment of the present application will be illustrated by using the embodiment of FIG. In the embodiment shown in FIG. 13, the network security device is the second network security device in FIG. 11. The test file is the file carried in the packets entering or leaving the boundary of the data center. In other words, the method flow described in FIG. 13 relates to how the network security equipment deployed at the boundary of the data center protects the network security of the data center. It should be understood that the steps in the embodiment in FIG. 13 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.
参见图13,图13是本申请实施例提供的一种网络安全的保护方法的流程图,如图13所示,该方法可以包括以下步骤1301至步骤1306。Referring to FIG. 13, FIG. 13 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 13, the method may include the following steps 1301 to 1306.
步骤1301、第二网络安全设备采集从数据中心的边界进入或流出的报文,得到该报文携带的测试文件,该测试文件为第一操作系统的可执行文件。Step 1301: The second network security device collects a message entering or exiting from the boundary of the data center, and obtains a test file carried by the message, and the test file is an executable file of the first operating system.
步骤1302、第二网络安全设备在虚拟运行环境中运行测试文件。Step 1302: The second network security device runs the test file in the virtual operating environment.
步骤1303、第二网络安全设备获得测试文件在运行过程中调用的第一API序列。Step 1303: The second network security device obtains the first API sequence called during the running of the test file.
步骤1304、第二网络安全设备在第二操作系统中执行第二API序列。Step 1304: The second network security device executes the second API sequence in the second operating system.
步骤1305、第二网络安全设备基于第一API序列被调用过程中测试文件的行为特征,判断测试文件是否为恶意文件。Step 1305: The second network security device judges whether the test file is a malicious file based on the behavior characteristics of the test file during the calling process of the first API sequence.
图13实施例中的步骤1302的具体细节请参考图4实施例中的步骤403,图13实施例中的步骤1303的具体细节请参考图4实施例中的步骤404,图13实施例中的步骤1304的具体细节请参考图4实施例中的步骤405,图13实施例中的步骤1305的具体细节请参考图4实施例中的步骤406,为了简洁,在此不再赘述。For specific details of step 1302 in the embodiment of FIG. 13, please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1303 in the embodiment of FIG. 13 please refer to step 404 in the embodiment of FIG. For specific details of step 1304, please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1305 in the embodiment of FIG. 13 please refer to step 406 in the embodiment of FIG.
步骤1306、若测试文件为恶意文件,第二网络安全设备上报在该数据中心的边界检测到恶意流量。Step 1306: If the test file is a malicious file, the second network security device reports that malicious traffic is detected at the boundary of the data center.
通过本申请实施例提供的方法,通过在数据中心的边界部署网络安全设备,由网络安全设备采集网络出口中出入的报文,实施跨平台动态检测恶意文件的方案,以对报文中携带的测试文件进行检测,依据检测结果进行入侵防御。通过该方法,若数据中心内部传输的报文携带了利用Windows操作系统进行恶意操作的恶意文件,网络安全设备使用虚拟运行环境模拟Windows操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。若互联网向数据中心传输的报文携带了利用Linux操作系统进行恶意操作的恶意文件,网络安全设备使用虚拟运行环境模拟Linux操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。由此可见,该方法有助于发现数据中心内网传播的恶意文件,有助于保护服务器核心资产,发现内网潜伏的攻击、恶意扫描,渗透等。Through the method provided by the embodiments of the present application, by deploying network security equipment at the border of the data center, the network security equipment collects messages coming in and out of the network exit, and implements a cross-platform dynamic detection scheme for malicious files, so as to detect the malicious files carried in the messages. Test files are tested, and intrusion prevention is performed based on the test results. Through this method, if the message transmitted inside the data center carries malicious files that use the Windows operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Windows operating system, and runs the malicious files in the virtual operating environment to detect This malicious message. If the message transmitted from the Internet to the data center carries a malicious file that uses the Linux operating system to perform malicious operations, the network security device uses a virtual operating environment to simulate the Linux operating system, and runs the malicious file in the virtual operating environment to detect this malicious Message. It can be seen that this method helps to find malicious files spread in the data center's intranet, helps protect the core assets of the server, and discovers potential attacks, malicious scans, and infiltrations in the intranet.
以下通过图14实施例,对本申请实施例附图10所描述的恶意文件的检测方法进行举例说明。在附图14所示的实施例中,网络安全设备为图11中的第三网络安全设备。测试文件为核心部门内部传输的报文中携带的文件。换句话说,附图14描述的方法流程关于部署在核心部门边界的网络安全设备如何保护核心部门的网络安全。应理解,图14实施例与图10实施例同理的步骤还请参见图10实施例,在图14实施例中不做赘述。The following uses the embodiment of FIG. 14 to illustrate the malicious file detection method described in FIG. 10 of the embodiment of the present application. In the embodiment shown in FIG. 14, the network security device is the third network security device in FIG. 11. The test file is the file carried in the message transmitted internally by the core department. In other words, the method flow described in FIG. 14 relates to how the network security equipment deployed at the boundary of the core department protects the network security of the core department. It should be understood that the steps in the embodiment in FIG. 14 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.
参见图14,图14是本申请实施例提供的一种网络安全的保护方法的流程图,如图14所示,该方法可以包括以下步骤1401至步骤1406。Referring to FIG. 14, FIG. 14 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 14, the method may include the following steps 1401 to 1406.
步骤1401、第三网络安全设备采集核心部门内部传输的报文,得到该报文携带的测试文件,该测试文件为第一操作系统的可执行文件。 Step 1401. The third network security device collects the message transmitted internally by the core department to obtain a test file carried by the message, and the test file is an executable file of the first operating system.
步骤1402、第三网络安全设备在虚拟运行环境中运行测试文件。Step 1402: The third network security device runs the test file in the virtual operating environment.
步骤1403、第三网络安全设备获得测试文件在运行过程中调用的第一API序列。Step 1403: The third network security device obtains the first API sequence called during the running of the test file.
步骤1404、第三网络安全设备在第二操作系统中执行第二API序列。Step 1404: The third network security device executes the second API sequence in the second operating system.
步骤1405、第三网络安全设备基于第一API序列被调用过程中测试文件的行为特征,判断测试文件是否为恶意文件。Step 1405: The third network security device judges whether the test file is a malicious file based on the behavior characteristics of the test file during the calling process of the first API sequence.
图14实施例中的步骤1402的具体细节请参考图4实施例中的步骤403,图14实施例中的步骤1403的具体细节请参考图4实施例中的步骤404,图14实施例中的步骤1404的具体细节请参考图4实施例中的步骤405,图14实施例中的步骤1405的具体细节请参考图4实施例中的步骤406,为了简洁,在此不再赘述。For specific details of step 1402 in the embodiment of FIG. 14, please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1403 in the embodiment of FIG. 14, please refer to step 404 in the embodiment of FIG. 4. For specific details of step 1404, please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1405 in the embodiment of FIG. 14 please refer to step 406 in the embodiment of FIG.
步骤1406、若测试文件为恶意文件,第三网络安全设备上报在核心部门内部检测到恶意流量。Step 1406: If the test file is a malicious file, the third network security device reports that malicious traffic is detected in the core department.
通过本申请实施例提供的方法,通过在核心部门的边界部署网络安全设备,由网络安全设备采集核心部门中出入的报文,实施跨平台动态检测恶意文件的方案,以对报文中携带的测试文件进行检测,依据检测结果进行入侵防御。通过该方法,若核心部门内部传输的报文携带了利用Windows操作系统进行恶意操作的恶意文件,网络安全设备使用虚拟运行环境模拟Windows操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。若互联网向核心部门传输的报文携带了利用Linux操作系统进行恶意操作的恶意文件,网络安全设备使用虚拟运行环境模拟Linux操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。由此可见,该方法有助于发现核心部门内网传播的恶意文件,有助于防范内网可疑测试文件传播,横向感染核心部门。Through the method provided by the embodiments of this application, by deploying network security equipment at the border of the core department, the network security equipment collects incoming and outgoing messages from the core department, and implements a cross-platform dynamic detection scheme for malicious files, so as to detect the malicious files carried in the messages. Test files are tested, and intrusion prevention is performed based on the test results. With this method, if the internally transmitted messages of the core department carry malicious files that use the Windows operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Windows operating system, and runs the malicious files in the virtual operating environment to detect This malicious message. If the message transmitted from the Internet to the core department carries malicious files that use the Linux operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Linux operating system, and runs the malicious files in the virtual operating environment to detect this malicious Message. It can be seen that this method helps to discover malicious files spread on the intranet of the core department, and helps prevent the spread of suspicious test files on the intranet and infect the core department horizontally.
以下通过图15实施例,对本申请实施例附图10所描述的恶意文件的检测方法进行举例说明。在附图15所示的实施例中,网络安全设备为图11中的第四网络安全设备。测试文件为分支机构局域网的边界进入或流出的报文中携带的文件。换句话说,附图15描述的方法流程关于部署在分支机构局域网的边界的网络安全设备如何保护分支机构局域网的网络安全。应理解,图15实施例与图10实施例同理的步骤还请参见图10实施例,在图15实施例中不做赘述。The following uses the embodiment of FIG. 15 to illustrate the malicious file detection method described in FIG. 10 of the embodiment of the present application. In the embodiment shown in FIG. 15, the network security device is the fourth network security device in FIG. 11. The test file is the file carried in the incoming or outgoing packets from the boundary of the branch LAN. In other words, the method flow described in FIG. 15 relates to how the network security device deployed at the boundary of the branch local area network protects the network security of the branch local area network. It should be understood that the steps in the embodiment in FIG. 15 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.
参见图15,图15是本申请实施例提供的一种网络安全的保护方法的流程图,如图15所示,该方法可以包括以下步骤1501至步骤1506。Referring to FIG. 15, FIG. 15 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 15, the method may include the following steps 1501 to 1506.
步骤1501、第四网络安全设备采集从分支机构局域网的边界进入或流出的报文,得到该报文携带的测试文件。Step 1501: The fourth network security device collects a message that enters or exits from the boundary of the branch LAN, and obtains a test file carried by the message.
例如,可选地,该报文包括分支机构局域网内部传输的报文、企业总部与分支机构局域网之间传输的报文、外网流入至分支机构局域网的报文或分支机构局域网流出至外网的报文中的至少一项。For example, optionally, the message includes the message transmitted within the branch LAN, the message transmitted between the corporate headquarters and the branch LAN, the message flowing from the external network to the branch LAN, or the branch LAN to the external network. At least one of the messages.
步骤1502、第四网络安全设备在虚拟运行环境中运行测试文件。 Step 1502, the fourth network security device runs the test file in the virtual operating environment.
步骤1503、第四网络安全设备获得测试文件在运行过程中调用的第一API序列。Step 1503: The fourth network security device obtains the first API sequence called during the running of the test file.
步骤1504、第四网络安全设备在第二操作系统中执行第二API序列。Step 1504: The fourth network security device executes the second API sequence in the second operating system.
步骤1505、第四网络安全设备基于第一API序列被调用过程中测试文件的行为特征,判断测试文件是否为恶意文件。Step 1505: The fourth network security device determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
图15实施例中的步骤1502的具体细节请参考图4实施例中的步骤403,图15实施例中的步骤1503的具体细节请参考图4实施例中的步骤404,图15实施例中的步骤1504的具体细节请参考图4实施例中的步骤405,图15实施例中的步骤1505的具体细节请参考图4实施例中的步骤406,为了简洁,在此不再赘述。For specific details of step 1502 in the embodiment of FIG. 15, please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1503 in the embodiment of FIG. 15, please refer to step 404 in the embodiment of FIG. 4. For specific details of step 1504, please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1505 in the embodiment of FIG. 15 please refer to step 406 in the embodiment of FIG.
步骤1506、若测试文件为恶意文件,第四网络安全设备上报在分支机构局域网的边界检测到恶意流量。Step 1506: If the test file is a malicious file, the fourth network security device reports that malicious traffic is detected at the boundary of the branch LAN.
通过本申请实施例提供的方法,通过在分支机构局域网的边界部署网络安全设备,由网络安全设备采集边界中出入的报文,实施跨平台动态检测恶意文件的方案,以对报文中携带的测试文件进行检测,依据检测结果进行入侵防御。通过该方法,若总部局域网向分支机构局域网传输的报文携带了利用Windows操作系统进行恶意操作的恶意文件,网络安全设备使用虚拟运行环境模拟Windows操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。若总部局域网向分支机构局域网传输的报文携带了利用Linux操作系统进行恶意操作的恶意文件,网络安全设备使用虚拟运行环境模拟Linux操作系统,在虚拟运行环境中运行恶意文件,从而侦测出这种恶意报文。由此可见,该方法能够为分支机构局域网有效防范来自总部局域网的恶意流量,避免恶意测试文件、未知威胁在分支机构局域网和总部局域网之间任意扩散,提高分支机构局域网的网络安全。Through the method provided in the embodiments of the present application, network security equipment is deployed at the boundary of the local area network of the branch. The network security equipment collects incoming and outgoing messages from the boundary, and implements a cross-platform dynamic detection scheme for malicious files to detect the malicious files carried in the messages. Test files are tested, and intrusion prevention is performed based on the test results. Through this method, if the message transmitted from the headquarters LAN to the branch LAN carries a malicious file that uses the Windows operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Windows operating system and runs the malicious file in the virtual operating environment. This malicious message is detected. If the message transmitted from the headquarters LAN to the branch LAN carries a malicious file that uses the Linux operating system to perform malicious operations, the network security device uses a virtual operating environment to simulate the Linux operating system, and runs the malicious file in the virtual operating environment to detect this. A malicious message. It can be seen that this method can effectively prevent malicious traffic from the headquarters LAN for the branch LAN, avoid malicious test files and unknown threats from spreading between the branch LAN and the headquarters LAN, and improve the network security of the branch LAN.
在一些可能的实施例中,可选地,上述方法实施例应用在虚拟化架构下,方法实施例的执行主体是虚拟化架构中网元对应的实体。In some possible embodiments, optionally, the above method embodiments are applied in a virtualization architecture, and the execution subject of the method embodiments is an entity corresponding to a network element in the virtualization architecture.
例如,该虚拟化架构是NFV架构。For example, the virtualization architecture is the NFV architecture.
参见图16,NFV架构包括NFV MANO以及VNF,NFV MANO有三个主要功能块,分别是NFV编排器、VNF管理器和虚拟基础设施管理器(virtualised infrastructure maneger,VIM)。简单来说,NFV编排器能够对服务和资源进行编排,能够控制新的网络服务并将VNF集成到虚拟架构中,NFV编排器还能够验证并授权NFV基础设施的资源请求。VNF管理器能够管理VNF的生命周期。VIM能够控制并管理NFV基础设施,包括计算资源、存储资源以及网络资源等。为了是NFV MANO行之有效,它必须与现有系统中的API集成,以便跨多个网络域使用多个厂商的技术,同样地,运营商的运营支撑系统(operation support system,OSS)和商务支撑系统(businesssupport system,BSS)也需要与NFV MANO系统实现互操作。Referring to Figure 16, the NFV architecture includes NFV MANO and VNF. NFV MANO has three main functional blocks, namely NFV orchestrator, VNF manager, and virtualized infrastructure manager (VIM). Simply put, the NFV orchestrator can orchestrate services and resources, control new network services and integrate VNFs into the virtual architecture. The NFV orchestrator can also verify and authorize resource requests from the NFV infrastructure. The VNF manager can manage the life cycle of the VNF. VIM can control and manage NFV infrastructure, including computing resources, storage resources, and network resources. In order for NFV MANO to be effective, it must be integrated with the API in the existing system to use technologies from multiple vendors across multiple network domains. Similarly, the operator’s operation support system (OSS) and business The business support system (BSS) also needs to interoperate with the NFV MANO system.
例如,可选地,图16中的每个组件的功能如下。For example, optionally, the function of each component in FIG. 16 is as follows.
网络功能虚拟化编排器(network function virtualization orchestrator,NFVO),用于实现对网络服务描述符(network service descriptor,NSD)、虚拟网络功能转发图(virtual network function forwarding graph,VNFFG)的管理及处理,对网络服务的生命周期的管理,以及和虚拟网络功能管理器(virtual network function manager,VNFM)配合,实现对虚拟网络功能(virtual network function,VNF)的生命周期的管理和虚拟资源的全局视图功能。Network function virtualization orchestrator (NFVO) is used to realize the management and processing of network service descriptor (NSD) and virtual network function forwarding graph (VNFFG), The management of the life cycle of network services, and the coordination of virtual network function manager (VNFM) to realize the management of the life cycle of virtual network function (VNF) and the global view function of virtual resources .
VNFM用于实现对VNF的生命周期的管理,包括VNF描述符(VNF descriptor,VNFD)的管理、VNF的实例化、VNF实例的弹性伸缩(例如,扩容Scaling out/up,和/或缩容Scaling in/down)、VNF实例的治愈(healing)以及VNF实例的终止。VNFM还支持接收NFVO下发的弹性伸缩(scaling)策略,实现自动化的VNF的弹性伸缩。VNFM is used to manage the life cycle of VNF, including VNF descriptor (VNF descriptor, VNFD) management, VNF instantiation, and elastic scaling of VNF instances (for example, scaling out/up, and/or scaling out) in/down), healing of VNF instances and termination of VNF instances. VNFM also supports receiving elastic scaling (scaling) policies issued by NFVO to realize automated VNF elastic scaling.
虚拟基础设施管理器(virtualised infrastructure manager,VIM)主要负责基础设施层的硬件资源、虚拟化资源的管理(包括,预留和分配),以及虚拟资源状态的监控和故障上报,面向上层应用提供虚拟化资源池。The virtualized infrastructure manager (VIM) is mainly responsible for the management (including reservation and allocation) of hardware resources and virtualized resources of the infrastructure layer, as well as the monitoring and fault reporting of virtual resource status, and provides virtualized resources for upper-layer applications. Resource pool.
运营和商务支撑系统(operations and business support systems,OSS/BSS)指运营商现有的运行维护系统。Operation and business support systems (OSS/BSS) refer to the existing operation and maintenance systems of operators.
网元管理系统(element manager,EM)针对VNF执行传统的故障、配置、用户、性能和安全的管理(fault management,configuration management,account management,performance management,security management,FCAPS)的功能。The element manager (EM) performs traditional fault, configuration, user, performance, and security management (fault management, configuration management, account management, performance management, security management, FCAPS) functions for the VNF.
虚拟化网络功能(virtualized network function,VNF)对应于传统非虚拟化网络中的物理网络功能(physical network function,PNF),例如,虚拟化的演进分组核心网(evolved packet core,EPC)的移动性管理实体(mobility management entity,MME)、服务网关(service gateway,SGW)、分组数据网关(packet data network gateway,PGW)等节点。网络功能的功能性行为和状态与虚拟化与否无关,NFV技术需求希望VNF和PNF拥有相同的功能性行为和外部接口。其中,可选地,VNF包括一个或多个更低功能级别的VNF组件(virtual network function component,VNFC)。The virtualized network function (VNF) corresponds to the physical network function (PNF) in the traditional non-virtualized network, for example, the mobility of the virtualized evolved packet core (EPC) Management entity (mobility management entity, MME), service gateway (service gateway, SGW), packet data gateway (packet data network gateway, PGW) and other nodes. The functional behavior and status of network functions have nothing to do with virtualization or not. NFV technical requirements hope that VNF and PNF have the same functional behavior and external interface. Wherein, optionally, the VNF includes one or more VNF components (virtual network function component, VNFC) of a lower functional level.
NFV基础设施(NFV infrastructure,NFVI):包括硬件资源、虚拟资源和虚拟化层。从VNF的角度来说,虚拟化层和硬件资源看起来是一个能够提供所需的虚拟资源的完整实体。NFV infrastructure (NFV infrastructure, NFVI): including hardware resources, virtual resources and virtualization layer. From the perspective of VNF, the virtualization layer and hardware resources appear to be a complete entity that can provide the required virtual resources.
在一些实施例中,可选地,NFVI的硬件资源是异构系统,该异构系统包括使用了不同类型指令集和体系架构的硬件,该硬件包括计算硬件、存储硬件、网络硬件等。比如如图16所示,该异构系统包括X86 CPU以及ARM CPU。NFVI的虚拟化层用于实现上述方法实施例描述的操作系统模拟的功能以及指令转换的功能。NFVI的虚拟资源包括容器,该容器用于提供上述方法实施例描述的虚拟运行环境,该容器提供为VNF。In some embodiments, optionally, the hardware resource of the NFVI is a heterogeneous system. The heterogeneous system includes hardware using different types of instruction sets and architectures. The hardware includes computing hardware, storage hardware, network hardware, and the like. For example, as shown in Figure 16, the heterogeneous system includes X86 CPU and ARM CPU. The virtualization layer of NFVI is used to implement the function of operating system simulation and the function of instruction conversion described in the foregoing method embodiment. The virtual resource of the NFVI includes a container, which is used to provide the virtual operating environment described in the above method embodiment, and the container is provided as a VNF.
以下通过图17实施例,对本申请实施例附图4所描述的恶意文件的检测方法进行举例说明。在附图17所示的实施例中,恶意文件的检测方法应用在NFV架构中,检测设备为VNF。软件提供商通过NFV MANO中的NFVO、VNFM或VIM,向检测设备下发测试文件,检测设备对测试文件的检测结果可以返回给NFV MANO。Hereinafter, the malicious file detection method described in FIG. 4 of the embodiment of the present application will be illustrated by using the embodiment of FIG. 17 as an example. In the embodiment shown in FIG. 17, the malicious file detection method is applied in the NFV architecture, and the detection device is a VNF. The software provider issues test files to the testing equipment through NFVO, VNFM or VIM in NFV MANO, and the testing results of the testing equipment on the test files can be returned to NFV MANO.
可选地,若恶意文件检测的方法通过容器实现,VNF通过运行容器,以进行恶意检测,该容器由NFV MANO下发至VNF。可选地,在NFV MANO中部署CaaS管理器,由该CaaS管理器向VNF下发容器。可选地,由NFV MANO中的其他网元向VNF下发容器。如此,实现容器化的VNF,该容器化的VNF通过运行容器,对测试文件进行恶意文件检测。其中, 容器化的VNF是指在容器上创建的VNF,容器化的VNF的实例包括一个或多个VNFC实例。可选地,一个VNFC映射为CaaS服务中的一个容器应用,或者,一个VNF映射为CaaS服务中的一个容器应用。Optionally, if the malicious file detection method is implemented by a container, the VNF runs the container to perform malicious detection, and the container is delivered to the VNF by the NFV MANO. Optionally, a CaaS manager is deployed in the NFV MANO, and the CaaS manager delivers the container to the VNF. Optionally, other network elements in the NFV MANO deliver the container to the VNF. In this way, a containerized VNF is realized, and the containerized VNF performs malicious file detection on the test file by running the container. Among them, a containerized VNF refers to a VNF created on a container, and examples of the containerized VNF include one or more VNFC instances. Optionally, one VNFC is mapped to one container application in the CaaS service, or one VNF is mapped to one container application in the CaaS service.
以下结合图17,对基于NFV架构进行恶意文件检测的方法流程进行描述,该方法可以包括以下步骤1701至步骤1706。The following describes the process of the method for detecting malicious files based on the NFV architecture with reference to FIG. 17. The method may include the following steps 1701 to 1706.
步骤1701、NFV MANO向VNF发送测试文件,该测试文件为第一操作系统的可执行文件。Step 1701, NFV MANO sends a test file to the VNF, where the test file is an executable file of the first operating system.
步骤1702、VNF接收测试文件,在虚拟运行环境中运行测试文件。Step 1702, the VNF receives the test file, and runs the test file in the virtual operating environment.
步骤1703、VNF获得测试文件在运行过程中调用的第一API序列。Step 1703: The VNF obtains the first API sequence called during the running of the test file.
步骤1704、VNF在第二操作系统中执行第二API序列。Step 1704: The VNF executes the second API sequence in the second operating system.
步骤1705、VNF基于第一API序列被调用过程中测试文件的行为特征,判断测试文件是否为恶意文件。Step 1705: The VNF determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
步骤1706、VNF向NFV MANO发送检测结果。Step 1706: The VNF sends the detection result to the NFV MANO.
NFV架构中,各个网元的功能通常不再依赖于专用的硬件实现,而是将电信网络中的各个网元虚拟化为各个软件,将各个软件部署在通用的硬件上,以便实现了软件和硬件的解耦。In the NFV architecture, the functions of each network element are usually no longer dependent on dedicated hardware. Instead, each network element in the telecommunications network is virtualized into software, and each software is deployed on common hardware, so as to realize the software and Decoupling of hardware.
通过本申请实施例提供的方法,由于令恶意文件的检测功能摆脱对特定硬件的依赖,不必采用专用的硬件实现恶意文件的检测流程,因此,刚好满足了NFV中软硬件解耦的根本目标,可以将检测恶意文件的功能虚拟化为VNF,应用在NFV这种虚拟化架构下,从而为NFV的应用扩展出恶意文件检测这种场景。Through the method provided by the embodiments of the present application, since the detection function of malicious files is free from dependence on specific hardware, it is not necessary to use dedicated hardware to implement the detection process of malicious files. Therefore, it just meets the fundamental goal of decoupling software and hardware in NFV. The function of detecting malicious files is virtualized as a VNF and applied under the virtualized architecture of NFV, thereby expanding the scenario of malicious file detection for NFV applications.
以上介绍了本申请实施例的恶意文件的检测方法,以下介绍本申请实施例的恶意文件的检测装置,应理解,该应用于恶意文件的检测装置其具有上述方法实施例的执行主体的任意功能。The malicious file detection method of the embodiment of the present application is introduced above, and the malicious file detection device of the embodiment of the present application is introduced below. It should be understood that the detection device applied to the malicious file has any function of the execution subject of the above method embodiment. .
图18是本申请实施例提供的一种恶意文件的检测装置的结构示意图,如图18所示,该恶意文件的检测装置包括获取模块1801、运行模块1802、执行模块1803和判断模块1804。FIG. 18 is a schematic structural diagram of a malicious file detection device provided by an embodiment of the present application. As shown in FIG. 18, the malicious file detection device includes an acquisition module 1801, an operation module 1802, an execution module 1803, and a judgment module 1804.
获取模块1801,用于获取测试文件,例如可以用于执行上述方法实施例中的步骤402、步骤501、步骤801、步骤903、步骤1002、步骤1201、步骤1301、步骤1401、步骤1501或步骤1702;The obtaining module 1801 is used to obtain a test file, for example, it can be used to execute step 402, step 501, step 801, step 903, step 1002, step 1201, step 1301, step 1401, step 1501, or step 1702 in the above method embodiment ;
运行模块1802,用于执行运行测试文件,例如可以用于执行上述方法实施例中的步骤403、步骤502、步骤802、步骤904、步骤1003、步骤1202、步骤1302、步骤1402、步骤1502或步骤1703;The running module 1802 is used to execute the running test file. For example, it can be used to execute step 403, step 502, step 802, step 904, step 1003, step 1202, step 1302, step 1402, step 1502 or step in the above method embodiment 1703;
获取模块1801,还用于获取第一API序列,例如可以用于执行上述方法实施例中的步骤404、步骤503、步骤803、步骤905、步骤1004、步骤1203、步骤1303、步骤1403、步骤1503或步骤1703;The obtaining module 1801 is also used to obtain the first API sequence. For example, it can be used to execute step 404, step 503, step 803, step 905, step 1004, step 1203, step 1303, step 1403, and step 1503 in the above method embodiment. Or step 1703;
执行模块1803,用于执行第二API序列,例如可以用于执行上述方法实施例中的步骤405、步骤504、步骤804、步骤906、步骤1005、步骤1204、步骤1304、步骤1404、步骤1504或步骤1704;The execution module 1803 is used to execute the second API sequence. For example, it can be used to execute step 405, step 504, step 804, step 906, step 1005, step 1204, step 1304, step 1404, step 1504 or Step 1704;
判断模块1804,用于判断测试文件是否为恶意文件,例如可以用于执行上述方法实施例中的步骤406、步骤505、步骤805、步骤907、步骤1006、步骤1205、步骤1305、步骤1405、 步骤1505或步骤1705。The judging module 1804 is used to judge whether the test file is a malicious file. For example, it can be used to execute step 406, step 505, step 805, step 907, step 1006, step 1205, step 1305, step 1405, step 1505 or step 1705.
可选地,执行模块1803,用于执行步骤405中的步骤一至步骤三。Optionally, the execution module 1803 is configured to execute step one to step three in step 405.
可选地,执行模块1803,用于执行步骤504中的步骤(1)至步骤(3)。Optionally, the execution module 1803 is configured to execute step (1) to step (3) in step 504.
可选地,执行模块1803,用于执行步骤8041至步骤8043。Optionally, the execution module 1803 is configured to execute step 8041 to step 8043.
可选地,执行模块1803,用于执行步骤405中的步骤a至步骤b。Optionally, the execution module 1803 is configured to execute step a to step b in step 405.
应理解,图18实施例提供的恶意文件的检测装置对应于上述方法实施例中的恶意文件的检测设备,恶意文件的检测装置中的各模块和上述其他操作和/或功能分别为了实现方法实施例中的恶意文件的检测设备所实施的各种步骤和方法,具体细节可参见上述方法实施例,为了简洁,在此不再赘述。It should be understood that the device for detecting malicious files provided in the embodiment of FIG. 18 corresponds to the device for detecting malicious files in the foregoing method embodiments. The modules in the device for detecting malicious files and the other operations and/or functions described above are used to implement the method. For the various steps and methods implemented by the malicious file detection device in the example, for specific details, please refer to the foregoing method embodiment, and for brevity, details are not repeated here.
应理解,图18实施例提供的恶意文件的检测装置在检测恶意文件时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将恶意文件的检测装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的恶意文件的检测装置与上述恶意文件的检测方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be understood that when the device for detecting malicious files provided in the embodiment of FIG. 18 detects malicious files, only the division of the above-mentioned functional modules is used as an example. In actual applications, the above-mentioned functions can be allocated by different functional modules as required. , That is, divide the internal structure of the malicious file detection device into different functional modules to complete all or part of the functions described above. In addition, the malicious file detection apparatus provided in the foregoing embodiment belongs to the same concept as the foregoing malicious file detection method embodiment. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.
本申请实施例还提供了一种计算机程序产品,当该计算机程序产品在检测设备上运行时,使得检测设备执行上述方法实施例提供的恶意文件的检测方法。The embodiments of the present application also provide a computer program product, which when the computer program product runs on a detection device, causes the detection device to execute the malicious file detection method provided in the foregoing method embodiment.
本申请实施例还提供了一种芯片,当该芯片在检测设备上运行时,使得检测设备执行上述方法实施例提供的恶意文件的检测方法。该芯片可以是通用处理器,该通用处理器包括处理电路和与该处理电路内部连接通信的输入接口以及输出接口,该处理电路用于通过输入接口执行上述各个方法实施例中获取测试文件的步骤,该处理电路用于执行上述各个方法实施例中运行测试文件、获取第一API序列、执行第二API序列、判断测试文件是否为恶意文件的步骤。可选地,该通用处理器还可以包括存储介质,该处理电路用于通过存储介质执行上述各个方法实施例中的存储步骤。存储介质可以存储处理电路执行的指令,该处理电路用于执行存储介质存储的指令以执行上述各个方法实施例。The embodiment of the present application also provides a chip, which when the chip runs on a detection device, causes the detection device to execute the malicious file detection method provided by the foregoing method embodiment. The chip may be a general-purpose processor, the general-purpose processor includes a processing circuit and an input interface and an output interface that are internally connected and communicated with the processing circuit, and the processing circuit is used to execute the steps of obtaining the test file in the above-mentioned various method embodiments through the input interface The processing circuit is used to execute the steps of running the test file, acquiring the first API sequence, executing the second API sequence, and judging whether the test file is a malicious file in the foregoing method embodiments. Optionally, the general-purpose processor may further include a storage medium, and the processing circuit is configured to execute the storage steps in each of the foregoing method embodiments through the storage medium. The storage medium may store instructions executed by the processing circuit, and the processing circuit is configured to execute the instructions stored in the storage medium to execute the foregoing method embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例中描述的各方法步骤和单元,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各实施例的步骤及组成。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that, in combination with the method steps and units described in the embodiments disclosed herein, they can be implemented by electronic hardware, computer software, or a combination of both, in order to clearly illustrate the possibilities of hardware and software. Interchangeability, in the above description, the steps and components of the embodiments have been generally described in accordance with their functions. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. A person of ordinary skill in the art may use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present application.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参见前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the above-described system, device, and unit can be referred to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相 互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the unit is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
该作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。The unit described as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may also be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
该集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例中方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .
以上描述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above descriptions are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of various equivalent modifications within the technical scope disclosed in this application. Or replacement, these modifications or replacements should be covered within the scope of protection of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机程序指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例中的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,该计算机程序指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如软盘、硬盘、磁带)、光介质(例如,数字视频光盘(digital video disc,DVD)、或者半导体介质(例如固态硬盘)等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer program instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer program instructions can be passed from a website, computer, server, or data center. Wired or wireless transmission to another website site, computer, server or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a digital video disc (DVD), or a semiconductor medium (for example, a solid state hard disk).
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,该程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the foregoing embodiments can be implemented by hardware, or by a program instructing related hardware to be completed. The program can be stored in a computer-readable storage medium, as mentioned above. The storage medium can be read-only memory, magnetic disk or optical disk, etc.
以上描述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above descriptions are only optional embodiments of this application and are not intended to limit this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection scope of this application within.

Claims (22)

  1. 一种恶意文件的检测方法,其特征在于,所述方法包括:A method for detecting malicious files, characterized in that the method includes:
    检测设备获取测试文件,所述测试文件为基于第一操作系统运行的可执行文件;The detection device obtains a test file, where the test file is an executable file running based on the first operating system;
    所述检测设备在虚拟运行环境中运行所述测试文件,所述虚拟运行环境是基于容器技术生成的;The detection device runs the test file in a virtual operating environment, and the virtual operating environment is generated based on container technology;
    所述检测设备获得所述测试文件在运行过程中调用的第一API序列,所述第一API序列中包括至少一个API,所述第一API序列包括的API为第一API集合中的API,所述第一API集合包括所述虚拟运行环境提供的软件运行所需的多个API,所述第一API集合中的API的标识与第二API集合中的API的标识相同,所述第二API集合包括所述第一操作系统提供的软件运行所需的多个API;The detection device obtains a first API sequence called during the running of the test file, the first API sequence includes at least one API, and the API included in the first API sequence is an API in a first API set, The first API set includes multiple APIs required for software operation provided by the virtual operating environment, the identifiers of the APIs in the first API set are the same as those of the APIs in the second API set, and the second The API set includes multiple APIs required for software operation provided by the first operating system;
    所述检测设备在第二操作系统中执行第二API序列,所述第二API序列中包括至少一个API,所述第二API序列包括的API为所述第二操作系统中的API,所述第二API序列中的第一API与所述第一API序列中的第一API具有映射关系,所述第二操作系统是基于所述检测设备的计算机指令集架构的操作系统;The detection device executes a second API sequence in a second operating system, the second API sequence includes at least one API, the API included in the second API sequence is an API in the second operating system, and The first API in the second API sequence has a mapping relationship with the first API in the first API sequence, and the second operating system is an operating system based on the computer instruction set architecture of the detection device;
    所述检测设备基于所述第一API序列被调用过程中所述测试文件的行为特征,判断所述测试文件是否为恶意文件。The detection device determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
  2. 根据权利要求1所述的方法,其特征在于,所述检测设备在第二操作系统中执行第二API序列,包括:The method according to claim 1, wherein the execution of the second API sequence in the second operating system by the detection device comprises:
    所述检测设备根据所述第一API序列中的每个API,分别从所述虚拟运行环境的动态链接库中获取对应的函数,从而获得第一函数序列,所述第一函数序列包括的函数用于实现所述第一API序列中包括的API;The detection device obtains the corresponding function from the dynamic link library of the virtual operating environment according to each API in the first API sequence, thereby obtaining the first function sequence, and the functions included in the first function sequence Used to implement the API included in the first API sequence;
    所述检测设备根据所述第一函数序列中的每个函数,分别从所述第二操作系统的动态链接库中获取映射的函数,从而生成第二函数序列,所述第二函数序列包括的函数用于实现所述第二API序列中包括的API,所述第二函数序列中的第一函数与所述第一函数序列中的第一函数具有映射关系;The detection device obtains the mapped function from the dynamic link library of the second operating system according to each function in the first function sequence, thereby generating a second function sequence. The second function sequence includes The function is used to implement the API included in the second API sequence, and the first function in the second function sequence has a mapping relationship with the first function in the first function sequence;
    所述检测设备在所述第二操作系统的内核中,根据所述第二函数序列执行操作。The detection device is in the kernel of the second operating system and performs operations according to the second function sequence.
  3. 根据权利要求2所述的方法,其特征在于,所述第一操作系统为Windows操作系统,所述第二操作系统为Linux操作系统,所述检测设备根据所述第一API序列中的每个API,分别从所述虚拟运行环境的动态链接库中获取对应的函数,包括:The method according to claim 2, wherein the first operating system is a Windows operating system, the second operating system is a Linux operating system, and the detection device is based on each of the first API sequence APIs respectively obtain corresponding functions from the dynamic link library of the virtual operating environment, including:
    所述检测设备根据所述第一API序列中的每个API,分别从动态链接库DLL文件中获取对应的函数;The detection device obtains a corresponding function from a dynamic link library DLL file according to each API in the first API sequence;
    所述检测设备根据所述第一函数序列中的每个函数,分别从所述第二操作系统的动态链接库中获取映射的函数,包括:The detection device separately obtains the mapped function from the dynamic link library of the second operating system according to each function in the first function sequence, including:
    所述检测设备根据所述第一函数序列中的每个函数以及函数之间的映射关系,分别从共享对象SO文件中获取映射的函数。The detection device separately obtains the mapped function from the shared object SO file according to each function in the first function sequence and the mapping relationship between the functions.
  4. 根据权利要求2所述的方法,其特征在于,所述第一操作系统为Linux操作系统,所述第二操作系统为Windows操作系统,所述检测设备根据所述第一API序列中的每个API,分别从所述虚拟运行环境的动态链接库中获取对应的函数,包括:The method according to claim 2, wherein the first operating system is a Linux operating system, the second operating system is a Windows operating system, and the detection device is based on each of the first API sequence APIs respectively obtain corresponding functions from the dynamic link library of the virtual operating environment, including:
    所述检测设备根据所述第一API序列中的每个API,分别从SO文件中获取对应的函数;The detection device obtains the corresponding function from the SO file according to each API in the first API sequence;
    所述检测设备根据所述第一函数序列中的每个函数,分别从所述第二操作系统的动态链接库中获取映射的函数,包括:The detection device separately obtains the mapped function from the dynamic link library of the second operating system according to each function in the first function sequence, including:
    所述检测设备根据所述第一函数序列中的每个函数以及函数之间的映射关系,分别从DLL文件中获取映射的函数。The detection device separately obtains the mapped function from the DLL file according to each function in the first function sequence and the mapping relationship between the functions.
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述检测设备在第二操作系统中执行第二API序列,包括:The method according to any one of claims 1 to 4, wherein the execution of the second API sequence in the second operating system by the detection device comprises:
    所述检测设备获取所述第一API序列中调用的第一类参数,所述第一类参数包括的参数为所述第一API序列中的API的输入参数;Acquiring, by the detection device, a first-type parameter called in the first API sequence, and the parameters included in the first-type parameter are input parameters of the API in the first API sequence;
    所述检测设备在所述第二操作系统中,根据第二类参数执行所述第二API序列,所述第二类参数包括的参数为所述第二API序列中的API的输入参数,所述第二类参数中的第一参数与所述第一类参数中的第一参数具有映射关系。The detection device executes the second API sequence according to the second type of parameters in the second operating system, and the parameters included in the second type of parameters are the input parameters of the API in the second API sequence, so The first parameter in the second type of parameters has a mapping relationship with the first parameter in the first type of parameters.
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述检测设备在第二操作系统中执行第二API序列,包括:The method according to any one of claims 1 to 5, wherein the execution of the second API sequence in the second operating system by the detection device comprises:
    所述检测设备获取所述测试文件在运行过程中触发的第一指令序列,所述第一指令序列包括至少一个指令,所述第一指令序列中的每个指令用于指示调用所述第一API序列中的一个API;The detection device acquires a first instruction sequence triggered during the running of the test file, the first instruction sequence includes at least one instruction, and each instruction in the first instruction sequence is used to instruct to call the first instruction sequence. An API in the API sequence;
    所述检测设备对所述第一指令序列中的指令进行第一指令转换,根据第一指令转换的结果得到第二指令序列,所述第二指令序列包括至少一个指令,所述第二指令序列中的每个指令用于指示调用所述第二API序列中的一个API,所述第一指令转换用于将所述第一操作系统所基于的指令集中的指令转换为所述检测设备的计算机指令集中的指令;The detection device performs a first instruction conversion on the instructions in the first instruction sequence, and obtains a second instruction sequence according to the result of the first instruction conversion, the second instruction sequence includes at least one instruction, and the second instruction sequence Each instruction in is used to instruct to call an API in the second API sequence, and the first instruction conversion is used to convert the instructions in the instruction set based on the first operating system into the computer of the detection device Instructions in the instruction set;
    所述检测设备执行所述第二指令序列,以实现所述第二API序列对应的操作。The detection device executes the second instruction sequence to implement operations corresponding to the second API sequence.
  7. 根据权利要求6所述的方法,其特征在于,所述第一操作系统为Windows操作系统,所述检测设备的计算机指令集架构为进阶精简指令集机器ARM架构,所述检测设备对所述第一指令序列中的指令进行第一指令转换,根据第一指令转换的结果得到第二指令序列,包括:The method according to claim 6, wherein the first operating system is a Windows operating system, the computer instruction set architecture of the detection device is an advanced reduced instruction set machine ARM architecture, and the detection device supports the The instructions in the first instruction sequence perform the first instruction conversion, and obtain the second instruction sequence according to the result of the first instruction conversion, including:
    所述检测设备将所述第一指令序列中的每个X86指令转换为ARM指令,根据转换得到的ARM指令得到所述第二指令序列。The detection device converts each X86 instruction in the first instruction sequence into an ARM instruction, and obtains the second instruction sequence according to the converted ARM instruction.
  8. 根据权利要求6所述的方法,其特征在于,所述第一操作系统为Linux操作系统,所述检测设备的计算机指令集架构为X86架构,所述检测设备对所述第一指令序列中的指令进行第一指令转换,根据第一指令转换的结果得到第二指令序列,包括:The method according to claim 6, wherein the first operating system is a Linux operating system, the computer instruction set architecture of the detection device is an X86 architecture, and the detection device responds to instructions in the first instruction sequence The instruction performs the first instruction conversion, and obtains the second instruction sequence according to the result of the first instruction conversion, including:
    所述检测设备将所述第一指令序列中的每个ARM指令转换为X86指令,根据转换得到的X86指令得到所述第二指令序列。The detection device converts each ARM instruction in the first instruction sequence into an X86 instruction, and obtains the second instruction sequence according to the converted X86 instruction.
  9. 根据权利要求1所述的方法,其特征在于,所述检测设备在第二操作系统中执行第二API序列之后,所述方法还包括:The method according to claim 1, wherein after the detection device executes the second API sequence in the second operating system, the method further comprises:
    所述检测设备获取第三指令序列,所述第三指令序列表示执行所述第二API序列后得到的结果,所述第三指令序列中的指令属于所述检测设备的计算机指令集;Acquiring, by the detection device, a third instruction sequence, where the third instruction sequence represents a result obtained after executing the second API sequence, and the instructions in the third instruction sequence belong to the computer instruction set of the detection device;
    所述检测设备对所述第三指令序列中的每个指令进行第二指令转换,根据第二指令转换的结果得到第四指令序列,所述第四指令序列中的指令属于所述虚拟运行环境的计算机指令集,所述第二指令转换用于将所述检测设备的计算机指令集中的指令转换为所述第一操作系统所基于的指令集中的指令;The detection device performs a second instruction conversion on each instruction in the third instruction sequence, and obtains a fourth instruction sequence according to the result of the second instruction conversion, and the instructions in the fourth instruction sequence belong to the virtual operating environment The second instruction conversion is used to convert instructions in the computer instruction set of the detection device into instructions in the instruction set on which the first operating system is based;
    所述检测设备将所述第四指令序列输入所述虚拟运行环境。The detection device inputs the fourth instruction sequence into the virtual operating environment.
  10. 根据权利要求1至9中任一项所述的方法,其特征在于,所述虚拟运行环境是基于容器的镜像生成的,所述镜像封装有所述第一API集合。The method according to any one of claims 1 to 9, wherein the virtual operating environment is generated based on a container image, and the image encapsulates the first API set.
  11. 根据权利要求1至10中任一项所述的方法,其特征在于,所述容器技术包括Docker容器技术,所述虚拟运行环境通过Docker守护进程启动,所述Docker守护进程为所述检测设备基于所述第二操作系统运行的进程。The method according to any one of claims 1 to 10, wherein the container technology includes Docker container technology, the virtual operating environment is started by a Docker daemon, and the Docker daemon is the detection device based on The process run by the second operating system.
  12. 一种恶意文件的检测装置,其特征在于,所述装置包括:A detection device for malicious files, characterized in that the device includes:
    获取模块,用于获取测试文件,所述测试文件为基于第一操作系统运行的可执行文件;An obtaining module for obtaining a test file, the test file being an executable file running based on the first operating system;
    运行模块,用于在虚拟运行环境中运行所述测试文件,所述虚拟运行环境是基于容器技术生成的;获得所述测试文件在运行过程中调用的第一API序列,所述第一API序列中包括至少一个API,所述第一API序列包括的API为第一API集合中的API,所述第一API集合包括所述虚拟运行环境提供的软件运行所需的多个API,所述第一API集合中的API的标识与第二API集合中的API的标识相同,所述第二API集合包括所述第一操作系统提供的软件运行所需的多个API;The running module is used to run the test file in a virtual running environment, the virtual running environment is generated based on container technology; to obtain the first API sequence called by the test file during the running process, the first API sequence The first API sequence includes at least one API, the API included in the first API sequence is an API in a first API set, and the first API set includes multiple APIs required by the software provided by the virtual operating environment to run. The identifiers of APIs in an API set are the same as those of APIs in a second API set, and the second API set includes multiple APIs required for software operation provided by the first operating system;
    执行模块,用于在第二操作系统中执行第二API序列,所述第二API序列中包括至少一个API,所述第二API序列包括的API为所述第二操作系统中的API,所述第二API序列中的第一API与所述第一API序列中的第一API具有映射关系,所述第二操作系统是基于所述检测设备的计算机指令集架构的操作系统;The execution module is used to execute a second API sequence in a second operating system, the second API sequence includes at least one API, and the API included in the second API sequence is an API in the second operating system, so The first API in the second API sequence has a mapping relationship with the first API in the first API sequence, and the second operating system is an operating system based on a computer instruction set architecture of the detection device;
    判断模块,用于基于所述第一API序列被调用过程中所述测试文件的行为特征,判断所述测试文件是否为恶意文件。The judging module is used to judge whether the test file is a malicious file based on the behavior characteristics of the test file during the calling process of the first API sequence.
  13. 根据权利要求12所述的装置,其特征在于,所述执行模块,用于根据所述第一API序列中的每个API,分别从所述虚拟运行环境的动态链接库中获取对应的函数,从而获得第一函数序列,所述第一函数序列包括的函数用于实现所述第一API序列中包括的API;根据所述第一函数序列中的每个函数,分别从所述第二操作系统的动态链接库中获取映射的函数, 从而生成第二函数序列,所述第二函数序列包括的函数用于实现所述第二API序列中包括的API,所述第二函数序列中的第一函数与所述第一函数序列中的第一函数具有映射关系;在所述第二操作系统的内核中,根据所述第二函数序列执行操作。The apparatus according to claim 12, wherein the execution module is configured to obtain corresponding functions from the dynamic link library of the virtual runtime environment according to each API in the first API sequence, Thereby, a first function sequence is obtained, and the functions included in the first function sequence are used to implement the API included in the first API sequence; according to each function in the first function sequence, the second operation The mapped function is acquired from the dynamic link library of the system to generate a second function sequence. The functions included in the second function sequence are used to implement the API included in the second API sequence. A function has a mapping relationship with the first function in the first function sequence; in the kernel of the second operating system, operations are performed according to the second function sequence.
  14. 根据权利要求13所述的装置,其特征在于,所述第一操作系统为Windows操作系统,所述第二操作系统为Linux操作系统,所述执行模块,用于根据所述第一API序列中的每个API,分别从动态链接库DLL文件中获取对应的函数,根据所述第一函数序列中的每个函数以及函数之间的映射关系,分别从共享对象SO文件中获取映射的函数。The device according to claim 13, wherein the first operating system is a Windows operating system, the second operating system is a Linux operating system, and the execution module is configured to execute according to the first API sequence Each of the APIs obtains the corresponding function from the dynamic link library DLL file, and obtains the mapped function from the shared object SO file according to each function in the first function sequence and the mapping relationship between the functions.
  15. 根据权利要求13所述的装置,其特征在于,所述第一操作系统为Linux操作系统,所述第二操作系统为Windows操作系统,所述执行模块,用于根据所述第一API序列中的每个API,分别从SO文件中获取对应的函数,根据所述第一函数序列中的每个函数以及函数之间的映射关系,分别从DLL文件中获取映射的函数。The apparatus according to claim 13, wherein the first operating system is a Linux operating system, the second operating system is a Windows operating system, and the execution module is configured to perform according to the first API sequence Each of the APIs obtains the corresponding function from the SO file, and obtains the mapped function from the DLL file according to each function in the first function sequence and the mapping relationship between the functions.
  16. 根据权利要求12至15中任一项所述的装置,其特征在于,所述执行模块,用于获取所述第一API序列中调用的第一类参数,所述第一类参数包括的参数为所述第一API序列中的API的输入参数;在所述第二操作系统中,根据第二类参数执行所述第二API序列,所述第二类参数包括的参数为所述第二API序列中的API的输入参数,所述第二类参数中的第一参数与所述第一类参数中的第一参数具有映射关系。The device according to any one of claims 12 to 15, wherein the execution module is configured to obtain the first type of parameters called in the first API sequence, and the parameters included in the first type of parameters Is the input parameter of the API in the first API sequence; in the second operating system, the second API sequence is executed according to the second type of parameters, and the parameters included in the second type of parameters are the second For the input parameters of the API in the API sequence, the first parameter in the second type of parameter has a mapping relationship with the first parameter in the first type of parameter.
  17. 根据权利要求12至16中任一项所述的装置,其特征在于,所述执行模块,用于获取所述测试文件在运行过程中触发的第一指令序列,所述第一指令序列包括至少一个指令,所述第一指令序列中的每个指令用于指示调用所述第一API序列中的一个API;对所述第一指令序列中的指令进行第一指令转换,根据第一指令转换的结果得到第二指令序列,所述第二指令序列包括至少一个指令,所述第二指令序列中的每个指令用于指示调用所述第二API序列中的一个API,所述第一指令转换用于将所述第一操作系统所基于的指令集中的指令转换为所述检测设备的计算机指令集中的指令;执行所述第二指令序列,以实现所述第二API序列对应的操作。The device according to any one of claims 12 to 16, wherein the execution module is configured to obtain a first instruction sequence triggered during the running of the test file, and the first instruction sequence includes at least An instruction, each instruction in the first instruction sequence is used to instruct to call an API in the first API sequence; perform a first instruction conversion on the instructions in the first instruction sequence, and convert according to the first instruction A second instruction sequence is obtained as a result of the second instruction sequence, the second instruction sequence includes at least one instruction, and each instruction in the second instruction sequence is used to instruct to call an API in the second API sequence, and the first instruction The conversion is used to convert the instructions in the instruction set based on the first operating system into the instructions in the computer instruction set of the detection device; execute the second instruction sequence to implement the operation corresponding to the second API sequence.
  18. 根据权利要求17所述的装置,其特征在于,所述第一操作系统为Windows操作系统,所述检测设备的计算机指令集架构为进阶精简指令集机器ARM架构,所述执行模块,用于将所述第一指令序列中的每个X86指令转换为ARM指令,根据转换得到的ARM指令得到所述第二指令序列。The apparatus according to claim 17, wherein the first operating system is a Windows operating system, the computer instruction set architecture of the detection device is an advanced reduced instruction set machine ARM architecture, and the execution module is used for Each X86 instruction in the first instruction sequence is converted into an ARM instruction, and the second instruction sequence is obtained according to the converted ARM instruction.
  19. 根据权利要求17所述的装置,其特征在于,所述第一操作系统为Linux操作系统,所述检测设备的计算机指令集架构为X86架构,所述执行模块,用于将所述第一指令序列中的每个ARM指令转换为X86指令,根据转换得到的X86指令得到所述第二指令序列。The apparatus according to claim 17, wherein the first operating system is a Linux operating system, the computer instruction set architecture of the detection device is an X86 architecture, and the execution module is configured to transfer the first instruction Each ARM instruction in the sequence is converted into an X86 instruction, and the second instruction sequence is obtained according to the converted X86 instruction.
  20. 根据权利要求17所述的装置,其特征在于,所述执行模块,用于获取第三指令序列, 所述第三指令序列表示执行所述第二API序列后得到的结果,所述第三指令序列中的指令属于所述检测设备的计算机指令集;对所述第三指令序列中的每个指令进行第二指令转换,根据第二指令转换的结果得到第四指令序列,所述第四指令序列中的指令属于所述虚拟运行环境的计算机指令集,所述第二指令转换用于将所述检测设备的计算机指令集中的指令转换为所述第一操作系统所基于的指令集中的指令;将所述第四指令序列输入所述虚拟运行环境。The device according to claim 17, wherein the execution module is configured to obtain a third instruction sequence, and the third instruction sequence represents a result obtained after executing the second API sequence, and the third instruction The instructions in the sequence belong to the computer instruction set of the detection device; a second instruction conversion is performed on each instruction in the third instruction sequence, and the fourth instruction sequence is obtained according to the result of the second instruction conversion. The instructions in the sequence belong to the computer instruction set of the virtual operating environment, and the second instruction conversion is used to convert the instructions in the computer instruction set of the detection device into the instructions in the instruction set on which the first operating system is based; The fourth instruction sequence is input into the virtual operating environment.
  21. 一种检测设备,其特征在于,包括网络接口、存储器和与所述存储器连接的处理器,A detection device, characterized by comprising a network interface, a memory, and a processor connected to the memory,
    所述网络接口,用于获取测试文件,所述测试文件为基于第一操作系统运行的可执行文件;The network interface is used to obtain a test file, and the test file is an executable file running based on a first operating system;
    所述存储器用于存储程序指令;The memory is used to store program instructions;
    所述处理器用于执行所述程序指令,以使所述检测设备执行以下操作:The processor is configured to execute the program instructions, so that the detection device performs the following operations:
    在虚拟运行环境中运行所述测试文件,所述虚拟运行环境是基于容器技术生成的;Running the test file in a virtual operating environment, the virtual operating environment being generated based on container technology;
    获得所述测试文件在运行过程中调用的第一API序列,所述第一API序列中包括至少一个API,所述第一API序列包括的API为第一API集合中的API,所述第一API集合包括所述虚拟运行环境提供的软件运行所需的多个API,所述第一API集合中的API的标识与第二API集合中的API的标识相同,所述第二API集合包括所述第一操作系统提供的软件运行所需的多个API;Obtain the first API sequence called during the running of the test file, the first API sequence includes at least one API, the API included in the first API sequence is the API in the first API set, and the first API sequence The API set includes multiple APIs required for software operation provided by the virtual operating environment, the identifiers of the APIs in the first API set are the same as those of the APIs in the second API set, and the second API set includes all APIs. Describe multiple APIs required for software operation provided by the first operating system;
    在第二操作系统中执行第二API序列,所述第二API序列中包括至少一个API,所述第二API序列包括的API为所述第二操作系统中的API,所述第二API序列中的第一API与所述第一API序列中的第一API具有映射关系,所述第二操作系统是基于所述检测设备的计算机指令集架构的操作系统;A second API sequence is executed in a second operating system, the second API sequence includes at least one API, the API included in the second API sequence is an API in the second operating system, and the second API sequence The first API in and the first API in the first API sequence have a mapping relationship, and the second operating system is an operating system based on the computer instruction set architecture of the detection device;
    基于所述第一API序列被调用过程中所述测试文件的行为特征,判断所述测试文件是否为恶意文件。Based on the behavior characteristics of the test file when the first API sequence is called, it is determined whether the test file is a malicious file.
  22. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述指令由处理器读取以使检测设备执行如权利要求1至权利要求11中任一项所述的方法。A computer-readable storage medium, characterized in that at least one instruction is stored in the storage medium, and the instruction is read by a processor to make a detection device execute any one of claims 1 to 11 Methods.
PCT/CN2020/104449 2020-01-20 2020-07-24 Method, apparatus and device for detecting malicious file, and storage medium WO2021147282A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010065766.3 2020-01-20
CN202010065766.3A CN113139176A (en) 2020-01-20 2020-01-20 Malicious file detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2021147282A1 true WO2021147282A1 (en) 2021-07-29

Family

ID=76808881

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/104449 WO2021147282A1 (en) 2020-01-20 2020-07-24 Method, apparatus and device for detecting malicious file, and storage medium

Country Status (2)

Country Link
CN (1) CN113139176A (en)
WO (1) WO2021147282A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114995877A (en) * 2022-08-03 2022-09-02 平安银行股份有限公司 Variable configuration method and device
CN115686984A (en) * 2022-12-29 2023-02-03 江西萤火虫微电子科技有限公司 Board card function testing method and device, computer and readable storage medium
CN116107913A (en) * 2023-04-06 2023-05-12 阿里云计算有限公司 Test control method, device and system of single-node server

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113742002A (en) * 2021-09-10 2021-12-03 上海达梦数据库有限公司 Method, device, equipment and storage medium for acquiring dependency relationship of dynamic library
TWI827203B (en) * 2022-08-18 2023-12-21 中華電信股份有限公司 Verification system and verification method for malicious file of container
CN115328580B (en) * 2022-10-13 2022-12-16 中科方德软件有限公司 Processing method, device and medium for registry operation in application migration environment
CN116760620B (en) * 2023-07-10 2024-03-26 释空(上海)品牌策划有限公司 Network risk early warning and management and control system of industrial control system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1701586A (en) * 2003-10-01 2005-11-23 株式会社东芝 Flexible protocol stack
US20160156665A1 (en) * 2014-05-15 2016-06-02 Lynx Software Technologies, Inc. Systems and Methods Involving Aspects of Hardware Virtualization such as hypervisor, detection and interception of code or instruction execution including API calls, and/or other features
CN109669782A (en) * 2017-10-13 2019-04-23 阿里巴巴集团控股有限公司 Hardware abstraction layer multiplexing method, device, operating system and equipment
CN110210219A (en) * 2018-05-30 2019-09-06 腾讯科技(深圳)有限公司 Recognition methods, device, equipment and the storage medium of virus document

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1701586A (en) * 2003-10-01 2005-11-23 株式会社东芝 Flexible protocol stack
US20160156665A1 (en) * 2014-05-15 2016-06-02 Lynx Software Technologies, Inc. Systems and Methods Involving Aspects of Hardware Virtualization such as hypervisor, detection and interception of code or instruction execution including API calls, and/or other features
CN109669782A (en) * 2017-10-13 2019-04-23 阿里巴巴集团控股有限公司 Hardware abstraction layer multiplexing method, device, operating system and equipment
CN110210219A (en) * 2018-05-30 2019-09-06 腾讯科技(深圳)有限公司 Recognition methods, device, equipment and the storage medium of virus document

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114995877A (en) * 2022-08-03 2022-09-02 平安银行股份有限公司 Variable configuration method and device
CN115686984A (en) * 2022-12-29 2023-02-03 江西萤火虫微电子科技有限公司 Board card function testing method and device, computer and readable storage medium
CN116107913A (en) * 2023-04-06 2023-05-12 阿里云计算有限公司 Test control method, device and system of single-node server
CN116107913B (en) * 2023-04-06 2023-11-14 阿里云计算有限公司 Test control method, device and system of single-node server

Also Published As

Publication number Publication date
CN113139176A (en) 2021-07-20

Similar Documents

Publication Publication Date Title
WO2021147282A1 (en) Method, apparatus and device for detecting malicious file, and storage medium
JP6942824B2 (en) Configurable logical platform
US11960605B2 (en) Dynamic analysis techniques for applications
US11720393B2 (en) Enforcing compliance rules using guest management components
US11604878B2 (en) Dynamic analysis techniques for applications
US10469512B1 (en) Optimized resource allocation for virtual machines within a malware content detection system
US11625489B2 (en) Techniques for securing execution environments by quarantining software containers
US10963268B1 (en) Interception of identifier indicative of client configurable hardware logic and configuration data
US10417031B2 (en) Selective virtualization for security threat detection
US11328060B2 (en) Multi-tiered sandbox based network threat detection
US10025612B2 (en) Enforcing compliance rules against hypervisor and host device using guest management components
Wu et al. AirBag: Boosting Smartphone Resistance to Malware Infection.
US8910238B2 (en) Hypervisor-based enterprise endpoint protection
RU2679175C1 (en) Method of behavioral detection of malicious programs using a virtual interpreter machine
US9251343B1 (en) Detecting bootkits resident on compromised computers
US10715554B2 (en) Translating existing security policies enforced in upper layers into new security policies enforced in lower layers
CN110362994B (en) Malicious file detection method, device and system
US10795742B1 (en) Isolating unresponsive customer logic from a bus
US11120148B2 (en) Dynamically applying application security settings and policies based on workload properties
US11914711B2 (en) Systems and methods for automatically generating malware countermeasures
Sandhu Implementation of Portable Security Analysis Tool
Aderholdt et al. Review of enabling technologies to facilitate secure compute customization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20915033

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20915033

Country of ref document: EP

Kind code of ref document: A1